Summary Statistics With Two Variables – R For Economists Basics 9

Okay we are back again last time we talked about how to get summary statistics of a single variable things like mean standard deviations summary statistics tables all that good sort of stuff this time we’re going to be expanding a little bit we’re gonna be looking at summary statistics or basic calculations for two variables at the same time and.

So we’re gonna go ahead and get started so we’re gonna be looking at basically how two variables relate to each other work.

Together we’re gonna be working with the exact same data set as last time so I’ve already loaded in the foreign package and then loaded in the wage one data set from that so we should be ready to go it’s the first thing we’re.

Gonna do we’re gonna get started right off the bat is with a.

Correlation so we’re gonna do a regular old correlation as you might expect the name for this has to do with correlation it’s gonna be C Oh are we calculate the correlation between two variables in particular let’s say we’re looking and we’re interested whether there might.

Be some sort of correlation between education and wages okay so we’ve got of course wage as the first input were there we got the wages of age variable from the wage one data set we also have the wage the education variable from the wage one data set calculated out he tells us the correlation between the two is point four oh six pretty good positive as you would expect we might also be interested in whether that correlation is statistically significant we can.

Test that all we got to do is the exact same thing that we just did except we’re gonna do the test version of it dot test core dot test now by the way if you weren’t sure how to do this a lot of functions will have extras on there so if I just put in core dot just to see if.

There’s any sort of variations that I can run it’ll pop up tests for me immediately right there I don’t have to go looking it up it’ll just tell me exactly what this so core dot test if I do this it’ll give me all the information on it’ll do the same calculation point 406 as.

The correlation it’ll give me the ninety-five confidence interval of that correlation and it will give me the p-value for that correlation being equal.

To zero so there we go so now we have a correlation going on now another common way of looking at the relationship between two variables is simply.

Looking at a frequency table so we use the table last time to look at the different values of something and how many observations were in each value for so for example we did the table of Education we did table.

Wage 1 education to show the different distribution of education for each different years of education we can also do this easily with two variables and will show us the cross tabulation so how many observations fit into the cross of two different variables so if we did with education that’d be a lot of different cells to track so let’s just do it with some simpler one.

So we’re gonna calculate a cross tabulation so this is going to be table the same table function as before but we’re this time we’re going to put in two variables instead of one anyway it will automatically know that we want a cross tabulation so the.