Data Science Initiative Introduction To R Bootcamp (part 2)

Okay let’s just do this this is a different way of faceting okay this just facets by the type and it just.

Goes across as a as a single row and it may or may not work okay the aspect ratio is very different the other version a canvas.

Again the other version was just this facet wrap with a type and.

The end called three we actually have to use two different commands here one for grid and one for wrapping so that it would go around rows and columns but look.

Redo the actual instructions all I’m basically saying is build up all these layers don’t do anything with it yet now I want you to go render it in a particular way oh hi.

I’m second that’s not what I wanted take take the plot again that the description of.

The plot and and draw an hour organize at a different way and look this is a reusable object okay I don’t.

Need to issue commands to do this I just it’s just it’s it’s fully self-contained i if you wanted to you could send me this plot okay you could save this object this P send it to me and I will be ready to organize it the way I want but you’ve spent hours constructing all these layers okay and that’s a very useful thing to.

Be able to actually do yours we’re minimizing the code that we actually have to to repeat – to explain all of this stuff everyone happy let’s take a break for ten minutes okay – come.

Back a quarter of eleven and then we’ll talk about some other stuff say when you go in when you quit no I didn’t think I’d say yeah sure there’s a thing called dodge yeah so if you look up so if you look up dodge and jitter and also there’s a there is a if I believe I have it here and if you go to feel.

If you google but if you go to and of course wouldn’t you know what zoom is sitting on top of me and getting oh.

There is a if you go here ggplot2 dot tidy verse org there’s a there’s a whole reference guide to this stuff whether this is enough for you to learn gg+ probably not but whether but it’s enough for a reference okay so once you understand the concepts you can actually use these or you can look them and say okay I should squirrel that.

Away in my head and say I may.

I may need that in the future okay so you know so that this talks about this talks about the.

Basic functions that you have and.

The layers and so forth we’ve talked about John point for making a scatter plot we also did John box plot well there’s John a v-line V line H line bars for bar plots okay and these correspond to different types of plots and yet but you can also use them in creative ways so you can you can go and do a bunch of different things and.

So it’s worthwhile looking through and reading about these don’t think I have a ticket to my office there’s also the our graphics cookbook which has a lot of interesting you know 1/2 page recipes for things a very easy.

Read so you kind of convince or though I should remember about how to make that sort of plot so there’s all these different geometries and then as you go along.

So let’s if we if we search for Dodge and that this is what you’re what you can do inside in the aesthetics you can actually and in the jobs so let’s say I think it’s in the jobs you can actually specify a position and you can say Dodge ok this which is overlap it dodges it tries to compute ways to avoid overlapping objects and put some side by side.

Whereas the jitter well the jitter is just it just adds this little bit of noise and I we added a very we just said jitter but we didn’t tell it how much to jitter it by we could sort of say jitter PI point zero one up two up twos up to point zero two zero point one.

Or whatever or hey go between zero and 100 and then there we would have got a mess of a plus because all the bedrooms would have gone flying all over the place you know so.

You can if there’s a you want to be able to control.

This and you can the this gives you options but these are all things like nudge okay just nudge them a fixed distance so there’s all these different things to.

Try to help layout there is another thing by the.

Way just you know when I look at.

This hang on a second so let’s just do the following here’s anything here’s a horrible example let’s just do this which is read RDS and I.

Craig’s List oh no I gave it away which which is Davis and sack let me just it was okay so let me just grab this guy so this just again this is this is easy I just grab this guy you can see this RDS file I put it up on the web page okay so let’s just.

Go and let’s call this sack okay so let’s go off and.

Say what’s the class of this it’s it’s it’s a data frame what’s the dimension of this guy it’s essentially.

A it’s slightly different I think but there’s thirteen hundred thirteen thousand think points observations so let’s just do the following sach dollar square-foot sach dollar price hello what’s what is gone oh maybe I didn’t actually do this I haven’t this one so let’s just do this okay plot sach dollar I don’t seem to have I don’t know I haven’t computed square.

Foot in this case so a bank of last long in Sactown or last I’m just trying to make one point here what do you see here there’s thirteen thousand points here clearly this guy’s odd okay but I don’t know how many there are there could be ten points here the fifteen points or something like that so let’s just so we can we can.

In gg+ or or in let’s do this which is which is we will say X Lam equals minus 120 so X lam is equal to minus 122 – well not 125 – – minus 120 okay and this just.

Spread out which which is which has more points this one or this one no idea you can’t see anything here so this is not a good plot and you’re getting a huge amount of.

Over plotting so we say scatter smooth is a good way to do this yeah smooth scatter and this and we can I’m just trying to do quickly what this actually does is actually tries to put down densities and shows you how many points.

There are at each at each region we still end up with a lot of nonsense here that we should actually control for that’s we need to clean up this data set aloft but we’re actually able to sort of much much richer relationships between the X and the y here and this.