Data Science Initiative Introduction To R Bootcamp (part 2)

Messages and make any sense of them Stack Overflow is your friend everyone else has run into the same problems you have and haven’t soap Felice um buddy has actually written a coherent answer so Stack.

Overflow is great you need to understand are to reading Stack Overflow.

There’s an art to actually looking through 25 answers and going I know which one it is that actually is relevant Stack Overflow tries to tries to rank them but sometimes it gets the ranking wrong so what should we do here this is problem solving which unfortunately is.

Not this is debugging and problem solving which is not written down in many books I want the shape I want I want applause I mean there’s yeah I mean I although the hanging you know let’s just not make a plot at all.

Let’s just go home you know I wanted a plan okay and again I’ll show you you know what we what we actually wanted is.

A plot that looks like this which is how we wanted was a plot which what do we actually say here we said bedrooms is here bedrooms and we have the price.

Have plotting characters here okay in triangles and this is what I wanted because I wanted to see the number of baths I don’t know why but I want to see the number of there may be many ways to plot this but remember we basically said that there’s a square foot there’s the price okay there’s the number of bedrooms and there’s the number of baths you have there’s the latitude and longitude.

I want to look at this data and try to understand how much should I be paying for a house or about apartments or whatever so we’ve also got the type here oh.

My goodness there’s a lot of.

Data there’s a lot of variables here and I need to look at this this is a it would be great if we could write draw a five dimensional plot it’s hard.

Seriously hard to read it to look at a five.

Dimensional plot but you can get two dimensions this way if you wanted to get three dimensions but it doesn’t work very well on a two-dimensional screen but we get an extra dimension this way that’s three dimensions even if we take this run.

Away there’s three dimensions okay we’ve got the bedrooms the price and we’ve got the plotting characters that indicates the number of baths and if we change the color over.

Here as well we can we could actually get the bedrooms in potentially okay.

So how are we going to get the how can I get five dimensions I how about of how about if I actually put in the square foot okay so how do.

We are we got we’ve got color was I said four type okay this one here for the plotting character the actual glyphs for character that we use is for the bath so that would be bath and type we’ve got the we’ve got the number of bedrooms is there we’ve got the prices here all I need is the square foot anyone tell me how I can get my fifth dimension size I can change the size I got.

Okay so I’m gonna make this big because it’s got a lot of square feet okay five dimensional plot no problem okay I throw down at eleven thousand points like this and your head will explode and you won’t stall you’ll see is a big mess okay but so now we may want to think about something slightly different is there another way of getting five dimensional plot with that but I’m.

Gonna but up but what I’m is there another way of doing a five dimensional plot showing all the data for five variables but.

I’m gonna only have four on a scatter plot the answer is yes it’s.

Just be creative so let’s just let’s take away type here let’s just take let’s just take away type okay and we’ll just say let’s just do it for home for a house okay so I just subset and do it for a house now what about an apartment well I just subset and do it for an apartment so what I can do is I can actually have so this is for house this is.

For apartment this is for a townhouse this is for a loft and so forth and if I have four dimensions on each of these I get five dimensions in total okay so I can layer everything on to one plot with many layers or I can have what are called facets okay or panels okay and what I just I subdivide them just like we did over here and I draw separate plots everyone happy that’s.

The goal so I want to see these things I don’t want to I can’t give up shape here because I want to get the five dimensions not down back and I want to go backwards to two so the problem with this is it’s a continuous variable what’s a continuous variable non discreet it’s anomaly.

It’s supposed to take on potentially values between any any value between zero you know.

Between I’ve been a minimum and maximum bath is not so this is let’s take a look at this class of Davis dollar but.

This is why I like to know I know how I can actually find these things what’s the class of this it’s an integer that’s not continuous it cannot take on a value of 1.5 because I limit I told it it couldn’t but but ggplot says it could be from minus infinity to plus infinity that’s continuous in my mind so what would we do I need to make that.

Into a categorical variable so how will I make it because I need a plotting character I need a plotting character for shape which is.

An X let’s say corresponds to one category A plus corresponds to a different category I need to map these into categories not into real valued numbers okay because I can map this this thing in a and then half on half an X or 3/4 of an.

X I need to actually make them into categories how do I make something into a category what is a categorical.

Variable in our remember we came across them yesterday it was a reason what was the name of them had them in we had them in our data set so let’s go see if we can remember what the name of it is I’m gonna loop over.

And go to for each element for each column I want to call the class I’m gonna get the class name and then we’ll see if we can recognize it so what was it.

Factoring okay this is a very handy thing I can’t remember the name of the stupid thing well let’s just ask I know there’s one in the Davis data frame so let’s just ask what are the classes of each of this thing this just loops over all the columns and calls class on each of it so I get there you can look you can look in the STR the output of STR.

For simple somebody said that down my it was a little overwhelming okay we just we just asked a more.