Data Science Initiative Introduction To R Bootcamp (part 2)

To go element by element so each observation is going to compare that observations number of bedrooms two bathrooms Inc oh hey that this one has more bathrooms than bedrooms bit odd okay potentially maybe maybe not now let’s go.

And investigate those I suspect when you do this as in that question you’ll find out that I probably got the bat number of bathrooms wrong okay it’s not that it actually has more baths than bedrooms admit may be that it may be.

That the posting was wrong but it’s also more likely that the code I used was wrong because I was taking shortcuts okay but this is just a logical vector and now that’s a different condition and we.

Can then then we can repeat everything here’s that drop all of those guys or just look at those guys okay and that’s that’s all we do all the time is start looking at groups okay subsets that we care about based on some condition everyone happy how are you guys doing again the point of this is is to try to practice and play okay what so check it out when we put it when we organized this.

Said we’d go to one o’clock today and then we’d say and afterwards I’m around and we can do we can go beyond one o’clock today we can also just go and do a you know you can sit here and.

Actually practice things and ask questions so the practicing is that is absolutely key I said I didn’t get quite get.

A chance to do it yesterday partially because there’s too much and to cover.

But you can play with these you can come up.

With your own questions and just play and ask how do I do this and hopefully all this stuff will become reasonably clear okay I mean again all we’re building on is you know this is just vectors we’re pulling scene columns out of a data frame we’re comparing things we’re doing subtraction and addition and so forth we’re then.

Subsetting the the rows are the elements of a vector and and that’s this is all there is to it okay we do a huge amount of of this sort of thing okay and let’s just so let’s just do this I’m gonna I’m gonna type in here rather than in them rather than in our studio I’m gonna type over here are there any questions about what what what you what you did over here what okay does anyone want to talk about any of these things or recap anything yesterday no yes yes I.

Most certainly can because okay so what let’s add a new column to the data frame okay so that’s first of all okay I want to measure for some reason I may actually want to use this okay we.

Can look at the relationship between price and square-foot we may also find it useful to actually just compute the price divided by square foot and look at look for weird things there or use that in a in a in a model so we want to compute the.

Rent price per square foot so the way we do that is very simple okay so be there’s two steps.

One of them is to computers and the other one is to put it back into the data frame we may not want to put it into the data frame but I said we I decided we did okay so what do we do we basically say what do we actually have here do.

We have which we have Davis so let’s just okay I’m coming back to this after a while just let’s just check this is what I think it is let’s go and like take a look Mike okay good it’s got a hundred and seventy-five observations this is the one I want okay so now you know and we just checked I mean and I do.

Just remind myself that we really do have what we what we think we have as opposed to getting surprised later on okay so what do I want this is very simple Davis dollar price divided by Davis dollar square foot yes and then we get all those now we get singing values because the price or the square foot might be missing okay we can take a look and see what it is we can again remember.

These missing values are important but we can so we can say Davis dollar price how many are missing there’s none missing in there but in the square foot we do actually.

Have some that are missing before we compute this before we compute price per square foot we probably want to go on go and fix the square footage if we can okay we might.

Want to look at the posts and and see if we can actually compute the square foot but that’s fine we can always fix them up later on okay so now what do we go we go back over.

Here so the only thing that okay we’ve computed the price per square foot yep now.

To do is put it back into the data frame anyone suggest how I do that I’ll give you hint it starts with Davis because that’s where we’re putting it.

And I’ll give you another hint it goes on the right-hand side okay this is the thing we compute now I need to put it back in here so just guess and I’m gonna call it price square foot.

I need a con new column called price square foot okay I could type price per square.

Foot but life is short okay so I kind of want this okay and now the only.

Thing I need to do how do I get how do I get a column out the dollar sign how do I get a column in of the dollar.

Sign remember what I said the subsetting works on the when you’re getting stuff out or when you’re putting stuff in so this is this I’ve now just created I will now just create a new variable width which contains this value and as always I tend to do this first check to see it makes sense and then shove it back in later on right so do they do the calculations in two steps there we go and then.

What is the class of Davis dollar price square foot it’s numeric what’s the names of Davis oh look we got the price square foot tag down to the end life is good so we just shoved it in now it’s and again these line up observation by observer with the.

Original data if you stick if you screw up the order you’re in trouble okay you’ve just now said that square footage how that price square foot.

Is now but the different rental unit that’s really bad news okay any other.

Questions no none so things like is da 10a okay is your friend you need that to actually look for you identify the missing values then you’re just going to go off and subset life is good the.

M so one of the things about this data one of the reasons I call this is because you all have you’re all living somewhere in near here hopefully okay and so how are you getting a good deal on what you’re renting okay.