Data Science Initiative Introduction To R Bootcamp (part 2)

The day that we make it sent and we try to make sense of them cannot so no no you just come with questions and come with like okay fine we we covered a lot of stuff now we got to pull it all together I’m with ya learning around here it doesn’t like capitals when you.

Don’t win it when you lose cap is this whose cap I thought I thought might be yours.

And anyone recognize this anyone new book excellent good with it and that one’s gone for six months it’s just minding its permanently no.

No no you want to get rid of us because it’s gonna show up in plots it’s gonna show up and in all these computations so if you want to get rid of something you gotta no no no no no so we got all these we got all these we got all these things that we’ve learned.

And today is today we’ll try to assemble them but the remember you know if you got a data frame D you know if you knew the row number you could say minus 10 no that will give you back a new object which is a copy of D but with that row gone and you.

Could then say g2 equals this thing okay but this is this is the destroy you have to find out which row it is SR or yeah okay so you and the but then and but you may not want to remove it from the original data set but you may want to create a new data set that actually.

Has I think they’re just creating copies our new variables that’s a way remember what we were doing.
So one way there’s but what we were doing was your basic say.

Okay let’s compute the minimum of I’m gonna call it D for Davis KD dollar price so now we’ve actually got that minimum value now we need to go find out which rows are equal to that which rows have a value of pride opening so we now say D dollar price equals equals and that I’m gonna call.

This w for kind of which or washer you know okay and this is now a logical vector that is truth and falses okay you know and again I test everything afterwards table of W that’ll tell me how many truths and.

How many faults is there are you know I should check the class of this – but if I’ll just do.

It do it here and hopefully I’ll just see the truth and falses so now that’s giving me a logical vector and then if I do something like this this will identify that this will pull out the rows all of I will pull out all of.

The columns for just these row which are true remember so we can subset biological vector which is the same thing as it’s the same thing as if we put in which w here.

We just don’t have to and I don’t like typing any more than I have to but that’s not that’s that’s what we’re doing here which is which one I’m.

Gonna win here find that finally observation maybe maybe multiple ones with the lowest value for price so here’s the lowest value of price here’s a logical value that logical vector that says is this is this observation does it have the same value with the minimum price and now we can go in.

And pull out that observation and.

Then we can look at the whole thing all 19 or 18 variables and say does it actually make any sense has anyone found it yet anyone know why it’s only fifteen dollars house cleaner in Houston the actual location that we were scraping it’s amazing seventeen we didn’t pertain so just just to be clear we haven’t removed anything yet okay we.

Just I we just said tell me the ones which satisfy this criterion okay then then show me all then go get the subset of those observations or rows in the data finally showing all the columns okay you know I might go and look at just I.

Might go in here and type quote body here.

Just I want to look at the body I’ll read the whole of posting and that’ll be sufficient to actually say this is nuts okay so this would just show me now if I wanted to get rid of these guys if I actually so there’s two things like two things we want to do okay one of them is just so I started off with a data frame that looks like this and then I ended up with a data.

Frame that just looks like this one row because I just wanted to see us then I make up then I make up my mind and say you know that one’s just garbage okay but I’m a little I’m a little I’m a little conservative okay I’m gonna say D 2 is equal to D not W okay what’s this this is w.

Which is this this is truth and falses this is I true means it is equal to the minimum so not W okay means turn all the truths into.

Folders and all the false is into true okay you can write this as W equals equals false which is kind of weird is if this is false if the value.

And it’s equal to false then the answer is true okay this is starting to get a little bit weird okay okay so let’s not do that this is.

Just a invert that trues and falses it’s just my way of thinking about that so now this one will actually say make a pull out all the rows which don’t match the minimum and then assign it to.

D2 okay now we could then go on and try again and repeat the same process find the minimum in d2 and then you get 17 let’s say and you go look at.

That one alternatively you can say okay so it is that clear that you could just repeat the process however and the same logic works which says we said fine this is the condition I’m looking for well what do we.

Just said something like you know I just don’t want to see any apartments less than 50 bucks because they’re probably not apartments okay so we may make that up or we may look at the the values of price and then say you know it looks like 200 bucks is the minimum that I care about so.

We could change this to B we could change this to.

Is less than let’s just say 50 bucks okay and now that’s going to include multiple observations actually I don’t think it will and I think it’s still I think the min max minimum is actually real says about $600 but this would actually potentially pull in more and then.

We just drop all of those guys okay happy but again all it is is and again there’s a principle here is that this is trues and falses we can.

Make the trues and falses in it using any candy we want.

I could say it’s the minimum price or it’s less than 50 or I could say yeah had it here or we could do something like this okay and I’m not saying this is a good idea but.

Identify the ones where the number of bedrooms is less than the same number of bathrooms okay and this is going.