Data Science Initiative Introduction To R Bootcamp (part 2)

We don’t need you were guessing what that might do and it.

Did it took a few shortcuts that we we needed to do a bit more what if we did.

The following well c’mon let’s just try this one point two three two okay so there’s a string and.

It’s and it’s got a it’s got that okay there we go it got it right okay what if I do they’re following it’s one three two what if I put it in ABC again you can do tests just check it says that’s one point three two this is a missing value so that gives me a hint and this by the way this is just a generic.

Approach that you can use kay okay let’s do the following let’s do as dot character D dollar velocity so I’m just gonna like to say yeah yeah I forget the fact that they’re factors just turn them into lay into strings okay let’s just go straight to strings get him back I could have done this by the way said something’s wrong here we’ve done CSV and I should I could.

Have said strings as factors equals false okay just like just look I’m in diagnostic I don’t want you to do the right thing I want you to do something there like that so I can fix this up but I can also just do it this way okay so now I’m gonna do this I’m gonna call this temp and now I’m gonna do the following I’m gonna say now so let’s just take a quick look at this there we go okay these all look like numbers to me.

I don’t know why are still so stupid okay by the way how big is how big is D 16,000 it seems like a good good enough random sample just look at the first let’s look at the first six they all look like numbers what could possibly go wrong there’s only another 16 tiles and later on okay that’s probably going to be our issue is that we’re only looking at the top of the file so let’s do the following as dot numeric of temp.

Okay and let’s call this that’s called his bowels for just we’re just gonna hang on a second that wasn’t good we got.

The NA s that’s why I did this thing earlier on I said if we give it something that it can’t turn into a number we get an na so let’s go back and look at temp which is my.

Character where Val’s is missing this says which element in Val’s is.

An na which then go back and let’s look at where it came from which is 10 is this the corresponding element in temp and you oh uh there we go that’s my problem okay when somebody edited this that you forgot to take out these lines okay where are those lines there you go.

So this unfortunately is a status report her message from the sensor that’s actually delivering this data so this is not a CSV file and what file is close enough because this is in CSV form but this value is is is messing up this entire column so what do we do I remember this was done by hand this was edited by hand 16,000 lines somebody went through when they missed oh they only miss 16.

How do I delete them there like so I can okay make me okay let’s suppose there’s a hunt a thousand of them let’s so we want to delete we want to ignore them but I want to do it by hand and again you know do it do.

It whatever way makes sense okay but I guarantee you that this person may have gone through and actually removed a thousand of them in this 60 because you.