mari681
2012-Mar-14 17:49 UTC
[R] check for data in a data.frame and return correspondent number
Dear R-ers, still the newbie. With a question about coordinates of a vector appearing or not in a data.frame. I have a data.frame (MyData) with 3 columns which looks like this: V1 V4 redNew red-j 10.5032 appearance blood-n red-j 9.3749 appearance ground-n red-j 10.2167 appearance sea-n red-j 10.8200 appearance sky-n red-j 9.2831 area chicken-n red-j 8.2838 area color-n and a MyVector which includes also (but not only) the data in the 3rd column: " appearance blood-n" " appearance ground-n" "appearance sea-n" "as_adj_as fire-n" "as_adj_as carrot-n" "appearance sky-n" " area chicken-n" "area color-n" I would like to get a data.frame of 2 columns where in the first column there is all MyVector, and in the second column there is either the correspondent number found in MyData (shown in column 2) or a "0" if the entrance is not found. I've tried some options, among which a loop: out<-for(x in MyVector) if (x %in% MyData) print (MyData[,2]) but obviously doesn't work. How can I select the correspondent element on column 2 for each x found in column 3? Suggestions in general? Thank you for consideration!!! Have a nice day, Marianna -- View this message in context: http://r.789695.n4.nabble.com/check-for-data-in-a-data-frame-and-return-correspondent-number-tp4472634p4472634.html Sent from the R help mailing list archive at Nabble.com.
Jan van der Laan
2012-Mar-14 20:34 UTC
[R] check for data in a data.frame and return correspondent number
Marianna, You can use merge for that (or match). Using merge: MyData <- data.frame( V1=c("red-j", "red-j", "red-j", "red-j", "red-j", "red-j"), V4=c(10.5032, 9.3749, 10.2167, 10.8200, 9.2831, 8.2838), redNew=c("appearance blood-n", "appearance ground-n", "appearance sea-n", "appearance sky-n", "area chicken-n", "area color-n") ) MyVector <- data.frame( V1 = c("appearance blood-n", "appearance ground-n", "appearance sea-n", "as_adj_as fire-n", "as_adj_as carrot-n", "appearance sky-n", "area chicken-n", "area color-n") ) merge(MyVector, MyData[, c("V4", "redNew")] , by.x="V1", by.y="redNew", all.x=TRUE) Btw I saw some spaces in some of your strings (I have removed these in the example above). Be aware that the character string " appearance ground-n" is not equal to "appearance ground-n". HTH Jan On 03/14/2012 06:49 PM, mari681 wrote:> Dear R-ers, > > still the newbie. With a question about coordinates of a vector appearing or > not in a data.frame. > I have a data.frame (MyData) with 3 columns which looks like this: > > V1 V4 redNew > red-j 10.5032 appearance blood-n > red-j 9.3749 appearance ground-n > red-j 10.2167 appearance sea-n > red-j 10.8200 appearance sky-n > red-j 9.2831 area chicken-n > red-j 8.2838 area color-n > > and a MyVector which includes also (but not only) the data in the 3rd > column: > > " appearance blood-n" > " appearance ground-n" > "appearance sea-n" > "as_adj_as fire-n" > "as_adj_as carrot-n" > "appearance sky-n" > " area chicken-n" > "area color-n" > > I would like to get a data.frame of 2 columns where in the first column > there is all MyVector, and in the second column there is either the > correspondent number found in MyData (shown in column 2) or a "0" if the > entrance is not found. > > I've tried some options, among which a loop: > > out<-for(x in MyVector) if (x %in% MyData) print (MyData[,2]) > > but obviously doesn't work. > How can I select the correspondent element on column 2 for each x found in > column 3? > > Suggestions in general? > Thank you for consideration!!! > > Have a nice day, > Marianna > > > -- > View this message in context: http://r.789695.n4.nabble.com/check-for-data-in-a-data-frame-and-return-correspondent-number-tp4472634p4472634.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.