abo dalash
2017-Apr-29 15:13 UTC
[R] Finding nrows with specefic values&converting a matrix into a table
Hi All I'm trying to identify number of rows containing 2 specific values. I tried : which(mydata == 566,235), but this returns logical values for all rows and any T in a certain row indicates the existence of one of these values but what I need to know is only number of rows in my data set with these 2 particular values considering these two values as one pair per column. For example : 1 123 566 235 2 443 54 566 3 566 44 235 here number of rows with the values 566&235 is 2 which are rows 1 & 3. Row 2 has only 566 so it should not be included in our calculation. I also have a large matrix and wanted to convert it into a table so I can easily identify the combination with higher frequencies. The matrix looks like this: x y z x 0 5 67 y na 0 23 z na na 0 and I would like to convert this into a table arranged with higher values first like this : x z 67 y z 23 x y 5 x x 0 y y 0 z z 0 y x na z x na z y na Is there simple function to perform this conversion with some explanation about the Syntax Regards [[alternative HTML version deleted]]
David L Carlson
2017-Apr-29 20:38 UTC
[R] Finding nrows with specefic values&converting a matrix into a table
First. Do not use html messages, only plain text. Second. Provide a small example data set, preferably using dput(). Just printing your data can hide important information. Third. Read the documentation. Your first example does not return a logical vector at all:> dput(mydata)structure(list(Col1 = c(123L, 443L, 566L), Col2 = c(566L, 54L, 44L), Col3 = c(235L, 566L, 235L)), .Names = c("Col1", "Col2", "Col3"), class = "data.frame", row.names = c(NA, -3L))> which(mydata == 566,235)row col [1,] 3 1 [2,] 1 2 [3,] 2 3 It locates cells with 566, but not 235 which is not a surprise because you did not provide a valid logical expression to which(). There are a number of ways to get what you want, but since you want to process rows, apply() is straightforward:> Val566 <- apply(mydata, 1, function(x) any(x == 566)) > Val566[1] TRUE TRUE TRUE> Val235 <- apply(mydata, 1, function(x) any(x == 235)) > Val235[1] TRUE FALSE TRUE> which(Val235 & Val566)[1] 1 3 You should read the manual pages on any(), apply(), dput() and which() and logical expressions:> ?apply > ?any > ?dput > ?which > ?Comparison # ?"==" will also get you there.For the second question, assuming you are beginning with a table object as R defines that term and not a matrix (since all tables are matrices, but all matrices are not tables):> dput(moredata)structure(c(0L, NA, NA, 5L, 0L, NA, 67L, 23L, 0L), .Dim = c(3L, 3L), .Dimnames = list(c("x", "y", "z"), c("x", "y", "z")), class = "table")> moredatax y z x 0 5 67 y NA 0 23 z NA NA 0 Note, that your example uses na rather than NA. R is case sensitive so na is just an ordinary character string while NA is a missing value indicator. This is one of the reasons that dput() is important> moredata.df <- as.data.frame(moredata) > moredata.dfVar1 Var2 Freq 1 x x 0 2 y x NA 3 z x NA 4 x y 5 5 y y 0 6 z y NA 7 x z 67 8 y z 23 9 z z 0> moredata.df[order(moredata.df$Freq, decreasing=TRUE), ]Var1 Var2 Freq 7 x z 67 8 y z 23 4 x y 5 1 x x 0 5 y y 0 9 z z 0 2 y x NA 3 z x NA 6 z y NA For this you should read the following manual pages:> ?as.data.frame > ?order > ?Extract------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of abo dalash Sent: Saturday, April 29, 2017 10:14 AM To: r-help at R-project.org Subject: [R] Finding nrows with specefic values&converting a matrix into a table Hi All I'm trying to identify number of rows containing 2 specific values. I tried : which(mydata == 566,235), but this returns logical values for all rows and any T in a certain row indicates the existence of one of these values but what I need to know is only number of rows in my data set with these 2 particular values considering these two values as one pair per column. For example : 1 123 566 235 2 443 54 566 3 566 44 235 here number of rows with the values 566&235 is 2 which are rows 1 & 3. Row 2 has only 566 so it should not be included in our calculation. I also have a large matrix and wanted to convert it into a table so I can easily identify the combination with higher frequencies. The matrix looks like this: x y z x 0 5 67 y na 0 23 z na na 0 and I would like to convert this into a table arranged with higher values first like this : x z 67 y z 23 x y 5 x x 0 y y 0 z z 0 y x na z x na z y na Is there simple function to perform this conversion with some explanation about the Syntax Regards [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
abo dalash
2017-Apr-30 15:09 UTC
[R] Finding nrows with specefic values&converting a matrix into a table
Dear David .., Many thanks for this detailed answer. Your guidance reg. the first task has resolved my issue and I have understood now how to perform this type of analysis. I have saved your learning tips in my script. Reg. the Matrix-table conversion, could you please clarify this more?. I applied the function as.data.frame but this returned the same matrix without converting it into a list table. I'm not sure where is the problem in my code : mymatrix <- as.data.frame(mymatrix). Many thanks for your support Regards ________________________________ From: David L Carlson <dcarlson at tamu.edu> Sent: 29 April 2017 11:38 PM To: abo dalash; r-help at R-project.org Subject: RE: [R] Finding nrows with specefic values&converting a matrix into a table First. Do not use html messages, only plain text. Second. Provide a small example data set, preferably using dput(). Just printing your data can hide important information. Third. Read the documentation. Your first example does not return a logical vector at all:> dput(mydata)structure(list(Col1 = c(123L, 443L, 566L), Col2 = c(566L, 54L, 44L), Col3 = c(235L, 566L, 235L)), .Names = c("Col1", "Col2", "Col3"), class = "data.frame", row.names = c(NA, -3L))> which(mydata == 566,235)row col [1,] 3 1 [2,] 1 2 [3,] 2 3 It locates cells with 566, but not 235 which is not a surprise because you did not provide a valid logical expression to which(). There are a number of ways to get what you want, but since you want to process rows, apply() is straightforward:> Val566 <- apply(mydata, 1, function(x) any(x == 566)) > Val566[1] TRUE TRUE TRUE> Val235 <- apply(mydata, 1, function(x) any(x == 235)) > Val235[1] TRUE FALSE TRUE> which(Val235 & Val566)[1] 1 3 You should read the manual pages on any(), apply(), dput() and which() and logical expressions:> ?apply > ?any > ?dput > ?which > ?Comparison # ?"==" will also get you there.For the second question, assuming you are beginning with a table object as R defines that term and not a matrix (since all tables are matrices, but all matrices are not tables):> dput(moredata)structure(c(0L, NA, NA, 5L, 0L, NA, 67L, 23L, 0L), .Dim = c(3L, 3L), .Dimnames = list(c("x", "y", "z"), c("x", "y", "z")), class = "table")> moredatax y z x 0 5 67 y NA 0 23 z NA NA 0 Note, that your example uses na rather than NA. R is case sensitive so na is just an ordinary character string while NA is a missing value indicator. This is one of the reasons that dput() is important> moredata.df <- as.data.frame(moredata) > moredata.dfVar1 Var2 Freq 1 x x 0 2 y x NA 3 z x NA 4 x y 5 5 y y 0 6 z y NA 7 x z 67 8 y z 23 9 z z 0> moredata.df[order(moredata.df$Freq, decreasing=TRUE), ]Var1 Var2 Freq 7 x z 67 8 y z 23 4 x y 5 1 x x 0 5 y y 0 9 z z 0 2 y x NA 3 z x NA 6 z y NA For this you should read the following manual pages:> ?as.data.frame > ?order > ?Extract------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of abo dalash Sent: Saturday, April 29, 2017 10:14 AM To: r-help at R-project.org Subject: [R] Finding nrows with specefic values&converting a matrix into a table Hi All I'm trying to identify number of rows containing 2 specific values. I tried : which(mydata == 566,235), but this returns logical values for all rows and any T in a certain row indicates the existence of one of these values but what I need to know is only number of rows in my data set with these 2 particular values considering these two values as one pair per column. For example : 1 123 566 235 2 443 54 566 3 566 44 235 here number of rows with the values 566&235 is 2 which are rows 1 & 3. Row 2 has only 566 so it should not be included in our calculation. I also have a large matrix and wanted to convert it into a table so I can easily identify the combination with higher frequencies. The matrix looks like this: x y z x 0 5 67 y na 0 23 z na na 0 and I would like to convert this into a table arranged with higher values first like this : x z 67 y z 23 x y 5 x x 0 y y 0 z z 0 y x na z x na z y na Is there simple function to perform this conversion with some explanation about the Syntax Regards [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]