Ashim Kapoor
2017-Jun-12 05:23 UTC
[R] Keep only those values in a row in a data frame which occur only once.
Dear All, I have a file data.txt as follows: Name_1,A,B,C Name_2,E,F Name_3,I,J,I,K,L,M I will read this with: my_data<- read.csv("data.txt",header=FALSE,col.names=paste0("V", seq(1:10)),fill=TRUE) Then the file will have 10 columns. I am assuming that each row in data.txt will have at the max 10 entries. Note: Here each row will have a different number of columns in data.txt but each row will have 10 ( some trailing blank columns ) columns. My query is how can I keep only the unique elements in each row? For example: I want the row 3 to be Name_3,I,J,K,L,M Please note I don't want the 2nd I to appear. How can I do this? Best Regards, Ashim [[alternative HTML version deleted]]
Jim Lemon
2017-Jun-12 07:02 UTC
[R] Keep only those values in a row in a data frame which occur only once.
Hi Ashim, One way is this, assuming that your data frame is named akdf: akdf<-t(apply(akdf,1,function(x) return(unique(x)[1:length(x)]))) If you want factors instead of strings, more processing will be required. Jim On Mon, Jun 12, 2017 at 3:23 PM, Ashim Kapoor <ashimkapoor at gmail.com> wrote:> Dear All, > > I have a file data.txt as follows: > > Name_1,A,B,C > Name_2,E,F > Name_3,I,J,I,K,L,M > > I will read this with: > my_data<- read.csv("data.txt",header=FALSE,col.names=paste0("V", > seq(1:10)),fill=TRUE) > > Then the file will have 10 columns. I am assuming that each row in data.txt > will have at the max 10 entries. > > Note: Here each row will have a different number of columns in data.txt but > each row will have 10 ( some trailing blank columns ) columns. > > My query is how can I keep only the unique elements in each row? For > example: I want the row 3 to be Name_3,I,J,K,L,M > > Please note I don't want the 2nd I to appear. > > How can I do this? > > Best Regards, > Ashim > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
S Ellison
2017-Jun-12 13:54 UTC
[R] Keep only those values in a row in a data frame which occur only once.
> I have a file data.txt as follows: > > Name_1,A,B,C > Name_2,E,F > Name_3,I,J,I,K,L,M > > My query is how can I keep only the unique elements in each row? For > example: I want the row 3 to be Name_3,I,J,K,L,M > > Please note I don't want the 2nd I to appear. > > How can I do this?Use unique() on each row and pad with NA? Example: uniq10 <- function(x, L=10) { u <- unique(x) c(u, rep(NA, L-length(u)) ) } as.data.frame( t( apply(tmp, 1, uniq10) ) ) assuming tmp is the name of your initial data frame. S Ellison ******************************************************************* This email and any attachments are confidential. Any use...{{dropped:8}}