thr3ads.net - similar to: "Deleting rows based on duplicate entries in one columns in a data matrix"

Displaying 20 results from an estimated 10000 matches similar to: "Deleting rows based on duplicate entries in one columns in a data matrix"

Using changing names in loop in R

2010 Nov 06

Using changing names in loop in R

Hello everybody, I have usually solved this problem by repeating lines of codes instead of a loop, but it's such a waste of time, I thought I should really learn how to do it with loops: What I want to do: Say, I have several data files that differ only in a number, e.g. data points (or vector, or matrix...) Data_1, Data_2, Data_3,... and I want to manipulate them e.g. a simple sum of

Deleting rows and columns containing NA's and "" only

2012 Feb 13

Deleting rows and columns containing NA's and "" only

Hello, I use read.xls from the gdata package to read in xlsx files. Sometimes these data.frames contain NA columns and rows only. I know how to get rid of those ones but here is the R output of a test data set read in with read.xls > t1 A B X D X.1 X.2 1 test 1 NA NA 2 <NA> asd NA

importing text file with duplicate rows / indexing rows and columns

2004 May 16

importing text file with duplicate rows / indexing rows and columns

Could somebody advise me about importing a txt file as a frame? I am using the command: test <- read.delim ("~/docs/perl/expr_ctx.txt2", header=T, sep = "\t", row.names = 1) This gives me an error because there are duplicate rows. In the txt file, the columns are unique subjects and the rows are variables, so I had planned to transform the file after importing. The first

Unique command not deleting all duplicate rows

2009 Aug 24

Unique command not deleting all duplicate rows

Hello everyone, when I run the "unique" command on my data frame, it deletes the majority of duplicate rows, but not all of them. Here is a sample of my data. How do I get it to delete all the rows? 6 -115.38 32.894 195 162.94 D 8419 D 7 -115.432 32.864 115 208.91 D 8419 D 8 -115.447 32.773 1170 264.57 D 8419 D 9 -115.447 32.773 1170 264.57 D 8419 D 10 -115.447 32.773 1170

Deleting duplicate rows in a matrix at random

2010 Mar 24

Deleting duplicate rows in a matrix at random

Hello, I am relatively new to R, and I've run into a problem formatting my data for input into the package RankAggreg. I have a matrix of gene titles and P-values (weights) in two columns: KCTD12 4.06904E-22 UNC93A 9.91852E-22 CDKN3 1.24695E-21 CLEC2B 4.71759E-21 DAB2 1.12062E-20 HSPB1 1.23125E-20 ... The data contains many, many duplicate gene titles, and I need to remove all but one of

Deleting Rows/Columns

2005 Nov 07

Deleting Rows/Columns

Sorry to bother the group but I am wondering if there are some official ways to delete a row/column, i.e., some functions of dataTable manipulation? For rows operation I use subset() but what about columns? Any advice is welcome and I will be more than grateful if somebody could make a summary on this issue. Xiaofan --------------------------------------------------------- Xiaofan Li

R: deleting rows

2005 Sep 15

R: deleting rows

hi all hopefully some one can help. assume that i imported the following data into R (say the data frame is called a) x1 x2 x3 1 NA 3 1 2 NA 1 2 3 3 NA 6 4 5 9 7 5 6 7 8 9 NA 7 9 How can i construct a new data frame that only contains those rows that does not contain the NA's? is these a quick way? ie x1 x2 x3 1 2 3 4 5 9 7 5 6 7 8 9 in this example we can simple use

how to simplify a data.frame and add the counts of duplicate rows as a new column

2011 Mar 02

how to simplify a data.frame and add the counts of duplicate rows as a new column

Hello List, I would like to simplify a data.frame like this columnA columnB user10 proj12 user10 proj19 user10 proj12 into something like: columnA columnB columnC user10 proj12 2 user10 proj19 1 I know unique() can simplify the data.frame, but how to count and store the duplicates? thanks in advance for any help. best regards, Simone

Deleting multiple rows based on a variable

2008 Feb 20

Deleting multiple rows based on a variable

Hello, I have a dataset which consists of 9 columns (variables) and 35 rows (observations). I am doing a simple linear regression of one variable on the other. There are some observations that are outliers and I would like to remove them based on another variable (it's a unique, numeric variable). How do you tell R to remove multiple rows (observations) based on a variable value?

Deleting multiple rows from a data matrix based on exp value

2011 Nov 20

Deleting multiple rows from a data matrix based on exp value

Dear List, I have a data matrix that consists of ~4500 rows and 25 columns (i.e. an exprSet object that I converted via the 'exprs' function into a data matrix) Now I want to remove/delete the rows where all exp. values in that particular row are below or equal to a specific cut-off value (e.g 1.11) I have tried using several commands to address this issue: >Matrix[rowSums(Matrix

Deleting rows with NA from isolated column in matrix

2010 May 14

Deleting rows with NA from isolated column in matrix

Hi all, I'm relatively new to R and have a data management problem. I am importing a data matrix with some columns that have missing values. I am trying to figure out how to delete rows with NA for data FOR JUST ONE SPECIFIED column. For instance, with the example matrix: x<-matrix(nrow=5,ncol=3) x[,]<-1 x[5,1]<-NA x[3,3]<-NA how do I tell R to delete any rows with an NA value

How to delete rows using conditions on all columns

2011 Oct 24

How to delete rows using conditions on all columns

n <- 10 P1 <- runif(n) P2 <- runif(n) P3 <- P1 + P2 + runif(n)/100 P4 <- P1 + P2 + P3 + runif(n)/100 mydata <- data.frame(cbind(P1,P2,P3,P4)) mydata[1,1] <- 8 mydata[3,1] <- -5 mydata[2,3] <- -6 mydata[7,3] <- 7 f=function(z){quantile(z, c(0.01, 0.99)) } temp1 <- lapply(mydata, f) temp1 $P1 1% 99% -4.542391 7.354209 $P2 1% 99%

delete selecting rows and columns

2007 Feb 28

delete selecting rows and columns

Hi, I'm working with a big square matrix (15k x 15k) and I have some trouble. I want to delete selecting rows and columns. I'm using something like this: > sel_r=c(15,34,384,985,4302,6213) > sel_c=c(3,151,324,3384,7985,14302) > matrix=matrix[-sel_r,-sel_c] but it works very slow. Does anybody know how to make it in faster way? Thank's -- View this message in context:

Deleting columns based on the number of non-blank observations

2009 Jan 18

Deleting columns based on the number of non-blank observations

Hello, I have a dataset (named "x") with many (966) columns. What I would like to do is delete any columns that do not have at least 375 non-blank observations (i.e., the cells have some value in them besides NA). How can I do this? I have come up with the following code to _count_ the non-blank observations in each column, but how would I adapt this code to _delete_ columns from the

Delete rows in the data frame by limiting values in two columns

2010 Jun 25

Delete rows in the data frame by limiting values in two columns

Hi, folks, Finally Friday~~ Here comes the question: x=c('germany','poor italy','usa','england','poor italy','japan') y=c('Spain','germany','usa','brazil','england','chile') s=1:6 z=3:8 test=data.frame(x,y,s,z) #Now I only concern the countries ('germany','england','brazil').

Delete rows with duplicate field...

2010 May 03

Delete rows with duplicate field...

as a r noob i am having another problem: i have a big dataframe where each row corresponds to one entry and each column is a field... for instance, i have the column ID and time and many more... Id like to get a dataframe where all IDs are just included once (some users with that ID might have several entries but Id like to kepp only one).. when i use unique I only get a list of the levels (or

How to delete rows with specific values on all columns (variables)?

2011 Feb 21

How to delete rows with specific values on all columns (variables)?

Hi, I need to filter my data: I think its easy but i'm stuck so i'll appreciate some help: I have a data frame with 14 variables and 6 million rows. About half of this rows have a value of "0" in 12 variables (the other two variables always have values). How can I delete the rows in which all 12 variables have the value of "0". example (from my data, variable 14 is

Replace rows in dataframe based on values in other columns

2013 May 09

Replace rows in dataframe based on values in other columns

Hi, dat1<- read.table(text=" Restaurant owner purchase_date ??????????? 23 Chuck 3/4/2011 ??????????? 23 Chuck 3/4/2011 ??????????? 23 Chuck 3/4/2011 ??????????? 23 Chuck 3/4/2011 ??????????? 23 Bob??????? 1/1/2013 ??????????? 23 Bob??????? 1/1/2013 ??????????? 23 Bob???????? 1/1/2013 ??????????? 15 Hazel 4/11/2010 ??????????? 15 Hazel 4/11/2010 ??????????? 15 Hazel 4/11/2010 ???????????

[help] deleting rows which contain more than 2 NAs or zeros

2010 Mar 08

[help] deleting rows which contain more than 2 NAs or zeros

Hello. I have just started learning how to work with R program but I have encountered a problem. I can't think up how to remove the rows which contain two (2) or more NA or Zero (0). I would be glad if you could help me because I just have some basic knowledge so far and I even haven't mastered all the basics yet as well. Thanks in advance. -- View this message in context:

Addition operation based on specific columns and rows of two data frames

2007 Oct 12

Addition operation based on specific columns and rows of two data frames

#Hello, # I have a question about the addition of values in specific columns and rows of a Data frame. # Below I have created two data frames, X.df and "Y.df". ## creation of X.df data frame X<- matrix(0,16,3) X.df<-data.frame(X) X.df[,1] <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4) X.df[,2] <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4) names(X.df)[1]<-"L(A)a(i)"

similar to: Deleting rows based on duplicate entries in one columns in a data matrix