Dear all,
Probably a very basic question but I need some help.
I have a data frame (made by read.table from a text file) of microarray data, of
which the first column is a factor and the rest of the columns are numeric.
The factor column contains chromosome names, so values 1 through 22 plus X, Y
and XY. The numeric columns contain positions or intensity measurements.
What I need to do is change the X's in the first column to a value of 23.
This is what I thought I would do:
BAF_temp <- read.table("BAF_all.txt", sep="\t", header=T)
#to read in the table
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
#"in rows where the first column of BAF_temp is X, change the first column
of BAF_temp to 23"
However with this last line I get an error: "Invalid factor level, NAs
generated in '[<-.factor'('*tmp*',
BAF_temp[,1]=="X", value=23)"
(I tested if my syntax for selecting the rows of chromosome X was correct by
trying
BAF_X <- BAF_temp[BAF_temp[,1]=="X",]
which worked to give me a data frame with only the rows of the X chromosome.)
I then thought it might work better if I changed the data frame to a matrix.
When I change the BAF_temp data frame into a matrix (by BAF_matrix <-
as.matrix(BAF_temp)), then the command I used above:
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
works fine and the end result is as I meant it to be, with all the X's
changed into 23's.
However, by using as.matrix all columns are changed to 'character'
including the numeric measurements (I understand this is because one of the
columns of the data frame is 'factor')
I would like some help on what is the best option to solve this. I have thought
of a few options myself and would like your comment/help:
1. Is there another syntax I can use on the data frame to change the X's to
23's, so I don't have to change the data frame into a matrix first?
2. I could change the data frame into a matrix and run the syntax as I
described, resulting in all columns becoming 'character'; is there then
an easy way to turn the columns with measurements (columns 2 and further) back
into 'numeric' while leaving the first column with the chromosome
numbers as 'character'?
3. I thought of using data.matrix(BAF_temp) and making use of the fact that the
first column of factors would be changed to the underlying numbers (because X
being the 23rd level in the list would automaticly be changed to 23). However
because the levels (chromosome names) of the factor column are ordered as
"1", "10", "11",
"12",....,"19", "2", "20",
"21", "3", "4", etc. (I see this when using
str(BAF_temp)) , this results in chromosome 10 being changed into a value of 2,
chromosome 11 into 3, chromosome 2 into 12 etc. For info: the chromosome names
in the text file that is imported are ordered just 1, 2, 3, etc.
If anyone has some tips for me I would greatly appreciate it.
Best wishes,
Marije
De inhoud van dit bericht is vertrouwelijk en alleen bestemd voor de
geadresseerde(n). Anderen dan de geadresseerde(n) mogen geen gebruik maken van
dit bericht, het niet openbaar maken of op enige wijze verspreiden of
vermenigvuldigen. Het UMCG kan niet aansprakelijk gesteld worden voor een
incomplete aankomst of vertraging van dit verzonden bericht.
The contents of this message are confidential and only intended for the eyes of
the addressee(s). Others than the addressee(s) are not allowed to use this
message, to make it public or to distribute or multiply this message in any way.
The UMCG cannot be held responsible for incomplete reception or delay of this
transferred message.
[[alternative HTML version deleted]]
x=c(1:25)
x[23]="X"
x
x.new=ifelse(x=="X",23,x)
x.new=as.numeric(x.new)
Best,
Daniel
-------------------------
cuncta stricte discussurus
-------------------------
-----Urspr?ngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Booman, M
Gesendet: Wednesday, July 09, 2008 5:14 AM
An: r-help at r-project.org
Betreff: [R] replacing value in column of data frame
Dear all,
Probably a very basic question but I need some help.
I have a data frame (made by read.table from a text file) of microarray
data, of which the first column is a factor and the rest of the columns are
numeric.
The factor column contains chromosome names, so values 1 through 22 plus X,
Y and XY. The numeric columns contain positions or intensity measurements.
What I need to do is change the X's in the first column to a value of 23.
This is what I thought I would do:
BAF_temp <- read.table("BAF_all.txt", sep="\t", header=T)
#to read in the
table
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
#"in rows
where the first column of BAF_temp is X, change the first column of BAF_temp
to 23"
However with this last line I get an error: "Invalid factor level, NAs
generated in '[<-.factor'('*tmp*',
BAF_temp[,1]=="X", value=23)"
(I tested if my syntax for selecting the rows of chromosome X was correct by
trying BAF_X <- BAF_temp[BAF_temp[,1]=="X",] which worked to give
me a data
frame with only the rows of the X chromosome.)
I then thought it might work better if I changed the data frame to a matrix.
When I change the BAF_temp data frame into a matrix (by BAF_matrix <-
as.matrix(BAF_temp)), then the command I used above:
BAF_temp[,1][BAF_temp[,1]=="X"] <- 23
works fine and the end result is as I meant it to be, with all the X's
changed into 23's.
However, by using as.matrix all columns are changed to 'character'
including
the numeric measurements (I understand this is because one of the columns of
the data frame is 'factor')
I would like some help on what is the best option to solve this. I have
thought of a few options myself and would like your comment/help:
1. Is there another syntax I can use on the data frame to change the X's to
23's, so I don't have to change the data frame into a matrix first?
2. I could change the data frame into a matrix and run the syntax as I
described, resulting in all columns becoming 'character'; is there then
an
easy way to turn the columns with measurements (columns 2 and further) back
into 'numeric' while leaving the first column with the chromosome
numbers as
'character'?
3. I thought of using data.matrix(BAF_temp) and making use of the fact that
the first column of factors would be changed to the underlying numbers
(because X being the 23rd level in the list would automaticly be changed to
23). However because the levels (chromosome names) of the factor column are
ordered as "1", "10", "11",
"12",....,"19", "2", "20",
"21", "3", "4", etc.
(I see this when using str(BAF_temp)) , this results in chromosome 10 being
changed into a value of 2, chromosome 11 into 3, chromosome 2 into 12 etc.
For info: the chromosome names in the text file that is imported are ordered
just 1, 2, 3, etc.
If anyone has some tips for me I would greatly appreciate it.
Best wishes,
Marije
De inhoud van dit bericht is vertrouwelijk en alleen bestemd voor de
geadresseerde(n). Anderen dan de geadresseerde(n) mogen geen gebruik maken
van dit bericht, het niet openbaar maken of op enige wijze verspreiden of
vermenigvuldigen. Het UMCG kan niet aansprakelijk gesteld worden voor een
incomplete aankomst of vertraging van dit verzonden bericht.
The contents of this message are confidential and only intended for the eyes
of the addressee(s). Others than the addressee(s) are not allowed to use
this message, to make it public or to distribute or multiply this message in
any way. The UMCG cannot be held responsible for incomplete reception or
delay of this transferred message.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Try this; what you want to do is to change the 'levels' of the factor.> x <- factor(c(1:10,'x','y','xy')) > str(x)Factor w/ 13 levels "1","10","2","3",..: 1 3 4 5 6 7 8 9 10 2 ...> x[1] 1 2 3 4 5 6 7 8 9 10 x y xy Levels: 1 10 2 3 4 5 6 7 8 9 x xy y> # your error > x[x == 'x'] <- 23Warning message: In `[<-.factor`(`*tmp*`, x == "x", value = 23) : invalid factor level, NAs generated> x[1] 1 2 3 4 5 6 7 8 9 10 <NA> y xy Levels: 1 10 2 3 4 5 6 7 8 9 x xy y> > # work with the levels which is what you want to change > x <- factor(c(1:10,'x','y','xy')) > levels(x)[x == 'x'] <- '23' > x[1] 1 2 3 4 5 6 7 8 9 10 23 y xy Levels: 1 10 2 3 4 5 6 7 8 9 23 xy y>On Wed, Jul 9, 2008 at 5:13 AM, Booman, M <m.booman at path.umcg.nl> wrote:> Dear all, > > Probably a very basic question but I need some help. > I have a data frame (made by read.table from a text file) of microarray data, of which the first column is a factor and the rest of the columns are numeric. > The factor column contains chromosome names, so values 1 through 22 plus X, Y and XY. The numeric columns contain positions or intensity measurements. > What I need to do is change the X's in the first column to a value of 23. > > This is what I thought I would do: > > BAF_temp <- read.table("BAF_all.txt", sep="\t", header=T) #to read in the table > BAF_temp[,1][BAF_temp[,1]=="X"] <- 23 #"in rows where the first column of BAF_temp is X, change the first column of BAF_temp to 23" > > However with this last line I get an error: "Invalid factor level, NAs generated in '[<-.factor'('*tmp*', BAF_temp[,1]=="X", value=23)" > > (I tested if my syntax for selecting the rows of chromosome X was correct by trying > BAF_X <- BAF_temp[BAF_temp[,1]=="X",] > which worked to give me a data frame with only the rows of the X chromosome.) > > I then thought it might work better if I changed the data frame to a matrix. > When I change the BAF_temp data frame into a matrix (by BAF_matrix <- as.matrix(BAF_temp)), then the command I used above: > BAF_temp[,1][BAF_temp[,1]=="X"] <- 23 > works fine and the end result is as I meant it to be, with all the X's changed into 23's. > However, by using as.matrix all columns are changed to 'character' including the numeric measurements (I understand this is because one of the columns of the data frame is 'factor') > > I would like some help on what is the best option to solve this. I have thought of a few options myself and would like your comment/help: > 1. Is there another syntax I can use on the data frame to change the X's to 23's, so I don't have to change the data frame into a matrix first? > > 2. I could change the data frame into a matrix and run the syntax as I described, resulting in all columns becoming 'character'; is there then an easy way to turn the columns with measurements (columns 2 and further) back into 'numeric' while leaving the first column with the chromosome numbers as 'character'? > > 3. I thought of using data.matrix(BAF_temp) and making use of the fact that the first column of factors would be changed to the underlying numbers (because X being the 23rd level in the list would automaticly be changed to 23). However because the levels (chromosome names) of the factor column are ordered as "1", "10", "11", "12",....,"19", "2", "20", "21", "3", "4", etc. (I see this when using str(BAF_temp)) , this results in chromosome 10 being changed into a value of 2, chromosome 11 into 3, chromosome 2 into 12 etc. For info: the chromosome names in the text file that is imported are ordered just 1, 2, 3, etc. > > If anyone has some tips for me I would greatly appreciate it. > > Best wishes, > Marije > > > > > De inhoud van dit bericht is vertrouwelijk en alleen bestemd voor de geadresseerde(n). Anderen dan de geadresseerde(n) mogen geen gebruik maken van dit bericht, het niet openbaar maken of op enige wijze verspreiden of vermenigvuldigen. Het UMCG kan niet aansprakelijk gesteld worden voor een incomplete aankomst of vertraging van dit verzonden bericht. > > The contents of this message are confidential and only intended for the eyes of the addressee(s). Others than the addressee(s) are not allowed to use this message, to make it public or to distribute or multiply this message in any way. The UMCG cannot be held responsible for incomplete reception or delay of this transferred message. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?