thr3ads.net - R help - [R] lapply getting names of the list [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Sashi Challa

2010-Dec-09 17:44 UTC

[R] lapply getting names of the list

Hello All,

I have a toy dataframe like this. It has 8 columns separated by tab.

Name    SampleID        Al1     Al2     X       Y       R       Th
rs191191        A1      A       B       0.999   0.09    0.78    0.090
abc928291       A1      B       J       0.3838  0.3839  0.028   0.888
abcnab  A1      H       K       0.3939  0.939   0.3939  0.77
rx82922 B1      J       K       0.3838  0.393   0.393   0.00
rcn3939 B1      M       O       0.000   0.000   0.000   0.77
tcn39399        B1      P       I       0.393   0.393   0.393   0.56

Note that the SampleID is repeating. So I want to be able to split the dataset
based on the SampleID and write the splitted dataset of every SampleID into a
new file.
I tried split followed by lapply to do this.

infile <- read.csv("test.txt", sep="\t", as.is = TRUE,
header = TRUE)
infile.split  <- split(infile, infile$SampleID)
names(infile.split[1])  ## outputs “A1”
## now A1, B1 are two lists in infile.split as I understand it. Correct me if I
am wrong.

lapply(infile.split,function(x){
              filename <- names(x) #### here I expect to see A1 or B1, I
didn’t, I tried (names(x)[1]) and that gave me “Name” and not A1 or B1.
              final_filename <- paste(filename,”toy_set.txt”,sep=”_”)
              write.table(x, file = paste(path, final_filename,sep=”/”,
row.names=FALSE, quote=FALSE,sep=”\t”)
  } )

In lapply I wanted to give a unique filename to all the split Sample Ids, i.e.
name them here as A1_toy_set.txt, B1_toy_set_txt.
How do I get those names, i.e. A1, B1 to a create a filename like above.
When I write each of the element in the list obtained after split into a file,
the column names would have names like A1.Name, A1.SampleID, A1.Al1, ….. Can I
get rid of “A1” in the column names within the lapply (other than reading in the
file again and changing the names) ?

Thanks for your time,

Regards
Sashi


	[[alternative HTML version deleted]]

Joshua Wiley

2010-Dec-09 18:06 UTC

head link

[R] lapply getting names of the list

Hi Sashi,

On Thu, Dec 9, 2010 at 9:44 AM, Sashi Challa <challa at ohsu.edu>
wrote:> Hello All,
>
> I have a toy dataframe like this. It has 8 columns separated by tab.
>
> Name ? ?SampleID ? ? ? ?Al1 ? ? Al2 ? ? X ? ? ? Y ? ? ? R ? ? ? Th
> rs191191 ? ? ? ?A1 ? ? ?A ? ? ? B ? ? ? 0.999 ? 0.09 ? ?0.78 ? ?0.090
> abc928291 ? ? ? A1 ? ? ?B ? ? ? J ? ? ? 0.3838 ?0.3839 ?0.028 ? 0.888
> abcnab ?A1 ? ? ?H ? ? ? K ? ? ? 0.3939 ?0.939 ? 0.3939 ?0.77
> rx82922 B1 ? ? ?J ? ? ? K ? ? ? 0.3838 ?0.393 ? 0.393 ? 0.00
> rcn3939 B1 ? ? ?M ? ? ? O ? ? ? 0.000 ? 0.000 ? 0.000 ? 0.77
> tcn39399 ? ? ? ?B1 ? ? ?P ? ? ? I ? ? ? 0.393 ? 0.393 ? 0.393 ? 0.56
>
> Note that the SampleID is repeating. So I want to be able to split the
dataset based on the SampleID and write the splitted dataset of every SampleID
into a new file.
> I tried split followed by lapply to do this.
>
> infile <- read.csv("test.txt", sep="\t", as.is =
TRUE, header = TRUE)
> infile.split ?<- split(infile, infile$SampleID)
> names(infile.split[1]) ?## outputs ?A1?
correct, names() returns the top level names of infile.split (i.e.,
the two data frame names)
> ## now A1, B1 are two lists in infile.split as I understand it. Correct me
if I am wrong.
It is a single, named list containing two data frames (A1 and B1)
(though data frames are built from lists, I think so I suppose in a
way it contains two lists, but that is not really the point).
>
> lapply(infile.split,function(x){
> ? ? ? ? ? ? ?filename <- names(x) #### here I expect to see A1 or B1, I
didn?t, I tried (names(x)[1]) and that gave me ?Name? and not A1 or B1.
by using lapply() on the actual object, your function is getting each
element of the list.  That is:

infile.split[[1]]
infile.split[[2]]

trying names() on those:

names(infile.split[[1]])

should show what you are getting
> ? ? ? ? ? ? ?final_filename <- paste(filename,?toy_set.txt?,sep=?_?)
> ? ? ? ? ? ? ?write.table(x, file = paste(path, final_filename,sep=?/?,
row.names=FALSE, quote=FALSE,sep=?\t?)
FYI I think you are missing a parenthesis in there
somewhere> ?} )
>
> In lapply I wanted to give a unique filename to all the split Sample Ids,
i.e. name them here as A1_toy_set.txt, B1_toy_set_txt.
> How do I get those names, i.e. A1, B1 to a create a filename like above.
Try this:

## read your data from the clipboard (obviously you do not need to)
infile <- read.table("clipboard", header = TRUE)
split.infile <- split(dat, dat$SampleID) #split data
path <- "~" # generic path

## rather than applying to the data itself, instead apply to the names
lapply(names(split.infile), function(x) {
  write.table(x = split.infile[[x]],
    file = paste(path, paste(x, "toy_set.txt", sep = "_"),
sep = "/"),
    row.names = FALSE, quote = FALSE, sep = "\t")
  cat("wrote ", x, fill = TRUE)
})

it will return two NULL lists, but that is fine because it should have
written the files.
> When I write each of the element in the list obtained after split into a
file, the column names would have names like A1.Name, A1.SampleID, A1.Al1, ?..
Can I get rid of ?A1? in the column names within the lapply (other than reading
in the file again and changing the names) ?
Can you report the results of str(yourdataframe) ?  I did not have
that issue just copying and pasting from your email and using the code
I showed above.

Cheers,

Josh
>
> Thanks for your time,
>
> Regards
> Sashi
>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

David Winsemius

2010-Dec-09 19:21 UTC

head link

[R] lapply getting names of the list

On Dec 9, 2010, at 12:44 PM, Sashi Challa wrote:
> Hello All,
>
> I have a toy dataframe like this. It has 8 columns separated by tab.
>
> Name    SampleID        Al1     Al2     X       Y       R       Th
> rs191191        A1      A       B       0.999   0.09    0.78    0.090
> abc928291       A1      B       J       0.3838  0.3839  0.028   0.888
> abcnab  A1      H       K       0.3939  0.939   0.3939  0.77
> rx82922 B1      J       K       0.3838  0.393   0.393   0.00
> rcn3939 B1      M       O       0.000   0.000   0.000   0.77
> tcn39399        B1      P       I       0.393   0.393   0.393   0.56
>
> Note that the SampleID is repeating. So I want to be able to split  
> the dataset based on the SampleID and write the splitted dataset of  
> every SampleID into a new file.
> I tried split followed by lapply to do this.
>
> infile <- read.csv("test.txt", sep="\t", as.is =
TRUE, header = TRUE)
> infile.split  <- split(infile, infile$SampleID)
> names(infile.split[1])  ## outputs ?A1?
> ## now A1, B1 are two lists in infile.split as I understand it.  
> Correct me if I am wrong.
>
> lapply(infile.split,function(x){
>              filename <- names(x) #### here I expect to see A1 or  
> B1, I didn?t, I tried (names(x)[1]) and that gave me ?Name? and not  
> A1 or B1.
>              final_filename <- paste(filename,?toy_set.txt?,sep=?_?)
>              write.table(x, file = paste(path,  
> final_filename,sep=?/?, row.names=FALSE, quote=FALSE,sep=?\t?)
>  } )
>
> In lapply I wanted to give a unique filename to all the split Sample  
> Ids, i.e. name them here as <dragged to the c() construct>.
> How do I get those names, i.e. A1, B1 to a create a filename like  
> above.
names(file.split) <- c("A1_toy_set.txt",
"B1_toy_set_txt")
> When I write each of the element in the list obtained after split  
> into a file,
How are you proposing do do this "writing"?
> the column names would have names like A1.Name, A1.SampleID, A1.Al1,  
> ?..
Are you sure? Why would you think that?

-- 
David.
> Can I get rid of ?A1? in the column names within the lapply (other  
> than reading in the file again and changing the names) ?
>
> Thanks for your time,
>
> Regards
> Sashi
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT

Sashi Challa

2010-Dec-09 19:24 UTC

head link

[R] lapply getting names of the list

Thanks a lot Joshua, that works perfectly fine. 
I could not think to lapply on the names instead of data itself.
I don't now notice SampleID names in the column names.
Thanks for your time,

-Sashi

-----Original Message-----
From: Joshua Wiley [mailto:jwiley.psych at gmail.com] 
Sent: Thursday, December 09, 2010 10:07 AM
To: Sashi Challa
Cc: r-help at R-project.org
Subject: Re: [R] lapply getting names of the list

Hi Sashi,

On Thu, Dec 9, 2010 at 9:44 AM, Sashi Challa <challa at ohsu.edu>
wrote:> Hello All,
>
> I have a toy dataframe like this. It has 8 columns separated by tab.
>
> Name ? ?SampleID ? ? ? ?Al1 ? ? Al2 ? ? X ? ? ? Y ? ? ? R ? ? ? Th
> rs191191 ? ? ? ?A1 ? ? ?A ? ? ? B ? ? ? 0.999 ? 0.09 ? ?0.78 ? ?0.090
> abc928291 ? ? ? A1 ? ? ?B ? ? ? J ? ? ? 0.3838 ?0.3839 ?0.028 ? 0.888
> abcnab ?A1 ? ? ?H ? ? ? K ? ? ? 0.3939 ?0.939 ? 0.3939 ?0.77
> rx82922 B1 ? ? ?J ? ? ? K ? ? ? 0.3838 ?0.393 ? 0.393 ? 0.00
> rcn3939 B1 ? ? ?M ? ? ? O ? ? ? 0.000 ? 0.000 ? 0.000 ? 0.77
> tcn39399 ? ? ? ?B1 ? ? ?P ? ? ? I ? ? ? 0.393 ? 0.393 ? 0.393 ? 0.56
>
> Note that the SampleID is repeating. So I want to be able to split the
dataset based on the SampleID and write the splitted dataset of every SampleID
into a new file.
> I tried split followed by lapply to do this.
>
> infile <- read.csv("test.txt", sep="\t", as.is =
TRUE, header = TRUE)
> infile.split ?<- split(infile, infile$SampleID)
> names(infile.split[1]) ?## outputs ?A1?
correct, names() returns the top level names of infile.split (i.e.,
the two data frame names)
> ## now A1, B1 are two lists in infile.split as I understand it. Correct me
if I am wrong.
It is a single, named list containing two data frames (A1 and B1)
(though data frames are built from lists, I think so I suppose in a
way it contains two lists, but that is not really the point).
>
> lapply(infile.split,function(x){
> ? ? ? ? ? ? ?filename <- names(x) #### here I expect to see A1 or B1, I
didn?t, I tried (names(x)[1]) and that gave me ?Name? and not A1 or B1.
by using lapply() on the actual object, your function is getting each
element of the list.  That is:

infile.split[[1]]
infile.split[[2]]

trying names() on those:

names(infile.split[[1]])

should show what you are getting
> ? ? ? ? ? ? ?final_filename <- paste(filename,?toy_set.txt?,sep=?_?)
> ? ? ? ? ? ? ?write.table(x, file = paste(path, final_filename,sep=?/?,
row.names=FALSE, quote=FALSE,sep=?\t?)
FYI I think you are missing a parenthesis in there
somewhere> ?} )
>
> In lapply I wanted to give a unique filename to all the split Sample Ids,
i.e. name them here as A1_toy_set.txt, B1_toy_set_txt.
> How do I get those names, i.e. A1, B1 to a create a filename like above.
Try this:

## read your data from the clipboard (obviously you do not need to)
infile <- read.table("clipboard", header = TRUE)
split.infile <- split(dat, dat$SampleID) #split data
path <- "~" # generic path

## rather than applying to the data itself, instead apply to the names
lapply(names(split.infile), function(x) {
  write.table(x = split.infile[[x]],
    file = paste(path, paste(x, "toy_set.txt", sep = "_"),
sep = "/"),
    row.names = FALSE, quote = FALSE, sep = "\t")
  cat("wrote ", x, fill = TRUE)
})

it will return two NULL lists, but that is fine because it should have
written the files.
> When I write each of the element in the list obtained after split into a
file, the column names would have names like A1.Name, A1.SampleID, A1.Al1, ?..
Can I get rid of ?A1? in the column names within the lapply (other than reading
in the file again and changing the names) ?
Can you report the results of str(yourdataframe) ?  I did not have
that issue just copying and pasting from your email and using the code
I showed above.

Cheers,

Josh
>
> Thanks for your time,
>
> Regards
> Sashi
>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

Jon Erik Ween

2010-Dec-16 14:43 UTC

head link

[R] Optimal method to scan cells and set flag

Hi

I am still working on my large dataset (sample attached) that contains a series
of binary variables (flags, yes/no) regarding affected brain areas
("Lica","LAtChor","LA1" ,etc).

I need to scan these columns, if value = Y for "Lxxx" set
"LesionSide" to "L", if Y for "Rxxx" set to
"R" and "B" if both.  There are >2500 records, so
for-loops would be inefficient. Any suggestions?

Much obliged.

Best

Jon

Soli Deo Gloria

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.txt
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20101216/a1ded1e6/attachment.txt>
-------------- next part --------------


Jon Erik Ween, MD, MS
Scientist, Kunin-Lunenfeld Applied Research Unit 
Director, Stroke Clinic, Brain Health Clinic, Baycrest Centre
Assistant Professor, Dept. of Medicine, Div. of Neurology
    University of Toronto Faculty of Medicine

Kimel Family Building, 6th Floor, Room 644 
Baycrest Centre
3560 Bathurst Street 
Toronto, Ontario M6A 2E1
Canada 

Phone: 416-785-2500 x3648
Fax: 416-785-2484
Email: jween at klaru-baycrest.on.ca


Confidential: This communication and any attachment(s) may contain confidential
or privileged information and is intended solely for the address(es) or the
entity representing the recipient(s). If you have received this information in
error, you are hereby advised to destroy the document and any attachment(s),
make no copies of same and inform the sender immediately of the error. Any
unauthorized use or disclosure of this information is strictly prohibited.



On 2010-12-09, at 2:21 PM, David Winsemius wrote:
> 
> On Dec 9, 2010, at 12:44 PM, Sashi Challa wrote:
> 
>> Hello All,
>> 
>> I have a toy dataframe like this. It has 8 columns separated by tab.
>> 
>> Name    SampleID        Al1     Al2     X       Y       R       Th
>> rs191191        A1      A       B       0.999   0.09    0.78    0.090
>> abc928291       A1      B       J       0.3838  0.3839  0.028   0.888
>> abcnab  A1      H       K       0.3939  0.939   0.3939  0.77
>> rx82922 B1      J       K       0.3838  0.393   0.393   0.00
>> rcn3939 B1      M       O       0.000   0.000   0.000   0.77
>> tcn39399        B1      P       I       0.393   0.393   0.393   0.56
>> 
>> Note that the SampleID is repeating. So I want to be able to split the
dataset based on the SampleID and write the splitted dataset of every SampleID
into a new file.
>> I tried split followed by lapply to do this.
>> 
>> infile <- read.csv("test.txt", sep="\t", as.is =
TRUE, header = TRUE)
>> infile.split  <- split(infile, infile$SampleID)
>> names(infile.split[1])  ## outputs ?A1?
>> ## now A1, B1 are two lists in infile.split as I understand it. Correct
me if I am wrong.
>> 
>> lapply(infile.split,function(x){
>>             filename <- names(x) #### here I expect to see A1 or B1,
I didn?t, I tried (names(x)[1]) and that gave me ?Name? and not A1 or B1.
>>             final_filename <- paste(filename,?toy_set.txt?,sep=?_?)
>>             write.table(x, file = paste(path, final_filename,sep=?/?,
row.names=FALSE, quote=FALSE,sep=?\t?)
>> } )
>> 
>> In lapply I wanted to give a unique filename to all the split Sample
Ids, i.e. name them here as <dragged to the c() construct>.
>> How do I get those names, i.e. A1, B1 to a create a filename like
above.
> 
> names(file.split) <- c("A1_toy_set.txt",
"B1_toy_set_txt")
> 
>> When I write each of the element in the list obtained after split into
a file,
> 
> How are you proposing do do this "writing"?
> 
>> the column names would have names like A1.Name, A1.SampleID, A1.Al1,
?..
> 
> Are you sure? Why would you think that?
> 
> -- 
> David.
> 
>> Can I get rid of ?A1? in the column names within the lapply (other than
reading in the file again and changing the names) ?
>> 
>> Thanks for your time,
>> 
>> Regards
>> Sashi
>> 
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more maybe matching threads

R help - Dec 2010 - lapply getting names of the list

[R] lapply getting names of the list

[R] lapply getting names of the list

[R] lapply getting names of the list

[R] lapply getting names of the list

[R] Optimal method to scan cells and set flag

Possibly Parallel Threads