Hi everybody!
To use some functions, I have to transform my dataset into a list, where
each element contains one group, and I have to prepare a list for each
variable I have (altogether I have 15 variables, and many entries per
factor level)
Here is some part of my dataset:
SPECSHOR BONE Asfc Smc epLsar
cotau tx 454.390369 29.261638 0.001136
cotau tx 117.445711 4.291884 0.00056
cotau tx 381.024682 15.313017 0.002324
cotau tx 159.081789 18.134533 0.000462
cotau tm 160.641503 6.411332 0.000571
cotau tm 79.238023 3.828254 0.001182
cotau tm 143.20655 11.921899 0.000192
cotau tm 115.476996 33.116386 0.000417
cotau tm 594.256234 72.538131 0.000477
eqgre tx 188.261324 8.279096 0.000777
eqgre tx 152.444216 2.596325 0.001022
eqgre tx 256.601507 8.279096 0.000566
eqgre tx 250.816445 18.134533 0.000535
eqgre tx 272.396711 24.492879 0.000585
eqgre tm 172.63264 4.291884 0.001781
eqgre tm 189.441097 14.425498 0.001347
eqgre tm 170.743788 13.564472 0.000602
eqgre tm 158.960849 10.385299 0.001189
eqgre tm 80.972408 3.828254 0.000644
gicam tx 294.494001 9.656738 0.000524
gicam tx 267.126765 19.128024 0.000647
gicam tx 81.888658 4.782006 0.000492
gicam tx 168.32908 12.729939 0.001097
gicam tx 123.296056 7.007427 0.000659
gicam tm 94.264887 18.134533 0.000752
gicam tm 54.317395 3.828254 0.00038
gicam tm 55.978883 17.167534 0.000141
gicam tm 279.597993 15.313017 0.000398
gicam tm 288.262556 18.134533 0.001043
What I do next is:
----
list_Asfc <- list()
list_Asfc[[1]] <-
ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tx', 3]
list_Asfc[[2]] <-
ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tm', 3]
----
And so on for each level of SPECSHOR and BONE
I'm stuck on 2 parts:
- in a loop or something similar, I would like the 1st element of the
list to be filled by the values for the 1st variable with the first
level of my factors (i.e. cotau + tx), and then the 2nd element with the
2nd level (i.e. cotau + tm) and so on. As shown above, I know how to do
it if I enter manually the different levels, but I have no idea which
function I should use so that each combination of factor will be used.
See what I mean?
- I would then like to run it in a loop or something for each variable.
It is by itself not so complicated, but I don't know how to give the
correct name to my list. I want the list containing the data for Asfc to
be named "list_Asfc".
Here is what I tried:
----
seq.num <- c(seq(3,5,1)) #the indexes of the variables
for(i in 1:length(seq.num)) {
k <- seq.num[i]
name.num <- names(ssfamed)[k]
list <- list()
list[[1]] <-
ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tx', i]
list[[2]] <-
ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tm', i]
names(list) <- c("cotau_tx", "cotau_tm") #I have
more and the 1st
question should help me on that too
}
----
After names(list) I need to insert something like: name_list <- list
But I don't know how to give it the correct name. How do we change the
name of an object? Or am I on the wrong path?
Thank you in advance for your help.
Ivan
PS: if necessary: under Windows XP, R2.10.
[[alternative HTML version deleted]]
One way is : dataset = data.table(ssfamed) dataset[, < whatever "some functions" are on Asfc, Smc, epLsar, etc >, by="SPECSHOR,BONE"] Your SPECSHOR and BONE names will be in your result alongside the results of the <whatever ...> Or try package plyr which does this sort of thing too. And sqldf may be better if you know SQL and prefer it. There are actually zillions of ways to do it : by(), doBy() etc etc If you get your code working the way its constructed currently, its going to be very slow, because of those "==". data.table doesn't do that and is pretty fast for this kind of thing. You might find that plyr is easier to use and more flexible though if speed isn't an issue, depending on exactly what you want to do. Whichever way you decide, consider voting on crantastic for the package you end up using, and that may be a quick and easy way for you to help new R users in the future, and help us all by reducing the r-help traffic on the same subject over and over again. Note that plyr is the 2nd spot on crantastic, it would have solved your problem without needing to write that code. If you check crantastic first and make sure you're aware of popular packages, it might avoid getting stuck in this way again. It only works if users contribute to it though. "Ivan Calandra" <ivan.calandra at uni-hamburg.de> wrote in message news:4B587CDD.4070209 at uni-hamburg.de...> Hi everybody! > > To use some functions, I have to transform my dataset into a list, where > each element contains one group, and I have to prepare a list for each > variable I have (altogether I have 15 variables, and many entries per > factor level) > > Here is some part of my dataset: > SPECSHOR BONE Asfc Smc epLsar > cotau tx 454.390369 29.261638 0.001136 > cotau tx 117.445711 4.291884 0.00056 > cotau tx 381.024682 15.313017 0.002324 > cotau tx 159.081789 18.134533 0.000462 > cotau tm 160.641503 6.411332 0.000571 > cotau tm 79.238023 3.828254 0.001182 > cotau tm 143.20655 11.921899 0.000192 > cotau tm 115.476996 33.116386 0.000417 > cotau tm 594.256234 72.538131 0.000477 > eqgre tx 188.261324 8.279096 0.000777 > eqgre tx 152.444216 2.596325 0.001022 > eqgre tx 256.601507 8.279096 0.000566 > eqgre tx 250.816445 18.134533 0.000535 > eqgre tx 272.396711 24.492879 0.000585 > eqgre tm 172.63264 4.291884 0.001781 > eqgre tm 189.441097 14.425498 0.001347 > eqgre tm 170.743788 13.564472 0.000602 > eqgre tm 158.960849 10.385299 0.001189 > eqgre tm 80.972408 3.828254 0.000644 > gicam tx 294.494001 9.656738 0.000524 > gicam tx 267.126765 19.128024 0.000647 > gicam tx 81.888658 4.782006 0.000492 > gicam tx 168.32908 12.729939 0.001097 > gicam tx 123.296056 7.007427 0.000659 > gicam tm 94.264887 18.134533 0.000752 > gicam tm 54.317395 3.828254 0.00038 > gicam tm 55.978883 17.167534 0.000141 > gicam tm 279.597993 15.313017 0.000398 > gicam tm 288.262556 18.134533 0.001043 > > What I do next is: > ---- > list_Asfc <- list() > list_Asfc[[1]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tx', 3] > list_Asfc[[2]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tm', 3] > ---- > > And so on for each level of SPECSHOR and BONE > > I'm stuck on 2 parts: > - in a loop or something similar, I would like the 1st element of the > list to be filled by the values for the 1st variable with the first > level of my factors (i.e. cotau + tx), and then the 2nd element with the > 2nd level (i.e. cotau + tm) and so on. As shown above, I know how to do > it if I enter manually the different levels, but I have no idea which > function I should use so that each combination of factor will be used. > See what I mean? > > - I would then like to run it in a loop or something for each variable. > It is by itself not so complicated, but I don't know how to give the > correct name to my list. I want the list containing the data for Asfc to > be named "list_Asfc". > Here is what I tried: > ---- > seq.num <- c(seq(3,5,1)) #the indexes of the variables > for(i in 1:length(seq.num)) { > k <- seq.num[i] > name.num <- names(ssfamed)[k] > list <- list() > list[[1]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tx', i] > list[[2]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tm', i] > names(list) <- c("cotau_tx", "cotau_tm") #I have more and the 1st > question should help me on that too > } > ---- > After names(list) I need to insert something like: name_list <- list > But I don't know how to give it the correct name. How do we change the > name of an object? Or am I on the wrong path? > > Thank you in advance for your help. > Ivan > > PS: if necessary: under Windows XP, R2.10. > > > > > > > > > > > > > [[alternative HTML version deleted]] >
Without reading all the details of your question, it looks like maybe split() is what you want. split( dataset, paste(dataset$SPECSHOR,dataset$BONE) ) or split( dataset[,3], paste(dataset$SPECSHOR,dataset$BONE) ) -Don At 5:12 PM +0100 1/21/10, Ivan Calandra wrote:>Hi everybody! > >To use some functions, I have to transform my dataset into a list, where >each element contains one group, and I have to prepare a list for each >variable I have (altogether I have 15 variables, and many entries per >factor level) > >Here is some part of my dataset: >SPECSHOR BONE Asfc Smc epLsar >cotau tx 454.390369 29.261638 0.001136 >cotau tx 117.445711 4.291884 0.00056 >cotau tx 381.024682 15.313017 0.002324 >cotau tx 159.081789 18.134533 0.000462 >cotau tm 160.641503 6.411332 0.000571 >cotau tm 79.238023 3.828254 0.001182 >cotau tm 143.20655 11.921899 0.000192 >cotau tm 115.476996 33.116386 0.000417 >cotau tm 594.256234 72.538131 0.000477 >eqgre tx 188.261324 8.279096 0.000777 >eqgre tx 152.444216 2.596325 0.001022 >eqgre tx 256.601507 8.279096 0.000566 >eqgre tx 250.816445 18.134533 0.000535 >eqgre tx 272.396711 24.492879 0.000585 >eqgre tm 172.63264 4.291884 0.001781 >eqgre tm 189.441097 14.425498 0.001347 >eqgre tm 170.743788 13.564472 0.000602 >eqgre tm 158.960849 10.385299 0.001189 >eqgre tm 80.972408 3.828254 0.000644 >gicam tx 294.494001 9.656738 0.000524 >gicam tx 267.126765 19.128024 0.000647 >gicam tx 81.888658 4.782006 0.000492 >gicam tx 168.32908 12.729939 0.001097 >gicam tx 123.296056 7.007427 0.000659 >gicam tm 94.264887 18.134533 0.000752 >gicam tm 54.317395 3.828254 0.00038 >gicam tm 55.978883 17.167534 0.000141 >gicam tm 279.597993 15.313017 0.000398 >gicam tm 288.262556 18.134533 0.001043 > >What I do next is: >---- >list_Asfc <- list() >list_Asfc[[1]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tx', 3] >list_Asfc[[2]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tm', 3] >---- > >And so on for each level of SPECSHOR and BONE > >I'm stuck on 2 parts: >- in a loop or something similar, I would like the 1st element of the >list to be filled by the values for the 1st variable with the first >level of my factors (i.e. cotau + tx), and then the 2nd element with the >2nd level (i.e. cotau + tm) and so on. As shown above, I know how to do >it if I enter manually the different levels, but I have no idea which >function I should use so that each combination of factor will be used. >See what I mean? > >- I would then like to run it in a loop or something for each variable. >It is by itself not so complicated, but I don't know how to give the >correct name to my list. I want the list containing the data for Asfc to >be named "list_Asfc". >Here is what I tried: >---- >seq.num <- c(seq(3,5,1)) #the indexes of the variables >for(i in 1:length(seq.num)) { > k <- seq.num[i] > name.num <- names(ssfamed)[k] > list <- list() > list[[1]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tx', i] > list[[2]] <- ssfamed[ssfamed$SPECSHOR=='cotau'&ssfamed$BONE=='tm', i] > names(list) <- c("cotau_tx", "cotau_tm") #I have more and the 1st >question should help me on that too >} >---- >After names(list) I need to insert something like: name_list <- list >But I don't know how to give it the correct name. How do we change the >name of an object? Or am I on the wrong path? > >Thank you in advance for your help. >Ivan > >PS: if necessary: under Windows XP, R2.10. > > > > > > > > > > > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list >https://*stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- -------------------------------------- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062