Alison,
Your code works fine on the first six lines of the data that you provided.
Rumino_Reps_agreeWalign <- data.frame(
geneid = c("657313.locus_tag:RTO_08940",
"457412.251848018",
"657314.locus_tag:CK5_20630",
"657323.locus_tag:CK1_33060",
"657313.locus_tag:RTO_09690",
"471875.197297106"),
count_Conser = c(7, 1, 2, 1, 3, 0),
count_NonCons = c(5, 4, 4, 0, 0, 2),
count_ConsSubst = c(5, 3, 1, 1, 3, 1),
count_NCSubst = c(1, 0, 0, 0, 1, 1))
gene.list <- strsplit(as.character(Rumino_Reps_agreeWalign$geneid),
"\\.")
Rumino_Reps_agreeWalignTR <- transform(Rumino_Reps_agreeWalign,
taxid=do.call(rbind, gene.list))
Perhaps in later rows of the data there are cases where there is no
"." in
geneid? If not, can you provide a subset of your data that results in the
warning? Use the dput() function.
It's not a good idea to create an object named "strsplit". That
will only
mask the function strsplit() in later runs.
If time is an issue, a slightly faster way to do this, after the
strsplit() function is:
Rumino_Reps_agreeWalign$geneid.prefix <- sapply(gene.list, "[", 1)
Rumino_Reps_agreeWalign$geneid.suffix <- sapply(gene.list, "[", 2)
Jean
alison waller wrote on 04/11/2012 08:23:29 AM:
> Dear all,
>
> I want to use string split to parse column names, however, I am having
> some errors that I don't understand.
> I see a problem when I try to rbind the output from strsplit.
>
> please let me know if I'm missing something obvious,
>
> thanks,
> alison
>
> here are my commands:
>
>strsplit<-strsplit(as.character(Rumino_Reps_agreeWalign$geneid),"\\.")
> >
> Rumino_Reps_agreeWalignTR<-transform
> (Rumino_Reps_agreeWalign,taxid=do.call(rbind,
> strsplit))
> Warning message:
> In function (..., deparse.level = 1) :
> number of columns of result is not a multiple of vector length (arg
1)>
>
> here is my data:
>
> > head(Rumino_Reps_agreeWalign)
> geneid count_Conser count_NonCons count_ConsSubst
> 1 657313.locus_tag:RTO_08940 7 5 5
> 2 457412.251848018 1 4 3
> 3 657314.locus_tag:CK5_20630 2 4 1
> 4 657323.locus_tag:CK1_33060 1 0 1
> 5 657313.locus_tag:RTO_09690 3 0 3
> 6 471875.197297106 0 2 1
> count_NCSubst
> 1 1
> 2 0
> 3 0
> 4 0
> 5 1
> 6 1
>
> here are the results from strsplit:
> > head(strsplit)
> [[1]]
> [1] "657313" "locus_tag:RTO_08940"
>
> [[2]]
> [1] "457412" "251848018"
>
> [[3]]
> [1] "657314" "locus_tag:CK5_20630"
>
> [[4]]
> [1] "657323" "locus_tag:CK1_33060"
>
> [[5]]
> [1] "657313" "locus_tag:RTO_09690"
>
> [[6]]
> [1] "471875" "197297106"
[[alternative HTML version deleted]]