Martin Schilling
2012-May-23 12:50 UTC
[R] data conversion (possibly with reshape package)
Hi everyone,
I have an issue with a data conversion. First, I tried it with the
reshape-package, but since it's quite a while that I used it, I feel kind
of rusty...
I have a data.frame like this:
id Sample.Name Marker Allele.1
Allele.2 sample_id species
1 01_primer01 Dalb01 165
179 SH233 D. madagascariensis
2 01_primer04 Dalb04 221
225 SH233 D. madagascariensis
3 01_primer08 Dalb08 218
218 SH233 D. madagascariensis
4 01_primer10 Dalb10 134
134 SH233 D. madagascariensis
5 01_primer14 Dalb14 250
250 SH233 D. madagascariensis
6 01_primer16 Dalb16 232
232 SH233 D. madagascariensis
this was just the head(), in fact, the sample_id col has different ids, I
would like to aggregate matching sample_id's into one
and would like to get something like this:
species sample_id Marker1_Allele1
Marker1_Allele2 Marker2_Allele1 Marker2_Allele2
... Marker31_Allele1 Marker31_Allele2
D. madagascariensis SH233 179
225 134
244 308 322
D. baronii SH151 123
134 155
155 307 312
I tried to prepare the cast() but didn't quite figure out how to achieve
this. I tried to first merge with the following:
genMelt <- melt(geno, id = c(1:2, 5:6))
then I created a column:
genMelt$Locus <- substring(as.character(genMelt$Panel),5, 6)
genMelt$Locus <- paste(genMelt$Locus, genMelt$variable, sep
= "_")
So I get a column in the appropriate format. But when I cast :
mycast <- cast(genMelt,sample_id~Locus)
I get just the frequencies of the Loci per sample_id
Maybe I don't even need reshape, I also thought about a loop. If you have
any idea on this, I appreciate it
Cheers,
Martin
[[alternative HTML version deleted]]
