Carl-Göran CG. Pettersson
2010-Jan-11 07:47 UTC
[R] Illustrating kernel distribution in wheat ears
Dear all R2.10 WinXP I have a dataset dealing with the way different wheat cultivars build their yield. Wheat ears are organised in spikelets where the spikelets can be numbered from the bottom, with even numbers on one side and odd on the other. I know how many kernels there were in each spikelet after some months spent counting them... Now I want to illustrate the differences between the cultivars in how the kernels are distributed in the ears. In the best of all possible worlds it would be possible to place histograms or boxplots on adjecent sides of vertical lines representing different cultivars. I have done some experimenting using boxplot() but I am stuck and out of ideas right now. All ideas are welcome! /CG Here is a sample dataset with the countings of kernels for the first 14 spikelets: cn spl01 spl02 spl03 spl04 spl05 spl06 spl07 spl08 spl09 spl10 spl11 spl12 spl13 spl14 Lans 1.8 3.1 3.5 3.8 3.8 4.1 4.2 4.3 4.4 4.5 4.2 4.1 3.9 3.8 Kranich 0.6 2.4 3.4 4.2 4.5 4.7 4.9 4.9 4.8 4.7 4.4 4.1 4.1 3.9 Loyal 1.1 2.7 3.6 3.7 4.1 4.4 4.4 4.6 4.3 4.5 4.3 4.1 3.8 3.7 Boomer NA NA NA NA NA NA NA NA NA NA NA NA NA NA Oakley NA NA NA NA NA NA NA NA NA NA NA NA NA NA Hereford 0.6 2.3 3.3 3.6 3.9 4 4.2 4.1 4.1 3.9 3.9 3.6 3.4 3.2 Kranich 0.3 2.5 3.6 4 4.4 4.5 4.3 4.8 4.7 4.6 4.4 4.3 4.1 4 Oakley 0.5 2.1 3.2 3.4 3.8 4.4 4.3 4.3 4.3 4.2 4.2 3.9 3.8 3.6 Loyal 1.6 3.3 3.9 4.2 4.3 4.4 4.4 4.6 4.6 4.5 4.3 4.3 4.2 4 Hereford NA NA NA NA NA NA NA NA NA NA NA NA NA NA Oakley 0.5 2.1 3.2 3.6 4 4 4.1 4.4 4.4 4.2 4.1 3.8 3.8 4 Kranich NA NA NA NA NA NA NA NA NA NA NA NA NA NA Lans 1.4 3 3.3 3.8 3.9 4.3 4 4.3 4.3 4.3 4 4.1 4 4 Hereford 1.2 2.7 3.6 3.8 4 4 4.1 4.2 4.1 4.1 3.9 3.6 3.8 3.3 Boomer 0.3 2.5 3.1 3.8 3.9 4.4 4.1 4.2 4.3 4 4.2 4 3.8 3.7 Lans NA NA NA NA NA NA NA NA NA NA NA NA NA NA Boomer 0.2 1.9 3 3.4 3.7 3.9 3.9 4 4 4 3.8 3.8 3.6 3.4 Loyal NA NA NA NA NA NA NA NA NA NA NA NA NA NA Boomer NA NA NA NA NA NA NA NA NA NA NA NA NA NA Kranich NA NA NA NA NA NA NA NA NA NA NA NA NA NA Kranich 0.3 1.1 2.9 3.5 3.9 4.3 4.4 4.4 4 4.2 4.2 4 3.9 3.8 Hereford 0.5 2.1 3.1 3.6 3.7 3.9 4 3.8 4 3.8 3.6 3.6 3.1 3 Loyal NA NA NA NA NA NA NA NA NA NA NA NA NA NA Boomer 0.3 0.8 2.8 3 3.6 3.7 3.8 4 3.8 3.5 3.3 3.2 3.2 2.9 Oakley 0.5 2.7 3.4 3.8 4 3.9 4.2 4.5 4.3 4.4 4 4 3.9 3.9 Loyal 0.9 2.6 3.6 3.8 3.8 4.4 4.2 4.4 4.2 3.9 3.8 4 3.4 3.7 Oakley NA NA NA NA NA NA NA NA NA NA NA NA NA NA Hereford 0.7 2.9 3.6 4 4 3.9 4 4 4 3.9 3.8 3.7 3 3 Hereford NA NA NA NA NA NA NA NA NA NA NA NA NA NA Loyal 0.7 2.3 3.5 3.7 3.9 3.8 4.2 4.1 4.1 4.1 4 4 3.4 3.6 Boomer 0.7 2 3.3 3.5 3.9 3.7 4 3.9 3.8 4 3.7 3.8 3.5 3.4 Lans NA NA NA NA NA NA NA NA NA NA NA NA NA NA Lans 1.9 3 3.7 3.8 3.9 4 3.9 4.3 4.1 4.1 4.1 3.8 3.8 3.9 Lans 1.1 2.6 3.3 3.7 4.1 4 4.2 4.2 4.2 4 4.1 4.1 3.8 3.6 Kranich 0.5 1.3 2.9 3.8 3.8 4.3 4.3 4.4 4.4 4 4.3 3.9 3.6 3.4 Oakley 0.1 2 3.1 3.5 4.1 3.9 4.1 4.2 4.2 4.2 4.1 4 3.9 3.8
On 01/11/2010 06:47 PM, Carl-G?ran CG. Pettersson wrote:> Dear all > > R2.10 WinXP > > I have a dataset dealing with the way different wheat cultivars build their yield. > Wheat ears are organised in spikelets where the spikelets can be numbered from the bottom, with even numbers on one side and odd on the other. > I know how many kernels there were in each spikelet after some months spent counting them... > > Now I want to illustrate the differences between the cultivars in how the kernels are distributed in the ears. > In the best of all possible worlds it would be possible to place histograms or boxplots on adjecent sides of vertical lines representing different cultivars. > I have done some experimenting using boxplot() but I am stuck and out of ideas right now. > >Hi Carl, Is this what you are looking for? plot(0,xlim=c(0.5,6.5),ylim=c(0,6), main="Kernel distribution",xlab="Cultivar", ylab="Count",type="n",xaxt="n") cultivars<-unique(spikernel$cn) axis(1,at=1:6,labels=cultivars) for(cultivar in 1:6) boxplot(unlist(spikernel[spikernel$cn==cultivars[cultivar],]), add=TRUE,at=cultivar) abline("v"=1.5:5.5) Jim
Carl-Göran CG. Pettersson
2010-Jan-11 13:41 UTC
[R] Illustrating kernel distribution in wheat ears
Thanks a lot for the quick response! The suggested code worked fine up to a certain point: the actual plotting from the datasets... With the sample dataset in "samp" my code looks like this at the end:> for(cultivar in 1:6)+ boxplot(unlist(samp[samp$cn==cultivar[cultivar],]), + add=TRUE,at=cultivar) There were 12 warnings (use warnings() to see them)> warnings()Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf . . . . . What is happening and why? /CG ________________________________________ Fr?n: Jim Lemon [jim at bitwrit.com.au] Skickat: den 11 januari 2010 10:56 Till: Carl-G?ran CG. Pettersson Kopia: r-help at r-project.org ?mne: Re: [R] Illustrating kernel distribution in wheat ears On 01/11/2010 06:47 PM, Carl-G?ran CG. Pettersson wrote:> Dear all > > R2.10 WinXP > > I have a dataset dealing with the way different wheat cultivars build their yield. > Wheat ears are organised in spikelets where the spikelets can be numbered from the bottom, with even numbers on one side and odd on the other. > I know how many kernels there were in each spikelet after some months spent counting them... > > Now I want to illustrate the differences between the cultivars in how the kernels are distributed in the ears. > In the best of all possible worlds it would be possible to place histograms or boxplots on adjecent sides of vertical lines representing different cultivars. > I have done some experimenting using boxplot() but I am stuck and out of ideas right now. > >Hi Carl, Is this what you are looking for? plot(0,xlim=c(0.5,6.5),ylim=c(0,6), main="Kernel distribution",xlab="Cultivar", ylab="Count",type="n",xaxt="n") cultivars<-unique(spikernel$cn) axis(1,at=1:6,labels=cultivars) for(cultivar in 1:6) boxplot(unlist(spikernel[spikernel$cn==cultivars[cultivar],]), add=TRUE,at=cultivar) abline("v"=1.5:5.5) Jim
Hi: It wasn't clear to me precisely what you wanted, but here are a couple of ideas in the hope that it will help. I used ggplot2 for the graphics, so it requires some manipulation of your dataset from 'wide' format to 'long'. I also add an indicator for side of the ear (odd is side one (L?), even is side 2) and a variable I call 'loc' to indicate the value associated with the splxx variable. I read the data into a data frame called spikelets. The first step is to remove the rows of missing responses: naind <- apply(spikelets[, -1], 1, function(x) all(is.na(x))) spikelets2 <- spikelets[!naind, ] Next, I use the plyr package and its melt() function to convert the data frame from 'wide' to 'long' form: library(ggplot2) # attaches the plyr package in the loading process spikes.long <- melt(spikelets2, id = 'cn') The variable 'variable' contains the variable names as a vector (spl01, spl02, ..., spl14) Next, I create a variable called loc, which represents the numeric part of the spl variables, and then create a variable side to distinguish one side of the awn from the other. 'variable' is then removed... spikes.long$loc <- as.numeric(substring(spikes.long$variable, 4)) spikes.long$side <- factor(2 - spikes.long$loc %% 2) spikes.long$variable <- NULL Now we're in a position to plot. The first is a scatterplot of the response by location, stratified by cultivar; it contains color to distinguish sides. # With color: p <- qplot(loc, value, data = spikes.long, group = cn, colour = side) p + facet_grid(cn ~ .) The color is not terribly informative, so to get rid of it, remove the colour = side argument. One could also merge the plots together and fit smooths to the different cultivars. ggplot(spikes.long, aes(loc, value, colour = cn)) + geom_point() + geom_smooth(se = FALSE) I also came up with boxplot pairs by side for each cultivar, which is shown below: q <- ggplot(spikes.long, aes(side, value)) q + geom_boxplot() + facet_grid(~ cultivar) For some reason, I kept getting these messages from every ggplot2 call: Error in recordGraphics(drawGTree(x), list(x = x), getNamespace("grid")) : invalid graphics state but all of the plots rendered as expected. HTH, Dennis 2010/1/10 Carl-Göran CG. Pettersson <CG.Pettersson@vpe.slu.se>> Dear all > > R2.10 WinXP > > I have a dataset dealing with the way different wheat cultivars build their > yield. > Wheat ears are organised in spikelets where the spikelets can be numbered > from the bottom, with even numbers on one side and odd on the other. > I know how many kernels there were in each spikelet after some months spent > counting them... > > Now I want to illustrate the differences between the cultivars in how the > kernels are distributed in the ears. > In the best of all possible worlds it would be possible to place histograms > or boxplots on adjecent sides of vertical lines representing different > cultivars. > I have done some experimenting using boxplot() but I am stuck and out of > ideas right now. > > All ideas are welcome! > /CG > > > Here is a sample dataset with the countings of kernels for the first 14 > spikelets: > > cn spl01 spl02 spl03 spl04 spl05 spl06 spl07 spl08 > spl09 spl10 spl11 spl12 spl13 spl14 > Lans 1.8 3.1 3.5 3.8 3.8 4.1 4.2 4.3 4.4 > 4.5 4.2 4.1 3.9 3.8 > Kranich 0.6 2.4 3.4 4.2 4.5 4.7 4.9 4.9 4.8 > 4.7 4.4 4.1 4.1 3.9 > Loyal 1.1 2.7 3.6 3.7 4.1 4.4 4.4 4.6 4.3 > 4.5 4.3 4.1 3.8 3.7 > Boomer NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Oakley NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Hereford 0.6 2.3 3.3 3.6 3.9 4 4.2 4.1 > 4.1 3.9 3.9 3.6 3.4 3.2 > Kranich 0.3 2.5 3.6 4 4.4 4.5 4.3 4.8 4.7 > 4.6 4.4 4.3 4.1 4 > Oakley 0.5 2.1 3.2 3.4 3.8 4.4 4.3 4.3 4.3 > 4.2 4.2 3.9 3.8 3.6 > Loyal 1.6 3.3 3.9 4.2 4.3 4.4 4.4 4.6 4.6 > 4.5 4.3 4.3 4.2 4 > Hereford NA NA NA NA NA NA NA NA > NA NA NA NA NA NA > Oakley 0.5 2.1 3.2 3.6 4 4 4.1 4.4 4.4 > 4.2 4.1 3.8 3.8 4 > Kranich NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Lans 1.4 3 3.3 3.8 3.9 4.3 4 4.3 4.3 > 4.3 4 4.1 4 4 > Hereford 1.2 2.7 3.6 3.8 4 4 4.1 4.2 > 4.1 4.1 3.9 3.6 3.8 3.3 > Boomer 0.3 2.5 3.1 3.8 3.9 4.4 4.1 4.2 4.3 > 4 4.2 4 3.8 3.7 > Lans NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Boomer 0.2 1.9 3 3.4 3.7 3.9 3.9 4 4 > 4 3.8 3.8 3.6 3.4 > Loyal NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Boomer NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Kranich NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Kranich 0.3 1.1 2.9 3.5 3.9 4.3 4.4 4.4 4 > 4.2 4.2 4 3.9 3.8 > Hereford 0.5 2.1 3.1 3.6 3.7 3.9 4 3.8 > 4 3.8 3.6 3.6 3.1 3 > Loyal NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Boomer 0.3 0.8 2.8 3 3.6 3.7 3.8 4 3.8 > 3.5 3.3 3.2 3.2 2.9 > Oakley 0.5 2.7 3.4 3.8 4 3.9 4.2 4.5 4.3 > 4.4 4 4 3.9 3.9 > Loyal 0.9 2.6 3.6 3.8 3.8 4.4 4.2 4.4 4.2 > 3.9 3.8 4 3.4 3.7 > Oakley NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Hereford 0.7 2.9 3.6 4 4 3.9 4 4 > 4 3.9 3.8 3.7 3 3 > Hereford NA NA NA NA NA NA NA NA > NA NA NA NA NA NA > Loyal 0.7 2.3 3.5 3.7 3.9 3.8 4.2 4.1 4.1 > 4.1 4 4 3.4 3.6 > Boomer 0.7 2 3.3 3.5 3.9 3.7 4 3.9 3.8 > 4 3.7 3.8 3.5 3.4 > Lans NA NA NA NA NA NA NA NA NA > NA NA NA NA NA > Lans 1.9 3 3.7 3.8 3.9 4 3.9 4.3 4.1 > 4.1 4.1 3.8 3.8 3.9 > Lans 1.1 2.6 3.3 3.7 4.1 4 4.2 4.2 4.2 > 4 4.1 4.1 3.8 3.6 > Kranich 0.5 1.3 2.9 3.8 3.8 4.3 4.3 4.4 4.4 > 4 4.3 3.9 3.6 3.4 > Oakley 0.1 2 3.1 3.5 4.1 3.9 4.1 4.2 4.2 > 4.2 4.1 4 3.9 3.8 > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]