thr3ads.net - R help - [R] plot question [Oct 2007]

If this information is useful, please help other people find it:
Share via:

Tiandao Li

2007-Oct-02 16:03 UTC

[R] plot question

Hello,

I have a question about how to plot a series of data. The folloqing is my 
data matrix of n> n             25p    5p  2.5p 0.5p
16B-E06.g 45379  4383  5123   45
16B-E06.g 45138  4028  6249   52
16B-E06.g 48457  4267  5470   54
16B-E06.g 47740  4676  6769   48
37B-B02.g 42860  6152 19276   72
35B-A02.g 48325 12863 38274  143
35B-A02.g 48410 12806 39013  175
35B-A02.g 48417  9057 40923  176
35B-A02.g 51403 13865 43338  161
45B-C12.g 50939  3656  5783   43
45B-C12.g 52356  5524  6041   55
45B-C12.g 49338  5141  5266   41
45B-C12.g 51567  3915  5677   43
35A-G04.g 40365  5513  6971   32
35B-D01.g 54217 12607 13067   93
35B-D01.g 55283 11441 14964  101
35B-D01.g 55041  9626 14928   94
35B-D01.g 54058  9465 14912   88
35B-A04.g 42745 12080 34271  105
35B-A04.g 41055 12423 34874  126

colnames(n) is concentrations, rownames(n) is gene IDs, and the rest is 
Intensity. I want to plot the data this way.
x-axis is colnames(n) in the order of 0.5p, 2.5p,5p,and 25p.
y-axis is Intensity
Inside of plot is the points of intensity over 4 concentrations, points 
from different genes have different color or shape. A regression line of 
each genes crosss different concetrations, and at the end of line is gene 
IDs.

Thanks,

Tiandao

Eric Thompson

2007-Oct-02 17:13 UTC

head link

[R] plot question

If I've correctly interpreted what you want, you first need to get the x
values:

x <- colnames(n)
x <- as.numeric(substr(x, 1, nchar(x) - 1))

Then it seems fairly easy to use matplot to get the values with
different colors for each concentration

dim(x) <- c(length(x), 1)
matplot(x, t(n), pch = 1)

But this does not look like a simple line will fit the data for each
gene well, so perhaps I've misunderstood something. You will have to
decide how you want to do the regression. It will also get very messy
and difficult to read with 20 lines (a different regression for each
gene). To do the regressions, plot the lines, and label with the gene
ID, see

?lm
?predict
?abline
?text




On 10/2/07, Tiandao Li <Tiandao.Li at usm.edu>
wrote:> Hello,
>
> I have a question about how to plot a series of data. The folloqing is my
> data matrix of n
> > n
>              25p    5p  2.5p 0.5p
> 16B-E06.g 45379  4383  5123   45
> 16B-E06.g 45138  4028  6249   52
> 16B-E06.g 48457  4267  5470   54
> 16B-E06.g 47740  4676  6769   48
> 37B-B02.g 42860  6152 19276   72
> 35B-A02.g 48325 12863 38274  143
> 35B-A02.g 48410 12806 39013  175
> 35B-A02.g 48417  9057 40923  176
> 35B-A02.g 51403 13865 43338  161
> 45B-C12.g 50939  3656  5783   43
> 45B-C12.g 52356  5524  6041   55
> 45B-C12.g 49338  5141  5266   41
> 45B-C12.g 51567  3915  5677   43
> 35A-G04.g 40365  5513  6971   32
> 35B-D01.g 54217 12607 13067   93
> 35B-D01.g 55283 11441 14964  101
> 35B-D01.g 55041  9626 14928   94
> 35B-D01.g 54058  9465 14912   88
> 35B-A04.g 42745 12080 34271  105
> 35B-A04.g 41055 12423 34874  126
>
> colnames(n) is concentrations, rownames(n) is gene IDs, and the rest is
> Intensity. I want to plot the data this way.
> x-axis is colnames(n) in the order of 0.5p, 2.5p,5p,and 25p.
> y-axis is Intensity
> Inside of plot is the points of intensity over 4 concentrations, points
> from different genes have different color or shape. A regression line of
> each genes crosss different concetrations, and at the end of line is gene
> IDs.
>
> Thanks,
>
> Tiandao
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

hadley wickham

2007-Oct-02 19:38 UTC

head link

[R] plot question

On 10/2/07, Tiandao Li <Tiandao.Li at usm.edu>
wrote:> Hello,
>
> I have a question about how to plot a series of data. The folloqing is my
> data matrix of n
> > n
>              25p    5p  2.5p 0.5p
> 16B-E06.g 45379  4383  5123   45
> 16B-E06.g 45138  4028  6249   52
> 16B-E06.g 48457  4267  5470   54
> 16B-E06.g 47740  4676  6769   48
> 37B-B02.g 42860  6152 19276   72
> 35B-A02.g 48325 12863 38274  143
> 35B-A02.g 48410 12806 39013  175
> 35B-A02.g 48417  9057 40923  176
> 35B-A02.g 51403 13865 43338  161
> 45B-C12.g 50939  3656  5783   43
> 45B-C12.g 52356  5524  6041   55
> 45B-C12.g 49338  5141  5266   41
> 45B-C12.g 51567  3915  5677   43
> 35A-G04.g 40365  5513  6971   32
> 35B-D01.g 54217 12607 13067   93
> 35B-D01.g 55283 11441 14964  101
> 35B-D01.g 55041  9626 14928   94
> 35B-D01.g 54058  9465 14912   88
> 35B-A04.g 42745 12080 34271  105
> 35B-A04.g 41055 12423 34874  126
>
> colnames(n) is concentrations, rownames(n) is gene IDs, and the rest is
> Intensity. I want to plot the data this way.
> x-axis is colnames(n) in the order of 0.5p, 2.5p,5p,and 25p.
> y-axis is Intensity
> Inside of plot is the points of intensity over 4 concentrations, points
> from different genes have different color or shape. A regression line of
> each genes crosss different concetrations, and at the end of line is gene
> IDs.
I might do it something like this:

df <- structure(list(gene = structure(c(1L, 1L, 1L, 1L, 6L, 3L, 3L,
3L, 3L, 7L, 7L, 7L, 7L, 2L, 5L, 5L, 5L, 5L, 4L, 4L), .Label =
c("16B-E06.g",
"35A-G04.g", "35B-A02.g", "35B-A04.g",
"35B-D01.g", "37B-B02.g",
"45B-C12.g"), class = "factor"), X25p = c(45379L, 45138L,
48457L,
47740L, 42860L, 48325L, 48410L, 48417L, 51403L, 50939L, 52356L,
49338L, 51567L, 40365L, 54217L, 55283L, 55041L, 54058L, 42745L,
41055L), X5p = c(4383L, 4028L, 4267L, 4676L, 6152L, 12863L, 12806L,
9057L, 13865L, 3656L, 5524L, 5141L, 3915L, 5513L, 12607L, 11441L,
9626L, 9465L, 12080L, 12423L), X2.5p = c(5123L, 6249L, 5470L,
6769L, 19276L, 38274L, 39013L, 40923L, 43338L, 5783L, 6041L,
5266L, 5677L, 6971L, 13067L, 14964L, 14928L, 14912L, 34271L,
34874L), X0.5p = c(45L, 52L, 54L, 48L, 72L, 143L, 175L, 176L,
161L, 43L, 55L, 41L, 43L, 32L, 93L, 101L, 94L, 88L, 105L, 126L
)), .Names = c("gene", "X25p", "X5p",
"X2.5p", "X0.5p"),
class = "data.frame", row.names = c(NA, -20L))

library(reshape)
library(ggplot2)

dfm <- melt(df, id=1)
names(dfm) <- c("gene", "conc", "intensity")
dfm$conc <- as.numeric(gsub("[Xp]", "",
as.character(dfm$conc)))

qplot(conc, intensity, data=dfm, colour=gene, log="xy") +
geom_smooth(method=lm)

Note that I've converted the concentrations to numeric values and
plotted them on a log scale.  If you want to treat concentration as a
factor, then you'll need the following code:

dfm$conc <- factor(dfm$conc)
qplot(conc, intensity, data=dfm, colour=gene, group=gene, log="y") +
geom_smooth(method=lm, xseq=levels(dfm$conc))

But in that case, fitting a linear model seems a bit dubious.

Note that you can also use this format of data with lattice:

library(lattice)
xyplot(intensity ~ conc, data=dfm, type=c("p","r"),
group=gene, auto.key=T)

Hadley

-- 
http://had.co.nz/

Apparently Analagous Threads

Search for more possibly parallel threads

R help - Oct 2007 - plot question

[R] plot question

[R] plot question

[R] plot question

Apparently Analagous Threads