On 03/23/2010 01:11 AM, kathy_BJ wrote:>
> I am new to R, can anyone help with boxplot for a dataset like:
> file1 col1 col2 col3 col4 col5
> 050350005 101 56.625 48.318 RED
> 051010002 106 50.625 46.990 GREEN
> 051190007 25 65.875 74.545 BLUE
> 051191002 246 52.875 57.070 RED
> 220050004 55 70 80.274 BLUE
> 220150008 75 67.750 62.749 RED
> 220170001 77 65.750 54.307 GREEN
> file2
> col1 col2 col3 col4 col5
> 050350005 101 56.625 57 RED
> 051010002 106 50.625 77 GREEN
> 051190007 25 65.875 51.6 BLUE
> 051191002 246 52.875 55.070 RED
> 220050004 55 70 32 BLUE
> 220150008 75 67.750 32.49 RED
>
> for each color (red,green and blue), I need to compare file1 and file2 by
> making box plot with MB and RMSE for (col4-col3) for file1 and file2 by
> dividing col2 in different group: if col2<20,20<=col2<50, 50<=
col2<70,
> col2>=70. That is, for the boxplot, the x is (<20,
20-50,50-70,>70), while
> y is MB (and RMSE) of the difference of col4 and col3
>
Creating a category variable for your col2 values will probably make
things easier:
kbj1$col2_group<-cut(kbj1$col2,breaks=c(0,20,50,70,max(kbj1$col2)+1),
right=FALSE)
kbj2$col2_group<-cut(kbj2$col2,breaks=c(0,20,50,70,max(kbj2$col2)+1),
right=FALSE)
I assume that your MB refers to megabases (and not megabytes, Monica
Belluci or the Muslim Brotherhood) so:
kbj1$mbdiff<-kbj1$col4-kbj1$col3
kbj2$mbdiff<-kbj2$col4-kbj2$col3
Having calculated that and your RMSE, you can turn to getting a boxplot.
boxplot(mbdiff~col2group,kbj1[kbj1$col5=="RED",])
boxplot(rmse~col2group,kbj1[kbj1$col5=="RED",])
...
> I hope I didn't confuse anybody. Thank you so much
>
You can confuse all of the people some of the time and some of the
people all of the time, but if you can confuse all of the people all of
the time, you probably have a great career in politics waiting.
Jim