maryam moazam
2016-Jan-09 14:29 UTC
[R] Combining dataframes with different row numbers and plotting with ggplot2
Dear Michael, Thanks for your feedback. Actually, I would like to show (and compare) size distribution of df1 and df2 in the single plot using ggplot2, something like the attached picture. The command dosesn't lead me to this purpose. However, I'm really new here, could you please help me more on this? Thanks in advance, Maryam On Sat, Jan 9, 2016 at 5:38 PM, Michael Dewey <lists at dewey.myzen.co.uk> wrote:> Dear Maryam > > If you just need all the values of size would > c(df1$size, df2$size) > work? > > On 08/01/2016 21:44, maryam moazam wrote: > >> Dear Sir / Madam, >> >> I have just come to the amazing R software, so please be patient if my >> question is basic for you. I have 2 text file (say 1.txt and 2.txt), each >> file containing 2 columns and different row numbers, like below >> >> case size >> case1 120 >> case2 120 >> case3 121 >> case4 121 >> case5 121 >> case6 122 >> case7 122 >> case8 123 >> >> I would like to have a one plot for all text files, with x-axis shows the >> size between 300-1200 with the interval of 200 (300,500,700,900,1200) and >> size between 1201-1500 with the interval of 1000. For dataframes with the >> equal row numbers, the following codes worked well, >> >> df1 = data.frame("1.txt", header=T) >> df2 = data.frame("2.txt", header=T) >> *combining two dataframes with equal row number* >> >> df = data.frame(df1$size,df2$size) >> library(reshape) >> melted <- melt(df) >> >> ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram >> (binwidth =500)+ >> >> >> +scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000))) >> >> >> but I couldn't reproduce the plot with these codes for dataframes with >> different row number. I think the problem is* how to combine datafrmaes >> with the different row number*, could you please help me out on this >> issue? >> >> Thank you in advance >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > -- > Michael > http://www.dewey.myzen.co.uk/home.html >-------------- next part -------------- A non-text attachment was scrubbed... Name: plot.png Type: image/png Size: 18641 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20160109/b721b1c6/attachment.png>
Michael Dewey
2016-Jan-09 15:11 UTC
[R] Combining dataframes with different row numbers and plotting with ggplot2
Sorry Maryam but I use neither reshape nor ggplot2 so I will leave it to others to advise you. On 09/01/2016 14:29, maryam moazam wrote:> > Dear Michael, > > Thanks for your feedback. Actually, I would like to show (and compare) > size distribution of df1 and df2 in the single plot using ggplot2, > something like the attached picture. The command dosesn't lead me to > this purpose. However, I'm really new here, could you please help > me more on this? > > > Thanks in advance, > Maryam > > > > > > On Sat, Jan 9, 2016 at 5:38 PM, Michael Dewey <lists at dewey.myzen.co.uk > <mailto:lists at dewey.myzen.co.uk>> wrote: > > Dear Maryam > > If you just need all the values of size would > c(df1$size, df2$size) > work? > > On 08/01/2016 21:44, maryam moazam wrote: > > Dear Sir / Madam, > > I have just come to the amazing R software, so please be patient > if my > question is basic for you. I have 2 text file (say 1.txt and > 2.txt), each > file containing 2 columns and different row numbers, like below > > case size > case1 120 > case2 120 > case3 121 > case4 121 > case5 121 > case6 122 > case7 122 > case8 123 > > I would like to have a one plot for all text files, with x-axis > shows the > size between 300-1200 with the interval of 200 > (300,500,700,900,1200) and > size between 1201-1500 with the interval of 1000. For dataframes > with the > equal row numbers, the following codes worked well, > > df1 = data.frame("1.txt", header=T) > df2 = data.frame("2.txt", header=T) > *combining two dataframes with equal row number* > > df = data.frame(df1$size,df2$size) > library(reshape) > melted <- melt(df) > > ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram > (binwidth =500)+ > > +scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000))) > > > but I couldn't reproduce the plot with these codes for > dataframes with > different row number. I think the problem is* how to combine > datafrmaes > with the different row number*, could you please help me out on > this issue? > > Thank you in advance > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > -- > Michael > http://www.dewey.myzen.co.uk/home.html > > >-- Michael http://www.dewey.myzen.co.uk/home.html
Jeff Newmiller
2016-Jan-09 19:05 UTC
[R] Combining dataframes with different row numbers and plotting with ggplot2
Please study each line of code, and use the str command to study the
intermediate data objects... the examples on this list are almost never
plug-and-play for your real work. Note that while you provided some of the
code necessary to make your example reproducible, I had to fill in blanks
with additional code... the Posting Guide asks you to make your example
run as-is to get us to the point where you are having problems. The below
code is a model for posing your future questions as well as an answer to
this one.
library(ggplot2)
DF1 <- read.table( text "case size
case1 120
case2 120
case3 121
case4 121
case5 121
case6 122
case7 122
case8 123
", header=TRUE, as.is=TRUE )
# note the fewer records below
DF2 <- read.table( text "case size
case1 120
case2 120
case3 121
case4 121
case5 121
case6 122
case7 122
", header=TRUE, as.is=TRUE )
# While you CAN use reshape to make long data out of wide data, that
# method for making long data will always presume you have the same number
# of records for each case. Combine your data directly into long form if
# that is how it is best represented.
# Below note the use of labels such as "Source" to organize the data
# Also note the use of "stringsAsFactors = FALSE" because
concatenating
# factors is almost never a good idea... go read (again?) about what
# factors are if you don't understand why concatenating factors doesn't
# work well
DFL <- rbind( data.frame( Source = "DF1"
, size = DF1$size
, stringsAsFactors = FALSE
)
, data.frame( Source = "DF2"
, size = DF2$size
, stringsAsFactors = FALSE
)
)
# Your intent in making this graph is still a little opaque to me.. the
# breaks are causing logarithmic axis labels, but not all of the breaks
# show up
ggplot( data = DFL
, aes( x=size, fill=Source ) ) +
geom_histogram( binwidth = 500 ) + # might want
"position='dodge`"?
scale_x_continuous( breaks = c( seq( 300, 800, by = 200 )
, seq( 1000, 15000, by = 1000 )
)
)
On Sat, 9 Jan 2016, maryam moazam wrote:
> Dear Michael,
>
> Thanks for your feedback. Actually, I would like to show (and compare) size
> distribution of df1 and df2 in the single plot using ggplot2, something
> like the attached picture. The command dosesn't lead me to this
purpose.
> However, I'm really new here, could you please help me more on this?
>
>
> Thanks in advance,
> Maryam
>
>
>
>
>
> On Sat, Jan 9, 2016 at 5:38 PM, Michael Dewey <lists at
dewey.myzen.co.uk>
> wrote:
>
>> Dear Maryam
>>
>> If you just need all the values of size would
>> c(df1$size, df2$size)
>> work?
>>
>> On 08/01/2016 21:44, maryam moazam wrote:
>>
>>> Dear Sir / Madam,
>>>
>>> I have just come to the amazing R software, so please be patient if
my
>>> question is basic for you. I have 2 text file (say 1.txt and
2.txt), each
>>> file containing 2 columns and different row numbers, like below
>>>
>>> case size
>>> case1 120
>>> case2 120
>>> case3 121
>>> case4 121
>>> case5 121
>>> case6 122
>>> case7 122
>>> case8 123
>>>
>>> I would like to have a one plot for all text files, with x-axis
shows the
>>> size between 300-1200 with the interval of 200
(300,500,700,900,1200) and
>>> size between 1201-1500 with the interval of 1000. For dataframes
with the
>>> equal row numbers, the following codes worked well,
>>>
>>> df1 = data.frame("1.txt", header=T)
>>> df2 = data.frame("2.txt", header=T)
>>> *combining two dataframes with equal row number*
>>>
>>> df = data.frame(df1$size,df2$size)
>>> library(reshape)
>>> melted <- melt(df)
>>>
>>> ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram
>>> (binwidth =500)+
>>>
>>>
>>>
+scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000)))
>>>
>>>
>>> but I couldn't reproduce the plot with these codes for
dataframes with
>>> different row number. I think the problem is* how to combine
datafrmaes
>>> with the different row number*, could you please help me out on
this
>>> issue?
>>>
>>> Thank you in advance
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> --
>> Michael
>> http://www.dewey.myzen.co.uk/home.html
>>
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k