maryam moazam
2016-Jan-09 14:29 UTC
[R] Combining dataframes with different row numbers and plotting with ggplot2
Dear Michael, Thanks for your feedback. Actually, I would like to show (and compare) size distribution of df1 and df2 in the single plot using ggplot2, something like the attached picture. The command dosesn't lead me to this purpose. However, I'm really new here, could you please help me more on this? Thanks in advance, Maryam On Sat, Jan 9, 2016 at 5:38 PM, Michael Dewey <lists at dewey.myzen.co.uk> wrote:> Dear Maryam > > If you just need all the values of size would > c(df1$size, df2$size) > work? > > On 08/01/2016 21:44, maryam moazam wrote: > >> Dear Sir / Madam, >> >> I have just come to the amazing R software, so please be patient if my >> question is basic for you. I have 2 text file (say 1.txt and 2.txt), each >> file containing 2 columns and different row numbers, like below >> >> case size >> case1 120 >> case2 120 >> case3 121 >> case4 121 >> case5 121 >> case6 122 >> case7 122 >> case8 123 >> >> I would like to have a one plot for all text files, with x-axis shows the >> size between 300-1200 with the interval of 200 (300,500,700,900,1200) and >> size between 1201-1500 with the interval of 1000. For dataframes with the >> equal row numbers, the following codes worked well, >> >> df1 = data.frame("1.txt", header=T) >> df2 = data.frame("2.txt", header=T) >> *combining two dataframes with equal row number* >> >> df = data.frame(df1$size,df2$size) >> library(reshape) >> melted <- melt(df) >> >> ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram >> (binwidth =500)+ >> >> >> +scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000))) >> >> >> but I couldn't reproduce the plot with these codes for dataframes with >> different row number. I think the problem is* how to combine datafrmaes >> with the different row number*, could you please help me out on this >> issue? >> >> Thank you in advance >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > -- > Michael > http://www.dewey.myzen.co.uk/home.html >-------------- next part -------------- A non-text attachment was scrubbed... Name: plot.png Type: image/png Size: 18641 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20160109/b721b1c6/attachment.png>
Michael Dewey
2016-Jan-09 15:11 UTC
[R] Combining dataframes with different row numbers and plotting with ggplot2
Sorry Maryam but I use neither reshape nor ggplot2 so I will leave it to others to advise you. On 09/01/2016 14:29, maryam moazam wrote:> > Dear Michael, > > Thanks for your feedback. Actually, I would like to show (and compare) > size distribution of df1 and df2 in the single plot using ggplot2, > something like the attached picture. The command dosesn't lead me to > this purpose. However, I'm really new here, could you please help > me more on this? > > > Thanks in advance, > Maryam > > > > > > On Sat, Jan 9, 2016 at 5:38 PM, Michael Dewey <lists at dewey.myzen.co.uk > <mailto:lists at dewey.myzen.co.uk>> wrote: > > Dear Maryam > > If you just need all the values of size would > c(df1$size, df2$size) > work? > > On 08/01/2016 21:44, maryam moazam wrote: > > Dear Sir / Madam, > > I have just come to the amazing R software, so please be patient > if my > question is basic for you. I have 2 text file (say 1.txt and > 2.txt), each > file containing 2 columns and different row numbers, like below > > case size > case1 120 > case2 120 > case3 121 > case4 121 > case5 121 > case6 122 > case7 122 > case8 123 > > I would like to have a one plot for all text files, with x-axis > shows the > size between 300-1200 with the interval of 200 > (300,500,700,900,1200) and > size between 1201-1500 with the interval of 1000. For dataframes > with the > equal row numbers, the following codes worked well, > > df1 = data.frame("1.txt", header=T) > df2 = data.frame("2.txt", header=T) > *combining two dataframes with equal row number* > > df = data.frame(df1$size,df2$size) > library(reshape) > melted <- melt(df) > > ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram > (binwidth =500)+ > > +scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000))) > > > but I couldn't reproduce the plot with these codes for > dataframes with > different row number. I think the problem is* how to combine > datafrmaes > with the different row number*, could you please help me out on > this issue? > > Thank you in advance > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > -- > Michael > http://www.dewey.myzen.co.uk/home.html > > >-- Michael http://www.dewey.myzen.co.uk/home.html
Jeff Newmiller
2016-Jan-09 19:05 UTC
[R] Combining dataframes with different row numbers and plotting with ggplot2
Please study each line of code, and use the str command to study the intermediate data objects... the examples on this list are almost never plug-and-play for your real work. Note that while you provided some of the code necessary to make your example reproducible, I had to fill in blanks with additional code... the Posting Guide asks you to make your example run as-is to get us to the point where you are having problems. The below code is a model for posing your future questions as well as an answer to this one. library(ggplot2) DF1 <- read.table( text "case size case1 120 case2 120 case3 121 case4 121 case5 121 case6 122 case7 122 case8 123 ", header=TRUE, as.is=TRUE ) # note the fewer records below DF2 <- read.table( text "case size case1 120 case2 120 case3 121 case4 121 case5 121 case6 122 case7 122 ", header=TRUE, as.is=TRUE ) # While you CAN use reshape to make long data out of wide data, that # method for making long data will always presume you have the same number # of records for each case. Combine your data directly into long form if # that is how it is best represented. # Below note the use of labels such as "Source" to organize the data # Also note the use of "stringsAsFactors = FALSE" because concatenating # factors is almost never a good idea... go read (again?) about what # factors are if you don't understand why concatenating factors doesn't # work well DFL <- rbind( data.frame( Source = "DF1" , size = DF1$size , stringsAsFactors = FALSE ) , data.frame( Source = "DF2" , size = DF2$size , stringsAsFactors = FALSE ) ) # Your intent in making this graph is still a little opaque to me.. the # breaks are causing logarithmic axis labels, but not all of the breaks # show up ggplot( data = DFL , aes( x=size, fill=Source ) ) + geom_histogram( binwidth = 500 ) + # might want "position='dodge`"? scale_x_continuous( breaks = c( seq( 300, 800, by = 200 ) , seq( 1000, 15000, by = 1000 ) ) ) On Sat, 9 Jan 2016, maryam moazam wrote:> Dear Michael, > > Thanks for your feedback. Actually, I would like to show (and compare) size > distribution of df1 and df2 in the single plot using ggplot2, something > like the attached picture. The command dosesn't lead me to this purpose. > However, I'm really new here, could you please help me more on this? > > > Thanks in advance, > Maryam > > > > > > On Sat, Jan 9, 2016 at 5:38 PM, Michael Dewey <lists at dewey.myzen.co.uk> > wrote: > >> Dear Maryam >> >> If you just need all the values of size would >> c(df1$size, df2$size) >> work? >> >> On 08/01/2016 21:44, maryam moazam wrote: >> >>> Dear Sir / Madam, >>> >>> I have just come to the amazing R software, so please be patient if my >>> question is basic for you. I have 2 text file (say 1.txt and 2.txt), each >>> file containing 2 columns and different row numbers, like below >>> >>> case size >>> case1 120 >>> case2 120 >>> case3 121 >>> case4 121 >>> case5 121 >>> case6 122 >>> case7 122 >>> case8 123 >>> >>> I would like to have a one plot for all text files, with x-axis shows the >>> size between 300-1200 with the interval of 200 (300,500,700,900,1200) and >>> size between 1201-1500 with the interval of 1000. For dataframes with the >>> equal row numbers, the following codes worked well, >>> >>> df1 = data.frame("1.txt", header=T) >>> df2 = data.frame("2.txt", header=T) >>> *combining two dataframes with equal row number* >>> >>> df = data.frame(df1$size,df2$size) >>> library(reshape) >>> melted <- melt(df) >>> >>> ggplot(data=melted, aes(value))+aes(fill=variable)+ geom_histogram >>> (binwidth =500)+ >>> >>> >>> +scale_x_continuous(breaks=c(seq(300,1000,by=200),seq(1001,15000,by=1000))) >>> >>> >>> but I couldn't reproduce the plot with these codes for dataframes with >>> different row number. I think the problem is* how to combine datafrmaes >>> with the different row number*, could you please help me out on this >>> issue? >>> >>> Thank you in advance >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> -- >> Michael >> http://www.dewey.myzen.co.uk/home.html >> >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k