Hi all, I'm a beginner user of R. I am stuck at what I thought was a very obvious problem, but surprisingly, I havent found any solution on the forum or online till now. My problem is simple. I have a file which has entries like the following: #ID Value1 List_of_values ID1 0.342 0.01,1.2,0,0.323,0.67 ID2 0.010 0.987,0.056,1.3,1.5,0.4 ID3 0.146 0.1173,0.1494,0.211,0.1257 ... ... I want to split the third column (by comma) into individual values and put them in a variable so that I can plot a boxplot with those values, one boxplot per row . I have been having three issues: 1) R identifies the third column as an integer, instead of a list of lists 2) I havent been able to split the third column into individual values 3) How do I get it in a format suitable for plotting a boxplot? Any suggestions? I'd really appreciate any help on this. Thank you, Gaurav [[alternative HTML version deleted]]
Two questions: What is your code? What do you get with: > options()$dec decimal_point "." -- David On Dec 8, 2009, at 3:55 PM, Gaurav Moghe wrote:> Hi all, > > I'm a beginner user of R. I am stuck at what I thought was a very > obvious > problem, but surprisingly, I havent found any solution on the forum or > online till now. > > My problem is simple. I have a file which has entries like the > following: > #ID Value1 List_of_values > ID1 0.342 0.01,1.2,0,0.323,0.67 > ID2 0.010 0.987,0.056,1.3,1.5,0.4 > ID3 0.146 0.1173,0.1494,0.211,0.1257 > ... > ... > > I want to split the third column (by comma) into individual values > and put > them in a variable so that I can plot a boxplot with those values, one > boxplot per row . I have been having three issues: > 1) R identifies the third column as an integer, instead of a list of > lists > 2) I havent been able to split the third column into individual values > 3) How do I get it in a format suitable for plotting a boxplot? > > Any suggestions? I'd really appreciate any help on this. > > Thank you, > Gaurav > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
Gaurav - Here's one way:> x = textConnection('ID1 0.342 0.01,1.2,0,0.323,0.67+ ID2 0.010 0.987,0.056,1.3,1.5,0.4 + ID3 0.146 0.1173,0.1494,0.211,0.1257 + + ')> y = read.table(x,stringsAsFactors=FALSE) > res = apply(y,1,function(x)as.numeric(strsplit(x[3],',')[[1]])) > names(res) = y[,1] > boxplot(res)- Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Tue, 8 Dec 2009, Gaurav Moghe wrote:> Hi all, > > I'm a beginner user of R. I am stuck at what I thought was a very obvious > problem, but surprisingly, I havent found any solution on the forum or > online till now. > > My problem is simple. I have a file which has entries like the following: > #ID Value1 List_of_values > ID1 0.342 0.01,1.2,0,0.323,0.67 > ID2 0.010 0.987,0.056,1.3,1.5,0.4 > ID3 0.146 0.1173,0.1494,0.211,0.1257 > ... > ... > > I want to split the third column (by comma) into individual values and put > them in a variable so that I can plot a boxplot with those values, one > boxplot per row . I have been having three issues: > 1) R identifies the third column as an integer, instead of a list of lists > 2) I havent been able to split the third column into individual values > 3) How do I get it in a format suitable for plotting a boxplot? > > Any suggestions? I'd really appreciate any help on this. > > Thank you, > Gaurav > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Gaurav, 1) tell R when reading the data to consider the third column as "character" via the colClasses argument to read.table() 2) foo <- as.numeric(strplit(dataset$List_of_values,",")) 3) unlist(foo) or some such HTH, Stephan Gaurav Moghe schrieb:> Hi all, > > I'm a beginner user of R. I am stuck at what I thought was a very obvious > problem, but surprisingly, I havent found any solution on the forum or > online till now. > > My problem is simple. I have a file which has entries like the following: > #ID Value1 List_of_values > ID1 0.342 0.01,1.2,0,0.323,0.67 > ID2 0.010 0.987,0.056,1.3,1.5,0.4 > ID3 0.146 0.1173,0.1494,0.211,0.1257 > ... > ... > > I want to split the third column (by comma) into individual values and put > them in a variable so that I can plot a boxplot with those values, one > boxplot per row . I have been having three issues: > 1) R identifies the third column as an integer, instead of a list of lists > 2) I havent been able to split the third column into individual values > 3) How do I get it in a format suitable for plotting a boxplot? > > Any suggestions? I'd really appreciate any help on this. > > Thank you, > Gaurav > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Gaurav Moghe > Sent: Tuesday, December 08, 2009 12:56 PM > To: r-help at r-project.org > Subject: [R] Split comma separated list > > Hi all, > > I'm a beginner user of R. I am stuck at what I thought was a > very obvious > problem, but surprisingly, I havent found any solution on the forum or > online till now. > > My problem is simple. I have a file which has entries like > the following: > #ID Value1 List_of_values > ID1 0.342 0.01,1.2,0,0.323,0.67 > ID2 0.010 0.987,0.056,1.3,1.5,0.4 > ID3 0.146 0.1173,0.1494,0.211,0.1257 > ... > ... > > I want to split the third column (by comma) into individual > values and put > them in a variable so that I can plot a boxplot with those values, one > boxplot per row . I have been having three issues: > 1) R identifies the third column as an integer, instead of a > list of lists > 2) I havent been able to split the third column into individual valuesFor 1) and 2) try: > z <- read.table(textConnection(input), header=T, comment="", row.names=1, stringsAsFactors=FALSE) > z$List <- strsplit(z$List_of_values, ",") # strsplit fails on factors > z$List <- lapply(z$List, as.numeric) # strings->numbers > z Value1 List_of_values List ID1 0.342 0.01,1.2,0,0.323,0.67 0.010, 1.200, 0.000, 0.323, 0.670 ID2 0.010 0.987,0.056,1.3,1.5,0.4 0.987, 0.056, 1.300, 1.500, 0.400 ID3 0.146 0.1173,0.1494,0.211,0.1257 0.1173, 0.1494, 0.2110, 0.1257> 3) How do I get it in a format suitable for plotting a boxplot?> boxplot(z$List, names=z$Value1) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > Any suggestions? I'd really appreciate any help on this. > > Thank you, > Gaurav > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >