Hi, I am a relatively new to R. So, this is probably a really basic issue that I keep hitting. I read my data into R using the read.csv command: x = rep("numeric", 3) CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", x)) # I have clearly told R that I have one factor variable and 3 numeric variables in this table. #but if I try to do anything with them, I get an error boxplot(CB_un["Value"]~CB_un["State.Fips"]) Error in model.frame.default(formula = CB_un["Value"] ~ CB_un["State.Fips"]) : invalid type (list) for variable 'CB_un["Value"]' # Because these variables are all stored as lists. #So, I have to unpack them. CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) #before I can do anything with them boxplot(CB_unirr_rent~CB_unirr_State) Is there a reason my data is always imported as lists? Is there a way to skip this upacking step? Thanks, Sam [[alternative HTML version deleted]]
They aren't quite lists --- they are actually data.frame()s which are a special sort of list with rownames and other nice things. To your immediate question, I think you're looking for the formula interface: boxplot(Value ~ State.Fips, data = CB_un) The data= argument is important so boxplot knows where to look for "Value" and "State.Fips" Best, Michael On Mon, Jun 11, 2012 at 11:29 AM, Samantha Sifleet <Sifleet.Samantha at epamail.epa.gov> wrote:> Hi, > > I am a relatively new to R. So, this is probably a really basic issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > ?invalid type (list) for variable 'CB_un["Value"]' > > # Because ?these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? ?Is there a way to > skip this upacking step? > > Thanks, > > Sam > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, Have you tried str(CB_un) to make sure the structure of your data is what you expect? Does boxplot(CB_un[, "Value"]~CB_un[, "State.Fips"]) work? Look at this:> testdf <- data.frame(a=1:3, b=11:13) > class(testdf["a"])[1] "data.frame"> class(testdf[["a"]])[1] "integer"> class(testdf[, "a"])[1] "integer" A data frame is a special form of list, so the usual list subsetting rules apply. Extracting a named component of a data frame with single square brackets gives you a data frame. Using row, column notation or double brackets gives a vector. ?"[" will give you more detail. You have to use a data frame, and thus a list, for your data, since you can't mix factor and numeric data types in a matrix. Sarah On Mon, Jun 11, 2012 at 12:29 PM, Samantha Sifleet <Sifleet.Samantha at epamail.epa.gov> wrote:> Hi, > > I am a relatively new to R. So, this is probably a really basic issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > ?invalid type (list) for variable 'CB_un["Value"]' > > # Because ?these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? ?Is there a way to > skip this upacking step? > > Thanks, > > Sam-- Sarah Goslee http://www.functionaldiversity.org
On Jun 11, 2012, at 12:29 PM, Samantha Sifleet wrote:> Hi, > > I am a relatively new to R. So, this is probably a really basic > issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, > colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > invalid type (list) for variable 'CB_un["Value"]'If you were steadfastly intent on using direct extraction in the formula, then this would be the way to do so: boxplot(CB_un[["Value"]]~CB_un[["State.Fips"]]) Beter would be to use the formula interface the way it was designed to operate: boxplot( Value ~ State.Fips, data=CB_un) -- David.> > # Because these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists?It's in a dataframe ..... dataframes are lists> Is there a way to > skip this upacking step? > > Thanks, > > Sam > [[alternative HTML version deleted]]-- David Winsemius, MD West Hartford, CT
A data.frame is a list with some extra attributes. When you subset a data.frame as z["Column"] you get a one-column data.frame (which boxplot rejects because it want numeric or character data). Subsetting it as either z[, "Column"] or z[["Column"]] gives you the column itself, not a data.frame containing one column. > z <- data.frame(One=log(1:10), Two=rep(c("i","ii","iii"),c(3,4,3))) > str(z["One"]) 'data.frame': 10 obs. of 1 variable: $ One: num 0 0.693 1.099 1.386 1.609 ... > str(z[, "One"]) num [1:10] 0 0.693 1.099 1.386 1.609 ... > str(z[["One"]]) num [1:10] 0 0.693 1.099 1.386 1.609 ... In the particular case of the formula interface to boxplot (and to other functions), you can avoid having to choose the column-extraction operator by using the data= argument. The following three examples give the same result: boxplot(data=z, One ~ Two) boxplot(z[["One"]] ~ z[["Two"]]) boxplot(z[, "One"] ~ z[, "Two"]) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf > Of Samantha Sifleet > Sent: Monday, June 11, 2012 9:29 AM > To: r-help at r-project.org > Subject: [R] Why is my data always imported as a list? > > Hi, > > I am a relatively new to R. So, this is probably a really basic issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > invalid type (list) for variable 'CB_un["Value"]' > > # Because these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? Is there a way to > skip this upacking step? > > Thanks, > > Sam > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Noone has yet mentioned tjhe 'standard' way to reference a data frame item, which uses $ boxplot(CB_un[["Value]]~CB_un[["State.Fips"]]) #Note the double [[ ]] # is equivalent to boxplot(CB_un$Value~CB_un$State.Fips) # and can be achieved by boxplot(Value~State.Fips, data=CB_un) The rather awkward [["name"]] notation is usually only needed if the name does not comply with variable naming requirements (including list and data frame names); for example, names containing spaces or operator symbols (like "+") are not allowed for ordinary variables. It is possible and sometimes useful to create non-standard data frame names for display; for example, anova.lm actually returns a data frame with names "Df", "Sum Sq", "Mean Sq", F value" and "Pr(>F)". But it makes manipulation a tad trickier and you'd be unable to use them in the context of a formula with a data argument. S Ellison> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Samantha Sifleet > Sent: 11 June 2012 17:29 > To: r-help at r-project.org > Subject: [R] Why is my data always imported as a list? > > Hi, > > I am a relatively new to R. So, this is probably a really > basic issue that I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, > colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 > numeric variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > invalid type (list) for variable 'CB_un["Value"]' > > # Because these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? Is > there a way to skip this upacking step? > > Thanks, > > Sam > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *******************************************************************This email and any attachments are confidential. Any use...{{dropped:8}}
On Mon, 11-Jun-2012 at 12:29PM -0400, Samantha Sifleet wrote: |> Hi, |> |> I am a relatively new to R. So, this is probably a really basic issue that |> I keep hitting. |> |> I read my data into R using the read.csv command: |> |> x = rep("numeric", 3) |> CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", |> x)) |> |> # I have clearly told R that I have one factor variable and 3 numeric |> variables in this table. |> #but if I try to do anything with them, I get an error |> |> boxplot(CB_un["Value"]~CB_un["State.Fips"]) Others have given good suggestions, but a slight modification of your code would work if your dataframe is what we'd like to think it is: boxplot(CB_un[,"Value"]~CB_un[,"State.Fips"]) or boxplot(CB_un[["Value"]]~CB_un[["State.Fips"]]) I can't check if those will work, but even if they do, the formula with a data argument is more elegant. HTH |> |> Error in model.frame.default(formula = CB_un["Value"] ~ |> CB_un["State.Fips"]) : |> invalid type (list) for variable 'CB_un["Value"]' |> |> # Because these variables are all stored as lists. |> #So, I have to unpack them. |> |> CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) |> CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) |> |> #before I can do anything with them |> |> boxplot(CB_unirr_rent~CB_unirr_State) |> |> Is there a reason my data is always imported as lists? Is there a way to |> skip this upacking step? |> |> Thanks, |> |> Sam |> [[alternative HTML version deleted]] |> |> ______________________________________________ |> R-help at r-project.org mailing list |> https://stat.ethz.ch/mailman/listinfo/r-help |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> and provide commented, minimal, self-contained, reproducible code. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.