Hi,
I am a relatively new to R. So, this is probably a really basic issue that
I keep hitting.
I read my data into R using the read.csv command:
x = rep("numeric", 3)
CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE,
colClasses=c("factor",
x))
# I have clearly told R that I have one factor variable and 3 numeric
variables in this table.
#but if I try to do anything with them, I get an error
boxplot(CB_un["Value"]~CB_un["State.Fips"])
Error in model.frame.default(formula = CB_un["Value"] ~
CB_un["State.Fips"]) :
invalid type (list) for variable 'CB_un["Value"]'
# Because these variables are all stored as lists.
#So, I have to unpack them.
CB_unirr_rent<-as.numeric(unlist(CB_un["Value"]))
CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"]))
#before I can do anything with them
boxplot(CB_unirr_rent~CB_unirr_State)
Is there a reason my data is always imported as lists? Is there a way to
skip this upacking step?
Thanks,
Sam
[[alternative HTML version deleted]]
They aren't quite lists --- they are actually data.frame()s which are a special sort of list with rownames and other nice things. To your immediate question, I think you're looking for the formula interface: boxplot(Value ~ State.Fips, data = CB_un) The data= argument is important so boxplot knows where to look for "Value" and "State.Fips" Best, Michael On Mon, Jun 11, 2012 at 11:29 AM, Samantha Sifleet <Sifleet.Samantha at epamail.epa.gov> wrote:> Hi, > > I am a relatively new to R. So, this is probably a really basic issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > ?invalid type (list) for variable 'CB_un["Value"]' > > # Because ?these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? ?Is there a way to > skip this upacking step? > > Thanks, > > Sam > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, Have you tried str(CB_un) to make sure the structure of your data is what you expect? Does boxplot(CB_un[, "Value"]~CB_un[, "State.Fips"]) work? Look at this:> testdf <- data.frame(a=1:3, b=11:13) > class(testdf["a"])[1] "data.frame"> class(testdf[["a"]])[1] "integer"> class(testdf[, "a"])[1] "integer" A data frame is a special form of list, so the usual list subsetting rules apply. Extracting a named component of a data frame with single square brackets gives you a data frame. Using row, column notation or double brackets gives a vector. ?"[" will give you more detail. You have to use a data frame, and thus a list, for your data, since you can't mix factor and numeric data types in a matrix. Sarah On Mon, Jun 11, 2012 at 12:29 PM, Samantha Sifleet <Sifleet.Samantha at epamail.epa.gov> wrote:> Hi, > > I am a relatively new to R. So, this is probably a really basic issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > ?invalid type (list) for variable 'CB_un["Value"]' > > # Because ?these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? ?Is there a way to > skip this upacking step? > > Thanks, > > Sam-- Sarah Goslee http://www.functionaldiversity.org
On Jun 11, 2012, at 12:29 PM, Samantha Sifleet wrote:> Hi, > > I am a relatively new to R. So, this is probably a really basic > issue that > I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, > colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 numeric > variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > invalid type (list) for variable 'CB_un["Value"]'If you were steadfastly intent on using direct extraction in the formula, then this would be the way to do so: boxplot(CB_un[["Value"]]~CB_un[["State.Fips"]]) Beter would be to use the formula interface the way it was designed to operate: boxplot( Value ~ State.Fips, data=CB_un) -- David.> > # Because these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists?It's in a dataframe ..... dataframes are lists> Is there a way to > skip this upacking step? > > Thanks, > > Sam > [[alternative HTML version deleted]]-- David Winsemius, MD West Hartford, CT
A data.frame is a list with some extra attributes. When you
subset a data.frame as
z["Column"]
you get a one-column data.frame (which boxplot rejects because
it want numeric or character data). Subsetting it as either
z[, "Column"]
or
z[["Column"]]
gives you the column itself, not a data.frame containing one column.
> z <- data.frame(One=log(1:10),
Two=rep(c("i","ii","iii"),c(3,4,3)))
> str(z["One"])
'data.frame': 10 obs. of 1 variable:
$ One: num 0 0.693 1.099 1.386 1.609 ...
> str(z[, "One"])
num [1:10] 0 0.693 1.099 1.386 1.609 ...
> str(z[["One"]])
num [1:10] 0 0.693 1.099 1.386 1.609 ...
In the particular case of the formula interface to boxplot (and to other
functions), you can avoid having to choose the column-extraction operator
by using the data= argument. The following three examples give the same
result:
boxplot(data=z, One ~ Two)
boxplot(z[["One"]] ~ z[["Two"]])
boxplot(z[, "One"] ~ z[, "Two"])
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org] On Behalf
> Of Samantha Sifleet
> Sent: Monday, June 11, 2012 9:29 AM
> To: r-help at r-project.org
> Subject: [R] Why is my data always imported as a list?
>
> Hi,
>
> I am a relatively new to R. So, this is probably a really basic issue that
> I keep hitting.
>
> I read my data into R using the read.csv command:
>
> x = rep("numeric", 3)
> CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE,
colClasses=c("factor",
> x))
>
> # I have clearly told R that I have one factor variable and 3 numeric
> variables in this table.
> #but if I try to do anything with them, I get an error
>
> boxplot(CB_un["Value"]~CB_un["State.Fips"])
>
> Error in model.frame.default(formula = CB_un["Value"] ~
> CB_un["State.Fips"]) :
> invalid type (list) for variable 'CB_un["Value"]'
>
> # Because these variables are all stored as lists.
> #So, I have to unpack them.
>
> CB_unirr_rent<-as.numeric(unlist(CB_un["Value"]))
> CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"]))
>
> #before I can do anything with them
>
> boxplot(CB_unirr_rent~CB_unirr_State)
>
> Is there a reason my data is always imported as lists? Is there a way to
> skip this upacking step?
>
> Thanks,
>
> Sam
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Noone has yet mentioned tjhe 'standard' way to reference a data frame item, which uses $ boxplot(CB_un[["Value]]~CB_un[["State.Fips"]]) #Note the double [[ ]] # is equivalent to boxplot(CB_un$Value~CB_un$State.Fips) # and can be achieved by boxplot(Value~State.Fips, data=CB_un) The rather awkward [["name"]] notation is usually only needed if the name does not comply with variable naming requirements (including list and data frame names); for example, names containing spaces or operator symbols (like "+") are not allowed for ordinary variables. It is possible and sometimes useful to create non-standard data frame names for display; for example, anova.lm actually returns a data frame with names "Df", "Sum Sq", "Mean Sq", F value" and "Pr(>F)". But it makes manipulation a tad trickier and you'd be unable to use them in the context of a formula with a data argument. S Ellison> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Samantha Sifleet > Sent: 11 June 2012 17:29 > To: r-help at r-project.org > Subject: [R] Why is my data always imported as a list? > > Hi, > > I am a relatively new to R. So, this is probably a really > basic issue that I keep hitting. > > I read my data into R using the read.csv command: > > x = rep("numeric", 3) > CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE, > colClasses=c("factor", > x)) > > # I have clearly told R that I have one factor variable and 3 > numeric variables in this table. > #but if I try to do anything with them, I get an error > > boxplot(CB_un["Value"]~CB_un["State.Fips"]) > > Error in model.frame.default(formula = CB_un["Value"] ~ > CB_un["State.Fips"]) : > invalid type (list) for variable 'CB_un["Value"]' > > # Because these variables are all stored as lists. > #So, I have to unpack them. > > CB_unirr_rent<-as.numeric(unlist(CB_un["Value"])) > CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"])) > > #before I can do anything with them > > boxplot(CB_unirr_rent~CB_unirr_State) > > Is there a reason my data is always imported as lists? Is > there a way to skip this upacking step? > > Thanks, > > Sam > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *******************************************************************This email and any attachments are confidential. Any use...{{dropped:8}}
On Mon, 11-Jun-2012 at 12:29PM -0400, Samantha Sifleet wrote:
|> Hi,
|>
|> I am a relatively new to R. So, this is probably a really basic issue that
|> I keep hitting.
|>
|> I read my data into R using the read.csv command:
|>
|> x = rep("numeric", 3)
|> CB_un=read.csv("Corn_Belt_unirr.csv", header=TRUE,
colClasses=c("factor",
|> x))
|>
|> # I have clearly told R that I have one factor variable and 3 numeric
|> variables in this table.
|> #but if I try to do anything with them, I get an error
|>
|> boxplot(CB_un["Value"]~CB_un["State.Fips"])
Others have given good suggestions, but a slight modification of your
code would work if your dataframe is what we'd like to think it is:
boxplot(CB_un[,"Value"]~CB_un[,"State.Fips"])
or
boxplot(CB_un[["Value"]]~CB_un[["State.Fips"]])
I can't check if those will work, but even if they do, the formula
with a data argument is more elegant.
HTH
|>
|> Error in model.frame.default(formula = CB_un["Value"] ~
|> CB_un["State.Fips"]) :
|> invalid type (list) for variable 'CB_un["Value"]'
|>
|> # Because these variables are all stored as lists.
|> #So, I have to unpack them.
|>
|> CB_unirr_rent<-as.numeric(unlist(CB_un["Value"]))
|> CB_unirr_State<-as.factor(unlist(CB_un["State.Fips"]))
|>
|> #before I can do anything with them
|>
|> boxplot(CB_unirr_rent~CB_unirr_State)
|>
|> Is there a reason my data is always imported as lists? Is there a way to
|> skip this upacking step?
|>
|> Thanks,
|>
|> Sam
|> [[alternative HTML version deleted]]
|>
|> ______________________________________________
|> R-help at r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.
--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Average minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Eleanor Roosevelt
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.