thr3ads.net - R help - [R] adding zeros to dataframe [May 2009]

If this information is useful, please help other people find it:
Share via:

Collins, Cathy

2009-May-01 17:20 UTC

[R] adding zeros to dataframe

Greetings,
 
I am new to R and am hoping to get some tips from experienced R-programmers.
 
I have a dataset that I've read into R as a dataframe. There are 5 columns:
Plot location,species name, a species number code (unique to each species name),
abundance, and treatment. There are 272 plots in each treatment, but only the
plots in which the species was recorded have an abundance value.  For all
species in the dataset, I would like to add zeros to the abundance column for
any plots in which the species was not recorded, so that each species has 272
rows.  The data are sorted by species and then abundance, so all of the zeros
can presumably just be tacked on to the last (272-occupied plots) row for each
species.
 
My programming skills are still somewhat rudimentary (and biased toward
VBA-style looping...which seems to be leading me astray). Though I have
searched, I have not yet seen this particular problem addressed in the help
files.
 
Many thanks for any suggestions,
Cathy
 
<mailto:ccollins at ku.edu>

(Ted Harding)

2009-May-01 17:40 UTC

head link

[R] adding zeros to dataframe

On 01-May-09 17:20:08, Collins, Cathy wrote:> Greetings,
> I am new to R and am hoping to get some tips from experienced
> R-programmers.
>  
> I have a dataset that I've read into R as a dataframe. There are 5
> columns: Plot location,species name, a species number code (unique
> to each species name), abundance, and treatment. There are 272 plots in
> each treatment, but only the plots in which the species was recorded
> have an abundance value.  For all species in the dataset, I would like
> to add zeros to the abundance column for any plots in which the species
> was not recorded, so that each species has 272 rows.  The data are
> sorted by species and then abundance, so all of the zeros can
> presumably just be tacked on to the last (272-occupied plots) row for
> each species.
>  
> My programming skills are still somewhat rudimentary (and biased toward
> VBA-style looping...which seems to be leading me astray). Though I have
> searched, I have not yet seen this particular problem addressed in the
> help files.
>  
> Many thanks for any suggestions,
> Cathy
> <mailto:ccollins at ku.edu>
Suppose we call your dataframe "abun.df". Then its columns will be
something like abun.df$location, abun.df$name, abun.df$abundance,
abun.df$trtmt (depending on what you called them in the first place).
>From your description, I am presuming that abundence values wherea species was not recorded have no value entered. In that case,
presumably they have gone into abun.df$abundance as "NA". You can
check this with a command like

  sum(is.na(abun.df$abundance))

If you get a positive result, then that is likely to be the case.
As a cross-check:

  sum(abun.df$abundance > 0, na.rm=TRUE)

should give another number which, together with the first, should
add up to the total number of rows in the dataframe.

Assuming, then, that this is the case, the simplest method to set
the non-recorded values to 0 is on the lines of

  ix <- (is.na(abun.df$abundance))
  abun.df$abundance[ix] <- 0

Then you can run the check

  sum(abun.df$abundance == 0)

and you should get a number which is the same as you got from

  sum(is.na(abun.df$abundance))

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 01-May-09                                       Time: 18:39:58
------------------------------ XFMail ------------------------------

Jorge Ivan Velez

2009-May-01 17:48 UTC

head link

[R] adding zeros to dataframe

Dear Cathy,
Try this:

# Some data
set.seed(123)
DF <- data.frame(
       Plot=sample(10),
       Location = sample(10),
       SpeciesName = sample(LETTERS[1:10]),
       SpeciesNumber = sample(10),
       abundance = rnorm(10),
       treatment = sample(letters[1:3],5, replace=TRUE)
       )

# Making some NAs in Plot
DF[c(2,7,10),1] <- NA
DF

# Hopefully what you asked for :-)
DF$abundance[is.na(DF$Plot)] <- 0
DF    # note than rows 2, 7 and 10 in abundance are now zero

See ?is.na for details.

HTH,

Jorge



On Fri, May 1, 2009 at 1:20 PM, Collins, Cathy <ccollins@ku.edu> wrote:
> Greetings,
>
> I am new to R and am hoping to get some tips from experienced
> R-programmers.
>
> I have a dataset that I've read into R as a dataframe. There are 5
columns:
> Plot location,species name, a species number code (unique to each species
> name), abundance, and treatment. There are 272 plots in each treatment, but
> only the plots in which the species was recorded have an abundance value.
>  For all species in the dataset, I would like to add zeros to the abundance
> column for any plots in which the species was not recorded, so that each
> species has 272 rows.  The data are sorted by species and then abundance,
so
> all of the zeros can presumably just be tacked on to the last (272-occupied
> plots) row for each species.
>
> My programming skills are still somewhat rudimentary (and biased toward
> VBA-style looping...which seems to be leading me astray). Though I have
> searched, I have not yet seen this particular problem addressed in the help
> files.
>
> Many thanks for any suggestions,
> Cathy
>
> <mailto:ccollins@ku.edu>
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Mark Wilkinson

2009-May-01 20:58 UTC

head link

[R] adding zeros to dataframe

Hi Cathy,

I interpreted your situation a little differently than the other
responses.  Please ignore this if their suggestions solved your
problem.

I assumed you have abundance where available, but otherwise it wasn't
recorded--not as NA, just unrecorded.  You want to fill in the missing
"rows" with zeros for abundance, for each treatment, for 272 plots
within treatment, for all possible species within a plot.

(I now see from your repost that this is the case.)

R code and comments follow.

## I'll try to reproduce some of your data.  You can ignore this part
for your code.

## Say there are 5 treatments, 272 plots per treatment, and 10
## *possible* species
set.seed(1001)
your.data <- expand.grid(treatment = c("A", "B",
"C", "D", "E"),
                         plot.location = 1:272,
                         species = paste("s", 1:10, sep =
""))
your.data$abundance <- rpois(nrow(your.data), 3)
your.data <- your.data[sample(nrow(your.data), size = 100), ]
row.names(your.data) <- seq(nrow(your.data))

## Your data looks something like this:
print(your.data)

## You need to generate all combinations of values of your variables

## Assuming all are currently represented somewhere in your data set,
(treatments <- unique(your.data$treatment[!is.na(your.data$treatment)]))
plot.locations <- 1:272 # or
unique(your.data$plot.location[!is.na(your.data$plot.location)]))
(species <- unique(your.data$species[!is.na(your.data$species)]))

## The complete data with all species, for all locations, for all
## treatments, present is
complete.data <- expand.grid(tx = treatments, pl = plot.locations,
                             sp = species)

## Put the two together, with NA for unrecorded abundance
your.complete.data <- merge(complete.data, your.data,
                            by.x = c("tx", "pl",
"sp"),
                            by.y = c("treatment",
"plot.location", "species"),
                            all.x = TRUE)

## Fill in the NAs
your.complete.data$abundance[is.na(your.complete.data$abundance)] <- 0

Hope this helps,
Mark


On Fri, May 1, 2009 at 12:20 PM, Collins, Cathy <ccollins at ku.edu>
wrote:> Greetings,
>
> I am new to R and am hoping to get some tips from experienced
R-programmers.
>
> I have a dataset that I've read into R as a dataframe. There are 5
columns: Plot location,species name, a species number code (unique to each
species name), abundance, and treatment. There are 272 plots in each treatment,
but only the plots in which the species was recorded have an abundance value.
?For all species in the dataset, I would like to add zeros to the abundance
column for any plots in which the species was not recorded, so that each species
has 272 rows. ?The data are sorted by species and then abundance, so all of the
zeros can presumably just be tacked on to the last (272-occupied plots) row for
each species.
>
> My programming skills are still somewhat rudimentary (and biased toward
VBA-style looping...which seems to be leading me astray). Though I have
searched, I have not yet seen this particular problem addressed in the help
files.
>
> Many thanks for any suggestions,
> Cathy
>
> <mailto:ccollins at ku.edu>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Reasonably Related Threads

Search for more reasonably related threads

R help - May 2009 - adding zeros to dataframe

[R] adding zeros to dataframe

[R] adding zeros to dataframe

[R] adding zeros to dataframe

[R] adding zeros to dataframe

Reasonably Related Threads