thr3ads.net - R help - [R] help with data layout [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Iain Gallagher

2008-Jul-17 15:50 UTC

[R] help with data layout

Hello list

I have been given some Excel sheets with data laid like this:

Col1    Col2    
A         3
           2
           3
B         4
           5
           4
C         1
           4
           3

I was hoping to import this into R as a csv and then get the mean and SD for
each letter in column 1.

Could someone give me some guidance on best to approach this?

Thanks

Iain

	[[alternative HTML version deleted]]

Erik Iverson

2008-Jul-17 16:19 UTC

head link

[R] help with data layout

Iain Gallagher wrote:> Hello list
> 
> I have been given some Excel sheets with data laid like this:
> 
> Col1    Col2 A         3 2 3 B         4 5 4 C         1 4 3
> 
> I was hoping to import this into R as a csv and then get the mean and
> SD for each letter in column 1.
> 
> Could someone give me some guidance on best to approach this?
> 
Sure.  Reading in Excel sheets can be done at least a few ways, see the 
R Data Import/Export manual on CRAN.  The only way I have done it is to 
save the Excel sheet as a CSV file, and then use read.csv in R to get a 
data.frame.  One note here is that sometimes the Excel sheet has 
'missing' cells where someone has inserted blanks.  These may get 
written out to the CSV file, you'll have to check.  For example, I've 
seen an Excel sheet with something like 10 rows of data that outputs 
about 100 to the CSV file, mostly all missing.

Anyway, once you have the data.frame, I'd use na.locf from the zoo 
package to 'fill' in the missing Col1 values, and then use an R function
such as ave, tapply, aggregate, or by to do whatever you'd like.

> Thanks
> 
> Iain
> 
> [[alternative HTML version deleted]]
> 
> ______________________________________________ R-help at r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.

Henrique Dallazuanna

2008-Jul-17 16:24 UTC

head link

[R] help with data layout

Try this:

#x <- read.csv('your_file.csv')

x$Col1 <- rep(as.character(x$Col1[x$Col1!="" ]),
              each = unique(diff(which(x$Col1 != ""))))
with(x, sapply(c(sd, mean), function(x)tapply(Col2, Col1, x)))

On Thu, Jul 17, 2008 at 12:50 PM, Iain Gallagher
<iaingallagher at btopenworld.com> wrote:> Hello list
>
> I have been given some Excel sheets with data laid like this:
>
> Col1    Col2
> A         3
>           2
>           3
> B         4
>           5
>           4
> C         1
>           4
>           3
>
> I was hoping to import this into R as a csv and then get the mean and SD
for each letter in column 1.
>
> Could someone give me some guidance on best to approach this?
>
> Thanks
>
> Iain
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paran?-Brasil
25? 25' 40" S 49? 16' 22" O

Stephen Tucker

2008-Jul-17 16:29 UTC

head link

[R] help with data layout

Hi, hope this will help:

txt <- "Col1,    Col2   
A,        3
,         2
,        3
B,        4
,         5
,         4
C,        1
,         4
,         3"

## read data
dat <- read.csv(textConnection(txt),na.string="")
## fill in empty cells with correct category
dat$Col1[] <-
  Reduce(function(x,y) c(x,ifelse(is.na(y),tail(x,1),y)),dat$Col1)
## calculate mean and standard deviation
mat <- t(sapply(split(dat$Col2,f=dat$Col1),function(X)
              c(mean=mean(X),sd=sd(X))))
## look at results (stored in a matrix)> print(mat)      mean        sd
A 2.666667 0.5773503
B 4.333333 0.5773503
C 2.666667 1.5275252



----- Original Message ----
From: Iain Gallagher <iaingallagher at btopenworld.com>
To: r-help at stat.math.ethz.ch
Sent: Thursday, July 17, 2008 8:50:42 AM
Subject: [R] help with data layout

Hello list

I have been given some Excel sheets with data laid like this:

Col1    Col2    
A         3
           2
           3
B         4
           5
           4
C         1
           4
           3

I was hoping to import this into R as a csv and then get the mean and SD for
each letter in column 1.

Could someone give me some guidance on best to approach this?

Thanks

Iain

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

jim holtman

2008-Jul-17 16:33 UTC

head link

[R] help with data layout

Does this do at least the means for you:
> x <- read.csv(textConnection("Col1 ,   Col2+ A    ,     3
+      ,     2
+      ,     3
+ B    ,     4
+      ,     5
+     ,      4
+ C   ,      1
+     ,      4
+     ,      3"), strip.white=TRUE)> x  Col1 Col2
1    A    3
2         2
3         3
4    B    4
5         5
6         4
7    C    1
8         4
9         3> # replace blanks with NAs in first column
> is.na(x$Col1) <- x$Col1 == ''
> require(zoo)
> x$Col1 <- na.locf(x$Col1)
> x  Col1 Col2
1    A    3
2    A    2
3    A    3
4    B    4
5    B    5
6    B    4
7    C    1
8    C    4
9    C    3> aggregate(x$Col2, list(x$Col1), FUN=mean)  Group.1        x
1       A 2.666667
2       B 4.333333
3       C 2.666667>

On Thu, Jul 17, 2008 at 11:50 AM, Iain Gallagher
<iaingallagher at btopenworld.com> wrote:> Hello list
>
> I have been given some Excel sheets with data laid like this:
>
> Col1    Col2
> A         3
>           2
>           3
> B         4
>           5
>           4
> C         1
>           4
>           3
>
> I was hoping to import this into R as a csv and then get the mean and SD
for each letter in column 1.
>
> Could someone give me some guidance on best to approach this?
>
> Thanks
>
> Iain
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Reasonably Related Threads

Search for more reasonably related threads

R help - Jul 2008 - help with data layout

[R] help with data layout

[R] help with data layout

[R] help with data layout

[R] help with data layout

[R] help with data layout

Reasonably Related Threads