thr3ads.net - R help - [R] transpose dataset to PC-ORD? [May 2006]

If this information is useful, please help other people find it:
Share via:

Daniel Gruner

2006-May-23 19:36 UTC

[R] transpose dataset to PC-ORD?

Hello:

I need to take a species-sample matrix and transpose it to the format 
used by PC-ORD for analysis. Unfortunately, the number of species is 
very large (>5000), and so this operation cannot be performed simply 
in an application like Excel, which has a 255 column limit. So, I 
wrote relatively simple code in R that I hoped would do this 
(appended below). But there are glitches.

The format needed for PC-ORD (where "NA" shows an empty cell):

NA,3,sites,NA
NA,3,species,NA
NA,Q,Q,Q
NA,sp1,sp2,sp3
site1,1,0,0
site2,0,1,2
site3,0,3,0

2 cells in first row indicate number of samples (rows), the second 
column indicates number of species (columns), the third row indicates 
variable type (Q = quantitative), and the fourth row shows column 
headers (species names). So, one can create a transposable matrix in 
a spreadsheet where 5000+ species are the rows:

NA,NA,NA,NA,site1,site2,site3
3,3,Q,sp1,1,0,0
sites,species,Q,sp2,0,1,3
NA,NA,Q,sp3,0,2,0


It is important that the data file written out is totally clean and 
ready to go for PC-ORD, because I cannot open and edit it in a 
spreadsheet. However, the code performs the transpose operation and 
writes the file, but now the former row IDs are the first row in the 
new file (NA,1,2,3), and the 4 leading spaces are "X, X.1, X.2, 
X.3".  I'd like to delete the first row and delete the first 4 values 
of column1, without deleting the column.

NA,1,2,3
X,3,islands,NA
X.1,3,speciesNA
X.2,Q,Q,Q
X.3,sp1,sp2,sp3
site1,1,0,0
site2,0,1,2
site3,0,3,0

I have tried various tricks that I will not list/belabor here 
(various col.names, row.names, header, Extract, etc commands). Any 
further hints on code that will either stop R from adding these, or 
strip them at the end?

(PS, yes, I can learn how to my multivariate analyses in R and skip 
PC-ORD, but I am time limited on this one, and it seems that this 
code could be very useful in numerous ways)

Many thanks for the help,
Dan Gruner
(Windows XP, R vers2.2)



##transpose datasets to convert to PC-ORD format

data<-read.csv("data.csv", header=TRUE, as.is=T,
    strip.white=T, na.strings="NA")
data<-as.matrix(data)
data.trans <- t(data)
write.csv(data.trans, file = "datatransp.csv",
    quote = F, na = "")



*******************************

Daniel S. Gruner, Postdoctoral Scholar
Bodega Marine Lab, University of California -- Davis
PO Box 247, 2099 Westside Rd
Bodega Bay, CA 94923-0247
(o) 707.875.2022  (f) 707.875.2009   (m) 707.338.5722
email:  dsgruner_at_ucdavis.edu
http://www.bml.ucdavis.edu/facresearch/gruner.html
http://www.hawaii.edu/ant/

Jean Eid

2006-May-23 20:30 UTC

head link

[R] transpose dataset to PC-ORD?

I do not know exactly what you are looking for but it seems that you are 
writing the column names (which become row names) when transposing the 
data. So to fix this try using write.table(..., sep=",", row.names=F)


Jean


Daniel Gruner wrote:> Hello:
>
> I need to take a species-sample matrix and transpose it to the format 
> used by PC-ORD for analysis. Unfortunately, the number of species is 
> very large (>5000), and so this operation cannot be performed simply 
> in an application like Excel, which has a 255 column limit. So, I 
> wrote relatively simple code in R that I hoped would do this 
> (appended below). But there are glitches.
>
> The format needed for PC-ORD (where "NA" shows an empty cell):
>
> NA,3,sites,NA
> NA,3,species,NA
> NA,Q,Q,Q
> NA,sp1,sp2,sp3
> site1,1,0,0
> site2,0,1,2
> site3,0,3,0
>
> 2 cells in first row indicate number of samples (rows), the second 
> column indicates number of species (columns), the third row indicates 
> variable type (Q = quantitative), and the fourth row shows column 
> headers (species names). So, one can create a transposable matrix in 
> a spreadsheet where 5000+ species are the rows:
>
> NA,NA,NA,NA,site1,site2,site3
> 3,3,Q,sp1,1,0,0
> sites,species,Q,sp2,0,1,3
> NA,NA,Q,sp3,0,2,0
>
>
> It is important that the data file written out is totally clean and 
> ready to go for PC-ORD, because I cannot open and edit it in a 
> spreadsheet. However, the code performs the transpose operation and 
> writes the file, but now the former row IDs are the first row in the 
> new file (NA,1,2,3), and the 4 leading spaces are "X, X.1, X.2, 
> X.3".  I'd like to delete the first row and delete the first 4
values
> of column1, without deleting the column.
>
> NA,1,2,3
> X,3,islands,NA
> X.1,3,speciesNA
> X.2,Q,Q,Q
> X.3,sp1,sp2,sp3
> site1,1,0,0
> site2,0,1,2
> site3,0,3,0
>
> I have tried various tricks that I will not list/belabor here 
> (various col.names, row.names, header, Extract, etc commands). Any 
> further hints on code that will either stop R from adding these, or 
> strip them at the end?
>
> (PS, yes, I can learn how to my multivariate analyses in R and skip 
> PC-ORD, but I am time limited on this one, and it seems that this 
> code could be very useful in numerous ways)
>
> Many thanks for the help,
> Dan Gruner
> (Windows XP, R vers2.2)
>
>
>
> ##transpose datasets to convert to PC-ORD format
>
> data<-read.csv("data.csv", header=TRUE, as.is=T,
>     strip.white=T, na.strings="NA")
> data<-as.matrix(data)
> data.trans <- t(data)
> write.csv(data.trans, file = "datatransp.csv",
>     quote = F, na = "")
>
>
>
> *******************************
>
> Daniel S. Gruner, Postdoctoral Scholar
> Bodega Marine Lab, University of California -- Davis
> PO Box 247, 2099 Westside Rd
> Bodega Bay, CA 94923-0247
> (o) 707.875.2022  (f) 707.875.2009   (m) 707.338.5722
> email:  dsgruner_at_ucdavis.edu
> http://www.bml.ucdavis.edu/facresearch/gruner.html
> http://www.hawaii.edu/ant/
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>

Dave Roberts

2006-May-23 21:46 UTC

head link

[R] transpose dataset to PC-ORD?

Daniel,

     I can help somewhat I think.  PC-ORD also allows data input in what 
it calls "database" format, where each row is

sample, taxon, abundance

There as many rows/sample as there are non-zero species, and only three 
columns.  To get your taxon data.frame (currently samples as rows, 
species as columns, called data in your example) in that format try

dematrify(data,file='whatever.csv')

with the function pasted below (watch out for email-altered line 
breaks).  That will create a CSV file you can import into PC-ORD.

     Just to encourage you a little, you really should try the Ecology 
packages in R.  See packages vegan, ade-4, and labdsv, for example, and 
take a look at

http://ecology.msu.montana.edu/labdsv/R

Dave R.
*********************************************************************
dematrify <- function (df,filename=NULL,sep=",")
{
     tmp <- which(df>0,arr.ind=TRUE)
     stack <- NULL
     samples <- row.names(tmp)
     taxon <- names(df)[tmp[,2]]
     abund <- rep(NA,nrow(tmp))
     for (i in 1:nrow(tmp)) {
         abund[i] <- df[samples[i],taxon[i]]
         stack <- 
rbind(stack,paste(samples[i],sep,taxon[i],sep,abund[i],"\n",sep=""))
     }
     if (is.null(filename)) {
         tmp2 <- cbind(samples,taxon,abund)
         tmp2 <- data.frame(tmp2[order(tmp2[,1]),])
         return(tmp2)
     }
     else {
         stack <- sort(stack)
         sink(file=filename)
         cat(stack)
         sink()
     }
}

Daniel Gruner wrote:> Hello:
> 
> I need to take a species-sample matrix and transpose it to the format 
> used by PC-ORD for analysis. Unfortunately, the number of species is 
> very large (>5000), and so this operation cannot be performed simply 
> in an application like Excel, which has a 255 column limit. So, I 
> wrote relatively simple code in R that I hoped would do this 
> (appended below). But there are glitches.
> 
> The format needed for PC-ORD (where "NA" shows an empty cell):
> 
> NA,3,sites,NA
> NA,3,species,NA
> NA,Q,Q,Q
> NA,sp1,sp2,sp3
> site1,1,0,0
> site2,0,1,2
> site3,0,3,0
> 
> 2 cells in first row indicate number of samples (rows), the second 
> column indicates number of species (columns), the third row indicates 
> variable type (Q = quantitative), and the fourth row shows column 
> headers (species names). So, one can create a transposable matrix in 
> a spreadsheet where 5000+ species are the rows:
> 
> NA,NA,NA,NA,site1,site2,site3
> 3,3,Q,sp1,1,0,0
> sites,species,Q,sp2,0,1,3
> NA,NA,Q,sp3,0,2,0
> 
> 
> It is important that the data file written out is totally clean and 
> ready to go for PC-ORD, because I cannot open and edit it in a 
> spreadsheet. However, the code performs the transpose operation and 
> writes the file, but now the former row IDs are the first row in the 
> new file (NA,1,2,3), and the 4 leading spaces are "X, X.1, X.2, 
> X.3".  I'd like to delete the first row and delete the first 4
values
> of column1, without deleting the column.
> 
> NA,1,2,3
> X,3,islands,NA
> X.1,3,speciesNA
> X.2,Q,Q,Q
> X.3,sp1,sp2,sp3
> site1,1,0,0
> site2,0,1,2
> site3,0,3,0
> 
> I have tried various tricks that I will not list/belabor here 
> (various col.names, row.names, header, Extract, etc commands). Any 
> further hints on code that will either stop R from adding these, or 
> strip them at the end?
> 
> (PS, yes, I can learn how to my multivariate analyses in R and skip 
> PC-ORD, but I am time limited on this one, and it seems that this 
> code could be very useful in numerous ways)
> 
> Many thanks for the help,
> Dan Gruner
> (Windows XP, R vers2.2)
> 
> 
> 
> ##transpose datasets to convert to PC-ORD format
> 
> data<-read.csv("data.csv", header=TRUE, as.is=T,
>     strip.white=T, na.strings="NA")
> data<-as.matrix(data)
> data.trans <- t(data)
> write.csv(data.trans, file = "datatransp.csv",
>     quote = F, na = "")
> 
> 
> 
> *******************************
> 
> Daniel S. Gruner, Postdoctoral Scholar
> Bodega Marine Lab, University of California -- Davis
> PO Box 247, 2099 Westside Rd
> Bodega Bay, CA 94923-0247
> (o) 707.875.2022  (f) 707.875.2009   (m) 707.338.5722
> email:  dsgruner_at_ucdavis.edu
> http://www.bml.ucdavis.edu/facresearch/gruner.html
> http://www.hawaii.edu/ant/
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> 
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
David W. Roberts                                     office 406-994-4548
Professor and Head                                      FAX 406-994-3190
Department of Ecology                         email droberts at montana.edu
Montana State University
Bozeman, MT 59717-3460

Apparently Analagous Threads

Search for more reasonably related threads

R help - May 2006 - transpose dataset to PC-ORD?

[R] transpose dataset to PC-ORD?

[R] transpose dataset to PC-ORD?

[R] transpose dataset to PC-ORD?

Apparently Analagous Threads