Thilini Maddegoda Vidanelage
2017-Mar-07 01:16 UTC
[R] FOR TAKING PERCENTAGES of OTUS in each column (n=2910 COLUMNs)
Hi, I am analyzing a huge excel table with OTUs. In the table, I have 2910 columns and 365 rows.Each column represents one individual (n=2910). Rows represent microbial species (n=365). I have the total of all OTUs of microbial species under each column. Then I need to get the percentages of each species in each individual.I started to do this in excel but I have to repeat this for 2910 times which is going to be very time-consuming. I am sure there should be a smart way to do this and just wondering whether there is any R script to do this.Any help is much appreciated. Many thanks, Thilini *Thilini Jayasinghe* PhD Candidate Liggins Institute The University of Auckland Building 503/201, 85 Park Road, Grafton, Auckland 2023 Mobile: +64 220211604 Email: tmad109 at aucklanduni.ac.nz [[alternative HTML version deleted]]
Jeff Newmiller
2017-Mar-07 06:46 UTC
[R] FOR TAKING PERCENTAGES of OTUS in each column (n=2910 COLUMNs)
If your problem really requires genetics jargon to be expressed, then perhaps you should be asking it in a forum where more of the participants are likely to understand it... like the Bioconductor help forum? https://www.bioconductor.org/help/support/ The Posting Guide mentioned in the footer has quite a lot of helpful context for the kinds of questions that are appropriate here... I recommend reading it. -- Sent from my phone. Please excuse my brevity. On March 6, 2017 5:16:16 PM PST, Thilini Maddegoda Vidanelage <tmad109 at aucklanduni.ac.nz> wrote:>Hi, >I am analyzing a huge excel table with OTUs. In the table, I have 2910 >columns and 365 rows.Each column represents one individual (n=2910). >Rows >represent microbial species (n=365). >I have the total of all OTUs of microbial species under each column. >Then I >need to get the percentages of each species in each individual.I >started to >do this in excel but I have to repeat this for 2910 times which is >going to >be very time-consuming. I am sure there should be a smart way to do >this >and just wondering whether there is any R script to do this.Any help is >much appreciated. >Many thanks, Thilini > >*Thilini Jayasinghe* >PhD Candidate >Liggins Institute >The University of Auckland >Building 503/201, 85 Park Road, Grafton, Auckland 2023 >Mobile: +64 220211604 >Email: tmad109 at aucklanduni.ac.nz > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Jim Lemon
2017-Mar-07 08:24 UTC
[R] FOR TAKING PERCENTAGES of OTUS in each column (n=2910 COLUMNs)
Hi Thilini, It is fairly simple in R once you have imported the data. Say you have a data frame obtained by exporting the Excel table to CSV and then importing it with "read.csv". I'm not sure whether you have a number in each cell or just a 0/1 absent/present value, but it may not matter. Assume the data frame is named "tjdf" for(column in 1:dim(tjdf)[2]) tjdf[,paste("pct",column,sep="")]<-100*tjdf[,column]/sum(tjdf[,column]) Alternatively, you could create a new data frame with just the percentages. Jim On Tue, Mar 7, 2017 at 12:16 PM, Thilini Maddegoda Vidanelage <tmad109 at aucklanduni.ac.nz> wrote:> Hi, > I am analyzing a huge excel table with OTUs. In the table, I have 2910 > columns and 365 rows.Each column represents one individual (n=2910). Rows > represent microbial species (n=365). > I have the total of all OTUs of microbial species under each column. Then I > need to get the percentages of each species in each individual.I started to > do this in excel but I have to repeat this for 2910 times which is going to > be very time-consuming. I am sure there should be a smart way to do this > and just wondering whether there is any R script to do this.Any help is > much appreciated. > Many thanks, Thilini > > *Thilini Jayasinghe* > PhD Candidate > Liggins Institute > The University of Auckland > Building 503/201, 85 Park Road, Grafton, Auckland 2023 > Mobile: +64 220211604 > Email: tmad109 at aucklanduni.ac.nz > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
David L Carlson
2017-Mar-07 13:41 UTC
[R] FOR TAKING PERCENTAGES of OTUS in each column (n=2910 COLUMNs)
If you read your data into R, it is simple to compute the percentages. Use Save As in Excel to save your data as a .csv (comma separated variables) file. Then use read.csv() to create a data frame in R as Jim indicated. Put it in the default directory that R is using (this depends on what operating system you are using). Then import the file with raw_data <- read.csv("YourData.csv") You may need to add some arguments in read.csv() depending on if you have column headings or not. Blank fields in Excel will be interpreted as missing values, not zeros, but you did not give us any of your data (even just the first 10, rows and columns) so it is impossible to be more specific. Once you have the data frame (and have replaced the missing values with zeros if necessary), the process is simple: pct_data <- prop.table(as.matrix(raw_data), 2) * 100 will produce a matrix with percentages down each column and store it as a matrix object (variable) called pct_data. R uses different methods to store different kinds of data. The read.csv() function creates a data frame which can handle a mixture of character and numeric data, but the prop.table() function only accepts a matrix of numeric data and returns a matrix of numeric data. The data you described is all numeric so it is easy to switch the data frame to a matrix (and then back again if you want). If you are going to use R, you will need to spend some time reading about how it works, but as you can see, that time invested will make some operations much simpler than Excel and will allow you to conduct analyses that Excel does not even attempt. You can get details on these three functions by running the following commands in R: ?read.csv ?prop.table ?as.matrix ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon Sent: Tuesday, March 7, 2017 2:24 AM To: Thilini Maddegoda Vidanelage <tmad109 at aucklanduni.ac.nz>; r-help mailing list <r-help at r-project.org> Subject: Re: [R] FOR TAKING PERCENTAGES of OTUS in each column (n=2910 COLUMNs) Hi Thilini, It is fairly simple in R once you have imported the data. Say you have a data frame obtained by exporting the Excel table to CSV and then importing it with "read.csv". I'm not sure whether you have a number in each cell or just a 0/1 absent/present value, but it may not matter. Assume the data frame is named "tjdf" for(column in 1:dim(tjdf)[2]) tjdf[,paste("pct",column,sep="")]<-100*tjdf[,column]/sum(tjdf[,column]) Alternatively, you could create a new data frame with just the percentages. Jim On Tue, Mar 7, 2017 at 12:16 PM, Thilini Maddegoda Vidanelage <tmad109 at aucklanduni.ac.nz> wrote:> Hi, > I am analyzing a huge excel table with OTUs. In the table, I have 2910 > columns and 365 rows.Each column represents one individual (n=2910). Rows > represent microbial species (n=365). > I have the total of all OTUs of microbial species under each column. Then I > need to get the percentages of each species in each individual.I started to > do this in excel but I have to repeat this for 2910 times which is going to > be very time-consuming. I am sure there should be a smart way to do this > and just wondering whether there is any R script to do this.Any help is > much appreciated. > Many thanks, Thilini > > *Thilini Jayasinghe* > PhD Candidate > Liggins Institute > The University of Auckland > Building 503/201, 85 Park Road, Grafton, Auckland 2023 > Mobile: +64 220211604 > Email: tmad109 at aucklanduni.ac.nz > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.