similar to: Do you use R for data manipulation?

Displaying 20 results from an estimated 20000 matches similar to: "Do you use R for data manipulation?"

2006 Dec 31
4
Does SQL group by have a heavy duty equivalent in R
I have hundreds of humans who have undergone SNP genotyping at hundreds of loci. Some have even undergone the procedure twice or thrice (kind of an internal control). So obviously I need to find those replications, and confirm that the results are the same. If there is discordance then I need to address it. I tried to use the aggregate function nr.attempts
2006 Nov 29
2
reshape command is (stats) dropping instances
I would really appreciate it if anyone could determine what is going on with the following command. It is only half-working and is losing lots of data. For the life of me I cannot even see the pattern of what it is losing and what it is not. I am attaching the R data set which you can use with the Load Workspace menu function.
2007 Oct 02
2
Calculating proportions from a data frame rather than a table
When one has raw data it is easy to create a table of one variable against another and then calculate proportions For example a.nice.table<-table(a,b) prop.table(a.nice.table,1) However, I looked at several papers and created a data frame of the aggregate data. That means I acually created a table except it is a data frame. The first column lists the name of the first author and the year. I
2006 Apr 06
4
Reshaping genetic data from long to wide
Bottom Line Up Front: How does one reshape genetic data from long to wide? I currently have a lot of data. About 180 individuals (some probands/patients, some parents, rare siblings) and SNP data from 6000 loci on each. The standard formats seem to be something along the lines of Famid, pid, fatid, motid, affected, sex, locus1Allele1, locus1Allele2, locus2Allele1, locus2Allele2, etc In other
2009 Jul 19
4
space in column name
I read a table from Microsoft Access using RODBC. Some of the variables had a name with a space in it. R has no problem with it but I do. I cannot find out how to specify the space names(alltime) [1] "ID" "LVL7" "Ref Pv No" "Ref Pv Name" "DOS" "Pt Last Name" "Pt First Name" "MRN"
2008 Dec 15
3
Reading from Google Docs
I saw a thread from September 24 in which Duncan Temple Lang told us: - The package currently has no Rd files, but there is a brief "user's guide". The package is available from http://www.omegahat.org/RGoogleDocs I could not find it by using Tinn-R or RGui's package install tool. Then when I went to the website I saw that package is only available as
2008 Apr 25
2
Differentiate alphanumeric vs numeric strings
I have a bunch of tables in a Microsoft Access database. An updated database is sent to me every week containing a new table. I know that is inefficient and weird but welcome to my life. I want to read the tables whose names are something such as "040207" but not the ones that have alphanumeric names such as "everyone". Using RODBC I am easily able to create a character vector
2006 Dec 29
1
Genotypes are not all the same
I have been merrily using the genetics package and more specifically have been using the makeGenotypes and genotypes function. I check my accomplishments by going > class(g2) [1] "genotype" "factor" and likewise > class(g1) [1] "genotype" "factor" Yet when I execute a command such as allele count I get this > allele.count(g1) D I [1,]
2008 Dec 10
2
converting multiple columns from POSIX* to Date
converting a POSIX class variable to a date class is easy. dates<-as.Date(x) #where X is of class POSIX How does one do that to all columns in a data frame that are of POSIX class and leave all the other columns (integers, factors) as is. Feel free to reply with just one or two buzzwords that I could then search for to find how to do it. Farrel Buchinsky
2009 Mar 24
2
two different date formats in the same variable
How does one convert to a date format when survey respondents have used two different date formats whilst entering their data. There were clearly told to use mm/dd/yyyy but humans being humans some entered mm/dd/yy. There was even validity checks on the forms but I allowed them to be overridden since the data is more holy than the format. The data was downloaded as a csv and read.csv was used to
2007 Jan 09
3
dimensions of a all objects
Why will the following command not work sapply(objects(),dim) What does it say about the objects list? What does it say about the dim command? Likewise, the following also does not work all<-ls() for (f in all) print(dim(f)) -- Farrel Buchinsky [[alternative HTML version deleted]]
2007 Jun 19
1
genetics package not working
Has something changed in R that requires an update in the genetics package by Gregory Warnes? I am using R version 2.5.0 This used to work > summary(founders[,59]) to prove that it is a genotype class > class(founders[,59]) [1] "genotype" "factor" Now when I issue the command: > summary(founders[,59]) I get: Error in attr(retval, "which") <- which :
2006 Nov 24
1
Sunflower plot error; how to deal with NA
I suspect the problem stems from the fact that there are a couple of NA values. > sunflowerplot(lastoto,maxear) Error in rep.int(i.multi, number[number > 1]) : invalid number of copies in rep.int() So I used the subset command to get rid of the cases with NA hell<-subset(ChinOtoMayB,is.na(lastoto)==FALSE) Then it worked perfectly sunflowerplot(hell$lastoto,hell$maxear) Is
2008 Oct 03
1
Tinn-R explorer used to be my friend
I have upgraded everything lately and can no longer get the Tinn-R explorer to work. I think I have had this problem before but cannot recall how I solved it.I run Tinn-R 2.0.0.7 and Rgui version 2.7.2 When I click on the explorer button I get > trObjList(envir='.GlobalEnv', pattern='', group='', path=.trPaths[3]) Error in trObjList(envir = ".GlobalEnv",
2009 Jan 23
2
forward slash vs double backslash R and Tinn-R
I installed the newest version of R and once again ran into problem with Tinn-R failing when trying to use the R explorer. I had this problem once before and solved it when I added the following .trPaths = c( 'C:/Documents and Settings/fbuchins/Application Data/Tinn-R/tmp/', 'C:/Documents and Settings/fbuchins/Application Data/Tinn-R/tmp/search.txt', 'C:/Documents and
2009 Dec 10
3
Have you used RGoogleDocs and RGoogleData?
Both of these applications fulfill a great need of mine: to read data directly from google spreadsheets that are private to myself and one or two collaborators. Thanks to the authors. I had been using RGoogleDocs for the about 6 months (maybe more) but have had to stop using it in the past month since for some reason that I do not understand it no longer reads google spreadsheets. I loved it. Its
2006 Apr 29
3
Writing responses to the R-Help list
A while back Gabor Grothendieck suggested that I try http://news.gmane.org/gmane.comp.lang.r.general. This was after I asked how to easily reply to posts on the listserve. Ideally I would like the functionality that I find in Microsoft Outlook Express newsreader for usenet groups or what I find in Google Groups. I started using gmane about 3 weeks ago. I find it fantastic for searching and for
2006 May 03
5
Listing Variables
How does one create a vector whose contents is the list of variables in a dataframe pertaining to a particular pattern? This is so simple but I cannot find a straightforward answer. I want to be able to pass the contents of that list to a "for" loop. So let us assume that one has a dataframe whose name is Data. And let us assume one had the height of a group of people measured at
2007 Dec 14
6
Analyzing Publications from Pubmed via XML
I would like to track in which journals articles about a particular disease are being published. Creating a pubmed search is trivial. The search provides data but obviously not as an R dataframe. I can get the search to export the data as an xml feed and the xml package seems to be able to read it. xmlTreeParse("
2007 Jan 01
1
Subset by using multiple values
I have a vector containg about 20 unique values. It is called rejectrs$rs. It is a factor I have a data frame with about 100000 rows. I want to exclude all rows where in variable rs the value is one of the 20 on the exclude list. I thought this would work but none did. RawSeqBig<-subset(RawSeqBig,ASSAY_ID!=rejectrs$rs) RawSeqBig<-subset(RawSeqBig,ASSAY_ID!=list(rejectrs$rs)) -- Farrel