thr3ads.net - similar to: "data.frame question"

Displaying 20 results from an estimated 90000 matches similar to: "data.frame question"

2008 Sep 03

subsetting a data frame

I have a data frame that looks like this: V1 V2 V3 a b 0:1:12 d f 1:2:1 c d 1:0:9 where V3 is in the form x:y:z Can someone show me how to subset the rows where the values of x, y and z <= 10: V1 V2 V3 d f 1:2:1 c d 1:0:9 Thanks Joseph [[alternative HTML version deleted]]

data frame question

2008 Feb 10

data frame question

Hello I have 2 data frames df1 and df2. I would like to create a new data frame new_df which will contain only the common rows based on the first 2 columns (chrN and start). The column score in the new data frame should be replaced with a column containing the average score (average_score) from df1 and df2. df1= data.frame(chrN= c(“chr1”, “chr1”, “chr1”, “chr1”, “chr2”, “chr2”, “chr2”),

difference of two data frames

2008 Sep 14

difference of two data frames

Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version

merging more than 2 data frames

2008 Feb 12

merging more than 2 data frames

Hi merge() takes only 2 data frames. What can you do to it to make take more than two data frames? or is there another function that does that? Thanks joseph ____________________________________________________________________________________ Looking for last minute shopping deals? [[alternative HTML version deleted]]

remove column names from a data frame

2008 Feb 18

remove column names from a data frame

I want to remove the column names from a data frame. I do it the long way, can any body show me a better way ? df= data.frame(chrN= c(“chr1”, “chr2”, “chr3”), start= c(1, 2, 3), end= c(4, 5, 6), score= c(7, 8, 9)) df #I write a txt file without row or column names write.table(df,"df1.txt",sep='\t',quote=FALSE,row.names=F,col.names=F) #then I read it with the header = F

inserting text lines in a dat frame

2008 Feb 06

inserting text lines in a dat frame

Hi Jim I am trying to prepare a bed file to load as accustom track on the UCSC genome browser. I have a data frame that looks like the one below. > x V1 V2 V3 1 chr1 11255 55 2 chr1 11320 29 3 chr1 11400 45 4 chr2 21680 35 5 chr2 21750 84 6 chr2 21820 29 7 chr2 31890 46 8 chr3 32100 29 9 chr3 52380 29 10 chr3 66450 46 I would like to insert the following 4 lines at the beginning:

64-bit R on Mac OS X 10.5.4

2008 Jul 27

64-bit R on Mac OS X 10.5.4

Hi Matt Your method is the easiest way for me to install the 64-bit R. I followed the directions on your web site and then did the following: R --arch=x86_64 source("http://bioconductor.org/biocLite.R") biocLite(type = "source",lib = "/Library/Frameworks/R.framework/Versions/2.8/Resources/RLib64") I got many errors and warnings which I copied to the attached file.

counting identical data in a column

2008 Feb 04

counting identical data in a column

Hi Peter I have the following data frame with chromosome name, start and end positions: chrN start end 1 chr1 11122333 11122633 2 chr1 11122333 11122633 3 chr3 11122333 11122633 8 chr3 111273334 111273634 7 chr2 12122334 12122634 4 chr1 21122377 21122677 5 chr2 33122355 33122655 6 chr2 33122355 33122655 I would like to count the positions that have the same start and

naming components of a list

2008 May 25

naming components of a list

Hi I have a character vector with thousands of names which looks like this: > V=c("Fred", "Mary", "SAM") > V [1] "Fred" "Mary" "SAM" > class(V) [1] "character" I would like to change it to a list: > L=as.list(V) > L [[1]] [1] "Fred" [[2]] [1] "Mary" [[3]] [1] "SAM" but I need to

convertin a data frame column from character to numeric

2008 Feb 08

convertin a data frame column from character to numeric

I have a data.frame with all character columns, I would like to convert the last two columns into numeric.> x[1:5, ] chrN start end 1 chr1 71310034 71310064 2 chr14 23354088 23354118 3 chr14 71310034 71310064 4 chr15 37759058 37759088 5 chr22 18262638 18262668 > apply(x, 2, FUN = mode) chrN start end

data.frame or list

2008 Apr 03

data.frame or list

Dear R list, I'm having difficulties in choosing between a list or a data.frame, or an array for the storage and manipulation of my data (example follows). I've been using the three for different purposes but I would rather like to know which is more adapted to what task. Here is the data I'm currently working on: 200 observations, each observation being a vector of length

counting sequence mismatches

2008 Feb 23

counting sequence mismatches

Hello I have 2 columns of short sequences that I would like to compare and count the number of mismatches and record the number of mismatches in a new column. The sequences are part of a data frame that looks like this: seq1=c("CGGTGTAGAGGAAAAAAAGGAAACAGGAGTTC","CGGTGGTCAGTCTGGGACCTGGGCAGCAGGCT", "CGGGCCTCTCGGCCTGCAGCCCCCAACAGCCA")

transforming one column into 2 columns

2008 Feb 02

transforming one column into 2 columns

Hello I have a data frame and one of its columns is as follows: Col chr1:71310034 chr14:23354088 chr15:37759058 chr22:18262638 chrUn:31337214 chr10_random:4369261 chrUn:3545097 I would like to get rid of colon (:) and replace this column with two new columns containing the terms on each side of the colon. The new columns should look as follows: Col_a Col_b chr1

Scatter Plot Command Syntax Using Data.Frame Source

2011 Aug 31

Scatter Plot Command Syntax Using Data.Frame Source

I've tried various commands. ?plot, Teetor's book, "R Cookbook", and Mittal's book, "R Graphs Cookbook" without seeing how to write the command to create scatterplots from my data.frame. The structure is: > str(chemdata) 'data.frame': 14886 obs. of 4 variables: $ site : Factor w/ 148 levels "BC-0.5","BC-1",..: 104 145 126 115

"factorise" variables in a data.frame

2009 Apr 03

"factorise" variables in a data.frame

Dear list, I often need to convert several variables from numeric or integer into factors (before plotting, for instance), as in the following example, d <- data.frame( x = seq(1, 10), y = seq(1, 10), z = rnorm(10), a = letters[1:10]) d2 <- within(d, { x = factor(x) y = factor(y) }) str(d) str(d2) I'd like to write a function factorise() which takes a data.frame and

drop unused levels in subset.data.frame

2009 Nov 10

drop unused levels in subset.data.frame

Dear list, subset has a 'drop' argument that I had often mistaken for the one in [.factor which removes unused levels. Clearly it doesn't work that way, as shown below, d <- data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3])) s <- subset(d, y=="A", drop=TRUE) str(s) 'data.frame': 5 obs. of 2 variables: $ x: Factor w/ 15 levels

Beginer data.frame

2010 Jan 12

Beginer data.frame

Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable "Code" starts with an "R". I took all the Code and put them into a vector vec<-grep("R[A-Z][A-Z]",x.df$Code,value=TRUE) Then I

tapply within a data.frame: a simpler alternative?

2008 Dec 10

tapply within a data.frame: a simpler alternative?

Dear list, I have a data.frame with x, y values and a 3-level factor "group", say. I want to create a new column in this data.frame with the values of y scaled to 1 by group. Perhaps the example below describes it best: > x <- seq(0, 10, len=100) > my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), > cos(2*x)), # note how the y values have a different

Using by() and stacking back sub-data frames to one data frame

2009 Jun 25

Using by() and stacking back sub-data frames to one data frame

Dear all, I have a code where I subset a data frame to match entries within levels of an factor (actually, the full script uses three difference factors do do that). I'm very happy with the precision with which I can work with R, but since I loop over factor levels, and the data frame is big, the process is slow. So I've been trying to speed up the process using by(), but I got stuck at

"[.data.frame" and lapply

2009 Mar 25

"[.data.frame" and lapply

Dear all, Trying to extract a few rows for each element of a list of data.frames, I'm puzzled by the following behaviour, > d <- lapply(1:4, function(i) data.frame(x=rnorm(5), y=rnorm(5))) > str(d) > > lapply(d, "[", i= c(1)) # fine, this extracts the first columns > lapply(d, "[", j= c(1, 3)) # doesn't do nothing ?! > > library(plyr)

similar to: data.frame question