thr3ads.net - similar to: "data frames; matching/merging"

Displaying 20 results from an estimated 70000 matches similar to: "data frames; matching/merging"

Follow-up Question: data frames; matching/merging

2010 Feb 08

Follow-up Question: data frames; matching/merging

Wow.. thanks for the deluge of responses! Aggregate seems like the way to go here. But, suppose that instead of integers in column V2, I actually have dates (and instead of keeping the minimum integer, I want to keep the earliest date): > df =

Need help on dataframe

2013 Jan 05

Need help on dataframe

Dear R users, I came up to a problem by taking means (or other summary statistics) of a big dataframe. Suppose we do have a dataframe: ID V1 V2 V3 V4 ........................ V71 1 6 5 3 2 ........................ 3 2 3 2 2 1 ........................ 1 3 6 5 3 2 ........................ 3 4 12 15 3 2 ........................ 100

1.8.1 behavior change?

2003 Nov 22

1.8.1 behavior change?

In <R 1.8.1 the following fragment worked properly, now (1.8.1) it creates the following warning/error: Any advice appreciated. stt <- data.frame() # load all datasets into a dataframe for (ds in 1:n) { stt[ds] <- as.matrix(read.table(fileList[ds])) } -- > stt <- data.frame() > # load all datasets into a dataframe > for (ds in 1:n) { + stt[ds] <-

Calculated mean value based on another column bin from dataframe.

2011 Apr 06

Calculated mean value based on another column bin from dataframe.

Dear list, I have a dataframe with two column as fellow. > head(dat) V1 V2 0.15624 0.94567 0.26039 0.66442 0.16629 0.97822 0.23474 0.72079 0.11037 0.83760 0.14969 0.91312 I want to get the column V2 mean value based on the bin of column of V1. I write the code as fellow. It works, but I think this is not the elegant way. Any suggestions?

merging with aggregating

2005 Dec 06

merging with aggregating

Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result

For help in R coding

2011 Jul 01

For help in R coding

Dear all, I am doing a project on variant calling using R.I am working on pileup file.There are 10 columns in my data frame and I want to count the number of A,C,G and T in each row for column 9.example of column 9 is given below- .a,g,, .t,t,, .,c,c, .,a,,, .,t,t,t .c,,g,^!. .g,ggg.^!, .$,,,,,.,

Efficient ways of merging data frames

2010 Sep 12

Efficient ways of merging data frames

Hi all, I am just wondering if there is a more efficient way of merging two large datasets based on the values of multiple columns, some of which are not numerical. The default merge function in dataframe is very inefficient and the merge function in data.table seems to be faster, but it does not seem to allow keys that are not numerical in nature. Any other suggestion? Thanks a lot!

Aggregate to find majority level of a factor

2007 May 31

Aggregate to find majority level of a factor

I want to use the aggregate function to summarize data by a factor (my field plots), but I want the summary to be the majority level of another factor. For example, given the dataframe: Plot1 big Plot1 big Plot1 small Plot2 big Plot2 small Plot2 small Plot3 small Plot3 small Plot3 small My desired result would be: Plot1 big Plot2 small Plot3 small I

Cross-tabulation Question

2008 Sep 29

Cross-tabulation Question

Hi R, This is a cross tabulation question. Suppose that, > d=read.table("clipboard",header=F) > d V1 V2 V3 A One Apple A One Cake A One Cake B One Apple B One Apple B One Apple > table(d$V2,d$V3) Apple Cake One 4 2 But, I don't want the count to be like the above. Here, it is counting the

function to include factors in summary data frame

2011 Sep 12

function to include factors in summary data frame

Hi all, I have a dataframe that includes data on individuals that are distributed across multiple rows. I have aggregated the data using ddply, but I have columns in the original data frame that are factors ( such as sites "A", "B", and "C") that I would like to include in the new data frame. I have done this in a clunky way using match() and a loop, but am

how to combine data of several csv-files

2007 Jul 30

how to combine data of several csv-files

Hello, I'm looking for a solution for the following problem: 1) I have a folder with several csv files; each contains a set of measurement values 2) The measurements of each file belong to a position in a two dimensional matrix (lets say "B02.csv" belongs to position 2,2 3) The size of the matrix is fix 4) I cannot assure to have a csv file for each position 5) Each position

aggregate function with a dataframe for both "x" and "by"

2011 Oct 05

aggregate function with a dataframe for both "x" and "by"

I have 2 dataframes. "mydata" contains numerical data. "mybys" contains information on the "group" each row of the data is in. I wish to aggregate each column in mydata using the corresponding column in mybys. Please see the example below. What is a more elegant or "better" way to accomplish this task? Thanks! mydata =

Finding a Diff within a Dataframe columns

2011 Jan 31

Finding a Diff within a Dataframe columns

Hi, I have a Dataframe. A B C D 0.1 0.7 0.9 0.8 0.20 0.60 0.80 0.70 0.40 0.80 0.70 0.76 I need a resultant dataframe (A-B) (C-D) -0.6 0.1 -0.40 0.1 -0.40 -0.06 Any suggestion would be of a great help Thanks Ramya -- View this message in context: http://r.789695.n4.nabble.com/Finding-a-Diff-within-a-Dataframe-columns-tp3247943p3247943.html Sent from

Agregar variables de un dataframe

2012 Mar 20

Agregar variables de un dataframe

Hola a todos.Quiero saber si existe una forma mas apropiada para hacer esto:tengo un dataframe de 40 variables y una de ellas es de fechas.lo que quiero es una tabla agregada por suma y como criterio de agrupación esta variable de fecha.en sql sería algo así: select fecha, sum(v1), sum(v2)..., sum(v39)from tablagroup by fecha; mi problema es que las vn pueden ser de dimensión cambiante, es decir,

Efficient way to do a merge in R

2011 Oct 03

Efficient way to do a merge in R

Dear all, I am new in R and I have been faced with the following problem, that slows me down a lot. I am short of ideas to circumvent it. So, any help would be highly appreciated: I have 2 dataframes x and y. x is very big (70 million observations), whereas y is smaller (300000 observations). All the observations of y are present in x. But y has one additional variable that I would like to

Help with aggregate and cor

2010 Mar 10

Help with aggregate and cor

Hello, I do not understand the correct way to approach the following problem in R. I have observations of pairs of variables, v1, o1, v2, o2, etc, observed every 30 seconds. What I would like to do is compute the correlation matrix, but not for all my data, just for, say 5 minutes or 1 hour chunks. In sql, what I would say is select id, date_trunc('hour'::text, ts) as tshour,

difference of two data frames

2008 Sep 14

difference of two data frames

Hello I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1: DF1= data.frame(V1=1:6, V2= letters[1:6]) DF2= data.frame(V1=1:3, V2= letters[1:3]) How do I create a new data frame of the difference between DF1 and DF2 newDF=data.frame(V1=4:6, V2= letters[4:6]) In my real data, the rows are not in order as in the example I provided. Thanks much Joseph [[alternative HTML version

factors

2011 May 05

factors

Hi, I'm requesting you don't berate me for asking this question: I clearly don't have the gist of factors. I have two dataframes, A and B. Each of them has a column containing strings (they're labels). I want to, one-by-one in a loop, compare the particular string in an entry from dataframe A to an entry in B, to see if they're the same. The problem, when posing the

Change data frame column names

2009 Jul 15

Change data frame column names

Hi R helpers, I have a data frame and I want to change the column names to names I have held in another data frame, but I am having difficulty. All data framnes are large so i can't input manually. Below is what i have tried so far: df<-data.frame(a=1:5, b=2:6, d=3:7, e=4:8) coltitles<-data.frame(v1="col number one", v2="col number two", v3="col number

more dates and data frames

2010 Jun 08

more dates and data frames

Dear R People: So thanks to your help, I have the following: > dog3.df <- read.delim("c:/Users/erin/Documents/dog1.txt",header=FALSE,sep="\t") > dog3.df V1 V2 1 1/1/2000 dog 2 1/1/2000 cat 3 1/1/2000 tree 4 1/1/2000 dog 5 1/2/2000 cat 6 1/2/2000 cat 7 1/2/2000 cat 8 1/2/2000 tree 9 1/3/2000 dog 10 1/3/2000 tree 11 1/6/2000 dog 12 1/6/2000

similar to: data frames; matching/merging