Displaying 20 results from an estimated 70000 matches similar to: "data frames; matching/merging"
2010 Feb 08
1
Follow-up Question: data frames; matching/merging
Wow.. thanks for the deluge of responses!
Aggregate seems like the way to go here.
But, suppose that instead of integers in column V2, I actually have
dates (and instead of keeping the minimum integer, I want to keep the
earliest date):
> df =
2013 Jan 05
5
Need help on dataframe
Dear R users, I came up to a problem by taking means (or other summary
statistics) of a big dataframe.
Suppose we do have a dataframe:
ID V1 V2 V3 V4 ........................ V71
1 6 5 3 2 ........................ 3
2 3 2 2 1 ........................ 1
3 6 5 3 2 ........................ 3
4 12 15 3 2 ........................ 100
2003 Nov 22
3
1.8.1 behavior change?
In <R 1.8.1 the following fragment worked properly, now (1.8.1)
it creates the following warning/error:
Any advice appreciated.
stt <- data.frame()
# load all datasets into a dataframe
for (ds in 1:n) {
stt[ds] <- as.matrix(read.table(fileList[ds]))
}
--
> stt <- data.frame()
> # load all datasets into a dataframe
> for (ds in 1:n) {
+ stt[ds] <-
2011 Apr 06
3
Calculated mean value based on another column bin from dataframe.
Dear list,
I have a dataframe with two column as fellow.
> head(dat)
V1 V2
0.15624 0.94567
0.26039 0.66442
0.16629 0.97822
0.23474 0.72079
0.11037 0.83760
0.14969 0.91312
I want to get the column V2 mean value based on the bin of column of
V1. I write the code as fellow. It works, but I think this is not the
elegant way. Any suggestions?
2005 Dec 06
3
merging with aggregating
Dear List,
I have two data.frame of the following form:
A:
n V1 V2
1 12 0
2 10 8
3 3 8
4 8 4
6 7 3
7 12 0
8 1 0
9 18 0
10 1 0
13 2 0
B:
n V1 V2
1 0 2
2 0 3
3 1 9
4 12 8
5 2 9
6 2 9
8 2 0
10 4 1
11 7 1
12 0 1
Now I want to merge those frame to one data.frame with summing up the
columns V1 and V2 but not the column n. So the result
2011 Jul 01
13
For help in R coding
Dear all,
I am doing a project on variant calling using R.I am working on pileup file.There are 10 columns in my data frame and I want to count the number of A,C,G and T in each row for column 9.example of column 9 is given below-
.a,g,,
.t,t,,
.,c,c,
.,a,,,
.,t,t,t
.c,,g,^!.
.g,ggg.^!,
.$,,,,,.,
2010 Sep 12
2
Efficient ways of merging data frames
Hi all,
I am just wondering if there is a more efficient way of merging two large
datasets based on the values of multiple columns, some of which are not
numerical.
The default merge function in dataframe is very inefficient and the merge
function in data.table seems to be faster, but it does not seem to allow
keys that are not numerical in nature.
Any other suggestion?
Thanks a lot!
2007 May 31
4
Aggregate to find majority level of a factor
I want to use the aggregate function to summarize data by a factor (my
field plots), but I want the summary to be the majority level of another
factor.
For example, given the dataframe:
Plot1 big
Plot1 big
Plot1 small
Plot2 big
Plot2 small
Plot2 small
Plot3 small
Plot3 small
Plot3 small
My desired result would be:
Plot1 big
Plot2 small
Plot3 small
I
2008 Sep 29
3
Cross-tabulation Question
Hi R,
This is a cross tabulation question. Suppose that,
> d=read.table("clipboard",header=F)
> d
V1 V2 V3
A One Apple
A One Cake
A One Cake
B One Apple
B One Apple
B One Apple
> table(d$V2,d$V3)
Apple Cake
One 4 2
But, I don't want the count to be like the above. Here, it is counting
the
2011 Sep 12
2
function to include factors in summary data frame
Hi all,
I have a dataframe that includes data on individuals that are distributed
across multiple rows. I have aggregated the data using ddply, but I have
columns in the original data frame that are factors ( such as sites "A",
"B", and "C") that I would like to include in the new data frame. I have
done this in a clunky way using match() and a loop, but am
2007 Jul 30
4
how to combine data of several csv-files
Hello,
I'm looking for a solution for the following problem:
1) I have a folder with several csv files; each contains a set of
measurement values
2) The measurements of each file belong to a position in a two
dimensional matrix (lets say "B02.csv" belongs to position 2,2
3) The size of the matrix is fix
4) I cannot assure to have a csv file for each position
5) Each position
2011 Oct 05
2
aggregate function with a dataframe for both "x" and "by"
I have 2 dataframes. "mydata" contains numerical data. "mybys" contains
information on the "group" each row of the data is in. I wish to aggregate
each column in mydata using the corresponding column in mybys.
Please see the example below. What is a more elegant or "better" way to
accomplish this task?
Thanks!
mydata =
2011 Jan 31
5
Finding a Diff within a Dataframe columns
Hi,
I have a Dataframe.
A B C D
0.1 0.7 0.9 0.8
0.20 0.60 0.80 0.70
0.40 0.80 0.70 0.76
I need a resultant dataframe
(A-B) (C-D)
-0.6 0.1
-0.40 0.1
-0.40 -0.06
Any suggestion would be of a great help
Thanks
Ramya
--
View this message in context: http://r.789695.n4.nabble.com/Finding-a-Diff-within-a-Dataframe-columns-tp3247943p3247943.html
Sent from
2012 Mar 20
3
Agregar variables de un dataframe
Hola a todos.Quiero saber si existe una forma mas apropiada para hacer esto:tengo un dataframe de 40 variables y una de ellas es de fechas.lo que quiero es una tabla agregada por suma y como criterio de agrupación esta variable de fecha.en sql sería algo así:
select fecha, sum(v1), sum(v2)..., sum(v39)from tablagroup by fecha;
mi problema es que las vn pueden ser de dimensión cambiante, es decir,
2011 Oct 03
1
Efficient way to do a merge in R
Dear all,
I am new in R and I have been faced with the following problem, that slows
me down a lot. I am short of ideas to circumvent it. So, any help would be
highly appreciated:
I have 2 dataframes x and y. x is very big (70 million observations),
whereas y is smaller (300000 observations).
All the observations of y are present in x. But y has one additional
variable that I would like to
2010 Mar 10
3
Help with aggregate and cor
Hello,
I do not understand the correct way to approach the following problem
in R.
I have observations of pairs of variables, v1, o1, v2, o2, etc,
observed every 30 seconds. What I would like to do is compute the
correlation matrix, but not for all my data, just for, say 5 minutes
or 1 hour chunks.
In sql, what I would say is
select id, date_trunc('hour'::text, ts) as tshour,
2008 Sep 14
5
difference of two data frames
Hello
I have 2 data frames DF1 and DF2 where DF2 is a subset of DF1:
DF1= data.frame(V1=1:6, V2= letters[1:6])
DF2= data.frame(V1=1:3, V2= letters[1:3])
How do I create a new data frame of the difference between DF1 and DF2
newDF=data.frame(V1=4:6, V2= letters[4:6])
In my real data, the rows are not in order as in the example I provided.
Thanks much
Joseph
[[alternative HTML version
2011 May 05
3
factors
Hi, I'm requesting you don't berate me for asking this question:
I clearly don't have the gist of factors.
I have two dataframes, A and B.
Each of them has a column containing strings (they're labels).
I want to, one-by-one in a loop, compare the particular string in an entry from dataframe A to an entry in B, to see if they're the same.
The problem, when posing the
2009 Jul 15
4
Change data frame column names
Hi R helpers,
I have a data frame and I want to change the column names to names I have held in another data frame, but I am having difficulty. All data framnes are large so i can't input manually. Below is what i have tried so far:
df<-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
coltitles<-data.frame(v1="col number one", v2="col number two", v3="col number
2010 Jun 08
3
more dates and data frames
Dear R People:
So thanks to your help, I have the following:
> dog3.df <- read.delim("c:/Users/erin/Documents/dog1.txt",header=FALSE,sep="\t")
> dog3.df
V1 V2
1 1/1/2000 dog
2 1/1/2000 cat
3 1/1/2000 tree
4 1/1/2000 dog
5 1/2/2000 cat
6 1/2/2000 cat
7 1/2/2000 cat
8 1/2/2000 tree
9 1/3/2000 dog
10 1/3/2000 tree
11 1/6/2000 dog
12 1/6/2000