Displaying 20 results from an estimated 10000 matches similar to: "selecting first row of a variable with long-format data"
2010 Feb 20
3
aggregating using 'with' function
Hi All,
I am interested in aggregating a data frame based on 2
categories--mean effect size (r) for each 'id's' 'mod1'. The
'with' function works well when aggregating on one category (e.g.,
based on 'id' below) but doesnt work if I try 2 categories. How can
this be accomplished?
# sample data
id<-c(1,1,1,rep(4:12))
n<-c(10,20,13,22,28,12,12,36,19,12,
2010 Jan 28
2
Data.frame manipulation
Hi All,
I'm conducting a meta-analysis and have taken a data.frame with multiple
rows per
study (for each effect size) and performed a weighted average of effect size
for
each study. This results in a reduced # of rows. I am particularly
interested in
simply reducing the additional variables in the data.frame to the first row
of the
corresponding id variable. For example:
2009 May 25
3
long format - find age when another variable is first 'high'
Dear R,
I've got a data frame with children examined multiple times and at various
ages. I'm trying to find the first age at which another variable
(LDL-Cholesterol) is >= 130 mg/dL; for some children, this may never happen.
I can do this with transformBy and ddply, but with 10,000 different
children, these functions take some time on my PCs - is there a faster way
to do this in R?
2011 Nov 03
2
Take variables in data.frame and create list of matrices
Hi,
I have this sample data below and would like to create a list of matricies.
setseed(1254)
id <- c(1,1,1,1 ,2,2,2)
o <- as.factor(c(1:4, 1, 3, 4))
r <- rep(.5, 7)
v <- rnorm(7)
s <- rnorm(7)
dat <-data.frame(id, o, r, v, s)
dat
#> dat
# id o r v s
# 1 1 0.5 0.7024631 2.0813672
# 1 2 0.5 -0.5541955 0.1095156
# 1 3 0.5 -1.0418167 0.4164930
# 1
2009 Jan 02
7
the first and last observation for each subject
I have the following data
ID x y time
1 10 20 0
1 10 30 1
1 10 40 2
2 12 23 0
2 12 25 1
2 12 28 2
2 12 38 3
3 5 10 0
3 5 15 2
.....
x is time invariant, ID is the subject id number, y is changing over time.
I want to find out the difference between the first and last observed y
value for each subject and get a table like
ID x y
1 10 20
2 12 15
3 5 5
......
Is there any easy way to generate
2011 Nov 22
4
Removing rows in dataframe w'o duplicated values
Hi,
Is there an easy way to remove dataframe rows without duplicated values of
a specified column ('id')? e.g.,
dat <- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 =
c(1,4,3,3,4,3))
dat
id value value2
1 1 5 1
2 1 6 4
3 1 7 3
4 2 4 3
5 3 5 4
6 3 4 3
This is sample data and the real data has hundreds of
2008 Sep 25
2
Equivalent of 'first.var' or 'last.var' from SAS in R?
Hi,
I want to sort a data frame by multiple columns and then take the
first record in each unique level of the "by" group I used to sort the
data frame. Does someone have an example of how to do this?
Thanks,
Matt
--
It is from the wellspring of our despair and the places that we are
broken that we come to repair the world.
-- Murray Waas
2010 Mar 17
2
Retaining variable name in a function
Hi All,
Im interested in creating a function that will convert a variable within a
data.frame to a factor while retaining the original name (yes, I know that I
can just: var <-factor(var) but I need it as a function for other
purposes). e.g.:
# this was an attempt but fails.
facts <- function(meta, mod, modname = "spec") {
meta$mod <- factor(meta$mod)
2010 Feb 22
2
how do I calculate means or cov matrix for multivariate groups
Hello,
Having the matrix d
> d
value value2 class
1 1 1 x
2 2 2 x
3 3 3 x
4 4 2 x
5 5 1 y
6 11 3 y
7 12 4 z
8 13 5 z
9 14 6 z
10 15 7 z
I want to calculate the means and cov matrix for groups x,y,z.
I know how to do it the long way.
I tried to use tapply and
2010 Sep 22
2
speeding up regressions using ddply
Hi,
I have a data set that I'd like to run logistic regressions on, using
ddply to speed up the computation of many models with different
combinations of variables. I would like to run regressions on every
unique two-variable combination in a portion of my data set, but I
can't quite figure out how to do using ddply. The data set looks like
this, with "status" as
2011 Aug 24
3
ddply from plyr package - any alternatives?
Hello everyone,
I was asked to repost this again, sorry for any inconvenience.
I'm looking replacement for ddply function from plyr package.
Function allows to apply function by category stored in any column/columns.
Regular loops or lapplys slow down greatly because my unique combination
count exceeds 9000. Is there any available solution which allow me to apply
function by category?
2010 Apr 14
6
sum specific rows in a data frame
I have a data frame called "pose":
DESCRIPTION QUANITY CLOSING.PRICE
1 WHEAT May/10 1 467.75
2 WHEAT May/10 2 467.75
3 WHEAT May/10 1 467.75
4 WHEAT May/10 1 467.75
5 COTTON NO.2 May/10 1 78.13
6 COTTON NO.2 May/10 3 78.13
7 COTTON NO.2 May/10 1 78.13
2011 Aug 23
3
ddply - how to transform df column "in place"
Dear R-users,
I am trying to get the plyr syntax right, without much success.
Given:
d<- data.frame(cbind(x=1,y=seq(20100801,20100830,1)))
names(d)<-c("first", "daterep")
d2<-d
# I can convert the daterep column in place the classic way:
d$daterep<-as.Date(strptime(d$daterep, format="%Y%m%d"))
# How to do it the plyr way?
ddply(d2,
2010 Jun 17
1
big big problem
Dear list,
I'll try to be more clear in explaining my problem. I have a data frame like this called X:
CLUSTER YEAR variable value1 value2
M1 2005 EC01 NA NA
M1 2006 EC01 2 5
M1 2007
2012 Jan 17
1
New PLYR issue
Hello everyone,
I have got the same problem, with the same error message.
Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP
The plyr mailing list does not provide any help until now.
>require(plyr)
>c(sample(c(1:100), 50, replace=TRUE))->V1
>c(rep( 1:5, 10))->f1 #variable to group V1
>data.frame(cbind(V1, f1))->DF
>str(DF)
>ddply(DF$V1, DF$f1,
2012 Jan 27
3
Subsetting for the ten highest values by group in a dataframe
Hello,
I am looking for a way to subset a data frame by choosing the top ten
maximum values from that dataframe. As well this occurs within some
factor levels.
## I've used plyr here but I'm not married to this approach
require(plyr)
## I've created a data.frame with two groups and then a id variable (y)
df <- data.frame(x=rnorm(400, mean=20), y=1:400,
2009 Apr 03
3
plyr and table question
Dear all,
I'm puzzled by the following example inspired by a recent question on
R-help,
cc <- textConnection("user_id website time
20 google 0930
21 yahoo 0935
20 facebook 1000
25 facebook 1015
61 google 0940")
d <- read.table(cc, head=T) ; close(cc)
table(d$user_id) # count the
2011 Jun 21
4
ddply to count frequency of combinations
I have a dataframe df with two columns x and y. I want to count the number
of times a unique x, y combination occurs.
For example
x<- c(1,2,3,4,5,1,2,3,4)
y<- c(1,2,3,4,5,1,2,4,1)
df<-as.data.frame(cbind(x, y))
#what is the correct way to use ddply for this example?
ddply(df, c('x','y', summarize, ??)
#desired output -- format and order doesn't matter
# (x, y)
2012 Oct 11
2
Selecting n observation
Hello R help,
I have a question similar to what is posted by someone before. my
problem is that Instead of last assessment, I want to choose last two.
I have a data set with several time assessments for each participant.
I want to select the last assessment for each participant. My dataset
looks like this:
ID week outcome
1 2 14
1 4 28
1 6 42
4 2 14
4 6 46
4 9 64
4 9
2013 Aug 27
1
[plyr] Moving average filter with plyr
Dear all,
I'm stuck with a problem using plyr to process a rather large junk of data. What I'm trying to do is applying a moving average to all the subparts of the dataframe (the example data can be found here https://dl.dropboxusercontent.com/u/2414056/testData.Rdata).
require(plyr)
load("testData.Rdata")
applyfilter<-function(x){
return(filter(x,rep(1/5, times=5)))
}