Displaying 20 results from an estimated 40000 matches similar to: "the first and last observation for each subject"
2012 Mar 21
2
Best way to compute the difference between two levels of a factor ?
Dear R-help Members,
I am wondering if anyone think of the optimal way of computing for
several numeric variable the difference between 2 levels of a factor.
To be clear let's generate a simple data frame with 2 numeric variables
collected for different subjects (ID) and 2 levels of a TIME factor
(time of evaluation)
2009 Nov 16
8
extracting the last row of each group in a data frame
Hi,
I would like to extract the last row of each group in a data frame.
The data frame is as follows
Name Value
A 1
A 2
A 3
B 4
B 8
C 2
D 3
I would like to get a data frame as
Name Value
A 3
B 8
C 2
D 3
Thank you for your suggestions in advance
Jeff
2010 Apr 22
4
how to select the first observation only?
Dear r-helpers,
I have a very simple question. Suppose my data is like
id=c(rep(1,2),rep(2,2))
b=c(2,3,4,5)
m=cbind(id,b)
> m
id b
[1,] 1 2
[2,] 1 3
[3,] 2 4
[4,] 2 5
I wish to select the first observation for each id. That is, I want to
quickly select two rows:
id b
1 2
2 4
only. how should i do this?
[[alternative HTML version deleted]]
2012 Oct 11
2
Selecting n observation
Hello R help,
I have a question similar to what is posted by someone before. my
problem is that Instead of last assessment, I want to choose last two.
I have a data set with several time assessments for each participant.
I want to select the last assessment for each participant. My dataset
looks like this:
ID week outcome
1 2 14
1 4 28
1 6 42
4 2 14
4 6 46
4 9 64
4 9
2008 Sep 25
2
Equivalent of 'first.var' or 'last.var' from SAS in R?
Hi,
I want to sort a data frame by multiple columns and then take the
first record in each unique level of the "by" group I used to sort the
data frame. Does someone have an example of how to do this?
Thanks,
Matt
--
It is from the wellspring of our despair and the places that we are
broken that we come to repair the world.
-- Murray Waas
2011 Feb 21
3
Subset according to groups NA proportion within specific variables
Dear R-List,
I have a dataframe with one grouping variable (x) and three response variables (y,z,w).
df<-data.frame(x=c(rep(1,3),rep(2,4),rep(3,5)),y=rnorm(12),z=c(3,4,5,NA,NA,NA,NA,1,2,1,2,1),w=c(1,2,3,3,4,3,5,NA,5,NA,7,8))
>df
x y z w
1 0.29306106 3 1
1 0.54797780 4 2
1 -1.38365548 5 3
2 -0.20407986
2013 Mar 15
2
Help finding first value in a BY group
I have a large Excel file with SKU numbers (stock keeping units) and
forecasts which can be mimicked with the following:
Period <- c(1, 2, 3, 1, 2, 3, 4, 1, 2)
SKU <- c("A1","A1","A1","X4","X4","X4","X4","K2","K2")
Forecast <- c(99, 103, 128, 63, 69, 72, 75, 207, 201)
PeriodSKUForecast <-
2011 Aug 23
3
ddply - how to transform df column "in place"
Dear R-users,
I am trying to get the plyr syntax right, without much success.
Given:
d<- data.frame(cbind(x=1,y=seq(20100801,20100830,1)))
names(d)<-c("first", "daterep")
d2<-d
# I can convert the daterep column in place the classic way:
d$daterep<-as.Date(strptime(d$daterep, format="%Y%m%d"))
# How to do it the plyr way?
ddply(d2,
2012 Jan 17
1
New PLYR issue
Hello everyone,
I have got the same problem, with the same error message.
Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP
The plyr mailing list does not provide any help until now.
>require(plyr)
>c(sample(c(1:100), 50, replace=TRUE))->V1
>c(rep( 1:5, 10))->f1 #variable to group V1
>data.frame(cbind(V1, f1))->DF
>str(DF)
>ddply(DF$V1, DF$f1,
2010 Apr 14
6
sum specific rows in a data frame
I have a data frame called "pose":
DESCRIPTION QUANITY CLOSING.PRICE
1 WHEAT May/10 1 467.75
2 WHEAT May/10 2 467.75
3 WHEAT May/10 1 467.75
4 WHEAT May/10 1 467.75
5 COTTON NO.2 May/10 1 78.13
6 COTTON NO.2 May/10 3 78.13
7 COTTON NO.2 May/10 1 78.13
2012 Jan 27
3
Subsetting for the ten highest values by group in a dataframe
Hello,
I am looking for a way to subset a data frame by choosing the top ten
maximum values from that dataframe. As well this occurs within some
factor levels.
## I've used plyr here but I'm not married to this approach
require(plyr)
## I've created a data.frame with two groups and then a id variable (y)
df <- data.frame(x=rnorm(400, mean=20), y=1:400,
2009 May 25
3
long format - find age when another variable is first 'high'
Dear R,
I've got a data frame with children examined multiple times and at various
ages. I'm trying to find the first age at which another variable
(LDL-Cholesterol) is >= 130 mg/dL; for some children, this may never happen.
I can do this with transformBy and ddply, but with 10,000 different
children, these functions take some time on my PCs - is there a faster way
to do this in R?
2009 Aug 05
2
using ddply but preserving some of the outside data
I have a bit of a quandy. I'm working with a data set for which I
have sampled sites at a variety of dates. I want to use this data,
and get a running average of the sampled values for the current and
previous date.
I originally thought something like ddply would be ideal for this,
however, I cannot break up my data by date, and then apply a function
that requires information
2011 Jun 21
4
ddply to count frequency of combinations
I have a dataframe df with two columns x and y. I want to count the number
of times a unique x, y combination occurs.
For example
x<- c(1,2,3,4,5,1,2,3,4)
y<- c(1,2,3,4,5,1,2,4,1)
df<-as.data.frame(cbind(x, y))
#what is the correct way to use ddply for this example?
ddply(df, c('x','y', summarize, ??)
#desired output -- format and order doesn't matter
# (x, y)
2010 Sep 10
4
Counting occurances of a letter by a factor
I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame.
Ex.
> DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L",
2011 Apr 25
2
Problem with ddply in the plyr-package: surprising output of a date-column
Hi Together,
I have a problem with the plyr package - more precisely with the ddply
function - and would be very grateful for any help. I hope the example
here is precise enough for someone to identify the problem. Basically,
in this step I want to identify observations that are identical in
terms of certain identifiers (ID1, ID2, ID3) and just want to save
those observations (in this step,
2011 Oct 12
3
Applying function to only numeric variable (plyr package?)
My data frame consists of character variables, factors, and proportions,
something like
c1 <- c("A", "B", "C", "C")
c2 <- factor(c(1, 1, 2, 2), labels = c("Y","N"))
x <- c(0.5234, 0.6919, 0.2307, 0.1160)
y <- c(0.9251, 0.7616, 0.3624, 0.4462)
df <- data.frame(c1, c2, x, y)
pct <- function(x) round(100*x, 1)
I want to
2011 Apr 21
1
Stymied by plyr
Hello, This is my first time trying to use plyr, and I'm getting
nowhere. I have teacher ratings data (1:4), on 10 components, by
external observers and internal observers, in schools in areas. I want
to calculate the percentage of each rating given on each component, by
each type of observer, within each school, within each area. The data
look like this:
unit area ext.obs rating comp
11
2011 Sep 25
4
selecting first row of a variable with long-format data
Hi,
I am trying to select the first row of a variable with data in long-format,
e.g.,
# sample data
id <- c(1,1,1,2,2)
value <- c(5,6,7,4,5)
dat <- data.frame(id, value)
dat
How can I select/subset the first 'value' for each unique 'id'?
Thanks,
AC
[[alternative HTML version deleted]]
2011 May 01
1
Mean/SD of Each Position in Table
I have 100+ .csv files which have the basic format:
> test
X Substance1 Substance2 Substance3 Substance4 Substance5
1 Time1 10 0 0 0 0
2 Time2 9 5 0 0 0
3 Time3 8 10 1 0 0
4 Time4 7 20 2 1 0
5 Time5