Displaying 20 results from an estimated 10000 matches similar to: "Identify first row of each ID within a data frame, create a variable first =1 for the first row and first=0 of all other rows"
2024 Dec 02
0
Identify first row of each ID within a data frame, create a variable first =1 for the first row and first=0 of all other rows
John,
Thanks for enlightening us so we better understand.
I won't argue with your wish to learn to do things in base R first. I started that way, myself, and found lots of the commands not particularly easy to fit into a single worldview. Many functions I read about were promptly forgotten, especially those without great documentation and not enough examples of real world usage.
This is why
2024 Dec 01
6
Identify first row of each ID within a data frame, create a variable first =1 for the first row and first=0 of all other rows
Dear R help folks,
First my apologizes for sending several related questions to the list server. I am trying to learn how to manipulate data in R . . . and am having difficulty getting my program to work. I greatly appreciate the help and support list member give!
I am trying to write a program that will run through a data frame organized by ID and for the first line of each new group of data
2024 Dec 01
2
Identify first row of each ID within a data frame, create a variable first =1 for the first row and first=0 of all other rows
Rui:
"f these two, diff is faster. But of all the solutions posted so far,
Ben Bolker's is the fastest."
But the explicit version of diff is still considerably faster:
> D <- c(rep(1,10),rep(2,6),rep(3,2))
> microbenchmark(c(1L,diff(D)), times = 1000L)
Unit: microseconds
expr min lq mean median uq max neval
c(1L, diff(D)) 3.075 3.198 3.34396
2024 Nov 27
1
R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments
?s 16:30 de 27/11/2024, Sorkin, John escreveu:
> I am an old, long time SAS programmer. I need to produce R code that processes a dataframe in a manner that is equivalent to that produced by using a by statement in SAS and an if first.day statement and a retain statement:
>
> I want to take data (olddata) that looks like this
> ID Day
> 1 1
> 1 1
> 1 2
> 1 2
> 1 3
>
2024 Nov 27
1
R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments
On 11/27/24 08:30, Sorkin, John wrote:
> I am an old, long time SAS programmer. I need to produce R code that processes a dataframe in a manner that is equivalent to that produced by using a by statement in SAS and an if first.day statement and a retain statement:
>
> I want to take data (olddata) that looks like this
> ID Day
> 1 1
> 1 1
> 1 2
> 1 2
> 1 3
> 1 3
>
2024 Nov 27
1
R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments
Was wondering when this would be suggested. But the question was about getting the final dataframe...
newdta <- olddta
newdta$FirstDay <- ave(newdata$date, newdata$ID, FUN = \(x) x[1L])
On November 27, 2024 11:13:49 AM PST, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>?s 16:30 de 27/11/2024, Sorkin, John escreveu:
>> I am an old, long time SAS programmer. I need to
2024 Nov 27
4
R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments
Check out the dplyr package, specifically the mutate function.
# Create new column based on existing column value
df <- df %>% mutate(FirstDay = if(ID = 2, 5))
df
Repeat as needed to capture all of the day/firstday combinations you want to account for.
Like everything else in R, there are probably at least a dozen other ways to do this, between base R and all of the library packages
2024 Nov 27
7
R Processing dataframe by group - equivalent to SAS by group processing with a first. and retain statments
I am an old, long time SAS programmer. I need to produce R code that processes a dataframe in a manner that is equivalent to that produced by using a by statement in SAS and an if first.day statement and a retain statement:
I want to take data (olddata) that looks like this
ID Day
1 1
1 1
1 2
1 2
1 3
1 3
1 4
1 4
1 5
1 5
2 5
2 5
2 5
2 6
2 6
2 6
3 10
3 10
and make it look like this:
(withing each
2011 May 19
1
Creating a "shifted" month (one that starts not on the first of each month but on another date)
Hello!
I have a data frame with dates. I need to create a new "month" that
starts on the 20th of each month - because I'll need to aggregate my
data later by that "shifted" month.
I wrote the code below and it works. However, I was wondering if there
is some ready-made function in some package - that makes it
easier/more elegant?
Thanks a lot!
# Example data:
2012 Feb 13
2
finding and describing missing data runs in a time series
Hi -
I am trying to find and describe missing data in a time series. For instance, in the library openair, there is a data frame called "mydata":
library(openair)
head(mydata)
date ws wd nox no2 o3 pm10 so2 co pm25
1 1998-01-01 00:00:00 0.60 280 285 39 1 29 4.7225 3.3725 NA
2 1998-01-01 01:00:00 2.16 230 NA NA NA 37 NA NA NA
3 1998-01-01 02:00:00
2010 Sep 21
1
partial dbRDA or CCA with two distance objects in Vegan.
I am trying to use the cca/rda/capscale functions in vegan to analyse
genetic distance data ( provided as a dist object calculated using
dist.genpop in package adegenet) with geographic distance partialled out
( provided as a distance object using dist function in veganthis method
is attempting to follow the method used by Geffen et al 2004 as
suggested by Legendre and . FORTIN (2010).
I
2013 Jan 15
1
Using system() to dump HDF to text file?
Hi all --
I am working on 64-bit Windows XP. I'm not a very technical person when it
comes to the command line stuff, so please forgive me if this is a stupid
question.
I have a bunch of HDF files, and I want to dump a single PM25 data layer
from each file to .txt by invoking ncdump.exe from system(). Here's a
sample command string:
"C:/ncdump -v PM25 C:/01aug2010.hdf >
2012 Apr 29
1
CForest Error Logical Subscript Too Long
Hi,
This is my code (my data is attached):
library(languageR)
library(rms)
library(party)
OLDDATA <- read.csv("/Users/Abigail/Documents/OldData250412.csv")
OLDDATA$YD <- factor(OLDDATA$YD, label=c("Yes", "No"))?
OLDDATA$ND <- factor(OLDDATA$ND, label=c("Yes", "No"))?
attach(OLDDATA)
defaults <- cbind(YD, ND)
set.seed(47)
data.controls
2023 Jul 25
1
Seeking Assistance: Plotting Sea Current Vectors in R
Hi Kostas,
The function vectorField in the plotrix package may do what you want.
See the example.
Jim
On Tue, Jul 25, 2023 at 9:30?PM konstantinos christodoulou
<konstantinos.christodoulou1 at gmail.com> wrote:
>
> Dear Rcommunity,
>
> I hope this email finds you well. I am writing to seek your assistance with
> a data visualization problem I am facing while working with R.
2013 Jun 08
0
modify and append new rows in a dataframe
My data frame shows changes on the variable act which records the consecutive duration (in seconds) of two states (wet-dry) over a few days for several?individuals (identified by Ring). Since I want to work with daytime (i.e. from dawn till dusk) and night time (i.e. from dusk till next dawn), I have to split act in two: from time[i] till dusk and from dusk until time[i+1], and from time[k] till
2001 Dec 07
2
question
Isn't anything in a data frame that is not explicitly numeric a *factor*?
-Greg
> -----Original Message-----
> From: Peter Dalgaard BSA [mailto:p.dalgaard@biostat.ku.dk]
> Sent: Friday, December 07, 2001 5:32 PM
> To: Erich Neuwirth
> Cc: r-devel@stat.math.ethz.ch
> Subject: Re: [Rd] question
>
>
> Erich Neuwirth
2012 Mar 26
1
assigning vector or matrix sparsely (for use with mclapply)
Dear R wizards---
I have a wrapper on mclapply() that makes it a little easier for me to
do multiprocessing. (Posting this may make life easier for other
googlers.) I pass a data frame, a vector that tells me what rows
should be recomputed, and the function; and I get back a vector or
matrix of answers.
d <- data.frame( id=1:6, val=11:16 )
loc <- c(TRUE,TRUE,FALSE,TRUE,FALSE,TRUE)
2023 Jul 25
2
Seeking Assistance: Plotting Sea Current Vectors in R
Dear Rcommunity,
I hope this email finds you well. I am writing to seek your assistance with
a data visualization problem I am facing while working with R.
Problem Description:
I have a dataframe named "df" containing the following columns:
"longitude", "latitude", "sea_currents_mag", and "sea_currents_direction".
The dataframe includes sea
2017 Jun 09
0
Extremely slow du
Can you please provide more details about your volume configuration and the
version of gluster that you are using?
Regards,
Vijay
On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com>
wrote:
> Hi
>
> I have just moved our 400 TB HPC storage from lustre to gluster. It is
> part of a research institute and users have very small files to big files
> ( few
2017 Jun 09
2
Extremely slow du
Hi
I have just moved our 400 TB HPC storage from lustre to gluster. It is part
of a research institute and users have very small files to big files ( few
KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 disks.
All servers are connected through 10G ethernet but not all clients.
Gluster volumes are distributed without any replication. There are
approximately 80 million files in