thr3ads.net - R help - [R] Help with selection of continuous data [Jun 2021]

If this information is useful, please help other people find it:
Share via:

Eric Berger

2021-Jun-21 08:17 UTC

[R] Help with selection of continuous data

Hi Andr?,
It's not 100% clear to me what you are asking. I am interpreting the
question as selecting the data from those dates for which all of
1,2,3,4,5,6,7,8 appear in the ID column.
My approach determines the dates satisfying this property, which I put into
a vector dtV. Then I take the rows of A for which the date is in the vector
dtV.

library(dplyr)
dtV <- A %>% mutate(x=2^(ID-1)) %>% group_by(Date) %>%
summarise(y=(sum(unique(x))%%256==255)) %>% filter(y==TRUE) %>%
select(Date)
B <- A[ A$Date %in% dtV$Date, ]

B is the subset of A that you want.

HTH,
Eric



On Mon, Jun 21, 2021 at 10:23 AM Andr? Luis Neves <andrluis at
ualberta.ca>
wrote:
> Dear R users,
>
> I want to select only the data containing a continuous number of *ID* from
> 1-8 in each *DATE*. Note, I do not want to select data that do not contain
> a continuous number in *ID *from 1-8 (eg. Data on *DATE* 1/2/2020, and
> 01/03/2020). The dataset is a huge matrix with 24 columns and 1.5 million
> rows, but I have prepared a reproducible code for your reference below.
>
> Here it is the reproducible code:
>
> A >
>
data.frame(c("01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020",
>
>
> 
"01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/02/2020","01/02/2020",
>
>
> 
"01/02/2020","01/02/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020",
>
>
> 
"01/03/2020","01/03/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020",
>               
"01/04/2020","01/04/2020","01/04/2020","01/04/2020"),
> c(23,22,12,24,26,19,34,15,17,19,23,33,
>
>  23,34,25,23,25,24,34,33,31,32,24,22,21,23,22,22,21,23,23,21),
> c(13,11,12,9,8,9,7,10,11,9,6,11,
>                9,8,9,10,11,12,9,8,10,4,6,9,8,9,10,11,14,12,13,11),
> c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,1,2,
>                3,4,5,6,7,1,2,3,4,5,6,7,8,9))
> colnames(A) <- c("Date", "CO2", "CH4",
"ID")
> A
>
> Thank you,
> --
> Andre
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Jim Lemon

2021-Jun-21 10:10 UTC

head link

[R] Help with selection of continuous data

Hi Andre,
I've taken a different approach to that employed by Eric:

A<-data.frame(c("01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020",
 "01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020",
 "01/01/2020","01/02/2020","01/02/2020","01/02/2020","01/02/2020","01/03/2020",
 "01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020",
 "01/04/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020",
     
"01/04/2020","01/04/2020","01/04/2020","01/04/2020"),
c(23,22,12,24,26,19,34,15,17,19,23,33,23,34,25,23,25,24,34,33,31,32,24,22,21,
 23,22,22,21,23,23,21),
c(13,11,12,9,8,9,7,10,11,9,6,11,9,8,9,10,11,12,9,8,10,4,6,9,8,9,10,11,14,12,
 13,11),
c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,1,2,
 3,4,5,6,7,1,2,3,4,5,6,7,8,9))
colnames(A) <- c("Date", "CO2", "CH4",
"ID")
# add a variable to compile selected rows
A$select<-FALSE
# get all unique dates
alldates<-unique(A$Date)
for(date in alldates) {
 # get indices for this date
 date_indices<-which(A$Date == date)
 # only mark the first 8 as TRUE
 A$select[date_indices[1:8]]<-all(1:8 %in% A$ID[date_indices])
}
A
A[A$select,]

If you don't want to add a column you can set up "select" as a
vector.

Jim

On Mon, Jun 21, 2021 at 6:18 PM Eric Berger <ericjberger at gmail.com>
wrote:>
> Hi Andr?,
> It's not 100% clear to me what you are asking. I am interpreting the
> question as selecting the data from those dates for which all of
> 1,2,3,4,5,6,7,8 appear in the ID column.
> My approach determines the dates satisfying this property, which I put into
> a vector dtV. Then I take the rows of A for which the date is in the vector
> dtV.
>
> library(dplyr)
> dtV <- A %>% mutate(x=2^(ID-1)) %>% group_by(Date) %>%
> summarise(y=(sum(unique(x))%%256==255)) %>% filter(y==TRUE) %>%
select(Date)
> B <- A[ A$Date %in% dtV$Date, ]
>
> B is the subset of A that you want.
>
> HTH,
> Eric
>
>
>
> On Mon, Jun 21, 2021 at 10:23 AM Andr? Luis Neves <andrluis at
ualberta.ca>
> wrote:
>
> > Dear R users,
> >
> > I want to select only the data containing a continuous number of *ID*
from
> > 1-8 in each *DATE*. Note, I do not want to select data that do not
contain
> > a continuous number in *ID *from 1-8 (eg. Data on *DATE* 1/2/2020, and
> > 01/03/2020). The dataset is a huge matrix with 24 columns and 1.5
million
> > rows, but I have prepared a reproducible code for your reference
below.
> >
> > Here it is the reproducible code:
> >
> > A > >
> >
data.frame(c("01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020",
> >
> >
> > 
"01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/01/2020","01/02/2020","01/02/2020",
> >
> >
> > 
"01/02/2020","01/02/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020","01/03/2020",
> >
> >
> > 
"01/03/2020","01/03/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020","01/04/2020",
> >               
"01/04/2020","01/04/2020","01/04/2020","01/04/2020"),
> > c(23,22,12,24,26,19,34,15,17,19,23,33,
> >
> >  23,34,25,23,25,24,34,33,31,32,24,22,21,23,22,22,21,23,23,21),
> > c(13,11,12,9,8,9,7,10,11,9,6,11,
> >                9,8,9,10,11,12,9,8,10,4,6,9,8,9,10,11,14,12,13,11),
> > c(1,2,3,4,5,6,7,8,9,10,11,12,1,2,3,4,1,2,
> >                3,4,5,6,7,1,2,3,4,5,6,7,8,9))
> > colnames(A) <- c("Date", "CO2",
"CH4", "ID")
> > A
> >
> > Thank you,
> > --
> > Andre
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Jun 2021 - Help with selection of continuous data

[R] Help with selection of continuous data

[R] Help with selection of continuous data