thr3ads.net - R help - [R] make new collumns with conditions [Jan 2021]

If this information is useful, please help other people find it:
Share via:

krissievdh

2021-Jan-25 16:01 UTC

[R] make new collumns with conditions

Hi,
Thanks for your response.

I do get what you're doing. However, the table I sent is just a small piece
of the complete database. So for me to have to add in everything with
structure list (c ......) by hand would be too much work.
Just to give you an idea, the database is around 16000 rows and has 40
columns with other variables that I do want to keep. So I  kind of want to
find a way to keep everything and just add a couple of columns with the
calculated time for vigilant behavior and the percentage.

Still thanks for thinking with me. I am looking into the aggregate
function. Hopefully, this could be a solution.

krissie






Op ma 25 jan. 2021 16:44 schreef Rui Barradas <ruipbarradas at sapo.pt>:
> Hello,
>
> Try the following.
> First aggregate the data, then get the totals, then the percentages.
> Finally, put the species in the result.
>
>
> agg <- aggregate(formula = `duration(s)` ~ `observation nr` + `behavior
> type`,
>                   data = d_vigi,
>                   FUN = sum,
>                   subset = `behavior type` == 'Vigilant')
> agg$total <- tapply(d_vigi$`duration(s)`, d_vigi$`observation nr`, FUN
> sum)
> agg$percent <- round(100 * agg$`duration(s)`/agg$total)
>
> res <- merge(agg, d_vigi[c(1, 3:4)])
> res[!duplicated(res), ]
>
>
> Data in dput format:
>
>
> d_vigi <-
> structure(list(`behavior type` = c("Non-vigilant",
"Vigilant",
> "Vigilant", "Non-vigilant", "Vigilant",
"Vigilant", "Non-vigilant",
> "Unkown"), `duration(s)` = c(5L, 2L, 2L, 3L, 7L, 2L, 1L, 2L),
>      `observation nr` = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), species >
c("red deer",
>      "red deer", "red deer", "red deer",
"red deer", "red deer",
>      "red deer", "red deer")), class =
"data.frame", row.names = c(NA,
> -8L))
>
>
> Hope this helps,
>
> Rui Barradas
>
> ?s 13:57 de 25/01/21, krissievdh escreveu:
> > Hi,
> >
> > I have a dataset (d_vigi)with this kind of data:
> > behavior type duration(s) observation nr species
> > Non-vigilant 5 1 red deer
> > Vigilant 2 1 red deer
> > Vigilant 2 1 red deer
> > Non-vigilant 3 1 red deer
> > Vigilant 7 2 red deer
> > Vigilant 2 2 red deer
> > Non-vigilant 1 2 red deer
> > Unkown  2 2 red deer
> > Now I have to calculate the percentage of vigilant behavior spent per
> > observation.
> >
> > So eventually I will need to end up with something like this:
> > Observation nr Species vigilant(s) total (s) percentage of vigilant
(%)
> > 1 red deer 4 12 33
> > 2 red deer 9 12 75
> >
> >
> > Now I know how to calculate the total amount of seconds per
observation.
> > But I don't know how I get to the total seconds of vigilant
behavior per
> > observation (red numbers). If I could get there I will know how to
> > calculate the percentage.
> >
> >
> > I calculated the total duration per observation this way:
> > for(id in d_vigi$Obs.nr){
> >
> >
>
d_vigi$t.duration[d_vigi$Obs.nr==id]<-sum(d_vigi$'Duration.(s).x'[d_vigi$Obs.nr==id])
> > }
> >
> > this does work and gives me the total (s) but i don't know how to
get to
> > the sum of the seconds just for the vigilant per observation number.
Is
> > there anyone who could help me?
> >
> > Thanks,
> > Krissie
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
	[[alternative HTML version deleted]]

krissievdh

2021-Jan-25 16:29 UTC

head link

[R] make new collumns with conditions

Hi,

So one thing i could manage to do was this:

d_vigi$combi <- paste(d_vigi$Behavioral.category, d_vigi$Obs.nr, sep =
"-")

This created a new column with a combination of the category and the
observation number.
Afterwards I did this:
for(id in d_vigi$combi){

d_vigi$durationpercat[d_vigi$combi==id]<-sum(d_vigi$'Duration.(s).x'[d_vigi$combi==id])
}

So this created another new column with the correct duration per category.
So that means that I have this:
behavior Behavioral category  Duration Obs nr species combi durationpercat
Non-vigilant 5 1 red deer Non-vigilant-1 8
Vigilant 2 1 red deer Vigilant-1 4
Vigilant 2 1 red deer Vigilant-1 4
Non-vigilant 3 1 red deer Non-vigilant-1 8
Vigilant 7 2 red deer Vigilant-2 9
Vigilant 2 2 red deer Vigilant-2 9
Non-vigilant 1 2 red deer Non-vigilant-2 1
Unknown  2 2 red deer Unknown-2 2
However, this doesn't work for me further along the line. I have to have
the duration for vigilant behaviour in a separate column. I really don't
know how to get there.

Hopefully, you understand where my problem lies. So I kinda need to have
three columns for vigilant, non-vigilant and unknown. That way I could add
in zero's for the observations where there weren't any vigilant
behaviour.

Krissie

Op ma 25 jan. 2021 om 17:01 schreef krissievdh <krissievdh at gmail.com>:
> Hi,
> Thanks for your response.
>
> I do get what you're doing. However, the table I sent is just a small
> piece of the complete database. So for me to have to add in everything with
> structure list (c ......) by hand would be too much work.
> Just to give you an idea, the database is around 16000 rows and has 40
> columns with other variables that I do want to keep. So I  kind of want to
> find a way to keep everything and just add a couple of columns with the
> calculated time for vigilant behavior and the percentage.
>
> Still thanks for thinking with me. I am looking into the aggregate
> function. Hopefully, this could be a solution.
>
> krissie
>
>
>
>
>
>
> Op ma 25 jan. 2021 16:44 schreef Rui Barradas <ruipbarradas at
sapo.pt>:
>
>> Hello,
>>
>> Try the following.
>> First aggregate the data, then get the totals, then the percentages.
>> Finally, put the species in the result.
>>
>>
>> agg <- aggregate(formula = `duration(s)` ~ `observation nr` +
`behavior
>> type`,
>>                   data = d_vigi,
>>                   FUN = sum,
>>                   subset = `behavior type` == 'Vigilant')
>> agg$total <- tapply(d_vigi$`duration(s)`, d_vigi$`observation nr`,
FUN >> sum)
>> agg$percent <- round(100 * agg$`duration(s)`/agg$total)
>>
>> res <- merge(agg, d_vigi[c(1, 3:4)])
>> res[!duplicated(res), ]
>>
>>
>> Data in dput format:
>>
>>
>> d_vigi <-
>> structure(list(`behavior type` = c("Non-vigilant",
"Vigilant",
>> "Vigilant", "Non-vigilant", "Vigilant",
"Vigilant", "Non-vigilant",
>> "Unkown"), `duration(s)` = c(5L, 2L, 2L, 3L, 7L, 2L, 1L, 2L),
>>      `observation nr` = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), species
>> c("red deer",
>>      "red deer", "red deer", "red deer",
"red deer", "red deer",
>>      "red deer", "red deer")), class =
"data.frame", row.names = c(NA,
>> -8L))
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> ?s 13:57 de 25/01/21, krissievdh escreveu:
>> > Hi,
>> >
>> > I have a dataset (d_vigi)with this kind of data:
>> > behavior type duration(s) observation nr species
>> > Non-vigilant 5 1 red deer
>> > Vigilant 2 1 red deer
>> > Vigilant 2 1 red deer
>> > Non-vigilant 3 1 red deer
>> > Vigilant 7 2 red deer
>> > Vigilant 2 2 red deer
>> > Non-vigilant 1 2 red deer
>> > Unkown  2 2 red deer
>> > Now I have to calculate the percentage of vigilant behavior spent
per
>> > observation.
>> >
>> > So eventually I will need to end up with something like this:
>> > Observation nr Species vigilant(s) total (s) percentage of
vigilant (%)
>> > 1 red deer 4 12 33
>> > 2 red deer 9 12 75
>> >
>> >
>> > Now I know how to calculate the total amount of seconds per
observation.
>> > But I don't know how I get to the total seconds of vigilant
behavior per
>> > observation (red numbers). If I could get there I will know how to
>> > calculate the percentage.
>> >
>> >
>> > I calculated the total duration per observation this way:
>> > for(id in d_vigi$Obs.nr){
>> >
>> >
>>
d_vigi$t.duration[d_vigi$Obs.nr==id]<-sum(d_vigi$'Duration.(s).x'[d_vigi$Obs.nr==id])
>> > }
>> >
>> > this does work and gives me the total (s) but i don't know how
to get to
>> > the sum of the seconds just for the vigilant per observation
number. Is
>> > there anyone who could help me?
>> >
>> > Thanks,
>> > Krissie
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>
	[[alternative HTML version deleted]]

Michael Dewey

2021-Jan-25 17:33 UTC

head link

[R] make new collumns with conditions

Dear Krissie

I think you misunderstood Rui's response. He was generating some fake 
data to test the code not suggesting you rebuild your data frame.

Michael

On 25/01/2021 16:01, krissievdh wrote:> Hi,
> Thanks for your response.
> 
> I do get what you're doing. However, the table I sent is just a small
piece
> of the complete database. So for me to have to add in everything with
> structure list (c ......) by hand would be too much work.
> Just to give you an idea, the database is around 16000 rows and has 40
> columns with other variables that I do want to keep. So I  kind of want to
> find a way to keep everything and just add a couple of columns with the
> calculated time for vigilant behavior and the percentage.
> 
> Still thanks for thinking with me. I am looking into the aggregate
> function. Hopefully, this could be a solution.
> 
> krissie
> 
> 
> 
> 
> 
> 
> Op ma 25 jan. 2021 16:44 schreef Rui Barradas <ruipbarradas at
sapo.pt>:
> 
>> Hello,
>>
>> Try the following.
>> First aggregate the data, then get the totals, then the percentages.
>> Finally, put the species in the result.
>>
>>
>> agg <- aggregate(formula = `duration(s)` ~ `observation nr` +
`behavior
>> type`,
>>                    data = d_vigi,
>>                    FUN = sum,
>>                    subset = `behavior type` == 'Vigilant')
>> agg$total <- tapply(d_vigi$`duration(s)`, d_vigi$`observation nr`,
FUN >> sum)
>> agg$percent <- round(100 * agg$`duration(s)`/agg$total)
>>
>> res <- merge(agg, d_vigi[c(1, 3:4)])
>> res[!duplicated(res), ]
>>
>>
>> Data in dput format:
>>
>>
>> d_vigi <-
>> structure(list(`behavior type` = c("Non-vigilant",
"Vigilant",
>> "Vigilant", "Non-vigilant", "Vigilant",
"Vigilant", "Non-vigilant",
>> "Unkown"), `duration(s)` = c(5L, 2L, 2L, 3L, 7L, 2L, 1L, 2L),
>>       `observation nr` = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), species
>> c("red deer",
>>       "red deer", "red deer", "red deer",
"red deer", "red deer",
>>       "red deer", "red deer")), class =
"data.frame", row.names = c(NA,
>> -8L))
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> ?s 13:57 de 25/01/21, krissievdh escreveu:
>>> Hi,
>>>
>>> I have a dataset (d_vigi)with this kind of data:
>>> behavior type duration(s) observation nr species
>>> Non-vigilant 5 1 red deer
>>> Vigilant 2 1 red deer
>>> Vigilant 2 1 red deer
>>> Non-vigilant 3 1 red deer
>>> Vigilant 7 2 red deer
>>> Vigilant 2 2 red deer
>>> Non-vigilant 1 2 red deer
>>> Unkown  2 2 red deer
>>> Now I have to calculate the percentage of vigilant behavior spent
per
>>> observation.
>>>
>>> So eventually I will need to end up with something like this:
>>> Observation nr Species vigilant(s) total (s) percentage of vigilant
(%)
>>> 1 red deer 4 12 33
>>> 2 red deer 9 12 75
>>>
>>>
>>> Now I know how to calculate the total amount of seconds per
observation.
>>> But I don't know how I get to the total seconds of vigilant
behavior per
>>> observation (red numbers). If I could get there I will know how to
>>> calculate the percentage.
>>>
>>>
>>> I calculated the total duration per observation this way:
>>> for(id in d_vigi$Obs.nr){
>>>
>>>
>>
d_vigi$t.duration[d_vigi$Obs.nr==id]<-sum(d_vigi$'Duration.(s).x'[d_vigi$Obs.nr==id])
>>> }
>>>
>>> this does work and gives me the total (s) but i don't know how
to get to
>>> the sum of the seconds just for the vigilant per observation
number. Is
>>> there anyone who could help me?
>>>
>>> Thanks,
>>> Krissie
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
-- 
Michael
http://www.dewey.myzen.co.uk/home.html

R help - Jan 2021 - make new collumns with conditions

[R] make new collumns with conditions

[R] make new collumns with conditions

[R] make new collumns with conditions