thr3ads.net - R help - [R] Replace NAs in split lists [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Ek Esawi

2018-Jan-08 16:03 UTC

[R] Replace NAs in split lists

Thank you Jeff. Your code works, as usual , perfectly. I am just
wondering why if i put the whole code in one line, i get an error
message.
sdf2 <- lapply( sdf, function(z){z$Value
<-ifelse(is.na(z$Value),z$Value[!is.na(z$Value)][1],z$Value)z})
error. unexpected symbol in sdf2

Thanks again

EK


On Mon, Jan 8, 2018 at 3:12 AM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us> wrote:> Upon closer examination I see that you are not using the split version of
> df1 as I usually would, so here is a reproducible example:
>
> #----
> df1 <- read.table( text> "ID ID_2 Firist Value
> 1  a   aa   TRUE     2
> 2  a   ab  FALSE    NA
> 3  a   ac  FALSE    NA
> 4  b   aa   TRUE     5
> 5  b   ab  FALSE    NA
> ", header=TRUE, as.is=TRUE )
>
> sdf <- split( df1, df1$ID )
> # note the extra [ 1 ] in case you have more than one non-NA value # per ID
> sdf2 <- lapply( sdf
>               , function( z ) {
>                  z$Value <- ifelse( is.na( z$Value )
>                                   , z$Value[ !is.na( z$Value ) ][ 1 ]
>                                   , z$Value
>                                   )
>                  z
>                 }
>               )
> df2 <- do.call( rbind, sdf2 )
> df2
> #>     ID ID_2 Firist Value
> #> a.1  a   aa   TRUE     2
> #> a.2  a   ab  FALSE     2
> #> a.3  a   ac  FALSE     2
> #> b.4  b   aa   TRUE     5
> #> b.5  b   ab  FALSE     5
>
> # or using tidyverse methods
>
> library(dplyr)
> #>
> #> Attaching package: 'dplyr'
> #> The following objects are masked from 'package:stats':
> #>
> #>     filter, lag
> #> The following objects are masked from 'package:base':
> #>
> #>     intersect, setdiff, setequal, union
> df3 <- (   df1
>        %>% group_by( ID )
>        %>% do({
>               mutate( .
>                     , Value = ifelse( is.na( Value )
>                                     , Value[ !is.na( Value ) ][ 1 ]
>                                     , Value
>                                     )
>                     )
>            })
>        %>% ungroup
>        )
> df3
> #> # A tibble: 5 x 4
> #>   ID    ID_2  Firist Value
> #>   <chr> <chr> <lgl>  <int>
> #> 1 a     aa    T          2
> #> 2 a     ab    F          2
> #> 3 a     ac    F          2
> #> 4 b     aa    T          5
> #> 5 b     ab    F          5
> #----
>
>
> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>
>> Why do you want to modify df1?
>>
>> Why not just reassemble the parts as a new data frame and use that
going
>> forward in your calculations? That is generally the preferred approach
in R
>> so you can re-do your calculations easily if you find a mistake later.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at
gmail.com> wrote:
>>>
>>> I just came up with a solution right after i posted the question,
but
>>> i figured there must be a better and shorter one.than my solution
>>> sdf1[[1]][1,4]<-lapplyresults[[1]]
>>> sdf1[[2]][1,4]<-lapplyresults[[2]]
>>>
>>> EK
>>>
>>> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at
gmail.com> wrote:
>>>>
>>>> Hi all--
>>>>
>>>> I stumbled on this problem online. I did not like the solution
given
>>>> there which was a long UDF. I thought why cannot split and l/s
apply
>>>> work here. My aim is to split the data frame, use l/sapply,
make
>>>> changes on the split lists and combine the split lists to new
data
>>>> frame with the desired changes/output.
>>>>
>>>> The data frame shown below has a column named ID which has 2
>>>
>>> variables
>>>>
>>>> a and b; i want to replace the NAs on the Value column by 2,
which is
>>>> the only numeric entry, for ID=a and by 5 for ID=b.
>>>>
>>>> I worked out the solution but could not replace the results in
the
>>>
>>> split lists.
>>>>
>>>>
>>>> Original dataframe , df1
>>>>   ID ID_2 Firist Value
>>>> 1  a   aa   TRUE     2
>>>> 2  a   ab  FALSE    NA
>>>> 3  a   ac  FALSE    NA
>>>> 4  b   aa   TRUE     5
>>>> 5  b   ab  FALSE    NA
>>>> Sdf1
>>>> $a
>>>> ID ID_2 Firist Value
>>>> 1  a   aa   TRUE     2
>>>> 2  a   ab  FALSE    NA
>>>> 3  a   ac  FALSE    NA
>>>> $b
>>>>   ID ID_2 Firist Value
>>>> 4  b   aa   TRUE     5
>>>> 5  b   ab  FALSE    NA
>>>> Desired results
>>>> ID ID_2 Firist Value
>>>> 1  a   aa   TRUE    2
>>>> 2  a   ab  FALSE    2
>>>> 3  a   ac  FALSE    2
>>>>
>>>> $b
>>>>   ID ID_2 Firist Value
>>>> 4  b   aa   TRUE     5
>>>> 5  b   ab  FALSE     5
>>>>
>>>> My code
>>>>
>>>> sdf <- split(df1,df$ID)
>>>> lapply(sdf, function(z)
>>>
>>> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>>>>
>>>> result:
>>>> $ a: num [1:3] 2 2 2
>>>> $ b: num [1:2] 5 5
>>>>
>>>> How could I put these two lists back in the split data frame,
sdf1?
>>>> Then I could use do.call to reassemble a data frame from the
split
>>>> lists,
>>>>
>>>> Thanks,
>>>> EK
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#. 
Live Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------

Jeff Newmiller

2018-Jan-08 16:44 UTC

head link

[R] Replace NAs in split lists

I don't know. You seem to be posting in HTML so your code is mangled. Can
you post plain text and use the reprex package to make sure it produces the
errorin a clean R session?
-- 
Sent from my phone. Please excuse my brevity.

On January 8, 2018 8:03:45 AM PST, Ek Esawi <esawiek at gmail.com>
wrote:>Thank you Jeff. Your code works, as usual , perfectly. I am just
>wondering why if i put the whole code in one line, i get an error
>message.
>sdf2 <- lapply( sdf, function(z){z$Value
><-ifelse(is.na(z$Value),z$Value[!is.na(z$Value)][1],z$Value)z})
>error. unexpected symbol in sdf2
>
>Thanks again
>
>EK
>
>
>On Mon, Jan 8, 2018 at 3:12 AM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
>> Upon closer examination I see that you are not using the split
>version of
>> df1 as I usually would, so here is a reproducible example:
>>
>> #----
>> df1 <- read.table( text>> "ID ID_2 Firist Value
>> 1  a   aa   TRUE     2
>> 2  a   ab  FALSE    NA
>> 3  a   ac  FALSE    NA
>> 4  b   aa   TRUE     5
>> 5  b   ab  FALSE    NA
>> ", header=TRUE, as.is=TRUE )
>>
>> sdf <- split( df1, df1$ID )
>> # note the extra [ 1 ] in case you have more than one non-NA value #
>per ID
>> sdf2 <- lapply( sdf
>>               , function( z ) {
>>                  z$Value <- ifelse( is.na( z$Value )
>>                                   , z$Value[ !is.na( z$Value ) ][ 1 ]
>>                                   , z$Value
>>                                   )
>>                  z
>>                 }
>>               )
>> df2 <- do.call( rbind, sdf2 )
>> df2
>> #>     ID ID_2 Firist Value
>> #> a.1  a   aa   TRUE     2
>> #> a.2  a   ab  FALSE     2
>> #> a.3  a   ac  FALSE     2
>> #> b.4  b   aa   TRUE     5
>> #> b.5  b   ab  FALSE     5
>>
>> # or using tidyverse methods
>>
>> library(dplyr)
>> #>
>> #> Attaching package: 'dplyr'
>> #> The following objects are masked from 'package:stats':
>> #>
>> #>     filter, lag
>> #> The following objects are masked from 'package:base':
>> #>
>> #>     intersect, setdiff, setequal, union
>> df3 <- (   df1
>>        %>% group_by( ID )
>>        %>% do({
>>               mutate( .
>>                     , Value = ifelse( is.na( Value )
>>                                     , Value[ !is.na( Value ) ][ 1 ]
>>                                     , Value
>>                                     )
>>                     )
>>            })
>>        %>% ungroup
>>        )
>> df3
>> #> # A tibble: 5 x 4
>> #>   ID    ID_2  Firist Value
>> #>   <chr> <chr> <lgl>  <int>
>> #> 1 a     aa    T          2
>> #> 2 a     ab    F          2
>> #> 3 a     ac    F          2
>> #> 4 b     aa    T          5
>> #> 5 b     ab    F          5
>> #----
>>
>>
>> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>>
>>> Why do you want to modify df1?
>>>
>>> Why not just reassemble the parts as a new data frame and use that
>going
>>> forward in your calculations? That is generally the preferred
>approach in R
>>> so you can re-do your calculations easily if you find a mistake
>later.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at
gmail.com>
>wrote:
>>>>
>>>> I just came up with a solution right after i posted the
question,
>but
>>>> i figured there must be a better and shorter one.than my
solution
>>>> sdf1[[1]][1,4]<-lapplyresults[[1]]
>>>> sdf1[[2]][1,4]<-lapplyresults[[2]]
>>>>
>>>> EK
>>>>
>>>> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at
gmail.com>
>wrote:
>>>>>
>>>>> Hi all--
>>>>>
>>>>> I stumbled on this problem online. I did not like the
solution
>given
>>>>> there which was a long UDF. I thought why cannot split and
l/s
>apply
>>>>> work here. My aim is to split the data frame, use l/sapply,
make
>>>>> changes on the split lists and combine the split lists to
new data
>>>>> frame with the desired changes/output.
>>>>>
>>>>> The data frame shown below has a column named ID which has
2
>>>>
>>>> variables
>>>>>
>>>>> a and b; i want to replace the NAs on the Value column by
2, which
>is
>>>>> the only numeric entry, for ID=a and by 5 for ID=b.
>>>>>
>>>>> I worked out the solution but could not replace the results
in the
>>>>
>>>> split lists.
>>>>>
>>>>>
>>>>> Original dataframe , df1
>>>>>   ID ID_2 Firist Value
>>>>> 1  a   aa   TRUE     2
>>>>> 2  a   ab  FALSE    NA
>>>>> 3  a   ac  FALSE    NA
>>>>> 4  b   aa   TRUE     5
>>>>> 5  b   ab  FALSE    NA
>>>>> Sdf1
>>>>> $a
>>>>> ID ID_2 Firist Value
>>>>> 1  a   aa   TRUE     2
>>>>> 2  a   ab  FALSE    NA
>>>>> 3  a   ac  FALSE    NA
>>>>> $b
>>>>>   ID ID_2 Firist Value
>>>>> 4  b   aa   TRUE     5
>>>>> 5  b   ab  FALSE    NA
>>>>> Desired results
>>>>> ID ID_2 Firist Value
>>>>> 1  a   aa   TRUE    2
>>>>> 2  a   ab  FALSE    2
>>>>> 3  a   ac  FALSE    2
>>>>>
>>>>> $b
>>>>>   ID ID_2 Firist Value
>>>>> 4  b   aa   TRUE     5
>>>>> 5  b   ab  FALSE     5
>>>>>
>>>>> My code
>>>>>
>>>>> sdf <- split(df1,df$ID)
>>>>> lapply(sdf, function(z)
>>>>
>>>> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>>>>>
>>>>> result:
>>>>> $ a: num [1:3] 2 2 2
>>>>> $ b: num [1:2] 5 5
>>>>>
>>>>> How could I put these two lists back in the split data
frame,
>sdf1?
>>>>> Then I could use do.call to reassemble a data frame from
the split
>>>>> lists,
>>>>>
>>>>> Thanks,
>>>>> EK
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
>Go...
>>                                       Live:   OO#.. Dead: OO#.. 
>Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#. 
>rocks...1k
>>
>---------------------------------------------------------------------------

Ek Esawi

2018-Jan-08 16:55 UTC

head link

[R] Replace NAs in split lists

OPS!  Sorry i did indeed posted the code in HTML; should have known better.

ifelse(is.na(z$Value),z$Value[!is.na(z$Value)][1],z$Value)z})
error. unexpected symbol in sdf2

On Mon, Jan 8, 2018 at 11:44 AM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:> I don't know. You seem to be posting in HTML so your code is mangled.
Can you post plain text and use the reprex package to make sure it produces the
errorin a clean R session?
> --
> Sent from my phone. Please excuse my brevity.
>
> On January 8, 2018 8:03:45 AM PST, Ek Esawi <esawiek at gmail.com>
wrote:
>>Thank you Jeff. Your code works, as usual , perfectly. I am just
>>wondering why if i put the whole code in one line, i get an error
>>message.
>>sdf2 <- lapply( sdf, function(z){z$Value
>><-ifelse(is.na(z$Value),z$Value[!is.na(z$Value)][1],z$Value)z})
>>error. unexpected symbol in sdf2
>>
>>Thanks again
>>
>>EK
>>
>>
>>On Mon, Jan 8, 2018 at 3:12 AM, Jeff Newmiller
>><jdnewmil at dcn.davis.ca.us> wrote:
>>> Upon closer examination I see that you are not using the split
>>version of
>>> df1 as I usually would, so here is a reproducible example:
>>>
>>> #----
>>> df1 <- read.table( text>>> "ID ID_2 Firist Value
>>> 1  a   aa   TRUE     2
>>> 2  a   ab  FALSE    NA
>>> 3  a   ac  FALSE    NA
>>> 4  b   aa   TRUE     5
>>> 5  b   ab  FALSE    NA
>>> ", header=TRUE, as.is=TRUE )
>>>
>>> sdf <- split( df1, df1$ID )
>>> # note the extra [ 1 ] in case you have more than one non-NA value
#
>>per ID
>>> sdf2 <- lapply( sdf
>>>               , function( z ) {
>>>                  z$Value <- ifelse( is.na( z$Value )
>>>                                   , z$Value[ !is.na( z$Value ) ][ 1
]
>>>                                   , z$Value
>>>                                   )
>>>                  z
>>>                 }
>>>               )
>>> df2 <- do.call( rbind, sdf2 )
>>> df2
>>> #>     ID ID_2 Firist Value
>>> #> a.1  a   aa   TRUE     2
>>> #> a.2  a   ab  FALSE     2
>>> #> a.3  a   ac  FALSE     2
>>> #> b.4  b   aa   TRUE     5
>>> #> b.5  b   ab  FALSE     5
>>>
>>> # or using tidyverse methods
>>>
>>> library(dplyr)
>>> #>
>>> #> Attaching package: 'dplyr'
>>> #> The following objects are masked from
'package:stats':
>>> #>
>>> #>     filter, lag
>>> #> The following objects are masked from 'package:base':
>>> #>
>>> #>     intersect, setdiff, setequal, union
>>> df3 <- (   df1
>>>        %>% group_by( ID )
>>>        %>% do({
>>>               mutate( .
>>>                     , Value = ifelse( is.na( Value )
>>>                                     , Value[ !is.na( Value ) ][ 1 ]
>>>                                     , Value
>>>                                     )
>>>                     )
>>>            })
>>>        %>% ungroup
>>>        )
>>> df3
>>> #> # A tibble: 5 x 4
>>> #>   ID    ID_2  Firist Value
>>> #>   <chr> <chr> <lgl>  <int>
>>> #> 1 a     aa    T          2
>>> #> 2 a     ab    F          2
>>> #> 3 a     ac    F          2
>>> #> 4 b     aa    T          5
>>> #> 5 b     ab    F          5
>>> #----
>>>
>>>
>>> On Sun, 7 Jan 2018, Jeff Newmiller wrote:
>>>
>>>> Why do you want to modify df1?
>>>>
>>>> Why not just reassemble the parts as a new data frame and use
that
>>going
>>>> forward in your calculations? That is generally the preferred
>>approach in R
>>>> so you can re-do your calculations easily if you find a mistake
>>later.
>>>> --
>>>> Sent from my phone. Please excuse my brevity.
>>>>
>>>> On January 7, 2018 7:35:59 PM PST, Ek Esawi <esawiek at
gmail.com>
>>wrote:
>>>>>
>>>>> I just came up with a solution right after i posted the
question,
>>but
>>>>> i figured there must be a better and shorter one.than my
solution
>>>>> sdf1[[1]][1,4]<-lapplyresults[[1]]
>>>>> sdf1[[2]][1,4]<-lapplyresults[[2]]
>>>>>
>>>>> EK
>>>>>
>>>>> On Sun, Jan 7, 2018 at 10:13 PM, Ek Esawi <esawiek at
gmail.com>
>>wrote:
>>>>>>
>>>>>> Hi all--
>>>>>>
>>>>>> I stumbled on this problem online. I did not like the
solution
>>given
>>>>>> there which was a long UDF. I thought why cannot split
and l/s
>>apply
>>>>>> work here. My aim is to split the data frame, use
l/sapply, make
>>>>>> changes on the split lists and combine the split lists
to new data
>>>>>> frame with the desired changes/output.
>>>>>>
>>>>>> The data frame shown below has a column named ID which
has 2
>>>>>
>>>>> variables
>>>>>>
>>>>>> a and b; i want to replace the NAs on the Value column
by 2, which
>>is
>>>>>> the only numeric entry, for ID=a and by 5 for ID=b.
>>>>>>
>>>>>> I worked out the solution but could not replace the
results in the
>>>>>
>>>>> split lists.
>>>>>>
>>>>>>
>>>>>> Original dataframe , df1
>>>>>>   ID ID_2 Firist Value
>>>>>> 1  a   aa   TRUE     2
>>>>>> 2  a   ab  FALSE    NA
>>>>>> 3  a   ac  FALSE    NA
>>>>>> 4  b   aa   TRUE     5
>>>>>> 5  b   ab  FALSE    NA
>>>>>> Sdf1
>>>>>> $a
>>>>>> ID ID_2 Firist Value
>>>>>> 1  a   aa   TRUE     2
>>>>>> 2  a   ab  FALSE    NA
>>>>>> 3  a   ac  FALSE    NA
>>>>>> $b
>>>>>>   ID ID_2 Firist Value
>>>>>> 4  b   aa   TRUE     5
>>>>>> 5  b   ab  FALSE    NA
>>>>>> Desired results
>>>>>> ID ID_2 Firist Value
>>>>>> 1  a   aa   TRUE    2
>>>>>> 2  a   ab  FALSE    2
>>>>>> 3  a   ac  FALSE    2
>>>>>>
>>>>>> $b
>>>>>>   ID ID_2 Firist Value
>>>>>> 4  b   aa   TRUE     5
>>>>>> 5  b   ab  FALSE     5
>>>>>>
>>>>>> My code
>>>>>>
>>>>>> sdf <- split(df1,df$ID)
>>>>>> lapply(sdf, function(z)
>>>>>
>>>>> ifelse(is.na(z$Value),z$Value[!is.na(z$Value)],z$Value))
>>>>>>
>>>>>> result:
>>>>>> $ a: num [1:3] 2 2 2
>>>>>> $ b: num [1:2] 5 5
>>>>>>
>>>>>> How could I put these two lists back in the split data
frame,
>>sdf1?
>>>>>> Then I could use do.call to reassemble a data frame
from the split
>>>>>> lists,
>>>>>>
>>>>>> Thanks,
>>>>>> EK
>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained,
reproducible code.
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>>
>>>
>>>
>>---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>>Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
>>Go...
>>>                                       Live:   OO#.. Dead: OO#..
>>Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#. 
with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>>rocks...1k
>>>
>>---------------------------------------------------------------------------

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Jan 2018 - Replace NAs in split lists

[R] Replace NAs in split lists

[R] Replace NAs in split lists

[R] Replace NAs in split lists

Seemingly Similar Threads