thr3ads.net - R help - [R] Fill NA values in columns with values of another column [Aug 2024]

If this information is useful, please help other people find it:
Share via:

Rui Barradas

2024-Aug-28 08:18 UTC

[R] Fill NA values in columns with values of another column

?s 11:23 de 27/08/2024, Francesca PANCOTTO via R-help
escreveu:> Dear Contributors,
> I have a problem with a database composed of many individuals for many
> periods, for which I need to perform a manipulation of data as follows.
> Here I report the procedure I need to do for the first 32 observations of
> the first period.
> 
> 
> cbind(VB1d[,1],s1id[,1])
>        [,1] [,2]
>   [1,]    6    8
>   [2,]    9    5
>   [3,]   NA    1
>   [4,]    5    6
>   [5,]   NA    7
>   [6,]   NA    2
>   [7,]    4    4
>   [8,]    2    7
>   [9,]    2    7
> [10,]   NA    3
> [11,]   NA    2
> [12,]   NA    4
> [13,]    5    6
> [14,]    9    5
> [15,]   NA    5
> [16,]   NA    6
> [17,]   10    3
> [18,]    7    2
> [19,]    2    1
> [20,]   NA    7
> [21,]    7    2
> [22,]   NA    8
> [23,]   NA    4
> [24,]   NA    5
> [25,]   NA    6
> [26,]    2    1
> [27,]    4    4
> [28,]    6    8
> [29,]   10    3
> [30,]   NA    3
> [31,]   NA    8
> [32,]   NA    1
> 
> 
> In column s1id, I have numbers from 1 to 8, which are the id of 8 groups ,
> randomly mixed in the larger group of 32.
> For each group, I want the value that is reported for only to group
> members, to all the four group members.
> 
> For example, value 8 in first row , second column, is group 8. The value
> for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to
> 8, I have 6.
> But in row 22, the value 8 of the second variable, reports a value NA.
> in each group is the same, only two values have the correct number, the
> other two are NA.
> I need that each group, identified by the values of the variable S1id,
> correctly report the number of variable VB1d that is present for just two
> group members.
> 
> I hope my explanation is acceptable.
> The task appears complex to me right now, especially because I will need to
> multiply this procedure for x12x14 similar databases.
> 
> Anyone has ever encountered a similar problem?
> Thanks in advance for any help provided.
> 
> ----------------------------------
> 
> Francesca Pancotto
> 
> Associate Professor Political Economy
> 
> University of Modena, Largo Santa Eufemia, 19, Modena
> 
> Office Phone: +39 0522 523264
> 
> Web: *https://sites.google.com/view/francescapancotto/home
> <https://sites.google.com/view/francescapancotto/home>*
> 
>   ----------------------------------
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.Hello,

Here is a solution.
Split the 1st column by the 2nd, keep only the not-NA values and unlist, 
to have a named vector.
Then put the names and the values together with cbind.



mat <- structure(
   c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L,
     9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L,
     10L, NA, NA, NA, 8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L,
     4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L,
     8L, 3L, 3L, 8L, 1L), dim = c(32L, 2L))


res <- split(mat[, 1L], mat[, 2L]) |> lapply(\(x) x[!is.na(x)]) |>
unlist()
nms <- names(res)
res <- cbind(
   VB1d = res,
   s1id = substr(nms, 1, nchar(nms) - 1L) |> as.integer()
)
res
#>    VB1d s1id
#> 11    2    1
#> 12    2    1
#> 21    7    2
#> 22    7    2
#> 31   10    3
#> 32   10    3
#> 41    4    4
#> 42    4    4
#> 51    9    5
#> 52    9    5
#> 61    5    6
#> 62    5    6
#> 71    2    7
#> 72    2    7
#> 81    6    8
#> 82    6    8



Hope this helps,

Rui Barradas


-- 
Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a
de v?rus.
www.avg.com

Ebert,Timothy Aaron

2024-Aug-28 15:24 UTC

head link

[R] Fill NA values in columns with values of another column

Why not use na.omit() and then go from there? Unless one handles NA differently
in different groups there is no point in processing the data by groups to remove
NA even if later analysis steps do require group information.

Tim

-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Rui Barradas
Sent: Wednesday, August 28, 2024 4:19 AM
To: Francesca PANCOTTO <francesca.pancotto at unimore.it>; r-help at
r-project.org
Subject: Re: [R] Fill NA values in columns with values of another column

[External Email]

?s 11:23 de 27/08/2024, Francesca PANCOTTO via R-help
escreveu:> Dear Contributors,
> I have a problem with a database composed of many individuals for many
> periods, for which I need to perform a manipulation of data as follows.
> Here I report the procedure I need to do for the first 32 observations
> of the first period.
>
>
> cbind(VB1d[,1],s1id[,1])
>        [,1] [,2]
>   [1,]    6    8
>   [2,]    9    5
>   [3,]   NA    1
>   [4,]    5    6
>   [5,]   NA    7
>   [6,]   NA    2
>   [7,]    4    4
>   [8,]    2    7
>   [9,]    2    7
> [10,]   NA    3
> [11,]   NA    2
> [12,]   NA    4
> [13,]    5    6
> [14,]    9    5
> [15,]   NA    5
> [16,]   NA    6
> [17,]   10    3
> [18,]    7    2
> [19,]    2    1
> [20,]   NA    7
> [21,]    7    2
> [22,]   NA    8
> [23,]   NA    4
> [24,]   NA    5
> [25,]   NA    6
> [26,]    2    1
> [27,]    4    4
> [28,]    6    8
> [29,]   10    3
> [30,]   NA    3
> [31,]   NA    8
> [32,]   NA    1
>
>
> In column s1id, I have numbers from 1 to 8, which are the id of 8
> groups , randomly mixed in the larger group of 32.
> For each group, I want the value that is reported for only to group
> members, to all the four group members.
>
> For example, value 8 in first row , second column, is group 8. The
> value for group 8 of the variable VB1d is 6. At row 28, again for s1id
> equal to 8, I have 6.
> But in row 22, the value 8 of the second variable, reports a value NA.
> in each group is the same, only two values have the correct number,
> the other two are NA.
> I need that each group, identified by the values of the variable S1id,
> correctly report the number of variable VB1d that is present for just
> two group members.
>
> I hope my explanation is acceptable.
> The task appears complex to me right now, especially because I will
> need to multiply this procedure for x12x14 similar databases.
>
> Anyone has ever encountered a similar problem?
> Thanks in advance for any help provided.
>
> ----------------------------------
>
> Francesca Pancotto
>
> Associate Professor Political Economy
>
> University of Modena, Largo Santa Eufemia, 19, Modena
>
> Office Phone: +39 0522 523264
>
> Web:
> *https://sit/
> es.google.com%2Fview%2Ffrancescapancotto%2Fhome&data=05%7C02%7Ctebert%
> 40ufl.edu%7C0ca2745d1f2142a0723608dcc73a15e3%7C0d4da0f84a314d76ace60a6
> 2331e1b84%7C0%7C0%7C638604299508876897%7CUnknown%7CTWFpbGZsb3d8eyJWIjo
> iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%
>
7C&sdata=yHdkL%2BmnsHgL1O3nE%2B0r4Wf5nvRgJp66VWJHHiYJVGA%3D&reserved=0
> <https://sit/
> es.google.com%2Fview%2Ffrancescapancotto%2Fhome&data=05%7C02%7Ctebert%
> 40ufl.edu%7C0ca2745d1f2142a0723608dcc73a15e3%7C0d4da0f84a314d76ace60a6
> 2331e1b84%7C0%7C0%7C638604299508887226%7CUnknown%7CTWFpbGZsb3d8eyJWIjo
> iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%
>
7C&sdata=XsB7jdjGD5S7YKiyPhY5DSR%2F1yhPrTuFxdA5qz3KEBY%3D&reserved=0>*
>
>   ----------------------------------
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat/
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
> %7C0ca2745d1f2142a0723608dcc73a15e3%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C638604299508890269%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
>
MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata>
BLTZvAFGtdZUoKefcgEtEsrw5pm4UHRUZJCGLXx5QFE%3D&reserved=0
> PLEASE do read the posting guide
> https://www/.
> r-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7C0
> ca2745d1f2142a0723608dcc73a15e3%7C0d4da0f84a314d76ace60a62331e1b84%7C0
> %7C0%7C638604299508893127%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAi
> LCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=q4Mj
> %2BjSL2ZG0%2Fi0%2FrBUR3Z2B%2BbV6eH35to2Rt6kHUZ8%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.Hello,

Here is a solution.
Split the 1st column by the 2nd, keep only the not-NA values and unlist, to have
a named vector.
Then put the names and the values together with cbind.



mat <- structure(
   c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L,
     9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L,
     10L, NA, NA, NA, 8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L,
     4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L,
     8L, 3L, 3L, 8L, 1L), dim = c(32L, 2L))


res <- split(mat[, 1L], mat[, 2L]) |> lapply(\(x) x[!is.na(x)]) |>
unlist() nms <- names(res) res <- cbind(
   VB1d = res,
   s1id = substr(nms, 1, nchar(nms) - 1L) |> as.integer()
)
res
#>    VB1d s1id
#> 11    2    1
#> 12    2    1
#> 21    7    2
#> 22    7    2
#> 31   10    3
#> 32   10    3
#> 41    4    4
#> 42    4    4
#> 51    9    5
#> 52    9    5
#> 61    5    6
#> 62    5    6
#> 71    2    7
#> 72    2    7
#> 81    6    8
#> 82    6    8



Hope this helps,

Rui Barradas


--
Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a
de v?rus.
http://www.avg.com/

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R help - Aug 2024 - Fill NA values in columns with values of another column

[R] Fill NA values in columns with values of another column

[R] Fill NA values in columns with values of another column