thr3ads.net - R help - [R] Fill NA values in columns with values of another column [Aug 2024]

If this information is useful, please help other people find it:
Share via:

Petr Pikal

2024-Aug-29 05:53 UTC

[R] Fill NA values in columns with values of another column

Hallo Francesca

If you had an object with correct setting, something like template
> dput(res)structure(list(V1 = c("1", "2", "3",
"4", "5", "6", "7", "8"),
    V2 = c(2, 7, 10, 4, 9, 5, 2, 6)), class = "data.frame", row.names
c("1",
"2", "3", "4", "5", "6",
"7", "8"))

you could merge it with your object where some values are missing
> dput(daf)structure(list(X1 = c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA,
NA, NA, 5L, 9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA,
2L, 4L, 6L, 10L, NA, NA, NA), X2 = c(8L, 5L, 1L, 6L, 7L, 2L,
4L, 7L, 7L, 3L, 2L, 4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L,
4L, 5L, 6L, 1L, 4L, 8L, 3L, 3L, 8L, 1L)), class = "data.frame",
row.names c(NA,
-32L))
> merge(daf, res, by.x="X2", by.y="V1")   X2 X1 V2
1   1 NA  2
2   1 NA  2
3   1  2  2
4   1  2  2
5   2 NA  7
6   2 NA  7
7   2  7  7
8   2  7  7
9   3 10 10
10  3 NA 10
11  3 10 10
12  3 NA 10
13  4  4  4
14  4 NA  4
15  4  4  4
16  4 NA  4
17  5  9  9
18  5 NA  9
19  5 NA  9

Cheers.
Petr




st 28. 8. 2024 v 0:45 odes?latel Francesca PANCOTTO via R-help <
r-help at r-project.org> napsal:
> Dear Contributors,
> I have a problem with a database composed of many individuals for many
> periods, for which I need to perform a manipulation of data as follows.
> Here I report the procedure I need to do for the first 32 observations of
> the first period.
>
>
> cbind(VB1d[,1],s1id[,1])
>       [,1] [,2]
>  [1,]    6    8
>  [2,]    9    5
>  [3,]   NA    1
>  [4,]    5    6
>  [5,]   NA    7
>  [6,]   NA    2
>  [7,]    4    4
>  [8,]    2    7
>  [9,]    2    7
> [10,]   NA    3
> [11,]   NA    2
> [12,]   NA    4
> [13,]    5    6
> [14,]    9    5
> [15,]   NA    5
> [16,]   NA    6
> [17,]   10    3
> [18,]    7    2
> [19,]    2    1
> [20,]   NA    7
> [21,]    7    2
> [22,]   NA    8
> [23,]   NA    4
> [24,]   NA    5
> [25,]   NA    6
> [26,]    2    1
> [27,]    4    4
> [28,]    6    8
> [29,]   10    3
> [30,]   NA    3
> [31,]   NA    8
> [32,]   NA    1
>
>
> In column s1id, I have numbers from 1 to 8, which are the id of 8 groups ,
> randomly mixed in the larger group of 32.
> For each group, I want the value that is reported for only to group
> members, to all the four group members.
>
> For example, value 8 in first row , second column, is group 8. The value
> for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to
> 8, I have 6.
> But in row 22, the value 8 of the second variable, reports a value NA.
> in each group is the same, only two values have the correct number, the
> other two are NA.
> I need that each group, identified by the values of the variable S1id,
> correctly report the number of variable VB1d that is present for just two
> group members.
>
> I hope my explanation is acceptable.
> The task appears complex to me right now, especially because I will need to
> multiply this procedure for x12x14 similar databases.
>
> Anyone has ever encountered a similar problem?
> Thanks in advance for any help provided.
>
> ----------------------------------
>
> Francesca Pancotto
>
> Associate Professor Political Economy
>
> University of Modena, Largo Santa Eufemia, 19, Modena
>
> Office Phone: +39 0522 523264
>
> Web: *https://sites.google.com/view/francescapancotto/home
> <https://sites.google.com/view/francescapancotto/home>*
>
>  ----------------------------------
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Bert Gunter

2024-Aug-29 21:29 UTC

head link

[R] Fill NA values in columns with values of another column

Petr et.al:

I think using merge is a very nice idea! (note that the email omitted the
last rows of the result, though your code of course produced them)

The only minor problem is that the order of the rows in the result is
changed from the original. If the OP needs to preserve the original
ordering, that can be easily done. Here is a complete implementation of
your idea (I think).

## assume that dat is a data frame with the first two columns as in the
OP's post, i.e. the first column is the Value with NA's and the second
is
the Group

 spl <-  dat |> nrow() |> seq_len() |> split(dat[[2]]) |>
unlist() ## for
reordering
 dat[spl, 1:2] <-
    dat[, 1:2] |>
    na.omit() |>  ## remove rows with NA's
    unique() |>  ## remove duplicate rows
    merge(dat[, 1:2], by.x=2, by.y=2) |> _[, 2:1]  ## and now merge()

 Note the final reordering of the first two columns because of the way
merge() works.
I suspect there may be a slicker way to do this using unsplit(), but I
could not figure out how

The result is:> dat   Value Group
1      6     8
2      9     5
3      2     1
4      5     6
5      2     7
6      7     2
7      4     4
8      2     7
9      2     7
10    10     3
11     7     2
12     4     4
13     5     6
14     9     5
15     9     5
16     5     6
17    10     3
18     7     2
19     2     1
20     2     7
21     7     2
22     6     8
23     4     4
24     9     5
25     5     6
26     2     1
27     4     4
28     6     8
29    10     3
30    10     3
31     6     8
32     2     1

Cheers,
Bert




On Wed, Aug 28, 2024 at 10:53?PM Petr Pikal <petr.pik at gmail.com> wrote:
> Hallo Francesca
>
> If you had an object with correct setting, something like template
>
> > dput(res)
> structure(list(V1 = c("1", "2", "3",
"4", "5", "6", "7", "8"),
>     V2 = c(2, 7, 10, 4, 9, 5, 2, 6)), class = "data.frame",
row.names > c("1",
> "2", "3", "4", "5", "6",
"7", "8"))
>
> you could merge it with your object where some values are missing
>
> > dput(daf)
> structure(list(X1 = c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA,
> NA, NA, 5L, 9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA,
> 2L, 4L, 6L, 10L, NA, NA, NA), X2 = c(8L, 5L, 1L, 6L, 7L, 2L,
> 4L, 7L, 7L, 3L, 2L, 4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L,
> 4L, 5L, 6L, 1L, 4L, 8L, 3L, 3L, 8L, 1L)), class = "data.frame",
row.names > c(NA,
> -32L))
>
> > merge(daf, res, by.x="X2", by.y="V1")
>    X2 X1 V2
> 1   1 NA  2
> 2   1 NA  2
> 3   1  2  2
> 4   1  2  2
> 5   2 NA  7
> 6   2 NA  7
> 7   2  7  7
> 8   2  7  7
> 9   3 10 10
> 10  3 NA 10
> 11  3 10 10
> 12  3 NA 10
> 13  4  4  4
> 14  4 NA  4
> 15  4  4  4
> 16  4 NA  4
> 17  5  9  9
> 18  5 NA  9
> 19  5 NA  9
>
> Cheers.
> Petr
>
>
>
>
> st 28. 8. 2024 v 0:45 odes?latel Francesca PANCOTTO via R-help <
> r-help at r-project.org> napsal:
>
> > Dear Contributors,
> > I have a problem with a database composed of many individuals for many
> > periods, for which I need to perform a manipulation of data as
follows.
> > Here I report the procedure I need to do for the first 32 observations
of
> > the first period.
> >
> >
> > cbind(VB1d[,1],s1id[,1])
> >       [,1] [,2]
> >  [1,]    6    8
> >  [2,]    9    5
> >  [3,]   NA    1
> >  [4,]    5    6
> >  [5,]   NA    7
> >  [6,]   NA    2
> >  [7,]    4    4
> >  [8,]    2    7
> >  [9,]    2    7
> > [10,]   NA    3
> > [11,]   NA    2
> > [12,]   NA    4
> > [13,]    5    6
> > [14,]    9    5
> > [15,]   NA    5
> > [16,]   NA    6
> > [17,]   10    3
> > [18,]    7    2
> > [19,]    2    1
> > [20,]   NA    7
> > [21,]    7    2
> > [22,]   NA    8
> > [23,]   NA    4
> > [24,]   NA    5
> > [25,]   NA    6
> > [26,]    2    1
> > [27,]    4    4
> > [28,]    6    8
> > [29,]   10    3
> > [30,]   NA    3
> > [31,]   NA    8
> > [32,]   NA    1
> >
> >
> > In column s1id, I have numbers from 1 to 8, which are the id of 8
groups
> ,
> > randomly mixed in the larger group of 32.
> > For each group, I want the value that is reported for only to group
> > members, to all the four group members.
> >
> > For example, value 8 in first row , second column, is group 8. The
value
> > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal
to
> > 8, I have 6.
> > But in row 22, the value 8 of the second variable, reports a value NA.
> > in each group is the same, only two values have the correct number,
the
> > other two are NA.
> > I need that each group, identified by the values of the variable S1id,
> > correctly report the number of variable VB1d that is present for just
two
> > group members.
> >
> > I hope my explanation is acceptable.
> > The task appears complex to me right now, especially because I will
need
> to
> > multiply this procedure for x12x14 similar databases.
> >
> > Anyone has ever encountered a similar problem?
> > Thanks in advance for any help provided.
> >
> > ----------------------------------
> >
> > Francesca Pancotto
> >
> > Associate Professor Political Economy
> >
> > University of Modena, Largo Santa Eufemia, 19, Modena
> >
> > Office Phone: +39 0522 523264
> >
> > Web: *https://sites.google.com/view/francescapancotto/home
> > <https://sites.google.com/view/francescapancotto/home>*
> >
> >  ----------------------------------
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > https://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Aug 2024 - Fill NA values in columns with values of another column

[R] Fill NA values in columns with values of another column

[R] Fill NA values in columns with values of another column