Francesca PANCOTTO
2024-Aug-27 10:23 UTC
[R] Fill NA values in columns with values of another column
Dear Contributors, I have a problem with a database composed of many individuals for many periods, for which I need to perform a manipulation of data as follows. Here I report the procedure I need to do for the first 32 observations of the first period. cbind(VB1d[,1],s1id[,1]) [,1] [,2] [1,] 6 8 [2,] 9 5 [3,] NA 1 [4,] 5 6 [5,] NA 7 [6,] NA 2 [7,] 4 4 [8,] 2 7 [9,] 2 7 [10,] NA 3 [11,] NA 2 [12,] NA 4 [13,] 5 6 [14,] 9 5 [15,] NA 5 [16,] NA 6 [17,] 10 3 [18,] 7 2 [19,] 2 1 [20,] NA 7 [21,] 7 2 [22,] NA 8 [23,] NA 4 [24,] NA 5 [25,] NA 6 [26,] 2 1 [27,] 4 4 [28,] 6 8 [29,] 10 3 [30,] NA 3 [31,] NA 8 [32,] NA 1 In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , randomly mixed in the larger group of 32. For each group, I want the value that is reported for only to group members, to all the four group members. For example, value 8 in first row , second column, is group 8. The value for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to 8, I have 6. But in row 22, the value 8 of the second variable, reports a value NA. in each group is the same, only two values have the correct number, the other two are NA. I need that each group, identified by the values of the variable S1id, correctly report the number of variable VB1d that is present for just two group members. I hope my explanation is acceptable. The task appears complex to me right now, especially because I will need to multiply this procedure for x12x14 similar databases. Anyone has ever encountered a similar problem? Thanks in advance for any help provided. ---------------------------------- Francesca Pancotto Associate Professor Political Economy University of Modena, Largo Santa Eufemia, 19, Modena Office Phone: +39 0522 523264 Web: *https://sites.google.com/view/francescapancotto/home <https://sites.google.com/view/francescapancotto/home>* ---------------------------------- [[alternative HTML version deleted]]
Bert Gunter
2024-Aug-27 23:05 UTC
[R] Fill NA values in columns with values of another column
Sorry, not clear to me. For group 8 in your example, do you want extract the values in column 1 that are not NA, i.e. one value, 6; or do you want to extract the number of values -- that is, the count -- that are not NA, i.e. 1? ... and for group 5, would it be c(9,9) for the values; or 2 for the count? Or something else entirely if I have completely misunderstood. Either of the above are easy and quick to do. You can also just remove the NA's via a version of ?na.omit if that's what you want. Of course, feel free to ignore this and wait for a more helpful response from someone who understands your query better than I. Cheers, Bert On Tue, Aug 27, 2024 at 3:45?PM Francesca PANCOTTO via R-help <r-help at r-project.org> wrote:> > Dear Contributors, > I have a problem with a database composed of many individuals for many > periods, for which I need to perform a manipulation of data as follows. > Here I report the procedure I need to do for the first 32 observations of > the first period. > > > cbind(VB1d[,1],s1id[,1]) > [,1] [,2] > [1,] 6 8 > [2,] 9 5 > [3,] NA 1 > [4,] 5 6 > [5,] NA 7 > [6,] NA 2 > [7,] 4 4 > [8,] 2 7 > [9,] 2 7 > [10,] NA 3 > [11,] NA 2 > [12,] NA 4 > [13,] 5 6 > [14,] 9 5 > [15,] NA 5 > [16,] NA 6 > [17,] 10 3 > [18,] 7 2 > [19,] 2 1 > [20,] NA 7 > [21,] 7 2 > [22,] NA 8 > [23,] NA 4 > [24,] NA 5 > [25,] NA 6 > [26,] 2 1 > [27,] 4 4 > [28,] 6 8 > [29,] 10 3 > [30,] NA 3 > [31,] NA 8 > [32,] NA 1 > > > In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , > randomly mixed in the larger group of 32. > For each group, I want the value that is reported for only to group > members, to all the four group members. > > For example, value 8 in first row , second column, is group 8. The value > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to > 8, I have 6. > But in row 22, the value 8 of the second variable, reports a value NA. > in each group is the same, only two values have the correct number, the > other two are NA. > I need that each group, identified by the values of the variable S1id, > correctly report the number of variable VB1d that is present for just two > group members. > > I hope my explanation is acceptable. > The task appears complex to me right now, especially because I will need to > multiply this procedure for x12x14 similar databases. > > Anyone has ever encountered a similar problem? > Thanks in advance for any help provided. > > ---------------------------------- > > Francesca Pancotto > > Associate Professor Political Economy > > University of Modena, Largo Santa Eufemia, 19, Modena > > Office Phone: +39 0522 523264 > > Web: *https://sites.google.com/view/francescapancotto/home > <https://sites.google.com/view/francescapancotto/home>* > > ---------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2024-Aug-28 08:18 UTC
[R] Fill NA values in columns with values of another column
?s 11:23 de 27/08/2024, Francesca PANCOTTO via R-help escreveu:> Dear Contributors, > I have a problem with a database composed of many individuals for many > periods, for which I need to perform a manipulation of data as follows. > Here I report the procedure I need to do for the first 32 observations of > the first period. > > > cbind(VB1d[,1],s1id[,1]) > [,1] [,2] > [1,] 6 8 > [2,] 9 5 > [3,] NA 1 > [4,] 5 6 > [5,] NA 7 > [6,] NA 2 > [7,] 4 4 > [8,] 2 7 > [9,] 2 7 > [10,] NA 3 > [11,] NA 2 > [12,] NA 4 > [13,] 5 6 > [14,] 9 5 > [15,] NA 5 > [16,] NA 6 > [17,] 10 3 > [18,] 7 2 > [19,] 2 1 > [20,] NA 7 > [21,] 7 2 > [22,] NA 8 > [23,] NA 4 > [24,] NA 5 > [25,] NA 6 > [26,] 2 1 > [27,] 4 4 > [28,] 6 8 > [29,] 10 3 > [30,] NA 3 > [31,] NA 8 > [32,] NA 1 > > > In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , > randomly mixed in the larger group of 32. > For each group, I want the value that is reported for only to group > members, to all the four group members. > > For example, value 8 in first row , second column, is group 8. The value > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to > 8, I have 6. > But in row 22, the value 8 of the second variable, reports a value NA. > in each group is the same, only two values have the correct number, the > other two are NA. > I need that each group, identified by the values of the variable S1id, > correctly report the number of variable VB1d that is present for just two > group members. > > I hope my explanation is acceptable. > The task appears complex to me right now, especially because I will need to > multiply this procedure for x12x14 similar databases. > > Anyone has ever encountered a similar problem? > Thanks in advance for any help provided. > > ---------------------------------- > > Francesca Pancotto > > Associate Professor Political Economy > > University of Modena, Largo Santa Eufemia, 19, Modena > > Office Phone: +39 0522 523264 > > Web: *https://sites.google.com/view/francescapancotto/home > <https://sites.google.com/view/francescapancotto/home>* > > ---------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Hello, Here is a solution. Split the 1st column by the 2nd, keep only the not-NA values and unlist, to have a named vector. Then put the names and the values together with cbind. mat <- structure( c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L, 9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L, 10L, NA, NA, NA, 8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L, 4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L, 8L, 3L, 3L, 8L, 1L), dim = c(32L, 2L)) res <- split(mat[, 1L], mat[, 2L]) |> lapply(\(x) x[!is.na(x)]) |> unlist() nms <- names(res) res <- cbind( VB1d = res, s1id = substr(nms, 1, nchar(nms) - 1L) |> as.integer() ) res #> VB1d s1id #> 11 2 1 #> 12 2 1 #> 21 7 2 #> 22 7 2 #> 31 10 3 #> 32 10 3 #> 41 4 4 #> 42 4 4 #> 51 9 5 #> 52 9 5 #> 61 5 6 #> 62 5 6 #> 71 2 7 #> 72 2 7 #> 81 6 8 #> 82 6 8 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a de v?rus. www.avg.com
Petr Pikal
2024-Aug-29 05:53 UTC
[R] Fill NA values in columns with values of another column
Hallo Francesca If you had an object with correct setting, something like template> dput(res)structure(list(V1 = c("1", "2", "3", "4", "5", "6", "7", "8"), V2 = c(2, 7, 10, 4, 9, 5, 2, 6)), class = "data.frame", row.names c("1", "2", "3", "4", "5", "6", "7", "8")) you could merge it with your object where some values are missing> dput(daf)structure(list(X1 = c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, NA, NA, 5L, 9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, 2L, 4L, 6L, 10L, NA, NA, NA), X2 = c(8L, 5L, 1L, 6L, 7L, 2L, 4L, 7L, 7L, 3L, 2L, 4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, 4L, 5L, 6L, 1L, 4L, 8L, 3L, 3L, 8L, 1L)), class = "data.frame", row.names c(NA, -32L))> merge(daf, res, by.x="X2", by.y="V1")X2 X1 V2 1 1 NA 2 2 1 NA 2 3 1 2 2 4 1 2 2 5 2 NA 7 6 2 NA 7 7 2 7 7 8 2 7 7 9 3 10 10 10 3 NA 10 11 3 10 10 12 3 NA 10 13 4 4 4 14 4 NA 4 15 4 4 4 16 4 NA 4 17 5 9 9 18 5 NA 9 19 5 NA 9 Cheers. Petr st 28. 8. 2024 v 0:45 odes?latel Francesca PANCOTTO via R-help < r-help at r-project.org> napsal:> Dear Contributors, > I have a problem with a database composed of many individuals for many > periods, for which I need to perform a manipulation of data as follows. > Here I report the procedure I need to do for the first 32 observations of > the first period. > > > cbind(VB1d[,1],s1id[,1]) > [,1] [,2] > [1,] 6 8 > [2,] 9 5 > [3,] NA 1 > [4,] 5 6 > [5,] NA 7 > [6,] NA 2 > [7,] 4 4 > [8,] 2 7 > [9,] 2 7 > [10,] NA 3 > [11,] NA 2 > [12,] NA 4 > [13,] 5 6 > [14,] 9 5 > [15,] NA 5 > [16,] NA 6 > [17,] 10 3 > [18,] 7 2 > [19,] 2 1 > [20,] NA 7 > [21,] 7 2 > [22,] NA 8 > [23,] NA 4 > [24,] NA 5 > [25,] NA 6 > [26,] 2 1 > [27,] 4 4 > [28,] 6 8 > [29,] 10 3 > [30,] NA 3 > [31,] NA 8 > [32,] NA 1 > > > In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , > randomly mixed in the larger group of 32. > For each group, I want the value that is reported for only to group > members, to all the four group members. > > For example, value 8 in first row , second column, is group 8. The value > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to > 8, I have 6. > But in row 22, the value 8 of the second variable, reports a value NA. > in each group is the same, only two values have the correct number, the > other two are NA. > I need that each group, identified by the values of the variable S1id, > correctly report the number of variable VB1d that is present for just two > group members. > > I hope my explanation is acceptable. > The task appears complex to me right now, especially because I will need to > multiply this procedure for x12x14 similar databases. > > Anyone has ever encountered a similar problem? > Thanks in advance for any help provided. > > ---------------------------------- > > Francesca Pancotto > > Associate Professor Political Economy > > University of Modena, Largo Santa Eufemia, 19, Modena > > Office Phone: +39 0522 523264 > > Web: *https://sites.google.com/view/francescapancotto/home > <https://sites.google.com/view/francescapancotto/home>* > > ---------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]