Bert Gunter
2024-Aug-27 23:05 UTC
[R] Fill NA values in columns with values of another column
Sorry, not clear to me. For group 8 in your example, do you want extract the values in column 1 that are not NA, i.e. one value, 6; or do you want to extract the number of values -- that is, the count -- that are not NA, i.e. 1? ... and for group 5, would it be c(9,9) for the values; or 2 for the count? Or something else entirely if I have completely misunderstood. Either of the above are easy and quick to do. You can also just remove the NA's via a version of ?na.omit if that's what you want. Of course, feel free to ignore this and wait for a more helpful response from someone who understands your query better than I. Cheers, Bert On Tue, Aug 27, 2024 at 3:45?PM Francesca PANCOTTO via R-help <r-help at r-project.org> wrote:> > Dear Contributors, > I have a problem with a database composed of many individuals for many > periods, for which I need to perform a manipulation of data as follows. > Here I report the procedure I need to do for the first 32 observations of > the first period. > > > cbind(VB1d[,1],s1id[,1]) > [,1] [,2] > [1,] 6 8 > [2,] 9 5 > [3,] NA 1 > [4,] 5 6 > [5,] NA 7 > [6,] NA 2 > [7,] 4 4 > [8,] 2 7 > [9,] 2 7 > [10,] NA 3 > [11,] NA 2 > [12,] NA 4 > [13,] 5 6 > [14,] 9 5 > [15,] NA 5 > [16,] NA 6 > [17,] 10 3 > [18,] 7 2 > [19,] 2 1 > [20,] NA 7 > [21,] 7 2 > [22,] NA 8 > [23,] NA 4 > [24,] NA 5 > [25,] NA 6 > [26,] 2 1 > [27,] 4 4 > [28,] 6 8 > [29,] 10 3 > [30,] NA 3 > [31,] NA 8 > [32,] NA 1 > > > In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , > randomly mixed in the larger group of 32. > For each group, I want the value that is reported for only to group > members, to all the four group members. > > For example, value 8 in first row , second column, is group 8. The value > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to > 8, I have 6. > But in row 22, the value 8 of the second variable, reports a value NA. > in each group is the same, only two values have the correct number, the > other two are NA. > I need that each group, identified by the values of the variable S1id, > correctly report the number of variable VB1d that is present for just two > group members. > > I hope my explanation is acceptable. > The task appears complex to me right now, especially because I will need to > multiply this procedure for x12x14 similar databases. > > Anyone has ever encountered a similar problem? > Thanks in advance for any help provided. > > ---------------------------------- > > Francesca Pancotto > > Associate Professor Political Economy > > University of Modena, Largo Santa Eufemia, 19, Modena > > Office Phone: +39 0522 523264 > > Web: *https://sites.google.com/view/francescapancotto/home > <https://sites.google.com/view/francescapancotto/home>* > > ---------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
CALUM POLWART
2024-Aug-28 00:07 UTC
[R] Fill NA values in columns with values of another column
Bert I thought she meant she wanted to replace the NAs with the 6. But I could be wrong. It looks like the data is combined from cbind. I'm going to give tidyverse examples because it's (/s) *"always"* (/s) easier. require(tidyverse) # impute the missing NAs myData <- cbind(VB1d[,1],s1id[,1]) myData |> said[ filter(!is.na(1)) |> #uses col1 would be better to use a name unique() -> referenceData myData |> select(2) |> #better to name left_join(referenceData) -> cleanData You will notice I've used column numbers. I suspect cbind will name the columns oddly. And I'm typing this on my phone so it's untested. If you wanted counts myData |> filter (!is.na(1)) |> group_by(2) |> summarise (n()) I won't answer the c(5,5) that Bert mentions because that's an extra question of what you do next with the data to know how best to present it. On Wed, 28 Aug 2024, 00:06 Bert Gunter, <bgunter.4567 at gmail.com> wrote:> Sorry, not clear to me. > > For group 8 in your example, do you want extract the values in column > 1 that are not NA, i.e. one value, 6; or do you want to extract the > number of values -- that is, the count -- that are not NA, i.e. 1? > > ... and for group 5, would it be c(9,9) for the values; or 2 for the count? > > Or something else entirely if I have completely misunderstood. > > Either of the above are easy and quick to do. You can also just remove > the NA's via a version of ?na.omit if that's what you want. > > Of course, feel free to ignore this and wait for a more helpful > response from someone who understands your query better than I. > > Cheers, > Bert > > On Tue, Aug 27, 2024 at 3:45?PM Francesca PANCOTTO via R-help > <r-help at r-project.org> wrote: > > > > Dear Contributors, > > I have a problem with a database composed of many individuals for many > > periods, for which I need to perform a manipulation of data as follows. > > Here I report the procedure I need to do for the first 32 observations of > > the first period. > > > > > > cbind(VB1d[,1],s1id[,1]) > > [,1] [,2] > > [1,] 6 8 > > [2,] 9 5 > > [3,] NA 1 > > [4,] 5 6 > > [5,] NA 7 > > [6,] NA 2 > > [7,] 4 4 > > [8,] 2 7 > > [9,] 2 7 > > [10,] NA 3 > > [11,] NA 2 > > [12,] NA 4 > > [13,] 5 6 > > [14,] 9 5 > > [15,] NA 5 > > [16,] NA 6 > > [17,] 10 3 > > [18,] 7 2 > > [19,] 2 1 > > [20,] NA 7 > > [21,] 7 2 > > [22,] NA 8 > > [23,] NA 4 > > [24,] NA 5 > > [25,] NA 6 > > [26,] 2 1 > > [27,] 4 4 > > [28,] 6 8 > > [29,] 10 3 > > [30,] NA 3 > > [31,] NA 8 > > [32,] NA 1 > > > > > > In column s1id, I have numbers from 1 to 8, which are the id of 8 groups > , > > randomly mixed in the larger group of 32. > > For each group, I want the value that is reported for only to group > > members, to all the four group members. > > > > For example, value 8 in first row , second column, is group 8. The value > > for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to > > 8, I have 6. > > But in row 22, the value 8 of the second variable, reports a value NA. > > in each group is the same, only two values have the correct number, the > > other two are NA. > > I need that each group, identified by the values of the variable S1id, > > correctly report the number of variable VB1d that is present for just two > > group members. > > > > I hope my explanation is acceptable. > > The task appears complex to me right now, especially because I will need > to > > multiply this procedure for x12x14 similar databases. > > > > Anyone has ever encountered a similar problem? > > Thanks in advance for any help provided. > > > > ---------------------------------- > > > > Francesca Pancotto > > > > Associate Professor Political Economy > > > > University of Modena, Largo Santa Eufemia, 19, Modena > > > > Office Phone: +39 0522 523264 > > > > Web: *https://sites.google.com/view/francescapancotto/home > > <https://sites.google.com/view/francescapancotto/home>* > > > > ---------------------------------- > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]