Awesome, thanks so much!! Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com On Mon, Aug 8, 2022 at 1:38 PM John Fox <jfox at mcmaster.ca> wrote:> Dear Erin, > > The problem is that the data frame gets coerced to a character matrix, > and the only column with "" entries is the 9th (the second one you > supplied): > > as.matrix(test1.df) > X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr > 1 "48160" "December" "2014" > 2 "48198" "June" "2018" > 3 "80027" "August" "2016" > 4 "48161" "" NA > 5 NA "" NA > 6 "48911" "August" "1985" > 7 NA "April" "2019" > 8 "48197" "February" "1993" > 9 "48021" "" NA > 10 "11355" "December" "1990" > > (Here, test1.df only contains the three columns you provided.) > > A solution is to use sapply: > > > sapply(test1.df, count1a) > X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr > 2 3 3 > > > I hope this helps, > John > > > On 2022-08-08 1:22 p.m., Erin Hodgess wrote: > > Hello! > > > > I have the following data.frame > > dput(test1.df[1:10,8:10]) > > structure(list(X1_1_HZP1 = c(48160L, 48198L, 80027L, 48161L, > > NA, 48911L, NA, 48197L, 48021L, 11355L), X1_1_HBM1_mon = c("December", > > "June", "August", "", "", "August", "April", "February", "", > > "December"), X1_1_HBM1_yr = c(2014L, 2018L, 2016L, NA, NA, 1985L, > > 2019L, 1993L, NA, 1990L)), row.names = c(NA, 10L), class = "data.frame") > > > > And the following function: > >> dput(count1a) > > function (x) > > { > > if (typeof(x) == "integer") > > y <- sum(is.na(x)) > > if (typeof(x) == "character") > > y <- sum(x == "") > > return(y) > > } > > When I use the apply function with count1a, I get the following: > > apply(test1.df[1:10,8:10],2,count1a) > > X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr > > NA 3 NA > > However, when I do use columns 8 and 10, I get the correct response: > > apply(test1.df[1:10,c(8,10)],2,count1a) > > X1_1_HZP1 X1_1_HBM1_yr > > 2 3 > >> > > I am really baffled. If I use count1a on a single column, it works fine. > > > > Any suggestions much appreciated. > > Thanks, > > Sincerely, > > Erin > > > > > > Erin Hodgess, PhD > > mailto: erinm.hodgess at gmail.com > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > web: https://socialsciences.mcmaster.ca/jfox/ > >[[alternative HTML version deleted]]
OK. I'm back again. So my test1.df is 236x390 If I put in the following: lapply(test1.df,count1a) Error in FUN(X[[i]], ...) : object 'y' not found> lapply(test1.df,count1a)Error in FUN(X[[i]], ...) : object 'y' not found> sapply(test1.df,count1a)Error in FUN(X[[i]], ...) : object 'y' not found>What am I doing wrong, please? Thanks, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com On Mon, Aug 8, 2022 at 1:41 PM Erin Hodgess <erinm.hodgess at gmail.com> wrote:> Awesome, thanks so much!! > > Erin Hodgess, PhD > mailto: erinm.hodgess at gmail.com > > > On Mon, Aug 8, 2022 at 1:38 PM John Fox <jfox at mcmaster.ca> wrote: > >> Dear Erin, >> >> The problem is that the data frame gets coerced to a character matrix, >> and the only column with "" entries is the 9th (the second one you >> supplied): >> >> as.matrix(test1.df) >> X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr >> 1 "48160" "December" "2014" >> 2 "48198" "June" "2018" >> 3 "80027" "August" "2016" >> 4 "48161" "" NA >> 5 NA "" NA >> 6 "48911" "August" "1985" >> 7 NA "April" "2019" >> 8 "48197" "February" "1993" >> 9 "48021" "" NA >> 10 "11355" "December" "1990" >> >> (Here, test1.df only contains the three columns you provided.) >> >> A solution is to use sapply: >> >> > sapply(test1.df, count1a) >> X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr >> 2 3 3 >> >> >> I hope this helps, >> John >> >> >> On 2022-08-08 1:22 p.m., Erin Hodgess wrote: >> > Hello! >> > >> > I have the following data.frame >> > dput(test1.df[1:10,8:10]) >> > structure(list(X1_1_HZP1 = c(48160L, 48198L, 80027L, 48161L, >> > NA, 48911L, NA, 48197L, 48021L, 11355L), X1_1_HBM1_mon = c("December", >> > "June", "August", "", "", "August", "April", "February", "", >> > "December"), X1_1_HBM1_yr = c(2014L, 2018L, 2016L, NA, NA, 1985L, >> > 2019L, 1993L, NA, 1990L)), row.names = c(NA, 10L), class = "data.frame") >> > >> > And the following function: >> >> dput(count1a) >> > function (x) >> > { >> > if (typeof(x) == "integer") >> > y <- sum(is.na(x)) >> > if (typeof(x) == "character") >> > y <- sum(x == "") >> > return(y) >> > } >> > When I use the apply function with count1a, I get the following: >> > apply(test1.df[1:10,8:10],2,count1a) >> > X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr >> > NA 3 NA >> > However, when I do use columns 8 and 10, I get the correct response: >> > apply(test1.df[1:10,c(8,10)],2,count1a) >> > X1_1_HZP1 X1_1_HBM1_yr >> > 2 3 >> >> >> > I am really baffled. If I use count1a on a single column, it works >> fine. >> > >> > Any suggestions much appreciated. >> > Thanks, >> > Sincerely, >> > Erin >> > >> > >> > Erin Hodgess, PhD >> > mailto: erinm.hodgess at gmail.com >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> -- >> John Fox, Professor Emeritus >> McMaster University >> Hamilton, Ontario, Canada >> web: https://socialsciences.mcmaster.ca/jfox/ >> >>[[alternative HTML version deleted]]
Nailed it! There were a few "logical" columns in my data.frame. Thanks for all of the help! Sincerely, Erin Erin Hodgess, PhD mailto: erinm.hodgess at gmail.com On Mon, Aug 8, 2022 at 1:51 PM Erin Hodgess <erinm.hodgess at gmail.com> wrote:> OK. I'm back again. > > So my test1.df is 236x390 > > If I put in the following: > lapply(test1.df,count1a) > Error in FUN(X[[i]], ...) : object 'y' not found > > lapply(test1.df,count1a) > Error in FUN(X[[i]], ...) : object 'y' not found > > sapply(test1.df,count1a) > Error in FUN(X[[i]], ...) : object 'y' not found > > > What am I doing wrong, please? > Thanks, > Erin > > > Erin Hodgess, PhD > mailto: erinm.hodgess at gmail.com > > > On Mon, Aug 8, 2022 at 1:41 PM Erin Hodgess <erinm.hodgess at gmail.com> > wrote: > >> Awesome, thanks so much!! >> >> Erin Hodgess, PhD >> mailto: erinm.hodgess at gmail.com >> >> >> On Mon, Aug 8, 2022 at 1:38 PM John Fox <jfox at mcmaster.ca> wrote: >> >>> Dear Erin, >>> >>> The problem is that the data frame gets coerced to a character matrix, >>> and the only column with "" entries is the 9th (the second one you >>> supplied): >>> >>> as.matrix(test1.df) >>> X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr >>> 1 "48160" "December" "2014" >>> 2 "48198" "June" "2018" >>> 3 "80027" "August" "2016" >>> 4 "48161" "" NA >>> 5 NA "" NA >>> 6 "48911" "August" "1985" >>> 7 NA "April" "2019" >>> 8 "48197" "February" "1993" >>> 9 "48021" "" NA >>> 10 "11355" "December" "1990" >>> >>> (Here, test1.df only contains the three columns you provided.) >>> >>> A solution is to use sapply: >>> >>> > sapply(test1.df, count1a) >>> X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr >>> 2 3 3 >>> >>> >>> I hope this helps, >>> John >>> >>> >>> On 2022-08-08 1:22 p.m., Erin Hodgess wrote: >>> > Hello! >>> > >>> > I have the following data.frame >>> > dput(test1.df[1:10,8:10]) >>> > structure(list(X1_1_HZP1 = c(48160L, 48198L, 80027L, 48161L, >>> > NA, 48911L, NA, 48197L, 48021L, 11355L), X1_1_HBM1_mon = c("December", >>> > "June", "August", "", "", "August", "April", "February", "", >>> > "December"), X1_1_HBM1_yr = c(2014L, 2018L, 2016L, NA, NA, 1985L, >>> > 2019L, 1993L, NA, 1990L)), row.names = c(NA, 10L), class >>> "data.frame") >>> > >>> > And the following function: >>> >> dput(count1a) >>> > function (x) >>> > { >>> > if (typeof(x) == "integer") >>> > y <- sum(is.na(x)) >>> > if (typeof(x) == "character") >>> > y <- sum(x == "") >>> > return(y) >>> > } >>> > When I use the apply function with count1a, I get the following: >>> > apply(test1.df[1:10,8:10],2,count1a) >>> > X1_1_HZP1 X1_1_HBM1_mon X1_1_HBM1_yr >>> > NA 3 NA >>> > However, when I do use columns 8 and 10, I get the correct response: >>> > apply(test1.df[1:10,c(8,10)],2,count1a) >>> > X1_1_HZP1 X1_1_HBM1_yr >>> > 2 3 >>> >> >>> > I am really baffled. If I use count1a on a single column, it works >>> fine. >>> > >>> > Any suggestions much appreciated. >>> > Thanks, >>> > Sincerely, >>> > Erin >>> > >>> > >>> > Erin Hodgess, PhD >>> > mailto: erinm.hodgess at gmail.com >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >>> -- >>> John Fox, Professor Emeritus >>> McMaster University >>> Hamilton, Ontario, Canada >>> web: https://socialsciences.mcmaster.ca/jfox/ >>> >>>[[alternative HTML version deleted]]