Chris Evans
2011-May-29 14:52 UTC
[R] Oddity: I seem to have a variable in a dataframe that doesn't show in colnames() - can anyone advise?
I may be being dopey, I surely am, but I'm baffled by this. I've been working, on and off for a few days in R version 2.13.0 (2011-04-13) i386-pc-mingw32/i386 (32-bit) working it through ESS. I've got a dataframe created a couple of days back, during the session:> dim(AllDat)[1] 27270 94 I came back this morning and misremembered my variables and thought I had a variable AllDat$PHQ and started using it and everything seemed fine until I realised that I shouldn't have it (!) and that the variable I was thinking of is AllDat$PHQ9 and that's there:> colnames(AllDat)[grep("PHQ",colnames(AllDat))][1] "PHQ9" "HasPHQ" "ZeroPHQ" and, as you can see, AllDat$PHQ. But I can I do:> head(table(AllDat$PHQ))0 1 2 3 4 5 731 527 764 845 872 915 Ooops ... so AllDat$PHQ _DOES_ exist. Its contents exactly match AllDat$PHQ9:> table(abs(AllDat$PHQ - AllDat$PHQ9))0 19032 I have searched back through my ESS transcript back to the start of the session and I can't see anywhere I've assigned to AllDat$PHQ (and I've never used "attach"). However, I guess that somehow I must have managed to duplicate AllDat in more than one open environment so I check out and I have 16 environments (I'm sure that's not right terminology, apologies):> search()[1] ".GlobalEnv" "package:reshape2" [3] "package:Hmisc" "package:survival" [5] "package:splines" "package:nnet" [7] "package:MASS" "package:gdata" [9] "package:stats" "package:graphics" [11] "package:grDevices" "package:utils" [13] "package:datasets" "package:methods" [15] "Autoloads" "package:base" So I try:> for (i in 1:16) { print(paste("i =",i,exists("AllDat",i,inherits FALSE))) }[1] "i = 1 TRUE" [1] "i = 2 FALSE" [1] "i = 3 FALSE" [1] "i = 4 FALSE" [1] "i = 5 FALSE" [1] "i = 6 FALSE" [1] "i = 7 FALSE" [1] "i = 8 FALSE" [1] "i = 9 FALSE" [1] "i = 10 FALSE" [1] "i = 11 FALSE" [1] "i = 12 FALSE" [1] "i = 13 FALSE" [1] "i = 14 FALSE" [1] "i = 15 FALSE" [1] "i = 16 FALSE" So I don't think I do have two different AllDat dataframes. Can anyone throw light on what's going on? I have searched archives etc. but can't think of sensible keywords and so far turned up nothing. Happy to be told RTFM or the equivalent but could someone point me to a specific location? Also happy to try any diagnostics anyone recommends. Many thanks in advance, Chris -- Chris Evans <chris at psyctc.org> Skype: chris-psyctc Consultant Psychiatrist in Psychotherapy, Notts. PDD network; Professor, Psychotherapy, Nottingham University *If I am writing from one of those roles, it will be clear. Otherwise* *my views are my own and not representative of those institutions * If you have difficulty Emailing me on this address or getting a reply, send again but cc to: chris dot evans at nottshc dot nhs dot uk and to: c dot evans at nottingham dot ac dot uk
Phil Spector
2011-May-29 15:06 UTC
[R] Oddity: I seem to have a variable in a dataframe that doesn't show in colnames() - can anyone advise?
Chris - If you check the documentation for the "$" operator, for example by typing help("$") you'll find (among a lot of other information): name: A literal character string or a name (possibly backtick quoted). For extraction, this is normally (see under ?Environments?) partially matched to the ?names? of the object. So when you use the "$" operator (but not "[" or "[["), partial matching is performed. For example:> x = data.frame(PHQ9=1:10) > x$PHQ[1] 1 2 3 4 5 6 7 8 9 10> x[,'PHQ']Error in `[.data.frame`(x, , "PHQ") : undefined columns selected> x[['PHQ']]NULL So if you don't want this "feature", you can use brackets instead of the dollar sign for extraction. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Sun, 29 May 2011, Chris Evans wrote:> I may be being dopey, I surely am, but I'm baffled by this. I've been > working, on and off for a few days in R version 2.13.0 (2011-04-13) > i386-pc-mingw32/i386 (32-bit) working it through ESS. > > I've got a dataframe created a couple of days back, during the session: >> dim(AllDat) > [1] 27270 94 > > I came back this morning and misremembered my variables and thought I > had a variable AllDat$PHQ and started using it and everything seemed > fine until I realised that I shouldn't have it (!) and that the variable > I was thinking of is AllDat$PHQ9 and that's there: >> colnames(AllDat)[grep("PHQ",colnames(AllDat))] > [1] "PHQ9" "HasPHQ" "ZeroPHQ" > > and, as you can see, AllDat$PHQ. But I can I do: > >> head(table(AllDat$PHQ)) > 0 1 2 3 4 5 > 731 527 764 845 872 915 > > Ooops ... so AllDat$PHQ _DOES_ exist. Its contents exactly match > AllDat$PHQ9: >> table(abs(AllDat$PHQ - AllDat$PHQ9)) > 0 > 19032 > > I have searched back through my ESS transcript back to the start of the > session and I can't see anywhere I've assigned to AllDat$PHQ (and I've > never used "attach"). > > However, I guess that somehow I must have managed to duplicate AllDat in > more than one open environment so I check out and I have 16 environments > (I'm sure that's not right terminology, apologies): >> search() > [1] ".GlobalEnv" "package:reshape2" > [3] "package:Hmisc" "package:survival" > [5] "package:splines" "package:nnet" > [7] "package:MASS" "package:gdata" > [9] "package:stats" "package:graphics" > [11] "package:grDevices" "package:utils" > [13] "package:datasets" "package:methods" > [15] "Autoloads" "package:base" > > So I try: >> for (i in 1:16) { print(paste("i =",i,exists("AllDat",i,inherits > FALSE))) } > [1] "i = 1 TRUE" > [1] "i = 2 FALSE" > [1] "i = 3 FALSE" > [1] "i = 4 FALSE" > [1] "i = 5 FALSE" > [1] "i = 6 FALSE" > [1] "i = 7 FALSE" > [1] "i = 8 FALSE" > [1] "i = 9 FALSE" > [1] "i = 10 FALSE" > [1] "i = 11 FALSE" > [1] "i = 12 FALSE" > [1] "i = 13 FALSE" > [1] "i = 14 FALSE" > [1] "i = 15 FALSE" > [1] "i = 16 FALSE" > > So I don't think I do have two different AllDat dataframes. > > Can anyone throw light on what's going on? I have searched archives > etc. but can't think of sensible keywords and so far turned up nothing. > Happy to be told RTFM or the equivalent but could someone point me to a > specific location? Also happy to try any diagnostics anyone recommends. > > Many thanks in advance, > > Chris > > -- > Chris Evans <chris at psyctc.org> Skype: chris-psyctc > Consultant Psychiatrist in Psychotherapy, Notts. PDD network; > Professor, Psychotherapy, Nottingham University > *If I am writing from one of those roles, it will be clear. Otherwise* > *my views are my own and not representative of those institutions * > If you have difficulty Emailing me on this address or getting a reply, > send again but cc to: chris dot evans at nottshc dot nhs dot uk > and to: c dot evans at nottingham dot ac dot uk > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Possibly Parallel Threads
- Oddity with internet access and R 11.0 with Sophos firewall and Windoze XP - solved
- Multivariate multilevel mixed effects model: interaction
- Deleting Rows based on Factor and Time Period
- ff object in lapply function
- update.packages() as ordinary user, /usr/lib/R/site-library is not writable