Chris Evans
2011-May-29 14:52 UTC
[R] Oddity: I seem to have a variable in a dataframe that doesn't show in colnames() - can anyone advise?
I may be being dopey, I surely am, but I'm baffled by this. I've been working, on and off for a few days in R version 2.13.0 (2011-04-13) i386-pc-mingw32/i386 (32-bit) working it through ESS. I've got a dataframe created a couple of days back, during the session:> dim(AllDat)[1] 27270 94 I came back this morning and misremembered my variables and thought I had a variable AllDat$PHQ and started using it and everything seemed fine until I realised that I shouldn't have it (!) and that the variable I was thinking of is AllDat$PHQ9 and that's there:> colnames(AllDat)[grep("PHQ",colnames(AllDat))][1] "PHQ9" "HasPHQ" "ZeroPHQ" and, as you can see, AllDat$PHQ. But I can I do:> head(table(AllDat$PHQ))0 1 2 3 4 5 731 527 764 845 872 915 Ooops ... so AllDat$PHQ _DOES_ exist. Its contents exactly match AllDat$PHQ9:> table(abs(AllDat$PHQ - AllDat$PHQ9))0 19032 I have searched back through my ESS transcript back to the start of the session and I can't see anywhere I've assigned to AllDat$PHQ (and I've never used "attach"). However, I guess that somehow I must have managed to duplicate AllDat in more than one open environment so I check out and I have 16 environments (I'm sure that's not right terminology, apologies):> search()[1] ".GlobalEnv" "package:reshape2" [3] "package:Hmisc" "package:survival" [5] "package:splines" "package:nnet" [7] "package:MASS" "package:gdata" [9] "package:stats" "package:graphics" [11] "package:grDevices" "package:utils" [13] "package:datasets" "package:methods" [15] "Autoloads" "package:base" So I try:> for (i in 1:16) { print(paste("i =",i,exists("AllDat",i,inherits FALSE))) }[1] "i = 1 TRUE" [1] "i = 2 FALSE" [1] "i = 3 FALSE" [1] "i = 4 FALSE" [1] "i = 5 FALSE" [1] "i = 6 FALSE" [1] "i = 7 FALSE" [1] "i = 8 FALSE" [1] "i = 9 FALSE" [1] "i = 10 FALSE" [1] "i = 11 FALSE" [1] "i = 12 FALSE" [1] "i = 13 FALSE" [1] "i = 14 FALSE" [1] "i = 15 FALSE" [1] "i = 16 FALSE" So I don't think I do have two different AllDat dataframes. Can anyone throw light on what's going on? I have searched archives etc. but can't think of sensible keywords and so far turned up nothing. Happy to be told RTFM or the equivalent but could someone point me to a specific location? Also happy to try any diagnostics anyone recommends. Many thanks in advance, Chris -- Chris Evans <chris at psyctc.org> Skype: chris-psyctc Consultant Psychiatrist in Psychotherapy, Notts. PDD network; Professor, Psychotherapy, Nottingham University *If I am writing from one of those roles, it will be clear. Otherwise* *my views are my own and not representative of those institutions * If you have difficulty Emailing me on this address or getting a reply, send again but cc to: chris dot evans at nottshc dot nhs dot uk and to: c dot evans at nottingham dot ac dot uk
Phil Spector
2011-May-29 15:06 UTC
[R] Oddity: I seem to have a variable in a dataframe that doesn't show in colnames() - can anyone advise?
Chris -
If you check the documentation for the "$" operator,
for example by typing
help("$")
you'll find (among a lot of other information):
name: A literal character string or a name (possibly backtick
quoted). For extraction, this is normally (see under
?Environments?) partially matched to the ?names? of the
object.
So when you use the "$" operator (but not "[" or
"[["), partial
matching is performed. For example:
> x = data.frame(PHQ9=1:10)
> x$PHQ
[1] 1 2 3 4 5 6 7 8 9 10> x[,'PHQ']
Error in `[.data.frame`(x, , "PHQ") : undefined columns
selected> x[['PHQ']]
NULL
So if you don't want this "feature", you can use brackets instead
of the dollar sign for extraction.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Sun, 29 May 2011, Chris Evans wrote:
> I may be being dopey, I surely am, but I'm baffled by this. I've
been
> working, on and off for a few days in R version 2.13.0 (2011-04-13)
> i386-pc-mingw32/i386 (32-bit) working it through ESS.
>
> I've got a dataframe created a couple of days back, during the session:
>> dim(AllDat)
> [1] 27270 94
>
> I came back this morning and misremembered my variables and thought I
> had a variable AllDat$PHQ and started using it and everything seemed
> fine until I realised that I shouldn't have it (!) and that the
variable
> I was thinking of is AllDat$PHQ9 and that's there:
>> colnames(AllDat)[grep("PHQ",colnames(AllDat))]
> [1] "PHQ9" "HasPHQ" "ZeroPHQ"
>
> and, as you can see, AllDat$PHQ. But I can I do:
>
>> head(table(AllDat$PHQ))
> 0 1 2 3 4 5
> 731 527 764 845 872 915
>
> Ooops ... so AllDat$PHQ _DOES_ exist. Its contents exactly match
> AllDat$PHQ9:
>> table(abs(AllDat$PHQ - AllDat$PHQ9))
> 0
> 19032
>
> I have searched back through my ESS transcript back to the start of the
> session and I can't see anywhere I've assigned to AllDat$PHQ (and
I've
> never used "attach").
>
> However, I guess that somehow I must have managed to duplicate AllDat in
> more than one open environment so I check out and I have 16 environments
> (I'm sure that's not right terminology, apologies):
>> search()
> [1] ".GlobalEnv" "package:reshape2"
> [3] "package:Hmisc" "package:survival"
> [5] "package:splines" "package:nnet"
> [7] "package:MASS" "package:gdata"
> [9] "package:stats" "package:graphics"
> [11] "package:grDevices" "package:utils"
> [13] "package:datasets" "package:methods"
> [15] "Autoloads" "package:base"
>
> So I try:
>> for (i in 1:16) { print(paste("i
=",i,exists("AllDat",i,inherits > FALSE))) }
> [1] "i = 1 TRUE"
> [1] "i = 2 FALSE"
> [1] "i = 3 FALSE"
> [1] "i = 4 FALSE"
> [1] "i = 5 FALSE"
> [1] "i = 6 FALSE"
> [1] "i = 7 FALSE"
> [1] "i = 8 FALSE"
> [1] "i = 9 FALSE"
> [1] "i = 10 FALSE"
> [1] "i = 11 FALSE"
> [1] "i = 12 FALSE"
> [1] "i = 13 FALSE"
> [1] "i = 14 FALSE"
> [1] "i = 15 FALSE"
> [1] "i = 16 FALSE"
>
> So I don't think I do have two different AllDat dataframes.
>
> Can anyone throw light on what's going on? I have searched archives
> etc. but can't think of sensible keywords and so far turned up nothing.
> Happy to be told RTFM or the equivalent but could someone point me to a
> specific location? Also happy to try any diagnostics anyone recommends.
>
> Many thanks in advance,
>
> Chris
>
> --
> Chris Evans <chris at psyctc.org> Skype: chris-psyctc
> Consultant Psychiatrist in Psychotherapy, Notts. PDD network;
> Professor, Psychotherapy, Nottingham University
> *If I am writing from one of those roles, it will be clear. Otherwise*
> *my views are my own and not representative of those institutions *
> If you have difficulty Emailing me on this address or getting a reply,
> send again but cc to: chris dot evans at nottshc dot nhs dot uk
> and to: c dot evans at nottingham dot ac dot uk
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Reasonably Related Threads
- Oddity with internet access and R 11.0 with Sophos firewall and Windoze XP - solved
- Multivariate multilevel mixed effects model: interaction
- Deleting Rows based on Factor and Time Period
- ff object in lapply function
- update.packages() as ordinary user, /usr/lib/R/site-library is not writable