I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example:> a <- NULL > a$abc <- 5 > a$a[1] 5> a$a <- a$a > a$abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the "[" behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396 Fax:617/476-6389 mailto:robert.mcgehee at geodecapital.com This e-mail, and any attachments hereto, are intended for us...{{dropped}}
This has been discussed a few times on this list before, so you might want to dig into the archive... You might want to check existence of name instead of checking whether the component is NULL:> x <- list(bc="bc", ab="ab") > is.null(x$b)[1] FALSE> "b" %in% names(x)[1] FALSE Andy> From: McGehee, Robert > > I was unaware until recently that partial matching was used to index > data frames and lists. This is now causing a great deal of problems in > my code as I sometimes index a list without knowing what elements it > contains, expecting a NULL if the column does not exist. However, if > partial matching is used, sometimes R will return an object I do not > want. My question, is there an easy way of getting around this? > > For example: > > a <- NULL > > a$abc <- 5 > > a$a > [1] 5 > > a$a <- a$a > > a > $abc > [1] 5 > $a > [1] 5 > > Certainly from a coding prospective, one might expect assigning a$a to > itself wouldn't do anything since either 1) a$a doesn't exist, so > nothing happens, or 2) a$a does exist and so it just assigns its value > to itself. However, in the above case, it creates a new > column entirely > because I happen to have another column called a$abc. I do > not want this > behavior. > > The solution I came up with was to create another indexing > function that > uses the subset() (which doesn't partial match), then check for an > error, and if there is an error substitute NULL (to mimic the "[" > behavior). However, I don't really want to start using > another indexing > function altogether just to get around this behavior. Is > there a better > way? Can I turn off partial matching? > > Thanks, > Robert > > > Robert McGehee > Geode Capital Management, LLC > 53 State Street, 5th Floor | Boston, MA | 02109 > Tel: 617/392-8396 Fax:617/476-6389 > mailto:robert.mcgehee at geodecapital.com > > > > This e-mail, and any attachments hereto, are intended for > us...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
This came up a few months ago. Check the thread on hashing and partial matching around Nov 18. The short answer is no, you can't turn it off because lots of code relies on that behavior. Reid Huntsinger -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of McGehee, Robert Sent: Thursday, January 27, 2005 9:34 AM To: r-help at stat.math.ethz.ch Subject: [R] Indexing Lists and Partial Matching I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example:> a <- NULL > a$abc <- 5 > a$a[1] 5> a$a <- a$a > a$abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the "[" behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396 Fax:617/476-6389 mailto:robert.mcgehee at geodecapital.com This e-mail, and any attachments hereto, are intended for us...{{dropped}} ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Thank you both for your reference. I had missed the previous discussions before posting. I am surprised to hear that there is code that relies on this indexing behavior, especially if it is in the base package. I'm not sure how a function could even make use of this feature without first asking R what the names of the list or data frame are, and then intentionally shortening them to something else. It even seems reasonable that if code _does_ rely on this behavior, then it may be subject to other problems anyway, such as if the wrong data is unintentionally returned (when NULL or error should be returned instead). (Although I freely acknowledge my ignorance of the uses of this feature as I only recently discovered it.)>From the previous posts, it seems the only way in R to code around thisis to _always_ check the names of a list before indexing, as anything else could lead to very subtle errors in complex code, unless one can a priori guarantee that the list names are always distinguishable. Perhaps one easy way to optionally remove this feature without breaking anything would be to have an option/flag in the description or namespace of a package indicating that list-indexing partial-matching should not be used for any function within that package. But that might be a bit hackish. However, for my personal code, the a[[match("abc", names(a))]] construct (from one of the Nov 18th posts) is easy enough to use, so no intention to rehash an already well-discussed topic. Thanks, Robert PS. None of this applies to partial matching of function arguments, as this is certainly widely used. -----Original Message----- From: Huntsinger, Reid [mailto:reid_huntsinger at merck.com] Sent: Thursday, January 27, 2005 10:15 AM To: 'McGehee, Robert'; r-help at stat.math.ethz.ch Subject: RE: [R] Indexing Lists and Partial Matching This came up a few months ago. Check the thread on hashing and partial matching around Nov 18. The short answer is no, you can't turn it off because lots of code relies on that behavior. Reid Huntsinger -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of McGehee, Robert Sent: Thursday, January 27, 2005 9:34 AM To: r-help at stat.math.ethz.ch Subject: [R] Indexing Lists and Partial Matching I was unaware until recently that partial matching was used to index data frames and lists. This is now causing a great deal of problems in my code as I sometimes index a list without knowing what elements it contains, expecting a NULL if the column does not exist. However, if partial matching is used, sometimes R will return an object I do not want. My question, is there an easy way of getting around this? For example:> a <- NULL > a$abc <- 5 > a$a[1] 5> a$a <- a$a > a$abc [1] 5 $a [1] 5 Certainly from a coding prospective, one might expect assigning a$a to itself wouldn't do anything since either 1) a$a doesn't exist, so nothing happens, or 2) a$a does exist and so it just assigns its value to itself. However, in the above case, it creates a new column entirely because I happen to have another column called a$abc. I do not want this behavior. The solution I came up with was to create another indexing function that uses the subset() (which doesn't partial match), then check for an error, and if there is an error substitute NULL (to mimic the "[" behavior). However, I don't really want to start using another indexing function altogether just to get around this behavior. Is there a better way? Can I turn off partial matching? Thanks, Robert Robert McGehee Geode Capital Management, LLC 53 State Street, 5th Floor | Boston, MA | 02109 Tel: 617/392-8396 Fax:617/476-6389 mailto:robert.mcgehee at geodecapital.com This e-mail, and any attachments hereto, are intended for us...{{dropped}} ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ------------------------------------------------------------------------ ------ Notice: This e-mail message, together with any attachments,...{{dropped}}