I had trouble with some tests for the survival suite last night that I cannot explain. Framework: Ubuntu Linux, R2.11. For testing survival I have a separate directory and Makefile. I pull everything into the local .RData, no packages, library, or namespace. (It's easier to add test modifications to a routine in a chain of calls). A test of survreg + psline would fail because splines is not automatically loaded in this scenario. In my console window type > library(splines) > fit <- survreg(Surv(time, status) ~ ph.ecog + spline(age), data=lung) as the first 2 lines a a new session (the lung data is also saved) and up would pop an error message about having <2.10 version of survival. And it is true that I had such a thing in my ~/RLib. But there is no mention of either the local or the system survival library when I type search(), and .Autoloaded is NULL. I'm mystified as to how R decided to use that library, and what it was trying to pull from it. I removed it from ~/Rlib and the message went away, but now I'm not so sure that this was a good idea. Is something now being automatically snatched out of the standard survival library? If so that compromises the integrity of my testbed. How do I find out? Terry Therneau
On 09/07/2010 10:55 AM, Therneau, Terry M., Ph.D. wrote:> I had trouble with some tests for the survival suite last night that I > cannot explain. > > Framework: Ubuntu Linux, R2.11. > For testing survival I have a separate directory and Makefile. I > pull everything into the local .RData, no packages, library, or > namespace. (It's easier to add test modifications to a routine in a > chain of calls). > > A test of survreg + psline would fail because splines is not > automatically loaded in this scenario. In my console window type > > library(splines) > > fit <- survreg(Surv(time, status) ~ ph.ecog + spline(age), > data=lung) > as the first 2 lines a a new session (the lung data is also saved) > and up would pop an error message about having <2.10 version of > survival. And it is true that I had such a thing in my ~/RLib. But > there is no mention of either the local or the system survival library > when I type search(), and .Autoloaded is NULL. >Could you show us the result of .libPaths(), search(), searchpaths() and sessionInfo()? Duncan Murdoch> I'm mystified as to how R decided to use that library, and what it was > trying to pull from it. I removed it from ~/Rlib and the message went > away, but now I'm not so sure that this was a good idea. Is something > now being automatically snatched out of the standard survival library? > If so that compromises the integrity of my testbed. How do I find out? > > Terry Therneau > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
Dear R-devel, I appear to see differences in behavior of unz between Windows and Linux. url0514 <- "http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/Stats19_Data_2005-2014.zip" file0514 <- c("Vehicles0514.csv","Casualties0514.csv","Accidents0514.csv") temp <- tempfile() download.file(url0514,temp) a0514 <<- read.csv(unz(temp, file0514[3])) c0514 <<- read.csv(unz(temp, file0514[2])) v0514 <<- read.csv(unz(temp, file0514[1])) Under Windows, I noticed that there are variables i..Accident_Index in objects [a|c|v]0514, but this is not the case if zip file contains only one file, i.e., file2015 <- c("Vehicles_2015.csv","Casualties_2015.csv","Accidents_2015.csv") url2015 <- "http://data.dft.gov.uk/road-accidents-safety-data/RoadSafetyData_2015.zip" download.file(url2015,temp) v2015 <<- read.csv(unz(temp, file2015[1])) c2015 <<- read.csv(unz(temp, file2015[2])) a2015 <<- read.csv(unz(temp, file2015[3])) so to combine [a|c|v]0514 and [a|c|v]2015, I need to add something like names(a0514)[names(a0514)=="?..Accident_Index"] <- "Accident_Index" names(c0514)[names(c0514)=="?..Accident_Index"] <- "Accident_Index" names(v0514)[names(v0514)=="?..Accident_Index"] <- "Accident_Index" This is unnecessary under Linux (RHEL), since those i..Accident_Index have no i.. prefix. Do I miss anything here? Many thanks, Jing Hua Zhao [[alternative HTML version deleted]]
The 2015 does contain three files. ________________________________ From: jing hua zhao <jinghuazhao at hotmail.com> Sent: 10 February 2017 00:13 To: r-devel at r-project.org Subject: issue with unz()? Dear R-devel, url0514 <- "http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/Stats19_Data_2005-2014.zip" file0514 <- c("Vehicles0514.csv","Casualties0514.csv","Accidents0514.csv") temp <- tempfile() download.file(url0514,temp) a0514 <<- read.csv(unz(temp, file0514[3])) c0514 <<- read.csv(unz(temp, file0514[2])) v0514 <<- read.csv(unz(temp, file0514[1])) cts [a|c|v]0514, but this is not the case if zip file contains only one file, i.e., file2015 <- c("Vehicles_2015.csv","Casualties_2015.csv","Accidents_2015.csv") url2015 <- "http://data.dft.gov.uk/road-accidents-safety-data/RoadSafetyData_2015.zip" download.file(url2015,temp) v2015 <<- read.csv(unz(temp, file2015[1])) c2015 <<- read.csv(unz(temp, file2015[2])) a2015 <<- read.csv(unz(temp, file2015[3])) so to combine [a|c|v]0514 and [a|c|v]2015, I need to add something like names(a0514)[names(a0514)=="?..Accident_Index"] <- "Accident_Index" names(c0514)[names(c0514)=="?..Accident_Index"] <- "Accident_Index" names(v0514)[names(v0514)=="?..Accident_Index"] <- "Accident_Index" This is unnecessary under Linux (RHEL), since those i..Accident_Index have no i.. prefix. Do I miss anything here? Many thanks, Jing Hua Zhao [[alternative HTML version deleted]]
If you use check.names=FALSE in your call to read.csv you can see that the first column name starts with the 3 bytes ef bb bf, which is the UTF-8 "byte-order mark" that Microsoft applications like to put at the start of a text file stored in UTF-8.> v0514 <- read.csv(unz(temp, file0514[1]), stringsAsFactors=FALSE, check.names=FALSE) > names(v0514)[1][1] "???Accident_Index"> charToRaw(names(v0514)[1])[1] ef bb bf 41 63 63 69 64 65 6e 74 5f 49 6e 64 65 78 I thought that adding fileEncoding="UTF-8-BOM" or perhaps encoding="UTF-8-BOM" would take care of the issue, but it does not do it for me. You can remove them by hand with substring()> substring(names(v0514)[1],4)[1] "Accident_Index" Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Feb 9, 2017 at 4:13 PM, jing hua zhao <jinghuazhao at hotmail.com> wrote:> Dear R-devel, > > > I appear to see differences in behavior of unz between Windows and Linux. > > > url0514 <- "http://data.dft.gov.uk.s3.amazonaws.com/road-accidents-safety-data/Stats19_Data_2005-2014.zip" > file0514 <- c("Vehicles0514.csv","Casualties0514.csv","Accidents0514.csv") > > temp <- tempfile() > download.file(url0514,temp) > a0514 <<- read.csv(unz(temp, file0514[3])) > > c0514 <<- read.csv(unz(temp, file0514[2])) > > v0514 <<- read.csv(unz(temp, file0514[1])) > > > Under Windows, I noticed that there are variables i..Accident_Index in objects [a|c|v]0514, but this is not the case if zip file contains only one file, i.e., > > file2015 <- c("Vehicles_2015.csv","Casualties_2015.csv","Accidents_2015.csv") > url2015 <- "http://data.dft.gov.uk/road-accidents-safety-data/RoadSafetyData_2015.zip" > download.file(url2015,temp) > v2015 <<- read.csv(unz(temp, file2015[1])) > c2015 <<- read.csv(unz(temp, file2015[2])) > a2015 <<- read.csv(unz(temp, file2015[3])) > > > so to combine [a|c|v]0514 and [a|c|v]2015, I need to add something like > > > names(a0514)[names(a0514)=="?..Accident_Index"] <- "Accident_Index" > names(c0514)[names(c0514)=="?..Accident_Index"] <- "Accident_Index" > names(v0514)[names(v0514)=="?..Accident_Index"] <- "Accident_Index" > > > This is unnecessary under Linux (RHEL), since those i..Accident_Index have no i.. prefix. > > > Do I miss anything here? > > > Many thanks, > > > > > Jing Hua Zhao > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel