Hi, I have the following function: getDataFromDVFileCustom <- function (file, hasHeader = TRUE, separator "\t") { DVdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE, comment.char = "#", as.is = TRUE, stringsAsFactors = FALSE, na.strings "NA")) DVdatatmper <- as.matrix(DVdatatmp[ , c("datetime", grep("^_00060_00003", colnames(DVdatatmp)))]) retval <- as.data.frame(DVdatatmper, colClasses = c("character"), fill TRUE, comment.char = "#", stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) <- c("dateTime", "value") } else if (ncol(retval) == 3) { names(retval) <- c("dateTime", "value", "code") } if (dateFormatCheck(retval$dateTime)) { retval$dateTime <- as.Date(retval$dateTime) } else { retval$dateTime <- as.Date(retval$dateTime, format = "%m/%d/%Y") } retval$value <- as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c("datetime", grep("^_00060_00003", colnames(DVdatatmp)))]) : subscript out of bounds I am trying to only select 3 columns (datetime and then two partial name columns that end in 00060_00003 and 00060_00003_cd. Each file that I will be reading into the function has a different number of columns and a different prefix in front of 00060_00003 and 00060_00003_cd. I have searched online and tried those possible solutions, but they did not work for my function and data. What is the best way to select those 3 columns only? Thank-you. Irucka Embry <span id=m2wTl><p><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">_______________________________________________________________<BR>Get the Free email that has everyone talking at <a href=http://www.mail2world.com target=new>http://www.mail2world.com</a><br> <font color=#999999>Unlimited Email Storage – POP3 – Calendar – SMS – Translator – Much More!</font></font></span> [[alternative HTML version deleted]]
Hi, May be this is creating the problem: set.seed(15) dat1<-data.frame(A_00060_00003=sample(1:10,5,replace=TRUE),B_00060_00003_cd=sample(20:30,5,replace=TRUE),C_00060_00003=sample(1:15,5,replace=TRUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep("6/3/2011",5),c("0:00","0:30","0:35","0:40","0:45")),format="%m/%d/%Y %H:%M")) ?dat1[,c("datetime",grep("00060_00003",colnames(dat1)))] #Error in `[.data.frame`(dat1, , c("datetime", grep("00060_00003", colnames(dat1)))) : ? #undefined columns selected dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] #???????????? datetime A_00060_00003 B_00060_00003_cd C_00060_00003 #1 2011-06-03 00:00:00???????????? 7?????????????? 30???????????? 2 #2 2011-06-03 00:30:00???????????? 2?????????????? 28??????????? 10 #3 2011-06-03 00:35:00??????????? 10?????????????? 22???????????? 8 #4 2011-06-03 00:40:00???????????? 7?????????????? 27??????????? 11 #5 2011-06-03 00:45:00???????????? 4?????????????? 29??????????? 13 A.K. ----- Original Message ----- From: Irucka Embry <iruckaE at mail2world.com> To: r-help at r-project.org Cc: Sent: Wednesday, January 9, 2013 5:44 AM Subject: [R] select partial name and full name columns Hi, I have the following function: getDataFromDVFileCustom <- function (file, hasHeader = TRUE, separator "\t") { DVdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE, comment.char = "#", as.is = TRUE, stringsAsFactors = FALSE, na.strings "NA")) DVdatatmper <- as.matrix(DVdatatmp[ , c("datetime", grep("^_00060_00003", colnames(DVdatatmp)))]) retval <- as.data.frame(DVdatatmper, colClasses = c("character"), fill TRUE, comment.char = "#", stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) <- c("dateTime", "value") } else if (ncol(retval) == 3) { names(retval) <- c("dateTime", "value", "code") } if (dateFormatCheck(retval$dateTime)) { retval$dateTime <- as.Date(retval$dateTime) } else { retval$dateTime <- as.Date(retval$dateTime, format = "%m/%d/%Y") } retval$value <- as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c("datetime", grep("^_00060_00003", colnames(DVdatatmp)))]) : subscript out of bounds I am trying to only select 3 columns (datetime and then two partial name columns that end in 00060_00003 and 00060_00003_cd. Each file that I will be reading into the function has a different number of columns and a different prefix in front of 00060_00003 and 00060_00003_cd. I have searched online and tried those possible solutions, but they did not work for my function and data. What is the best way to select those 3 columns only? Thank-you. Irucka Embry <span id=m2wTl><p><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">_______________________________________________________________<BR>Get the Free email that has everyone talking at <a href=http://www.mail2world.com target=new>http://www.mail2world.com</a><br>? <font color=#999999>Unlimited Email Storage ? POP3 ? Calendar ? SMS ? Translator ? Much More!</font></font></span> ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi Arun, thank-you for your suggestion. I made a mistake previously when I suggested that there was a "prefix" in front of "00060_00003" possibly suggesting that it was a string of characters rather than numbers. The "prefix" in front of "00060_00003" is actually two numbers, see the examples below: 01_00060_00003 01_00060_00003_cd 15_00060_00003 15_00060_00003_cd 02_00060_00003 02_00060_00003_cd How can the following code be modified to reflect the numerical rather than character prefix? dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] Thank-you. Irucka Embry <-----Original Message----->>From: arun [smartpink111@yahoo.com] >Sent: 1/9/2013 7:13:05 AM >To: iruckaE@mail2world.com >Cc: r-help@r-project.org >Subject: Re: [R] select partial name and full name columns > > > >Hi, > >May be this is creating the problem: > >set.seed(15) >dat1<-data.frame(A_00060_00003=sample(1:10,5,replace=TRUE),B_00060_00003_cd=sample(20:30,5,replace=TRUE),C_00060_00003=sample(1:15,5,replace=TR UE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep("6/ 3/2011",5),c("0:00","0:30","0:35","0:40","0:45")),format="%m/%d/%Y>%H:%M")) > dat1[,c("datetime",grep("00060_00003",colnames(dat1)))] >#Error in `[.data.frame`(dat1, , c("datetime", grep("00060_00003", >colnames(dat1)))) : > #undefined columns selected >dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] ># datetime A_00060_00003 B_00060_00003_cd C_00060_00003 >#1 2011-06-03 00:00:00 7 30 2 >#2 2011-06-03 00:30:00 2 28 10 >#3 2011-06-03 00:35:00 10 22 8 >#4 2011-06-03 00:40:00 7 27 11 >#5 2011-06-03 00:45:00 4 29 13 >A.K. > > > >----- Original Message ----- >From: Irucka Embry <iruckaE@mail2world.com> >To: r-help@r-project.org >Cc: >Sent: Wednesday, January 9, 2013 5:44 AM >Subject: [R] select partial name and full name columns > >Hi, I have the following function: > >getDataFromDVFileCustom <- function (file, hasHeader = TRUE, separator >"\t") >{ >DVdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE, >comment.char = "#", as.is = TRUE, stringsAsFactors = FALSE, na.strings >"NA")) >DVdatatmper <- as.matrix(DVdatatmp[ , c("datetime", >grep("^_00060_00003", colnames(DVdatatmp)))]) >retval <- as.data.frame(DVdatatmper, colClasses = c("character"), fill >TRUE, comment.char = "#", stringsAsFactors = FALSE) >if (ncol(retval) == 2) { >names(retval) <- c("dateTime", "value") >} >else if (ncol(retval) == 3) { >names(retval) <- c("dateTime", "value", "code") >} >if (dateFormatCheck(retval$dateTime)) { >retval$dateTime <- as.Date(retval$dateTime) >} >else { >retval$dateTime <- as.Date(retval$dateTime, format = "%m/%d/%Y") >} >retval$value <- as.numeric(retval$value) >return(retval) >} > >The function gives me this error: >getDataFromDVFileCustom(file) >Error in as.matrix(DVdatatmp[, c("datetime", grep("^_00060_00003", >colnames(DVdatatmp)))]) : >subscript out of bounds > >I am trying to only select 3 columns (datetime and then two partialname>columns that end in 00060_00003 and 00060_00003_cd. Each file that I >will be reading into the function has a different number of columns and >a different prefix in front of 00060_00003 and 00060_00003_cd. I have >searched online and tried those possible solutions, but they did not >work for my function and data. > >What is the best way to select those 3 columns only? > >Thank-you. > >Irucka Embry > > >______________________________________________ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html>and provide commented, minimal, self-contained, reproducible code.<span id=m2wTl><p><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">_______________________________________________________________<BR>Get the Free email that has everyone talking at <a href=http://www.mail2world.com target=new>http://www.mail2world.com</a><br> <font color=#999999>Unlimited Email Storage – POP3 – Calendar – SMS – Translator – Much More!</font></font></span> [[alternative HTML version deleted]]
Hi, You can use the same code: set.seed(15) ?dat1<-data.frame(sample(1:10,5,replace=TRUE),sample(20:30,5,replace=TRUE),sample(1:15,5,replace=TRUE),sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep("6/3/2011",5),c("0:00","0:30","0:35","0:40","0:45")),format="%m/%d/%Y %H:%M")) ?colnames(dat1)[1:4]<-c("01_00060_00003","01_000060_00003_cd","15_000060_00003","15_00060") dat1 #? 01_00060_00003 01_000060_00003_cd 15_000060_00003 15_00060 #1????????????? 7???????????????? 30?????????????? 2??????? 7 #2????????????? 2???????????????? 28????????????? 10??????? 4 #3???????????? 10???????????????? 22?????????????? 8??????? 8 #4????????????? 7???????????????? 27????????????? 11??????? 2 #5????????????? 4???????????????? 29????????????? 13??????? 7 ? # ????????? datetime #1 2011-06-03 00:00:00 #2 2011-06-03 00:30:00 #3 2011-06-03 00:35:00 #4 2011-06-03 00:40:00 #5 2011-06-03 00:45:00 dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] #???????????? datetime 01_00060_00003 01_000060_00003_cd 15_000060_00003 #1 2011-06-03 00:00:00????????????? 7???????????????? 30?????????????? 2 #2 2011-06-03 00:30:00????????????? 2???????????????? 28????????????? 10 #3 2011-06-03 00:35:00???????????? 10???????????????? 22?????????????? 8 #4 2011-06-03 00:40:00????????????? 7???????????????? 27????????????? 11 #5 2011-06-03 00:45:00????????????? 4???????????????? 29????????????? 13 A.K. ________________________________ From: Irucka Embry <iruckaE at mail2world.com> To: smartpink111 at yahoo.com Cc: r-help at r-project.org Sent: Wednesday, January 9, 2013 11:36 AM Subject: Re: [R] select partial name and full name columns Hi Arun, thank-you for your suggestion. I made a mistake previously when I suggested that there was a "prefix" in front of "00060_00003" possibly suggesting that it was a string of characters rather than numbers. The "prefix" in front of "00060_00003" is actually two numbers, see the examples below: 01_00060_00003 01_00060_00003_cd 15_00060_00003 15_00060_00003_cd 02_00060_00003 02_00060_00003_cd How can the following code be modified to reflect the numerical rather than character prefix? dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] Thank-you. Irucka Embry <-----Original Message----->>From: arun [smartpink111 at yahoo.com] >Sent: 1/9/2013 7:13:05 AM >To: iruckaE at mail2world.com >Cc: r-help at r-project.org >Subject: Re: [R] select partial name and full name columns > > > >Hi, > >May be this is creating the problem: > >set.seed(15) >dat1<-data.frame(A_00060_00003=sample(1:10,5,replace=TRUE),B_00060_00003_cd=sample(20:30,5,replace=TRUE),C_00060_00003=sample(1:15,5,replace=TRUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep("6/3/2011",5),c("0:00","0:30","0:35","0:40","0:45")),format="%m/%d/%Y >%H:%M")) > dat1[,c("datetime",grep("00060_00003",colnames(dat1)))] >#Error in `[.data.frame`(dat1, , c("datetime", grep("00060_00003", >colnames(dat1)))) : >??#undefined columns selected >dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] >#???????????? datetime A_00060_00003 B_00060_00003_cd C_00060_00003 >#1 2011-06-03 00:00:00???????????? 7?????????????? 30???????????? 2 >#2 2011-06-03 00:30:00???????????? 2?????????????? 28????????????10 >#3 2011-06-03 00:35:00????????????10?????????????? 22???????????? 8 >#4 2011-06-03 00:40:00???????????? 7?????????????? 27????????????11 >#5 2011-06-03 00:45:00???????????? 4?????????????? 29????????????13 >A.K. > > > >----- Original Message ----- >From: Irucka Embry <iruckaE at mail2world.com> >To: r-help at r-project.org >Cc: >Sent: Wednesday, January 9, 2013 5:44 AM >Subject: [R] select partial name and full name columns > >Hi, I have the following function: > >getDataFromDVFileCustom <- function (file, hasHeader = TRUE, separator >"\t") >{ >DVdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE, >comment.char = "#", as.is = TRUE, stringsAsFactors = FALSE, na.strings >"NA")) >DVdatatmper <- as.matrix(DVdatatmp[ , c("datetime", >grep("^_00060_00003", colnames(DVdatatmp)))]) >retval <- as.data.frame(DVdatatmper, colClasses = c("character"), fill >TRUE, comment.char = "#", stringsAsFactors = FALSE) >if (ncol(retval) == 2) { >names(retval) <- c("dateTime", "value") >} >else if (ncol(retval) == 3) { >names(retval) <- c("dateTime", "value", "code") >} >if (dateFormatCheck(retval$dateTime)) { >retval$dateTime <- as.Date(retval$dateTime) >} >else { >retval$dateTime <- as.Date(retval$dateTime, format = "%m/%d/%Y") >} >retval$value <- as.numeric(retval$value) >return(retval) >} > >The function gives me this error: >getDataFromDVFileCustom(file) >Error in as.matrix(DVdatatmp[, c("datetime", grep("^_00060_00003", >colnames(DVdatatmp)))]) : >subscript out of bounds > >I am trying to only select 3 columns (datetime and then two partial name >columns that end in 00060_00003 and 00060_00003_cd. Each file that I >will be reading into the function has a different number of columns and >a different prefix in front of 00060_00003 and 00060_00003_cd. I have >searched online and tried those possible solutions, but they did not >work for my function and data. > >What is the best way to select those 3 columns only? > >Thank-you. > >Irucka Embry > > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. _______________________________________________________________Get the Free email that has everyone talking at http://www.mail2world.com Unlimited Email Storage ? POP3 ? Calendar ? SMS ? Translator ? Much More!
Hi Arun, thanks again for your assistance. Previously I did not read the files with the headers so I could not search for those prefixed names. I corrected my mistake and the code that you suggested does work. Irucka <-----Original Message----->>From: arun [smartpink111@yahoo.com] >Sent: 1/9/2013 11:09:13 AM >To: iruckaE@mail2world.com >Cc: r-help@r-project.org >Subject: Re: [R] select partial name and full name columns > > > >Hi, >You can use the same code: >set.seed(15) > dat1<-data.frame(sample(1:10,5,replace=TRUE),sample(20:30,5,replace=TRUE),sample(1:15,5,replace=TRUE),sample(1:8,5,replace=TRUE),datetime=as.P OSIXct(paste(rep("6/3/2011",5),c("0:00","0:30","0:35","0:40","0:45")),fo rmat="%m/%d/%Y>%H:%M")) > > colnames(dat1)[1:4]<-c("01_00060_00003","01_000060_00003_cd","15_000060_00003","15_00060")> > >dat1 ># 01_00060_00003 01_000060_00003_cd 15_000060_00003 15_00060 >#1 7 30 2 7 >#2 2 28 10 4 >#3 10 22 8 8 >#4 7 27 11 2 >#5 4 29 13 7 > # datetime >#1 2011-06-03 00:00:00 >#2 2011-06-03 00:30:00 >#3 2011-06-03 00:35:00 >#4 2011-06-03 00:40:00 >#5 2011-06-03 00:45:00 > > >dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] ># datetime 01_00060_00003 01_000060_00003_cd15_000060_00003>#1 2011-06-03 00:00:00 7 302>#2 2011-06-03 00:30:00 2 2810>#3 2011-06-03 00:35:00 10 228>#4 2011-06-03 00:40:00 7 2711>#5 2011-06-03 00:45:00 4 2913> > >A.K. >________________________________ >From: Irucka Embry <iruckaE@mail2world.com> >To: smartpink111@yahoo.com >Cc: r-help@r-project.org >Sent: Wednesday, January 9, 2013 11:36 AM >Subject: Re: [R] select partial name and full name columns > > >Hi Arun, thank-you for your suggestion. > >I made a mistake previously when I suggested that there was a "prefix"in front>of "00060_00003" possibly suggesting that it was a string of charactersrather>than numbers. The "prefix" in front of "00060_00003" is actually twonumbers,>see the examples below: > >01_00060_00003 01_00060_00003_cd 15_00060_00003 15_00060_00003_cd02_00060_00003>02_00060_00003_cd > >How can the following code be modified to reflect the numerical ratherthan>character prefix? > >dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])] > >Thank-you. > >Irucka Embry > > ><-----Original Message-----> >>From: arun [smartpink111@yahoo.com] >>Sent: 1/9/2013 7:13:05 AM >>To: iruckaE@mail2world.com >>Cc: r-help@r-project.org >>Subject: Re: [R] select partial name and full name columns >> >> >> >>Hi, >> >>May be this is creating the problem: >> >>set.seed(15) >>dat1<-data.frame(A_00060_00003=sample(1:10,5,replace=TRUE),B_00060_00003_cd=sample(20:30,5,replace=TRUE),C_00060_00003=sample(1:15,5,replace=T RUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as..POSIXct(paste(rep(" 6/3/2011",5),c("0:00","0:30","0:35","0:40","0:45")),format="%m/%d/%Y>>%H:%M")) >> dat1[,c("datetime",grep("00060_00003",colnames(dat1)))] >>#Error in `[.data.frame`(dat1, , c("datetime", grep("00060_00003", >>colnames(dat1)))) : >> #undefined columns selected >>dat1[,c("datetime",colnames(dat1)[grep("00060_00003",colnames(dat1))])]>># datetime A_00060_00003 B_00060_00003_cd C_00060_00003 >>#1 2011-06-03 00:00:00 7 30 2 >>#2 2011-06-03 00:30:00 2 28 10 >>#3 2011-06-03 00:35:00 10 22 8 >>#4 2011-06-03 00:40:00 7 27 11 >>#5 2011-06-03 00:45:00 4 29 13 >>A.K. >> >> >> >>----- Original Message ----- >>From: Irucka Embry <iruckaE@mail2world.com> >>To: r-help@r-project.org >>Cc: >>Sent: Wednesday, January 9, 2013 5:44 AM >>Subject: [R] select partial name and full name columns >> >>Hi, I have the following function: >> >>getDataFromDVFileCustom <- function (file, hasHeader = TRUE, separator >>"\t") >>{ >>DVdatatmp <- as.matrix(read.table(file, sep = "\t", fill = TRUE, >>comment.char = "#", as.is = TRUE, stringsAsFactors = FALSE, na.strings >>"NA")) >>DVdatatmper <- as.matrix(DVdatatmp[ , c("datetime", >>grep("^_00060_00003", colnames(DVdatatmp)))]) >>retval <- as.data.frame(DVdatatmper, colClasses = c("character"), fill >>TRUE, comment.char = "#", stringsAsFactors = FALSE) >>if (ncol(retval) == 2) { >>names(retval) <- c("dateTime", "value") >>} >>else if (ncol(retval) == 3) { >>names(retval) <- c("dateTime", "value", "code") >>} >>if (dateFormatCheck(retval$dateTime)) { >>retval$dateTime <- as.Date(retval$dateTime) >>} >>else { >>retval$dateTime <- as.Date(retval$dateTime, format = "%m/%d/%Y") >>} >>retval$value <- as.numeric(retval$value) >>return(retval) >>} >> >>The function gives me this error: >>getDataFromDVFileCustom(file) >>Error in as.matrix(DVdatatmp[, c("datetime", grep("^_00060_00003", >>colnames(DVdatatmp)))]) : >>subscript out of bounds >> >>I am trying to only select 3 columns (datetime and then two partialname>>columns that end in 00060_00003 and 00060_00003_cd. Each file that I >>will be reading into the function has a different number of columnsand>>a different prefix in front of 00060_00003 and 00060_00003_cd. I have >>searched online and tried those possible solutions, but they did not >>work for my function and data. >> >>What is the best way to select those 3 columns only? >> >>Thank-you. >> >>Irucka Embry >> >> >>______________________________________________ >>R-help@r-project.org mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html>>and provide commented, minimal, self-contained, reproducible code.<span id=m2wTl><p><font face="Arial, Helvetica, sans-serif" size="2" style="font-size:13.5px">_______________________________________________________________<BR>Get the Free email that has everyone talking at <a href=http://www.mail2world.com target=new>http://www.mail2world.com</a><br> <font color=#999999>Unlimited Email Storage – POP3 – Calendar – SMS – Translator – Much More!</font></font></span> [[alternative HTML version deleted]]