Hi, Thanks for a great environmentfor statistical computing :-) I have some input data in a file ("input_kvpairs.csv") of the form key1=23, key2=67, key3="hello there" key1=7, key2=22, key3="how are you" key1=2, key2=77, key3="nice day, thanks" Now in my head I wish it was of the form ("input.csv") #key1, key2, key3 23, 67, "hello there" 7, 22, "how are you" 2, 77, "nice day, thanks" so I could do data <- read.csv("input.csv", header=TRUE) where the header column names are derived from the key names dynamically, and I could access the data using normal data$key1 or data$key2 mechanism. I guess I could just pre process the file first using python etc to create a CSV file with column header derived from key names, and values derived from key values, but I am interested to see how experienced R folks would handle this inside R. Thanks, Frank
Maybe you can use ',=' as separators. ( I don't have R to check). Otherwise, I would clean the file with an editor or tool like 'sed' to replace the regular expression /key[0-9]=/ by nothing. On Jan 18, 2013 8:05 AM, "Frank Singleton" <b17flyboy@gmail.com> wrote:> Hi, > > Thanks for a great environmentfor statistical computing :-) > > I have some input data in a file ("input_kvpairs.csv") of the form > > key1=23, key2=67, key3="hello there" > key1=7, key2=22, key3="how are you" > key1=2, key2=77, key3="nice day, thanks" > > Now in my head I wish it was of the form ("input.csv") > > #key1, key2, key3 > 23, 67, "hello there" > 7, 22, "how are you" > 2, 77, "nice day, thanks" > > so I could do > > data <- read.csv("input.csv", header=TRUE) > > where the header column names are derived from the key names dynamically, > and I could access the data using normal data$key1 or data$key2 mechanism. > > I guess I could just pre process the file first using python etc to create > a CSV file with column header derived from key names, and values derived > from > key values, but I am interested to see how experienced R folks would > handle this > inside R. > > Thanks, > > Frank > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi One option are regular expressions but you can also read data with "=" as separator. test<-read.table("input_kvpairs.csv", sep=c("="), header=F, stringsAsFactors=F) #use this function to split and extract numeric parts extract<-function(x) as.numeric(sapply(strsplit(x,","),"[",1)) # and apply the function to appropriate columns test1 <- data.frame(sapply(test[,2:3], extract), test[,4]) Now you can put names to resulting data frame see ?names Regards Petr> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Frank Singleton > Sent: Friday, January 18, 2013 5:21 AM > To: r-help at r-project.org > Subject: [R] reading multiple key=value pairs per line > > Hi, > > Thanks for a great environmentfor statistical computing :-) > > I have some input data in a file ("input_kvpairs.csv") of the form > > key1=23, key2=67, key3="hello there" > key1=7, key2=22, key3="how are you" > key1=2, key2=77, key3="nice day, thanks" > > Now in my head I wish it was of the form ("input.csv") > > #key1, key2, key3 > 23, 67, "hello there" > 7, 22, "how are you" > 2, 77, "nice day, thanks" > > so I could do > > data <- read.csv("input.csv", header=TRUE) > > where the header column names are derived from the key names > dynamically, and I could access the data using normal data$key1 or > data$key2 mechanism. > > I guess I could just pre process the file first using python etc to > create a CSV file with column header derived from key names, and values > derived from key values, but I am interested to see how experienced R > folks would handle this inside R. > > Thanks, > > Frank > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
HI, May be this helps: Lines1<-readLines(textConnection('key1=23, key2=67, key3="hello there" key1=7, key2=22, key3="how are you" key1=2, key2=77, key3="nice day, thanks"')) res<-read.table(text=gsub("key{0,1}\\d","",gsub("[\",]","",Lines1)),sep="=",header=FALSE,stringsAsFactors=F)[-1] ?names(res)<- paste(substr(Lines1,1,3),1:3,sep="") ?res #? key1 key2??????????? key3 #1?? 23?? 67???? hello there #2??? 7?? 22???? how are you #3??? 2?? 77 nice day thanks A.K. ----- Original Message ----- From: Frank Singleton <b17flyboy at gmail.com> To: r-help at r-project.org Cc: Sent: Thursday, January 17, 2013 11:21 PM Subject: [R] reading multiple key=value pairs per line Hi, Thanks for a great environmentfor statistical? computing :-) I have some input data in a file ("input_kvpairs.csv") of the form key1=23, key2=67, key3="hello there" key1=7, key2=22, key3="how are you" key1=2, key2=77, key3="nice day, thanks" Now in my head I wish it was of the form ("input.csv") #key1, key2, key3 23,? ? 67,? "hello there" 7,? ? 22,? "how are you" 2,? ? 77,? "nice day, thanks" so I could do data <- read.csv("input.csv", header=TRUE) where the header column names are derived from the key names dynamically, and I could access the data using normal data$key1 or data$key2 mechanism. I guess I could just pre process the file first? using python etc to create a CSV file with column header derived from key names, and values derived from key values, but I am interested to see how experienced R folks would handle this inside R. Thanks, Frank ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, Sorry, there was a mistake.? I didn't notice comma in key3 .Lines1<-readLines(textConnection('key1=23, key2=67, key3="hello there" key1=7, key2=22, key3="how are you" key1=2, key2=77, key3="nice day, thanks"')) res1<-read.table(text=gsub("key{0,1}\\d","",gsub("[\"]","",Lines1)),sep="=",header=FALSE,stringsAsFactors=F)[-1] ?res1[,1:2]<-sapply(res1[,1:2],function(x) as.numeric(gsub("\\, $","",x))) names(res1)<- paste(substr(Lines1,1,3),1:3,sep="") res1 #? key1 key2???????????? key3 #1?? 23?? 67????? hello there #2??? 7?? 22????? how are you #3??? 2?? 77 nice day, thanks ?str(res1) #'data.frame':??? 3 obs. of? 3 variables: # $ key1: num? 23 7 2 # $ key2: num? 67 22 77 # $ key3: chr? "hello there" "how are you" "nice day, thanks" A.K. ----- Original Message ----- From: Frank Singleton <b17flyboy at gmail.com> To: r-help at r-project.org Cc: Sent: Thursday, January 17, 2013 11:21 PM Subject: [R] reading multiple key=value pairs per line Hi, Thanks for a great environmentfor statistical? computing :-) I have some input data in a file ("input_kvpairs.csv") of the form key1=23, key2=67, key3="hello there" key1=7, key2=22, key3="how are you" key1=2, key2=77, key3="nice day, thanks" Now in my head I wish it was of the form ("input.csv") #key1, key2, key3 23,? ? 67,? "hello there" 7,? ? 22,? "how are you" 2,? ? 77,? "nice day, thanks" so I could do data <- read.csv("input.csv", header=TRUE) where the header column names are derived from the key names dynamically, and I could access the data using normal data$key1 or data$key2 mechanism. I guess I could just pre process the file first? using python etc to create a CSV file with column header derived from key names, and values derived from key values, but I am interested to see how experienced R folks would handle this inside R. Thanks, Frank ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You could use the strapply function from the gsubfn package to extract the data from strings. This will return a list that you could use with do.call(rbind( The stringr package may have something similar or an alternative (but I am less familiar with that package). On Thu, Jan 17, 2013 at 9:21 PM, Frank Singleton <b17flyboy@gmail.com>wrote:> Hi, > > Thanks for a great environmentfor statistical computing :-) > > I have some input data in a file ("input_kvpairs.csv") of the form > > key1=23, key2=67, key3="hello there" > key1=7, key2=22, key3="how are you" > key1=2, key2=77, key3="nice day, thanks" > > Now in my head I wish it was of the form ("input.csv") > > #key1, key2, key3 > 23, 67, "hello there" > 7, 22, "how are you" > 2, 77, "nice day, thanks" > > so I could do > > data <- read.csv("input.csv", header=TRUE) > > where the header column names are derived from the key names dynamically, > and I could access the data using normal data$key1 or data$key2 mechanism. > > I guess I could just pre process the file first using python etc to create > a CSV file with column header derived from key names, and values derived > from > key values, but I am interested to see how experienced R folks would > handle this > inside R. > > Thanks, > > Frank > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Gregory (Greg) L. Snow Ph.D. 538280@gmail.com [[alternative HTML version deleted]]