Hi,
Thanks for a great environmentfor statistical computing :-)
I have some input data in a file ("input_kvpairs.csv") of the form
key1=23, key2=67, key3="hello there"
key1=7, key2=22, key3="how are you"
key1=2, key2=77, key3="nice day, thanks"
Now in my head I wish it was of the form ("input.csv")
#key1, key2, key3
23, 67, "hello there"
7, 22, "how are you"
2, 77, "nice day, thanks"
so I could do
data <- read.csv("input.csv", header=TRUE)
where the header column names are derived from the key names dynamically,
and I could access the data using normal data$key1 or data$key2 mechanism.
I guess I could just pre process the file first using python etc to create
a CSV file with column header derived from key names, and values derived
from
key values, but I am interested to see how experienced R folks would
handle this
inside R.
Thanks,
Frank
Maybe you can use ',=' as separators. ( I don't have R to check). Otherwise, I would clean the file with an editor or tool like 'sed' to replace the regular expression /key[0-9]=/ by nothing. On Jan 18, 2013 8:05 AM, "Frank Singleton" <b17flyboy@gmail.com> wrote:> Hi, > > Thanks for a great environmentfor statistical computing :-) > > I have some input data in a file ("input_kvpairs.csv") of the form > > key1=23, key2=67, key3="hello there" > key1=7, key2=22, key3="how are you" > key1=2, key2=77, key3="nice day, thanks" > > Now in my head I wish it was of the form ("input.csv") > > #key1, key2, key3 > 23, 67, "hello there" > 7, 22, "how are you" > 2, 77, "nice day, thanks" > > so I could do > > data <- read.csv("input.csv", header=TRUE) > > where the header column names are derived from the key names dynamically, > and I could access the data using normal data$key1 or data$key2 mechanism. > > I guess I could just pre process the file first using python etc to create > a CSV file with column header derived from key names, and values derived > from > key values, but I am interested to see how experienced R folks would > handle this > inside R. > > Thanks, > > Frank > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi
One option are regular expressions but you can also read data with "="
as separator.
test<-read.table("input_kvpairs.csv", sep=c("="),
header=F, stringsAsFactors=F)
#use this function to split and extract numeric parts
extract<-function(x)
as.numeric(sapply(strsplit(x,","),"[",1))
# and apply the function to appropriate columns
test1 <- data.frame(sapply(test[,2:3], extract), test[,4])
Now you can put names to resulting data frame see
?names
Regards
Petr
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Frank Singleton
> Sent: Friday, January 18, 2013 5:21 AM
> To: r-help at r-project.org
> Subject: [R] reading multiple key=value pairs per line
>
> Hi,
>
> Thanks for a great environmentfor statistical computing :-)
>
> I have some input data in a file ("input_kvpairs.csv") of the
form
>
> key1=23, key2=67, key3="hello there"
> key1=7, key2=22, key3="how are you"
> key1=2, key2=77, key3="nice day, thanks"
>
> Now in my head I wish it was of the form ("input.csv")
>
> #key1, key2, key3
> 23, 67, "hello there"
> 7, 22, "how are you"
> 2, 77, "nice day, thanks"
>
> so I could do
>
> data <- read.csv("input.csv", header=TRUE)
>
> where the header column names are derived from the key names
> dynamically, and I could access the data using normal data$key1 or
> data$key2 mechanism.
>
> I guess I could just pre process the file first using python etc to
> create a CSV file with column header derived from key names, and values
> derived from key values, but I am interested to see how experienced R
> folks would handle this inside R.
>
> Thanks,
>
> Frank
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
HI,
May be this helps:
Lines1<-readLines(textConnection('key1=23, key2=67, key3="hello
there"
key1=7, key2=22, key3="how are you"
key1=2, key2=77, key3="nice day, thanks"'))
res<-read.table(text=gsub("key{0,1}\\d","",gsub("[\",]","",Lines1)),sep="=",header=FALSE,stringsAsFactors=F)[-1]
?names(res)<- paste(substr(Lines1,1,3),1:3,sep="")
?res
#? key1 key2??????????? key3
#1?? 23?? 67???? hello there
#2??? 7?? 22???? how are you
#3??? 2?? 77 nice day thanks
A.K.
----- Original Message -----
From: Frank Singleton <b17flyboy at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Thursday, January 17, 2013 11:21 PM
Subject: [R] reading multiple key=value pairs per line
Hi,
Thanks for a great environmentfor statistical? computing :-)
I have some input data in a file ("input_kvpairs.csv") of the form
key1=23, key2=67, key3="hello there"
key1=7, key2=22, key3="how are you"
key1=2, key2=77, key3="nice day, thanks"
Now in my head I wish it was of the form ("input.csv")
#key1, key2, key3
23,? ? 67,? "hello there"
7,? ? 22,? "how are you"
2,? ? 77,? "nice day, thanks"
so I could do
data <- read.csv("input.csv", header=TRUE)
where the header column names are derived from the key names dynamically,
and I could access the data using normal data$key1 or data$key2 mechanism.
I guess I could just pre process the file first? using python etc to create
a CSV file with column header derived from key names, and values derived from
key values, but I am interested to see how experienced R folks would handle this
inside R.
Thanks,
Frank
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hi,
Sorry, there was a mistake.? I didn't notice comma in key3
.Lines1<-readLines(textConnection('key1=23, key2=67, key3="hello
there"
key1=7, key2=22, key3="how are you"
key1=2, key2=77, key3="nice day, thanks"'))
res1<-read.table(text=gsub("key{0,1}\\d","",gsub("[\"]","",Lines1)),sep="=",header=FALSE,stringsAsFactors=F)[-1]
?res1[,1:2]<-sapply(res1[,1:2],function(x) as.numeric(gsub("\\,
$","",x)))
names(res1)<- paste(substr(Lines1,1,3),1:3,sep="")
res1
#? key1 key2???????????? key3
#1?? 23?? 67????? hello there
#2??? 7?? 22????? how are you
#3??? 2?? 77 nice day, thanks
?str(res1)
#'data.frame':??? 3 obs. of? 3 variables:
# $ key1: num? 23 7 2
# $ key2: num? 67 22 77
# $ key3: chr? "hello there" "how are you" "nice day,
thanks"
A.K.
----- Original Message -----
From: Frank Singleton <b17flyboy at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Thursday, January 17, 2013 11:21 PM
Subject: [R] reading multiple key=value pairs per line
Hi,
Thanks for a great environmentfor statistical? computing :-)
I have some input data in a file ("input_kvpairs.csv") of the form
key1=23, key2=67, key3="hello there"
key1=7, key2=22, key3="how are you"
key1=2, key2=77, key3="nice day, thanks"
Now in my head I wish it was of the form ("input.csv")
#key1, key2, key3
23,? ? 67,? "hello there"
7,? ? 22,? "how are you"
2,? ? 77,? "nice day, thanks"
so I could do
data <- read.csv("input.csv", header=TRUE)
where the header column names are derived from the key names dynamically,
and I could access the data using normal data$key1 or data$key2 mechanism.
I guess I could just pre process the file first? using python etc to create
a CSV file with column header derived from key names, and values derived from
key values, but I am interested to see how experienced R folks would handle this
inside R.
Thanks,
Frank
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
You could use the strapply function from the gsubfn package to extract the data from strings. This will return a list that you could use with do.call(rbind( The stringr package may have something similar or an alternative (but I am less familiar with that package). On Thu, Jan 17, 2013 at 9:21 PM, Frank Singleton <b17flyboy@gmail.com>wrote:> Hi, > > Thanks for a great environmentfor statistical computing :-) > > I have some input data in a file ("input_kvpairs.csv") of the form > > key1=23, key2=67, key3="hello there" > key1=7, key2=22, key3="how are you" > key1=2, key2=77, key3="nice day, thanks" > > Now in my head I wish it was of the form ("input.csv") > > #key1, key2, key3 > 23, 67, "hello there" > 7, 22, "how are you" > 2, 77, "nice day, thanks" > > so I could do > > data <- read.csv("input.csv", header=TRUE) > > where the header column names are derived from the key names dynamically, > and I could access the data using normal data$key1 or data$key2 mechanism. > > I guess I could just pre process the file first using python etc to create > a CSV file with column header derived from key names, and values derived > from > key values, but I am interested to see how experienced R folks would > handle this > inside R. > > Thanks, > > Frank > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Gregory (Greg) L. Snow Ph.D. 538280@gmail.com [[alternative HTML version deleted]]