Jeroen Ooms wrote:>
> What is the most efficient method of parsing a dataframe-like structure
> that has been json encoded in record-based format rather than vector
> based. For example a structure like this:
>
> [ {"name":"joe", "gender":"male",
"age":41}, {"name":"anna",
> "gender":"female", "age":23} ]
>
> RJSONIO parses this as a list of lists, which I would then have to apply
> as.data.frame to and append them to an existing dataframe, which is
> terribly slow.
>
>
unlist is pretty fast. The solution below assumes that you know how your
structure is, so it is not very flexible, but it should show you that the
conversion to data.frame is not the bottleneck.
# json
library(RJSONIO)
# [ {"name":"joe", "gender":"male",
"age":41},
# {"name":"anna", "gender":"female",
"age":23} ]
n = 300000
d = data.frame(name=rep(c("joe","anna"),n),
gender=rep(c("male","female"),n),
age = rep(c("23","41"),n))
dj = toJSON(d)
system.time(d1 <- fromJSON(dj))
# user system elapsed
# 4.06 0.26 4.32
system.time(
dd <- data.frame(
name = unlist(d1$name),
gender = unlist(d1$gender),
age=as.numeric(unlist(d1$age)))
)
# user system elapsed
# 1.13 0.05 1.18
--
View this message in context:
http://r.789695.n4.nabble.com/Parsing-JSON-records-to-a-dataframe-tp3178646p3178753.html
Sent from the R help mailing list archive at Nabble.com.