Ted Byers
2010-Jul-16 15:54 UTC
[R] I need help making a data.fame comprised of selected columns of an original data frame.
I must have missed something simple, but still, i don't know what.
I obtained my basic data as follows:
x <- sprintf("SELECT m_id,sale_date,YEAR(sale_date) AS
sale_year,WEEK(sale_date) AS sale_week,return_type,0.0001 +
DATEDIFF(return_date,sale_date) AS elapsed_time FROM
`merchants2`.`risk_input` WHERE DATEDIFF(return_date,sale_date) IS NOT
NULL")
moreinfo <- dbGetQuery(con, x)
I then made the data frame I want to use as follows:
fun_m_id <- function(df)
if (length(df$elapsed_time) > 5) {
rv = fitdist(df$elapsed_time,"exp")
rv$mid = df$m_id[1]
rv
}
aaa <- lapply(split(moreinfo,list(moreinfo$m_id),drop = TRUE), fun_m_id)
m_id_default_res <- do.call(rbind, aaa)
At this point, each row in m_id_default_res corresponds to one data.frame
produced by fitdist. When I print it, I get the output I expected.
However, I need to store only some of it into my DB.
And then, because fitdist produces a data frame that includes a lot of info
I don't need to store in the DB, I tried making a new data.frame containing
only the info I need as follows:
ndf = data.frame()
for (i in 1:length(m_id_default_res[,1])) {
ndf$mid[i] = m_id_default_res$mid[i]
ndf$estimate[i] = m_id_default_res$estimate[i]
ndf$sd[i] = m_id_default_res$sd[i]
ndf$n[i] = m_id_default_res[i]
ndf$loglik[i] = m_id_default_res$loglik[i]
ndf$aic[i] = m_id_default_res$aic[i]
ndf$bic[i] = m_id_default_res$bic[i]
ndf$chisq[i] = m_id_default_res$chisq[i]
ndf$chisqpvalue[i] = m_id_default_res$chisqpvalue[i]
ndf$chisqdf[i] = m_id_default_res$chisqdf[i]
}
ndf
And I get the following error:
Error in `$<-.data.frame`(`*tmp*`, "n", value =
list(0.114752782316094)) :
replacement has 1 rows, data has 0
I need to either get rid of the columns in m_id_default_res that I don't
need, or I need to copy only those columns I need to a new data.frame. How
do I do this. Obviously, doing an element-wise copy, at least as I tried to
do it, doesn't work.
Thanks,
Ted
[[alternative HTML version deleted]]
Steve Lianoglou
2010-Jul-16 16:04 UTC
[R] I need help making a data.fame comprised of selected columns of an original data frame.
Hi, First: it's kind of hard to play along w/o some reproducible data. To that end, you can paste into an email the output of: dput(moreinfo) If there are lots of rows in `moreinfo`, just give us the first ~10-20 dput(head(moreinfo, 20)) Anyway: <snip>> At this point, each row in m_id_default_res corresponds to one data.frame > produced by fitdist. ?When I print it, I get the output I expected. > However, I need to store only some of it into my DB. > > And then, because fitdist produces a data frame that includes a lot of info > I don't need to store in the DB, I tried making a new data.frame containing > only the info I need as follows: > ndf = data.frame() > for (i in 1:length(m_id_default_res[,1])) { > ?ndf$mid[i] = m_id_default_res$mid[i] > ?ndf$estimate[i] = m_id_default_res$estimate[i] > ?ndf$sd[i] = m_id_default_res$sd[i] > ?ndf$n[i] = m_id_default_res[i] > ?ndf$loglik[i] = m_id_default_res$loglik[i] > ?ndf$aic[i] = m_id_default_res$aic[i] > ?ndf$bic[i] = m_id_default_res$bic[i] > ?ndf$chisq[i] = m_id_default_res$chisq[i] > ?ndf$chisqpvalue[i] = m_id_default_res$chisqpvalue[i] > ?ndf$chisqdf[i] = m_id_default_res$chisqdf[i] > }Forget the for loop. How about: ndf <- m_id_default[, c('mid, 'estimate', 'sd', 'loglik', 'aic', 'bic', 'chisq', 'chisqpvalue', 'chisqdf') Having just written that, I see something strange in your for loop. Specifically this line:> ?ndf$n[i] = m_id_default_res[i]m_id_default_res is a data.frame, right? Why don't you try to see what `m_id_default_res[1]` returns. I'm not sure that that's what your error message is coming from, but I foresee this to be a problem anyway, if I follow your "build up" code correctly. Hope that helps, -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Apparently Analagous Threads
- Troubles with DBI's dbWriteTable in RMySQL
- How do I get rid of list elements where the value is NULL before applying rbind?
- One problem with RMySQL and a query that returns an empty recordset
- Query about using timestamps returned by SQL as 'factor' for split
- install a package made using bioconductor package pdInfoBuilder