thr3ads.net - similar to: "syntax for extending a line in a script??"

Displaying 20 results from an estimated 3000 matches similar to: "syntax for extending a line in a script??"

anyone know why package "RandomForest" na.roughfix is so slow??

2010 Jun 30

anyone know why package "RandomForest" na.roughfix is so slow??

Hi all, I am using the package "random forest" for random forest predictions. I like the package. However, I have fairly large data sets, and it can often take *hours* just to go through the "na.roughfix" call, which simply goes through and cleans up any NA values to either the median (numerical data) or the most frequent occurrence (factors). I am going to start

manipulating the Date & Time classes

2011 Feb 08

manipulating the Date & Time classes

Hello, This is mostly to developers, but in case I missed something in my literature search, I am sending this to the broader audience. - Are there any plans in the works to make "time" classes a bit more friendly to the rest of the "R" world? I am not suggesting to allow for fancy functions to manipulate times, per se, or to figure out how to properly

manipulating the Date & Time classes

2011 Feb 08

manipulating the Date & Time classes

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

2010 Jul 13

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"

ideas, modeling highly discrete time-series data

2010 Dec 20

ideas, modeling highly discrete time-series data

Hello all, First of all, thanks so those of you who helped me a week or so ago managing a time series with varying gaps between the data series in 'R'. (My final preferred solution was to use "its" function & then forecast(Arima( ) ). ) My next question is a general statistical question where I'd like some advice, for those willing / able to proffer any wisdom:

How to get 'R' to talk BACK to other languages / scripts??

2010 Dec 03

How to get 'R' to talk BACK to other languages / scripts??

Hey everyone, I know that I can call 'R' from other scripts, and that I can make command calls from 'R' (e.g., using system() ). But how can I get 'R' to RETURN values to the script that called it. E.g., I would like to be able to do something like the following (as a simpler example) from a bash script: #!/bin/bash myTest=echo /usr/local/bin/R --no-restore

how to convert "sloppy data" into a time series?

2010 Dec 17

how to convert "sloppy data" into a time series?

Hi All, First let me state that I did search for a while on r-help, google, and using the "sos" package inside of 'R', without much luck. I want to know how to create a univariate time series from a set of data that will have huge time gaps in it. For instance, here is a snapshot of a piece of data that I would like to analyze: *Row queued_time

odd behavior of "summary" function

2010 Aug 24

odd behavior of "summary" function

Hello All, Using the standard "summary" function in 'R', I ran across some odd behavior that I cannot understand. Easy to reproduce: Typing: summary(c(6,207936)) Yields:: Min. *1st Qu. Median Mean 3rd Qu. Max.* 6 *51990 104000 104000 156000 207900* None of these values are correct except for the minimum. If I perform "quantile(c(6,

how can I evaluate a formula passed as a string?

2010 Jun 24

how can I evaluate a formula passed as a string?

Hey everyone, I've been using 'R' long enough that I should have some idea of what the heck either expression() or eval() are really ever useful for. I come across another instance where I WISH they would be useful, but I cannot get them to work. Here is the crux of what I would like to do: presume df looks like this A B C === === === M 45 0 M

trouble with RODBC -- chopping off part of column names

2010 Oct 01

trouble with RODBC -- chopping off part of column names

Hello all, I have a strange / interesting problem that might be 'R' settings themselves, or it might be something with the OS. I am using the RODBC library. I have a script that goes out and, before making a query for a big data set, will first query for the column names of the data set. The column names could sometimes be quite long (e.g., "Time Background Estimation

ggplot2 histograms... a subtle error found

2010 Jul 29

ggplot2 histograms... a subtle error found

Hello all, I have a peculiar and particular bug that I stumbled across with ggplot2. I cannot seem to replicate it with anything other than my specific data set. Here is the problem: - when I try to plot a histogram, allowing for ggplot2 to decide the binwidths itself, I get the following error: - stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to

garbage collection & memory leaks in 'R', it seems...

2010 Jul 16

garbage collection & memory leaks in 'R', it seems...

Hello developers, I noticed that if I am running 'R', type "rm(list=objects())" and "gc()", 'R' will still be consuming (a lot) more memory than when I then close 'R' and re-open it. In my ignorance, I'm presuming this is something in 'R' where it doesn't really do a great job of garbage collection... at least not nearly as well as

Problem with a if statement inside a function

2011 Jun 09

Problem with a if statement inside a function

I have a really long functions, and at the end of the function, I am using a if statement to tag certain keywords based on whether they have certain values contained in them. However, the if statement doesn't seem to work. When I had split up the commands into various functions, it worked fine, but I'm not sure what going on now that it's combined into a single function. myfunc

Data frame from list of lists

2003 Sep 22

Data frame from list of lists

This seems to be a simple problem, and I feel that there ought to be a simple answer, but I can't seem to find it. I have a function that returns a number of values as a heterogeneous list - always the same length and same names(), but a number of different data types, including character. I want to apply it to many inputs, resulting in a list of lists. I would like to turn this list of

Strange variable names in factor regression

2024 May 09

Strange variable names in factor regression

On converting character variables to ordered factors, regression result has strange names. Is it possible to obtain same variable names with and without intercept? Thanks, Naresh mydf <- data.frame(date = seq.Date(as.Date("2024-01-01"), as.Date("2024-03-31"), by = 1)) mydf[, "wday"] <- weekdays(mydf$date, abbreviate = TRUE) mydf.work <- subset(mydf, !(wday

Creating a "shifted" month (one that starts not on the first of each month but on another date)

2011 May 19

Creating a "shifted" month (one that starts not on the first of each month but on another date)

Hello! I have a data frame with dates. I need to create a new "month" that starts on the 20th of each month - because I'll need to aggregate my data later by that "shifted" month. I wrote the code below and it works. However, I was wondering if there is some ready-made function in some package - that makes it easier/more elegant? Thanks a lot! # Example data:

Comparing "transform" to "with"

2007 Sep 01

Comparing "transform" to "with"

Hi All, I've been successfully using the with function for analyses and the transform function for multiple transformations. Then I thought, why not use "with" for both? I ran into problems & couldn't figure them out from help files or books. So I created a simplified version of what I'm doing: rm( list=ls() ) x1<-c(1,3,3) x2<-c(3,2,1) x3<-c(2,5,2)

Problem with subset() function?

2009 Jan 20

Problem with subset() function?

Hi all, Can anyone explain why the following use of the subset() function produces a different outcome than the use of the "[" extractor? The subset() function as used in density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) appears to me from documentation to be equivalent to density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])

Surprising Behavior of 'tapply'

2005 Feb 03

Surprising Behavior of 'tapply'

Dear all, I wanted to make a two-way-table of two variables with a counting variable stored in another column of a dataframe. In version 1.9.1, the behavior is as expected as shown in the simplified example code. > sex <- rep(c("F", "M"), 5) > income <- c(rep("low", 5), rep("high", 5)) > count <- 1:10 > mydf <-

Error: missing values where TRUE/FALSE needed

2011 Jun 09

Error: missing values where TRUE/FALSE needed

I'm writing a function and keep getting the following error message. myfunc <- function(lst) { lst <- list(roots = c("car insurance", "auto insurance"), roots2 = c("insurance"), prefix = c("cheap", "budget"), prefix2 = c("low cost"), suffix = c("quote", "quotes"), suffix2 = c("rate",

similar to: syntax for extending a line in a script??