Displaying 20 results from an estimated 3000 matches similar to: "syntax for extending a line in a script??"
2010 Jun 30
2
anyone know why package "RandomForest" na.roughfix is so slow??
Hi all,
I am using the package "random forest" for random forest predictions. I
like the package. However, I have fairly large data sets, and it can often
take *hours* just to go through the "na.roughfix" call, which simply goes
through and cleans up any NA values to either the median (numerical data) or
the most frequent occurrence (factors).
I am going to start
2011 Feb 08
4
manipulating the Date & Time classes
Hello,
This is mostly to developers, but in case I missed something in my
literature search, I am sending this to the broader audience.
- Are there any plans in the works to make "time" classes a bit more
friendly to the rest of the "R" world? I am not suggesting to allow for
fancy functions to manipulate times, per se, or to figure out how to
properly
2011 Feb 08
4
manipulating the Date & Time classes
Hello,
This is mostly to developers, but in case I missed something in my
literature search, I am sending this to the broader audience.
- Are there any plans in the works to make "time" classes a bit more
friendly to the rest of the "R" world? I am not suggesting to allow for
fancy functions to manipulate times, per se, or to figure out how to
properly
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone,
I have another "Random Forest" package question:
- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the "% increase in MSE" and "IncNodePurity" exactly as can be
found from the "importance" section of the model results.
- However, the plot does not, in fact, match the "importance"
2010 Dec 20
1
ideas, modeling highly discrete time-series data
Hello all,
First of all, thanks so those of you who helped me a week or so ago
managing a time series with varying gaps between the data series in 'R'.
(My final preferred solution was to use "its" function & then
forecast(Arima( ) ). )
My next question is a general statistical question where I'd like some
advice, for those willing / able to proffer any wisdom:
2010 Dec 03
2
How to get 'R' to talk BACK to other languages / scripts??
Hey everyone,
I know that I can call 'R' from other scripts, and that I can make
command calls from 'R' (e.g., using system() ). But how can I get 'R' to
RETURN values to the script that called it. E.g., I would like to be able
to do something like the following (as a simpler example) from a bash
script:
#!/bin/bash
myTest=echo /usr/local/bin/R --no-restore
2010 Dec 17
2
how to convert "sloppy data" into a time series?
Hi All,
First let me state that I did search for a while on r-help, google, and
using the "sos" package inside of 'R', without much luck. I want to know
how to create a univariate time series from a set of data that will have
huge time gaps in it. For instance, here is a snapshot of a piece of data
that I would like to analyze:
*Row queued_time
2010 Aug 24
3
odd behavior of "summary" function
Hello All,
Using the standard "summary" function in 'R', I ran across some odd
behavior that I cannot understand. Easy to reproduce:
Typing:
summary(c(6,207936))
Yields::
Min. *1st Qu. Median Mean 3rd Qu. Max.*
6 *51990 104000 104000 156000 207900*
None of these values are correct except for the minimum. If I perform
"quantile(c(6,
2010 Jun 24
1
how can I evaluate a formula passed as a string?
Hey everyone,
I've been using 'R' long enough that I should have some idea of what the
heck either expression() or eval() are really ever useful for. I come
across another instance where I WISH they would be useful, but I cannot get
them to work.
Here is the crux of what I would like to do:
presume df looks like this
A B C
=== === ===
M 45 0
M
2010 Oct 01
2
trouble with RODBC -- chopping off part of column names
Hello all,
I have a strange / interesting problem that might be 'R' settings
themselves, or it might be something with the OS.
I am using the RODBC library. I have a script that goes out and, before
making a query for a big data set, will first query for the column names of
the data set. The column names could sometimes be quite long (e.g., "Time
Background Estimation
2010 Jul 29
2
ggplot2 histograms... a subtle error found
Hello all,
I have a peculiar and particular bug that I stumbled across with
ggplot2. I cannot seem to replicate it with anything other than my specific
data set.
Here is the problem:
- when I try to plot a histogram, allowing for ggplot2 to decide the
binwidths itself, I get the following error:
- stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to
2010 Jul 16
1
garbage collection & memory leaks in 'R', it seems...
Hello developers,
I noticed that if I am running 'R', type "rm(list=objects())" and
"gc()", 'R' will still be consuming (a lot) more memory than when I then
close 'R' and re-open it. In my ignorance, I'm presuming this is something
in 'R' where it doesn't really do a great job of garbage collection... at
least not nearly as well as
2011 Jun 09
2
Problem with a if statement inside a function
I have a really long functions, and at the end of the function, I am using a
if statement
to tag certain keywords based on whether they have certain values contained
in them.
However, the if statement doesn't seem to work.
When I had split up the commands into various functions, it worked fine, but
I'm not sure
what going on now that it's combined into a single function.
myfunc
2003 Sep 22
1
Data frame from list of lists
This seems to be a simple problem, and I feel that there ought to be a
simple answer, but I can't seem to find it.
I have a function that returns a number of values as a heterogeneous list -
always the same length and same names(), but a number of different data
types, including character. I want to apply it to many inputs, resulting in
a list of lists.
I would like to turn this list of
2024 May 09
2
Strange variable names in factor regression
On converting character variables to ordered factors, regression result
has strange names. Is it possible to obtain same variable names with
and without intercept?
Thanks,
Naresh
mydf <- data.frame(date = seq.Date(as.Date("2024-01-01"),
as.Date("2024-03-31"), by = 1))
mydf[, "wday"] <- weekdays(mydf$date, abbreviate = TRUE)
mydf.work <- subset(mydf, !(wday
2011 May 19
1
Creating a "shifted" month (one that starts not on the first of each month but on another date)
Hello!
I have a data frame with dates. I need to create a new "month" that
starts on the 20th of each month - because I'll need to aggregate my
data later by that "shifted" month.
I wrote the code below and it works. However, I was wondering if there
is some ready-made function in some package - that makes it
easier/more elegant?
Thanks a lot!
# Example data:
2007 Sep 01
2
Comparing "transform" to "with"
Hi All,
I've been successfully using the with function for analyses and the
transform function for multiple transformations. Then I thought, why not
use "with" for both? I ran into problems & couldn't figure them out from
help files or books. So I created a simplified version of what I'm
doing:
rm( list=ls() )
x1<-c(1,3,3)
x2<-c(3,2,1)
x3<-c(2,5,2)
2009 Jan 20
5
Problem with subset() function?
Hi all,
Can anyone explain why the following use of
the subset() function produces a different
outcome than the use of the "[" extractor?
The subset() function as used in
density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
appears to me from documentation to be equivalent to
density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])
2005 Feb 03
2
Surprising Behavior of 'tapply'
Dear all,
I wanted to make a two-way-table of two variables with a counting
variable stored in another column of a dataframe. In version 1.9.1, the
behavior is as expected as shown in the simplified example code.
> sex <- rep(c("F", "M"), 5)
> income <- c(rep("low", 5), rep("high", 5))
> count <- 1:10
> mydf <-
2011 Jun 09
1
Error: missing values where TRUE/FALSE needed
I'm writing a function and keep getting the following error message.
myfunc <- function(lst) {
lst <- list(roots = c("car insurance", "auto insurance"),
roots2 = c("insurance"), prefix = c("cheap", "budget"),
prefix2 = c("low cost"), suffix = c("quote", "quotes"),
suffix2 = c("rate",