Displaying 20 results from an estimated 8000 matches similar to: "na.omit leaves cases with NA's intact"
2007 Mar 15
2
replacing all NA's in a dataframe with zeros...
I've seen how to replace the NA's in a single column with a data frame
*> mydata$ncigs[is.na(mydata$ncigs)]<-0
*But this is just one column... I have thousands of columns (!) that I need
to do this, and I can't figure out a way, outside of the dreaded loop, do
replace all NA's in an entire data frame (all vars) without naming each var
separately. Yikes.
I'm racking my
2005 Nov 07
4
R seems to "stall" after several hours on a long series of analyses... where to start?
Not sure where to even start on this.... I'm hoping there's some debugging I
can do...
I have a loop that cycles through several different data sets (same
structure, different info), performing randomForest growth and
predictions... saving out the predictions for later study...
I get about 5 hours in (9%... of the planned iterations.. yikes!) and R just
freezes.
This happens in
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction. If "strata" is not
specified, the class labels will be used.
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?
Sorry for the repost, but I've really been looking, and can't find any
syntax direction on this issue...
Just browsing the documentation, and searching the list came up short... I
have some unbalanced data and was wondering if, in a "0" v "1"
classification forest, some combo of these options might yield better
predictions when the proportion of one class is low (less
2005 Aug 10
2
Creating new columns inside a loop
Ok, I know R isn't an optimal environment for looping (or so I've heard) but
I have a need to loop through columns of data and create new columns of data
based on calculations within rows...
I'm sure there's a help file, but I'm not sure what search terms to use to
find it! The problem is that these new columns need to have names that I can
later access... Like NewVar1,
2007 Mar 23
1
memory, speed, and assigning results into new v. existing variable
I have a very large data frame, and I'm doing a conversion of all columns
into factors. Takes a while (thanks to folks here though, for making
faster!), but am wondering about optimization from a memory perspective...
Internally, am I better off assigning into a new data frame, or doing one of
these:
dataframe<-someoperation(dataframe)
It would seem that re-assigning into the same data
2011 Aug 26
2
cbind giving NA's?
I have two xts objects, call them "a" and "b", and am trying to merge them...
> class(a)
[1] "xts" "zoo"
> class(b)
[1] "xts" "zoo"
> head(a)
2010-04-01 7.6343
2010-04-02 7.6343
2010-04-03 7.5458
2010-04-04 7.4532
2010-04-05 7.4040
2010-04-06 7.3317
> head(b)
2010-04-01 568.80
2010-04-05 571.01
2010-04-06
2005 Nov 07
1
R seems to "stall" after several hours on a long series o f analyses... where to start?
You can test if the problem is accumulation in memory registers, which is
certainly what this sounds like. Just do a loop over a reasonably small
number of iterations and store or print the time between each iteration. If
memory accumulation it will run optimally for the first few iterations,
after which the time will increase noticeably (essentially exponentially,
hence ultimately freezes up). If
2010 Nov 21
1
abline(h=whatever) not working in candleChart() (in quantmod)?
Hello, all--
I am having some fun playing with the graphing in quantmod-- very nice! I am
writing a function to calculate (and hopefully plot) support and resistance
lines, but the usual plot call of "abline(h=value)" does not seem to work.
Here's my code:
require(quantmod)
AAPL<-getYahooData("AAPL")
candleChart(AAPL,subset="last 3
2013 Mar 07
2
xts time series object removing time and leaving just the date
I have and XTS time series object that has date and time. I started with 1
minute data and used apply.daily(x, sum) to sum the data to one cumulative
value. This function works just fine however it leaves a time for the last
summed value which looks like this 2006-07-19 14:58:00. I need to just have
the date and to remove the time value of 14:58:00 just leaving the date
value of 2006-07-19 .
2005 Nov 23
2
TryCatch() with read.csv("http://...")
Hi, folks!
I'm trying to pull in data using read.csv("my URL goes here"), and it really
works fantastically. Amazing to pull in live data right off the internet,
into RAM, and get busy...
however...
occasionally there is a server problem, or the data are not up yet, and
instead of pushing through a nice CSV file, the server sends a 404 "Not
Found" page...
Since the
2005 Oct 09
1
Insert value from same column of another row (lag across observations)
I know I've done this before, but it's been a while and I can't find quite
what I need in the help files or archives.
I have a text field in a very large data frame. I'd like to add a column
that represents the value from an existing field, from the next record (the
data are sorted). I'm trying to represent "what happens tomorrow", so the
"today" row would
2013 Mar 18
2
data.frame with NA
I have this little data.frame
http://dl.dropbox.com/u/102669/nanotna.rdata
Two column contains NA, so the best thing to do is use na.locf function (with
fromLast = T)
But locf function doesn't work because NA in my data.frame are not recognized as
real NA.
Is there a way to substitute fake NA with real NA? In this case na.locf function
should work
Thank you
2005 Sep 13
1
Anyone have any code for importing data from NAMCS?
The National Ambulatory and Medical Care Survey is a free data set from the
CDC that I'd like to analyze using the "Survey" package in R. Before I dive
in, though, it occurred to me that someone may already have gone to the
trouble of writing code that will bring in the data and assign the variable
names and value labels. This is a big file, so doing it from scratch will
take
2005 May 15
1
Not sure if this is "aggregate" or some other task.
I have data where where I've taken some measurements three times... twice in
rapid succession so I could check test-retest reliability of a piece of
equipment, and then a third measurement some time later.
Not I'd like to do an analysis where I have two scores... the first being
the mean of the first two taken the same day, and the second being the one
taken later.
I have a lot of
2013 Mar 16
2
Find NA in xts object
Hi to all, i'm new to R
I have an xts object.
Can i find:
a) how many NA are in my object ?
b) eventually where (in which line) they are
Thank you
2006 May 23
1
Survey proportions... Can I use population as denominator?
Just giving the survey package a spin...
I'm accustomed to stata, and it seems very similar in many respects. One
thing is throwing me, however.
I've gotten my data in, and specified the design. Looks like the weighting
is right (based on published population estimates from these data), but now
I'd like to check my "marginal means" for proportions against those that
have
2009 Dec 19
1
as.xts convert all my numeric data to character
Hello, all... I've been playing with the TTR package and quantmod, and I'm
loading the Chicago Board of Exchange put/call ratio data via a simple
read.csv call...
CBOEtotal<-read.csv(file="
http://www.cboe.com/publish/ScheduledTask/MktData/datahouse/totalpc.csv
",skip=1)
this gives me a data frame with columns....
> names(CBOEtotal)
[1] "Trade_date"
2005 Oct 04
1
"Survey" package and NAMCS data... unsure of specification
Hello, all.
I wanted to use the "survey" package to analyze data from the National
Ambulatory Medical Care Survey, and am having some difficulty translating
the analysis keywords from one package (Stata) to the other (R). The data
were collected using a multistage probability sampling, and there are
variables included to identify the sampling units and weights. Documentation
from the
2012 Jul 22
2
Reading many large files causes R to crash - Possible Bug in R 2.15.1 64-bit Ubuntu
I am reading several hundred files. Anywhere from 50k-400k in size. It
appears that when I read these files with R 2.15.1 the process will hang or
seg fault on the scan() call. This does not happen on R 2.14.1.
This is happening on the precise build of Ubuntu.
I have included everything, but the issue appears to be when performing the
scan in the method parseTickData.
Below is the