Displaying 20 results from an estimated 7000 matches similar to: "Avoiding for loops"
2009 Aug 30
3
Sapply
Hi,
I need a bit of guidance with the sapply function. I've read the help
page, but am still a bit unsure how to use it.
I have a large data frame with about 100 columns and 30,000 rows. One
of the columns is "group" of which there are about 2,000 distinct "groups".
I want to normalize (sum to 1) one of my variables per-group.
Normally, I would just write a huge
2009 Sep 07
2
Confused - better empirical results with error in data
Hi,
I have a strange one for the group.
We have a system that predicts probabilities using a fairly standard svm
(e1017). We are looking at probabilities of a binary outcome.
The input data is generated by a perl script that calculates a bunch of
things, fetches data from a database, etc.
We train the system on 30,000 examples and then test the system on an
unseen set of 5,000 records.
2011 Jan 07
2
Stepwise SVM Variable selection
I have a data set with about 30,000 training cases and 103 variable.
I've trained an SVM (using the e1071 package) for a binary classifier
{0,1}. The accuracy isn't great.
I used a grid search over the C and G parameters with an RBF kernel to
find the best settings.
I remember that for least squares, R has a nice stepwise function that
will try combining subsets of variables to find
2011 Mar 31
3
Create Variable names dynamically
Hi,
I want to create variable names from within my code, but can't find any documentation for this.
An example is probably the best way to illustrate. I am reading data in from a file, doing a bunch of stuff, and want to generate variables with my output. (I could make a "list of lists" and name all the elements, but I really want separate variables.)
#################
#This is
2010 Jun 01
4
Plot multiple columns
I'm running a long MCMC chain that is generating samples for 22 variables.
I have each run of the chain as a row in a matrix.
So: Chain[,1] is the column with all the samples for variable one.
Chain[,2] is the column with all the samples for variable 2, etc.
I'd like to fit all 22 on a single page to print a nice summary. It is
OK if the graphs are small, I just need to show the
2012 Nov 29
7
Fast Normalize by Group
Hi,
I have a very large data set (aprox. 100,000 rows.)
The data comes from around 10,000 "groups" with about 10 entered per group.
The values are in one column, the group ID is an integer in the second column.
I want to normalize the values by group:
for(g in unique(groups){
x[group==g] / sum(x[group==g])
}
This works find in a loop, but is slow. Is there a faster way to do
2009 Sep 11
4
R on Multi Core
Hi,
Our discussions about 64 bit R has led me to another thought.
I have a nice dual core 3.0 chip inside my Linux Box (Running Fedora 11.)
Is there a version of R that would take advantage of BOTH cores??
(Watching my system performance meter now is interesting, Running R will
hold a single core at 100% perfectly, but the other core sites idle.)
Thanks!
--
Noah
2011 Sep 02
2
Avoiding for Loop for moving average
Hello,
I need to calculate a moving average and an exponentially weighted moving average over a fairly large data set (500K rows).
Doing this in a for loop works nicely, but is slow.
ewma <- data$col[1]
N <- dim(data)[1]
for(i in 2:N){
data$ewma <- alpha * data$ewma[i-1] + (1-alpha) * data$value[i]
}
Since the moving average "accumulates" as we move through the data,
2010 Feb 28
2
lapply with data frame
I'm a bit confused on how to use lapply with a data.frame.
For example.
lapply(data, function(x) print(x))
WHAT exactly is passed to the function. Is it each ROW in the data
frame, one by one, or each column, or the entire frame in one shot?
What I want to do apply a function to each row in the data frame. Is
lapply the right way.
A second application is to normalize a column value by
2009 Oct 16
2
Different way of scaling data
Hi,
I have a data.frame that I need to scale.
I've been using the scale function and it works nicely.
Some of the libraries I'm testing won't accept negative values for data,
so I need to find a way to scale the data from 0 to 1
Any ideas?
Thans!
2011 Jan 10
2
Step command failing for lm function
Hi,
I have a fairly simple linear regression using the lm function. There
are about 100 variables and 30,000 rows of data. It runs fine and
produces a decent looking R2 value. I'm interested in performing a
stepwise variable selection to see if things can be cleaned up a bit.
Calling the step function returns ONE iteration (all the variables) and
then stops. No errors are reported.
2010 Jul 09
3
accessing return variables from a function
Hi,
I am trying to figure out a "short" way to access two values output from
the sort function.
>x <- c(3,4,3,6,78,3,1,2)
>sort(x, index.return=T)
$x
[1] 1 2 3 3 3 4 6 78
$ix
[1] 7 8 1 3 6 2 4 5
It would be great to do something like this (doesn't work.):
c(y, indexes) <- sort(x, index.return=T)
But that doesn't work.
I CAN grab the output of sort in a
2009 Aug 25
1
Clogit or LRM?
Hello
I believe that I'm getting very close in my modeling application.
I've come across a challenge that I am unable to solve and would really
appreciate the group's opinion.
I've been using the val.prob function from the Design library (Thanks
Frank!!) to both evaluate and visualize my model.
From the scores and graph, it appears as my model is very accurate in
2009 Sep 22
2
Pull Coefficients from MCMCpack models
Hi,
I've been testing some models with the MCMCpack library.
I can run the process and get a nice model "object". I can easily see
the summary and even plot it.
I can't seem to figure out how to:
1) Access the final coefficients in the model
2) Turn the coefficients into a model so I can then run predictions
using them.
A summary command will SHOW Me the coefficients, but
2011 Dec 10
1
Difficult subset challenge
Hi,
I'm having difficulty coming up with a good way to subest some data to generate statistics.
My data frame has multiple observations by group.
Here is an overly-simplified toy example of the data
==========================
code v1 v2
G1 1.2 2.3
G1 0 2.4
G1 1.4 3.4
G2 2.9 2.3
G2 4.3 4.4
etc..
=========================
I want to normalize the data *by group* for certain variable.
2010 May 15
3
Discretize factors?
Hi,
I'm looking for an easy way to discretize factors in R
I've noticed that the lm function does this automatically with a nice
result.
If I have
group <- c("A", "B","B","C","C","C")
and run:
lm(result ~ x1 + group)
The lm function has split the group into separate binary variables {0,1}
before performing the
2009 Aug 26
6
Managing output
Hi,
Is there a way to build up a vector, item by item. In perl, we can
"push" an item onto an array. How can we can do this in R?
I have a loop that generates values as it goes. I want to end up with a
vector of all the loop results.
In perl it woud be:
for(item in list){
result <- 2*item^2 (Or whatever formula, this is just a pseudo example)
Push(@result_list,
2009 Sep 03
2
Easy way to get top 2 items from vector
Hi,
I use the max function often to find the top value from a matrix or
column of a data.frame.
Now I'm looking to find the top 2 (or three) values from my data.
I know that I could sort the list and then access the first two items,
but that seems like the "long way". Is there some way to access "max_2"
or similar?
Thanks!
--
Noah
2010 May 12
3
Summarizing counts by multiple factors
Hi,
An example data set is:
group level color
A 1 "blue"
A 1 "Red"
B 1 "blue"
B 2 "Red"
A 2 "Red"
B 2 "Red"
B 2 "blue"
B 2 "blue"
A 2 "blue"
A 2 "Red"
2009 Aug 02
2
Strange column shifting with read.table
Hi,
I am reading in a dataframe from a CSV file. It has 70 columns. I do
not have any kind of unique "row id".
rawdata <- read.table("r_work/train_data.csv", header=T, sep=",",
na.strings=0)
When training an svm, I keep getting an error
So, as an experiment, I wrote the data back out to a new file so that I
could see what the svm function sees.