Displaying 20 results from an estimated 10000 matches similar to: "applying to dataframe rows"
2008 Sep 17
1
creating horizontal dataframes with column names
Greetings -- in order to write back to SQL databases, one needs to
create a dataframe with values. I can get column names of an existing
table with sqlColumns. Say I have a vector of values (if they're all
the same type), or a list (if different). How do I create a dataframe
with column names given by my sqlColumns? To make it concrete, how do
we make a dataframe
A B C
1 2 3
2009 Jan 26
2
name scoping within dataframe index
Every time I have to prefix a dataframe column inside the indexing
brackets with the dataframe name, e.g.
df[df$colname==value,]
-- I am wondering, why isn't there an R scoping rule that search
starts with the dataframe names, as if we'd said
with(df, df[colname==value,])
-- wouldn't that be a reasonable default to prepend to the name search
path?
Cheers,
Alexy
2009 Feb 24
2
growing dataframes with rbind
I'm growing a large dataframe by composing new rows and then doing
row <- compute.new.row.somehow(...)
d <- rbind(d,row)
Is this a fast/preferred way?
Cheers,
Alexy
2007 Nov 23
2
printing levels as tuples
I'm running rle() on a long vector, and get a result which looks like
> uc
Run Length Encoding
lengths: int [1:16753] 1 1 1 1 1 1 1 1 1 1 ...
values : int [1:16753] 29462748 22596107 18322820 14323315
12684505 9909036 7296916 6857692 5884755 5883697 ...
I can print uc$names or uc$levels separately. Is there any way to
print them together as tuples, looking like
(29462748, 1)
2008 Sep 09
2
splitting time vector into days
Greetings -- I have a dataframe a with one element a vector, time, of
POSIXct values. What's a good way to split the data frame into
periods of a$time, e.g. days, and apply a function, e.g. mean, to some
other column of the dataframe, e.g. a$value?
Cheers,
Alexy
2007 Nov 27
2
exporting a split list
Using wk <- with(d, split(word, kind)), I get the following class table:
wk$`1`
[1] "a" "bra" ... # (*)
wk$`10`
"ca" "dabra" ...
Now I need to export it in the following format:
class num_members examples
1 23 a bra ...
10 4 ca dabra
For each class C such as `1`, I need to print the
2008 Oct 02
4
namespaces
I'd like to control my namespace thoroughly, separated by task. Is
there a way, in R session, to introduce namespaces for tasks
dynamically and switch them as needed? Or, is there a combination of
load/save workspace steps which can facilitate this?
Cheers,
Alexy
2007 Nov 21
2
uniq -c
Is there an R analog of the Unix command uniq -c:
http://en.wikipedia.org/wiki/Uniq
Given an array x, uniq -c replaces each contiguous subsequence of
identical numbers with a tuple (count, number). E.g.
$ cat > usample
10
10
9
8
8
7
7
7
6
3
1
1
1
0
$ uniq -c usample
2 10
1 9
2 8
3 7
1 6
1 3
3 1
1 0
Cheers,
Alexy
2008 Oct 29
2
Functional pattern-matching in R
I found there's a very good functional set of operations in R, such as
apply family, Hadley Wickham's lovely plyr, etc. There's even a
Reduce (a.k.a. fold). Now I wonder how can we do pattern-matching?
E.g., now I split dimensions like this:
m <- dim(V)[1] # R
n <- dim(V)[2] # still R
While even Matlab allows for
[m,n] = size(V) % MATLAB!
Ideally I'd be able to
2011 Mar 23
1
rbind a heterogeneous row
I have a dataframe with many rows like this:
> df
X1 X2 X3 X4 X5 X6 X7 week d
sim1 FALSE TRUE TRUE TRUE TRUE TRUE TRUE 1 0.3064985
sim1 is the rowname, X1..X7,week,d are the column names. X1..X7 are factors, booleans in this case.
I need to add another row, represented by the following list:
list(rep(T,7),5,0.0)
-- i.e, TRUE in all boolean columns,
2007 Nov 21
3
shrink a dataframe for plotting
I get tables with millions of rows. For plotting to a screen-size
jpg, obviously just about 1000 points are enough. Instead of feeding
plot() the original millions of rows, I'd rather shrink the original
dataframe, using some kind of the following interpolation:
-- split dataframe into chunks of N rows each, e.g. 1000 rows each
-- compute average for each column
-- issue one new row
2008 Oct 09
1
R/OCaml?
Did anyone try to write R extensions in OCaml? What would it entail
to enable it?
Cheers,
Alexy
2008 Sep 05
1
dealing with NAs in time series
Certain timeseries I have had outliers, which I removed by assigning
NA to their positions. Now acf() refuses to go to work. What's the
right way to remove outliers from ts objects, and what are teh
standard ways to interpolate NAs in them?
Cheers,
Alexy
2008 Sep 28
1
partitioning vectors of intervals
I have two pairs of time intervals: coarse- and fine-grained. They're
components of their respective dataframes, looking like,
coarse: endtime starttime
1 t1_end t1_start
2 t2_end t2_start
...
fine: is the same, except that its intervals presumably fall into the
coarse's enclosing ones.
The problem is to partition
2009 Feb 27
2
factors to integers preserving value in a dataframe
I want to produce a dataframe with integer columns for elements of
string pairs:
pairs <- c("10 21","23 45")
pairs.split <- lapply(pairs,function(x)strsplit(x," "))
pdf <- as.data.frame(pairs.split)
names(pdf) <- c("p","q")
-- at this point things look good, except the columns are factors, as
I didn't change the default
2007 Nov 07
3
R as a programming language
Greetings -- coming from Python/Ruby perspective, I'm wondering about
certain features of R as a programming language.
Say I have a huge table t of the form
run ord unit words new
1 1 6939 1013 641
1 2 275 1001 518
1 3 3314 1008 488
1 4 14154 1018 463
1 5 2982 1006 421
Alternatively, it
2007 Nov 24
2
[:]
What are idioms for taking a head or a tail of a vector, either up to
an index, or from an index to the end? Also -- is it necessary to
use length(v) to refer to the last element? E.g., Python has
v[:3] # indices 0,1,2
v[3:] # indices 3,4,...
v[-1] # the last element of v
v[:-1] # all but last
Cheers,
Alexy
2009 Feb 23
2
1.095e+09 for integers
I've had a very long file written out by R with write.table, with
fields of time values, converted from POSIXlt as.numeric. Among 2.5
million values, very few had 6 trailing zeroes, and those were output
in scientific notation as in the subject. Is this the default
behavior for long integers, and how can it be turned off (with all
digits for any integer field in write.table)? This
2008 May 07
2
figure margins too large for a barplot in png, pdf ok
I've used to have a script with a barplot command it in, preceded by a
png:
png(graph.file,height=H,width=W)
barplot(t,names.arg=breaks[2:(length(t)+1)],tck=gridlines)
-- worked before R 2.6.2. When I tried it in R 2.6.2, which I have
for a while but didn't run with that script, it complained, the
margins too large, and I've googled the messages from our list where
neither
2009 Feb 27
2
accessing and preserving list names in lapply
Sometimes I'm iterating over a list where names are keys into another
data structure, e.g. a related list. Then I can't use lapply as it
does [[]] and loses the name. Then I do something like this:
do.one <- function(ldf) { # list-dataframe item
key <- names(ldf)
meat <- ldf[[1]]
mydf <- some.df[[key]] # related data structure
r.df <-