Displaying 20 results from an estimated 30000 matches similar to: "Reading name-value data"
2011 Oct 23
2
Summary stats in table
Suppose I have data like this:
A <- sample(letters[1:3], 1000, replace=TRUE)
B <- sample(LETTERS[1:2], 1000, replace=TRUE)
x <- rnorm(1000)
I can get a table of means via
tapply(x, list(A, B), mean)
and I can add the marginal means to this using cbind/rbind:
main <- tapply(x, list(A,B), mean)
Amargin <- tapply(x, list(A), mean)
Bmargin <- tapply(x, list(B), mean)
2008 Nov 29
2
Using grep() to subset lines of text
I have two vectors, a and b. b is a text file. I want to find in b those
elements of a which occur at the beginning of the line in b. I have the
following code, but it only returns a value for the first value in a, but I
want both. Any ideas please.
a = c(2,3)
b = NULL
b[1] = "aaa 2 aaa"
b[2] = "2 aaa"
b[3] = "3 aaa"
b[4] = "aaa 3 aaa"
2011 Apr 11
0
plyr: version 1.5
# plyr
plyr is a set of tools for a common set of problems: you need to
__split__ up a big data structure into homogeneous pieces, __apply__ a
function to each piece and then __combine__ all the results back
together. For example, you might want to:
* fit the same model each patient subsets of a data frame
* quickly calculate summary statistics for each group
* perform group-wise
2011 Apr 11
0
plyr: version 1.5
# plyr
plyr is a set of tools for a common set of problems: you need to
__split__ up a big data structure into homogeneous pieces, __apply__ a
function to each piece and then __combine__ all the results back
together. For example, you might want to:
* fit the same model each patient subsets of a data frame
* quickly calculate summary statistics for each group
* perform group-wise
2008 Nov 10
1
Preparing data for display
I have a dataset of about 10^6 rows, each consisting of a timestamp,
several factors, a string, some integers, and some floats.
I'd like to graph this data in various ways, including straightforward
ones (how many events per week over the past year for each of 4 values
of some factor), some less straightforward. I've managed to do this
by brute force, but I'd like to learn how to do
2009 Feb 17
2
cumsum vs. sum
I recently traced a bug of mine to the fact that cumsum(s)[length(s)]
is not always exactly equal to sum(s).
For example,
x<-1/(12:14)
sum(x) - cumsum(x)[3] => 2.8e-17
Floating-point addition is of course not exact, and in particular is
not associative, so there are various possible reasons for this.
Perhaps sum uses clever summing tricks to get more accurate results?
In some
2009 Jul 29
3
Object equality for S4 objects
To test two environments for object equality (Lisp EQ), I can use 'identity':
> e1 <- environment(local(function()x))
> e2 <- environment(local(function()x))
> identical(e1,e2) # compares object identity
[1] FALSE
> identical(as.list(e1),as.list(e2)) # compares values as name->value mapping
[1] TRUE # (is there a
2009 Mar 09
3
E`<`<rrors in recursive default argument references
Tested in: R version 2.8.1 (2008-12-22) / Windows
Recursive default argument references normally give nice clear errors.
In the first set of examples, you get the error:
Error in ... :
promise already under evaluation: recursive default argument
reference or earlier problems?
(function(a = a) a ) ()
(function(a = a) c(a) ) ()
(function(a = a) a[1] ) ()
(function(a = a)
2009 May 20
2
Class for time of day?
What is the recommended class for time of day (independent of calendar
date)?
And what is the recommended way to get the time of day from a POSIXct
object? (Not a string representation, but a computable representation.)
I have looked in the man page for DateTimeClasses, in the Time Series
Analysis Task View and in Spector's Data Manipulation book but haven't found
these. Clearly I can
2009 Feb 10
1
Variable/function namespaces WAS: Bug in subsetting data frame (PR#13515)
On Tue, Feb 10, 2009 at 10:11 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> Stavros Macrakis wrote:
>> On Tue, Feb 10, 2009 at 8:31 AM, Duncan Murdoch <murdoch at stats.uwo.ca>wrote:
>>> The evaluator recognizes the context of usage and will get the
>>> function for a function call....
>> Can you point me to chapter and verse in the language
2010 Jun 29
2
transposing a data frame from horizontal to vertical (stacking)
Hello, everyone!
I have a very simple task - I have a data frame (see MyData below) and
I need to stack the data (see result below).
I wrote the syntax below - it's very basic and it does what I need.
But I am sure what I am trying to do is a very typical task and there
must be a much shorter/more elegant way of doing it.
Any advice?
Thank you very much!
2009 Dec 18
2
Vectorized switch
What is the 'idiomatic' way of writing a vectorized switch statement?
That is, I would like to write, e.g.,
vswitch( c('a','x','b','a'),
a= 1:4,
b=11:14,
100 )
=> c(1, 100, 13, 4 )
equivalent to
ifelse( c('a','x','b','a') ==
2011 Oct 19
2
Speed difference between df$a[1] and df[1,"a"]
I was surprised to find that df$a[1] is an order of magnitude faster than
df[1,"a"]:
> df <- data.frame(a=1:10)
> system.time(replicate(100000, df$a[3]))
user system elapsed
0.36 0.00 0.36
> system.time(replicate(100000, df[3,"a"]))
user system elapsed
4.09 0.00 4.09
A priori, I'd have thought that combining the row and column
2009 Apr 20
2
The assign(paste(...,i),...) idiom
Judging from the traffic on this mailing list, a lot of R beginners
are trying to write things like
assign( paste( "myvar", i), ...)
where they really should probably be writing
myvar[i] <- ...
Do we have any idea where this bizarre habit comes from?
-s
2009 May 27
1
R Books listing on R-Project
I was wondering what the criteria were for including books on the Books
Related to R page <http://www.r-project.org/doc/bib/R-books.html>. (There is
no maintainer listed on this page.)
In particular, I was wondering why the following two books are not listed:
* Andrew Gelman, Jennifer Hill, *Data Analysis Using Regression and
Multilevel/Hierarchical Models*. (CRAN package 'arm')
*
2008 Dec 08
4
R and Scheme
I've read in many places that R semantics are based on Scheme semantics. As
a long-time Lisp user and implementor, I've tried to make this more precise,
and this is what I've found so far. I've excluded trivial things that
aren't basic semantic issues: support for arbitrary-precision integers;
subscripting; general style; etc. I would appreciate corrections or
additions from
2009 Apr 01
2
Definition of = vs. <-
NOTA BENE: This email is about `=`, the assignment operator (e.g. {a=1}
which is equivalent to { `=`(a,1) } ), not `=` the named-argument syntax
(e.g. f(a=1), which is equivalent to
eval(structure(quote(f(1)),names=c('','a'))).
As far as I can tell from the documentation, assignment with = is precisely
equivalent to assignment with <-. Yet they call different primitives:
>
2009 Apr 01
2
Definition of = vs. <-
NOTA BENE: This email is about `=`, the assignment operator (e.g. {a=1}
which is equivalent to { `=`(a,1) } ), not `=` the named-argument syntax
(e.g. f(a=1), which is equivalent to
eval(structure(quote(f(1)),names=c('','a'))).
As far as I can tell from the documentation, assignment with = is precisely
equivalent to assignment with <-. Yet they call different primitives:
>
2011 Apr 04
2
General binary search?
Is there a generic binary search routine in a standard library which
a) works for character vectors
b) runs in O(log(N)) time?
I'm aware of findInterval(x,vec), but it is restricted to numeric vectors.
I'm also aware of various hashing solutions (e.g. new.env(hash=TRUE) and
fastmatch), but I need the greatest-lower-bound match in my application.
findInterval is also slow for
2009 Jan 21
1
Handling of factors
I'm rather confused by the semantics of factors.
When applied to factors, some functions (whose results are elements of
the original factor argument) return results of class factor, some
return integer vectors, some return character vectors, some give
errors. I understand some but not all of this. Consider:
Preserve factors: `[`, `[[`, sort, unique, subset, head, tapply, rep, rev, by,