Displaying 20 results from an estimated 40000 matches similar to: "help with "by" command"
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all,
   I am using the clustering functions in R in order to work with large
masses of binary time series data, however the clustering functions do not
seem able to fit this size of practical problem. Library 'hclust' is good
(though it may be sub par for this size of problem, thus doubly poor for
this application) in that I do not want to make assumptions about the number
of
2010 Feb 08
7
data frames; matching/merging
Hi all,
    I'm feeling a little guilty to ask this question, since I've
written a solution using a rather clunky for loop that gets the job
done.  But I'm convinced there must be a faster (and probably more
elegant) way to accomplish what I'm looking to do (perhaps using the
"merge" function?).  I figured somebody out there might've already
figured this out:
I have
2010 Sep 27
1
bwplot superpose panel.points from another dataframe
Hi everybody,
using bwplot for producing panel boxplot with 3 dimensions
i want to add a mark on each boxplot representing one individual (on all its
dimensions)
till now, i didn't succeed getting the desired solution
I want as well to keep the median symbols as a line
Many thanks for your help
christophe
here is the tested code:
########################
library(lattice)
ex <-
2010 Dec 15
3
Applying function to a TABLE and also "apply, tapply, sapply etc"
Dear R-help forum members,
Suppose I have a data-frame having two variables and single data for each of them, as described below.
variable_1           variable_2
        10                          20
I have written a function, say, 'fun' which uses input 10 and 20 and gives me desired result.
fun = function(X, Y)
         {
         X + Y              #( I am just giving an example of
2010 May 06
2
Data frame "pivoting"
Dear R experts,
I am trying to solve this problem, related to the possibility of
changing the shape of a data frame using a "pivoting-like" function.
I have a dataframe df of observations as follows:
ID		VALIDITY YEAR		PROPERTY	PROPERTY VALUE
A1		2007				P1		V1
A1		2007				P2		V2
A1		2007				P3		V3
A1		2008				P1		V10
A1		2008				P2		V20
A2		2007				P5		V50
A2		2008				P6		V20
A3		2007
2009 Apr 13
3
tapply output as a dataframe
i use tapply and by often, but i always end up banging my head against
the wall with the output.  
is there a simpler way to convert the output of the following tapply to
a dataframe or matrix than what i have here:
# setup data for tapply
dt = data.frame(bucket=rep(1:4,25),val=rnorm(100))
fn = function(x) { 
  ret =
c(unname(quantile(x,probs=seq(.25,.75,.25),na.rm=T)),mean(x,na.rm=T))
}
a =
2008 Oct 01
3
"tapply versus by" in function with more than 1 arguments
Hi. I searched the list and didn't found nothing similar to this. I simplified my example like below:
#I need calculate correlation (for example) between 2 columns classified by a third one at a data.frame, like below:
#number of rows
nr = 10
#the third column is to enforce that I need correlation on two variables only
dataf =
2007 May 31
4
Aggregate to find majority level of a factor
I want to use the aggregate function to summarize data by a factor (my
field plots), but I want the summary to be the majority level of another
factor.
 
For example, given the dataframe:
Plot1     big
Plot1     big
Plot1     small
Plot2     big
Plot2     small    
Plot2     small
Plot3     small
Plot3     small
Plot3     small
My desired result would be:
Plot1 big
Plot2 small
Plot3 small
I
2006 Feb 24
3
Summarize by two-column factor, retaining original factors
I am having trouble doing the following.  I have a data.frame like
this, where x and y are a variable that I want to do calculations on:
Name Year x y
ab   2001  15 3
ab   2001  10 2
ab   2002  12 8
ab   2003  7 10
dv   2002  10 15
dv   2002  3 2
dv   2003  1 15
Before I do all the other things I need to do with this data, I need
to summarize or collapse the data by name and year.  I've
2007 Apr 24
2
problem in tapply command
hello
when I entered following command, I got NA values for some catagories.
 > tapply(slp_jeo2$slp,slp_jeo2$jeo,mean )
              999       Ca      Cka      DCy       Jh      JKi       Kk
14.06665       NA 14.60445       NA       NA       NA       NA       NA
     KTa     KTac       Ku      Kua      Kus       Ky      Kyk      ODe
      NA       NA       NA       NA       NA       NA      
2011 Feb 02
2
Finding the maximum in a particular group in a dataframe
Hello
I am trying to find a way to find the max value, for only a subset of a
dataframe, depending on how the data is grouped for example,
How would I find the maxmium responce, for all the GPR119a condition below:
I've tried tapply
> tapply(GPR119data$responce, GPR119data$GPR119a, max)
Error in tapply(GPR119data$responce, GPR119data$GPR119a, max) : 
  arguments must have same length
2002 Sep 29
1
Runnin R prorams from a command line
Hello, I am a new R user, using version 1.5.1.  I am
attempting to run R programs from a dos command line
(in win2000) and am having problems. My goal is to
be able to use R from batch scripts in both windows
and also in Linux eventually later.
 
When I first ran "rcmd BATCH --help" it said that
perl was not found.  after installing perl and
running again, I received the following error:
2015 Apr 23
2
Sample Docker images for Asterisk available
Hello all,
I created a set of Docker images running Asterisk and exposing AMI /
ARI ports that i found to be quite useful for ARI / AMI development
and regression.
As they are based on Docker with whaleware, adding new configuration
files to roll your own dialplan / queues / voicemail etc is pretty
easy. And you can run quite a lot on the same box to simulate
clusters.
There is no SIP / RTP
2005 Sep 30
4
by() processing on a dataframe
I want to calculate a statistic on a number of subgroups of a dataframe, 
then put the results into a dataframe.  (What SAS PROC MEANS does, I 
think, though it's been years since I used it.)
This is possible using by(), but it seems cumbersome and fragile.  Is 
there a more straightforward way than this?
Here's a simple example showing my current strategy:
 > dataset <-
2005 Jan 27
4
self-written function
Dear all,
I?ve got a simple self-written function to calculate the mean + s.e. 
from arcsine-transformed data:
backsin<-function(x,y,...){
 backtransf<-list()
 backtransf$back<-((sin(x[x!="NA"]))^2)*100
 backtransf$mback<-tapply(backtransf$back,y[x!="NA"],mean)
2012 Feb 13
3
Change dataframe-structure
Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt.
Name: nicht verf?gbar
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120213/d2a5afa6/attachment.pl>
2004 Aug 03
2
attach data from tapply to dataframe
I am working with a longitudinal data set in the long format. This data
set has three observations per grade level per year. Here are the first
10 rows of the data frame:
 
>tenn.dat[1:10,]
 
year  schid type grade gain  se new cohort
6  2001 100005    5     4 33.1 3.5   4      3
7  2002 100005    5     4 33.9 3.9   4      2
8  2003 100005    5     4 32.3 4.2   4      1
10 2001 100005  
2005 Jun 20
6
tapply
hi,
i have another question on tapply:
i have a dataset z like this:
5540 389100307391      2600
5541 389100307391      2600
5542 389100307391      2600
5543 389100307391      2600
5544 389100307391      2600
5546 381300302513        NA
5547 387000307470        NA
5548 387000307470        NA
5549 387000307470        NA
5550 387000307470        NA
5551 387000307470        NA
5552 387000307470      
2011 Feb 03
2
tapply output as a dataframe
On Mon, Apr 13, 2009 at 12:41 PM, Dan Dube <ddube-at-advisen.com> wrote:
> i use tapply and by often, but i always end up banging my head against
> the wall with the output.
The proposed solution of Dan's problem posted on R-help was: 
> do.call(rbind,a)
When I use this 'solution' I get 'ERROR:  second argument must be a list'.  So head on wall continues.
My
2010 Apr 16
3
problem with FUN in Hmisc::summarize
Hi all,
I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
of a single vector argument to create the statistical summaries.
Consider an easy case: I'd like to compute the correlation between two
variables in my dataframe, grouped according to other variables in the same
dataframe.
For exemple, consider the following dataframe D:
V1  V2   V3
A     1    -1
A     1