Displaying 20 results from an estimated 40000 matches similar to: "help with "by" command"
2011 Aug 10
4
Clustering Large Applications..sort of
Hello all,
I am using the clustering functions in R in order to work with large
masses of binary time series data, however the clustering functions do not
seem able to fit this size of practical problem. Library 'hclust' is good
(though it may be sub par for this size of problem, thus doubly poor for
this application) in that I do not want to make assumptions about the number
of
2010 Feb 08
7
data frames; matching/merging
Hi all,
I'm feeling a little guilty to ask this question, since I've
written a solution using a rather clunky for loop that gets the job
done. But I'm convinced there must be a faster (and probably more
elegant) way to accomplish what I'm looking to do (perhaps using the
"merge" function?). I figured somebody out there might've already
figured this out:
I have
2010 Sep 27
1
bwplot superpose panel.points from another dataframe
Hi everybody,
using bwplot for producing panel boxplot with 3 dimensions
i want to add a mark on each boxplot representing one individual (on all its
dimensions)
till now, i didn't succeed getting the desired solution
I want as well to keep the median symbols as a line
Many thanks for your help
christophe
here is the tested code:
########################
library(lattice)
ex <-
2010 Dec 15
3
Applying function to a TABLE and also "apply, tapply, sapply etc"
Dear R-help forum members,
Suppose I have a data-frame having two variables and single data for each of them, as described below.
variable_1 variable_2
10 20
I have written a function, say, 'fun' which uses input 10 and 20 and gives me desired result.
fun = function(X, Y)
{
X + Y #( I am just giving an example of
2010 May 06
2
Data frame "pivoting"
Dear R experts,
I am trying to solve this problem, related to the possibility of
changing the shape of a data frame using a "pivoting-like" function.
I have a dataframe df of observations as follows:
ID VALIDITY YEAR PROPERTY PROPERTY VALUE
A1 2007 P1 V1
A1 2007 P2 V2
A1 2007 P3 V3
A1 2008 P1 V10
A1 2008 P2 V20
A2 2007 P5 V50
A2 2008 P6 V20
A3 2007
2009 Apr 13
3
tapply output as a dataframe
i use tapply and by often, but i always end up banging my head against
the wall with the output.
is there a simpler way to convert the output of the following tapply to
a dataframe or matrix than what i have here:
# setup data for tapply
dt = data.frame(bucket=rep(1:4,25),val=rnorm(100))
fn = function(x) {
ret =
c(unname(quantile(x,probs=seq(.25,.75,.25),na.rm=T)),mean(x,na.rm=T))
}
a =
2008 Oct 01
3
"tapply versus by" in function with more than 1 arguments
Hi. I searched the list and didn't found nothing similar to this. I simplified my example like below:
#I need calculate correlation (for example) between 2 columns classified by a third one at a data.frame, like below:
#number of rows
nr = 10
#the third column is to enforce that I need correlation on two variables only
dataf =
2007 May 31
4
Aggregate to find majority level of a factor
I want to use the aggregate function to summarize data by a factor (my
field plots), but I want the summary to be the majority level of another
factor.
For example, given the dataframe:
Plot1 big
Plot1 big
Plot1 small
Plot2 big
Plot2 small
Plot2 small
Plot3 small
Plot3 small
Plot3 small
My desired result would be:
Plot1 big
Plot2 small
Plot3 small
I
2006 Feb 24
3
Summarize by two-column factor, retaining original factors
I am having trouble doing the following. I have a data.frame like
this, where x and y are a variable that I want to do calculations on:
Name Year x y
ab 2001 15 3
ab 2001 10 2
ab 2002 12 8
ab 2003 7 10
dv 2002 10 15
dv 2002 3 2
dv 2003 1 15
Before I do all the other things I need to do with this data, I need
to summarize or collapse the data by name and year. I've
2007 Apr 24
2
problem in tapply command
hello
when I entered following command, I got NA values for some catagories.
> tapply(slp_jeo2$slp,slp_jeo2$jeo,mean )
999 Ca Cka DCy Jh JKi Kk
14.06665 NA 14.60445 NA NA NA NA NA
KTa KTac Ku Kua Kus Ky Kyk ODe
NA NA NA NA NA NA
2011 Feb 02
2
Finding the maximum in a particular group in a dataframe
Hello
I am trying to find a way to find the max value, for only a subset of a
dataframe, depending on how the data is grouped for example,
How would I find the maxmium responce, for all the GPR119a condition below:
I've tried tapply
> tapply(GPR119data$responce, GPR119data$GPR119a, max)
Error in tapply(GPR119data$responce, GPR119data$GPR119a, max) :
arguments must have same length
2002 Sep 29
1
Runnin R prorams from a command line
Hello, I am a new R user, using version 1.5.1. I am
attempting to run R programs from a dos command line
(in win2000) and am having problems. My goal is to
be able to use R from batch scripts in both windows
and also in Linux eventually later.
When I first ran "rcmd BATCH --help" it said that
perl was not found. after installing perl and
running again, I received the following error:
2015 Apr 23
2
Sample Docker images for Asterisk available
Hello all,
I created a set of Docker images running Asterisk and exposing AMI /
ARI ports that i found to be quite useful for ARI / AMI development
and regression.
As they are based on Docker with whaleware, adding new configuration
files to roll your own dialplan / queues / voicemail etc is pretty
easy. And you can run quite a lot on the same box to simulate
clusters.
There is no SIP / RTP
2005 Sep 30
4
by() processing on a dataframe
I want to calculate a statistic on a number of subgroups of a dataframe,
then put the results into a dataframe. (What SAS PROC MEANS does, I
think, though it's been years since I used it.)
This is possible using by(), but it seems cumbersome and fragile. Is
there a more straightforward way than this?
Here's a simple example showing my current strategy:
> dataset <-
2005 Jan 27
4
self-written function
Dear all,
I?ve got a simple self-written function to calculate the mean + s.e.
from arcsine-transformed data:
backsin<-function(x,y,...){
backtransf<-list()
backtransf$back<-((sin(x[x!="NA"]))^2)*100
backtransf$mback<-tapply(backtransf$back,y[x!="NA"],mean)
2012 Feb 13
3
Change dataframe-structure
Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt.
Name: nicht verf?gbar
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120213/d2a5afa6/attachment.pl>
2004 Aug 03
2
attach data from tapply to dataframe
I am working with a longitudinal data set in the long format. This data
set has three observations per grade level per year. Here are the first
10 rows of the data frame:
>tenn.dat[1:10,]
year schid type grade gain se new cohort
6 2001 100005 5 4 33.1 3.5 4 3
7 2002 100005 5 4 33.9 3.9 4 2
8 2003 100005 5 4 32.3 4.2 4 1
10 2001 100005
2005 Jun 20
6
tapply
hi,
i have another question on tapply:
i have a dataset z like this:
5540 389100307391 2600
5541 389100307391 2600
5542 389100307391 2600
5543 389100307391 2600
5544 389100307391 2600
5546 381300302513 NA
5547 387000307470 NA
5548 387000307470 NA
5549 387000307470 NA
5550 387000307470 NA
5551 387000307470 NA
5552 387000307470
2011 Feb 03
2
tapply output as a dataframe
On Mon, Apr 13, 2009 at 12:41 PM, Dan Dube <ddube-at-advisen.com> wrote:
> i use tapply and by often, but i always end up banging my head against
> the wall with the output.
The proposed solution of Dan's problem posted on R-help was:
> do.call(rbind,a)
When I use this 'solution' I get 'ERROR: second argument must be a list'. So head on wall continues.
My
2010 Apr 16
3
problem with FUN in Hmisc::summarize
Hi all,
I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
of a single vector argument to create the statistical summaries.
Consider an easy case: I'd like to compute the correlation between two
variables in my dataframe, grouped according to other variables in the same
dataframe.
For exemple, consider the following dataframe D:
V1 V2 V3
A 1 -1
A 1