Displaying 20 results from an estimated 40000 matches similar to: ""sequeeze" a data frame"
2008 Oct 15
1
combining same-day lab measurements with 'apply'
Another request for help implementing the 'apply' functions to avoid a
loop structure...
I am working with a data set that includes lab measurements taken at
different dates for the subjects, with some subjects having more
results than others. I would like to average lab results for each
subject that were taken on the same day. I can do this using a for
loop, but would like to know how
2005 Jan 25
2
tapply and names
I have a data frame containing children, with variables 'year' = birth
year, and 'm.id' = mother's id number. Let's assume that all the births of
each mother is represented in the data frame.
Now I want to create a subset of this data frame containing all children,
whose mother's first birth was in the year 1816 or later. This seems to
work:
mid <-
2011 Apr 03
1
zoo:rollapply by multiple grouping factors
# Hi there,
# I am trying to apply a function over a moving-window for a large
number of multivariate time-series that are grouped in a nested set of
factors. I have spent a few days searching for solutions with no luck,
so any suggestions are much appreciated.
# The data I have are for the abundance dynamics of multiple species
observed in multiple fixed plots at multiple sites. (I total I
2004 Aug 03
2
attach data from tapply to dataframe
I am working with a longitudinal data set in the long format. This data
set has three observations per grade level per year. Here are the first
10 rows of the data frame:
>tenn.dat[1:10,]
year schid type grade gain se new cohort
6 2001 100005 5 4 33.1 3.5 4 3
7 2002 100005 5 4 33.9 3.9 4 2
8 2003 100005 5 4 32.3 4.2 4 1
10 2001 100005
2008 May 02
2
Coercing by/tapply to data.frame for more than two indices?
Dear Colleagues,
Apologies for a long email to ask what I feel may be a very simple
question; I figure it's better to overspecify my situation.
I was asked a question, recently, by a colleague in my department
about pre-aggregating variables, i.e., computing the mean of defined subsets
of a data frame. Naturally, I thought of the 'by' and 'tapply' functions, as
2005 May 25
2
weighted.mean and tapply (again)
I read answers to questions including the words "tapply" and
"weighted.mean", but I didn't understand either the problem (data) or the
solution provided.
Here is my question ...
> dat[1:10,]
GROUP VALUE FREQUENCY
1 2 2 78
2 2 3 40
3 2 4 16
4 2 5 3
5 2 6 1
6 2 8 1
7
2007 Jan 26
1
plotting results from tapply
Hi, there
I'm trying to plot what is returned from a call to tapply, and can't figure
out how to do it. My guess is that it has something to do with the
inclusion of row names when you ask for the values you're interested in,
but if anyone has any ideas on how to get it to work, that would be
stellar. Here's some example code:
y1<-rnorm(40, 2)
x1<-rep(1:2, each=20)
2012 Oct 18
1
mean value calculation
Dear all,
I want to calculate mean values for multiple rows:
structure(list(Name = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L), .Label = c("AKT", "CKT"), class = "factor"), val1 = c(2,
3, 2, 2, 2, 5, 3, 8, 2), val2. = c(4, 5, 4, 8, 4, 8, 4, 7, 4),
val3 = c(5, 6, 5, 9, 5, 9, 5, 9, 5)), .Names = c("Name",
"val1", "val2.",
2007 Sep 27
1
windows device transparency issue
I read in a thread in r-help today that the windows device in 2.6 supports
transparency, so I tried an example and had some issues. The density plots
should be filled with transparent color in the following example (similar to
the points), however the color is "fully" transparent. This works in the
Cairo device, but not in the windows device.
Thanks,
--Matt
Matt Austin
2013 Jan 24
4
sorting/grouping/classification problem?
Hi,
I'm a database admin for a database which manage chromatographic results of products during stability studies.
I use R for the reporting of the results in MS Word through R2wd.
But now I think I need your help:
suppose we have the following data frame:
ID rrt Mnd Result
1 0.45 0 0.10
1 0.48 0 0.30
1 1.24 0 0.50
2 0.45 3 0.20
2 0.48 3 0.60
2 1.22 3 0.40
3
2010 Jun 08
2
how to ignore rows missing arguments of a function when creating a function?
Hi,
I am relatively new to R; when creating functions, I run into problems with
missing values. I would like my functions to ignore rows with missing values
for arguments of my function) in the analysis (as for example is the case in
STATA). Note that I don't want my function to drop rows if there are missing
arguments elsewhere in a row, ie for variables that are not arguments of my
2011 Mar 19
2
persuade tabulate function to count NAs in a data frame
Hi,
I'd like to ask you a question again. It is basically about data frames, NAs and tabulate function.
I have this data frame. I already used this in one of the previous questions of mine. It intentionally looks this simple, my real 'df' dataframe is much bigger actually and again, I am not willing to annoy anyone with huge databases... So, my database:
id
2007 Dec 07
2
Same regression per sub-group: apply?
Dear helpers,
I've come up with what is probably a simple problem, but I cannot
find the solution. I have a data-set containing survey-data from
several countries. What I want to do is to perform some regression
analyses, for each country separately. The question is, how to do
this nicely (thus without repeating the same syntax with another
`subset' argument).
I thought of the
2010 Jan 26
2
tapply and more than one function, with different arguments
Dear R-users,
I am working with R version 2.10.1.
Say I have is a simple function like this:
> my.fun <- function(x, mult) mult*sum(x)
Now, I want to apply this function along with some other (say 'max') to a simple data.frame, like:
> dat <- data.frame(x = 1:4, grp = c("a","a","b","b"))
Ideally, the result would look something like
2005 Nov 10
2
ltext - adding text to each panel from a matrix
Hi all (really probably just Deepayan):
In the plot below I want to add text on either side of each violin plot that
indicates the number of observations that are either positive or negative.
I'm trying to do this with ltext() and I've also monkeyed about with
panel.text(). The code below is generally what I want but my calls to
ltext() are wrong and I'm not sure how to fix them.
2011 Aug 24
3
Creating new variable with maximum visit date by group_id
Dear R users,
I am encoutering the following problem: I have a dataset with a 'unique_id' and different 'visit_date' (formatted as.Date, "%d/%m/%Y") per unique_id. I would like to create a new variable with the most recent date of visit per unique_id as shown below.
unique_id visit_date last_visit_date
1 01/06/2010 01/06/2011
1 01/01/2011 01/06/2011
1
2006 Jul 06
2
tapply question
I think I understand tapply but i still
can't figure out how to do the following.
I have a dataframe where some of the column names are the same
and i want to make a new dataframe where columns
that have the same name are averaged by row.
so, if the data frame, DF, was
AAA BBB CCC AAA DDD
1 0 7 11 13
2 0 8 12 14
3 0 6 0 15
2005 Apr 19
3
Ranking within a classification variable.
Suppose I have a data frame with two columns ``district'' and
``score'' --- score is numeric; district may be considered
categorical.
I wish to append to this data frame a third column whose entries are
the ranks of ``score'' ***within*** district.
I've tried fiddling about with tapply() and by() but the result is a
list whose i-th component consists of the ranks of
2009 Apr 23
1
ggplot2/aesthetic plotting advice
Consider the following situation:
we have quantified algal concentrations for
a variety of species using many samples at each
of three years. It seems to make sense to generate
a line plot (matplot-like), with each species plotted
as a separate line, with the points connected to emphasize
the temporal pattern.
The problem: lots of overlapping error bars.
The question: from both a
2010 Jan 11
1
apply a function down each column
Hello World,
I have a function that makes pairwise comparisons between two strings. I would like to apply this function to my data (which consists of columns with different strings) in the way that it compares the first with the second entry, and then the third with the fourth, and then the fifth with the sixth, and so on down each column...
So (2x-1) and (2x) would be the different entries to be