thr3ads.net - similar to: ""sequeeze" a data frame"

Displaying 20 results from an estimated 40000 matches similar to: ""sequeeze" a data frame"

combining same-day lab measurements with 'apply'

2008 Oct 15

combining same-day lab measurements with 'apply'

Another request for help implementing the 'apply' functions to avoid a loop structure... I am working with a data set that includes lab measurements taken at different dates for the subjects, with some subjects having more results than others. I would like to average lab results for each subject that were taken on the same day. I can do this using a for loop, but would like to know how

tapply and names

2005 Jan 25

tapply and names

I have a data frame containing children, with variables 'year' = birth year, and 'm.id' = mother's id number. Let's assume that all the births of each mother is represented in the data frame. Now I want to create a subset of this data frame containing all children, whose mother's first birth was in the year 1816 or later. This seems to work: mid <-

zoo:rollapply by multiple grouping factors

2011 Apr 03

zoo:rollapply by multiple grouping factors

# Hi there, # I am trying to apply a function over a moving-window for a large number of multivariate time-series that are grouped in a nested set of factors. I have spent a few days searching for solutions with no luck, so any suggestions are much appreciated. # The data I have are for the abundance dynamics of multiple species observed in multiple fixed plots at multiple sites. (I total I

attach data from tapply to dataframe

2004 Aug 03

attach data from tapply to dataframe

I am working with a longitudinal data set in the long format. This data set has three observations per grade level per year. Here are the first 10 rows of the data frame: >tenn.dat[1:10,] year schid type grade gain se new cohort 6 2001 100005 5 4 33.1 3.5 4 3 7 2002 100005 5 4 33.9 3.9 4 2 8 2003 100005 5 4 32.3 4.2 4 1 10 2001 100005

Coercing by/tapply to data.frame for more than two indices?

2008 May 02

Coercing by/tapply to data.frame for more than two indices?

Dear Colleagues, Apologies for a long email to ask what I feel may be a very simple question; I figure it's better to overspecify my situation. I was asked a question, recently, by a colleague in my department about pre-aggregating variables, i.e., computing the mean of defined subsets of a data frame. Naturally, I thought of the 'by' and 'tapply' functions, as

weighted.mean and tapply (again)

2005 May 25

weighted.mean and tapply (again)

I read answers to questions including the words "tapply" and "weighted.mean", but I didn't understand either the problem (data) or the solution provided. Here is my question ... > dat[1:10,] GROUP VALUE FREQUENCY 1 2 2 78 2 2 3 40 3 2 4 16 4 2 5 3 5 2 6 1 6 2 8 1 7

plotting results from tapply

2007 Jan 26

plotting results from tapply

Hi, there I'm trying to plot what is returned from a call to tapply, and can't figure out how to do it. My guess is that it has something to do with the inclusion of row names when you ask for the values you're interested in, but if anyone has any ideas on how to get it to work, that would be stellar. Here's some example code: y1<-rnorm(40, 2) x1<-rep(1:2, each=20)

mean value calculation

2012 Oct 18

mean value calculation

Dear all, I want to calculate mean values for multiple rows: structure(list(Name = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("AKT", "CKT"), class = "factor"), val1 = c(2, 3, 2, 2, 2, 5, 3, 8, 2), val2. = c(4, 5, 4, 8, 4, 8, 4, 7, 4), val3 = c(5, 6, 5, 9, 5, 9, 5, 9, 5)), .Names = c("Name", "val1", "val2.",

windows device transparency issue

2007 Sep 27

windows device transparency issue

I read in a thread in r-help today that the windows device in 2.6 supports transparency, so I tried an example and had some issues. The density plots should be filled with transparent color in the following example (similar to the points), however the color is "fully" transparent. This works in the Cairo device, but not in the windows device. Thanks, --Matt Matt Austin

sorting/grouping/classification problem?

2013 Jan 24

sorting/grouping/classification problem?

Hi, I'm a database admin for a database which manage chromatographic results of products during stability studies. I use R for the reporting of the results in MS Word through R2wd. But now I think I need your help: suppose we have the following data frame: ID rrt Mnd Result 1 0.45 0 0.10 1 0.48 0 0.30 1 1.24 0 0.50 2 0.45 3 0.20 2 0.48 3 0.60 2 1.22 3 0.40 3

how to ignore rows missing arguments of a function when creating a function?

2010 Jun 08

how to ignore rows missing arguments of a function when creating a function?

Hi, I am relatively new to R; when creating functions, I run into problems with missing values. I would like my functions to ignore rows with missing values for arguments of my function) in the analysis (as for example is the case in STATA). Note that I don't want my function to drop rows if there are missing arguments elsewhere in a row, ie for variables that are not arguments of my

persuade tabulate function to count NAs in a data frame

2011 Mar 19

persuade tabulate function to count NAs in a data frame

Hi, I'd like to ask you a question again. It is basically about data frames, NAs and tabulate function. I have this data frame. I already used this in one of the previous questions of mine. It intentionally looks this simple, my real 'df' dataframe is much bigger actually and again, I am not willing to annoy anyone with huge databases... So, my database: id

Same regression per sub-group: apply?

2007 Dec 07

Same regression per sub-group: apply?

Dear helpers, I've come up with what is probably a simple problem, but I cannot find the solution. I have a data-set containing survey-data from several countries. What I want to do is to perform some regression analyses, for each country separately. The question is, how to do this nicely (thus without repeating the same syntax with another `subset' argument). I thought of the

tapply and more than one function, with different arguments

2010 Jan 26

tapply and more than one function, with different arguments

Dear R-users, I am working with R version 2.10.1. Say I have is a simple function like this: > my.fun <- function(x, mult) mult*sum(x) Now, I want to apply this function along with some other (say 'max') to a simple data.frame, like: > dat <- data.frame(x = 1:4, grp = c("a","a","b","b")) Ideally, the result would look something like

ltext - adding text to each panel from a matrix

2005 Nov 10

ltext - adding text to each panel from a matrix

Hi all (really probably just Deepayan): In the plot below I want to add text on either side of each violin plot that indicates the number of observations that are either positive or negative. I'm trying to do this with ltext() and I've also monkeyed about with panel.text(). The code below is generally what I want but my calls to ltext() are wrong and I'm not sure how to fix them.

Creating new variable with maximum visit date by group_id

2011 Aug 24

Creating new variable with maximum visit date by group_id

Dear R users, I am encoutering the following problem: I have a dataset with a 'unique_id' and different 'visit_date' (formatted as.Date, "%d/%m/%Y") per unique_id. I would like to create a new variable with the most recent date of visit per unique_id as shown below. unique_id visit_date last_visit_date 1 01/06/2010 01/06/2011 1 01/01/2011 01/06/2011 1

tapply question

2006 Jul 06

tapply question

I think I understand tapply but i still can't figure out how to do the following. I have a dataframe where some of the column names are the same and i want to make a new dataframe where columns that have the same name are averaged by row. so, if the data frame, DF, was AAA BBB CCC AAA DDD 1 0 7 11 13 2 0 8 12 14 3 0 6 0 15

Ranking within a classification variable.

2005 Apr 19

Ranking within a classification variable.

Suppose I have a data frame with two columns ``district'' and ``score'' --- score is numeric; district may be considered categorical. I wish to append to this data frame a third column whose entries are the ranks of ``score'' ***within*** district. I've tried fiddling about with tapply() and by() but the result is a list whose i-th component consists of the ranks of

ggplot2/aesthetic plotting advice

2009 Apr 23

ggplot2/aesthetic plotting advice

Consider the following situation: we have quantified algal concentrations for a variety of species using many samples at each of three years. It seems to make sense to generate a line plot (matplot-like), with each species plotted as a separate line, with the points connected to emphasize the temporal pattern. The problem: lots of overlapping error bars. The question: from both a

apply a function down each column

2010 Jan 11

apply a function down each column

Hello World, I have a function that makes pairwise comparisons between two strings. I would like to apply this function to my data (which consists of columns with different strings) in the way that it compares the first with the second entry, and then the third with the fourth, and then the fifth with the sixth, and so on down each column... So (2x-1) and (2x) would be the different entries to be

similar to: "sequeeze" a data frame