thr3ads.net - similar to: "split-apply question"

Displaying 20 results from an estimated 50000 matches similar to: "split-apply question"

2009 Jan 20

Summing Select Columns of a Data Frame?

Hi, I would like to operate on certain columns in a dataframe, but not others. My data looks like this: x1 x2 x3 1 2 3 4 5 6 7 8 9 I want to create a new column named x4 that is the sum of x1 and x2, but NOT x3. I looked at colSums and apply, but those functions seem to use all the columns in a dataframe. How do I only use select columns? If it helps, in Stata this would be gen x4

Obtaining the value of x at a given value of y in a smooth.spline object

2009 Aug 12

Obtaining the value of x at a given value of y in a smooth.spline object

I have some data fit to a smooth.spline object as follows: (x=vector of data for the predictor variable, y=vector of data for the response variable) fit <- smooth.spline(x,y) Now, given a spline fit point y_new, I want to be able to find out what value of x_new yielded this fit value. How to do so? (This problem is the inverse of the predict.smooth.spline function, which takes x_new as input

Converting a character string into a data frame name and performing assignments to that data frame

2010 Mar 20

Converting a character string into a data frame name and performing assignments to that data frame

Hi, I would like to do the following operations: variable.df is a character string that contains the name of the data frame that I want to do the following operations on: variable.df <- data.frame(); # I can do the above command using assign( variable.df, data.frame() ) How can I perform the assignment statements below ? colnames(variable.df) = colnames(some.other.df) variable.df =

Extract data

2011 Jan 06

Extract data

Dear List, I have a data frame called trait with roughly 800 species in, each species have 15 columns of information: Species 1 2 3 etc.. a t y h b f j u c r y u etc.. I then have another data frame called com with the composition of species in each region, there are 506 different communities: community species NA1102 a NA1102 c NA0402 b NA0402 c AT1302 a AT1302 b etc.. What

Formatting numeric values in a data frame

2009 Feb 25

Formatting numeric values in a data frame

Hi R users, I have a data frame that contains 10K obs and 200 variables where I am trying to format the numeric columns to look like the output table below (format to 2 decimal places) but I am having no luck.. Can someone tell me the best way to accomplist this? Thanks in advance for any help! str(ad.test) 'data.frame': 10,000 obs. of 200 variables: $ ID : Factor

Using apply to get group means

2009 Mar 31

Using apply to get group means

Hi all, I'm trying to improve my R skills and make my programming more efficient and succinct. I can solve the following question, but wonder if there's a better way to do it: I'm trying to calculate mean by several variables and then put this back into the original data set as a new variable. For example, if I were measuring weight, I might want to have each individual's

applying a function to a pair of components for each row of a list

2009 Jun 28

applying a function to a pair of components for each row of a list

Hi, I have a set of (x,y) coordinate pairs that are stored as a list > my_list $x [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 $y [1] -8.0866819 -7.3876052 -6.6849311 -5.9837693 -5.2967432 -4.6525466 [7] -4.0999453 -3.6556190 -3.3076102 -3.0360780 -2.8220465 -2.6532085 [13] -2.5192816 -2.4086241 -2.3072977 -2.1969611 -2.0574250 -1.8737694 [19] -1.6357864

Using of LME function in non-replicate data

2006 Mar 16

Using of LME function in non-replicate data

Hello all R-users! In Jun-2005, I find the follow discussion about using of LME function ( in NLME library ) for fitting non-replicate data The thread: ANOVA vs REML approach to variance component estimation http://tolstoy.newcastle.edu.au/R/help/05/06/6498.html Someone expose the follow problem: # non-replicate data y <- c(2.2, -1.4, -0.5, -0.3, -2.1, 1.5, 1.3, -0.3, 0.5, -1.4,

all.names() and all.vars(): sorting order of functions' return vector

2006 Oct 27

all.names() and all.vars(): sorting order of functions' return vector

Dear list-subscriber, in the process of writing a general code snippet to extract coefficients in an expression (in the example below: 0.5 and -0.7), I stumbled over the following peculiar (at least peculiar to me:-) ) sorting behaviour of the function all.names(): > expr1 <- expression(x3 = 0.5 * x1 - 0.7 * x2) > all.names(expr1) [1] "-" "*" "x1"

apply with a division

2008 Jul 03

apply with a division

Hi, I'd like to normalize a dataset by dividing each row by the first row. Very simple, right? I tried this: > expt.fluor X1 X2 X3 1 124 120 134 2 165 163 174 3 52 51 43 4 179 171 166 5 239 238 235 >first.row <- expt.fluor[1,] > normed <- apply(expt.fluor, 1, function(r) {r / first.row}) >normed [[1]] X1 X2 X3 1 1 1 1 [[2]] X1 X2 X3 1

2010 Nov 09

Question related to combination and the corresponding probability

Dear r users, I have 4 variables x1,x2,x3,x4 and each one has two levels, for example Y and N. For x1: prob(Y)=0.6, prob(N)=0.4; For x2: prob(Y)=0.5, prob(N)=0.5; For x3: prob(Y)=0.8, prob(N)=0.2; For x4: prob(Y)=0.9, prob(N)=0.1; Therefore, the sample space for (x1, x2, x3, x4)={YYYY, YYYN, YYNY,......} (16 possible combination) and the corresponding probabilities are {(0.6)(0.5)(0.8)(0.9),

tapply with multiple arguments that are not part of the same data frame

2009 Oct 22

tapply with multiple arguments that are not part of the same data frame

Hi all, I would like to invoke a function that takes multiple arguments (some of which are specified columns in the data frame, and others that are independent of the data frame) on split parts of a data frame, how do I do this? For example, let's say I have a data frame >fitness_data name height weight country rob 5.8 200 usa nancy 5.5 140 germany jen

2011 Oct 22

Does R has a similar way as DATA in SPSS?

Hi there, In SPSS, small piece of data can be input as following: DATA LIST LIST /x1 x2 x3 x4 x5 . BEGIN DATA 5700 12.8 2500 270 25000 1000 10.9 600 10 10000 3400 8.8 1000 10 9000 3800 13.6 1700 140 25000 4000 12.8 1600 140 25000 8200 8.3 2600 60 12000 1200 11.4 400 10 16000 9100 11.5 3300 60 14000 9900 12.5 3400 180 18000 9600 13.7 3600 390 25000

Relative GCV - poisson and negbin GAMs (mgcv)

2007 Apr 08

Relative GCV - poisson and negbin GAMs (mgcv)

I am using gam in mgcv (1.3-22) and trying to use gcv to help with model selection. However, I'm a little confused by the process of assessing GCV scores based on their magnitude (or on relative changes in magnitude). Differences in GCV scores often seem "obvious" with my poisson gams but with negative binomial, the decision seems less clear. My data represent a similar pattern as

re ading tokens

2009 Nov 03

re ading tokens

Greetings, I am not familiar with processing text in R. Can someone tell me how to read each line of words as separate elements in a list? FE, I would like to turn: word1 word2 word3 word2 word4 into a list of length two with three character elements in the first list and two elements in the second. I know that this should be easy, but I am a little confused by the text functions. Thanks in

predictOMatic for regression. Please try and advise me

2012 Apr 20

predictOMatic for regression. Please try and advise me

I'm pasting below a working R file featuring a function I'd like to polish up. I'm teaching regression this semester and every time I come to something that is very difficult to explain in class, I try to simplify it by writing an R function (eventually into my package "rockchalk"). Students have a difficult time with predict and newdata objects, so right now I'm

MLE Constraints

2008 Oct 15

MLE Constraints

Dears, I'm trying to find the parameters (a,b, ... l) that optimize the function (Model) described below. 1) How can I set some constraints with MLE2 function? I want to set p1>0, p2>0, p3>0, p1>p3. 2) The code is giving the following warning. Warning: optimization did not converge (code 1) How can I solve this problem? Can someone help me? M <- 14 Y = c(0, 1, 0, 0, 0,

replace a for loop with lapply or relative

2010 Feb 04

replace a for loop with lapply or relative

Dear helpers. I often need to make dichotomous variables out of continuous ones (yes, I realize the problems with throwing away much of the information), but I then like to check the min and max of each category. I have the following simple code to do this that cuts each variable (x1,x2,x3) at the 90th percentile, and then prints the min and max of each category:

Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing

2017 Oct 12

Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing

Hi, I recently ran into an inconsistency in the way model.matrix.default handles factor encoding for higher level interactions with categorical variables when the full hierarchy of effects is not present. Depending on which lower level interactions are specified, the factor encoding changes for a higher level interaction. Consider the following minimal reproducible example: -------------- >

Cropped graph using lattice

2010 Mar 17

Cropped graph using lattice

I'm fitting data from a mixture experiment, and I'd like to present the results in a ternary graph with contours. I found this code by Walmes Zeviani http://n4.nabble.com/Triangular-filled-contour-plot-td1557386.html which is just what I want--except I would like the axis titles and labels to be proportionately larger than the ternary graph itself, for legibility in publication. When I

similar to: split-apply question