similar to: working with summarized data

Displaying 20 results from an estimated 30000 matches similar to: "working with summarized data"

2003 Sep 10
2
Plot survey data
I am trying to make plots that take into account survey weights. This a survey of the US population. To start with I want to explore the data using pairs, plot, coplots and lattice. Are there specialized methods that handle survey weights for plotting? Any pointers? Anupam. [[alternative HTML version deleted]]
2002 Feb 06
4
Weighted median
Is there a weighted median function out there similar to weighted.mean() but for medians? If not, I'll try implement or port it myself. The need for a weighted median came from the following optimization problem: x* = arg_x min (a|x| + sum_{k=1}^n |x - b_k|) where a : is a *positive* real scalar x : is a real scalar n : is an integer b_k: are negative and positive scalars
2008 Oct 07
2
weighted quantiles
I have a set of values and their corresponding weights. I can use the function weighted.mean to calculate the weighted mean, I would like to be able to similarly calculate the weighted median and quantiles? Is there a function in R that can do this? thanks, Spencer [[alternative HTML version deleted]]
2004 Apr 12
2
Complex sample variances
Hello, Is there a way to get complex sample variances in the survey package on summary statistics other than means? If not, can they be added to a future version? It would be be great to have them on totals, quantiles, ratios, and tables (eg row percent, columns percent, etc). Thanks. Fred --------------------------------- [[alternative HTML version deleted]]
2006 Sep 11
2
faster way?
Hi, Is there a faster way to do this? It takes forever, even on a moderately sized dataset. n <- dim(dsn)[1] dsn2 <- dsn[order(-dsn$xhat),] dsn2[1, "cumx"] <- dsn2[1, "xhat"] for (i in 2:n) { dsn2[i, "cumx"] <- dsn2[i - 1, "cumx"] + dsn2[i, "xhat"] } [[alternative HTML version deleted]]
2017 Oct 08
2
Discourage the weights= option of lm with summarized data
Indeed: Using 'weights' is not meant to indicate that the same observation is repeated 'n' times. As I showed, this gives erroneous results. Hence I suggested that it is discouraged rather than encouraged in the Details section of lm in the Reference manual. Arie ---Original Message----- On Sat, 7 Oct 2017, wolfgang.viechtbauer at maastrichtuniversity.nl wrote: Using
2017 Oct 07
1
Discourage the weights= option of lm with summarized data
In the Details section of lm (linear models) in the Reference manual, it is suggested to use the weights= option for summarized data. This must be discouraged rather than encouraged. The motivation for this is as follows. With summarized data the standard errors get smaller with increasing numbers of observations. However, the standard errors in lm do not get smaller when for instance all weights
2017 Oct 09
2
Discourage the weights= option of lm with summarized data
Yes. Thank you; I should have quoted it. I suggest to remove this text or to add the word "not" at the beginning. Arie On Sun, Oct 8, 2017 at 4:38 PM, Viechtbauer Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote: > Ah, I think you are referring to this part from ?lm: > > "(including the case that there are w_i observations equal to y_i and
2017 Dec 03
1
Discourage the weights= option of lm with summarized data
Peter, This is a highly structured text. Just for the discussion, I separate the building blocks, where (D) and (E) and (F) are new: BEGIN OF TEXT -------------------- (A) Non-?NULL? ?weights? can be used to indicate that different observations have different variances (with the values in ?weights? being inversely proportional to the variances); (B) or equivalently, when the elements of
2017 Oct 12
4
Discourage the weights= option of lm with summarized data
OK. We have now three suggestions to repair the text: - remove the text - add "not" at the beginning of the text - add at the end of the text a warning; something like: "Note that in this case the standard estimates of the parameters are in general not correct, and hence also the t values and the p value. Also the number of degrees of freedom is not correct. (The parameter
2009 Jun 30
2
odd behaviour in quantreg::rq
Hi, I am trying to use quantile regression to perform weighted-comparisons of the median across groups. This works most of the time, however I am seeing some odd output in summary(rq()): Call: rq(formula = sand ~ method, tau = 0.5, data = x, weights = area_fraction) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 45.44262 3.64706 12.46007
2007 Feb 07
3
boxplot statistics in ggplot
I need to make weighted boxplots. I found that ggplot makes them. I would however like to label them with the boxplot statistics (the median, q1 and q3). In the boxplot function in r-base, I could output boxplot statistics and then write a text on the plot to place the labels. How would one do it with ggplot? Vikas
2011 Jan 27
2
Extrapolating values from a glm fit
Dear R-help, I have fitted a glm logistic function to dichotomous forced choices responses varying according to time interval between two stimulus. x values are time separation in miliseconds, and the y values are proportion responses for one of the stimulus. Now I am trying to extrapolate x values for the y value (proportion) at .25, .5, and .75. I have tried several predict parameters, and they
2009 Nov 18
2
Median on Aggregated data
Folks, I have the following code, that works fine on smaller data sets. For larger datasets, it runs out of memory and runs way too slow because we are essentially creating large vectors with rep() and then calling median() on it. (I learned this approach from a post on the web). Below that, I have written the corresponding SAS code. The SAS code works fast because I can just tell the proc
2003 Oct 02
3
Query: weighting cells in histogram
I have the 'breaks' for the histogram ('hist') but I want weight the cells instead of using actual observations. I thought that using freq=FALSE implied that the numbers in 'x' were weights but this turned out to be wrong. Any help and/or comment is very much appreciated. Regards, M?rten M?rten Bjellerup Doctoral Student in Economics School of Management and Economics
2009 Nov 21
4
other decriptive stats packages
i just found the following list, i wondered if anybody could add to this as i have to characterize a large data set and am new to R...the list below was so helpful....can you add to this??? Just to forestall confusion amongst those who would like to use one of the functions called "describe"... Hmisc package - describe numeric name count of observations count of missing
2012 Oct 30
6
standard error for quantile
Dear all I have a question about quantiles standard error, partly practical partly theoretical. I know that x<-rlnorm(100000, log(200), log(2)) quantile(x, c(.10,.5,.99)) computes quantiles but I would like to know if there is any function to find standard error (or any dispersion measure) of these estimated values. And here is a theoretical one. I feel that when I compute median from given
2006 Mar 11
1
Quicker quantiles?
Motivated by Deepayan's recent inquiries about the efficiency of the R 'quantile' function: http://tolstoy.newcastle.edu.au/R/devel/05/11/3305.html http://tolstoy.newcastle.edu.au/R/devel/06/03/4358.html I decided to try to revive an old project to implement a version of the Floyd and Rivest (1975) algorithm for finding quantiles with O(n) comparisons. I used
2010 Mar 30
2
weighted.median function from package R.basic
Dear all, I want to apply a weighted median on a huge dataset, and I remember a function from the package R.basic that could do this using an internal sorting algorithm qsort. This speeded things up quite a bit. Alas, I can't find that package anywhere anymore. There is a weighted.median function in the package limma too, but I didn't use that before. Anybody who knows what happened to
2010 Nov 12
3
predict.coxph
Since I read the list in digest form (and was out ill yesterday) I'm late to the discussion. There are 3 steps for predicting survival, using a Cox model: 1. Fit the data fit <- coxph(Surv(time, status) ~ age + ph.ecog, data=lung) The biggest question to answer here is what covariates you wish to base the prediction on. There is the usual tradeoff between too few (leave out something