thr3ads.net - similar to: "aggregating along bins and bin-quantiles"

Displaying 20 results from an estimated 9000 matches similar to: "aggregating along bins and bin-quantiles"

Help with Hmisc, cut2, split and quantile

2010 Mar 08

Help with Hmisc, cut2, split and quantile

Hello, I have a set of data with two columns: "Target" and "Actual". A http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is attached but the data looks like this: Actual Target -0.125 0.016124906 0.135 0.120799865 ... ... ... ... I want to be able to break the data into tables based on quantiles in the "Target" column. I can see (using

cut2 once, bin twice...

2009 Oct 23

cut2 once, bin twice...

Hello, I'm using the Hmisc cut2 function to bin a set of data. It produces bins that I like with results like this: [96,270]:171 [69, 96): 54 [49, 69): 40 [35, 49): 28 [28, 35): 14 [24, 28): 8 (Other) : 48 I would like to take a second set of data, and assign it to bins based on factors defined by my call to cut 2. Does anyone know how I can do this? Thank you, -S -- View this message

Oja median

2008 Nov 19

Oja median

Hi Roger, As we know that The Oja median has (finite) breakdown point 2/n, i.e., is not robust in any reasonable sense, and is quite expensive to compute, so do we have some better methodology to compute multivariate median Rahul Agarwal Analyst Equities Quantitative Research UBS_ISC, Hyderabad On Net: 19 533 6363 [[alternative HTML version deleted]]

warning with cut2 function

2011 Oct 11

warning with cut2 function

Dear r user, please find my attached sample of the dataset i? am using to create a crosstable and eventually plot a histogram from the output. I am using? the cut2 function to create bins, about 7 of them using the code after reading the data: cluster <- cut2(cross_val$value, g=7) I get the warning: Warning message: In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf

FW: Reading Data

2008 Oct 07

FW: Reading Data

Rahul Agarwal Analyst Equities Quantitative Research UBS_ISC, Hyderabad On Net: 19 533 6363 hi let me explain you the problem we have a database which is in this format Stocks 30-Jan-08 28-Feb-08 31-Mar-08 30-Apr-08 a 1.00 3.00 7.00 3.00 b 2.00 4.00 4.00 7.00 c 3.00 8.00 655.00 3.00 d 4.00 23.00 4.00 5.00 e 5.00 78.00 6.00 5.00 and we have a query

what does cut(data, breaks=n) actually do?

2007 Dec 13

what does cut(data, breaks=n) actually do?

Hello, I'm trying to bin a quantity into 2-3 bins for calculating entropy and mutual information. One of the approaches I'm exploring is the cut() function, which is what the mutualInfo function in binDist uses. When it's called in the format cut(data, breaks=n), it somehow splits the data into n distinct bins. Can anyone tell me how cut() decides where to cut? Thanks, Melissa

Two Noobie questions

2009 Jan 06

Two Noobie questions

1. I have a list of lm (linear model) objects. Is it possible to select, through subscripts, a particular element (say, the intercept) from all the models? I've tried something like this: List[[1:length(list)]][1] All members of the list are similar. My goal is to have a list of the intercepts and lists of other estimated parameters. Is it better to convert to a matrix? How to do this? 2.

survfit using quantiles to group age

2009 Feb 02

survfit using quantiles to group age

I am using the package Design for survival analysis. I want to plot a simple Kaplan-Meier fit of survival vs. age, with age grouped as quantiles. I can do this: survplot(survfit(Surv(time,status) ~ cut(age,3), data=veteran) but I would like to do something like this: survplot(survfit(Surv(time,status) ~ quantile(age,3), data=veteran) #will not work ideally I would like to superimpose

Oja median

2008 Sep 18

Oja median

Hi, Can we get the code for calculating Oja median for multivariate data Thanks and Regards Rahul Agarwal Analyst Equities Quantitative Research UBS_ISC, Hyderabad On Net: 19 533 6363 [[alternative HTML version deleted]]

cut2 error

2012 Oct 17

cut2 error

To R users, I am trying to use cut2 function from the 'Hmisc' library. However, when I try and run the function on the following variable, I get an error message (displayed below). I suspect it is because of the NA but I have no idea how to address the error. Many thanks to any insights. structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97, 98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98,

Nested loop and output help

2013 Feb 01

Nested loop and output help

Hello Everyone, My name is Thomas and I have been using R for one week. I recently found your site and have been able to search the archives of posts. This has given me some great information that has allowed me to craft an initial design to an inquiry I would like to make into the breakdown of McNemar's test. I have read an intro to R manual and the posting guides and hope I am not violating

sem with categorical data

2009 May 20

sem with categorical data

I am trying to run a confirmatory factor analysis using the SEM package. My data are ordinal. I have read http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf. When I apply the hetcor function, I receive the following error: Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, : at least one element of 'lower' is larger than 'upper' Example:

Improving effeciency - better table()?

2004 Jul 06

Improving effeciency - better table()?

Hi, I've been running some simulations for a while and the performance of R has been great. However, I've recently changed the code to perform a sort of chi-square goodness-of-fit test. To get the observed values for each cell I've been using table() - specifically I've been using cut2 from Hmisc to divide up the range into a specified number of cells and then using

loop of quartile groups

2012 Oct 17

loop of quartile groups

Greetings R users, My goal is to generate quartile groups of each variable in my data set. I would like each experiment to have its designated group added as a subsequent column. I can accomplish this individually with the following code: brks <- with(data_variables, cut2(var2, g=4)) #I don't want the actual numbers, I need a numbered group data$test1=factor(brks,

R D COM Excel Add-Ins

2008 Oct 24

R D COM Excel Add-Ins

Hello All! I have a question regarding the package RDCOMClient. I want to start an Excel file with R and it works flawlessly except the fact, that Add-Ins are not loaded. Can someone please explain me how to load one? Does it work with ex$AddIns$Invoke? Greetings, David [[alternative HTML version deleted]]

replace with quantile value for a large data frame...

2011 Mar 13

replace with quantile value for a large data frame...

Dear R-Experts I am sure this might look simple question for experts, at least is problem for me. I have a large data frame with over 1000 variables and each have different distribution( i.e. have different quantile). I want to create a new grouped data frame, where the new variables where the value falling in first (<25%), second (25% to <50%), third (50% to <75%) and fourth quantiles

quantile function

2004 Feb 06

quantile function

I am trying to `cut' a continuous variable into contiguous classes containing approximately an equal number of observations. I thought quantile() was the appropriate function to use in order to find the breakpoints, but I end up with classes of different sizes - see example below. Does anybody have an explanation for that? And what is the `recommended' way of computing what I am looking

plot of Bernoulli data

2001 Oct 02

plot of Bernoulli data

I have some Bernoulli data something like this: x<-sort(runif(100,1,20)) p<-pnorm(x,10,3) y<-as.numeric(runif(x)<p) plot(x,y) lines(x,p) This plot is not very satisfactory because the ogive does not visually fit the (0,1) points very well, and also because the points tend to fall on top of one another. The second problem can be eliminated by adding vertical jitter. However I was

cut and re-factor data

2009 Sep 22

cut and re-factor data

Hello R-users, I have a data frame with a factor of ages in 5 year increments, and various count data for each age group. I only have this summary information in R at the moment. I want to create a new factor that aggregates the age factors if the existing factors have insufficient counts. Then I can use aggregate to build a new data set. I figured out I can get the cut values I want using cut2

Hmisc: can not reproduce figure 4 of Statistical Tables and Plots using S and LATEX

2007 Nov 24

Hmisc: can not reproduce figure 4 of Statistical Tables and Plots using S and LATEX

Dear R-users: I can not reproduce figure 4 of *Statistical Tables and Plots using S and LATEX* by Prof. Frank Harrell with the following code: rm(list=ls()) library(Hmisc) getHdata(pbc) attach(pbc) age.groups <- cut2(age, c(45,60)) g <- function(y) apply(y, 2, quantile, c(.25,.5,.75)) y <- with(pbc, cbind(Chol=chol,Bili=bili)) # You can give new column names that are not legal S names

similar to: aggregating along bins and bin-quantiles