Displaying 20 results from an estimated 9000 matches similar to: "aggregating along bins and bin-quantiles"
2010 Mar 08
1
Help with Hmisc, cut2, split and quantile
Hello,
I have a set of data with two columns: "Target" and "Actual". A
http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is
attached but the data looks like this:
Actual Target
-0.125 0.016124906
0.135 0.120799865
... ...
... ...
I want to be able to break the data into tables based on quantiles in the
"Target" column. I can see (using
2009 Oct 23
1
cut2 once, bin twice...
Hello,
I'm using the Hmisc cut2 function to bin a set of data. It produces bins
that I like with results like this:
[96,270]:171
[69, 96): 54
[49, 69): 40
[35, 49): 28
[28, 35): 14
[24, 28): 8
(Other) : 48
I would like to take a second set of data, and assign it to bins based on
factors defined by my call to cut 2.
Does anyone know how I can do this?
Thank you,
-S
--
View this message
2008 Nov 19
2
Oja median
Hi Roger,
As we know that The Oja median has (finite) breakdown point 2/n, i.e.,
is not robust in any reasonable sense, and is quite expensive to
compute, so do we have some better methodology to compute multivariate
median
Rahul Agarwal
Analyst
Equities Quantitative Research
UBS_ISC, Hyderabad
On Net: 19 533 6363
[[alternative HTML version deleted]]
2011 Oct 11
1
warning with cut2 function
Dear r user,
please find my attached sample of the dataset i? am using to create a crosstable and eventually plot a histogram from the output.
I am using? the cut2 function to create bins, about 7 of them using the code after reading the data:
cluster <- cut2(cross_val$value, g=7)
I get the warning:
Warning message:
In min(xx[xx > upper]) : no non-missing arguments to min; returning Inf
2008 Oct 07
1
FW: Reading Data
Rahul Agarwal
Analyst
Equities Quantitative Research
UBS_ISC, Hyderabad
On Net: 19 533 6363
hi let me explain you the problem
we have a database which is in this format
Stocks 30-Jan-08 28-Feb-08 31-Mar-08 30-Apr-08
a 1.00 3.00 7.00 3.00
b 2.00 4.00 4.00 7.00
c 3.00 8.00 655.00 3.00
d 4.00 23.00 4.00 5.00
e 5.00 78.00 6.00 5.00
and we have a query
2007 Dec 13
3
what does cut(data, breaks=n) actually do?
Hello,
I'm trying to bin a quantity into 2-3 bins for calculating entropy and
mutual information. One of the approaches I'm exploring is the cut()
function, which is what the mutualInfo function in binDist uses. When it's
called in the format cut(data, breaks=n), it somehow splits the data into n
distinct bins. Can anyone tell me how cut() decides where to cut?
Thanks,
Melissa
2009 Jan 06
3
Two Noobie questions
1. I have a list of lm (linear model) objects. Is it possible to select,
through subscripts, a particular element (say, the intercept) from all the
models? I've tried something like this:
List[[1:length(list)]][1]
All members of the list are similar. My goal is to have a list of the
intercepts and lists of other estimated parameters. Is it better to convert
to a matrix? How to do this?
2.
2009 Feb 02
1
survfit using quantiles to group age
I am using the package Design for survival analysis. I want to plot a
simple Kaplan-Meier fit of survival vs. age, with age grouped as
quantiles. I can do this:
survplot(survfit(Surv(time,status) ~ cut(age,3), data=veteran)
but I would like to do something like this:
survplot(survfit(Surv(time,status) ~ quantile(age,3), data=veteran)
#will not work
ideally I would like to superimpose
2008 Sep 18
3
Oja median
Hi,
Can we get the code for calculating Oja median for multivariate data
Thanks and Regards
Rahul Agarwal
Analyst
Equities Quantitative Research
UBS_ISC, Hyderabad
On Net: 19 533 6363
[[alternative HTML version deleted]]
2012 Oct 17
2
cut2 error
To R users,
I am trying to use cut2 function from the 'Hmisc' library. However, when I
try and run the function on the following variable, I get an error message
(displayed below). I suspect it is because of the NA but I have no idea
how to address the error. Many thanks to any insights.
structure(list(var1 = c(97, 97, 98, 98, 97, 99, 97,
98, 99, 98, 99, 98, 98, 97, 97, 98, 99, 98,
2013 Feb 01
2
Nested loop and output help
Hello Everyone,
My name is Thomas and I have been using R for one week. I recently found
your site and have been able to search the archives of posts. This has
given me some great information that has allowed me to craft an initial
design to an inquiry I would like to make into the breakdown of McNemar's
test. I have read an intro to R manual and the posting guides and hope I am
not violating
2009 May 20
1
sem with categorical data
I am trying to run a confirmatory factor analysis using the SEM package. My
data are ordinal. I have read
http://socserv.mcmaster.ca/jfox/Misc/sem/SEM-paper.pdf.
When I apply the hetcor function, I receive the following error:
Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr,
:
at least one element of 'lower' is larger than 'upper'
Example:
2004 Jul 06
3
Improving effeciency - better table()?
Hi,
I've been running some simulations for a while and the performance of R
has been great. However, I've recently changed the code to perform a sort
of chi-square goodness-of-fit test. To get the observed values for each
cell I've been using table() - specifically I've been using cut2 from
Hmisc to divide up the range into a specified number of cells and then
using
2012 Oct 17
2
loop of quartile groups
Greetings R users,
My goal is to generate quartile groups of each variable in my data set. I
would like each experiment to have its designated group added as a
subsequent column. I can accomplish this individually with the following
code:
brks <- with(data_variables,
cut2(var2, g=4))
#I don't want the actual numbers, I need a numbered group
data$test1=factor(brks,
2008 Oct 24
1
R D COM Excel Add-Ins
Hello All!
I have a question regarding the package RDCOMClient. I want to start an
Excel file with R and it works flawlessly except the fact, that Add-Ins are
not loaded. Can someone please explain me how to load one? Does it work with
ex$AddIns$Invoke?
Greetings,
David
[[alternative HTML version deleted]]
2011 Mar 13
1
replace with quantile value for a large data frame...
Dear R-Experts
I am sure this might look simple question for experts, at least is problem
for me. I have a large data frame with over 1000 variables and each have
different distribution( i.e. have different quantile). I want to create a
new grouped data frame, where the new variables where the value falling in
first (<25%), second (25% to <50%), third (50% to <75%) and fourth quantiles
2004 Feb 06
3
quantile function
I am trying to `cut' a continuous variable into contiguous classes
containing approximately an equal number of observations. I thought
quantile() was the appropriate function to use in order to find the
breakpoints, but I end up with classes of different sizes - see
example below. Does anybody have an explanation for that? And what is
the `recommended' way of computing what I am looking
2001 Oct 02
4
plot of Bernoulli data
I have some Bernoulli data something like this:
x<-sort(runif(100,1,20))
p<-pnorm(x,10,3)
y<-as.numeric(runif(x)<p)
plot(x,y)
lines(x,p)
This plot is not very satisfactory because the ogive does not visually
fit the (0,1) points very well, and also because the points tend to fall
on top of one another. The second problem can be eliminated by adding
vertical jitter. However I was
2009 Sep 22
1
cut and re-factor data
Hello R-users,
I have a data frame with a factor of ages in 5 year increments, and various
count data for each age group. I only have this summary information in R at
the moment.
I want to create a new factor that aggregates the age factors if the
existing factors have insufficient counts. Then I can use aggregate to
build a new data set.
I figured out I can get the cut values I want using cut2
2007 Nov 24
1
Hmisc: can not reproduce figure 4 of Statistical Tables and Plots using S and LATEX
Dear R-users:
I can not reproduce figure 4 of *Statistical Tables and Plots using S and
LATEX* by Prof. Frank Harrell with the following code:
rm(list=ls())
library(Hmisc)
getHdata(pbc)
attach(pbc)
age.groups <- cut2(age, c(45,60))
g <- function(y) apply(y, 2, quantile, c(.25,.5,.75))
y <- with(pbc, cbind(Chol=chol,Bili=bili))
# You can give new column names that are not legal S names