Displaying 20 results from an estimated 10000 matches similar to: "Regression model when dependent variable can only take positive values"
2010 Sep 10
3
(no subject)
Hello,
I'm trying to do bar plot where 'sex' will be the category axis and
'occupation' will represent the bars and the clusters will represent
the mean 'income'.
sex occupation income
1 female j 12
2 male b 34
3 male j 22
4 female j 54
5 male b 33
6
2011 May 13
6
Powerful PC to run R
Dear all,
I'm currently running R on my laptop -- a Lenovo Thinkpad X201 (Intel Core
i7 CPU, M620, 2.67 Ghz, 8 GB RAM). The problem is that some of my
calculations run for several days sometimes even weeks (mainly simulations
over a large parameter space). Depending on the external conditions, my
laptop sometimes shuts down due to overheating.
I'm now thinking about buying a more
2016 Apr 16
1
Social Network Simulation
Dear all,
I am trying to simulate a series of networks that have characteristics
similar to real life social networks. Specifically I am interested in
networks that have (a) a reasonable degree of clustering (as measured by
the transitivity function in igraph) and (b) a reasonable degree of degree
polarization (as measured by the average degree of the top 10% nodes with
highest degree divided by
2009 Sep 04
2
plot positive predictive values
Hi,
I'm trying to fit a smooth line in a plot(y ~ x) graph.
x is continuous variable
y is a proportion of success in sub-samples, 0 <= y <= 1, from a Monte
Carlo simulation.
For each x there may be several y-values from different runs. Each run
produces several sub-samples, where "0" mean no success in any sub-
sample, "0.5" means success in half of the
2011 Apr 12
2
Testing equality of coefficients in coxph model
Dear all,
I'm running a coxph model of the form:
coxph(Surv(Start, End, Death.ID) ~ x1 + x2 + a1 + a2 + a3)
Within this model, I would like to compare the influence of x1 and x2 on the
hazard rate.
Specifically I am interested in testing whether the estimated coefficient
for x1 is equal (or not) to the estimated coefficient for x2.
I was thinking of using a Chow-test for this but the Chow
2010 Sep 01
2
ggplot2 multiple group barchart
hi there.. i got a problem with ggplot2.
here my example:
library (ggplot2)
v1 <- c(1,2,3,3,4)
v2 <- c(4,3,1,1,9)
v3 <- c(3,5,7,2,9)
gender <- c("m","f","m","f","f")
d.data <- data.frame (v1, v2, v3, gender)
d.data
x <- names (d.data[1:3])
y <- mean (d.data[1:3])
pl <- ggplot (data=d.data, aes (x=x,y=y))
pl
2010 Nov 19
2
Question on overdispersion
I have a few questions relating to overdispersion in a sex ratio data set
that I am working with (note that I already have an analysis with GLMMs for
fixed effects, this is just to estimate dispersion). The response variable
is binomial because nestlings can only be male or female. I have samples of
1-5 nestlings from each nest (individuals within a nest are not independent,
so the response
2013 Jan 22
2
Approximating discrete distribution by continuous distribution
Dear all,
I have a discrete distribution showing how age is distributed across a
population using a certain set of bands:
Age <- matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1,
dimnames=list(c("<18", "18-34", "35-64", "65+"),c()))
Age_dist <- Age/sum(Age)
For example I know that 23.94% of all people are between 0-18 years, 23.28%
2011 Mar 26
1
Effect size in multiple regression
Dear all,
is there a convenient way to determine the effect size for a regression
coefficient in a multiple regression model?
I have a model of the form lm(y ~ A*B*C*D) and would like to determine
Cohen's f2 (http://en.wikipedia.org/wiki/Effect_size) for each predictor
without having to do it manually.
Thanks,
Michael
Michael Haenlein
Associate Professor of Marketing
ESCP Europe
Paris,
2010 Nov 11
2
predict.coxph and predict.survreg
Dear all,
I'm struggling with predicting "expected time until death" for a coxph and
survreg model.
I have two datasets. Dataset 1 includes a certain number of people for which
I know a vector of covariates (age, gender, etc.) and their event times
(i.e., I know whether they have died and when if death occurred prior to the
end of the observation period). Dataset 2 includes another
2018 Feb 20
2
Take the maximum of every 12 columns
Don't do this (sorry Thierry)! max() already does this -- see ?max
> x <- data.frame(a =rnorm(10), b = rnorm(10))
> max(x)
[1] 1.799644
> max(sapply(x,max))
[1] 1.799644
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic
2018 Feb 20
0
Take the maximum of every 12 columns
The maximum over twelve columns is the maximum of the twelve maxima of
each of the columns.
single_col_max <- apply(x, 2, max)
twelve_col_max <- apply(
matrix(single_col_max, nrow = 12),
2,
max
)
ir. Thierry Onkelinx
Statisticus / Statistician
Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie
2010 Jul 28
1
Time-dependent covariates in survreg function
Dear all,
I'm asking this question again as I didn't get a reply last time:
I'm doing a survival analysis with time-dependent covariates. Until now,
I have used a simple Cox model for this, specifically the coxph function
from the survival library. Now, I would like to try out an accelerated
failure time model with a parametric specification as implemented for
example in the survreg
2011 Jul 01
1
Poisson GLM with a logged dependent variable...just asking for trouble?
Dear R-helpers,
I'm using a GLM with poisson errors to model integer count data as a
function of one non-integer covariate.
The model formula is: log(DV) ~ glm(log(IV,10),family=poisson).
I'm getting a warning because the logged DV is no longer an integer.
I have three questions:
1) Can I ignore the warning, or is logging the DV (resulting in
non-integers) a serious violation of the
2011 Sep 21
2
Cannot allocate vector of size x
Dear all,
I am running a simulation in which I randomly generate a series of vectors
to test whether they fulfill a certain condition. In most cases, there is no
problem. But from time to time, the (randomly) generated vectors are too
large for my system and I get the error message: "Cannot allocate vector of
size x".
The problem is that in those cases my simulation stops and I have to
2018 Feb 20
0
Take the maximum of every 12 columns
Thank you for your kind replies. Maybe I was not clear with my question (I
apologize) or I did not understand...
I would like to take the max for X0...X11 and X12...X24 in my dataset. When
I use pmax with the function byapply as in
byapply(df, 12, pmax)
I get back a list which I cannot convert to a dataframe. Am I missing
something? Thanks again!
Sincerely,
Milu
2010 Jul 14
1
Printing status updates in while-loop
Dear all,
I'm using a while loop in the context of an iterative optimization
procedure. Within my while loop I have a counter variable that helps me to
determine how long the loop has been running. Before the loop I initialize
it as counter <- 0 and the last condition within my loop is counter <-
counter + 1.
I'd like to print out the current status of "counter" while the
2012 Apr 12
2
Curve fitting, probably splines
Dear all,
This is probably more related to statistics than to [R] but I hope someone
can give me an idea how to solve it nevertheless:
Assume I have a variable y that is a function of x: y=f(x). I know the
average value of y for different intervals of x. For example, I know that
in the interval[0;x1] the average y is y1, in the interval [x1;x2] the
average y is y2 and so forth.
I would like to
2011 Sep 19
1
Binary optimization problem in R
Dear all,
I would like to solve a problem similar to a multiple knapsack problem and
am looking for a function in R that can help me.
Specifically, my situation is as follows: I have a list of n items which I
would like to allocate to m groups with fixed size. Each item has a certain
profit value and this profit depends on the type of group the item is in. My
problem is to allocate the items
2010 Sep 08
1
Aggregating data from two data frames
Dear all,
I'm working with two data frames.
The first frame (agg_data) consists of two columns. agg_data[,1] is a unique
ID for each row and agg_data[,2] contains a continuous variable.
The second data frame (geo_data) consists of several columns. One of these
columns (geo_data$ZCTA) corresponds to the unique ID in the first data
frame. The problem is that only a subset of the unique ID