Displaying 20 results from an estimated 1000 matches similar to: "mean of subset of rows"
2007 Oct 11
2
reference for logistic regression
Dear list, first accept my apologies for asking a non-R question.
Can anyone point me to a good reference on logistic regression? web or
book references would be great. I am interested in the use and
interpretation of dummy variables and prediction models.
I checked the contributed section in the CRAN homepage but could not
find anything (Julian Faraway?s "practical Regression and ANOVA
2008 Mar 11
2
Design�s validate() output
Dear list
Is there anywhere I could find further information on how to interpret
the output for a logistic regression for validate() from Design
package?. I tried ?validate and google but I cannot find information
on what the rows and the columns represent.
Thanks
David
2007 Sep 12
2
k-means clustering
Dear list, first apologies for this is not strictly an R question but
a theoretical one.
I have read that use of k-means clustering assumes sphericity of data
distribution. Can anyone explain me what this means? My statistical
background is too poor. Is it another kind of distribution, like
gaussian or binomial? What does it happen if the distribution is not
spherical? Could you give me an
2008 Feb 08
2
correlation
Dear list
I would like to compare two measurements of disease severity (M1 and
M2), one of the is continuous (M1 ranging from 1 to 10) and the other
is ordinal (M2 takes Low, Medium, high and very high). Do you think is
ok to use cor() function to test whether the two agree, i.e correlate?
I am afraid that if I set M2 to 1,2,3 and 4, the function cor() will
take them as continuous and
2008 Jan 21
2
summary of categorical variables
Dear list,
I have a data.frame with nine categorical variables (0,1,2 and NAs)
that I would like to get the number of events for each of them. I can
extract this using summary() for each variable at a time with the
as.factor()argument (otherwise it will get me the mean value):
>summary(as.factor(mydf[,3]))
0 1 2 NA's
194 67 4 2
Trying to use apply() to get this for
2008 Jan 18
1
histogram with NAs
Dear list,
I have a categorical variable in a data.frame that I would like to
plot using a histogram to show number of events. Values are 0, 1 and
some NAs. I can?t make the hist() function to
1) include a column with the number of NAs
2) have the x axis to be categorical, I always get 0, 0.2, 0.4,... 1
divisions
Can anyone help me?
This is my code. "database" is my data.frame and
2024 May 15
2
Extracting values from Surv function in survival package
OS X
R 4.3.3
Colleagues
I have created objects using the Surv function in the survival package:
> FIT.1
Call: survfit(formula = FORMULA1)
n events median 0.95LCL 0.95UCL
SUBDATA$ARM=1, SUBDATA[, EXP.STRAT]=0 18 13 345 156 NA
SUBDATA$ARM=2, SUBDATA[, EXP.STRAT]=1 13 5 NA 186 NA
SUBDATA$ARM=2, SUBDATA[, EXP.STRAT]=2 5
2024 May 16
1
Extracting values from Surv function in survival package
Hi Dennis,
look at the help page for summary.survfit, the Value n.event.
G?ran
On 2024-05-15 22:41, Dennis Fisher wrote:
> OS X
> R 4.3.3
>
> Colleagues
>
> I have created objects using the Surv function in the survival package:
>> FIT.1
> Call: survfit(formula = FORMULA1)
>
> n events median 0.95LCL 0.95UCL
>
2008 May 16
4
reading and analyzing a text file
Dear list,
I have a text file from a scanner that includes 20 lines of text
(scanner settings) before it actually starts showing the readings in a
tabular format (headings are ID, intensity, background and few others).
I am a biologist with some experience using R and my question is if it
is possible to read this file into an R workspace and store the actual
readings in a dataframe,
2007 Aug 01
1
Problem to remove loops in a routine
Dear R-users,
I have written the following code to generate some trellis plots. It
works perfectly fine except that it is quite slow when it is apply to my
typical datasets (over several thousands of lines). I believe the
problem comes from the loops I am using to subset my data.frame. I read
in the archives that the tapply function is often more efficient than a
loop in R. Unfortunately ,
2009 Sep 14
3
Eliminate cases in a subset of a dataframe
Hi folks,
I created a subset of a dataframe (i.e., selected only men):
subdata <- subset(data,data$gender==1)
After a residual diagnostic of a regression analysis, I detected three
outliers:
linmod <- lm(y ~ x, data=subdata)
plot(linmod)
Say, the cases 11,22, and 33 were outliers.
Here comes the problem: When I want to exclude these three cases in a
further regression analysis,
- for
2011 Aug 17
3
How to apply a function to subsets of a data frame *and* obtain a data frame again?
Dear all,
First, let's create some data to play around:
set.seed(1)
(df <- data.frame(Group=rep(c("Group1","Group2","Group3"), each=10),
Value=c(rexp(10, 1), rexp(10, 4), rexp(10, 10)))[sample(1:30,30),])
## Now we need the empirical distribution function:
edf <- function(x) ecdf(x)(x) # empirical distribution function evaluated at x
##
2003 Mar 25
2
Help with data.frame subsets
Hello all,
I'm trying to get a subset of a data frame by taking all rows where the 2nd
column is >= Min and <= Max. I can do that by a 2 step process similar to
the following:
subData <- dataFrame[dataFrame[,2] >= Min,]
subData2 <- subData[subData[,2] <= Max,]
Then I try to graph the results where col 2 is the X var and col 3 is the Y
var. Therefore I do the following:
X
2011 Oct 31
1
googleVis motionchart - slow with Date class
Hi,
I am trying to create a googleVis motion chart with monthly data. When formatting the date column as a Date class variable, the plot as presented in the browser becomes considerably slower and very prone to crashing the browser. To illustrate this issue I have modified the WorldBank demo.
### objects from demo("WorldBank", package = "googleVis")
M <-
2008 Jun 19
2
Advanced Filtering problem
http://www.nabble.com/file/p18018170/subdata.csv subdata.csv
I've attached 100 rows of a data frame I am working with.
I have one factor, id, with 27 levels. There are two columns of reference
data, x and y (UTM coordinates), one column "date" in POSIXct format, and
one column "diff" in times format (chron package).
What I am trying to do is as follows:
For each day
2007 Jun 21
2
Overlaying lattice graphs (continued)
Dear R Users,
I recently posted an email on this list about the use of data.frame and
overlaying multiple plots. Deepayan kindly indicated to me the
panel.superposition command which worked perfectly in the context of the
example I gave.
I'd like to go a little bit further on this topic using a more complex
dataset structure (actually the one I want to work on).
>mydata
Plot
2012 Sep 24
0
stop on rows where !is.na(mydata$ti_all)
Dear R experts,
I got help to build a loop but there is a bug inside it that causes
one part of the mechanism to fail.
It should grow once, but if keep growing on rows where $ti_all is not NA.
Here is a wall of code that very crudely demonstrates the problem,
there is a couple of dim() outputs at the end where you can see how it
the second time around keeps adds (2) rows, but this does not
2003 Oct 16
2
returning dynamic variable names from function
Within a function I'm assigning dynamic variable names and values to them
using the "assign" function. I want to pass back the results but am
uncertain how to do this.
Basically, my function reads a number of data files and uses the filename of
each file as the variable name for a list-to-become-dataframe. I want then
to pass all these lists back, but again, the names of the
2008 Mar 03
2
handling big data set in R
Hello R users,
I'm wondering whether it is possible to manage big data set in R? I
have a data set with 3 million rows and 3 columns (X,Y,Z), where X is
the group id. For each X, I need to run 2 regression on the submatrix.
I used the function "split":
datamatrix<-read.csv("datas.csv", header=F, sep=",")
dim(datamatrix)
# [1] 2980523 3
2002 Sep 30
0
using step function in functions
Help!
I am a new R user. It has been slow getting up to speed, but definitely
rewarding. I have come up against a problem I can't handle. I would
very much appreciate any help.
I am writing a vector auto regression (VAR) function that utilizes existing
R statistical functions. I would like to use the step function to do
step-wise elimination on each univariate time series model. No