Displaying 20 results from an estimated 20000 matches similar to: "Help with data.frame subsets"
2011 Aug 17
3
How to apply a function to subsets of a data frame *and* obtain a data frame again?
Dear all,
First, let's create some data to play around:
set.seed(1)
(df <- data.frame(Group=rep(c("Group1","Group2","Group3"), each=10),
Value=c(rexp(10, 1), rexp(10, 4), rexp(10, 10)))[sample(1:30,30),])
## Now we need the empirical distribution function:
edf <- function(x) ecdf(x)(x) # empirical distribution function evaluated at x
##
2009 Sep 14
3
Eliminate cases in a subset of a dataframe
Hi folks,
I created a subset of a dataframe (i.e., selected only men):
subdata <- subset(data,data$gender==1)
After a residual diagnostic of a regression analysis, I detected three
outliers:
linmod <- lm(y ~ x, data=subdata)
plot(linmod)
Say, the cases 11,22, and 33 were outliers.
Here comes the problem: When I want to exclude these three cases in a
further regression analysis,
- for
2007 Aug 01
1
Problem to remove loops in a routine
Dear R-users,
I have written the following code to generate some trellis plots. It
works perfectly fine except that it is quite slow when it is apply to my
typical datasets (over several thousands of lines). I believe the
problem comes from the loops I am using to subset my data.frame. I read
in the archives that the tapply function is often more efficient than a
loop in R. Unfortunately ,
2024 May 15
2
Extracting values from Surv function in survival package
OS X
R 4.3.3
Colleagues
I have created objects using the Surv function in the survival package:
> FIT.1
Call: survfit(formula = FORMULA1)
n events median 0.95LCL 0.95UCL
SUBDATA$ARM=1, SUBDATA[, EXP.STRAT]=0 18 13 345 156 NA
SUBDATA$ARM=2, SUBDATA[, EXP.STRAT]=1 13 5 NA 186 NA
SUBDATA$ARM=2, SUBDATA[, EXP.STRAT]=2 5
2024 May 16
1
Extracting values from Surv function in survival package
Hi Dennis,
look at the help page for summary.survfit, the Value n.event.
G?ran
On 2024-05-15 22:41, Dennis Fisher wrote:
> OS X
> R 4.3.3
>
> Colleagues
>
> I have created objects using the Surv function in the survival package:
>> FIT.1
> Call: survfit(formula = FORMULA1)
>
> n events median 0.95LCL 0.95UCL
>
2007 Oct 01
3
mean of subset of rows
Dear list,
this must be an easy one:
I have a data.frame of two columns, "ID" with four different levels (A
to D) and numerical "size", and each of the 4 different IDs is
repeated a
different number of times. I would like to get the mean size for each
ID as another data.frame. I have tried the following:
>ID= as.character(unique(data[,1])) # I use unique() because
2011 Oct 31
1
googleVis motionchart - slow with Date class
Hi,
I am trying to create a googleVis motion chart with monthly data. When formatting the date column as a Date class variable, the plot as presented in the browser becomes considerably slower and very prone to crashing the browser. To illustrate this issue I have modified the WorldBank demo.
### objects from demo("WorldBank", package = "googleVis")
M <-
2008 Jun 19
2
Advanced Filtering problem
http://www.nabble.com/file/p18018170/subdata.csv subdata.csv
I've attached 100 rows of a data frame I am working with.
I have one factor, id, with 27 levels. There are two columns of reference
data, x and y (UTM coordinates), one column "date" in POSIXct format, and
one column "diff" in times format (chron package).
What I am trying to do is as follows:
For each day
2007 Jun 21
2
Overlaying lattice graphs (continued)
Dear R Users,
I recently posted an email on this list about the use of data.frame and
overlaying multiple plots. Deepayan kindly indicated to me the
panel.superposition command which worked perfectly in the context of the
example I gave.
I'd like to go a little bit further on this topic using a more complex
dataset structure (actually the one I want to work on).
>mydata
Plot
2007 Feb 07
1
Problem with subsets and xyplot
Hello
I have a dataframe that looks like this
MSA CITY HIVEST YEAR YR CAT
1 0200 Albuquerque 0.50 1996 1996 5
2 0520 Atlanta 13.00 1997 1997 5
3 0720 Baltimore 29.10 1994 1994 1
4 0720 Baltimore 13.00 1995 1995 5
5 0720 Baltimore 3.68
2012 Sep 24
0
stop on rows where !is.na(mydata$ti_all)
Dear R experts,
I got help to build a loop but there is a bug inside it that causes
one part of the mechanism to fail.
It should grow once, but if keep growing on rows where $ti_all is not NA.
Here is a wall of code that very crudely demonstrates the problem,
there is a couple of dim() outputs at the end where you can see how it
the second time around keeps adds (2) rows, but this does not
2011 Dec 05
1
Subsetting a data frame
Hi R users,
I really need help with subsetting data frames:
I have a large database of medical records and I want to be able to match
patterns from a list of search terms .
I've used this simplified data frame in a previous example:
db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
2, 1.3, 3), test2 = c(56L, 27L, 58L,
2011 Jun 07
1
extract data features from subsets
I have a large dataset similar to this:
ID time result
A 1 5
A 2 2
A 3 1
A 4 1
A 5 1
A 6 2
A 7 3
A 8 4
B 1 3
B 2 2
B 3 4
B 4 6
B 5 8
I need to extract a number of features for each individual in it (identified by "ID"). These are:
* The lowest result (the nadir)
* The time of the nadir - but if the nadir level is present at >1 time point, I need the minimum and maximum time of nadir
2012 Feb 10
3
problem subsetting data frame with variable instead of constant
Hello,
I've encountered a very weird issue with the method subset(), or maybe this
is something I don't know about said method that when you're subsetting
based on the columns of a data frame you can only use constants (0.1, 2.3,
2.2) instead of variables?
Here's a look at my data frame called 'ea.cad.pwr':
*>ea.ca.pwr[1:5,]
MAF OR POWER
1 0.02 0.01 0.9999
2 0.02
2008 Mar 03
2
handling big data set in R
Hello R users,
I'm wondering whether it is possible to manage big data set in R? I
have a data set with 3 million rows and 3 columns (X,Y,Z), where X is
the group id. For each X, I need to run 2 regression on the submatrix.
I used the function "split":
datamatrix<-read.csv("datas.csv", header=F, sep=",")
dim(datamatrix)
# [1] 2980523 3
2011 Oct 05
2
Subsetting a data frame with multiple values and exclusions.
Hi all,
I realise that the convention is to provide a working example of my problem
but the data are of a sensitive nature so I'm not able to do that in this
case.
I need to query a database for multiple search terms:
db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1,
2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 =
2008 Jun 13
2
subsetting data-frame by vector of characters
Hi,
I have a very simple problem but I can't think how to solve it without
using a for loop and creating a large logical vector. However given the
nature of the problem I am sure there is a "1-liner" that could do the
same thing much more efficiently.
bascially I have a dataframe with characters in, eg
>names.and.numbers
(index) Name Fave.Number
1 John 7
2
2007 Aug 10
1
Subsetting by number of observations in a factor
Hi,
I generally do my data preparation externally to R, so I
this is a bit unfamiliar to me, but a colleague has asked
me how to do certain data manipulations within R.
Anyway, basically I can get his large file into a dataframe.
One of the columns is a management group code (mg). There may be
varying numbers of observations per management group, and
he would like to subset the dataframe such
2002 Sep 30
0
using step function in functions
Help!
I am a new R user. It has been slow getting up to speed, but definitely
rewarding. I have come up against a problem I can't handle. I would
very much appreciate any help.
I am writing a vector auto regression (VAR) function that utilizes existing
R statistical functions. I would like to use the step function to do
step-wise elimination on each univariate time series model. No
2017 Jan 10
7
[Bug 99354] New: [G71] "Assertion `bkref' failed" reproducible with glmark2
https://bugs.freedesktop.org/show_bug.cgi?id=99354
Bug ID: 99354
Summary: [G71] "Assertion `bkref' failed" reproducible with
glmark2
Product: Mesa
Version: 13.0
Hardware: x86 (IA32)
OS: Linux (All)
Status: NEW
Severity: normal
Priority: medium