Displaying 20 results from an estimated 4000 matches similar to: "Subsetting data systematically"
2007 Jan 21
1
identify selected substances across individuals
An embedded and charset-unspecified text was scrubbed...
Name: inte tillg?nglig
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070121/436ed377/attachment.pl
2003 Feb 12
1
Na/NaN error in subsampling script
R-help readers,
I''m having a problem with an R script (see below), which regularly generates the error message,
Error in start:(start + (sample.length - 1)) :
NA/NaN argument
, for which I am unsure of the cause.
In essence, the script (below) generates the start and end points for random subsamples from along a vector (in reality a transect (of a given length,
2007 Jan 23
3
Query about extracting subsets from a table
Hi
I am trying to process tabular data as follows:
Data in the input file is of the form
genome1 genome2 tree-dist log10escore
Genome1 and genome2 are alphabetic.
Tree-dist and log10escore are numeric.
I wish to extract only those rows from this table
where the log10escore is less than -3.
data <-read.table(filename);
data$log10escore = data$log10escore[ data$log10escore
< -3];
I
2007 Aug 09
2
Systematically biased count data regression model
Dear all,
I am attempting to explain patterns of arthropod family richness
(count data) using a regression model. It seems to be able to do a
pretty good job as an explanatory model (i.e. demonstrating
relationships between dependent and independent variables), but it has
systematic problems as a predictive model: It is biased high at low
observed values of family richness and biased low at
2020 Jul 15
2
Openblas?
Hello,
I thought that I should try openblas when building a CRAN package
containing lots of old (twentieth century) C-code with frequent calls to
blas and lapack routines. I have the following options on my Ubuntu
20.04 machine:
Selection Path Priority Status
------------------------------------------------------------
* 0
2013 Jan 01
1
Order variables automatically
Hi,
I have a dataset with 6 categorical variables. I have used this following code to make the variables u1-u6 ordered factors and this works well.
cat1cat2 cat3 cat4 cat5 cat6
? 0 ? ?? 1 ? ? 1????? 0 ??? 0? ?? 1
? 1 ? ?? 1 ? ? 0 ? ?? 0 ? ? 0 ? ? 0
.......
....
############
data<-read,table("example.txt")
data <- as.data.frame(lapply(data, ordered))
############
Now,
2010 Feb 23
3
how to rearrange a dataframe
Hi all,
I'd appreciate if anyone can help me with this...
I have a data frame that looks like this:
1 + name1 1 2 3
2 + name2 5 9 10
2 - name3 56 74 93
1 - name4 65 75 98
I need to rearrange this in a way so that the rows with "1" in the
first column, and "-" in the second column; then columns 4 and 6
should switch places. That is, column 6 would be now column 4 and
2011 May 21
2
unbalanced anova with subsampling (Type III SS)
Hello R-users,
I am trying to obtain Type III SS for an ANOVA with subsampling. My design
is slightly unbalanced with either 3 or 4 subsamples per replicate.
The basic aov model would be:
fit <- aov(y~x+Error(subsample))
But this gives Type I SS and not Type III.
But, using the drop() option:
drop1(fit, test="F")
I get an error message:
"Error in
2011 Mar 21
3
Computing row differences in new columns
Hi
I have the following columns with dates and results, sorted by subject and date. I'd like to compute the differences in dates and results for each patient, based on the previous row. Obviously the last entry for each subject should be a NA.
Which would be the best way to accomplished that ?
I guess questions like that have been already answered a thousand times, so I apologize for
2009 Apr 06
3
how to subsample all possible combinations of n species taken 1:n at a time?
Hello
I apologise for the length of this entry but please bear with me.
In short:
I need a way of subsampling communities from all possible communities of n
taxa taken 1:n at a time without having to calculate all possible
combinations (because this gives me a memory error - using
combn() or expand.grid() at least). Does anyone know of a function? Or can
you help me edit the
combn
or
2010 Oct 31
2
Randomly split a sample in two equal subsamples
Dear all,
I would like to randomly split a sample in two equally large
subsamples. The sample data is stored as a matrix with each row
representing an individual and each column representing some variable
(e.g., name, age, sex, etc.); the first row contains the names of the
variables; the first column contains the individual number (1:n, for n
individuals); the number of individuals is even (so,
2005 Jan 14
5
subsampling
hi,
I would like to subsample the array c(1:200) at random into ten subsamples
v1,v2,...,v10.
I tried with to go progressively like this:
> x<-c(1:200)
> v1<-sample(x,20)
> y<-x[-v1]
> v2<-sample(y,20)
and then I want to do:
>x<-y[-v2]
Error: subscript out of bounds.
2013 Jan 17
1
Help with interpolation
hi guys
I need to interpolate values for the zero coupon yield curve. Following data
is given
date days rate
1996 01
2012 Aug 16
1
Big Data reading subsample csv
Hello,
I'm most grateful for your time to read this.
I have a uber size 30GB file of 6 million records and 3000 (mostly
categorical data) columns in csv format. I want to bootstrap subsamples for
multinomial regression, but it's proving difficult even with my 64GB RAM
in my machine and twice that swap file , the process becomes super slow
and halts.
I'm thinking about generating
2012 Nov 05
1
Plot 3 lines in one graph
I'm new with R. I want to plot 3 lines in one graph. This is my data:
print(x)
V1 V2 V3 V41 -4800 25195.73 7415.219 7264.282
-2800 15195.73 5415.219 7264.28
I tried using matplot, but I cannot get exactly what I want. This is what I
get, and this is my code:
matplot(x[,1],x[,-1],type='b', xlab = "epsilon_h",
ylab = "Value2", xlim=
2003 Aug 06
1
Standard error of standard deviation: bootstrap or theoretical results?
Dear R users,
This is more a statistical question rather than an R question. I'd
appreciate it if you can give me some suggestions.
I have a sample of a time series (sample size 500, fat tail in density). I
am trying to calculate the Standard error of standard deviation of a
sub-block-sample (sample size 250). I take 100 this kind of
sub-block-sample, randomly. For these 100 subsamples, I
2007 Oct 09
1
pseudo code
Hey there!
I got a pseudo code and don't know how to apply it to R, maybe someone can help me:
Input: A dataset X, kmax: maximum number of clusters, num_subsamples: number of
subsamples.
Output: S(i; k) - a distribution of similarities between partitions into k clusters of a reference
clustering and clustering of subsamples; i = 1 to num_subsamples
Requires: T = cluster(X): A hierarchical
2012 Mar 21
1
fwdmsa package: Error in search.normal(X[samp, ], verbose = FALSE) : At least one item has no variance
I'm using the fwdmsa package to identify deviant cases in a Mokken scale
analysis. I've run into a problem., separate from the one I posted
previously. The problem comes with items that are "easy" by IRT standards. A
good scale should include a range of difficulties; yet when I include "easy"
items in a forward search I continuously run into the problem that these
items
2010 Jul 24
4
Trouble retrieving the second largest value from each row of a data.frame
I have a data frame with a couple million lines and want to retrieve the largest and second largest values in each row, along with the label of the column these values are in. For example
row 1
strongest=-11072
secondstrongest=-11707
strongestantenna=value120
secondstrongantenna=value60
Below is the code I am using and a truncated data.frame. Retrieving the largest value was easy, but I have
2008 Feb 14
1
deleting certain observations in a data frame
Hi, I'm wondering what the fastest way is to delete certain data points (observations) in a data frame.
I have a vector of the indices/row.names I would like to delete. I have tried replacing list by list, but it always complains about different lengths, "replacing list of length a with length b" and so on.
Another way to think of it is that it's a generazation of na.rm I