Displaying 20 results from an estimated 2000 matches similar to: "Subsampling-oversampling from a data frame"
2003 Feb 12
1
Na/NaN error in subsampling script
R-help readers,
I''m having a problem with an R script (see below), which regularly generates the error message,
Error in start:(start + (sample.length - 1)) :
NA/NaN argument
, for which I am unsure of the cause.
In essence, the script (below) generates the start and end points for random subsamples from along a vector (in reality a transect (of a given length,
2011 May 21
2
unbalanced anova with subsampling (Type III SS)
Hello R-users,
I am trying to obtain Type III SS for an ANOVA with subsampling. My design
is slightly unbalanced with either 3 or 4 subsamples per replicate.
The basic aov model would be:
fit <- aov(y~x+Error(subsample))
But this gives Type I SS and not Type III.
But, using the drop() option:
drop1(fit, test="F")
I get an error message:
"Error in
2011 Aug 11
1
Subsampling data
*Dear R community*
* *
*I have two questions on data subsample manipulation. I am starting to use R
again after a long brake and feel a bit rusty.*
* *
*I want to select a subsample of data for males and females separately*
* *
library(foreign)
Datatemp <- read.spss("H:/Skjol/Data/HL/t1and2b.sav", use.value.labels = F)
> table(Datatemp$sex)
1 2
3049 3702
2005 Jan 14
5
subsampling
hi,
I would like to subsample the array c(1:200) at random into ten subsamples
v1,v2,...,v10.
I tried with to go progressively like this:
> x<-c(1:200)
> v1<-sample(x,20)
> y<-x[-v1]
> v2<-sample(y,20)
and then I want to do:
>x<-y[-v2]
Error: subscript out of bounds.
2011 Feb 06
2
Subsampling out of site*abundance matrix
Hello,
How can I randomly sample individuals within a sites from a site (row) X
species abundance (column) data frame or matrix? As an example, the matrix
"abund2" made below.
##### (sorry, Im a newbie and this is the only way I know to get an example
on here)
abund1 <- c(150, 300, 0, 360, 150, 300, 0, 240, 150, 0, 60,
0, 150, 0, 540, 0, 0, 300, 0, 240, 300, 300,
2010 Nov 09
1
subsampling table
G'day R-helpers,
I want to subsample rows of a large table based on the value in its
first column. Of all rows sharing the same value in the first column I
want to RANDOMLY extract only one.
Thanks in advance,
Achim
example input
1 15 34
1 4 66
1 24 65
2 23 47
2 9 36
3 58 9
3 38 64
3 12 64
3 4 15
4 1 88
4 23 90
desired output
1 4 66
2 23 47
3 12 64
4 1 88
2011 Oct 31
2
oversampling code
Hi
I have an umbalanced data set where I want to predict a binary variable Y.
I want to do an under sampling by keeping all the 1 and taking just some of
the 0 such as I'll have 90% of 0 and 10% of 1.
Can u help me do that
Thank u
[[alternative HTML version deleted]]
2010 Dec 02
1
rpart results - problem after oversampling
Hi all,
I am trying to predict a target variable that takes values 0 or 1 using the rpart command. In my initial dataset I have few positive observations of the target variable; therefore I have oversampled the rare event by a multiple of 6 (i.e. from 762 to 4572).
However, in my results, I end up with a number of positives in one of the terminal nodes that is not divisible by 6. As I have the
2009 May 28
2
ggplot2 legend
Hi:
I need some help with the legend. I got 14 samples(Muestreo) and I
am trying to plot a smooth line for each sample. I am able to accomplish that but the problem is that the legend only displays every other sample. How can I force the legend to show all of my Muestreos? Thanks in advance.
fish_ByMuestreo <- structure(list(data = structure(list(SampleDate = structure(c(3L,
3L, 3L, 3L,
2009 Oct 09
1
Placing text in a ggplot
I am attempting to graph 12 months of temperatures, delineate the months with a vline and place the names of the months at the top of the graph.
So far I have gotten everything to work except the names, despite getting a similar graph to work yesterday the day before yesterday with Baptise A's help. Can anyone suggest what I am doing wrong. Data set is below code.
Thanks.
Code
2017 Jul 22
1
3-day moving average for block maxima
Dear r-users,
I would like to construct 3-day moving average for block maxima series.
I tried this:
bmthree <- lapply(split(dt, dt$Year), function(x) max(sapply(1:(nrow(x)-2),
function(i) with(x, mean(Amount[i:(i+2)],na.rm=TRUE)))))
bmthree
and got the following output.
$`1971`
[1] 70.81667
$`1972`
[1] 68.94553
$`1973`
[1] 102.7236
$`1974`
[1] 73.6625
$`1975`
[1]
2011 Jun 24
4
ggplot2 month and year boxplot x axis order problem
Hi
I am very new to R, I am attempting to produce a monthly boxplot with the
following fish thermal telemetry data:
ID Temp Date.Time Month.Year Month Week Shortdate
1 1734 4.4140 04/05/2010 11:56 05,2010 May 19 04/05/2010
2 1734 4.1002 04/05/2010 12:06 05,2010 May 19 04/05/2010
3 1734 3.9433 04/05/2010 12:09 05,2010 May 19 04/05/2010
4 1734 3.6295
2010 Feb 28
1
ggplot 'annotate problem' again.
I had a problem annotating a graph last year ( see http://n4.nabble.com/Putting-names-on-a-ggplot-td907158.html#a907158 for the discussion)
Stefan (smu) provided a solution using annotate(). However I apparently did not update the graph file and,now, when I go back to the thread and try to use Stefan's solution it does not seem to work although I am sure that it did then.
The problem
2008 Jul 06
2
lattice question
I'm creating a lattice barchart based off a pretty complicated data
structure. The barchart comes out quite nice ( thanks
to lattice ) but the problem is that the horizontal axis comes out all
scrunched because the barchart doesn't know that the intervals
of Var.1 are really "associated" with the conditioning variable Var.2.
Therefore, all the intervals of Var.1 are put on
2012 May 09
12
Matrix heatmap
I would like to organize my data as follows:
I have a table that contains various data, and the numbers represent a level
of similarity between these data,
eg RF00013 has 100% similarity with the data RF00014.
I would leave my table as a heatmap where darker colors represent higher
similarity, and the lighter colors represent less level of similarity.
I'm using version 2.11 of R.
these
2011 Jun 13
1
Heatmap in R and/or ggplot2
I have a dataframe df with columns x, y, and height. I want to create a
heatmap-like plot that creates a grid of x by y, and then color codes the
grid depending on the value of height.
Is there a ggplot2 object to do this? I'm able to easily do this in Excel
with pivot tables and conditional formatting so I'm including an image that
is close to the output I want. I want to be able to
2017 Nov 07
1
fill histogram in ggplot
Hi all,
I have the following data and I have a histogram for mms like
ggplot(hist,aes(x=hist$mms))+ geom_histogram(binwidth=1,fill="white",color="black")and then I want to fill the color of histogram by probable=1 and probable=0, could anyone help me in this?
My data:
structure(list(probable = c(1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L,
2018 May 16
1
Systemfit Question
I can't get my simultaneous equations to work using system fit. Please help.
#Reproducible script
Empdata<- read.csv("/Users/ngwinuiazenui/Documents/UPLOADemp.csv")
View(Empdata)
str(Empdata)
Empdata$gnipc<-as.numeric(Empdata$gnipc)
install.packages("systemfit")
library("systemfit")
pdata <- plm.data(Empdata,
2017 Dec 06
2
Odd dates generated in Forecasts
Dear friends,
I have a weekly time series which starts on Jan 4th, 2003 and ends on
december 31st, 2016.
I set up my ts object as follows:
MyTseries <- ts(mydataset, start=2003, end=2016, frequency=52)
MyModel <- auto.arima(MyTseries, d=1, D=1)
MyModelForecast <- forecast (MyModel, h=12)
Since my last observation was on december 31st, 2016 I expected my forecast
date to start on
2011 Jul 01
2
Initiating in BNArray
Hi,
I'm trying to understand some details about an example maintened in [1].
According that link, I have total.data as a data set (am I right?).
But I don't understand how is built that table.
I saved the dataset in a file, with dput(), and had something like this:
structure(list(df.all = structure(list(V1 = structure(c(1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,