thr3ads.net - similar to: "Rank-based p-value on large dataset"

Displaying 20 results from an estimated 1000 matches similar to: "Rank-based p-value on large dataset"

2008 Jul 01

"Invalid object" error in boxplot

Hi, I'm trying to make a boxplot with the data at the end of the message, and when I try to execute the command >boxplot(Diatoms) (or for any other field instead of "Diatoms") I get the following error message: Error in oldClass(stats) <- cl : adding class "factor" to an invalid object Any advice would be much appreciated. Thanks a lot, Miriam Date

AW: Rank and extract data from a series

2003 Sep 23

AW: Rank and extract data from a series

Hi, >I would like to rank a time-series of data, extract the top ten data items from this series, determine the >corresponding row numbers for each value in the sample, and take a mean of these *row numbers* (not the data). >I would like to do this in R, rather than pre-process the data on the UNIX command line if possible, as I need to >calculate other statistics for the series.

get the percentage rank of a value based on an empirical data vector

2012 Jan 11

get the percentage rank of a value based on an empirical data vector

Hi, I have a vector with values: x <- rnorm(1000, 5, 2) and one single value: y <- 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? thanks!

Rank and extract data from a series

2003 Sep 23

Rank and extract data from a series

Here's one way. Suppose your "time series" is in a vector called "x". top10 <- sort(x, decreasing=TRUE)[1:10] mean.index <- mean(which(x %in% top10)) HTH, Andy > -----Original Message----- > From: James Brown [mailto:jdb33 at hermes.cam.ac.uk] > Sent: Tuesday, September 23, 2003 7:51 AM > To: r-help at stat.math.ethz.ch > Subject: [R] Rank and

Rprof causing R to crash

2012 Dec 11

Rprof causing R to crash

I'm trying to use Rprof() to identify bottlenecks and speed up a particullary slow section of code which reads in a portion of a tif file and compares each of the values to values of predictors used for model fitting. I've written up an example that anyone can run. Generally temp would be a section of a tif read into a data.frame and used later for other processing. The first portion

memory and bootstrapping

2011 May 05

memory and bootstrapping

hello, the following questions will without doubt reveal some fundamental ignorance, but hopefully you can still help me out. I'd like to bootstrap a coefficient gained on the basis of the coefficients in a logistic regression model (the mean differences in the predicted probabilities between two groups, where each predict() operation uses as the newdata-argument a dataframe of equal size as

netapp/maildir/dovecot performance

2007 Mar 22

netapp/maildir/dovecot performance

We are seeing some poor performance recently that is focused around users with large mailboxes (100,000 message /INBOX, 80,000 message subfolders, etc). The performance problem manifests as very high system% utilization - basically iowait for NFS. There are two imap servers with plenty of horsepower/memory/etc. They are connected to a 3050c cluster via gig-e. Here are the mount options:

Error in vector("double", length) : vector size specified is too large....VLDs

2005 Sep 15

Error in vector("double", length) : vector size specified is too large....VLDs

I have what R seems to consider a very large dataset, a 12MB text file of lat,long,and height values, 130,000 rows to be exact. Here's what I get: Thomas Colson North Carolina State University Department of Forestry and Environmental Resources (919) 673 8023 tom_colson at ncsu.edu Calendar: www4.ncsu.edu/~tpcolson

RANDOM

2006 Jul 18

RANDOM

I am pretty much new at this ROR game and had what I think to be a simple question. I have a set of Sponsors that I would like to be able to select one at random and display in the my html. I have already set up the DB, scaffolded, set index controller and all is working smoothly. I know that I can display them all by doing <% for sponsor in @sponsors %> <%= sponsor.name %>

Parallal Building?

2006 Nov 20

Parallal Building?

I''m trying to index ~130,000 documents [soon to grow to about 500,000 documents] and I''m wondering if its possible to combine ferret databases or in some other way split up the building process. Normally, indexing 130k documents wouldn''t be that painful except that there are different types of links between these documents and they are not absolute (so for example

rank() vs SAS proc rank

2004 Mar 30

rank() vs SAS proc rank

SAS proc rank has ties options of high and low that would allow producing ranks of the type found in the sports pages, e.g., rank (c(1,1,2,2,2,2,3)) == 1 1 3 3 3 3 7 Could R support these ties.methods?

Installing Smart Suite 97

2001 Jul 26

Installing Smart Suite 97

I get to 53 percent of files copied and I get the message that c:\lotuscomponent/lttsn32.dll is in use by another application and terminates install. Help me? BTW I get a message about contacting Microsoft support about mytab11.c. Well this is a Microsoft clean installation. Took me hours to the point where the installer will load and run. -----= Posted via Newsfeeds.Com, Uncensored Usenet

rank with uniform count for each rank

2012 Feb 22

rank with uniform count for each rank

Hello, What is the best way to get ranks for a vector of values, limit the range of rank values and create equal count in each group? I call this uniform ranking...uniform count/number in each group. Here is an example using three groups: Say I have values: x = c(3, 2, -3, 1, 0, 5, 10, 30, -1, 4) names(x) = letters[1:10] > x a b c d e f g h i j 3 2 -3 1 0 5 10 30 -1 4 I

Sample size estimation for non-inferiority log-rank and Wilcoxon rank-sum tests

2010 Sep 27

Sample size estimation for non-inferiority log-rank and Wilcoxon rank-sum tests

Hello Everyone, I'm trying to conduct a couple of power analyses and was hoping someone might be able to help. I want to estimate the sample size that would be necessary to adequately power a couple of non-inferiority tests. The first would be a log-rank test and the second would be a Wilcoxon rank-sum test. I want to be able to determine the sample size that would be necessary to test for a

Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)

2020 Aug 05

Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)

Nir, BTW what are you using for performance testing? As far as I can tell it's not possible to make qemu-img convert use multi-conn when connecting to the source (which is going to be a problem if we want to use this stuff in virt-v2v). Instead I've hacked up a copy of this program from libnbd: https://github.com/libguestfs/libnbd/blob/master/examples/threaded-reads-and-writes.c so

Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)

2020 Aug 05

Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)

On Wed, Aug 5, 2020 at 3:47 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > Here are some results anyway. The command I'm using is: > > $ ./nbdkit -r -U - vddk \ > libdir=/path/to/vmware-vix-disklib-distrib \ > user=root password='***' \ > server='***' thumbprint=aa:bb:cc:... \ > vm=moref=3 \ >

GLM with ranks as response variable

2001 Dec 02

GLM with ranks as response variable

Dear R's, I have a survey where customers rank a set of 5 packages for a product, so the response variable looks like a d b c a c d b d b a c Predictors variables are 4 socio-economic parameters. I have modelled the FIRST choice of each subject as a multinomial model, similar to the housing example in MASS ch7.3, , but I would prefer to use the whole rank-set instead. Can someone give me a

Bar plot with error bars

2007 Oct 22

Bar plot with error bars

Apologies if this has been asked before. I am having trouble understanding the R mailing list never mind R! I am relatively new to R having migrated from Minitab and SPSS. I have managed to do some more complicated statistics such as hierarchical partitioning of variance on an 80,000 record dataset but have to admit that drawing a simple bar plot I could do by hand is proving extremely

splitting a string column into multiple columns faster

2013 Jun 08

splitting a string column into multiple columns faster

Hello! I have a column in my data frame that I have to split: I have to distill the numbers from the text. Below is my example and my solution. x<-data.frame(x=c("aaa1_bbb1_ccc3","aaa2_bbb3_ccc2","aaa3_bbb2_ccc1")) x library(stringr) out<-as.data.frame(str_split_fixed(x$x,"aaa",2)) out2<-as.data.frame(str_split_fixed(out$V2,"_bbb",2))

R Weka and cobweb

2007 Aug 11

R Weka and cobweb

Hi, I never use cobweb before and I'm quite new to this. I have a couple of questions around the cobweb implementation in R Weka. If you could supply answer or insight, I would really appreciate. 1. From Fisher's paper in 1987, it seems that Cobweb only deals with nominal data. In R Weka cobweb, is it allowed to accommodate real/continuous value? 2. My understanding is that Cobweb

similar to: Rank-based p-value on large dataset