similar to: help sample from large dataset - misleading error?

Displaying 20 results from an estimated 10000 matches similar to: "help sample from large dataset - misleading error?"

2010 Jun 22
1
subset dataset using factor levels instead of factor names
Hi All, I have a factor variable with 52 levels -with long, annoying names. I want to keep only rows with some variables. I can do this using this code: test1 <- subset(nih2009,ic_name %in% c('NATIONAL EYE INSTITUTE','Veterans Affairs')) dim(test1) [1] 2396 38 But this doesn't work: t1 <- subset(nih2009, ic_name %in% c(27,51)) dim(t1) [1] 0 38 I know
2009 Jul 09
3
Stratified data summaries
Hi All, I'm trying to automate a data summary using summary or describe from the HMisc package. I want to stratify my data set by patient_type. I was hoping to do something like: Describe(myDataFrame ~ patient_type) I can create data subsets and run the describe function one at a time, but there's got to be a better way. Any suggestions? Rachel [[alternative HTML
2013 Apr 26
4
Help with merge function
Dear all, I'm trying to merge 2 dataframes, but I'm not being entirely successful and I can't understand why. Dataframe x1 State_prov Shape_name bob2009 bob 2010 bob2011 Nova Scotia Annapolis 0 0 1 Nova Scotia Antigonish 0 0 0 Nova Scotia Gly NA NA
2009 Sep 30
1
rcs fits in design package
Hi all, I have a vector of proportions (post_op_prw) such that >summary(amb$post_op_prw) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 0.0000 0.0000 0.0000 0.3985 0.9134 0.9962 1.0000 > summary(cut2(amb$post_op_prw,0.0001)) [0.0000,0.0001) [0.0001,0.9962] NA's 1904 1672 1
2006 Jan 25
16
Slideshow beta
Ok, I finally got the slideshow code to a state worth showing it off. The site is a very rough cut of a site I''m building for my wife''s photography, so ignore the unfinished design for now :) http://rachel.kathihill.com/ To see the ajax version, go to: http://rachel.kathihill.com/?ajax=1 To randomize the order the images show: http://rachel.kathihill.com/?random=1 To change
2012 Mar 07
5
Sampling problems
Hi, I need to sample randomly my dataset for 1000 times. The sample need to be the 80%. I know how to do that, my problem is that not only I need the 80%, but I also need the corresponding 20% each time. Is there any way to do that? Alternatively, I was thinking to something like setdiff () function to compare my 80% sample to the original dataset and obtain the corresponding 20%, unfortunately
2012 Jun 14
1
Question about sampling
Dear list I wish to extract from a population genotypized for 10 SNP a subsample of the same population of size n with similar allele frequencies. Essentially i have a matrix of 200 rows (df) like this Name,Condition,rs1385699_X,rs6625163_X,rs962458_X,Rs4658627_1, sample01,Case,1,1,1,-1 sample02,Control,1,1,1,1 sample06,Control,1,-1,1,0 sample10,Case,1,1,1,0 sample11,Control,1,1,1,1
2012 May 08
4
glmmADMB
Hi there, I am new to the package glmmadmb, but need it to perform a zero-inflated gzlmm with a binomial error structure. I can't seem to get it to work without getting some strange error messages. I am trying to find out what is affecting the number of seabird calls on an array of recorders placed at 4 sites on 6 islands. I have nightly variables (weather and moonlight), site variables
2006 Jun 04
2
Can anyone help?
Quick question please....A user logs into windowsXP and tries to create a folder/document and the ownership on the new file/folder defaults to nobody:nobody. I have the user set up in samba on the IRIX machine. All other users have no problem. Anyone have any suggestions? Thanks Rachel
2001 Aug 09
2
Pulling columns out of a data.frame
Hi there Probably a very simple solution to this problem. I have a character vector eg c("name1","name2","name3") and I want to pull out these columns from a data.frame, converting each of these columns into factors also. Many thanks Rachel -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read
2003 Jun 18
2
Private: Problem with tapply/lapply and sample (PR#3286)
Full_Name: Peter Gedeck Version: R1.6.2 and R1.7.0 OS: Windows XP Submission from: (NULL) (194.191.169.72) Hello, I marked the bug report Private, as I don't want my email address on the web server. The problem that I found is best explained using an example. index <- 1:6 cluster <- c(1,1,1,2,2,3) tapply(index,cluster,sample) gives $"1" [1] 2 1 3 $"2" [1] 4 5
2009 Apr 28
3
creating a vector of sums
Hi, I am trying to create a function for a goodness-of-fit test for the Pareto Distribution for some loss data that I have. So far I have the following: function(X=OTOL) { n <- length(X)-1 #calculated the number of values (extra as 0 included) i <- 2:640 #values of i j <- 1:639 #values of i-1 Y <- (n-j+1)*((X[i])-(X[j])) #First part of GoF model Y } Where OTOL is the ordered loss
2011 Mar 04
1
a simple problem
Hello R-help   I am working with large data table that have the occasional label,  a particular time point in an experiment. E.g: "Time (min)", "R1 R1", "R2 R1", "R3 R1", "R4 R1" .909, 1.117, 1.225, 1.048, 1.258 3.942, 1.113, 1.230, 1.049, 1.262 3.976, 1.105, 1.226, 1.051, 1.259 4.009, 1.114, 1.231, 1.053, 1.259 4.042, 1.107, 1.230, 1.048, 1.262
2005 Sep 19
2
Problem with tick marks in lines.survfit (package survival)
I have attempted to follow posting guidelines but I have failed to find out what I am doing wrong here. I am trying to use lines.survfit to plot a second curve onto a survival curve produced by plot.survfit. In my case this is to be a progression free survival curve superimposed upon an overall survival curve, but I will illustrate my problem using the example given in the help for
2005 Oct 12
1
Windows XP client changes not being saved via Samba
[global] workgroup = IIG netbios name = TUX server string = Samba Server interfaces = 192.168.0.10/24 update encrypted = Yes logon path = \\TUX\profile\%u logon drive = H: domain logons = Yes os level = 65 preferred master = Yes domain master = Yes wins server = 127.0.0.1 wins support = Yes
2010 Aug 18
6
Once I added this HABTM, one of my 'through' relationships, on a non-habtm model, seems to have broke?
I''m a rails newb and have been Googling about this, but I''m still stumped. Not showing everything here, but basically it should be a pretty common setup so I''m sure others know what I''m doing wrong. - A meter can belong to many meter_groups - A meter_group can have many meters. - A user can ''subscribe'' to viewing a meter_group (Subscription)
2009 May 21
3
Problems with sample variance
Dear R users, I am a beginner to R. I generated 1000 samples with 15 data in each sample I tried finding the variance for each sample I used the code: m=1000;n=15 > r<-rnorm(15000) > for(i in 1:m){ x=data[,i] v=var(x)} what I got was just the variance for the last sample i.e. the 1000th sample but what I want is 1000 variance. Does anyone know what I did wrong? Thanks Chloe Smith
2017 Aug 19
4
My very first loop!! I failed. May I have some start-up aid?
Dear all, I have a data similar to this: myframe<- data.frame (ID=c("Ernie", "Ernie","Ernie","Ernie"), Timestamp=c("24.09.2012 08:00", "24.09.2012 09:00", "24.09.2012 10:00", "25.09.2012 10:00"), Longitude=c("8.481","8.482","8.483","8.481"),
2008 Aug 01
5
viewing data in something similar to 'R Data Editor'
Hi, I would like to view matrices I am working with in a clean, easy to read, separate window. A friend showed me how to do something like I want with edit(). I can view the matrix in the 'R Data Editor': For a sample matrix: > mat=matrix(1:15,ncol=3) > mat [,1] [,2] [,3] [1,] 1 6 11 [2,] 2 7 12 [3,] 3 8 13 [4,] 4 9 14 [5,] 5 10 15
2009 Feb 06
2
undesired grid in ps/eps outputs generated by filled.contour or image
Hi! Whenever I save a graphic in ps/eps format generated by filled.contour or image, an undesired grid is added to it (not visible on the X11 screen). For example: postscript("volcano.eps") filled.contour(volcano,col=gray(seq(0,1,,50)),levels=seq(min(volcano),max(volcano),,50)) dev.off() Any ideia how to eliminate this grid? Thanks, Rachel [[alternative HTML version deleted]]