thr3ads.net - similar to: "maximum elements in an ff object?"

Displaying 20 results from an estimated 4000 matches similar to: "maximum elements in an ff object?"

2007 Nov 01

unable to install package ff

Hi all, I've had one of my most miserable R weeks in memory. I'm trying to deal with huge datasets (>1GB each) but am running up against those pesky memory limits. The libraries filehash and g.data are not very suitable for what I need. I haven't gotten into the sql thing yet. Most recently I've been trying to install the new package ff (not yet on the CRAN repository). I

how to make read in a vector of 0s and 1s with no space between them

2010 Apr 25

how to make read in a vector of 0s and 1s with no space between them

Hi all, Probably a rudimentary question. I have a flat file that looks like this (the real one has ~10e6 elements): 10110100101001011101011 and I want to pull that into R as a vector, but with each digit being it's own element. There are no separators between the digits. How can I accomplish this? Thanks in advance! Matt -- Matthew C Keller Asst. Professor of Psychology University of

skip non-sequential lines using scan?

2007 Nov 08

skip non-sequential lines using scan?

Hi all, Is there a way to skip non-sequential lines using the "skip" argument in the scan function? E.g., I have a matrix with 100 rows and 1e7 columns. I open a connection and want to read only lines 5, 7, 9, etc [i.e., seq(5,99,2)] It might seem that the syntax to do this would be something like this (if only the "skip" allowed vectors in the same way colClasses does in

ideas about how to reduce RAM & improve speed in trying to use lapply(strsplit())

2011 May 30

ideas about how to reduce RAM & improve speed in trying to use lapply(strsplit())

hi all, I'm full of questions today :). Thanks in advance for your help! Here's the problem: x <- c('18x.6','12x.9','302x.3') I want to get a vector that is c('18x','12x','302x') This is easily done using this code: unlist(lapply(strsplit(x,".",fixed=TRUE),function(x) x[1])) So far so good. The problem is that x is a vector

Changing the name of the "R" process in top

2011 May 28

Changing the name of the "R" process in top

Hi all, Perhaps this is more of a unix question, but I'll give it a try here. I am running 9 different R processes at the same time (called from a shell script using R CMD BATCH). When I use the top program to monitor how they are doing, it is impossible to tell which R process is related to which R script. Is there a way to rename a specific instantiation of an R process in top with

how to get remote ESS graphics to work?

2009 May 20

how to get remote ESS graphics to work?

Hi all, My graduate student is logging onto my macpro and running R through ESS aquamacs (with Mx ssh and then Mx ess-remote). Everything is working fine until we get to graphing. We are trying to give him the ability to look at graphics interactively. The ESS manual is not too helpful: "If you run X11 (See Section 13.3.2 [X11], page 68, X-windows) on both the local and remote machines

fwrite() not found in data.table package

2017 Oct 02

fwrite() not found in data.table package

Hi all, I used to use fwrite() function in data.table but I cannot get it to work now. The function is not in the data.table package, even though a help page exists for it. My session info is below. Any ideas on how to get fwrite() to work would be much appreciated. Thanks! > sessionInfo() R version 3.2.0 (2015-04-16) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: Red Hat

64-bit R on Mac OS X 10.5.4

2008 Jul 27

64-bit R on Mac OS X 10.5.4

Hi Matt Your method is the easiest way for me to install the 64-bit R. I followed the directions on your web site and then did the following: R --arch=x86_64 source("http://bioconductor.org/biocLite.R") biocLite(type = "source",lib = "/Library/Frameworks/R.framework/Versions/2.8/Resources/RLib64") I got many errors and warnings which I copied to the attached file.

things that are difficult/impossible to do in SAS or SPSS but simple in R

2008 Jan 15

things that are difficult/impossible to do in SAS or SPSS but simple in R

Hi all, I'm giving a talk in a few days to a group of psychology faculty and grad students re the R statistical language. Most people in my dept. use SAS or SPSS. It occurred to me that it would be nice to have a few concrete examples of things that are fairly straightforward to do in R but that are difficult or impossible to do in SAS or SPSS. However, it has been so long since I have used

how to interpolate time series data with missingness

2009 Jun 17

how to interpolate time series data with missingness

Hi all, I have a vector, most of which is missing. The data is always increasing, but may do so in jumps. I would like to interpolate the NAs with 'best guesses', using something like filter(), which doesn't work due to the NAs. Here is an example: > x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA) > x [1] 2.0 3.0 NA NA NA 3.2 3.5 NA NA 6.0 NA I would like a function that

how to sort huge (> 2^31 row) dataframes quickly

2012 Jul 30

how to sort huge (> 2^31 row) dataframes quickly

Hello all, I have some genetic datasets (gzipped) that contain 6 columns and upwards of 10s of billions of rows. The largest dataset is about 16 GB on file, gzipped (!). I need to sort them according to columns 1, 2, and 3. The setkey() function in the data.table package does this quickly, but of course we're limited by R not being able to index vectors with > 2^31 elements, and bringing

tapply for enormous (>2^31 row) matrices

2012 Feb 21

tapply for enormous (>2^31 row) matrices

Hi all, SETUP: I have pairwise data on 22 chromosomes. Data matrix X for a given chromosome looks like this: 1 13 58 1.12 6 142 56 1.11 18 307 64 3.13 22 320 58 0.72 Where column 1 is person ID 1, column 2 is person ID 2, column 3 can be ignored, and column 4 is how much chromosomal sharing those two individuals have in some small portion of the chromosome. There are 9000 individual people, and

Input appreciated: R teaching idea + a way to improve R-wiki

2007 Oct 21

Input appreciated: R teaching idea + a way to improve R-wiki

Hi all, I will be teaching a graduate-level course on R at CU Boulder next semester. I have a teaching idea that might also help improve the R wiki page... I wanted to know what you all thought of it and wanted to solicit some advice about doing it. During the latter part of the course, students will choose a topic of interest (e.g., hierarchical linear modeling), and show how to achieve it in

fwrite() not found in data.table package

2017 Oct 02

fwrite() not found in data.table package

You are asking about (a) a contributed package (b) for a package version that is not in CRAN and (c) an R version that is outdated, which stretches the definition of "on topic" here. Since that function does not appear to have been removed from that package (I am not installing a development version to test if it is broken for your benefit), I will throw out a guess that if you update R

constraining correlations

2007 Oct 11

constraining correlations

Hello, I've searched for an answer to no avail. I am wondering if anyone knows how to constrain certain correlations to be equal. I have family data with 2 twins per family plus up to 2 siblings. I would like to somehow constrain all the sibling correlations (twin-sib and sib-sib) to be the same while allowing the twin-twin correlation to be different. Here is some simulated code:

non-positive definite matrix remedies?

2009 Mar 11

non-positive definite matrix remedies?

Hi all, For computational reasons, I need to estimate an 18x18 polychoric correlation matrix two variables at a time (rather than trying to estimate them all simultaneously using ML). The resulting polychoric correlation matrix I am getting is non-positive definite, which is problematic because I'm using this matrix later on as if it were a legitimately estimated correlation matrix (in order

question about bigmemory: releasing RAM from a big.matrix that isn't used anymore

2010 Feb 06

question about bigmemory: releasing RAM from a big.matrix that isn't used anymore

Hi all, I'm on a Linux server with 48Gb RAM. I did the following: x <- big.matrix(nrow=20000,ncol=500000,type='short',init=0,dimnames=list(1:20000,1:500000)) #Gets around the 2^31 issue - yeah! in Unix, when I hit the "top" command, I see R is taking up about 18Gb RAM, even though the object x is 0 bytes in R. That's fine: that's how bigmemory is supposed to

how to merge distance data based on location

2011 Aug 19

how to merge distance data based on location

Hi all, I have two data frames, two columns each, 1000s of rows. Each row represents a segment of the genome where a deletion has occurred. First column is start position of the deletion in genomic distance, second is end position. So, e.g., first 3 rows of data frame A is: 1003 1023 5932 6120 12348 12689 first 3 rows of data frame B is: 852 5305 1010 1015 8500 9500 10000 13000 I want to merge

[R-SIG-Mac] How to interrupt an R process that hangs

2010 Mar 15

[R-SIG-Mac] How to interrupt an R process that hangs

+1--this is the single most-annoying issue with R that I know of. My usual solution, after accomplishing nothing as R spins idly for a couple hours, is to kill the process and lose any un-saved work. save.history() is my friend, but is a big delay when you work with big data sets as I do, so I don't run it after every command. I have cc'd r-help here, however, because I experience this

why does scan(gzfile("file"), what='integer') import data as mode "character" ?

2011 May 29

why does scan(gzfile("file"), what='integer') import data as mode "character" ?

Hi all, My code: x <- scan(gzfile("file"),what='integer') x is imported, but as mode "character" rather than "integer". I know I can do as.integer() when importing, but am still trying to figure out why the above occurs. When I do summary(as.integer(x)), there are no NAs introduced by coercion, so the vector really is all integer. Also, is the above

similar to: maximum elements in an ff object?