Displaying 20 results from an estimated 1000 matches similar to: "efficiently replacing values in a matrix"
2008 May 29
1
Separator argument in read.table
Hi,
Suppose I have the following tabular data:
1729_at | TRADD | TNFRSF1A-associated via death domain | protein-coding
1773_at | FNTB | farnesyltransferase, CAAX box, beta | protein-coding
177_at | PLD1 | phospholipase D1, phosphatidylcholine-specific | protein-coding
What is the right separator used for read.table function?
I tried this:
dat <-
2011 Nov 20
1
place values into a matrix efficiently?
This question attacked me as I was thinking about matrix value updates.
I probably will never need to do this, but wanted to ask if there are
efficient methods to perform the for-loop in the following sequence.
%xymat<-matrix(rep(0,100) nr=10,nc=10) # empty matrix
%x<-1:10
%y<-sample.int(10,10,rep=T)
%for (j in 1:10) xymat[x[j],y[j]] <- some_function(x[j],y[j]) #to create
either
2010 Mar 23
2
Saving tab/csv delimited data with NaN's
Hello,
I am working multiple simulated data sets with missing values, I would
like to store these data sets in either tab delimited format for .csv
format with missing values marked as NaN's instead of NA's.
I read the import/export document which mentions that write.table
command converts NaN's to NA. Is there any other way I can store the
NaN's. I tried the write syntax
2012 Nov 08
3
problem with package development and older defs earlier in search order
Hi,
I have a problem with a package I have developed in that functions do not get loaded due to older versions of the functions being in the .GlobalEnv? fetched from .Rdata files stored from previous saved workspaces. I need to be able to fix this somehow when I load the package. I do not want to mess up the search order to fix the problem.
How I got myself into this mess is that I started
2004 Jul 30
1
Three-way ANOVA?
Hi,
I'm a biologist, so please forgive me if my question sounds absurd! I have 3
parameters x1, x2, x3 and a response variable y.The sample size is 75. I tried
to do the following:
mylm<-lm(y~ x1 + x2 + x3, data="mydata")
but i can only get stats from anova for the first 2 variables. The third comes
up as NA. The degrees of freedom for the third variable are 0.
Is there
2010 Apr 25
4
how to make read in a vector of 0s and 1s with no space between them
Hi all,
Probably a rudimentary question. I have a flat file that looks like
this (the real one has ~10e6 elements):
10110100101001011101011
and I want to pull that into R as a vector, but with each digit being
it's own element. There are no separators between the digits. How can
I accomplish this? Thanks in advance!
Matt
--
Matthew C Keller
Asst. Professor of Psychology
University of
2007 Nov 08
3
skip non-sequential lines using scan?
Hi all,
Is there a way to skip non-sequential lines using the "skip" argument
in the scan function?
E.g., I have a matrix with 100 rows and 1e7 columns. I open a
connection and want to read only lines 5, 7, 9, etc [i.e.,
seq(5,99,2)]
It might seem that the syntax to do this would be something like this
(if only the "skip" allowed vectors in the same way colClasses does in
2011 May 28
3
Changing the name of the "R" process in top
Hi all,
Perhaps this is more of a unix question, but I'll give it a try here.
I am running 9 different R processes at the same time (called from a
shell script using R CMD BATCH). When I use the top program to
monitor how they are doing, it is impossible to tell which R process
is related to which R script. Is there a way to rename a specific
instantiation of an R process in top with
2010 Dec 02
1
The behaviour of read.csv().
I have recently been bitten by an aspect of the behaviour of
the read.csv() function.
Some lines in a (fairly large) *.csv file that I read in had
too many entries. I would have hoped that this would cause
read.csv() to throw an error, or at least issue a warning,
but it read the file without complaint, putting the extra
entries into an additional line.
This behaviour is illustrated by the toy
2011 May 30
3
ideas about how to reduce RAM & improve speed in trying to use lapply(strsplit())
hi all,
I'm full of questions today :). Thanks in advance for your help!
Here's the problem:
x <- c('18x.6','12x.9','302x.3')
I want to get a vector that is c('18x','12x','302x')
This is easily done using this code:
unlist(lapply(strsplit(x,".",fixed=TRUE),function(x) x[1]))
So far so good. The problem is that x is a vector
2010 Feb 05
1
maximum elements in an ff object?
Hello all,
I hate to add to the daily queries regarding R's handling of large
datsets ;), but...
I read in an online powerpoint about the ff package something about
the "length of an ff object" needing to be smaller than
.Machine$integer.max. Does anyone know if this means that the # of
elements in an ff object must be < .Machine$integer.max [i.e., that ff
provides no help with
2009 May 20
1
how to get remote ESS graphics to work?
Hi all,
My graduate student is logging onto my macpro and running R through
ESS aquamacs (with Mx ssh and then Mx ess-remote). Everything is
working fine until we get to graphing.
We are trying to give him the ability to look at graphics
interactively. The ESS manual is not too helpful: "If you run X11 (See
Section 13.3.2 [X11], page 68, X-windows) on both the local and remote
machines
2009 Jun 17
1
how to interpolate time series data with missingness
Hi all,
I have a vector, most of which is missing. The data is always
increasing, but may do so in jumps. I would like to interpolate the
NAs with 'best guesses', using something like filter(), which doesn't
work due to the NAs. Here is an example:
> x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA)
> x
[1] 2.0 3.0 NA NA NA 3.2 3.5 NA NA 6.0 NA
I would like a function that
2007 Nov 01
2
unable to install package ff
Hi all,
I've had one of my most miserable R weeks in memory. I'm trying to
deal with huge datasets (>1GB each) but am running up against those
pesky memory limits. The libraries filehash and g.data are not very
suitable for what I need. I haven't gotten into the sql thing yet.
Most recently I've been trying to install the new package ff (not yet
on the CRAN repository). I
2008 Feb 27
1
Bug in help(). (PR#10859)
There appears to be a bug in help() when there are multiple packages
attached
containing functions with the same name, and offline=TRUE.
Example:
library(mgcv)
library(gam)
If one simply does:
help(gam) # No ``offline=TRUE''
then the following message appears:
Help on topic 'gam' was found in the following packages:
Package Library
mgcv
2017 Oct 02
2
fwrite() not found in data.table package
Hi all,
I used to use fwrite() function in data.table but I cannot get it to work
now. The function is not in the data.table package, even though a help page
exists for it. My session info is below. Any ideas on how to get fwrite()
to work would be much appreciated. Thanks!
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Red Hat
2008 Jul 27
1
64-bit R on Mac OS X 10.5.4
Hi Matt
Your method is the easiest way for me to install the 64-bit R. I followed the directions on your web site and then did the following:
R --arch=x86_64
source("http://bioconductor.org/biocLite.R")
biocLite(type = "source",lib = "/Library/Frameworks/R.framework/Versions/2.8/Resources/RLib64")
I got many errors and warnings which I copied to the attached file.
2012 Jul 30
1
how to sort huge (> 2^31 row) dataframes quickly
Hello all,
I have some genetic datasets (gzipped) that contain 6 columns and
upwards of 10s of billions of rows. The largest dataset is about 16 GB
on file, gzipped (!). I need to sort them according to columns 1, 2,
and 3. The setkey() function in the data.table package does this
quickly, but of course we're limited by R not being able to index
vectors with > 2^31 elements, and bringing
2012 Feb 21
1
tapply for enormous (>2^31 row) matrices
Hi all,
SETUP:
I have pairwise data on 22 chromosomes. Data matrix X for a given
chromosome looks like this:
1 13 58 1.12
6 142 56 1.11
18 307 64 3.13
22 320 58 0.72
Where column 1 is person ID 1, column 2 is person ID 2, column 3 can
be ignored, and column 4 is how much chromosomal sharing those two
individuals have in some small portion of the chromosome. There are
9000 individual people, and
2010 Feb 06
2
question about bigmemory: releasing RAM from a big.matrix that isn't used anymore
Hi all,
I'm on a Linux server with 48Gb RAM. I did the following:
x <- big.matrix(nrow=20000,ncol=500000,type='short',init=0,dimnames=list(1:20000,1:500000))
#Gets around the 2^31 issue - yeah!
in Unix, when I hit the "top" command, I see R is taking up about 18Gb
RAM, even though the object x is 0 bytes in R. That's fine: that's how
bigmemory is supposed to