Displaying 20 results from an estimated 7000 matches similar to: "question about differences in behavior with NA subscripts in matrix vs. data.frame"
2009 Nov 12
2
package "tm" fails to remove "the" with remove stopwords
I am using code that previously worked to remove stopwords using package
"tm". Even manually adding "the" to the list does not work to remove "the".
This package has undergone extensive redevelopment with changes to the
function syntax, so perhaps I am just missing something.
Please see my simple example, output, and sessionInfo() below.
Thanks!
Mark
require(tm)
2009 Dec 08
1
problem with split eating giga-bytes of memory
I'm having trouble using split on a very large data-set with ~1400 levels of
the factor to be split. Unfortunately, I can't reproduce it with the simple
self-contained example below. As you can see, splitting the artificial
dataframe of size ~13MB results in a split dataframe of ~ 144MB, with an
increase memory allocation of ~10 fold for the split object. If split scales
linearly, then my
2009 Dec 10
0
R CMD SHLIB requesting makefile. Is a makefile required?
A few years ago I used the following to compile a shared object that I
wanted to call from R and it worked just fine.
R CMD SHLIB -o ~/my_C/R.shared.so/cocite.mat.so cocite.mat.c
Now when it is executed?I receive the following error message:
make: *** No rule to make target `cocite.mat.o', needed by
`/home/mkimpel/my_C/R.shared.so/cocite.mat.so'. Stop.
I've consulted R CMD SHLIB
2009 Aug 28
1
problems with strsplit using a split of ' \\\ ' : a regex problem
I have a vector of gene symbols, some of which have multiple aliases. In the
case of an alias, they are separated by ' \\\ '.
Here is a real world example, which would represent one element of my
vector:
Eif4g2 /// Eif4g2-ps1 /// LOC678831
What I would like to do is input the vector into a function and output a
vector with just the first alias of each element (or, if there are no
aliases,
2008 Sep 24
1
splitting strings efficiently
I have a very long list of strings. Each string actually contains multiple
values separated by a semi-colon. I need to turn each string into a vector
of the values delimited by the semi-colons. I know I can do this very
laboriously by using loops, nchar, and substr, but it is terribly slow. Is
there a basic R function that handles this situation? If not, is there
perhaps a faster way to do it than
2009 Oct 12
3
help with the use of mtext to create main title over multiple plots
I'm trying to use mtext to create a main title over multiple plots. Below is
a simple self-contained example and my sessionInfo (I should note I've also
tried this with R-2.8.1 with the same results). When I execute the code
chunk below, I get the plots, but no title. I've tried this using the screen
driver, pdf, and postscript. I've used different sizes of paper. I suspect I
am
2008 Feb 07
1
help with R rendering engine
I'm doing some work on a potential patch to the Bioconductor package
Rgraphviz and have some questions on code that is contained in engine.c.
In particular, I am developing some custom shapes using polygon and need
to make sure that, with rendering, the line connecting the centers of
two polygons stops at the border of each polygon. The polygons can be
transparent, so the option of just
2008 Oct 17
1
how to list variables enclosed in an environment
I'm having trouble with a Bioconductor package, an variable expected in an
environment does not seem to be there. As part of my investigation of the
problem (most likely on my end) I'd like to list the variables contained in
an environment. If you have an environment loaded, lets call it "pkgEnv',
how does one find what it does contain? Mark
2009 Jul 04
4
help with dealing with integer(0) returns from grep used within a conditional loop
I am using grep to locate colnames to automate a report build and have
run into a problem when a colname is not found. The use of integer(0)
in a conditional statement seems to be a no no as it has length 0.
Below is a self-contained trivial example. I would like to get
something like "NA" or -1 for the position when it is not found OR
learn a way to use integer(0) or some
2008 Mar 12
2
subset list based on logical within element flag
I have a very long list that I'd like to subset based on a logical value
within each element. Example below. I'd like to get just those list
elements for further study whose $sig.cor slot is TRUE. In this example,
I'd only want element [[2]].
Should be simple, I know. How can I do this? Thanks, Mark
> gene.pair.tf.lst
[[1]]
[[1]]$gene.pair
[1] "Lgals1:Pxmp2"
2009 Jul 21
1
problem with heatmap.2 in package gplots generating non-finite breaks
I have written a wrapper for heatmap.2 called
heatmap.w.row.and.col.clust which auto-generates breaks using
breaks<-round((c(seq(from=(-20 * stddev), to=(20 * stddev))))/20,
digits = 2) #(stddev in this case = 2.5)
This has always worked well in the past but now I am getting an error
that non-finite breaks are being generated. Drilling down, it seems
that my wrapper is generating finite
2008 Mar 13
3
fast way to compare two matrices of combinations
I have a list (length 750), each element containing a vector of unique
strings (unique gene ids), with length up to ~40 (median 15). I want to
compile a matrix of all possible triplets and their frequency within
gene elements. Using combn and a lot of looping, I am accomplishing this
but it is VERY slow.
I've tried to figure out a way to vectorize this, using "match" and
2008 Mar 05
1
connectivity measure for graph nodes
I am doing some work the Rgraphviz, a Bioconductor package, but since my
question is of a more general nature, thought I would send to this list
in hopes that a graph theory expert could answer my question.
I wish to do some statistics on node-node relationships. In particular,
I want to see if two connected nodes share a common property. I believe
that the more "connected" the two
2008 May 06
3
rggobi is crashing R-2.7.0
I am running 64-bit Ubuntu 8.04 and when I invoke rggobi the interactive
graph displays but R crashes. See my sessionInfo() and a short example
below. Ggobi and rggobi installed without complaints. Mark
> sessionInfo()
R version 2.7.0 Patched (2008-05-04 r45620)
x86_64-unknown-linux-gnu
locale:
2008 Oct 07
1
using assign with lists
I am performing many permutations on a data-set with each permutation
producing a variable number of results. I thought that the best way to keep
track of all this in one object would be with a list ('res.lst'). To address
these variable results for each permutation I attempted to construct this
list using 'assign'. There is even more nesting than indicated below, but
this is a
2008 Mar 05
4
vertex labels in igraph from adjacency matrix
I am getting some unexpected results from some functions of igraph and
it is possible that I am misinterpreting the vertex numbers. Eg., the
max betweenness measure seems to be from a vertex that is not connected
to a single other vertex. Below if my code snippet:
require(igraph)
my.graph <- graph.adjacency(adjmatrix = my.adj.matrix, mode=c("undirected"))
most.between.vert <-
2007 Dec 10
2
00LOCK error with site-library
I have identical R.profiles and R_HOME directories set up on both my
local machine and a remote linux cluster. To keep my libraries and R
install separate, I use a site-library on both machines.
The first line of my .Rprofile is:
'.libPaths(new= "~/R_HOME/site-library") #tell R where site-library is'
Until R-2.6.0 this was working fine on both machines, but since I have
been
2008 Jan 22
0
calling MPI parallel C code from R
R developers:
I have some parallel C code, written with the MPI library, that I would
like to call from R, but I get the error message below.
cocite.mat.true.parallel.so compiles without complaint and I have MPI in
my path, but R isn't recognizing one of the MPI symbols.
I have a feeling that this is going to be tricky, but I need to get this
to work. Helpful suggestions? I'd be
2010 Mar 07
2
vectorizing ANOVA over a vectorized linear model
Is it possible to vectorize anova over the output of a vectorized lm? I
have a gene expression matrix with each row being a gene and columns for
samples. There are several factors with interactions. I can get p values by
looping over the matrix with lm and anova, but I would like to make this as
computationally efficient as possible. I am able to vectorize the lm
command, but when I try to use
2010 Oct 20
2
ascii or regex code for alt-enter for Excel
I need to write a table that can be opened in Excel or OpenOffice such that
there are newlines embedded within cells.
After much Googling and futzing, I can't figure out how to do this. The way
to do this within Excel is alt-Enter and I've tried '/n', '/n/r', '/r/n' per
some web suggestions without luck.
Anybody know what character or ASCII code to use for this?