Displaying 20 results from an estimated 2000 matches similar to: "how to separate char and num within a variable"
2011 Feb 01
3
Matching patients
May I ask a clinical question? For a trial, we have a treatment group of small
size, say 30 patients. We want to selectmatching control patients from a bigger
group (100 patients) in terms of several clinical variables, such as age, tumor
stage etc. This practice is to select the closest matching set of control cases.
I wonder if R has any routine or package that can help with this problem.
2009 Feb 11
2
error in my previous message
i'm sorry. i had an error in my previous code because i left out a
letter in the rownames.
while fixing that, i also found a solution. so i'm sorry for the
confusion.
below is my fix.
temp2 <- matrix(rnorm(10),nc=1,nrow=10)
rownames(temp2) <-
2012 Jun 06
2
how to remove part of the string
Dear all,
Does any one know how to remove part of the string?
For example, "LTA4H||Leukotriene A4 hydrolase" is a gene name plus gene description. I hope to remove "||Leukotriene A4 hydrolase". What would be the R code to do that using gsub()? Many thanks!
Bill
2011 Jul 27
3
Ordinary Least Products regression in R
Dear all,
Does any one know if any R package or function can do Ordinary Least Products regression? Many thanks!
Bill
2009 Feb 11
2
sorting a matrix by the column
this is a bad question but I can't figure it out and i've tried. if i
sort the 2 column
matrix , temp1, by the first column, then things work as expected. But,
if I sort the 1 column matrix, temp2, then it gets turned coerced to a
vector. I realize that I
need to use drop=FALSE but i've put it in a few different places with no
success. Thanks.
temp1 <-
2011 Jun 07
1
extract data from a data frame field
Hi all,
I am given the a data frame in which one of the columns has more information
together- see column 4, peak_loc:
chr start end peak_loc cluster_TC strand peak_TC
1 chr1 564620 564649 chr1:564644..564645,+ 94 + 10
2 chr1 565369 565404 chr1:565371..565372,+ 217 + 8
3 chr1 565463 565541 chr1:565480..565481,+ 1214 + 15
4 chr1
2017 Sep 04
1
Merge by Range in R
Hi,?
I have two big data set.?
data _1 :?
> dim(data_1)
[1] 15820 5
> head(data_1)
? ?Chromosome ?????Start????????End????????Feature GroupA_3
1: ? ? ? ????????chr1 521369 ?750000 ????chr1-0001 ? ?????0.170
2: ? ? ? ????????chr1 750001 ?800000 ????chr1-0002 ? ????-0.086
3: ? ? ? ????????chr1 800001 ?850000 ????chr1-0003 ? ?????0.006
4: ? ? ? ????????chr1 850001 ?900000 ????chr1-0004 ?
2010 Apr 29
1
Using plyr::dply more (memory) efficiently?
Hi all,
In short:
I'm running ddply on an admittedly (somehow) large data.frame (not
that large). It runs fine until it finishes and gets to the
"collating" part where all subsets of my data.frame have been
summarized and they are being reassembled into the final summary
data.frame (sorry, don't know the correct plyr terminology). During
collation, my R workspace RAM usage goes
2013 Feb 03
1
Adding complex new columns to data frame depending on existing column
Hello
I have a data frame as below
V1 V2 V3 V4 V5 V6
chr1 18884 C CAAAA 2 0
chr1 135419 TATACA T 2 0
chr1 332045 T TTG 0 2
chr1 453838 T TAC 2 0
chr1 567652 T TG 1 0
chr1 602541 TTTA T 2 0
on which I want to perform complex rearrangement such that:
if V3 is a string >1 (i.e line 2) then I
2008 Feb 06
4
inserting text lines in a dat frame
Hi Jim
I am trying to prepare a bed file to load as accustom track on the UCSC genome browser.
I have a data frame that looks like the one below.
> x
V1 V2 V3
1 chr1 11255 55
2 chr1 11320 29
3 chr1 11400 45
4 chr2 21680 35
5 chr2 21750 84
6 chr2 21820 29
7 chr2 31890 46
8 chr3 32100 29
9 chr3 52380 29
10 chr3 66450 46
I would like to insert the following 4 lines at the beginning:
2008 Feb 04
1
counting identical data in a column
Hi Peter
I have the following data frame with chromosome name, start and end positions:
chrN start end
1 chr1 11122333 11122633
2 chr1 11122333 11122633
3 chr3 11122333 11122633
8 chr3 111273334 111273634
7 chr2 12122334 12122634
4 chr1 21122377 21122677
5 chr2 33122355 33122655
6 chr2 33122355 33122655
I would like to count the positions that have the same start and
2012 Mar 04
1
Intersection of two chromosomal ranges
Hi,
I want to merge multiple chromosomal regions based on their common
intersecting regions. I tried couple of things using while and if loops but
did not work out.
I would appreciate if anyone could provide me a small piece of code in R to
get the intersection of following example:
chr1: 100-150
chr1: 79-250
chr1: 100-175
chr1: 300-350
I want the intersection of all four regions as follow:
2011 Jun 08
1
return counts of elements on a table column depending on elements on another column
Hi,
I am given the following table:
> head(hsa_refseq)
chr genome region start stop nu strand nu.1 nu.2
gene_id
1 chr1 hg19_refGene CDS 67000042 67000051 0 + 0 gene_id
NM_032291
2 chr1 hg19_refGene exon 66999825 67000051 0 + . gene_id
NM_032291
3 chr1 hg19_refGene CDS 67091530 67091593 0 + 2 gene_id
NM_032291
4 chr1 hg19_refGene exon
2011 Jan 31
1
how to search to value to another table
Hello,
I'm a new R user.
I have two different dummy tables with the variable name tb1 and tb2.
tb1<
v1 v2 v3 v4
"chr1" 22 23 3
"chr1" 36 37 1
"chr1" 54 55 0
"chr1" 77 78 1
"chr2" 80 81 4
"chr2" 85 86 0
"chr2" 99 100 1
2012 Jul 02
1
apply with multiple conditions
Hello all,
I have written a for loop to act on a dataframe with close to 3million rows
and 6 columns and I would like to pass it to apply() to speed the process up
(I let the loop run for 2 days before stopping it and it had only gone
through 200,000 rows) but I am really struggling to find a way to pass the
arguments. Below are the loop and the head of the dataframe I am working on.
Any hints
2011 Jun 27
1
create a new data frame after comparing two columns of the previous data frame
Hi everyone,
I am trying to find a way to filter a table; If I am given for example the
following table:
> head(intra)
chr miRNA start end strand ACC hsa_ID
region region_start region_end gene_id transcrip_id
1 chr1 miRNA 1102484 1102578 + ACC="MI0000342"; ID="hsa-mir-200b";
exon 1102484 1102578 NR_029639 NR_029639
2 chr1
2008 Mar 11
2
persp question
someone sent in a question earlier about doing
something in 3D so i took a stab at it purely
for educational purposes ( i'm not even sure that I understood the question actually ).
Unfortunately, persp gives me an error that I don't understand because it says "object y not found". I'm sending y in as a parameter to persp similar to what ?persp shows in one of oits examples
2011 Jun 02
1
an efficient way to calculate correlation matrix
Dear all,
I have a problem. I have m variables each of which has n observations. I want to
calculate pairwise correlation among the m variables and store the values in a m
x m matrix. It is extremely slow to use nested 'for' loops if m and n are large.
Is there any efficient alternative to do this? Many thanks for your
suggestions!!
Bill
2016 Apr 05
2
Is that an efficient way to find the overlapped , upstream and downstream ranges for a bunch of ranges
I do have a bunch of genes ( nearly ~50000) from the whole genome, which read in genomic ranges
A range(gene) can be seem as an observation has three columns chromosome, start and end, like that
seqnames start end width strand
gene1 chr1 1 5 5 +
gene2 chr1 10 15 6 +
gene3 chr1 12 17 6 +
gene4 chr1 20 25 6 +
gene5
2011 Dec 06
1
warning for inefficiently compressed datasets
Hi,
Recently added to doc/NEWS.Rd:
'R CMD check' now gives a warning rather than a note if it finds
inefficiently compressed datasets. With 'bzip2' and 'xz' compression
having been available since R 2.10.0, there is no excuse for not
using them.
Why isn't a note enough for this?
Generally speaking, warnings are for things that are dangerous,
or unsafe,