Displaying 20 results from an estimated 6000 matches similar to: "Statistical Histograms in R"
2006 Mar 26
1
Newbie clustering/classification question
My laboratory is measuring the abundance of various proteins in the
blood from either healthy individuals or from individuals with various
diseases. I would like to determine which proteins, if any, have
significantly different abundances between the healthy and diseased
individuals. Currently, one of my colleagues is performing an ANOVA on
each protein with MS Excel. I would like to analyze
2024 Apr 16
5
read.csv
Dear R-developers,
I came to a somewhat unexpected behaviour of read.csv() which is trivial but worthwhile to note -- my data involves a protein named "1433E" but to save space I drop the quote so it becomes,
Gene,SNP,prot,log10p
YWHAE,13:62129097_C_T,1433E,7.35
YWHAE,4:72617557_T_TA,1433E,7.73
Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly confused by
2024 Apr 16
1
read.csv
?s 11:46 de 16/04/2024, jing hua zhao escreveu:
> Dear R-developers,
>
> I came to a somewhat unexpected behaviour of read.csv() which is trivial but worthwhile to note -- my data involves a protein named "1433E" but to save space I drop the quote so it becomes,
>
> Gene,SNP,prot,log10p
> YWHAE,13:62129097_C_T,1433E,7.35
> YWHAE,4:72617557_T_TA,1433E,7.73
>
2009 Jul 10
2
predict.glm -> which class does it predict?
Hi,
I have a question about logistic regression in R.
Suppose I have a small list of proteins P1, P2, P3 that predict a
two-class target T, say cancer/noncancer. Lets further say I know that I
can build a simple logistic regression model in R
model <- glm(T ~ ., data=d.f(Y), family=binomial) (Y is the dataset of
the Proteins).
This works fine. T is a factored vector with levels cancer,
2024 Apr 16
1
read.csv
Gene names being misinterpreted by spreadsheet software (read.csv is
no different) is a classic issue in bioinformatics. It seems like
every practitioner ends up encountering this issue in due time. E.g.
https://pubmed.ncbi.nlm.nih.gov/15214961/
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7
https://www.nature.com/articles/d41586-021-02211-4
2012 Aug 10
3
Parsing large XML documents in R - how to optimize the speed?
Hello everyone,
I would like to parse very large xml files from MS/MS experiments and
create R objects from their content. (By very large, I mean going up to
5-10Gb, although I am using a 'small' 40M file to test my code.)
My first attempt at parsing the 40M file, using the XML package, took more
than 2200 seconds and left me quite disappointed.
I managed to cut that down to around 40
2009 Sep 16
1
expression
/Dear all,///
/I am very thankful, if you could tell what is the right way to write:
mtext(paste(expression("R"^2),round(marco2[1,i],digits=3)," N? of proteins:",marco3[i]),side=4,cex=.6)
in this case the output is:
"R"^2
I tried also in this way:
mtext(paste(expression(paste("R"^2)),round(marco2[1,i],digits=3)," N? of
2009 Aug 21
1
LASSO: glmpath and cv.glmpath
Hi,
perhaps you can help me to find out, how to find the best Lambda in a
LASSO-model.
I have a feature selection problem with 150 proteins potentially
predicting Cancer or Noncancer. With a lasso model
fit.glm <- glmpath(x=as.matrix(X), y=target, family="binomial")
(target is 0, 1 <- Cancer non cancer, X the proteins, numerical in
expression), I get following path (PICTURE
2011 Jun 07
1
ggplot 2: Histogram with bell curve?
I am learning ggplot2 commands specifically qplot for the time being and I
have figured out how to create histograms and normal density curves but I am
not sure how to add a normal bell curve or other dist. as well on top of a
histogram.
Here are the two graphs that I created.
## Histogram
t<-rnorm(500)
w<-qplot(t, main="Normal Random Sample", fill=I("blue"),
2012 Jul 23
3
How to do the same thing for all levels of a column?
Dear all,
I am a R beginner, and I am looking for a way to do the same thing for all
levels of a column in a table.
Basically, I have a bunch of protein sequences composed of different amino
acid residues, and each residue is represented by an uppercase letter. I
want to calculate the ratio of different amino acid residues at each
position of the proteins. Here is an example table:
Proteins
2018 May 03
3
Package for Molecular Properties
All
Is there a package or library that will, given a nucleotide sequence
1. calculate the extinction coefficient at 260 nm for (Beer-Lambert's law)
2. calculate molecular weight
3. return it's complementary sequence
I was able to find several packages that can do similar calculations for an amino acid sequence for proteins but none for nucleic acids.
Any pointers, etc. would be
2012 Mar 08
4
Correlation between 2 matrices but with subset of variables
Dear All,
I have two matrices A (40 x 732) and B (40 x 1230) and would like to calculate correlation between them. I can use: cor(A,B, method="pearson") to calculate correlation between all possible pairs. But the issue is that there is one-many specific mappings between A and B and I just need to calculate correlations for those pairs (not all). Some variables in A (proteins, say p1)
2009 Jul 14
2
hi friends, is there any wait function in R
hi,
is there any wait function in R. I am running one R script to plot
many graphs it is in the for loop. its showing no error but its not
plotting well I think i can solve this problem with a wait function.
Please help me in this regards. If u need any clarification about
programme. u can find the script below.
best regards,
Deepak.M.R
Biocomputing Group
University of Bologana.
#!/usr/bin/R
2010 Aug 17
1
ROCR predictions
Hi everybody,
I am having a problem building a ROC curve with my data using the ROCR
package.
I have 10 lists of proteins such as attached (proteinlist.xls). each of the
lists was calculated with a different p-value.
The goal is to find the optimal p-value for the highest number of true
positives as well as lowaest number of false positives.
As far as I understood the explanations from the
2003 Jan 25
7
Plotting coloured histograms...
Hi, I am having some trouble trying to plot a histogram in more than one
colour. What I want to do is, plot two vectors in the same histogram, but
with different colours, for instance:
> x <- rnorm(1000,20,4);
> y <- rnorm(1000,10,2);
Then I'd like to have x and y ploted on the same hist (I can do that
already doing w <- c(x,y) then hist(w)) but the bars
2010 May 20
2
Overlap of leaf labels
Hi,
I have tried looking at the archives but havent found any answer that works
till now (Sorry if i have missed anything)
I am a newbie to R and i am trying to carry out hierarchical clustering
using hclust -> as.dendrogram and then plotting the results as a dendrogram
using the plot function plot(object).
My question is :
In the function "plot", can one decrease the leaf label
2018 May 03
0
Package for Molecular Properties
library(sos)
(mp <- findFn('{molecular properties}'))
????? ** found 7 matches in 4 packages and opened two web pages in my
default browser with (a) the 7 matches and (b) the 4 packages. The first
function was something for amino acids, like you suggested.? Two others
returned compound and substance information from PubChem.
????? Does this help?
????? Spencer
On
2009 Feb 03
1
pairs() help - colour histograms on diagonal
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I'd like to be able to colour histograms along the diagonal using the colours
stored in colnames(d):
> d
black blue brown cyan
1 0.96405751 -0.02964390 -0.060147424 -0.06460070
2 -0.03614607 0.95475444 -0.152382053 -0.07767974
3 -0.07095613 -0.05884884 -0.061289399 -0.06445973
4 -0.03708223 -0.05997624
2007 Apr 16
2
Histograms of lots of variables
Hi R-helpers,
I wish to produce frequency histograms of all of the variables in my
dataframe (except some identifying variables).
I have tried
>hist(dataframe[,3:20])
to produce histograms of the 3rd through 20th variables in my dataframe, but
R doesn't like that.
Could anyone provide a suggestion?
Also, once I produce the histograms, I'd like to save them as graphic files
on my
2006 Jul 11
1
Query about getting averages across a certain parameter in a table
Hi
I have a table that goes
data
cluster_ac clockrate age class
7337 0.9 0.001 alpha_proteins
7888 0.1 0.78 beta proteins
etc
The class column can have 7-8 different unique values
While the clockrate and age columns are floats varying
from 0 to 1.
I wish to get the average clockrate across each of the
classes for this data.
I would appreciate your help