Displaying 20 results from an estimated 2000 matches similar to: "using the stepfun to plot histogram outline."
2008 May 21
1
problems with data frames, factors and lists
I have a function that creates a list based on some clustered data:
mix <- function(Y, pid) {
hc = gethc(Y,pid)
maxheight = max(hc$height)
noingrp = processhc(hc)
one = noingrp$one
two = noingrp$two
twoisone = "one"
if (two != 1)
twoisone = "more"
out = list(pid = pid,one = noingrp$one, two = noingrp$two, diff = maxheight, noseqs = length(hc$labels), twogrp = twoisone)
2008 Aug 05
1
xyplot key issue - line colors
I have a problem regarding the colors assigned to the lines in the key
to an xy plot. I specify the plot like this:
xyplot(numbers~sqrt(breaks)|moltype+disttype, groups = type, data = alldata,
layout = c(3,2), type = "l" , lwd = 2, col = c("gray", "skyblue"),
key = simpleKey(levels(alldata$type), points = FALSE, lines = TRUE,
columns = 2, lwd = 2,
2008 Feb 18
3
tabulation on dataframe question
I have a data frame with data similar to this:
NameA GrpA NameB GrpB Dist
A Alpha B Alpha 0.2
A Alpha C Beta 0.2
A Alpha D Beta 0.4
B Alpha C Beta 0.2
B Alpha D Beta 0.1
C Beta D Beta 0.3
Dist is a distance measure between two entities. The table displays
all to all distances, but the
2007 Oct 01
3
"continuous" boxplot?
I have two vectors x and y, which I would like to plot against each
other. I am also displaying other data in this plot. However, I have
about 1 million points to plot, and just plotting them x againt y is
not very informative. What I'd like to do is to do sort of a
continuous box plot.
My x values goes from -1 to 1 and my y values from 0 to 1, so I?d like
to plot the median and quantiles,
2009 Apr 15
1
performing function on data frame
Hi!
First, pardon me if this is a faq. I think I should be using some sort
of apply, but I am not managing to figure those out.
I have a data frame similar to this:
> d <- data.frame(x = LETTERS[1:5], y = rnorm(5), z = rnorm(5))
> d
x y z
1 A 0.1605464 -0.2719820
2 B -0.9258660 1.2623117
3 C -0.3602656 1.5470351
4 D 1.2621797 1.2996500
5 E 0.6021728 0.5027095
2008 Jun 05
5
vector comparison
I know this is fairly basic, but I must have somehow missed it in the
manuals.
I have two vectors, often of unequal length. I would like to compare
them for identity. Order of elements do not matter, but they should
contain the same.
I.e: I want this kind of comparison:
> if (1==1) show("yes") else show("blah")
[1] "yes"
> if (1==2) show("yes") else
2008 Feb 26
1
combine vector and data frame on field?
I have managed to create a data frame like this:
> tsus_same_mean[1:10,]
PID Grp Dist PercAln PercId
1 12638 Acidobacteria 0.000000000 1.0000000 1.0000000
2 87 Actinobacteria 0.000000000 0.9700000 0.9700000
3 92 Actinobacteria 0.008902000 1.0000000 0.9910000
4 94 Actinobacteria 0.000000000 1.0000000 1.0000000
5 189 Actinobacteria 0.005876733
2011 Mar 29
3
producing histogram-like plot
Hi!
I have a dataset that looks like this:
0.0 14
0.0 3
0.9 12
0.73 15
0.78 2
1.0 15
0.3 2
0.32 8
...and so on.
I.e. a value between 0 and 1, and a number
I would like to plot this in a histogram-like manner. I would like to
have a set of bins, each 0.1 wide, and plot the sum of values in column
2 that falls within each bin. I.e, in this case I would like the first
bin, 0.0, to have the
2008 Mar 10
1
hclust graphics - plotting many points
Hello.
I have a distance matrix with lots of distances that I use hclust to
organise. I then plot the results using the plot method of hclust.
However, the plot itself takes around 20 mins to make due to there
being ~700 things in the matrix that I have distances for. I thus
would like to dump this to some graphics format which will let me
examine this further.
I tried dumping it to postscript:
2007 Oct 11
1
creating summary functions for data frame
I have a data frame that looks like this:
> gctablechromonly[1:5,]
refseq geometry gccontent X60_origin X60_terminus length kingdom
1 NC_009484 cir 0.6799 1790000 773000 3389227 Bacteria
2 NC_009484 cir 0.6799 1790000 773000 3389227 Bacteria
3 NC_009484 cir 0.6799 1790000 773000 3389227 Bacteria
4 NC_009484 cir 0.6799
2010 Nov 10
1
plotting histograms/density plots in a triangular layout?
Hi!
I have a set of 49 pairwise comparisons that I have done. From this I
would like to plot either histograms or the density plots of the values
I get. Now, I can plot one histogram per comparison, but I have problems
getting the output I want. When plotting like I normally would do:
histogram(~percid | orgA_orgB, data = alldata)
I get the histograms next to eachother in a boxlike shape.
2007 Sep 19
2
function on factors - how best to proceed
Sorry about this one being long, and I apologise beforehand if there
is something obvious here that I have missed. I am new to creating my
own functions in R, and I am uncertain of how they work.
I have a data set that I have read into a data frame:
> gctable[1:5,]
refseq geometry X60_origin X60_terminus length kingdom
1 NC_009484 cir 1790000 773000 3389227 Bacteria
2
2008 Apr 22
2
cloud plot has white(transparent?) background
I am using the code example from the R graph gallery to look at a
cloud plot:
require(lattice)
data(iris)
print(cloud(Sepal.Length ~ Petal.Length * Petal.Width, data = iris,
groups = Species, screen = list(z = 20, x = -70),
perspective = FALSE,
key = list(title = "Iris Data", x = .15, y=.85, corner = c(0,1),
border = TRUE,
2008 Jan 22
2
contingency table on data frame
I am sorry if this is a faq or tutorial somewhere, but I am unable to
solve this one.
What I am looking for is a count of how many different
categories(numbers in this case) that appears for a given factor.
Example:
> l <- c("Yes", "No", "Perhaps")
> x <- factor( sample(l, 10, replace=T), levels=l )
> m <- c(1:5)
> y <- factor( sample(m, 10,
2009 Aug 12
1
inserting into data frame gives "invalid factor level, NAs generated"
I am calculating some values that I am inserting into a data frame. From
what I have read, creating the dataframe ahead of time is more efficient,
since rbind (so far the only solution I have found to appending to a data
frame) is not very fast.
What I am doing is the following:
# create data frame
goframe = data.frame(goA = character(10), goB = character(10), value =
numeric(10))
goframe[1,] =
2008 Jan 25
1
accessing the indices of outliers in a data frame boxplot
I have a data frame containing columns which are factors. I use this
to make boxplots for the data, with one box per factor. I would now
like to get at the data in the data frame which corresponds to the
outliers. I have so far found the $out, which gives "the values of any
data points which lie beyond the extremes of the whiskers", but I
haven't found anything which will let me get
2007 Sep 27
1
problem loading hexbin associated package colorspace
I have lots of data that I need to display, and I think hexbin would
be good for it.
However, I cannot load one of the requried packages associated with
the hexbin package:
> library(hexbin)
Loading required package: colorspace
Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) :
in 'colorspace' methods for export not found: [, coords, plot
2008 Feb 20
1
clustering problem
First I just want to say thanks for all the help I've had from the
list so far..)
I now have what I think is a clustering problem. I have lots of
objects which I have measured a dissimilarity between. Now, this list
only has one entry per pair, so it is not symmetrical.
Example input:
NameA NameB Dist
189_1C2 189_1C1 0
189_1C3 189_1C1 0.017
189_1C3 189_1C2 0.017
189_1C4 189_1C1 0
2008 Jun 27
1
xyplot and separate abline per plot
Hello list!
I have a set of data like this:
> alldata[1:5,]
breaks numbers disttype moltype type
1 0.0000000 6598 Gapped Distances 5S Between species
2 0.4066667 0 Gapped Distances 5S Between species
3 0.8133333 5228 Gapped Distances 5S Between species
4 1.2200000 0 Gapped Distances 5S Between species
5 1.6266667 9702 Gapped
2007 Sep 25
5
Am I misunderstanding the ifelse construction?
I have a function like this:
changedir <- function(dataframe) {
dir <- dataframe$dir
gc_content <- dataframe$gc_content
d <- ifelse(dir == "-",
gc_content <- -gc_content,gc_content <- gc_content)
return(d)
}
The goal of this function is to be able to input a data frame like this:
> lala
dir gc_content
1 + 0.5
2 - 0.5
3 +