thr3ads.net - R help - [R] 2 D density plot interpretation and manipulating the data [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Ana Marija

2020-Oct-08 20:52 UTC

[R] 2 D density plot interpretation and manipulating the data

Hello,

I have a data frame like this:
> head(SNP)               mean      var     sd
FQC.10090295 0.0327 0.002678 0.0517
FQC.10119363 0.0220 0.000978 0.0313
FQC.10132112 0.0275 0.002088 0.0457
FQC.10201128 0.0169 0.000289 0.0170
FQC.10208432 0.0443 0.004081 0.0639
FQC.10218466 0.0116 0.000131 0.0115
...

and I am creating plot like this:

s <- ggplot(SNP, mapping = aes(x = mean, y = var))
s <- s +  geom_density_2d() + geom_point() + my.theme +
ggtitle("SNPs")
s

I am getting plot in attach.

My question is how do I:
1.interpret the inclusion versus exclusion within the ellipses-contours?

2. how do I extract from my data frame the points which are outside of ellipses?

Thanks
Ana

-------------- next part --------------
A non-text attachment was scrubbed...
Name: snps.pdf
Type: application/pdf
Size: 27821 bytes
Desc: not available
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20201008/d35a5c66/attachment.pdf>

Ana Marija

2020-Oct-09 01:35 UTC

head link

[R] 2 D density plot interpretation and manipulating the data

My understanding is that this represents bivariate normal
approximation of the data which uses the kernel density function to
test for inclusion within a level set. (please correct me)

In order to exclude the outlier to these ellipses/contours is it
advisable to do something like this:

SNP$density <- get_density(SNP$mean, SNP$var)> summary(SNP$density)   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      0     383     696     738    1170    1789

where get_density() is function from here:
https://slowkow.com/notes/ggplot2-color-by-density/

and then do something like this:

a=SNP[SNP$density>400,]

and plot it again:

p <- ggplot(a, mapping = aes(x = mean, y = var))
p <- p +  geom_density_2d() + geom_point() + my.theme +
ggtitle("SNPS_red")

On Thu, Oct 8, 2020 at 3:52 PM Ana Marija <sokovic.anamarija at gmail.com>
wrote:>
> Hello,
>
> I have a data frame like this:
>
> > head(SNP)
>                mean      var     sd
> FQC.10090295 0.0327 0.002678 0.0517
> FQC.10119363 0.0220 0.000978 0.0313
> FQC.10132112 0.0275 0.002088 0.0457
> FQC.10201128 0.0169 0.000289 0.0170
> FQC.10208432 0.0443 0.004081 0.0639
> FQC.10218466 0.0116 0.000131 0.0115
> ...
>
> and I am creating plot like this:
>
> s <- ggplot(SNP, mapping = aes(x = mean, y = var))
> s <- s +  geom_density_2d() + geom_point() + my.theme +
ggtitle("SNPs")
> s
>
> I am getting plot in attach.
>
> My question is how do I:
> 1.interpret the inclusion versus exclusion within the ellipses-contours?
>
> 2. how do I extract from my data frame the points which are outside of
ellipses?
>
> Thanks
> Ana

Abby Spurdle

2020-Oct-09 07:12 UTC

head link

[R] 2 D density plot interpretation and manipulating the data

> My understanding is that this represents bivariate normal
> approximation of the data which uses the kernel density function to
> test for inclusion within a level set. (please correct me)
You can fit a bivariate normal distribution by computing five parameters.
Two means, two standard deviations (or two variances) and one
correlation (or covariance) coefficient.
The bivariate normal *has* elliptical contours.

A kernel density estimate is usually regarded as an estimate of an
unknown density function.
Often they use a normal (or Gaussian) kernel, but I wouldn't describe
them as normal approximations.
In general, bivariate kernel density estimates do *not* have
elliptical contours.
But in saying that, if the data is close to normality, then contours
will be close to elliptical.

Kernel density estimates do not test for inclusion, as such.
(But technically, there are some exceptions to that).

I'm not sure what you're trying to achieve here.

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Oct 2020 - 2 D density plot interpretation and manipulating the data

[R] 2 D density plot interpretation and manipulating the data

[R] 2 D density plot interpretation and manipulating the data

[R] 2 D density plot interpretation and manipulating the data

Apparently Analagous Threads