similar to: odd behavior of "summary" function

Displaying 20 results from an estimated 3000 matches similar to: "odd behavior of "summary" function"

2007 Oct 09
3
Summary vs fivenum results for Q3
I've just started using R and am still a neophyte, but I found the following curious result. I'm using the current version of R (2.5.1 (2007-06-27) ). Why are the results for the third quartile different in the output from the summary and fivenum commands? For the following data set 457 514 530 530 538 560 687 745 745 778 786 790 792
2010 Jan 22
2
Quartiles and Inter-Quartile Range
Why am I getting a wrong result for quartiles? here is my code: > cbiomass = c(910, 1058, 929, 1103, 1056, 1022, 1255, 1121, 1111, 1192, > 1074, 1415) > summary(cbiomass) > IQR(cbiomass) The result R gives me is: For the summary > Min. 1st Qu. Median Mean 3rd Qu. Max. 910 1048 1088 1104 1139 1415 For IQR > 91.25 ********* The true Q1 is 1039
2007 Nov 18
2
Obtaining x-values from ECDF
Dear Group, I am using the ecdf function as follows: cawa.cdp <- ecdf(cawaocc$LEFF80) summary(cawa.cdp) Empirical CDF: 223 unique values with summary Min. 1st Qu. Median Mean 3rd Qu. Max. 0.07918 1.35700 1.68600 1.61000 1.91200 2.70000 I can see by the summary that the y-value for the 3rd quartile is 1.912. How can I obtain the x-value for a specified y-value (e.g., 0.8)?
2005 Apr 28
3
have to point it out again: a distribution question
Stock returns and other financial data have often found to be heavy-tailed. Even Cauchy distributions (without even a first absolute moment) have been entertained as models. Your qq function subtracts numbers on the scale of a normal (0,1) distribution from the input data. When the input data are scaled so that they are insignificant compared to 1, say, then you get essentially the
2010 Jul 29
2
ggplot2 histograms... a subtle error found
Hello all, I have a peculiar and particular bug that I stumbled across with ggplot2. I cannot seem to replicate it with anything other than my specific data set. Here is the problem: - when I try to plot a histogram, allowing for ggplot2 to decide the binwidths itself, I get the following error: - stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to
2010 Jun 30
2
anyone know why package "RandomForest" na.roughfix is so slow??
Hi all, I am using the package "random forest" for random forest predictions. I like the package. However, I have fairly large data sets, and it can often take *hours* just to go through the "na.roughfix" call, which simply goes through and cleans up any NA values to either the median (numerical data) or the most frequent occurrence (factors). I am going to start
2011 Feb 08
4
manipulating the Date & Time classes
Hello, This is mostly to developers, but in case I missed something in my literature search, I am sending this to the broader audience. - Are there any plans in the works to make "time" classes a bit more friendly to the rest of the "R" world? I am not suggesting to allow for fancy functions to manipulate times, per se, or to figure out how to properly
2011 Feb 08
4
manipulating the Date & Time classes
Hello, This is mostly to developers, but in case I missed something in my literature search, I am sending this to the broader audience. - Are there any plans in the works to make "time" classes a bit more friendly to the rest of the "R" world? I am not suggesting to allow for fancy functions to manipulate times, per se, or to figure out how to properly
2011 Feb 24
1
Boxplot not doing what I think it should
My box plot below is drawing its upper whisker all the way to the last point, instead of showing the point as an outlier. Am I misunderstanding, or is it a bug? Help(boxplot) states for the parameter ?range? that ?this determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"
2010 Dec 20
1
ideas, modeling highly discrete time-series data
Hello all, First of all, thanks so those of you who helped me a week or so ago managing a time series with varying gaps between the data series in 'R'. (My final preferred solution was to use "its" function & then forecast(Arima( ) ). ) My next question is a general statistical question where I'd like some advice, for those willing / able to proffer any wisdom:
2010 Dec 03
2
How to get 'R' to talk BACK to other languages / scripts??
Hey everyone, I know that I can call 'R' from other scripts, and that I can make command calls from 'R' (e.g., using system() ). But how can I get 'R' to RETURN values to the script that called it. E.g., I would like to be able to do something like the following (as a simpler example) from a bash script: #!/bin/bash myTest=echo /usr/local/bin/R --no-restore
2005 Oct 04
6
boxplot statistics
I have read and reread the boxplot and the boxplot stats page, and I still cannot understand how and what boxplot shows. I realize that this might be due to me not knowing enough statistics, but anyway... First, how does boxplot determine the size of the box? And is the line inside the box the mean or the median (or something completely different?) And how does it determine how long out the
2010 Dec 17
2
how to convert "sloppy data" into a time series?
Hi All, First let me state that I did search for a while on r-help, google, and using the "sos" package inside of 'R', without much luck. I want to know how to create a univariate time series from a set of data that will have huge time gaps in it. For instance, here is a snapshot of a piece of data that I would like to analyze: *Row queued_time
2007 Feb 22
1
Diagnostic Tests: Jarque-Bera Test / RAMSEY
Hello R-Users, The following questions are not R-technical, but more of general statistical nature. 1. NORMALITY I built a normal linear regression model and now I want to check for the residual normality assumption. If I check the distribution graphically and look at the descriptive characteristics (skewness and kurtosis are below 1), I would confirm that the residuals are normally
2011 Jan 12
2
syntax for extending a line in a script??
Hello, A hopefully simple question. I use 'R' through emacs, but I suspect the following would occur with any manner of text editor: - my editor has a normally quite handy feature where it will automatically indent to the appropriate level when I start a new line. However, this occasionally creates cases where there is no friendly way to break a long line of code into
2012 Oct 17
2
loop of quartile groups
Greetings R users, My goal is to generate quartile groups of each variable in my data set. I would like each experiment to have its designated group added as a subsequent column. I can accomplish this individually with the following code: brks <- with(data_variables, cut2(var2, g=4)) #I don't want the actual numbers, I need a numbered group data$test1=factor(brks,
2010 Oct 26
2
Forcing results from lm into datframe
Hi I need some help getting results from multiple linear models into a dataframe. Let me explain the problem. I have a dataframe with ejection fraction results measured over a number of quartiles and grouped by base_study. My dataframe (800 different base_studies) looks like > afvtprelvefs basestudy quartile ef ef_std entropy CBP0908020 1 21.6 0.53 3.27
2009 Sep 22
5
use of class variable in r as in Proc means of sas
Hi,everyone i need to calculate quartile values of a variable grouped by the other variable . same as in aggregate function(only median,mean or functions is possible-i think so) Could you please help me to achieve the same for other quartile values(5,10,25,75,90) as for median using aggregate. Thanks in advance. data : zip price 60000 567000 60001 478654 60004 485647 60001
2003 Oct 28
4
random number generation
Hi every one, I am trying to generate a normally distributed random variable with the following descriptive statistics, min=1, max=99, variance=125, mean=38.32, 1st quartile=38, median=40, 3rd quartile=40, skewness=-0.274. I know the "rnorm" will allow me to simulate random numbers with mean 38.32 and Sd=11.18(sqrt(125)). But I need to have the above mentioned descriptive