Displaying 20 results from an estimated 1300 matches similar to: "How to get Quartiles when data contains both numeric variables and factors"
2011 Nov 04
2
Reading parameters from dataframe and loading as objects
Hi List,
I want to read several parameters from data frame and load them as object
into R session, Is there any package or function in R for this??
Here is example
param <-c("clust_num", "minsamp_size", "maxsamp_size", "min_pct", "max_pct")
value <-c(15, 20000, 200000, 0.001, .999)
data <- data.frame ( cbind(param , value))
data
2007 Jun 16
2
Visualize quartiles of plot line
Hello,
I'm currently using a simple plot to visualize some mean values. I'm
having ~200 datapoints on the x-axis, each has 10 records. I'm
currently plotting only the mean value of each of the datapoints.
What I need is a way to visualize the quartiles/error/whatever of
these points. I thought about boxplots, but I have to many points on
the xaxis - it would be impossible to see
2010 Jan 22
2
Quartiles and Inter-Quartile Range
Why am I getting a wrong result for quartiles?
here is my code:
> cbiomass = c(910, 1058, 929, 1103, 1056, 1022, 1255, 1121, 1111, 1192,
> 1074, 1415)
> summary(cbiomass)
> IQR(cbiomass)
The result R gives me is:
For the summary
> Min. 1st Qu. Median Mean 3rd Qu. Max.
910 1048 1088 1104 1139 1415
For IQR
> 91.25
*********
The true Q1 is 1039
2007 Feb 12
6
Boxplot: quartiles/outliers
For boxplot(), is it possible to pass in a parameter to change the default
way that the 1st and 3rd quartiles are computed? (specifically, I'd like to
use type 6 described in the quantile function).
Also, what are the options for how outliers are computed, and how can one
change them?
Thank you
[[alternative HTML version deleted]]
2011 Oct 19
1
Subsetting data by eliminating redundant variables
Dear All,
I am new to R, I have one question which might be easy.
I have a large data with more than 250 variable, i am reducing number of
variables by redun function as in the example below,
n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(n)/10
x4 <- x1 + x2 + x3 + runif(n)/10
x5 <- factor(sample(c('a','b','c'),n,replace=TRUE))
x6 <-
2001 Jul 10
1
returning quartiles of a list?
Hi, all. I have a list:
process <- c( 5 , 7 , 4 , 1 , 4 , 1)
and I'd like to get each half (or each third or each quartile) of the list:
process.firsthalf would be (5, 7, 4) and
process.secondhalf would be (1, 4, 1).
note that I'm not interested in the numeric quartiles (then I could
use quantile or several other functions).
what is the best way to get this kind of thing? for
2012 Aug 10
1
Solving binary integer optimization problem
Hi,
I am new to R for solving optimization problems, I have set of communication
channels with limited capacity with two types of costs, fixed and variable
cost. Each channel has expected gain for a single communication.
I want to determine optimal number of communications for each channel
maximizing ROI)return on investment) with overall budget as constraint.60000
is the budget allocated.
2011 Oct 22
1
Data frame manipulation by eliminating rows containing extreme values
Dear All,
I have got the limits for removing extreme values for each variables using
following function .
f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm =
TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)}
#Example:
n <- 100
x1 <- runif(n)
x2 <- runif(n)
x3 <- x1 + x2 + runif(n)/10
x4 <- x1 + x2 + x3 + runif(n)/10
x5 <-
2011 Nov 15
2
Putting directory path as a parameter
Hi List,
I am new to R, this may be simple.
I want to store directory path as parameter which in turn to be used while
reading and writing data from csv files.
How I can use dir defined in the below mentioned example while reading the
csv file.
Example:
dir <- "C:/Users/Desktop" #location of file
temp_data <- read.csv("dir/bs_dev_segment_file.csv")
If I run this
2008 Apr 03
1
prettyR 25% quartile, 75% quartile
I am using the describe function in prettyR. I would like to add the
25% 75% quartiles to the summary table
how do I do this
I have tried
describe(x.f, num.desc=c("mean", "median", "sd", "min", "max",
"skewness", "quantile(x.f, na.rm=T, probs=seq(0.25, 0.75))",
"valid.n"))
help
--
Let's not spend our time
2007 Apr 11
1
Boxplot with quartiles generated from different algorithms
R users:
I am trying to replicate the boxplot output I achieve with Minitab in R.
I realize that R gives the user many more options on the algorithm used
to
calculate the IQR than Minitab, so I concentrated on type=6 when using
the quantile() function in R. The problem I am having is setting the
upper and
lower limit of the whisker based on the nearest actual data that should
be included.
If
2012 Oct 30
2
Java Exception error while reading large data in R from DB using RJDBC.
Dear List,
Java Exception error while reading large data in R from DB using RJDBC.
I am trying to read large data from DB table(Vectorwise), using RJDBC
connection.
I have tested the connection with small size data and was able to fetch DB
tables using same connection(conn as in my code).
Please suggest where am i going wrong or alternate option to solve such
issues while reading large DB
2006 Oct 25
1
Drawing a reference line for a qqplot with reference to Weibull distribution
Hi,
I'm trying to create a qqplot with reference to a Weibull distribution
including a reference line. This is my current code:
lights.data <- scan("lights.dat")
#Generate Weibull quantiles
prob.grid <- ppoints(length(lights.data))
prob.quant <- qweibull(prob.grid , 1.5,4)
#Draw QQ plot
qqplot(prob.quant,lights.data)
#add red reference line
qqline(lights.data,col = 2)
2012 Jan 13
1
Quantiles in boxplot
Hi,
I have a simple question about quartiles in R, especially how they are calculated using the boxplot.
Quartiles
(.25 and .75) in boxplot are different from the summary function and
also don't match with the 9 types in the quantile function.
See attachment for details.
Can you give me the details on how the boxplot function does calculate these values?
Cheers,
Rene Brinkhuis
2011 Oct 20
2
How to remove multiple outliers
Hi All,
I am working on the dataset in which some of the variables have more than
one observations with outliers .
I am using below mentioned sample script
library(outliers)
x1 <- c(10, 10, 11, 12, 13, 14, 14, 10, 11, 13, 12, 13, 10, 19, 18, 17,
10099, 10099, 10098)
outlier_tf1 = outlier(x1,logical=TRUE)
find_outlier1 = which(outlier_tf1==TRUE, arr.ind=TRUE)
beh_input_ro1 =
2000 Dec 11
1
qqline (PR#764)
I think qqline does not do exactly what it is advertised to do ("`qqline'
adds a line to a normal quantile-quantile plot which passes through the
first and third quartiles."). Consider the graph:
tmp <- qnorm(ppoints(10))
qqnorm(tmp)
qqline(tmp)
The line (which I expected go through all the points), has a slightly
shallower slope than does the points plotted by qqnorm. I think
2011 Nov 04
1
Decision tree model using rpart ( classification
Hi Experts,
I am new to R, using decision tree model for getting segmentation rules.
A) Using behavioural data (attributes defining customer behaviour, ( example
balances, number of accounts etc.)
1. Clustering: Cluster behavioural data to suitable number of clusters
2. Decision Tree: Using rpart classification tree for generating rules for
segmentation using cluster number(cluster id) as target
2008 Aug 05
5
boxplot with average instead of median
I really like the ease of use with the boxplot command in R. I would
rather have a boxplot that shows the average value and the standard
deviation then the median value and the quartiles.
Is there a way to do this?
Chad Junkermeier, Graduate Student
Dept. of Physics
West Virginia University
PO Box 6315
210 Hodges Hall
Morgantown WV 26506-6315
phone: (304) 293-3442 ext. 1430
fax: (304)
2008 Nov 03
1
quantcut
I'm trying to devide x into tertiles, but ends up with integer limits
even x holds one decimal. The analysis is extremely sensitive to the
limits and I like to keep them right. How can that be done?
quartiles <- quantcut( x[x >= 0], q=seq(0,1, by=(1/3))
> table(quartiles)
quartiles
[180,344] (344,448] (448,644]
16467 16476 16452
[[alternative HTML version deleted]]
2007 May 14
1
Nicely formatted summary table with mean, standard deviation or number and proportion
Dear all,
The incredibly useful Hmisc package provides a method to generate
summary tables that can be typeset in latex. The Alzola and Harrell book
"An introduction to S and the Hmisc and Design libraries" provides an
example that generates mean and quartiles for continuous variables, and
numbers and percentages for count variables: summary() with method =
'reverse'.
I