Hi, Newbie here. I read the R for Beginners but i still don't get this. I have the following data (this is just an example) in a CSV file: courseid numstudents 101 209 141 13 246 140 263 8 321 10 361 10 364 28 365 25 366 23 367 34 I load my data using: fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, sep=',') I want to get the ecdf. So, I looked at the ?ecdf which says usage:ecdf(x) So I expected ecdf(fs$numstudents) to work Instead it just returned: Call: ecdf(fs$numstudents) x[1:210] = 1, 2, 3, ..., 3717, 4538 After Googling, got this to work: ecdf(fs$numstudents)(unique(fs$numstudents)) But I don't understand why if the ?ecdf says usage is ecdf(x) ... I need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this to work? Can somebody explain this to me? Regards Gawesh
On Oct 16, 2011, at 11:31 AM, gj wrote:> Hi, > Newbie here. I read the R for Beginners but i still don't get this. > > I have the following data (this is just an example) in a CSV file: > > courseid numstudents > 101 209 > 141 13 > 246 140 > 263 8 > 321 10 > 361 10 > 364 28 > 365 25 > 366 23 > 367 34 > > I load my data using: > > fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, > sep=',') > > I want to get the ecdf. So, I looked at the ?ecdf which says > usage:ecdf(x) > > So I expected ecdf(fs$numstudents) to work > > Instead it just returned: > Call: ecdf(fs$numstudents) > x[1:210] = 1, 2, 3, ..., 3717, 4538 > > After Googling, got this to work: > ecdf(fs$numstudents)(unique(fs$numstudents)) > > But I don't understand why if the ?ecdf says usage is ecdf(x) ... I > need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this > to work? > > Can somebody explain this to me?ecdf() returns a function rather than a vector. You need to supply arguments to that function to get something that you recognize. Had you passed that function off to plot you would have seen that the information needed to calculate the plot is obviously "in there". If you go to the stepfun page you find that the knots function can recover some of htat information for display. > plot( ecdf(fs$numstudents) ) > knots( ecdf(fs$numstudents) ) [1] 8 10 13 23 25 28 34 140 209 If you count the knots you can deduce the quantile values (the "y- values") at which those "x-values" will start the step "dot-line" -- David Winsemius, MD West Hartford, CT
Hi: I don't understand what you're attempting to do. Wouldn't courseid be a categorical variable with a numeric label? If that is so, why are you trying to compute an EDF? An EDF computes cumulative relative frequency of a random variable, which by definition is numeric. If we were talking about EDFs for a distribution of student course grades on a numeric point system by course, that would make some sense, but I don't see how the course IDs themselves qualify as being on an interval scale of measurement. Could you clarify your intent? Dennis On Sun, Oct 16, 2011 at 8:31 AM, gj <gawesh at gmail.com> wrote:> Hi, > Newbie here. I read the R for Beginners but i still don't get this. > > I have the following data (this is just an example) in a CSV file: > > ? ?courseid numstudents > ? ? ? ?101 ? ? ? ? 209 > ? ? ? ?141 ? ? ? ? ?13 > ? ? ? ?246 ? ? ? ? 140 > ? ? ? ?263 ? ? ? ? ? 8 > ? ? ? ?321 ? ? ? ? ?10 > ? ? ? ?361 ? ? ? ? ?10 > ? ? ? ?364 ? ? ? ? ?28 > ? ? ? ?365 ? ? ? ? ?25 > ? ? ? ?366 ? ? ? ? ?23 > ? ? ? ?367 ? ? ? ? ?34 > > I load my data using: > > fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, sep=',') > > I want to get the ecdf. So, I looked at the ?ecdf which says usage:ecdf(x) > > So I expected ecdf(fs$numstudents) to work > > Instead it just returned: > Call: ecdf(fs$numstudents) > ?x[1:210] = ? ? ?1, ? ? ?2, ? ? ?3, ?..., ? 3717, ? 4538 > > After Googling, got this to work: > ecdf(fs$numstudents)(unique(fs$numstudents)) > > But I don't understand why if the ?ecdf says usage is ecdf(x) ... I > need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this > to work? > > Can somebody explain this to me? > > Regards > Gawesh > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >