Skip to content
Back to formatted view

Raw Message

Message-ID: <CADv2QyGHvnnyqFTMcwsydbwBF_JNQAvysPL3vsGiGqJRSvUgQQ@mail.gmail.com>
Date: 2011-10-17T01:16:16Z
From: Dennis Murphy
Subject: ecdf
In-Reply-To: <CAJ7zG7E_rKM2UXDMVJkhH99XPnu8iaHiN0c3Cxi-+aTbzr++Gg@mail.gmail.com>

Thanks for the clarification. I stand corrected.

Dennis

On Sun, Oct 16, 2011 at 5:48 PM, gj <gawesh at gmail.com> wrote:
> David is right. I am looking for the ecfd for fs$numstudents. The
> other column is just an id.
>
> I guess I don't know how to read the R documentation when it comes to functions.
>
> looking at the documentation, i now notice that it says "Compute an
> empirical cummulative distribution function and not a vector.
>
> But still I would had assumed that in ecdf(x) ... the x is the argument.
>
> So ecdf(fs$numstudents)(unique(fs$numstudents))
> ? ? =============== ?==================
> ? ? ? ? ?function ? ? ? ? ? ? ? ? ? ? ? arguments
>
> Yes? But I can't read that from the documentation? I suspect it has
> something to those dots .... in the arguments which I don't
> understand.
>
> Why it says usage ecdf(x) when it's clearly not the case?
>
> I don't get it.
>
> Gawesh
>
>
> On Sun, Oct 16, 2011 at 11:02 PM, David Winsemius
> <dwinsemius at comcast.net> wrote:
>>
>> On Oct 16, 2011, at 3:53 PM, Dennis Murphy wrote:
>>
>>> Hi:
>>>
>>> I don't understand what you're attempting to do. Wouldn't courseid be
>>> a categorical variable with a numeric label? If that is so, why are
>>> you trying to compute an EDF? An EDF computes cumulative relative
>>> frequency of a random variable, which by definition is numeric. If we
>>> were talking about EDFs for a distribution of student course grades on
>>> a numeric point system by course, that would make some sense, but I
>>> don't see how the course IDs themselves qualify as being on an
>>> interval scale of measurement. Could you clarify your intent?
>>
>> Huh? gawesh asked for ecdf on numstrudents (not courseid) ?... pretty
>> clearly a numeric value for which an ECDF should make sense.
>>
>> --
>> David.
>>
>> --
>>>
>>> Dennis
>>>
>>> On Sun, Oct 16, 2011 at 8:31 AM, gj <gawesh at gmail.com> wrote:
>>>>
>>>> Hi,
>>>> Newbie here. I read the R for Beginners but i still don't get this.
>>>>
>>>> I have the following data (this is just an example) in a CSV file:
>>>>
>>>> ? courseid numstudents
>>>> ? ? ? 101 ? ? ? ? 209
>>>> ? ? ? 141 ? ? ? ? ?13
>>>> ? ? ? 246 ? ? ? ? 140
>>>> ? ? ? 263 ? ? ? ? ? 8
>>>> ? ? ? 321 ? ? ? ? ?10
>>>> ? ? ? 361 ? ? ? ? ?10
>>>> ? ? ? 364 ? ? ? ? ?28
>>>> ? ? ? 365 ? ? ? ? ?25
>>>> ? ? ? 366 ? ? ? ? ?23
>>>> ? ? ? 367 ? ? ? ? ?34
>>>>
>>>> I load my data using:
>>>>
>>>> fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, sep=',')
>>>>
>>>> I want to get the ecdf. So, I looked at the ?ecdf which says
>>>> usage:ecdf(x)
>>>>
>>>> So I expected ecdf(fs$numstudents) to work
>>>>
>>>> Instead it just returned:
>>>> Call: ecdf(fs$numstudents)
>>>> ?x[1:210] = ? ? ?1, ? ? ?2, ? ? ?3, ?..., ? 3717, ? 4538
>>>>
>>>> After Googling, got this to work:
>>>> ecdf(fs$numstudents)(unique(fs$numstudents))
>>>>
>>>> But I don't understand why if the ?ecdf says usage is ecdf(x) ... I
>>>> need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this
>>>> to work?
>>>>
>>>> Can somebody explain this to me?
>>>>
>>>> Regards
>>>> Gawesh
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>