Sorry, I'm new to R, and relatively new to statistics too so I'm still a
bit
unclear. The values in the post were only a sample of around 8400 rows. The
label has 1 or 0 (I thought this was the two classes needed). Each label row
has an equivalent probability. This is the data that I output from the
logistic regression analysis, but it is seemingly not the right format for
ROC curve analysis. There is a difference in how R displays the data, when I
type ROCR.simple it is in the format:
$predictions
[1] 0.612547843 0.364270971 0.432136142.......
$labels
[1] 1 1 0 0 0 1 1 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 0 ... etc.
whereas mine is in columns, e.g.
ID, labels, probs
8930 0 0.00070
8931 0 0.00036
8932 1 0.00000
8933 1 0.00002
8934 0 0.00001
etc.
That is why I think it is a format issue, but being new to R, I'm not sure
what I need to do to rectify it.
I have attached the text file if this helps.
Thank you for your time,
http://r.789695.n4.nabble.com/file/n2328240/prob.txt prob.txt
--
View this message in context:
http://r.789695.n4.nabble.com/ROCR-data-input-tp2328117p2328240.html
Sent from the R help mailing list archive at Nabble.com.