thr3ads.net - R help - [R] probabilities from predict.svm [Aug 2010]

If this information is useful, please help other people find it:
Share via:

Watling,James I

2010-Aug-18 19:09 UTC

[R] probabilities from predict.svm

Dear R Community-

I am a new user of support vector machines for species distribution modeling and
am using package e1071 to run svm() and predict.svm().  Briefly, I want to
create an svm model for classification of a factor response (species presence or
absence) based on climate predictor variables.  I have used a training dataset
to train the model, and tested it against a validation data set with good
results: AUC is high, and the confusion matrix indicates low commission and
omission errors.  The code for the best-fit model is:

svm.model
<-svm(as.factor(acutus)~p_feb+p_jan+p_mar+p_sep+t_feb+t_july+t_june+t_mar,cost=10000,
gamma=1, probability=T)

Because ultimately I want to create prediction maps of probabilities of species
occurrence under future climate change, I want to use the results of the
validated model to predict probability of presence using data describing future
conditions.  I have created a data frame (predict.data) with new values for the
same predictor variables used in the original model; each value corresponds to
an observation from a raster grid of the study area.  I enabled the probability
option when creating the original model, and acquire the probabilities using the
predict function:
pred.map <-predict(svm.model, predict.data, probability=T).  However, when I
use probs<-attr(pred.map, "probabilities") to acquire the
probabilities for each grid cell, the spatial signature of the probabilities
does make sense.  I have extracted the column of probabilities for class = 1
(probability of presence), and the resulting map of the study area is spatially
accurate (it has the right shape), but the probability values are incorrect, or
at least in the wrong place.  I am attaching a pdf (SVM prediction maps) of the
resulting map using probabilities obtained using the code described above (page
1) and a map of what the prediction map should look like given spatial
autocorrelation in climate predictors (page 2, map generated using
openmodeller).  Note that the openmodeller map was created with the same input
data and same svm algorithm (also using code from libsvm) as the model in R,
just run using different software.  I don't know why the prediction map of
probabilities based on the model is  so different from what I would expect, and
would appreciate any thoughts from the group.

All the best

James

*******************************************************************************
James I Watling, PhD
Postdoctoral Research Associate
University of Florida
Ft. Lauderdale Research & Education Center
3205 College Avenue
Ft Lauderdale, FL 33314 USA
954.577.6316 (phone)
954.475.4125 (fax)


*******************************************************************************
James I Watling, PhD
Postdoctoral Research Associate
University of Florida
Ft. Lauderdale Research & Education Center
3205 College Avenue
Ft Lauderdale, FL 33314 USA
954.577.6316 (phone)
954.475.4125 (fax)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: SVM prediction maps.pdf
Type: application/pdf
Size: 78297 bytes
Desc: SVM prediction maps.pdf
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20100818/f23e479c/attachment.pdf>

Steve Lianoglou

2010-Aug-19 14:23 UTC

head link

[R] probabilities from predict.svm

Hi James,

I'd like to help you out, but I'm not sure I understand what the problem
is.

Does the problem lie with building a predictive SVM, or getting the
right values (class probabilities) to land in the right place on your
map/plot?

-steve

On Wed, Aug 18, 2010 at 3:09 PM, Watling,James I <watlingj at ufl.edu>
wrote:> Dear R Community-
>
> I am a new user of support vector machines for species distribution
modeling and am using package e1071 to run svm() and predict.svm(). ?Briefly, I
want to create an svm model for classification of a factor response (species
presence or absence) based on climate predictor variables. ?I have used a
training dataset to train the model, and tested it against a validation data set
with good results: AUC is high, and the confusion matrix indicates low
commission and omission errors. ?The code for the best-fit model is:
>
> svm.model
<-svm(as.factor(acutus)~p_feb+p_jan+p_mar+p_sep+t_feb+t_july+t_june+t_mar,cost=10000,
gamma=1, probability=T)
>
> Because ultimately I want to create prediction maps of probabilities of
species occurrence under future climate change, I want to use the results of the
validated model to predict probability of presence using data describing future
conditions. ?I have created a data frame (predict.data) with new values for the
same predictor variables used in the original model; each value corresponds to
an observation from a raster grid of the study area. ?I enabled the probability
option when creating the original model, and acquire the probabilities using the
predict function:
> pred.map <-predict(svm.model, predict.data, probability=T). ?However,
when I use probs<-attr(pred.map, "probabilities") to acquire the
probabilities for each grid cell, the spatial signature of the probabilities
does make sense. ?I have extracted the column of probabilities for class = 1
(probability of presence), and the resulting map of the study area is spatially
accurate (it has the right shape), but the probability values are incorrect, or
at least in the wrong place. ?I am attaching a pdf (SVM prediction maps) of the
resulting map using probabilities obtained using the code described above (page
1) and a map of what the prediction map should look like given spatial
autocorrelation in climate predictors (page 2, map generated using
openmodeller). ?Note that the openmodeller map was created with the same input
data and same svm algorithm (also using code from libsvm) as the model in R,
just run using different software. ?I don't know why the prediction map of
probabilities based on the model is ?so different from what I would expect, and
would appreciate any thoughts from the group.
>
> All the best
>
> James
>
>
*******************************************************************************
> James I Watling, PhD
> Postdoctoral Research Associate
> University of Florida
> Ft. Lauderdale Research & Education Center
> 3205 College Avenue
> Ft Lauderdale, FL 33314 USA
> 954.577.6316 (phone)
> 954.475.4125 (fax)
>
>
>
*******************************************************************************
> James I Watling, PhD
> Postdoctoral Research Associate
> University of Florida
> Ft. Lauderdale Research & Education Center
> 3205 College Avenue
> Ft Lauderdale, FL 33314 USA
> 954.577.6316 (phone)
> 954.475.4125 (fax)
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
?| Memorial Sloan-Kettering Cancer Center
?| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

Possibly Parallel Threads

Search for more maybe matching threads

R help - Aug 2010 - probabilities from predict.svm

[R] probabilities from predict.svm

[R] probabilities from predict.svm

Possibly Parallel Threads