Short course: Statistical Learning and Data Mining II:
tools for tall and wide data
Trevor Hastie and Robert Tibshirani, Stanford University
Sheraton Hotel,
Palo Alto, California,
April 3-4, 2006.
This two-day course gives a detailed overview of statistical models for
data mining, inference and prediction. With the rapid developments
in internet technology, genomics, financial risk modeling, and other
high-tech industries, we rely increasingly more on data analysis and
statistical models to exploit the vast amounts of data at our
fingertips.
This course is the third in a series, and follows our popular past
offerings "Modern Regression and Classification", and
"Statistical
Learning and Data Mining".
The two earlier courses are not a prerequisite for this new course.
In this course we emphasize the tools useful for tackling modern-day
data analysis problems. We focus on both "tall" data ( N>p where
N=#cases, p=#features) and "wide" data (p>N). The tools include
gradient boosting, SVMs and kernel methods, random forests, lasso and
LARS, ridge regression and GAMs, supervised principal components, and
cross-validation. We also present some interesting case studies in a
variety of application areas. All our examples are developed using the
S language, and most of the procedures we discuss are implemented in
publicly available R packages.
Please visit the site
http://www-stat.stanford.edu/~hastie/sldm.html
for more information and registration details.
-------------------------------------------------------------------
Trevor Hastie hastie@stanford.edu
Professor, Department of Statistics, Stanford University
Phone: (650) 725-2231 (Statistics) Fax: (650) 725-8977
(650) 498-5233 (Biostatistics) Fax: (650) 725-6951
URL: http://www-stat.stanford.edu/~hastie
address: room 104, Department of Statistics, Sequoia Hall
390 Serra Mall, Stanford University, CA 94305-4065
--------------------------------------------------------------------
[[alternative HTML version deleted]]