Short course: Statistical Learning and Data Mining II: tools for tall and wide data Trevor Hastie and Robert Tibshirani, Stanford University The Conference Center at Harvard Medical School Boston, MA, Oct 31-Nov 1, 2005 This is a *new* two-day course on statistical models for data mining, inference and prediction. It is the third in a series, and follows our past offerings "Modern Regression and Classification", and "Statistical Learning and Data Mining". In this course we emphasize the tools useful for tackling modern-day data analysis problems. We focus on both "tall" data ( N>p where N=#cases, p=#features) and "wide" data (p>N). The tools include gradient boosting, SVMs and kernel methods, random forests, lasso and LARS, ridge regression and GAMs, supervised principal components, and cross-validation. We also present some interesting case studies in a variety of application areas. All our examples are developed using the S language, and most of the procedures we discuss are implemented in publically available R packages. Please visit the site http://www-stat.stanford.edu/~hastie/sldm.html for more information on the course and registration details. -- -------------------------------------------------------------------- Trevor Hastie hastie at stanford.edu Professor, Department of Statistics, Stanford University Phone: (650) 725-2231 (Statistics) Fax: (650) 725-8977 (650) 498-5233 (Biostatistics) Fax: (650) 725-6951 URL: http://www-stat.stanford.edu/~hastie address: room 104, Department of Statistics, Sequoia Hall 390 Serra Mall, Stanford University, CA 94305-4065