Short course: Statistical Learning and Data Mining II: tools for tall and wide data Trevor Hastie and Robert Tibshirani, Stanford University Sheraton Hotel, Palo Alto, California, April 3-4, 2006. This two-day course gives a detailed overview of statistical models for data mining, inference and prediction. With the rapid developments in internet technology, genomics, financial risk modeling, and other high-tech industries, we rely increasingly more on data analysis and statistical models to exploit the vast amounts of data at our fingertips. This course is the third in a series, and follows our popular past offerings "Modern Regression and Classification", and "Statistical Learning and Data Mining". The two earlier courses are not a prerequisite for this new course. In this course we emphasize the tools useful for tackling modern-day data analysis problems. We focus on both "tall" data ( N>p where N=#cases, p=#features) and "wide" data (p>N). The tools include gradient boosting, SVMs and kernel methods, random forests, lasso and LARS, ridge regression and GAMs, supervised principal components, and cross-validation. We also present some interesting case studies in a variety of application areas. All our examples are developed using the S language, and most of the procedures we discuss are implemented in publicly available R packages. Please visit the site http://www-stat.stanford.edu/~hastie/sldm.html for more information and registration details. ------------------------------------------------------------------- Trevor Hastie hastie@stanford.edu Professor, Department of Statistics, Stanford University Phone: (650) 725-2231 (Statistics) Fax: (650) 725-8977 (650) 498-5233 (Biostatistics) Fax: (650) 725-6951 URL: http://www-stat.stanford.edu/~hastie address: room 104, Department of Statistics, Sequoia Hall 390 Serra Mall, Stanford University, CA 94305-4065 -------------------------------------------------------------------- [[alternative HTML version deleted]]