In the second semester of grad school, I remember sitting in a Statistical Inference class watching a very Russian sounding instructor fast forward through an overhead projected PDF document filled with numbered equations and occasionally making comments like: “Vell, ve take zis eqazion on ze top and ve substitude it on ze butom, and zen it verk out. Do you see zat ?” I did not see zat. I don’t think many people saw zat.
In case I come off as an intolerant immigrant hater, let me assure you that as an immigrant from the former Soviet block, I have all due respect for the very bright Russian and non-Russian scientists who came to the United States to seek intellectual and other freedoms. But this post is not about immigration, which incidentally is in need of serious reform. This is about an important subject, which on average is not being taught very well.
This is hardly news, but many courses in Statistics are being taught by very talented statisticians who have no aptitude or interest in the teaching method. But poor instructors are not the only problem. These courses are part of an institution, an institution that is no longer in the business of providing education. Universities predominantly sell accreditation to students, and research to (mostly) the federal government. While I believe that government-sponsored research should be a foundation of modern society, it does not have to be delivered within the confines of a teaching institution. And a university diploma, even from a top school (i.e. accreditation), is at best a proxy for your knowledge and capabilities. For example, if you are a software engineer, Stack Overflow and GitHub provide much more direct evidence of your abilities.
With the cost of higher education skyrocketing, it is reasonable to ask if the traditional university education is still relevant? I am not sure about medicine, but in statistics, the answer is a resounding ‘No.’ Unless you want to be a professor. But chances are you will not be a professor, even if you get your coveted Ph.D.
So for all of you aspiring Data Geeks, I put together a table outlining Online Classes, Books, and Community and Q&A Sites that completely bypass the traditional channels. And if you really want to go to school, most Universities will allow you to audit classes, so that is always an option. Got Zat?
Online Classes | Books | Community / Q&A | |
Programming | Computer Science Courses at Udacity. Currently Introduction to Computer Science, Logic and Discrete Mathematics (great for preparation for Probability), Programming Languages, Design of Computer Programs, and Algorithms.
For a highly interactive experience try Codecademy. |
How to Think Like a Computer Scientist ( Allen B. Downey)
Code Complete (Steve McConnell) |
Stack Overflow |
Foundational Math | Singel Variable Calculus Course on Coursera (they are adding others; check that site often)
Khan Academy Linear Algebra Series Khan Academy Calculus Series (including multivariate) Gilbert Strang’s Linear Algebra Course |
Intro to Linear Algebra (Gilbert Strang)
Calculus, an Intuitive and Physical Approach (Morris Kline) |
Math Overflow |
Intro to Probability and Statistics |
Statistics One from Coursera. This course includes an Introduction to R language.
Introduction to Statistics from Udacity. |
Stats: Data and Models (Richard De Veaux) | Cross Validated, which tends to be more advanced |
Probability and Statistical Theory |
It is very lonely here… | Introduction to Probability Models(Sheldon Ross)
Statistical Inference (Casella and Berger) |
Cross Validated |
Applied and Computational Statistics |
Machine Learning from Coursera.
Statistics and Data Analysis curriculum from Coursera. |
Statistical Sleuth(Ramsey and Schafer)
Data Analysis Using Regression and Multilevel Models (Gelman) Pattern Recognition and Machine Learning (Chris Bishop) Elements of Statistical Learning (Hastie, Tibshirani, Friedman) |
Stack Overflow especially under the R tag
New York Open Statistical Programming Meetup, try searching Meetups in your city |
Bayesian Statistics | Not to my knowledge, but check the above-mentioned sites. | Bayesian Data Analysis (Gelman)
Doing Bayesian Data Analysis (Kruschke) |
I don’t know of any specialized sites for this. |
Nice post, Eric! For your section ‘Probability and Statistical Theory’ I would suggest the Statistics 110: Probability by Joe Blitzstein, Professor of the Practice in Statistics
Harvard University. http://projects.iq.harvard.edu/stat110
For the ‘Bayesian Statistics’ there is a very nice github book “Probabilistic Programming and Bayesian Methods for Hackers” https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers
Thanks for the links Alice; especially the one for Bayesian Stats.
For Differential Equations, “Ordinary Differential Equations” by Tenenbaum is really good – even if you are learning on your own. It’s also really inexpensive for a math text! http://www.amazon.com/Ordinary-Differential-Equations-Dover-Mathematics/dp/0486649407/ref=sr_1_1?ie=UTF8&qid=1389798733&sr=8-1&keywords=ordinary+differential+equations
Agree, it’s a nice book. I have not seen many DEs in stats work, but they are helpful in Time Series and certainly if you are doing a lot of work with Stochastic Processes. Thanks for the reference.