(Complete Table of Contents here: http://aka.ms/backyarddatascience)


A good friend of mine asked a question the other day that I’ve been asked before – and I have a rule that if I’m asked the same question from multiple people, I blog it.

The question was this: “Should I learn statistics first and then focus on R or learn statistics along the way?” This follows a pattern of “Should I learn the foundation concepts of some technology or jump in and use the tools, picking up the concepts as I go?”

The answer is – yes.


If you really want to master a concept or skill, you need to understand the foundation concepts first. Golfers will tell you this. Many people that learn to golf simply start playing, and then find themselves frustrated that they are not doing well. They then seek out a professional trainer, who has to help them “un-learn” all the bad habits they’ve developed. A better approach in golf is to learn the proper basic techniques and strategies, and then move on to putting them together to form a good game.

This holds true for Data Science as well. Ideally, you’ll give full time and devotion to the topics you need to learn. By getting a strong foundation in math, statistics, and various data analytics topics, you’ll be a better Data Scientist.

There are a few issues with this approach. The first is time. One could make the argument that it would take several years to learn, much less master, the levels I’ve described, even for the basics.

Another issue is establishing a good learning path. A great deal of thought is required to develop a tailored course of study for each student that lays out the right mix of “this first, with this, then that” syllabus of learning.

So do you just start golfing, er, Data Science-ing? I think you can. That’s what this series of notebooks is about. We’re on a path of using a mix of basic foundational concepts (you are following the stats posts I gave you, aren’t you?) and mixing in tools that you need to know.

RSo jump in. Learn some R. Try some Python. Open Azure ML Studio. It’s not a perfect way of learning the topic, but it’s certainly achievable, and you’ll have the added incentive of actually seeing progress.

A word of caution: I do NOT advocate that you open R, run a few stats, and announce to your company that you are now the resident Data Scientist. Part of being a professional is knowing your limits. Always have someone with experience check your work, and leave the heavy lifting to the professionals, especially the important things your company needs to make decisions on.

(But follow them around like a kid brother, asking questions and making a general nuisance of yourself. That’s one of my favorite ways to learn.)

Now go get on that stats homework. And use R to do it. You’ll learn two things at once.