There’s a show on US television called “This Old House“, where a team of craftsmen fix up older homes – like VERY old (at least old in the US sense). In one episode, an entire quarter of the show time was devoted to the amazing lengths the team went through to put in updated plumbing in a second-story bathroom. To get the plumbing installed, vented, routed and up to code was a Herculean task. My wife said “Seems like you should just lay out the plumbing and wiring and build a house around whatever that becomes. She’s probably right.
It’s MUCH easier in a home build if you lay in the plumbing, wiring, and other base elements first. And that’s true with Data Science projects as well. So what is the “plumbing and wiring” in our case?
Number one, it’s the data. The data has to be “up to code”, by being reliable, consistent, and available. The main bit of the plumbing, however, is the data path – where the data is coming from, how it travels, and how it gets there. Think hard about these things before you start your project, and break out Visio (or whatever tool you use) and get a diagram going. You’ll find it’s easier to correct on paper than once you’ve started (just like in a house build).
The wiring in a Data Science project is composed of things like security and automation. Think these things through as well – if you can’t get your design past the security team (which happens way more often than you might expect) your cool algorithm is going nowhere.
So start with the plumbing in mind. Make Norm proud.