SQL Server Big Data Clusters Workshop at SQL Bits

On Thursday, 28 February 2019, I'll be teaching a brand-new course from Microsoft called "Microsoft SQL Server Big Data Clusters Architecture", which I'll be delivering as a one-day workshop at SQL Bits in Manchester in the UK. I wanted to explain how the course will work, since we'll be covering a lot of information in … Continue reading SQL Server Big Data Clusters Workshop at SQL Bits

Syllabuck: Ignite 2018 Conference

(A "Syllabuck" is like a Syllabus, but more like that second definition, and certainly more random) I recently attended, presented, worked and did interviews at the Microsoft Ignite 2018 Conference in Orlando. If you have never been, you should go sometime. 30,000 people, 2.1 million square feet of space, and ten+ miles of walking per … Continue reading Syllabuck: Ignite 2018 Conference

DevOps for Data Science – Load Testing and Auto-Scale

In this series on DevOps for Data Science, I’ve explained the concept of a DevOps “Maturity Model” – a list of things you can do, in order, that will set you on the path for implementing DevOps in Data Science. The final DevOps Maturity Model  is Load Testing and Auto-Scale. Note that you want to … Continue reading DevOps for Data Science – Load Testing and Auto-Scale

DevOps for Data Science – Application Performance Monitoring

In this series on DevOps for Data Science, I’ve explained the concept of a DevOps “Maturity Model” – a list of things you can do, in order, that will set you on the path for implementing DevOps in Data Science. The first thing you can do in your projects is to implement Infrastructure as Code … Continue reading DevOps for Data Science – Application Performance Monitoring

The Keys to Effective Data Science Projects – Part 9: Testing and Validation

We’re continuing our discussion of the series of the Keys to Effective Data Science Projects,  this time focusing on Testing and Validating the Model. We're in the general phase in the Team Data Science Process called "Customer Acceptance". "Testing" in the general sense is the same in Data Science projects and any other typical software project - … Continue reading The Keys to Effective Data Science Projects – Part 9: Testing and Validation

The Keys to Effective Data Science Projects – Part 8: Operationalize

We’re in part eight on our journey through the series of the Keys to Effective Data Science Projects -"Operationalization" - a term only a marketer could love. It really just means "people using your solution". And it's this part of the process that is quite possibly the most complicated, and usually the one done with the … Continue reading The Keys to Effective Data Science Projects – Part 8: Operationalize

The Keys to Effective Data Science Projects – Part 7: Create and Train the Model

We’re in part seven on our series of the Keys to Effective Data Science Projects.  This is the section that most people think of when they think of "Data Science". It's where we take the question, the source data which has been turned into the proper Features (and potentially Labels), and select an algorithm or two … Continue reading The Keys to Effective Data Science Projects – Part 7: Create and Train the Model

The Keys to Effective Data Science Projects – Part 6: Feature Selection

We're in part six on our series of the Keys to Effective Data Science Projects. I won't cover basic Feature Engineering in this article - it's a huge topic and central to working in Machine Learning areas. I do recommend you check out as many articles as you can find on the subject, and once … Continue reading The Keys to Effective Data Science Projects – Part 6: Feature Selection

The Keys to Effective Data Science Projects – Part 5: Update the Data

In this series on the “Keys to Effective Data Science Projects”, we've seen a process we can use, we've determined what we want to know, and we've ingested the data. In the last step we explored the data, and in a different way than we might be used to when working with in a database … Continue reading The Keys to Effective Data Science Projects – Part 5: Update the Data

The Keys to Effective Data Science Projects – Part 4: Explore the Data

We’re in a series on the “Keys to Effective Data Science Projects”. We've identified the question we want to solve, and made a preliminary pass at the data we need to answer that question. Next we brought in that data to a central location we can work with. We now want to explore that data. … Continue reading The Keys to Effective Data Science Projects – Part 4: Explore the Data