Learning Areas for SQL Server Big Data Clusters

 If you’re a data professional, you know that it’s important to set aside some time for training when a new release or paradigm comes from your platform. In the case of SQL Server 2019 (and later), you’ll want to pay close attention to the Big Data Clusters feature. It’s a exponential knowledge increase, and that’s no exaggeration.

There’s a lot to learn to implement SQL Server‘s Big Data Cluster system. I’ll be covering these topics at various workshops, events, courses, webinars and presentations around the world in more depth, and I thought I might show a few of the things the data professional needs to understand to get ready.

Some of these technologies and concepts are not owned or created by Microsoft – the concepts are universal, and a few of the technologies are open-source. I’ve marked those in italics.

I’ve also included a few links to a training resource I’ve found to be useful. I normally use LinkedIn Learning for larger courses, along with EdX, DataCamp, and many other platforms for in-depth training. The links I have indicated here are by no means exhaustive, but they are free, and provide a good starting point.

Look for the training announcements I’ll post here on this blog to find out where our team is presenting these topics, and feel free to post comments on resources you have found useful.


LinuxOperating system used in Containers and Container management (Kubernetes)

gitSource control management system

ContainersEncapsulation level for the SQL Server Big Data Cluster architecture

KubernetesManagement, control plane and security for Containers

Microsoft AzureCloud environment for services

Azure Kubernetes Service (AKS)Kubernetes as a Service

Apache HDFSScale-out storage subsystem

Apache SparkIn-memory large-scale, scale-out data processing architecture used by SQL Server

Python, R, Java, SparkMLML/AI programming languages used for Machine Learning and AI Model creation

Azure Data StudioTooling for SQL Server, HDFS, Kubernetes cluster management, T-SQL, R, Python, and SparkML languages

SQL Server Machine Learning ServicesR, Python and Java extensions for SQL Server

Microsoft Data Science Process (TDSP)Project, Development, Control and Management framework

Monitoring and ManagementDashboards, logs, API’s and other constructs to manage and monitor the solution

SecurityRBAC, Keys, Secrets, VNETs and Compliance for solutions

If that looks like a lot, it’s because it’s a lot. Stay tuned – I’m with you on the journey. We’ll learn together.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.