Big Data is just Data

A few years ago it was all the rage to talk about “Big Data”. Lots of descriptions of “Big Data” popped up, including the “V’s” (Variety, Velocity, Volume, etc.) that proved very helpful. I even have my own definition:

Big Data is any data you can’t process

in the time you want

with the systems you have

Uncle. Buck’s Guide to Technology

Data professionals focused on learning technologies like Hadoop and Spark to ingest, process, and distribute large sets of data. This set of activities is often now called “Data Engineering”. I still teach this term when I talk about the new SQL Server 2019 Big Data Cluster platform.

We’ve moved on to talk about using all that data in applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL). I teach this as part of the operationalizing function of the SQL Server Big Data Cluster.

(AI and ML tend to “top out” for the usefulness of amounts of data at a really high amount, but Deep Learning is always hungry for more data. )

But the “Big Data” moniker has gone largely silent – which means it’s not a thing any more, right?

No, that’s not right at all – Data Scientists were always rather baffled at having to explain “big” data, since the algorithms we use require statistically significant amounts of features and labels to work. For a Data Scientist, it’s always been just “data”. (Or data if that’s how you pronounce it)

So the industry is now catching up with those Data Science concepts. The term “Big Data” has died out in the Hype Cycle, but it is baked in to the AI, ML and DL applications.

Long story short: You still have to learn HDFS, Spark, and Distributed Storage / Computing. And we’ll help you do that here: https://github.com/microsoft/sqlworkshops/tree/master/sqlserver2019bigdataclusters

FYI – it’s OK to have catchy terms and ways of describing things at first – new industries always do that. Also, it really helps rake in the money for vendors that use the cool new term. Remember when everything was i-this and e-that? No? Well, everything was. And before that, it was all “electro” or “magna-” something or other. Of course, when I started in computing, we just used cards and teletypes, but that’s another blog.

Advertisements

One thought on “Big Data is just Data

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.