Data Science and the Lytro Camera

CameraSince the first practical camera was invented in the 1800’s, it’s been used as a scientific tool. In essence, it’s a database – albeit one that stores shades of light rather than 0’s and 1’s, and didn’t use a hard drive (at least at first).

In 2012, a new type of camera was introduced, the Lytro. Actually, it’s not technically a camera – it’s a “Light Field Imaging Platform”, because it differs from previous cameras in a very significant way. Cameras work by collecting a moment of light differences based on a field of focus. Using a series of settings such as speed and aperture, the camera allows light to show and expose on a medium for recording the image. In the past, this medium was light-sensitive chemicals on a type of film, and later this was exchanged for the electronic sensors in digital cameras and phones. The part that is interesting is that the entire camera is designed to tune out most of the information you are looking at, and only capture the parts you care about.

The Lytro is different – it records all of the information it can, completely unfocused and mostly uncompensated. But what use is that? Viewed directly, the information is a mess – it’s too dense. But that’s where the difference comes into play. Using software, you tell the system what you want to focus on, and how to focus on it, after you have all of the information. (Read more about it here: ). Essentially you have every picture you could ever want from that one field of view – you just need to process that into whatever focus and light you care about later. You can have hundreds of pictures this way from taking just one shot.

AnscomeGridIn working with Business Intelligence or Data Warehousing, it’s common to follow an “Extract, Transform and Load” (ETL) process. You find your source data, decide what formatting, data types, lengths and other tuning you need on that data to make it homogeneous. You might load it into staging tables using tools like SQL Server Integration Services (SSIS), or change it as it streams in to be in the final form. The point is to make the data take a shape that is well suited for the reporting and exploration you want to perform on it, because you know in large part the type of queries you want to run on that data.

In Data Science, it’s quite the opposite. Any change in data loses fidelity within the data – even normalizing the type of data, such as changing text to numbers, is fraught with peril. In fact, the process changes from ETL to ELTExtract, Load and only when you query the data do you Transform it. In Data Science, you want the data to be as pure to the source as possible – because you aren’t sure what you want to ask it yet. You’ll also use the data multiple times, with multiple systems, each of which might have their own type of processing engine or data shape requirements.

So when you’re thinking about the base data you’ll use in your Data Science projects, think Lytro, not Kodak. Not that there’s anything wrong with Kodak, of course – it’s just that the more you leave the data in its original shape, the more systems you can process it with, and the more options you have for working with it. Storage is cheap, so bring it all in. And leave it alone – for now.

5 thoughts on “Data Science and the Lytro Camera

  1. My system is purpose-built. It does all of the heavy lifting up front. I don’t care to transform string dates to date-time objects on the fly; I want them there in that form, waiting for the user to query a date range (or a number, or a string, or a currency value, or…) Performance trumps paranoia.

    If you can’t trust your transform routines, there’s something wrong with your process, not with the data. Provenance is guaranteed through my process, via rigorous dev and testing.


  2. Nice analogy for those of us like me (in awe at data science and who may be holding the camera backward even).

    I am trying to come up with a witty take on a data lake… and data… and not having shape… and maybe something to do with an ETL boat… or not. I know — ice Query sculptures. meh.


Leave a Reply to sdmcnitt Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.