Velocity, Volume and Variety
The operators’ perception of the value of the TV usage data is rapidly changing. Most of them are realizing they may be sitting on a treasure trove of information about viewers’ habits and preferences. The introduction of new consumer technologies, mean that they are about to witness rapid growth, in the velocity, volume, and variety of data that is to come.
Long fueled by comfortable subscription feed, operators with TV services historically didn’t face the same pressure their pauper, younger cousins in the online video business struggled with. The effort of extracting value from datasets simply wasn’t worth it, or worse, would blur the operators’ image as a trusted provider of premium services.
Historic broadcast protocols (i.e., no return path) wouldn’t allow for much data collection anyways. And even once IP-enabled, the networks, often managed as silos, wouldn’t have much more to offer aside from inexpressive data collected at the household level (e.g., channel zaps, DVR, catch-up, etc.). This is very different compared to the rich, expressive signals of intent typically offered by online services.
This situation is further complicated by stringent legal or regulatory requirements that protect the privacy of viewers by forbidding the unilateral sharing of personally identifiable viewing records (e.g., the Video Privacy Protection Act or the Cable Communications Policy Act in the United States).
A major shift in the perception of the value of operators’ data is happening right before our eyes, as operators wake up to the opportunities and rewards that a successful data analytic implementation can bring. And despite growing cautiousness about the way their personal information is used, customers, by and large, have proved willing to trade data for an improvement in services. We’ll be writing more about the specific data-driven opportunities for operators in a forthcoming post, and the journey currently underway.
But first we need to talk about the 3Vs — Velocity, Volume & Variety — and the challenges that the industry is going to face in keeping up with them over the next handful of years.
This (r)evolution is illustrated succinctly in the table below.
Paradigm shift, from “Family TV” to “Immersive TV”
As mentioned above we are in the first stages of a transition period that is going to result in an astonishing rise in the amount of data captured. Let’s try and illustrate this with figures. For the sake of this exercise, we can take pretty simple and conservative assumptions about the event collection, notably in terms of:
- data normalization: we assume minimal data redundancy (e.g. no extra field added to facilitate e.g. search queries)
- data minimization: we assume the data collected is limited to the only fields necessary to the product or service being offered (e.g. a priori, a “regular” mobile video app doesn’t “need” to collect data from the phone’s gyrometer in real-time, while this data may be needed to provide a better Virtual Reality (VR)/360 immersive video service on the same mobile phone)
Real-life operators may in practice have to deal with much bigger volumes.
The legacy paradigm (“Family TV”) has measured viewer data captured by devices such as the set-top box at a household level. It has logged what is being watched when. A conservative estimate of 100 events captured per household per day, could translate into a data volume of approx. 0.5TB per million households (HH) per month (order of magnitude assuming 1 TV/STB per HH).
With the advent of OTT technologies, we are currently transitioning to the measurement of an individual’s digital footprints (the “Personal TV” paradigm). This folds in the smartphone and tablet and ramps up both the amount of data and the number of fields that are being captured to include impressions, views, clicks, social media interactions, geolocation data and more (more events being collected, bigger payloads). With an average of four connected devices per household (and rising) this could take the data load up to an estimated 5TB per million households per month.
Again, we need to point out that these figures are based on conservative event sizes assumptions– data volumes can be in practice much higher (see, for example, OTT provider Hulu’s estimates).
This journey from “Family TV” to “Personal TV” and an increase in data volume by an order of magnitude is about to be eclipsed by the arrival of “Immersive TV” and the capture of a viewers’ physical interactions with the environment enabled by VR and the Internet of Things. Head-Mounted Displays (HMDs) and wearable devices will result in what amounts to a ‘generational jump’ in the volume, variety and velocity of data that can be collected.
Viaccess-Orca’s VR/360 video analytics platform, showing in red the areas most looked at by virtual reality / 360-degree video users
If we consider volumes only, we estimate that the data streams from multiple sensors such as the accelerometers and gyrometers inherent in VR devices will lead to an explosion in real-time data (i.e. collected each second). Simple assumptions (e.g. 2 HMDs per HH, 2h of viewing per day) can help illustrate the extent of the revolution about to unfold before our eyes: a back-of-the-envelope calculation could lead to an increase by two orders of magnitude to a little less than 500TB per million households per month.
And if TB per million household figures are perhaps hard to visualize, if we present the sheer explosion in data at the individual household level, it represents an increase from 500kb a month to 500MB over the space of the next few years.
Even Bigger Data
It will be precisely the companies that can handle that increase in the Velocity, Volume and Variety of data – that will be prepared to process and, crucially understand that 16.5MB of information flooding in from every household every day in a not-so-distant future – that will benefit the most from these paradigm changes.
Big Data is growing into Bigger Data, and Even Bigger Data is on the near horizon. How can broadcasters and operators deal with that tsunami of information? What sort of issues this movement will bring such as privacy concerns and the growing realization of data as currency? Stay tuned, as we will return to these topics.