“Everything is a remix” – Kirby Ferguson (and probably a bunch of other people, too.)
I’ve produced a lot of content for YouTube since graduating college in 2007. Driver Digital was the first company I worked for, where I produced a few thousand videos for moms and kids. When I worked at Frederator Networks, as the VP of Audience Development, I oversaw all non-animation production and programming for our YouTube channels. At Little Monster, my YouTube production and consulting agency, we’ve produced hundreds of videos for clients.
Until about two years ago, when I began to develop The Taxonomy of Digital Video, I’d find myself in a room with an entire team of 20 or more people all asking, “What should we make?” It looked a lot like this:
My teams and I would sit around the conference table pitching ideas on different shows we could make and most of the time it was relatively fruitless. People would either pitch stuff that had been done to death, slight variations on what we were already doing, or stuff that might do okay on TV, but would never have a chance of success in digital. Occasionally a bolt of lightning would strike – a la 107 Facts on Frederator’s Leaderboard or Cartoon Conspiracy on Fredreator’s main channel – and we’d create a show that generated millions of views.
After enough of these meetings – and admittedly a few shows that did not generate millions of views – I realized I needed a framework with which to understand YouTube content. The problem my teams and I had wasn’t that our ideas were bad, it was that we didn’t have a box in which to develop content. There was no structure. No framework. We thoughtfully wandered into good ideas and that lead to the hit and miss nature of what we produced.
This is when I began to think about content in the same way that I think about the algorithm: as something that can be analyzed structurally.
I looked on blogs and in bookstores for writings on digital video formats, structures, programming, anything(!), but I couldn’t find any substantive works and nothing at all about how to develop content using this knowledge. And of course I couldn’t find anything. Digital video is a relatively new medium still in its infancy – or at least early childhood.
The Taxonomy
So, I decided to do what I had done in the past with my writings here on Tubefilter — I’d just make the thing I wanted to read my own damn self. What I’ve developed is The Taxonomy of Digital Video.
The Taxonomy is a structure. It’s a way of understanding YouTube content that boils mysterious “X factors” down into easily perceivable, and repeatable, processes or Formats.
This will allow you to go to your creative teams, your companies, your businesses, your studios, etc. with an understanding and way of analyzing what content is currently doing well in your vertical, what’s missing from your vertical, and how the content you make can stand out, feel completely original, and generate millions of views.
Essentially, it’s a guide to developing unique content for YouTube.
Furthermore, I’ll show why the understanding of these core Formats is key to building a long term sustainable audience on YouTube.
[One quick disclaimer before we dive in: Every brand, creator, or show I mention in this presentation is mentioned because I am a big fan of what they’ve done. They are all incredibly talented and creative people and have succeeded for many reasons beyond what I mention here. I’m simply trying to show what lies beneath the surface and demystify a little of why their content is popular.]
Classifications
Let’s start with the more basic stuff first. When thinking about classifying a YouTube channel, series, or video or creating your own, we typically start at the “Vertical” category – as in what is the general area of interest (automotive, beauty, etc.). That’s followed by Format, Style, Length, Personality, and then Topic.
With that in mind, I believe the classification model of a YouTube video might look like this:
Don’t get me wrong, this is not a Matter of Importance chart (e.g. Vertical is no more important than Style). It’s just a systematic way of classifying and developing content. Personally, I think the personalities and characters are the most important thing to any media brand, show, or video. But they’re in this Classification Model because different archetypes or personalities are better suited for and spans different types of content.
Most of these elements are pretty self-explanatory. However, the category of Formats is where many YouTube series and popular creators really distinguish themselves and make content that feels fresh and unique.
The 8 Formats
First, let’s establish how we determined these Formats. We strip away all of the stylistic elements of a video, and ask what the shared or primary structural characteristics of each video are. For example, the primary structural characteristic of a Listicle video is a list of things. Essentially we can classify videos In the same way that we classify plants and animals based on their shared primary characteristics.
These Formats are the Listicle, Explainer, Commentary, Interview, Music Video, Challenge, Reaction, and Narrative. These eight formats comprise the vast majority and potentially all of the popular formats on YouTube.
You may be thinking things along the lines of “vlogging / let’s plays / beauty tutorials aren’t a format?!” and you wouldn’t be wrong for thinking that.
Let’s think about an example though. Is Lily Singh a vlogger? Are people who do “trying things” videos also vlogging as they’re often talking directly to the camera and giving commentary on something just like a vlog? I think the answer to both of these questions is no. Vlogging is a combination of the commentary format and the direct to camera style. Trying videos are typically a challenge or reaction video– or a combination of both.
Let’s start with an easy one. Everyone should be familiar with the Listicle format. We’ve all read listicles whether it’s on Buzzfeed, Cracked, or any of the thousands of other sites that pump them out. This format is as old as time– ever read the Ten Commandments? That’s just a listicle.
The Listicle format is familiar to all of us and that’s one of the reasons why this format works so well on YouTube. The theory basically goes if an audience understands what they’re watching from a structure standpoint, they are more likely to enjoy and continue watching that content.
Essentially, if your content meets viewers expectations in format, they will be far more likely to be “sticky” and watch for extended periods of time.
For example, if a film bills itself as an action movie, you know that the format will basically be: We’ll start with seeing the hero in their everyday life, an inciting incident will set them out on their “hero journey”, they’ll have to overcome some adversities, and then they’ll take on some bad dudes and ultimately win or not.
If a film bills itself as an action movie and instead you get a romantic drama you would likely walk out of the theater pretty quickly.
Similarly, if an audience clicks on a YouTube video expecting a Listicle, and it’s a basic makeup tutorial, they’re probably going to click away pretty quickly. Some great examples of Listicle videos can be found on the channels WatchMojo, Dark 5, and Matthew Santoro.
However, some content on YouTube disguises the Listicle component. For example, what if I told you Cinema Sins, with its 8.2 million subscribers and over 2.6 billion video views is just a Listicle? Here take a look:
They are literally just listing the “sins” of the movie, albeit quite humorously.
Expanding upon each format, these are their definitions and common components:
Listicle Video: A video that lists or ranks items.
Common types:
Top ### Video
Best ofs
Things you don’t know
Many compilation videos
Common Elements:
Ranking and providing commentary as to why
Usually only 1 – 2 minutes per list item
Reading off of wikipedia
Playing with or against the audience’s expectations / knowledge
Common styles:
Direct to camera with over the shoulder images / videos
Cutaways to video
V.O. on top of images / videos
Primary Format Example:
Music Video: A video where a song or music plays, and it’s also the primary purpose of the video.
Common Types:
Official music video
Cover
Parody
Lyric Videos
Common Elements:
Telling a story
Dancing
Over-the-top costumes / situations
Party scenarios
Performance scenarios
Common Styles:
All
Primary Format Example:
Narrative: A video that depicts fiction or fictionalized events.
Common Types:
Clips from film / tv
Parodies / Sketch comedy
Dress Up Play
Web series
Common Elements:
Characters / Props / Sets
Story arc
One-off videos
Humor
Common Styles:
All
Primary Format Example:
Interview: a video where questions are asked of a subject or interviewee.
Common Types:
1 on 1 interview
Answering pre-written questions
Q&A with fans
Common Elements:
Questions
Interviewer / Interviewee
Slower paced
Common Styles:
Multiple camera
Away from camera
Live stream
Primary Format Example:
Explainer: A video that explains or teaches a topic, or in some instances answers a question.
Common Types:
How-to videos
Educational videos
Science experiments
Common Elements:
The video poses a question to the audience it then answers
School subjects
Simplifying complex ideas
Common Styles:
Direct to camera
Green screen
Direct to camera with over the shoulder images/videos
Cutaways to video
V.O. on top of images / videos
Overhead of hands
Primary Format Example:
Challenge Video*: A video where one or more subjects are challenged to perform a task in some way, be it a physical, mental or a competition between two or more people.
Common Types:
Try not to laugh
What’s in the box
Eating “gross” things
Debate
I tried XYZ
Common Elements:
Victory / Loss conditions
Sports
Gross-out factor
Things normal people don’t do
Things a category of person doesn’t normally do
Common Styles:
Direct to camera
Away from camera / to another person
Single camera
Multi-camera
Primary Format Example:
Reaction*: A video where the primary purpose is to show reactions to an event.
Common Types:
React videos
Pranks
Trying XYZ
Magic
Fail videos
What’s in the box
Unboxings
Common Elements:
Multiple people
Multiple things being reacted to
Showing onlookers
Shock / Gross-out factor
Table top
Pain
Humor
Common Styles:
Single camera
Direct to camera
To interviewer
Handheld camera
Primary Format Example:
*Most Challenge and Reaction videos these days are hybrids taking elements from both formats.
Commentary: A video that comments or provides opinion on a topic.
Common Types:
Vlogs
Gameplay/Lets play
Unboxings
Reviews
Conspiracy Theories
Analysis/Commentary of tv / film / sports / books / etc.
Common Elements:
Single person
Scripting
Sitting in a room in a house
Opinion
Common Styles:
Single camera
Direct to camera
Overhead camera
V.O. on top of images / video
Primary Format Example:
Hybrid Formats
If we only follow this model, we’ll just be making the same thing that thousands (millions?) of other people have already made.
To make something that at least feels fresh and unique, we have to create Hybrid Formats.
This is what some of the biggest channels on YouTube have done. They’ve created shows or hybrid formats that feel unique, new, or original and audiences have rewarded them for it. You can apply this same exact formula, the mixing and matching of format elements, on your channels and at your companies.
These are some of the best examples of Hybrid Formats, from extremely popular YouTube channels:
Commentary / Narrative Hybrid
A great example of someone creating a Hybrid Format is Lilly Singh or Superwoman. Lilly has one of the largest followings on YouTube and every video she posts does millions upon millions of views. There’s no doubt that she’s incredibly talented and funny. But I’d argue that her true genius, or at least the spark that set her career ablaze is in the unique Format concept she developed (or at least she was the first to really succeed with it).
From a strictly Format perspective, all she’s done is take the Commentary format in its primary style (direct to camera) and added parts of the Narrative format, specifically sketch comedy, through the characters she portrays and the sketches.
So, when we boil it down to its base elements we see that while this feels incredibly unique upon a quick view, it’s really just two incredibly popular Formats weaved together.
Listicle / Explainer Hybrid
Here’s an example of the blending of an Explainer video and a Listicle from 5 Minute Crafts, one of the most viewed channels in 2018.
Challenge / Reaction Hybrid
Rhett & Link make a lot of content in a lot of different Formats. One of their most popular Formats is a Challenge and Reaction hybrid.
Music Video / Challenge / Reaction Hybrid
One area where we’ve seen little innovation is in music videos. If you go back a few years there was a group called CDZA, which did some really amazing work. In this video they combine the Music Video with a Challenge and Reaction video.
Narrative / Multiple-Formats Hybrid
One of my favorite examples right now is Miranda Sings. Miranda also plays with a lot of different Formats such as the Explainer format, the Challenge format and so on, but always mixing in an element of the Narrative format in the form of sketch comedy through her character.
The examples above are great at showing how various creators and media brands have taken standard base Formats, and added elements from other formats to make a Hybrid Format. These hybrid formats have helped them stand out significantly on YouTube. It makes their content feel fresh and unique, and can help drive more audience.
Question: Why Hybrids Matter?
Beyond not wanting to make content that feels stale before its even uploaded, Hybrid Formats can drive huge and sustainable audiences. Let’s take one of the worst performing base Formats in YouTube history, the Interview, as an example.
Many people have tried to make an Interview show on YouTube successful, some pouring millions of dollars into it, some featuring huge celebrities, and some not. But regardless of the various components and budgets, 99 times out of 100 they’ve failed.
However, if we look at what Complex did with Hot Ones and what Condé Nast has done across multiple channels and multiple shows, it’s truly phenomenal. They’ve managed to take one of the oldest and worst performing formats on YouTube – the Interview – and make incredibly successful shows.
First, let’s talk about Hot Ones. Hot Ones is a show on the First We Feast channel. The basic concept is a standard Interview: the interviewer asks questions of the interviewee. However, the brilliance and success of this show lies in the slight adjustment they made to the Format. They added elements of the Challenge format.
Essentially, during each interview, the guests eat hotter and hotter hot wings, until they get to the hottest one. It’s essentially just a Challenge video.
So two Formats – Interview and Challenge – married together make this successful show.
Let’s put a pin in that for now and come back to it in one second.
Next, let’s look at 73 Qs. 73 Questions marries the Listicle with the Interview. This may have been enough to make this show incredibly successful, but they went three steps beyond.
First, they change the style of the standard interview from two people talking to each other with a static camera or multiple cameras, and instead have the subject speak directly into the camera – which we know is the most successful style on YouTube.
Second, in each video they go on a house or office tour, where the viewer gets to see where the subject lives or works. This is essentially the Format of a classic Commentary video: the “room / dorm / house tour.” Here’s what it looks like when it’s all put together:
Answer: The Algorithm and View Velocity
Other than the fact that these shows married two Formats to make something stale feel very new, what do these shows have in common?
Both of these shows give the AUDIENCE a reason to watch that has absolutely nothing to do with the TOPIC. This is incredibly important for long term sustainable growth because it massively contributes to View Velocity.
View Velocity – the rate at which a NEW UPLOAD gains viewership – is incredibly important for how many views that video will ultimately get, as illustrated in my previous research “Reverse Engineering the YouTube Algorithm,” and “Cracking YouTube.” View velocity is essentially a product of how many impressions your Title and Thumbnail get, the Click Through Rate on those impressions, and how quickly that happens. The greater your View Velocity and the greater chance you have of YouTube’s algorithm putting your video in front of a broader audience (by way of appearing in YouTube’s Suggested and Recommended Videos sections, search results, and more).
So, if View Velocity is an essential component for the success of my channel, the next rational question is, “How do I get the most View Velocity?” Well, the real questions you’re asking are, “What makes someone make the choice to click and watch a video? What are the reasons?”
Well I think our Taxonomy explains the possible reasons. They either:
Like the Format and/or Style
Are interested in the topic
Like the talent
or a combination of the above.
(I’m using “like” here to mean the viewer gets their desired emotion from watching, be it happiness, sadness, anger, etc. This doesn’t mean the user clicks a heart on the website.)
So let’s go back to our Hybrid Interview Formats. Again, both of these shows give the AUDIENCE a reason to watch that has absolutely nothing to do with the TOPIC.
Not interested in Bill Burr? Well you can still enjoy the video to see what happens when he eats that super spicy hot wing. Couldn’t care less about Kendall Jenner? Well you can still enjoy seeing how a multi-millionaire lives.
The effect of having fans of your show or channel as a whole is incredibly powerful for View Velocity. You’re no longer topic- or talent-dependent for views. You have real fans that will watch every episode, not just the episodes/videos they’re interested in at a topical level.
For example, if you had a channel that talked about a large number of different topics, and there wasn’t an underlying talent or format reason for the audience to watch, you will have a segmented audience (like the one on the left in the image below). In this scenario, your channel will have significant trouble growing and may eventually enter a YouTube death spiral because it will not generate enough View Velocity on any individual video to let YouTube’s algorithm know the video should be shown to a wide audience.
Conversely, if you have fans of a Format (or talent), they’ll watch just about anything you upload, (like the circle on the right, which generates far more View Velocity).
In closing, this the best piece of Algorithm or audience development advice I can give you: Make a hybrid-format, in a style endemic to the platform, with good talent. If you do that, you’ll be way ahead of the vast majority of YouTube channels.