The Four Basic Strategies For YouTube Viewership Growth

strategiesforgrowth_heroimage_4-1920x1131.jpeg

For 14 or so years, I have been making a living building some of the largest audiences on the YouTube platform. I’ve seen a lot of videos and channels. 

What I always find the most interesting, though, is that even after all of the time I’ve spent, all of the videos I’ve watched, all of the papers I’ve written here on Tubefilter, all of the conversations I’ve had with other strategists, all of the data analyses and spreadsheets and reports that I’ve presented–there are still new things to discover, analyze, dissect, and share.

At my agency Little Monster, we’ve been spending a lot of time recently thinking about macro programming strategies on YouTube. What we’ve boiled it down to is that YouTube channels employ one of four typical growth strategies: 

  1. Mass Upload

  2. Search / Utility

  3. Home Run Game

  4. The Little Monster Method

You can see each of these strategies represented in the top 100 most viewed channels on YouTube each month, and all have pros and cons. 

I’m going to lay out for you the foundation of each strategy, why it works, how and where it can fail, and the best practices required for each to succeed.

STRATEGY #1: Mass Upload

The T-Series Channel

The T-Series Channel

The T-Series Channel

The Mass Upload strategy is exactly what it sounds like (and the antithesis of the Little Monster Method…but more on that later). Within this strategy, a channel is not trying to maximize every upload, they’re trying to maximize upload-ing. In short, the mass-uploader is placing as many bets (videos) on the board as possible. The basic proposition is more videos should equal more views. And it’s not necessarily wrong.

Of the top 10 most viewed channels in the last 30 days as of this publication, according to Gospel Stats, four of them, including the top two most viewed channels, T-Series and SET-India, have more than 15,000 videos uploaded. Twenty-three channels in the top 100 have uploaded more than 10,000 videos, and 43 channels have uploaded more than 1,000 videos (YouTube Shorts channels excluded).

This strategy is largely employed by media companies with large libraries of content, and it would be difficult to imagine an independent creator being able to pursue this strategy in a sustainable way. However, it’s not impossible. 

Take gamers, for instance. A little back-of-the-napkin math would indicate that if a gamer was able to upload 10 clips a day, they would be able to get to 10,000 uploads over the course of nearly three years. 

Large media organizations don’t have this problem. They can use the mass upload strategy to generate billions of monthly views, and when they do, there are a few high-level best practices essential to success. 

First and foremost, for this strategy to work, a channel must recognize that what they are really building is a content library. This library will primarily thrive on: 

1.1 Content Quality 

The word “quality” is subjective here, but let’s be honest: content farms typically mass-produce garbage videos and rarely succeed in the long run. 

5-Minute Crafts monthly YouTube Views from January 2019 to May 2021.

5-Minute Crafts monthly YouTube Views from January 2019 to May 2021.

5-Minute Crafts monthly YouTube Views from January 2019 to May 2021.

On the other hand, the channel T-Series and similar brands succeed not just because of the volume of their library, but also the quality of the content therein. Just eyeballing on a cost-per-minute comparison, the videos being distributed on the T-Series channel probably cost 10x your average YouTube video or more. 

The importance of this can not be understated. They are not just pumping out hot garbage on a plate. These are often music videos, movie trailers, or clips from TV shows and movies produced straight out of Bollywood that viewers are looking for

That last bit is vital to the success of the Mass Upload strategy. If viewers are not looking for the type of content being mass produced, there’s a low likelihood of success with mass-uploading. Movieclips works because hundreds of millions of people are looking for these videos on a monthly basis.

Recent Videos From Fandango’s Movieclips

Recent Videos From Fandango’s Movieclips

Recent videos from Fandango’s Movieclips

1.2. Never Slow Down

Once a channel has begun to move toward mass uploading, growth is predicated on keeping a mass-upload pace or accelerating. Constantly adding to the library means that the channel will constantly create new opportunities to be recommended to viewers by YouTube. 

Most of these uploads will not generate views in the long run. Some content will get a second wind and spike for one reason or another. But every now and then a mass-uploaded video will crank out millions of views a day, every day, for years. 

This is where the Mass Upload strategy pays off. 

By having many thousands of videos in the channel library, the mass uploader does not need every video to generate thousands or millions of views on a daily basis in perpetuity. They only need a few to break through. 

1.3. Thumbnails and Metadata

Mass Upload channels depend largely on their library to drive viewership (as opposed to the success of each new upload). Despite having nearly 200 million subscribers, a new video on T-Series rarely breaks 100,000 views. The vast majority of their 3 to 4 BILLION monthly views are coming from their library of past uploads (videos older than 30-90 days). 

In order for these videos to do well over time and be consistently recommended in YouTube’s main arteries (suggested, browse/homepage, and search) they need thumbnails that accurately represent the video, and metadata that is clear so that people can find what they are looking for. 

In a recent Creator Insider video, YouTube stated that the recommendation system “finds videos for viewers, not viewers for videos.” 

So, in order for YouTube to find your video for a viewer, the content must be doing a good job of satisfying the viewers who have already clicked on it. Thus, the importance of accuracy. When YouTube sees viewers are finding what they are looking for in your video the recommendation system will continue to serve that video to similar viewers; or viewers looking for that thing. 

Please note that you do not need a master’s in YouTube “SEO” to succeed here. There’s no amount or “secret sauce” of keywords that will help you rank here. Those days are LONG gone. What matters is: 

  1. Viewer satisfaction: Did you deliver on your promise? Did people hang out and keep watching?

  2. Click through rate: You can’t satisfy viewers if they don’t click.

  3. Accuracy: You likely won’t “rank” for a term or topic if the audience and/or the recommendation system believe your video thumbnail or title is misleading.

Mass Upload channels do not need a highly engaged audience to succeed, and may in fact be some of the least engaged with channels given how few views new uploads receive. This strategy is about being everything to everyone (Walmart), instead of being a specific thing to a specific audience (Tiffany’s). The play here is to simply flood the zone with endless uploads as fast as possible. Go fast, and accurately portray the videos in your thumbnails and metadata and you can succeed. 

STRATEGY 2: “Search” & The Utility Channel

A “Utility Channel” is how we at Little Monster refer to a channel whose viewership is largely driven by direct informational value to the intended audience. These are essentially channels that make videos where the viewer is there to receive specific information. 

This type of video is the second most sought after behind entertainment. In the September 2018 whitepaper YouTube Needs: Understanding User’s Motivations to Watch Videos on Mobile DevicesGoogle reported that of the 432 viewers they studied, 30.7% of the visits were reported as “to obtain information.” 

Huge audiences can be built here, but it’s difficult. Let’s take for example Khan Academy, whose viewership has little to do with how frequently they post, or the quality of any individual video. Khan Academy’s viewership has far more to do with when people are in school and looking for specific information on the classes they are taking. Note how their viewership thrives when class is in session, and plummets over the summer and over the holidays:

Picture4.png

Khan Academy’s monthly YouTube views from January 2020 to May 20201.

While this is a pattern that generally seems to be true for similar educational or curriculum-based channels, other types of utility channels experience a strikingly similar need vs. viewership correlation. 

Let’s take a car review channel, Redline Reviews. This channel regularly releases videos reviewing various cars, and their viewership can swing by as much as 300% upload to upload.

These two videos released within a few days of each other: 

Two videos from Redline Reviews

What’s driving (pardon the pun) the viewership here is the need for information on a specific car. There are far more people interested in a brand-new model of car and in the price range of a Volkswagen ($32,000) than in the market for a car that has been around for a long time and in the price range of a BMW ($45,000). 

Another type of channel that we at Little Monster call a utility channel are recipe channels. The viewership here is predicated on a potential viewer’s need to know how to make the food item being advertised in the thumbnail and title. Again, we still see sporadic viewership ranging from a few thousand views, to tens of thousands of views. As an example, here’s what the viewership on BuzzFeed’s Tasty looks like over 24 recent uploads: 

Videos from Buzzfeed’s Tasty Channel

What these channels all have in common is that viewership is largely based on a potential viewer’s interest in the topic of the video. The viewer chooses to watch or not watch based on that particular video’s specific relevancy to them. 

I’ve written and spoken extensively about the danger of having viewership be based on topicality previously. That said, in short, what this type of programming strategy leads to is unpredictable viewership, large swaths of your audience becoming less likely to be served your videos over time, and ultimately a mass upload strategy. 

This leads to a mass upload strategy because that is the best way to succeed when you are a utility channel. This is true because the more videos you have that could potentially be relevant to a viewer, the more likely you are to gain viewership. The calculation is simple because if you want to grow as a utility channel, the more topics you cover (videos) the more potential viewership you will have. What is a library other than a utility to find the information you want? 

Therefore, the same best practices that exist for Mass Upload hold true for a utility channel, with a few small tweaks:

2.1. Thumbnails

Your channel’s niche largely determines the thumbnail style that is most likely to get your videos viewed. Recipe channels typically use glamour shots of the finished food product, car review channels use glamour shots of the featured car, and so on. These selections accurately represent the video, but more importantly adhere to conventions audiences have become used to and conditioned successful channels to use. Essentially, those channels that have come before you in a given niche have done the hard work of finding out what viewers are most likely to click on through years of trial and error. 

2.2. Being first

Utility can significantly benefit from being first to do something. When the latest tech or gadget comes out, those who are able to secure early access and get videos uploaded quickly tend to outperform others who come after. 

The reason for this is that if your video is one of the first to be uploaded on a given topic, it’s a video that YouTube will already have processed the data around. This means your video, if it has decent metrics, will likely be served at the top of Search, in Homepage feeds, and at the top of Suggested, as interest and viewership spikes on a given topic. Additionally, there’s likely to be a small supply of videos on the specific topic of interest, and therefore your video is more likely to be put in front of viewers rather than others.

This becomes an upward spiral as the increase in views leads to higher clickthrough rates as the video generates more and more social proof in the form of views. This leads to YouTube continuing to serve your video to more and more people in more and more places (assuming other factors remain relatively constant).

2.3. Specific Niche

While mass upload channels can be far more broad (“movies,” “music,” etcetera), Utility channels must find as specific of a niche as possible. This allows for your audience to flow across your videos more seamlessly as your videos’ close relation to each other means they are likely to be more relevant to the viewer. 

These are the main differences between the Mass Upload strategy and utility channel strategy in regards to best practices. However, there’s a poison pill in the utility channel strategy. 

The pill is that viewership is based on the viewer’s interest in a topic, which makes it extremely difficult to build a sizable audience. This is true because it’s unlikely that the people you get to watch one of your videos will want to watch many (or any) of your other videos, because they’re only watching based on the value they can get out of the initial video they clicked on, or their interest in the topic of a specific video in the first place. 

The audience is not there for you the creator, the style of the content, or the format of the show or video. 

For example, look at these screen grabs from some of the top search results for “how to make chocolate chip cookies”:

Besides the different logos, these three channels are virtually indistinguishable from one another in terms of personality, style, and format. 

These are the three key areas where a utility channel can distinguish itself. Focusing on developing these three areas can make a utility channel move beyond just providing informational value, and allow them to provide “entertainment” value  or “connection” value as well. 

“To connect with others” was the third main reason people came to YouTube in Google’s study. This connection specifically relates to how a personality, talent, or host connects directly to their audience. 

If there were a clear, repeatable science to creating this connection, there would be millions more full-time YouTubers, movie stars, and famous musicians. However, I don’t think this is some mysterious X-factor that you either have or you don’t. It can be honed and it can be leveled up through practice and training. 

A common trope in our industry is “be authentic.” Hell, I’ve said it myself when advising creators at Little Monster. But I’ve never heard someone actually describe what that means in practical terms or at least define it in a way that is universally applicable. 

My advice would be to study the creators on YouTube that you or your audience connect with. What do they do that makes you want to spend time with them? Find what that is, and what that is about you or your talent, and double down on that. Take improv classes. Practice making videos. I’m not saying you’ll be the next PewDiePieZendaya, or Cardi B, but improving your skills here can increase your chances of finding success on YouTube.

As it pertains to style, that’s simple and straightforward. Do people watch your videos because they really love the unique style of video you’re creating? Is there something you can do to make your style just a smidge or two better than those in your vertical? If so, this can make you and your content stand out amongst the sea of competitors.

The lovely thing about style is that it’s far less subjective than connection, there are far more resources to make yourself better at the actual art of video creation, and it doesn’t take a lot to really stand out. 

When speaking to Utility channels, the aspect of content creation that Little Monster focuses on most is format. I’ve written extensively about how a channel can distinguish itself by way of format in my Taxonomy article here on Tubefilter. I encourage you to read that article with an open mind, and to keep in mind that what I am proposing in that article is a methodology about how to create a new format, not just ape what is already being done successfully. 

What we are essentially advocating for here is moving a channel from having a Utility Channel approach (Mass Upload) to a Little Monster Method strategy. The reason for this is that by and large, most media companies and independent creators (who, by the way, are media companies whether they recognize it or not) will not be able to compete against the Mass Upload strategy. If you can out-upload the Mass Uploaders, by all means, give it a shot. But you’re more likely to bankrupt yourself or burn out than overtake a channel that already has thousands of videos and is adding more daily.

The vast majority of this type of channel will have to win not through quantity but through quality. Finding your footing in personality/talent, style, or format is the surest way we at Little Monster have found to do that. 

STRATEGY #3: The Home Run Game

Channels that play The Home Run Game upload many different types of videos until one really takes off (home run!). Once you’ve hit a home run, you double down on whatever worked, the “value proposition” of that particular video, be it format, style, personality, topic, or some combination of the four.

We see this often in channels that have been around for a long time before suddenly exploding with growth. A few top examples of this are channels like Cocomelon, PewDiePie, and MrBeast. MrBeast uploaded well over 100 videos and was active on YouTube for six years before he reached 1,000 subscribers.

3.1. Playing The Home Run Game

This strategy can work in a few different scenarios: 

  • You have an understanding of your channel’s core value proposition but can’t really get traction in your niche or vertical. In this scenariom it makes lots of sense to explore different styles, formats, and talent or performance styles.

For example, let’s say you’ve built a decent-sized audience in the home improvement vertical. You know your audience generally wants and expects you to talk about interior design. Your typical video shows you doing some sort of DIY project in your home. 

Instead of doing the same thing in each successive upload, take some swings at the fences. Try reacting to various clichéd themes, critiquing celebrity homes, remaking things in a celebrity’s home, go handheld instead of on a tripod, and so on.

  • You have a large competitive advantage, like a brand. If you’re a media brand, you likely have a library of content. Is there something you can do with this library that others are not? A different format or style of video? If you’re a non-media brand can you leverage your access to events, celebrities, or products to get an advantage over people in this space? If so, take these swings.

Note that if you do hit a home run here, you’re going to want to move everything else you are creating to another channel or spin this type of content out into a separate channel. YouTube channels are designed for one core value proposition. Two competing shows or types of value propositions on the same channel will hurt both. 

  • You have a new channel and therefore have room to experiment with no consequences. For people starting out on YouTube this is the perfect time to try things that are outside the norm. Every minute, 20+ DAYS worth of content is uploaded to YouTube. Do something unique to stand out in this content tsunami.

In the long run, the Home Run Game strategy should be considered a means to an end, with the end being a solidified, clear value proposition. If a channel always swings for the fences with wildly different programming, the audience will have no idea what to expect when YouTube serves them one of its videos.This will alienate both the audience and the recommendation systems. As a mentor once told me, you want to be dependable, but not predictable.

The home run game is not moneyball. The home run game is trying to become the Yankees on a Norwich Sea Unicorns budget. 

STRATEGY #4: The Little Monster Method

Moneyball, the 2003 book and 2011 film starring Brad Pitt and Jonah Hill, chronicles how Billy Beane took the Oakland A’s to the playoffs in 2002 and 2003, with one of the lowest team salaries in the majors. 

Essentially, Beane built a team by leveraging data to select and utilize players who were statistically highly likely to lead to wins through their combined impact, without relying on individual superstars. 

The Little Monster Method is moneyball for YouTube channels. 

We use data to create and distribute videos for our clients that are highly likely to succeed with our client’s audience and thereby the YouTube recommendation system. Like Beane, our clients’ videos win because they are analytically designed to do so.

Our method is based on the premise that a YouTube channel should serve one audience a single core value proposition. We use the data in YouTube analytics to understand exactly what that audience is most interested in, and then super-serve them what they want. 

This is a 360-degree approach to content creation, analyzing and optimizing the: 

  • Value proposition of the channel

  • Format of the content

  • Style of the content

  • Talent on screen

  • Topics of videos

  • Length of the content

  • Thumbnail design

  • Title structure

The two most important elements in this strategy are the Value Proposition and the format of both the channel and videos. 

4.1. Value Proposition

A value proposition is essentially the answer to the question “Why does someone watch your videos/channel?” It can be a concept (“they will be inspired”), format (it will be a reaction video), style (a Wes Anderson film), personality (a specific person will be in the video), or a topic (videos will always be about surfing).

The value proposition doesn’t need to be universally true for every viewer, but it does need to be universally true for every video. For example, if we asked MrBeast’s fans “Why do you watch MrBeast?” you would get a lot of different answers. However, from MrBeast’s point of view, every video is a larger-than-life spectacle (therefore, that’s the value proposition).

4.2. Format

For MrBeast, the other elements (format, style), while important, are not beholden to this same singularity. For example, he can do a Challenge video format in one video, and in the next a reaction/listicle video format. 

In the video below, MrBeast is utilizing a very classic Challenge video format, where three teams are competing to win a prize. Here, it’s a house:

https://www.youtube.com/watch?v=f0c7pSCoZqE

Utilizing a completely different format (reaction/listicle) MrBeast made a video where he tips waiters and waitresses with real gold bars: 

https://www.youtube.com/watch?v=Rmf6T_Ewt38

As you can see these videos are completely different formats, but still deliver on the larger-than-life spectacle value proposition. 

For MrBeast, format is less of a core necessity than for others. This is actually true for the vast majority of “influencers.” Quite often, once someone achieves influencer status, the core value proposition has become the creator.

MKBHD is a great example of a channel that has made personality it’s core value proposition. MKBHD’s core value proposition is that you get to hang out with him. He’s cool, transparent, humble, engaged, excited, and unflinchingly himself. He also has some of the slickest and most stylish “tech review” videos on the platform. Unlike other tech review channels, his content doesn’t adhere to a singular format at this point, because it doesn’t need to.

However, format as a core value proposition is key for channels like Complex’s First We FeastMTV’s Wild ‘N Out, and Binging with Babish

First We Feast by and large has two primary formats (in this case, named shows)–Hot Ones, and The Burger Show. By and large Hot Ones outperforms The Burger Show.

Recent videos from Complex’s First We Feast Channel

On MTV’s Wild ‘N Out, the rap battles and highlights from the show dominate the other types of content they’ve tried: 

Video thumbnails from MTV’s Wild ‘n Out


And for Binging with Babish, his main show beats out the secondary show Stump Sohla regularly. 

Recent videos from Babish Culinary Universe

The reasons these secondary shows all underperform (and there are countless examples) is because they do not match the core Value Proposition of the channel fully, as these are channels that are more reliant on a viewer’s interest in the format. 

You may be asking, why does this matter? What’s the harm in putting multiple value propositions or multiple formats into one channel? 

The problem lies in how YouTube’s recommendation systems are built and how audiences behave. To make recommendations, the YouTube recommendation system uses: 

  • A viewer’s past viewing history (clicks, duration of view)

  • A viewer’s past engagement history (likes/comments/surveys)

  • Contextual information (device, location, time of day, etc.)

  • Collaborative filtering (How other similar viewers have engaged with a video)

  • Video stats (clickthrough rate, average view duration, user feedback)

This system is designed to treat each individual user independently, creating a YouTube that is customized specifically for them. 

If a viewer has historically clicked on your videos regularly, watched them, and given positive feedback (implicitly, like spending a long time watching and then watching a bunch more of your videos, or explicitly, like filling out a survey about your video positively), YouTube will continue to serve them your videos. More importantly, YouTube will use this data and suggest your videos to similar viewers.

Now, let’s say you switch up your formats or introduce a new show concept and YouTube feeds the new format or concept to your audience…and they don’t like it. They don’t watch for very long, click “like,” or leave comments…You’ve changed the value proposition, and YouTube is going to ding you for it. 

This is because YouTube is incapable of deciphering between two different series on a YouTube channel. The recommendation system is built to simply understand if a viewer clicked, watched, and enjoyed your videos the last few times they were recommended to that viewer. 

Like we see with The Burger ShowQuarantine Workout, and Stump Sohla, audiences haven’t responded with the same enthusiasm, so YouTube serves these new concepts/formats to fewer people in their audience.

But it gets worse. Because YouTube is incapable of deciphering between different shows on your channel, it becomes less likely to serve your audience ANY of your videos in the future, including the type of video the audience initially liked. The introduction of a new value proposition is a big gamble–and from Little Monster’s point of view, a bad bet.

There are billions of videos on YouTube and 500 hours’ worth of content is uploaded every 60 seconds. Why should YouTube waste its time and impressions trying to decipher which specific videos from a channel a particular viewer might be interested in when they could just show them any of the millions of other videos with better metics from channels the viewer has a greater probability of enjoying? 

In other words, YouTube is not going to subsidize your bad bet. Their value proposition to the audience is that there is nothing in the world as entertaining as YouTube. If your new series doesn’t support that idea as well as your previous series did, YouTube has to shield people from it in order to live up to their value proposition.

In addition, the impressions that could have been going to your high-performing videos have been going to these new concept/format, underperforming videos–a double whammy for you and your content.

All of this causes your viewership to bleed over time, and eventually leads to “channel decay,” possibly to the point of no return.

This is why channels that make the mistake of having multiple value propositions decline in viewership. 

Don’t just take my word for it–here’s a graph of Babish Culinary Universe’s viewership over the last 12 months. The second show, Stump Sohla, first appeared on the channel in September of 2020:

Babish Culinary Universe monthly YouTube views from May 2020 to May 2021

Beyond value proposition and format, the other elements of a programming strategy all adhere to the same principles in the Little Monster Method. 

4.3. Talent/Personality: Are we seeing high CTRs but low relative average view duration? If this is married to other data (like negative comments about the main person onscreen) this might be an indication of a needed shift in performance or talent. If there are multiple different talent onscreen, is there one that gets more positive comments? Are there there spikes (good) or click off (bad) when a specific talent or character is onscreen? 

4.4. Style: What style of video does your audience most like to watch–handheld multi-camera with quick cuts, or a locked-down single camera and slower pacing? Here we’re looking at the difference in average view duration and average percentage viewed, and similar deep metrics.

4.5. Topics: Are there certain topics within your niche that an audience is most interested in? Are there topics they are clearly not interested in? We do this by labeling every video, averaging the viewership across those different topics, and then we double down on the topics that generate the most interest. Ideally a channel is not using topics as a primary value proposition. When topics are your primary value proposition, your viewership is tied to things outside your control. For example, look at all of the Five Nights at Freddy’s channels that don’t even upload anymore because interest in the IP has plummeted

4.6. Tactical Analytics: When we look at more tactical items in The Little Monster Method–titles, thumbnails, video length–our goal is to make the video the most “clickable,” without being misleading. Here we’re going to be analyzing things like the velocity of the video. Essentially, is the video getting more viewership in its first one hour, two hours, or ten hours than our other videos? What is it about the programming choice, thumbnail, and title that is causing this higher relative clickthrough rate? How many faces should we put in a thumbnail (if any)? Should we use text? Are we going for a clear value proposition as to what the video is about, creating an information gap, or teasing the viewer with our thumbnail and title combination? Do videos that fall between 17 minutes and 23 minutes on average generate the best average view duration for more viewership in the long run or is there a different sweet spot in length? 

4.7. Timeline & Evolution: We break down content and best practices for a channel over the long run by looking at what worked at the 30-, 60-, 90-, and 365-day mark. Combining analytics over the short- and long-term and teasing out the often subtle differences helps Little Monster create maximum impact and reach quickly and over the lifespan of a video. 

At Little Monster, we’re not looking for “viral” videos or home runs. We’re looking for videos that are more likely to get on base than they are to strike out. We’re using all of the data both on-platform and off-platform to make extremely well-informed decisions as to what combination of factors in strategy and tactics will lead to audience growth. We’re playing moneyball, and we’re winning. You should too.

The Taxonomy of YouTube Videos

taxonomy-of-youtube-videos-original-content-that-works-2.jpg

“Everything is a remix” – Kirby Ferguson (and probably a bunch of other people, too.)

I’ve produced a lot of content for YouTube since graduating college in 2007. Driver Digital was the first company I worked for, where I produced a few thousand videos for moms and kids. When I worked at Frederator Networks, as the VP of Audience Development, I oversaw all non-animation production and programming for our YouTube channels. At Little Monster, my YouTube production and consulting agency, we’ve produced hundreds of videos for clients.

Until about two years ago, when I began to develop The Taxonomy of Digital Video, I’d find myself in a room with an entire team of 20 or more people all asking, “What should we make?” It looked a lot like this:

Live feed of conference rooms where execs are trying to figure out what YouTube videos to make next.

Live feed of conference rooms where execs are trying to figure out what YouTube videos to make next.

My teams and I would sit around the conference table pitching ideas on different shows we could make and most of the time it was relatively fruitless. People would either pitch stuff that had been done to death, slight variations on what we were already doing, or stuff that might do okay on TV, but would never have a chance of success in digital. Occasionally a bolt of lightning would strike – a la 107 Facts on Frederator’s Leaderboard or Cartoon Conspiracy on Fredreator’s main channel – and we’d create a show that generated millions of views.

After enough of these meetings – and admittedly a few shows that did not generate millions of views – I realized I needed a framework with which to understand YouTube content. The problem my teams and I had wasn’t that our ideas were bad, it was that we didn’t have a box in which to develop content. There was no structure. No framework. We thoughtfully wandered into good ideas and that lead to the hit and miss nature of what we produced.

This is when I began to think about content in the same way that I think about the algorithm: as something that can be analyzed structurally.

I looked on blogs and in bookstores for writings on digital video formats, structures, programming, anything(!), but I couldn’t find any substantive works and nothing at all about how to develop content using this knowledge. And of course I couldn’t find anything. Digital video is a relatively new medium still in its infancy – or at least early childhood.

The Taxonomy

So, I decided to do what I had done in the past with my writings here on Tubefilter — I’d just make the thing I wanted to read my own damn self. What I’ve developed is The Taxonomy of Digital Video.

The Taxonomy is a structure. It’s a way of understanding YouTube content that boils mysterious “X factors” down into easily perceivable, and repeatable, processes or Formats.

This will allow you to go to your creative teams, your companies, your businesses, your studios, etc. with an understanding and way of analyzing what content is currently doing well in your vertical, what’s missing from your vertical, and how the content you make can stand out, feel completely original, and generate millions of views.

Essentially, it’s a guide to developing unique content for YouTube.

Furthermore, I’ll show why the understanding of these core Formats is key to building a long term sustainable audience on YouTube.

[One quick disclaimer before we dive in: Every brand, creator, or show I mention in this presentation is mentioned because I am a big fan of what they’ve done. They are all incredibly talented and creative people and have succeeded for many reasons beyond what I mention here. I’m simply trying to show what lies beneath the surface and demystify a little of why their content is popular.]

Classifications

Let’s start with the more basic stuff first. When thinking about classifying a YouTube channel, series, or video or creating your own, we typically start at the “Vertical” category – as in what is the general area of interest (automotive, beauty, etc.). That’s followed by Format, Style, Length, Personality, and then Topic.

With that in mind, I believe the classification model of a YouTube video might look like this:

PIC4.jpg

Don’t get me wrong, this is not a Matter of Importance chart (e.g. Vertical is no more important than Style). It’s just a systematic way of classifying and developing content. Personally, I think the personalities and characters are the most important thing to any media brand, show, or video. But they’re in this Classification Model because different archetypes or personalities are better suited for and spans different types of content.

Most of these elements are pretty self-explanatory. However, the category of Formats is where many YouTube series and popular creators really distinguish themselves and make content that feels fresh and unique.

The 8 Formats

First, let’s establish how we determined these Formats. We strip away all of the stylistic elements of a video, and ask what the shared or primary structural characteristics of each video are. For example, the primary structural characteristic of a Listicle video is a list of things. Essentially we can classify videos In the same way that we classify plants and animals based on their shared primary characteristics.

PIC-3-1.jpg

These Formats are the Listicle, Explainer, Commentary, Interview, Music Video, Challenge, Reaction, and Narrative. These eight formats comprise the vast majority and potentially all of the popular formats on YouTube.

You may be thinking things along the lines of “vlogging / let’s plays / beauty tutorials aren’t a format?!” and you wouldn’t be wrong for thinking that.

Let’s think about an example though. Is Lily Singh a vlogger? Are people who do “trying things” videos also vlogging as they’re often talking directly to the camera and giving commentary on something just like a vlog? I think the answer to both of these questions is no. Vlogging is a combination of the commentary format and the direct to camera style. Trying videos are typically a challenge or reaction video– or a combination of both.

Let’s start with an easy one. Everyone should be familiar with the Listicle format. We’ve all read listicles whether it’s on Buzzfeed, Cracked, or any of the thousands of other sites that pump them out. This format is as old as time– ever read the Ten Commandments? That’s just a listicle.

PIC2.png

The Listicle format is familiar to all of us and that’s one of the reasons why this format works so well on YouTube. The theory basically goes if an audience understands what they’re watching from a structure standpoint, they are more likely to enjoy and continue watching that content.

Essentially, if your content meets viewers expectations in format, they will be far more likely to be “sticky” and watch for extended periods of time.

For example, if a film bills itself as an action movie, you know that the format will basically be: We’ll start with seeing the hero in their everyday life, an inciting incident will set them out on their “hero journey”, they’ll have to overcome some adversities, and then they’ll take on some bad dudes and ultimately win or not.

If a film bills itself as an action movie and instead you get a romantic drama you would likely walk out of the theater pretty quickly.

Similarly, if an audience clicks on a YouTube video expecting a Listicle, and it’s a basic makeup tutorial, they’re probably going to click away pretty quickly. Some great examples of Listicle videos can be found on the channels WatchMojo, Dark 5, and Matthew Santoro.

However, some content on YouTube disguises the Listicle component. For example, what if I told you Cinema Sins, with its 8.2 million subscribers and over 2.6 billion video views is just a Listicle? Here take a look:

They are literally just listing the “sins” of the movie, albeit quite humorously.

Expanding upon each format, these are their definitions and common components:

Listicle Video: A video that lists or ranks items.

Common types:

  • Top ### Video

  • Best ofs

  • Things you don’t know

  • Many compilation videos

Common Elements:

  • Ranking and providing commentary as to why

  • Usually only 1 – 2 minutes per list item

  • Reading off of wikipedia

  • Playing with or against the audience’s expectations / knowledge

Common styles:

  • Direct to camera with over the shoulder images / videos

  • Cutaways to video

  • V.O. on top of images / videos

Primary Format Example:

Music Video: A video where a song or music plays, and it’s also the primary purpose of the video.

Common Types:

  • Official music video

  • Cover

  • Parody

  • Lyric Videos

Common Elements:

  • Telling a story

  • Dancing

  • Over-the-top costumes / situations

  • Party scenarios

  • Performance scenarios

Common Styles:

  • All

Primary Format Example:

Narrative: A video that depicts fiction or fictionalized events.

Common Types:

  • Clips from film / tv

  • Parodies / Sketch comedy

  • Dress Up Play

  • Web series

Common Elements:

  • Characters / Props / Sets

  • Story arc

  • One-off videos

  • Humor

Common Styles:

  • All

Primary Format Example:

Interview: a video where questions are asked of a subject or interviewee.

Common Types:

  • 1 on 1 interview

  • Answering pre-written questions

  • Q&A with fans

Common Elements:

  • Questions

  • Interviewer / Interviewee

  • Slower paced

Common Styles:

  • Multiple camera

  • Away from camera

  • Live stream

Primary Format Example:

Explainer: A video that explains or teaches a topic, or in some instances answers a question.

Common Types:

  • How-to videos

  • Educational videos

  • Science experiments

Common Elements:

  • The video poses a question to the audience it then answers

  • School subjects

  • Simplifying complex ideas

Common Styles:

  • Direct to camera

  • Green screen

  • Direct to camera with over the shoulder images/videos

  • Cutaways to video

  • V.O. on top of images / videos

  • Overhead of hands

Primary Format Example:

Challenge Video*: A video where one or more subjects are challenged to perform a task in some way, be it a physical, mental or a competition between two or more people.

Common Types:

  • Try not to laugh

  • What’s in the box

  • Eating “gross” things

  • Debate

  • I tried XYZ

Common Elements:

  • Victory / Loss conditions

  • Sports

  • Gross-out factor

  • Things normal people don’t do

  • Things a category of person doesn’t normally do

Common Styles:

  • Direct to camera

  • Away from camera / to another person

  • Single camera

  • Multi-camera

Primary Format Example:

Reaction*: A video where the primary purpose is to show reactions to an event.

Common Types:

  • React videos

  • Pranks

  • Trying XYZ

  • Magic

  • Fail videos

  • What’s in the box

  • Unboxings

Common Elements:

  • Multiple people

  • Multiple things being reacted to

  • Showing onlookers

  • Shock / Gross-out factor

  • Table top

  • Pain

  • Humor

Common Styles:

  • Single camera

  • Direct to camera

  • To interviewer

  • Handheld camera

Primary Format Example:

*Most Challenge and Reaction videos these days are hybrids taking elements from both formats.

Commentary: A video that comments or provides opinion on a topic.

Common Types:

  • Vlogs

  • Gameplay/Lets play

  • Unboxings

  • Reviews

  • Conspiracy Theories

  • Analysis/Commentary of tv / film / sports / books / etc.

Common Elements:

  • Single person

  • Scripting

  • Sitting in a room in a house

  • Opinion

Common Styles:

  • Single camera

  • Direct to camera

  • Overhead camera

  • V.O. on top of images / video

Primary Format Example:

Hybrid Formats

If we only follow this model, we’ll just be making the same thing that thousands (millions?) of other people have already made.

To make something that at least feels fresh and unique, we have to create Hybrid Formats.

This is what some of the biggest channels on YouTube have done. They’ve created shows or hybrid formats that feel unique, new, or original and audiences have rewarded them for it. You can apply this same exact formula, the mixing and matching of format elements, on your channels and at your companies.

These are some of the best examples of Hybrid Formats, from extremely popular YouTube channels:

Commentary / Narrative Hybrid

A great example of someone creating a Hybrid Format is Lilly Singh or Superwoman. Lilly has one of the largest followings on YouTube and every video she posts does millions upon millions of views. There’s no doubt that she’s incredibly talented and funny. But I’d argue that her true genius, or at least the spark that set her career ablaze is in the unique Format concept she developed (or at least she was the first to really succeed with it).

From a strictly Format perspective, all she’s done is take the Commentary format in its primary style (direct to camera) and added parts of the Narrative format, specifically sketch comedy, through the characters she portrays and the sketches.

So, when we boil it down to its base elements we see that while this feels incredibly unique upon a quick view, it’s really just two incredibly popular Formats weaved together.

Listicle / Explainer Hybrid

Here’s an example of the blending of an Explainer video and a Listicle from 5 Minute Crafts, one of the most viewed channels in 2018.

Challenge / Reaction Hybrid

Rhett & Link make a lot of content in a lot of different Formats. One of their most popular Formats is a Challenge and Reaction hybrid.

Music Video / Challenge / Reaction Hybrid

One area where we’ve seen little innovation is in music videos. If you go back a few years there was a group called CDZA, which did some really amazing work. In this video they combine the Music Video with a Challenge and Reaction video.

Narrative / Multiple-Formats Hybrid

One of my favorite examples right now is Miranda Sings. Miranda also plays with a lot of different Formats such as the Explainer format, the Challenge format and so on, but always mixing in an element of the Narrative format in the form of sketch comedy through her character.

The examples above are great at showing how various creators and media brands have taken standard base Formats, and added elements from other formats to make a Hybrid Format. These hybrid formats have helped them stand out significantly on YouTube. It makes their content feel fresh and unique, and can help drive more audience.

Question: Why Hybrids Matter?

Beyond not wanting to make content that feels stale before its even uploaded, Hybrid Formats can drive huge and sustainable audiences. Let’s take one of the worst performing base Formats in YouTube history, the Interview, as an example.

Many people have tried to make an Interview show on YouTube successful, some pouring millions of dollars into it, some featuring huge celebrities, and some not. But regardless of the various components and budgets, 99 times out of 100 they’ve failed.

However, if we look at what Complex did with Hot Ones and what Condé Nast has done across multiple channels and multiple shows, it’s truly phenomenal. They’ve managed to take one of the oldest and worst performing formats on YouTube – the Interview – and make incredibly successful shows.

First, let’s talk about Hot Ones. Hot Ones is a show on the First We Feast channel. The basic concept is a standard Interview: the interviewer asks questions of the interviewee. However, the brilliance and success of this show lies in the slight adjustment they made to the Format. They added elements of the Challenge format.

Essentially, during each interview, the guests eat hotter and hotter hot wings, until they get to the hottest one. It’s essentially just a Challenge video.

So two Formats – Interview and Challenge – married together make this successful show.

Let’s put a pin in that for now and come back to it in one second.

Next, let’s look at 73 Qs. 73 Questions marries the Listicle with the Interview. This may have been enough to make this show incredibly successful, but they went three steps beyond.

First, they change the style of the standard interview from two people talking to each other with a static camera or multiple cameras, and instead have the subject speak directly into the camera – which we know is the most successful style on YouTube.

Second, in each video they go on a house or office tour, where the viewer gets to see where the subject lives or works. This is essentially the Format of a classic Commentary video: the “room / dorm / house tour.” Here’s what it looks like when it’s all put together:

Answer: The Algorithm and View Velocity

Other than the fact that these shows married two Formats to make something stale feel very new, what do these shows have in common?

Both of these shows give the AUDIENCE a reason to watch that has absolutely nothing to do with the TOPIC. This is incredibly important for long term sustainable growth because it massively contributes to View Velocity.

View Velocity – the rate at which a NEW UPLOAD gains viewership –  is incredibly important for how many views that video will ultimately get, as illustrated in my previous research “Reverse Engineering the YouTube Algorithm,” and “Cracking YouTube.” View velocity is essentially a product of how many impressions your Title and Thumbnail get, the Click Through Rate on those impressions, and how quickly that happens. The greater your View Velocity and the greater chance you have of YouTube’s algorithm putting your video in front of a broader audience (by way of appearing in YouTube’s Suggested and Recommended Videos sections, search results, and more).

PiC-5.jpg

So, if View Velocity is an essential component for the success of my channel, the next rational question is, “How do I get the most View Velocity?” Well, the real questions you’re asking are, “What makes someone make the choice to click and watch a video? What are the reasons?”

Well I think our Taxonomy explains the possible reasons. They either:

  • Like the Format and/or Style

  • Are interested in the topic

  • Like the talent

  • or a combination of the above.

(I’m using “like” here to mean the viewer gets their desired emotion from watching, be it happiness, sadness, anger, etc. This doesn’t mean the user clicks a heart on the website.)

So let’s go back to our Hybrid Interview Formats. Again, both of these shows give the AUDIENCE a reason to watch that has absolutely nothing to do with the TOPIC.

Not interested in Bill Burr? Well you can still enjoy the video to see what happens when he eats that super spicy hot wing. Couldn’t care less about Kendall Jenner? Well you can still enjoy seeing how a multi-millionaire lives.

The effect of having fans of your show or channel as a whole is incredibly powerful for View Velocity. You’re no longer topic- or talent-dependent for views. You have real fans that will watch every episode, not just the episodes/videos they’re interested in at a topical level.

For example, if you had a channel that talked about a large number of different topics, and there wasn’t an underlying talent or format reason for the audience to watch, you will have a segmented audience (like the one on the left in the image below). In this scenario, your channel will have significant trouble growing and may eventually enter a YouTube death spiral because it will not generate enough View Velocity on any individual video to let YouTube’s algorithm know the video should be shown to a wide audience.

Conversely, if you have fans of a Format (or talent), they’ll watch just about anything you upload, (like the circle on the right, which generates far more View Velocity).

PIC6.jpg

In closing, this the best piece of Algorithm or audience development advice I can give you: Make a hybrid-format, in a style endemic to the platform, with good talent. If you do that, you’ll be way ahead of the vast majority of YouTube channels.

Cracking YouTube In 2017

cracking-youtube-in-2017.jpg

[This research was initially published on Tubefilter.com in June 2017}

YouTube’s promotional algorithms have changed drastically over the years. In the beginning, the algorithm was largely reliant on metrics that were easily manipulated, like views, clicks, likes, and comments. And the goal was primarily to drive more views.

In 2013 that changed in a big way. YouTube shifted the primary goal of its algorithm to reward “Watch Time”, or time spent on the YouTube platform.

In a previous study here on Tubefilter, we discussed what metrics YouTube considered on the publisher side to calculate Watch Time. We then wrote a follow up article last year in June, “Reverse Engineering The YouTube Algorithm (Part I),” which shared a number of insights we found in determining the metrics that drive Watch Time. Then, in the fall of 2016, Google released a white paper, “Deep Neural Networks for YouTube Recommendations,” which lead to Reverse Engineering The YouTube Algorithm (Part II), further shedding light on what powers the YouTube algorithm.

Now, we’re thrilled to share with you our latest research into the YouTube promotional algorithms in our presentation, “Cracking YouTube in 2017.”

Some of the big highlights from the presentation include:

  • Videos that are between 7 and 16 minutes perform up to 50% better than videos that are shorter or longer.

  • Videos with an average view duration of 5 – 8 minutes receive the most views.

  • There is no correlation between views and length of title, number of tags, or length of description.

  • There is a strong correlation between number of tags and number of creator suggested videos in that creator’s suggested video column.

But that’s not all! You can check out the entire presentation, replete with interesting data points and tangible takeaways, right here:

Reverse Engineering The YouTube Algorithm (Part 2)

youtube-algorith-reverse-engineering.jpg

[Editor’s Note: You can read Reverse Engineering the YouTube Algorithm: Part I right here. You don’t need to read it before reading Part II, but you should check it out at some point. It’s excellent.]

[This paper was original published on Tubfilter.com in February of 2017]

A team of Google researchers presented a paper in Boston, Massachusetts on September 18, 2016 titled Deep Neural Networks for YouTube Recommendations at the 10th annual Association for Computing Machinery conference on Recommender Systems (or, as the cool kids would call it, the ACM’s RecSys ‘16).

This paper was written by Paul Covington (currently a Senior Software Engineer at Google), Jay Adams (currently a Software Engineer at Google), and Embre Sargin (currently a Senior Software Engineer at Google) to show other engineers how YouTube uses Deep Neural Networks for Machine Learning. It gets into some pretty technical, high-level stuff, but what this paper ultimately illustrates is how the entire YouTube recommendation algorithm works(!!!). It gives a careful and prudent reader insight into how YouTube’s Browse, Suggested Videos, and Recommended Videos features actually function.

An Engineering Paper On The YouTube Algorithm For Dummies

While it was not necessarily the intent of the authors, it is our belief the Deep Neural paper can be read and interpreted by and for YouTube video publishers. The below is how we (and when I say we, I mean me and my team at my shiny new company Little Monster Media Co.) interpret this paper as a video publisher.

In a previous post I co-wrote here on Tubefilter, Reverse Engineering The YouTube Algorithm, we focused on the primary driver of the algorithm, Watch Time. We looked at the data from our videos on our channel to try to gain insight into how the YouTube algorithm worked. One of the limiting factors to this approach, however, is that it’s coming from a video publisher’s point of view. In an attempt to gain some insight into the YouTube algorithm we asked ourselves and then answered the question, “Why are our videos successful?” We were doing our best with the information we had, but our initial premise wasn’t ideal. And while I stand by our findings 100%, the problem with our previous approach is primarily twofold:

  1. Looking at an individual set of channel metrics means there’s a massive blind spot in our data, as we don’t have access to competitive metrics, session metrics, and clickthrough rates.

  2. The YouTube algorithm gives very little weight to video publisher-based metrics. It’s far more concerned with audience and individual-video-based metrics. Or, in laymen’s terms, the algorithm doesn’t really care about the videos you’re posting, but it cares a LOT about the videos you (and everyone else) are watching.

But at the time we wrote our original paper, there had been nothing released from YouTube or Google in years that would shed any light onto the algorithm in a meaningful way. Again, we did what we could with what we had. Fortunately for us though, the paper recently released by Google gives us a glimpse into exactly how the algorithm works and some of its most important metrics. Hopefully this begins to allow us to answer the more poignant question, “Why are videossuccessful?”

Staring Into The Deep Learning Abyss

The big takeaway from the paper’s introduction is that YouTube is using Deep Learning to power its algorithm. This isn’t exactly news, but it’s a confirmation of what many have believed for some time. The authors make the reveal in their intro:

In this paper we will focus on the immense impact deep learning has recently had on the YouTube video recommendations system….In conjugation with other product areas across Google, YouTube has undergone a fundamental paradigm shift towards using deep learning as a general-purpose solution for nearly all learning problems.

What this means is that with an increasing likelihood there’s going to be no humans actually making algorithmic tweaks, measuring those tweaks, and then implementing those tweaks across the world’s largest video sharing site. The algorithm is ingesting data in real time, ranking videos, and then providing recommendations based on those rankings. So, when YouTube claims they can’t really say why the algorithm does what it does, they probably mean that very literally.

The Two Neural Networks 

The paper begins by laying out the basic structure of the algorithm. This is the author’s first illustration:

youtube-algorithm-structure.jpg

Essentially there are two large filters, with varying inputs. The authors write:

The system is comprised of two neural networks: one for candidate generation and one for ranking.

These two filters and their inputs essentially decide every video a viewer sees in YouTube’s Suggested Videos, Recommend Videos, and Browse features.

The first filter is Candidate Generation. The paper states this is determined by “the user’s YouTube activity history,” which can be read as the user’s Watch History and Watch Time. Candidate Generation is also determined by what other similar viewers have watched, which the authors refer to as Collaborative Filtering. This algorithm decides who’s a similar viewer through “coarse features such as IDs of video watches, search query tokens, and demographics”.

To boil this down, in order for a video to be one of the “hundreds” of videos that makes it through first filter of Candidate Generation, that video must be relevant to the user’s Watch History and it must also be a video that similar viewers have watched.

The second filter is the Ranking filter. The paper goes into a lot of depth around the Ranking Filter and cites a few meaningful factors of which it’s composed. The Ranking filter, the authors write, ranks videos by:

…assigning a score to each video according to a desired objective function using a rich set of features describing the video and user. The highest scoring videos are presented to the user, ranked by their score.

Since Watch Time is the top objective of YouTube for viewers, we have to assume it’s the “desired objective function” referenced. Therefore, the score is based on how well a video, given the various user inputs, is going to be at generating Watch Time. But, unfortunately, it’s not quite that simple. The authors reveal there’s a lot more that goes into the algorithm’s calculus.

We typically use hundreds of features in our ranking models.

How the algorithm ranks videos is where the math gets really complex. The paper also isn’t explicit about the hundreds of factors considered in the ranking models, nor how those factors are weighted. It does cite the three elements mentioned in the Candidate Generation filter, however, (which are Watch History, Search History, and Demographic Inforomation) and several others including “freshness”:

Many hours worth of videos are uploaded each second to YouTube. Recommending this recently uploaded (“fresh”) content is extremely important for YouTube as a product. We consistently observe that users prefer fresh content, though not at the expense of relevance.

One interesting wrinkle the paper notes is that the algorithm isn’t necessarily influenced by the very last thing you watched (unless you have a very limited history). The authors write:

We “rollback” a user’s history by choosing a random watch and only input actions the user took before the held-out label watch.

In a later section of the paper they discuss clickthrough rates (aka CTR) on video impressions (aka Video Thumbnails and Video Titles). It states:

For example, a user may watch a given video with high probability generally but is unlikely to click on the specific homepage impression due to the choice of thumbnail image….Our final ranking objective is constantly being tuned based on live A/B testing results but is generally a simple function of expected watch time per impression.

It’s not a surprise clickthrough rates are called out here. In order to generate Watch Time a video has to get someone to watch it in the first place, and the most surefire way to do that is with a great thumbnail and a great title. This gives credence to many creator’s claims that clickthrough rate are extremely important to a video’s ranking within the algorithm.

YouTube knows that CTR can be exploited so they provide a counterbalance. This paper acknowledges this when it states the following:

Ranking by click-through rate often promotes deceptive videos that the user does not complete (“clickbait”) whereas watch time better captures engagement [13, 25].

While this might seem encouraging, the authors go on to write:

If a user was recently recommended a video but did not watch it then the model will naturally demote this impression on the next page load.

These statements support the idea that if viewers are not clicking a certain video, the algorithm will stop serving that video to similar viewers. There is evidence in this paper that this happens at the channel as well. It states (with my added emphasis):

We observe that the most important signals are those that describe a user’s previous interaction with the item itself and other similar items… As an example, consider the user’s past history with the channel that uploaded the video being scored – how many videos has the user watched from this channel? When was the last time the user watched a video on this topic? These continuous features describing past user actions on related items are particularly powerful

In addition, the paper notes all YouTube watch sessions are considered when training the algorithm, including those that are not part of the algorithm’s recommendations:

Training examples are generated from all YouTube watches (even those embedded on other sites) rather than just watches on the recommendations we produce. Otherwise, it would be very difficult for new content to surface and the recommender would be overly biased towards exploitation. If users are discovering videos through means other than our recommendations, we want to be able to quickly propagate this discovery to others via collaborative filtering.

Ultimately though, it all comes back to Watch Time for the algorithm. As we saw at the beginning of the paper when it stated the algorithm is designed to meet a “desired objective function,” the authors conclude with “Our Goal is to predict expected watch time,” and “Our final ranking objective is constantly being tuned based on live A/B testing results but is generally a simple function of expected watch time per impression.”

This confirms, once again, that Watch Time is what all of the factors that go into the algorithm are designed to create and prolong. The algorithm is weighted to encourage the greatest amount of time on site and longer watch sessions.

To Recap

That’s a lot to take in. Let’s quickly review.

  1. YouTube uses three primary viewer factors to choose which videos to promote. These inputs are Watch History, Search History, and Demographic Information.

  2. There are two filters a video must get through in order to be promoted by way of YouTube’s Browse, Suggested Videos, and Recommended Videos features:

    • Candidate Generation Filter

    • Ranking Filter

  3. The Ranking Filter uses the viewer inputs, as well as other factors such as “Freshness” and Clickthrough Rates.

  4. The promotional algorithm is designed to continually increase watch time on site by continually A/B testing videos and then feeding that data back into the neural networks, so that YouTube can promote videos that lead to longer viewing sessions.

Still Confused? Here’s An Example.

To help explain how this works, let’s look at an example of the system in action.

Josh really likes YouTube. He has a YouTube account and everything! He’s already logged into YouTube when he visits the site one day. And when he does, YouTube assigns three “tokens” to Josh’s YouTube browsing sessions. These three tokens are given to Josh behind the scenes. He doesn’t even know about them! They’re his Watch History, Search History, and Demographic Information.

Now is where the Candidate Generation filter comes into play. YouTube takes the value of those “tokens” and combines it with the Watch History of viewers who like to watch the same kind of stuff Josh likes to watch. What’s left over is hundreds of videos that Josh might be interested in viewing, filtered out from the millions and millions of videos on YouTube.

Next, these hundreds of videos are ranked based on their relevancy to Josh. The algorithm asks and answers the following questions in fractions of a second: How likely is it that Josh will watch the video? How likely is it the video will lead to Josh spending a lot of time on YouTube? How fresh is the video? How has Josh recently interacted with YouTube? Plus hundreds of other questions!

The top ranked videos are then served to to Josh in YouTube’s Browse, Suggested Videos, and Recommended Videos features. And Josh’s decision on what to watch (and what not watch) is sent back into the Neural Network so the algorithm can use that data for future viewers. Videos that get clicked, and keep the user watching for long periods of time, continue to be served. Those that don’t get clicks may not make it through the Candidate Generation filter the next time Josh (or a viewer like Josh) visits the site.

Conclusion

Deep Neural Networks for YouTube Recommendations is a fascinating read. It’s the first real glimpse into the algorithm, directly from source(!!!), that we’ve seen in a very long time. I hope we continue to see more papers like it so publishers can make better choices about what content they create for the platform. And that’s ultimately why I write these blogs in the first place. Making content suited for the platform means creators will generate more views, and therefore more revenue, which ultimately means we can make more and better programming and provide more entertainment for the billions of viewers who rack up significant Watch Time on YouTube each and every month.

Reverse Engineering The YouTube Algorithm (Part 1)

Reverse Engineering The YouTube Algorithm: Part I

  • By Matt Gielen

[Editor’s Note: You can read Reverse Engineering the YouTube Algorithm: Part II right here. You don’t need to read it after reading Part I, but you should check it out at some point. It’s excellent.]

[Originally published on Tubefilter.com in June 2016]

If you’re a creator who makes content for any kind of distribution (whether it be a feature film, a theatrical play, a TV program, or some kind of online video) the success or failure of that content can be dependent upon the mechanics of the distribution mechanism. For example, if you’re making a TV show and you want that show to be successful, you ideally want to know when to put in ad breaks, how to promote the program, which channel your show will appear on, how many homes the channel reaches, and so on, and so forth.

If you’re distributing videos onto YouTube, however, the most valuable knowledge you can have about that distribution point is how the YouTube algorithm works. But, like everything algorithm-related, that’s hard to do.

YouTube doesn’t make the variables that factor into its algorithm public. So, to figure out how it works, we must peer into a very big and very dark black box with very limited data. There are also factors at play that we have absolutely no data for whatsoever. These data points (such as thumbnail and title impressions, user viewing history and behavior, session metrics, etc.) would shed a lot of light on the algorithm. But, alas. They don’t exist.

Despite these limitations, we still have an obligation to try and figure out as much as we can with the data available to us. This is why my former colleague (FYI, I recently left Frederator to explore other opportunities), Jeremy Rosen, and I spent six months examining data from Frederator’s owned and operated channels to learn as much as we could about the YouTube algorithm.

One quick note before we get started. Throughout this post we will refer to the multiple YouTube promotional algorithms (Recommended, Suggested, Related, Search, MetaScore, etc.) simply as “the YouTube algorithm.” There are many differences between them, but generally they share the same principle. They’re all optimized for “Watch Time“.

Watch Time

First things first. “Watch Time” DOES NOT mean minutes watched. As we discussed before, Watch Time is a combination of the following:

  • Views

  • View Duration

  • Session Starts

  • Upload Frequency

  • Session Duration

  • Session Ends

Essentially, each of these items relate to how well and how often your channel and its videos get people to start a Viewing Sessions and stay on the platform for an extended period of time.

In order to accrue any sort of value in the algorithm, your channel and videos first need to get views. And for a video to be “successful” (success being defined by achieving viewership equal to or greater than 50% of the subscriber base in the first 30 days) you need to get a lot of views in the first minutes, hours, and days of a video’s release. We refer to this as View Velocity.

Views and View Velocity

When analyzing Frederator’s view velocity, we found that the average life-to-date viewership of a video increased exponentially as the percent of subscribers who watched in the first 48 hours increased:

1-View-Velocity.png

Average viewership for videos that received this percentage of subscriber views in the first 48 hours.

As a result of seeing this, we dug a bit deeper and found with a near 92% accuracy we could predict whether a video would perform well for us based on its View Velocity. Essentially, there was a direct correlation between the percentage of subscribers who viewed in the first 72 hours and a video’s life to date viewership.

View Velocity YouTube

Trendline of viewership as it relates to the percentage of subscribers who watched in the first 72 hours.

These graphs and correlations show that Views and View Veolicty have direct and significant impacts on the overall success of a video and a channel. In addition, we found evidence that suggests the reverse is true as well. Poor View Velocity has a negative impact on that video, the following videos, and previous videos.

This graph shows that if Frederator’s previous uploads had poor View Velocity (defined as less than 5% of subscribers) in the first 48 hours, our next uploads would be impacted negatively as well:

3-Negative-View-Veolcity.png

The percentage of subscribers who watched the next video versus the average percentage of subsribers who watched the 2 previous videos.

This data supports Matthew Patrick‘s theory outlined in this video, which suggests that if one of your videos is not clicked on by a large amount of subscribers, YouTube will not serve your next upload to a significant portion of your subscriber base.

It is possible that since the previous upload did poorly there will be less viewership on the channel, which will lead to less viewers passing through organically. But the results are the same regardless as to the “why”.

Another significant impact from negative View Velocity on a new upload is that there’s evidence to suggest that it also harms the viewership on your library of videos. Below you will see the first graph shows an average seven-day rolling % of subscribers who viewed in the first 48 hours (blue line) versus overall channel viewership. The second graph shows overall percentage of subscribers who watched a video that day versus overall channel viewership.

The Running 7 Day Average percentage of subscribers who viewed in the first 48 hours versus daily viewership for Channel Frederator.

The Running 7 Day Average percentage of subscribers who viewed in the first 48 hours versus daily viewership for Channel Frederator.

5-percentage-of-subs-watching-vs-overall-viewership.png

The 7 day rolling average of subscriber views versus total viewership for Channel Frederator.

Essentially what these graphs show is that as the percentage of your subscriber base that view new uploads and/or your library videos goes down, so does overall channel viewership. To us, what this says is that through the algorithm, YouTube actively promotes channels that appeal to that channel’s core audience, while actively punishing channels that do not.

View Duration

The next biggest metric we found to have a significant impact on the algorithm is View Duration.

View Duration speaks to how long a viewer spends watching an individual video. This metric carries a lot of weight and our data suggest that there’s an obvious tipping point. On Channel Frederator this year, videos with an average View Duration of over eight minutes brought in an average of over 350% more views in the first 30 days than those under five minutes. The following graph shows the average life-to-date views on an for Channel Frederator’s videos versus the average view duration of those videos.

6-time-per-video-life-to-date-small.png

Average Life Time Views versus aggregated Average Life View Duration. *Note on this graph. We have limited data points on videos with view durations greater than eight minutes.

We also found that videos that were longer in duration performed better, too. This graph shows the average first seven-day views for videos less than five minutes (1), five minutes to 10 minutes (5) and 10 minutes or greater (10):

7-Longer-videos-views.png

7 Day Average Views versus aggregated Average View Duration.

This graph shows the same but with life to date views instead.

8-longer-videos-lifetime-views.png

Life To Date Average Views versus aggregated Average View Duration.

Adding to these findings, we have anecdotal evidence to suggest that simply making videos longer will improve viewership performance. A channel that Frederator works with in the kids space was uploading three to four videos per week of varying lengths (three minutes, 10 minutes, 30 minutes and 70 minutes). We noticed that the 70-minute videos were receiving far more viewership in the first two days than the other videos, despite being mainly repurposed library videos. On top of this, the 70-minute videos had the same average view duration as any other video of any length on this channel.

We recommended that they reduce their uploads to just the 70 minute video each week. Since implementing this new strategy the channel’s daily average viewership has increased by 500,000 views, while uploading 75% fewer videos over the last 6 weeks. Crazy, I know.

Session Starts, Session Duration, and Session Ends

A great deal of this research was based on the research done for my previous post, WTF Is Watch Time?!.

For a quick recap, Session Starts is essentially how many people start their YouTube viewership session with one of your videos. This speaks volumes as to why the first 72 hours of viewership from your subscribers is so important. Subscribers are the people most likely to watch your video on its first days of being live. They are also the most likely to click on one of your thumbnails as they are familiar with your brand.

Session Duration is how long your content keeps people on the platform as they are watching your video, as well as after they’ve watched your video. There’s little to no hard data here other than Average View Duration and Unique Views, which is a shoddy metric at best.

Session Ends relates to how often someone terminates a YouTube session while or after watching one of your videos. This is a negative metric to the algorithm and a metric where there is literally no data available to us.

An Algorithm Theory:

YouTube’s algorithm is designed to PROMOTE CHANNELS, NOT INDIVIDUAL VIDEOS. However, it uses VIDEOS to promote INDIVIDUAL CHANNELS.

The algorithm uses a combination of video specific data and channel aggregate data to determine which videos to promote. However, the end goal is to build that CHANNEL’S audience.

YouTube does this because they want to promote channels that:

  1. Make people come back to the platform often.

  2. Keep them on the platform for an extended period of time.

Here are three graphs that give evidence to this theory.

The first graph is the 48-hour subscriber views % vs. the seven-day viewership for individual videos. It shows us that if you start a lot of sessions your video is going to get a lot of views. If you reach a threshold, it becomes exponential:

9-28-hour-subscribe-views.png

Average 7 Day Views of videos that reached a certain percentage of subscribers in the first 48 hours.

The second graph show the average daily views vs. rolling five-day % of subs viewership for the channel.

Average Daily Views versus the five day rolling percentage of subscribers who viewed.

Average Daily Views versus the five day rolling percentage of subscribers who viewed.

This means that if you CONSISTENTLY get a large number of subscribers to start sessions (five-day rolling average) the algorithm increases the daily views it sends to the channel’s entire video library.

The final graph is the average daily views as a percentage of subscribers vs. rolling five-day percentage of subs viewership for the channel.

11-Percentage-of-subs-vs-rolling-five-day.png

Daily viewership as a percentage of Channel Frederator’s total subscriber base versus the five day rolling subscriber viewership percentage.

We believe this shows there is a correlation between a channel’s consistency and exactly how many views, as a percentage of your subscribers, YouTube will driver to your videos.

So, let’s say you’re a gaming channel with 100,000 subs and you upload 1x daily and get 5% of your subs to watch each video. Your rolling average would be a consistent but modest 5%. This means you would be generating roughly 30% of your subscriber count in views on a given day, (or 30,000/day or 600,000/month). Now let’s say you have 1mm subs. Those numbers would look more like 300,000 daily views and 6,000,000 monthly views.

We think that math checks out pretty well. And essentially this means that YouTube is selecting channels to promote based on certain performance metrics and then driving exactly as many views as its algorithms determine to promote that channel.

But that’s just a theory!

An Algorithm Score

Here we have taken a crack at recreating these algorithms. Using 15 signals and our best estimate of their weights we’ve created an Algorithm Score. Here are the factors we used to figure it out:

12-Algorithm-Score-1.png

And here are the graphs putting our factors into action.

13-Algorithm-Score-2.png

Trendline of correlation between the 3 day rolling average Algorithm Score versus Views.

14-Algorithm-Score-3.png

Trendline of correlation between the Algorithm Score versus Views.

We’ve gotten it pretty close here:

15-Algorithm-Score-4.png

The 3 Day Rolling Average Algorithm Score versus Daily Views

If you’re curious this is our (very) rough view of how the algorithm is weighted: 

16-Algroithm-Weight-1.png

Algorithm Weighting Factors

16-Algroithm-Weight-2.png

Weighting for Watch Time metrics.

16-Algroithm-Weight-3.png

Algorithm Weighting for non-Watch Time Metrics

However, without more data, we can’t be sure what type of regression to use in the correlation and are only able to say we have strong correlations for most signals. That and we’re still just YouTube Algorithm enthusiasts.

The Ramifications of YouTube’s (Current) Algorithms

The data we found suggests 6 main takeaways:

  1. YouTube algorithmically determines exactly how many views each video and channel will get.

  2. Successful channels focus on one very specific content type/idea.

  3. Channels should rarely experiment once they’ve established a single successful content type.

  4. High dollar content producers will never be successful on the YouTube platform and therefore never fully embrace it.

  5. Personality driven shows/channels will always be the dominant content type on the platform because they are the “very specific content type” people are watching for.

  6. New channels that have no access to their own audience off the YouTube will struggle for a long time to grow.

In conclusion, it is our view that the algorithm is designed to promote channels that are capable of uploading videos that get and keep a large swath of their niche audience watching. If you want to be successful on YouTube the best advice we can give you is to focus on one very specific niche interest and make as many 10-minute or longer videos as you can about that singular topic.

On a personal note I’d like to mention that YouTube catches a lot of flack for its algorithms and I hope they don’t interpret this post as a negative look at the algorithm. Throughout this research process, I have gained an even deeper appreciation for YouTube and the engineers who oversee and design the algorithms. They are, after all, trying to entertain a billion people a month across the entire world, with vast and varied interests. When you take a step back and look at it as a whole, it’s an astounding thing of beauty designed unbelievably well to achieve YouTube’s business goals and prevent people from abusing the system. My hat is off to them.