Staying updated on all the tech talk in video engineering is no easy feat. From new video codecs to platform-specific lingo, it can leave your mind feeling a little boggled.
That’s why we’ve put together this video engineering dictionary.
Enjoy!
AAC, or Advanced Audio Codec, is the follow up to MP3, consider it MP4 for audio :). It is more efficient than MP3 so you get better sounding audio at the same rate (usually 128 or 192kbps). It is very widely supported in almost all devices.
All of the terms here are different containers, and containers are the thing that “contain” the codec, and have instructions on how to play the video file. An example is the container says where the metadata is for the video vs the actual video components themselves, vs the audio components like stereo and 5.1, etc. Some of the above are intended for studio use only and are made for editing very large files, while some of these are consumer facing and you play them on your iDevice every day.
Bitrate is defined as “ the number of bits per second that can be transmitted along a digital network.” but what it really means in video is the amount of data I can send you reliably for you to watch the video. If you want to watch a video in HD, it will take a higher bitrate than SD, but a lower bitrate than 4K because of the amount of information that’s trying to be sent to you. To put it another way, a 1080p stream will have a higher bitrate compared to a 720p stream because you’re sending much more information per frame. In general, you might see a 3mbps bitrate on the 720 but a 6mbps bitrate on the 1080 one.
DRM is a way for companies to know that their video is not being pirated. Before DRM, if a video leaked, it could be uploaded and played by anyone. With DRM, content owners can decide who/when/how/what gets played. If they only want you to be able to watch something for a week, they can. Not watchable offline? No problem. Can’t watch outside the US? Done. DRM is also able to be used down to the user, so the same file can work for one login and not for another. It's very powerful but expensive, so mostly the valuable content (movies, TV Shows, etc.) gets DRMed.
Encoding is the process of taking a video signal (usually from a camera or mixing board) and “encoding” it into a stream of data. Think of it like taking the camera signal and putting it through some very impressive mathematical formulas so we can digitize it. One other goal of encoding is to reduce the amount of data in the signal yet still have it look great. For example, we might take a camera signal that's 1 gigabit per second of data and compress it to only 10 megabits per second of data. The encoder is the thing that is converting that video from analog to digital.
FFmpeg is a piece of free software that does the actual encoding. You type all of the settings that you want into a command line command, and then it runs and produces the finished video. It is very powerful but has a steep learning curve. The training wheels version of FFmpeg is called Handbrake and uses a GUI.
FPS is a measure of how many times per second a video frame is made. In the USA we standardized on 30 frames and now 60 for 4K. In Europe they standardized on 25 FPS and 50 for 4K, so you can see we needed different converters in the US and Europe to watch the same show. The internet has made some of this moot, but it’s still a big deal in movies and tv.
H.264 is a video codec, and is the most widely used one today. Almost every device (TVs, Mobile Phones, iPads, etc) can playback H.264 video files efficiently. It has been surpassed by a more efficient technology called H.265 or HEVC, but HEVC has been hampered by patenting issues. If you needed to encode a video to stream online, this would be your best bet to reach as many devices and users as possible.
H.265, or HEVC (High-Efficiency Video Codec), is the follow-up to H.264 and is the current “new thing” in Codec’s. Apple has provided support for it, which should help its adoption and usage, has a very complicated patent/license setup that makes it expensive and scary to put your whole video library into. Technically, H.265 can produce 2x as good of a picture at the same bitrate, or and equivalent picture at half the bitrate, when compared to H.264. This is due to the encoding complexity, it takes much more CPU to encode and decode H.265 vs H.264.
Adaptive streaming means the same video is produced in different quality levels that the user can change between based on their bandwidth at the moment. If you’re alone in a coffee shop on the wifi you will probably get the highest quality stream. If a group of gamers come in and start using up all of the bandwidth then your quality level might drop and the video won't look as good. This way we don't have to try and pick a one size fits all approach, you can watch a video on your phone while on the bus and then when you home finish watching it on a big TV and both will look good.
A Linear Livestream is basically like a TV channel. you're watching whatever is live at the moment, as if you had turned on your TV and just watched whatever channel it was on. You cannot fast forward or skip the commercials, it is live.
Metadata is the information about a video, such as title, year made, genre, etc. It can be made very rich and can help a lot with searching for videos based on keywords.
DASH is a container, and contains a CODEC, usually H.264 or H.265. The benefits of DASH over things like HLS is in the complexity of the manifest that is delivered with the video chunks. DASH made some great improvement over HLS, but is more complicated and therefore hasn't taken off as much as expected. Also, Apple has kept up HLS so it has stayed the dominant container. The largest companies use DASH, and so about 50% of internet video traffic is in the format.
An MRSS feed is a media feed of data that's used to syndicate videos from one company to many others. So if MTV wants to send our their videos for other people to watch on their site, they would send an MRSS feed and that’s how people would know where to get the video and all of other metadata (name, year,etc).
OGG is an open source and free version of MP3, so you don't need to pay any licenses to use it, it’s not as good as MP3, nor AAC. it has some device support but should be an option and not the only option for audio on your project.
OTT, or Over The Top video, is the idea of not using the cable box to send video into the home. It might seem weird in today’s world, but it used to be that if you wanted to watch things at home without a VHS/DVD player, you needed to watch it through your cable box. When the internet got fast enough, we started to see video delivered “over the top”, meaning not through the cable wire going into your home, but through the internet wire instead. This was a big problem for the cable companies, who wanted to be your only link to the video in your home.
Transcoding is the practice of converting a video from one format to another. For example, you have to create a certain file format to use on Apple devices, so a transcoder might take something created for XBOX and convert, or transcode, it for use on Apple devices. You can think of encoding as getting a signal from analog to digital, and transcoding from digital to digital.
VP9 is very similar to H.264 but was bought by google and is used on some YouTube videos because it takes less time to encode into VP9 vs H.264. It has been superseded by VP10/AV1.
AV1 is based on VP10, but is also the new codec du jour in the industry. It is royalty free and very efficient, although very cpu intensive. Expect to see this in products from Google/Facebook/Amazon etc.
A Video Content Management System is a tool that media companies use to tag and manage their video so they can find things easily and approve/deny things for air in the same system. They are very powerful and at the enterprise level can be set up to approve or deny a cable company or even individual user from accessing content based on date and last payment.
A Video Origin is where a media company keeps their video files. Some people keep them in a data center that they own and control, and some companies keep them on the public cloud, but it's the place that the CDN's go to receive the content from the content owner
Encoding codecs are the different formats that the analog to digital conversion can end up in. Examples are H.264/5, VP9, ProRes, etc. All of these have pros and cons that are a little out of scope, but generally, there is a trade-off between how much CPU it takes to encode/decode and how much power is available. For example, a CODEC might take lots of CPU to encode and decode but have a very small file size. The problem is your cellular phone only has so much battery, so while the picture looks great it's eating up all of your juice to watch it. The goal is to find a midpoint between CPU cycles and file size.
VAST is a way for ad servers and video players to be reading and writing from the same playbook. It makes a standard for ads to be delivered so that the ad companies know the video player has all the information it needs, so there is no finger-pointing. Before VAST, everything was a custom implementation and things were much slower to get done.