This is the first post in a series of technical blogs which outline some of the key challenges that the OTT industry is currently facing, especially the control functionality needed when delivering TV content in a multi CDN environment.In this post, we will explore how to choose a streaming source for OTT video sessions using HLS and MPEG-DASH protocols with server-side mechanisms.
Streaming video over IP is not new. For more than 20 years, high quality video has been streamed over IP, starting with IPTV services using MPEG-2 TS streams with RTSP as control protocol.
With IPTV, the server is in control. The client can only request actions (pause, rewind, fast forward etc.), meaning it is up to the server to serve the request in any way it sees fit. Streaming over controlled networks to Set-Top-Boxes (STBs) is where Edgeware started its streaming business.
Video over the Internet started very slowly, evolving from low quality downloads to streaming. In fact, one of us authors played a role in starting up the RTSP-based 3GPP streaming standard PSS when it first came onto the scene in 2000.
When Apple introduced the iPhone in 2007, video was limited to progressive download of MP4 files; live and streamed video was not even supported. However, new streaming solutions started to appear, and control moved to the client side. In 2009, Apple introduced HLS (HTTP Live Streaming) which gave the client a menu of media segments to choose from, reducing the streaming server to a web server. But Apple was not the first to do this. Move Networks had already introduced segment-based HTTP streaming with adaptive bitrates, which gave even more control to the client and, under good conditions, could deliver high quality TV over the Internet.
Almost all OTT video streaming is now sent over HTTP. It allows the client to adapt to network conditions by choosing different variants of the video based on its estimation of the available bandwidth. By using the same HTTP distribution networks that were already used for static web content, it has made distribution cheap, but at the cost of losing server-side control, monitoring and enhancements.
Over time, video services have grown tremendously, resulting in video traffic dominating the internet. HLS and MPEG-DASH are the major HTTP streaming protocols being used today. Focusing on these two protocols, this blog will look into how server side can handle and control the initial setup of a streaming media session and choose which server will provide the media to the client.
Before going into the selection mechanisms, we need to understand a little more about how HTTP streaming, in particular HLS and MPEG-DASH, works.
Both HLS and MPEG-DASH – as well as other HTTP streaming techniques such as Microsoft Smooth Streaming – start with the client downloading a manifest describing the media and its media segments, followed by the client downloading a series of these segments.
For VoD (Video on Demand), everything the client needs to know about the asset is known from the start, so there is no need to download any manifest updates. However, for live video where segments are produced one by one as time passes, the client typically requests an updated manifest after reaching the end of the manifest to see if any new segments have been added.
For HLS, the manifest download is normally a two-step process. First, a master playlist is downloaded to find out what media playlists are available. The client then chooses the initial media playlist for each media type (audio, video and subtitles) and downloads them. The media playlists contain lists of available segments, which the client can begin to download from the start for VoD and close to the end for live. Depending on the estimated download speed, the client can choose to switch to a higher or lower bitrate for the next segment. If it is live content, the client downloads the media playlist again when it reaches the end to see if any new segments have been added. When many clients are constantly checking for updates, it becomes a potential problem.
For MPEG-DASH, all variants are described in a single manifest called a Media Presentation Description (MPD). The MPD is more compact than an HLS media playlist since it uses templates to describe the segment URLs.
A part of that template can be replaced to generate all segment URLs. To further increase the manifest download performance for live, DASH has a mode where the URLs of future segments can be created by simply incrementing a number. Bearing in mind that the client’s need to know when a segment will be available, the MPD provides the necessary timing information to calculate the availability time of any segment. Therefore, a single manifest download is enough for a long-running DASH live session.
One way of selecting streaming source on the server side is to use DNS. By providing different IP addresses to different clients, it is possible to spread clients across multiple servers. Since a DNS lookup only carries the domain requested and the source IP address, the available information is limited for an intelligent selection. And there is no way of forcing the client to do a new lookup once it has resolved a domain, so you can’t control the source precisely in time. However, a good thing about DNS is that all requests to that domain will go to the same IP address. Therefore, it works independent of the streaming protocol.
The other major way to select source on the server side is to redirect the initial request using HTTP 3XX redirect response.
A redirect is an HTTP response that contains a URL to another source, instead of the requested content. The URL is carried in a Location header.
host1/asset/manifest.mpd -> Location: host2/asset/manifest.mpd
The client then makes a second request to the new URL and downloads the content from there.
In this case, the server does not only have access to the domain, but the full URL and HTTP headers. Therefore, it knows what asset is being requested and the device type.
Some systems also choose to insert a session identifier in the URL when doing the initial redirect:
host1/asset/manifest.mpd -> Location: host2/<sessionId>/asset/manifest.mpd
The server can also add tokens for access control to the CDN. Although, we see that many streaming services seem to think that DRM protection is enough, leaving the CDN open to anyone who knows the URL.
HTTP redirect gives a number of advantages compared to using DNS. At Edgeware, we call this request routing. It is a service that selects the source, creates session information and can also create access tokens to allow access to 3rd party CDNs.
After the initial redirect, the client will not come back to the initial source for the same URL. However, other resources like media playlists and segments typically have relative URLs, which must also use the redirect URL as their base. This behavior needs to be implemented in the client.
If the clients do not use the base path of the redirect URL or if the manifest includes absolute URLs, the server needs to rewrite the manifest to use mechanisms that are part of the HLS or DASH format.
For HLS, the main mechanism is to include absolute URLs for the media playlists in the master playlist. Since the master playlist is only fetched once, the absolute URLs – possibly extended with session IDs – are then used for all future media playlist requests, and also serve as base URLs for the (relative URLs) of media segment requests.
In MPEG-DASH, there are three mechanisms for relative URLs. First, all relative URLs are resolved with regards to the manifest URL. Secondly these can be modified by `<BaseURL>` elements, which provide a base path for all relative segment URLs in the manifest. By using an absolute baseURL, it is possible to steer all segment requests to a specific server.
However, this does not influence the URL for future requests of the live manifest itself. To handle that, the third mechanism, the `<Location>` element, comes to rescue. This element tells the player that the URL provided inside it should be considered as the source URL of the manifest. It will not only provide a new base path for the segment URLs, but also serve as the URL for future manifest update requests. As with HTTP redirect, you can also extend the URL with additional parameters, so all flexibility is retained.
With HTTP redirect, one can achieve a more intelligent server-side selection of media streaming sources compared to using DNS. One can furthermore introduce new parameters in the redirect URL such as a session ID. This works in most clients, but in cases where it does not, there are HLS and MPEG-DASH specific mechanisms that can be triggered by rewriting the initial manifest response. As with HTTP redirect, it will determine the source of the complete session.
Edgeware’s TV CDN is based on HTTP redirect in a Request Router (TV Director). It has been further enhanced with the HLS and MPEG-DASH specific manifest rewrite mechanisms triggered on HTTP user-agent to handle all clients.
Torbjörn Einarsson, Expert, Media protocols and codecs, Edgeware
Kalle Henriksson, Founder, Edgeware
In our next post, we will continue the journey and look at how streaming sessions can be monitored from the server side. Read the next post here.
If you would like to find out how Edgeware handles session setup in multi-CDN environments, read more about our StreamPilot session control platform here. Would you like to see a live demo, please contact us here.
 There is an explicit list mode in HLS, but it is normally not used.
 The use of templates makes DASH more efficient than HLS but makes a change of a segment URL a much bigger operation in DASH compared to HLS. The multiple periods that are needed are not often supported for live. We will come back to that in a later blog post.
Do you want to know more? Fill out the form below and we will get in touch with you.