This is the second in our series of blog posts that looks at the challenges that the OTT industry is currently facing when it comes to controlling the delivery of TV content across multiple CDNs. The previous post explored session setup in multi CDN environments and is available here. The focus of this post is on measuring video quality of experience (QoE) for streaming over multiple CDNs.
As indicated by the term OTT (Over The Top), video is sent over an unmanaged network, which is unlike a managed network used for IPTV where the available capacity is known from the server to the client. In HTTP streaming, the client decides which bitrate variant of the video to download, while the server is reduced to a delivery mechanism for the different available variants.
When delivering high-quality TV services, it is important to have knowledge and control of the end user experience. This can be provided by integrating specific software in the clients. However, this is often cumbersome as the number of different clients available are constantly increasing, and it usually ends with sampling a subset of clients instead of sampling all of them. For this reason, it would be valuable if you could monitor and control video streaming at the server side and leave the clients with the simple task of presenting the video. In this blog, we will look into the monitoring possibilities at the server side. We will come back to the controlling part in a later post.
Let’s start with the client. Today, multiple vendors provide plugins in the clients to deliver quality of experience (QoE) measurements. Measuring QoE in the client makes sense. It is as close to the viewer as you can get, receiving the most accurate measurements of video stuttering, rebuffering and dropped frames. But, as the client landscape gets increasingly complex with different formats, such as the popular HLS and MPEG DASH, and an explosion of different client devices with SmartTVs, video sticks, apps and browsers – player integration takes more and more effort. For an accurate view of all of these, the clients need to send reports to a server that collects and analyzes the QoE information before a view can be presented. And since client integration across all platforms is resource heavy, only a subset of the clients are monitored. Furthermore, since different client platforms behave differently, show different metrics and have different views, it is hard to get a unified view of the quality in real time.
With server-side QoE measurement, you get a less accurate view of the playback experience, but also a format and client independent measurement that covers all clients, all of the time. By doing so, you minimize client development. If CDN selection is based on QoE information, all sessions – independent of client implementation – should be covered and the reported latency should be low. This is achievable with a server-side solution.
There is plenty of visibility available on the server side if you know where to look. Sessions can be identified by tokens or identifiers in the URL that are usually introduced during setup and stay for the lifetime of the session. This identification can also be used to track and limit the number of concurrent sessions, depending on what is allowed by the subscription.
There is also user-agent data in the HTTP headers which reveals a lot of information about the client and the client platform. With access to the manifest/playlist, the server can see which media is available. And as a segment is requested by the client, a server can see which video bitrate is being chosen compared to what is offered. Combined, this information can be used to identify the current quality of the video experience.
In addition to supervising the download times for the actual segments, and in comparison to the segment playback durations, the server can see if the CDN is performing well. This gives an indication of the available bandwidth seen from the client. And in combination with the bitrate variant chosen, you can find out how much of the available bandwidth is being used.
If you look deep enough into the TCP stack and analyze the details, you can even find information about network problems and delays, perhaps even diagnose if the problem is in a network (such as WiFi) that’s outside the control of the service, or if it is the CDN or core network that is causing the problem.
For live video, the server can track how close the download process is compared to the live edge segment generation as revealed by the manifest. For HLS, the latest available segment is listed in the media playlist, while for DASH, things can get a bit more complex, depending on the timing mode and manifest template being used. The client’s proximity to the live edge can be deducted from information available at the server.
Another interesting measure is how much of the video is actually being displayed to the user. For live video, this should have a strong correlation with what is downloaded, so once segment requests cease for a specific session, the interval between the first and last segments can be logged as an estimate of the viewing time.
For VoD, viewing time is more difficult to estimate since the clients typically download ahead of the viewing point in order to have a big local buffer to handle potential future network issues. This extra buffering is client dependent, but is typically in the order of a minute. Therefore, it is still possible to get a useful estimate of how much of the video is actually being consumed.
There are also cases when the stream fails to start. A player usually needs at least two segments before it starts to display video. One can check the manifest if two video and two audio segments have been requested and delivered. If that has not happened within a reasonable timeout, one can conclude that the session never started. However, from the server side, one cannot directly see if this is due to a user jumping between sources or due to an actual problem.
Another desired measurement is whether an ad has been viewed or not. This is valuable information since the service provider is paid by the number of times ads are being watched, also known as impressions. The most common solution is to use a code in the client which detects views and reports to an impression server for billing. Since money is involved, this chain of ad beaconing needs to be certified and accepted by the industry. Money can be lost if an ad is either reported as viewed but has not been viewed or was viewed but the impression server never got the report – often caused by ad-blockers.
Here, the server side has a weakness since it cannot see what is actually viewed. However, it can see whether the ad has been downloaded, and at a rate that would correspond to actual viewing. This can be done independently of any client integrations. Even though this could have lower confidence than a client-based solution, it’s applicability is greater.
Another advantage is that ad blockers, which either block the ad or block the beaconing signal, do not work on a server-side solution. This is evident when the content is stitched on the server to look like a continuous stream.
Client-side monitoring can present more details, especially about user interaction and video playback. But there is plenty of information that can be extracted and interpreted on the server side. Furthermore, server-side monitoring works for all clients without any modification. It is on the server side that information should be gathered for decisions such as CDN selection.
Torbjörn Einarsson, Expert, Media protocols and codecs, Edgeware
Kalle Henriksson, Founder, Edgeware
If you would like to find out more about Edgeware’s solution for measuring and controlling video quality of experience (QoE) in multi-CDN environments, read more about our StreamPilot session control platform. Or contact us here to request a virtual live demo.
Fill out the form below and we will get in touch with you.