[Music] in this section we're going to move away just a little bit from our focus on application level protocols and focus on application level distributed infrastructure for implementing a service the service that we're going to want to take a look at is video streaming it's an application we all know and love and probably use a lot there's some amazing examples of very sophisticated distributed infrastructure to implement video streaming services so there's a lot to learn we're going to start off by looking at video as an application and then as we've seen the internet can introduce
variable delays between a sender and a receiver so we'll take a look at client-side techniques buffering and adaptive play out to mitigate the effects of variable internet delays then we're going to take a look at something called dash dynamic adaptive streaming over http and we'll see how dash can be used to accommodate changes in available capacity available bandwidth between a source and a destination then we'll look at an example of dash in use looking at cdns content distribution networks and netflix as an application so there's a lot to see a lot to learn here so
let's get going streaming video traffic is a major consumer of internet bandwidth by some estimates 80 percent of residential isp traffic is streaming video traffic now when we think about streaming video a couple of challenges are going to be apparent as always there's the issue of scale we're going to want to be able to reach tens or hundreds of millions of users the second challenge is that of heterogeneity some users are going to be mobile some are going to be fixed some are going to be on high speed broadband connections others are going to be
in bandwidth poor connections how are we going to cope with that kind of heterogeneity and as we'll see the answer is a very sophisticated application level distributed infrastructure let's start our discussion of streaming video by looking at the structure of video itself a video is just a sequence of encoded images sometimes called frames taken at say 24 30 frames per second and each image is a matrix of pixels those pixels are usually encoded to reduce the size of the images and therefore reduce the size of the video by exploiting image redundancy there's spatial coding that
exploits redundancy within an image for example in this image here rather than storing n repeated purple sky pixel values we could store the single pixel value purple and the number of repeated instances that's two values to encode that part of the image rather than n that's coding within a frame we can also code between frames if the image doesn't change much between frames or changes just a bit we can then just send the changes between frames for example rather than the entire new frame and there are two broad classes of video encoding methods constant bit
rate video and variable bit rate video and as the name suggests in constant bit rate video the video recording rate over time is going to be fixed constant and with variable bit rate coding the encoding rate's going to change over time as the amount of spatial and temporal correlation changes over time we see here a number of encoding standards whose encoding rates range from mpeg-1 at 1.5 megabits per second to mpeg4 which is what we're recording these videos in for example that can run up to 10 megabits per second and higher now when we think
about the technical challenges associating with streaming stored video there's going to be two sources of complexity here the first has to do with the fact that the amount of available bandwidth between client and server is going to be changing over time there could be congestion in the home network in the access network the core network the network within the video server complex or within the video server system itself so the amount of available bandwidth is going to vary over time and we're going to need to be able to adapt to that secondly we've seen that
the delays between a source and a destination in the internet between a client and a server are also going to change over time it's not like there's a circuit with a fixed delay from source to destination guaranteed bandwidth between source and destination packet switch network we're going to see variable delays and so we're going to be able to adapt to that at the client as well let's start by taking the big picture view and take a look at the three steps involved in streaming stored video first the video is being recorded secondly the videos being
sent by the server and finally third the video is being played out at the client and we'll do this in the context of this diagram here on the x-axis we have time going forward and on the y-axis we have the cumulative amount of data that's been recorded that's been sent or that's been played as we'll see in this first black staircase curve here we show video being recorded let's assume for simplicity that it's a constant bit rate video we see more and more video being recorded over time with the cumulative amount of data going up
at a constant rate say with each jump representing a new frame's worth of recorded data this video is then stored and then eventually transmitted by a server here in this example the video is being transmitted the same rate at which it was recorded if it's an mpeg video recorded at five megabits per second then it's being sent at five megabits per second but it could be sent faster or even slower but let's assume for simplicity that's being just sent out at the recorded rate after some network delay shown here the video playout begins at the
receiver again at the same rate at which it was recorded and now we see why this is called streaming video if we look at this point in time here we can see that the client is playing out frame 2 while the server's sending frame 10 rather than downloading the entire video before playing it out the client begins play out while the server is still sending that's to say streaming later frames in the video and you might want to think about the advantages of streaming video rather than downloading the video in its entirety first and then
playing it up with streaming the client can begin play out earlier if the client doesn't watch the whole video we're not wasting a lot of bandwidth transmitting portions of the video that aren't viewed now over on the client side we're going to have to deal with a constraint known as the continuous playout constraint and this means that the timing of playout at the client side is going to have to match the timing from when the video was first recorded so you're sitting at the client you're playing out the video everyone's engaged it's time to play
out a piece of video that piece of video better have arrived from the server to the client in order to be played out if not we're going to see that spinning dial that you see here that we've all seen at one time or another the source of the challenge here is the variable delay between the video server and the client and to mitigate that delay we're going to use buffering to absorb some of those changes in delay there are other challenges as well like how to deal with client side operations like fast forward and rewind
and if packets are lost they'll be retransmitted if we're streaming over tcp resulting in additional delay so a fundamental challenge we'll need to address is that of variable network delays let's see how this is done let's again return to this figure and again assume constant bit rate video being transmitted by the server at a constant rate as shown in the red staircase curve here that we've seen before the difference between this diagram and the previous diagram is that the network delay for each video frame is now going to be variable remember in the previous diagram
with a fixed network delay the black staircase curve had nice even staircase steps because network delay was assumed to be constant fixed here the steps are no longer nice and even sometimes there's a longer horizontal step like here and here when the network delay of a frame is significantly longer than that of a previous frame and sometimes the horizontal delays are quite short like here and here when the network delay of a frame is significantly shorter than that of a previous frame because of the variable network delay frames are no longer received with a timing
that matches the timing needed for playout to accommodate this so-called jitter in network delay a client's going to use a buffer to smooth out delay as shown in the blue play-out curve here a client will now also wait before beginning playout but once playout begins however the client's going to play out video with the timing shown in blue here that matches the original timing as shown in red here and in black in our earlier figure how long should the client wait well that's the million dollar tricky question if the initial client playout delay is too
short and frame delays are highly variable a frame may not arrive in time for its play out that's called starvation and gives rise to that spinning wheel that we're used to when videos freeze and if the initial client playout delay is too long well then the user has to wait longer before video playout begins and users hate to wait having taken a look at client-side buffering and playout let's now turn our attention to the challenge of varying amounts of bandwidth availability between client and server buffering is great for absorbing variable delay but what happens when
the amount of available bandwidth that exists between the client and the server just isn't enough to support the rate at which video is being transmitted from client to server in this case we're going to need another solution and that's where dash dynamic adaptive streaming over http in this case comes in and so here's how dash works and let's start at the server side the video that's going to be streamed is divided into chunks each chunk is then encoded at different encoding rates at different levels of quality and stored in separate files larger files are going
to be associated with chunks of video that are encoded at higher quality and so it's going to take longer a longer amount higher amount of bandwidth in order to be able to download those these different chunks each representing different encodings are going to be stored at different nodes within a content distribution network and finally there's going to be a manifest file the manifest file is going to tell the client to pick up this chunk at this particular level of encoding here are the server nodes the cdn nodes that you can go to on the client
side the client's going to do the following it's going to periodically estimate the amount of server-to-client bandwidth that's available and ask itself can this path support even more traffic can i request the next chunk at a higher fidelity when the client needs a chunk it's going to consult the manifest and request video one chunk at a time choosing the maximum coding rate that it's estimated to be sustainable given the currently available bandwidth it can choose different coding rates at different points in time depending on the amount of available bandwidth at that time and it can
choose which server to request a chunk from so we see that in dash the intelligence the control is really at the client side the client's given information the manifest file that lists its options the client then monitors performance to determine the encoding rate and the cdn node from which it will make its next request that puts a lot of intelligence at the client so let's step back and ask ourselves a fundamental question how do we want to structure an application that's going to be able to stream videos to potentially thousands or hundreds of thousands of
simultaneous clients and to be chosen from a catalog that could have millions of videos in it what are the options for doing this well we might start off by thinking about well a mega server one massive server that's got all of the videos and it's going to handle all the requests coming in from all of the clients so what are the problems with that well hopefully it's obvious to you it's a single point of failure obviously there's going to be a potential for congestion there in the network and also in the video server itself and
then finally there are going to be long delays between the video server location and some points on the planet in short this solution just doesn't scale the second option the approach that's adopted in practice is to build a large distributed infrastructure to store and serve copies of video chunks at different geographically distributed sites this is an example of an application layer content distribution network a cdn the servers in this network are loaded with content to serve and either a manifest file or a cdn dns server will point a client to the content that the clients
requested there are two approaches that are taken in practice in the enter deep approach to cdn the cdn servers are pushed deep into many access networks at the internet's edge in 2015 a cdn company known as akamai headquartered in cambridge had a quarter of a million cdn servers deployed in more than 120 companies and i'll add that in 2018 one of our faculty members here at umass ramesh sidoraman was part of the team that received an acm sitcom networking systems award for building auckimy's content delivery network the second approach is known as the bring home
approach and in this approach a smaller number of larger server clusters are located in pops points of presence that we learned about earlier when we were studying section 1.5 now let's walk through an example of streaming a video via cdn here's a network setting we see netflix central here and we see copies of content say including copies of mad men distributed around its cdn nodes and here's me sitting at home and i want to watch a particular episode of madmen so my netflix client app sends a request to netflix central saying hey jim wants to
watch this episode netflix central then returns a manifest file listing the video chunks and their locations as we saw earlier my netflix client app might then begin retrieving a video here from this nearby cdn server performing buffering and client play out as we saw earlier and if that path happens to get congested my netflix client might choose to get the next chunk from this server here now if you think about netflix it's not an isp it's about content not about the network but it uses the networks provided by isps to deliver content over the isp's
network at the application layer for this reason a service like netflix is sometimes called an over-the-top or ott service since it's an application level service riding on top of the ip infrastructure and you might recall in our very first class when we asked ourselves the question what is the internet we answered that question in two ways we gave a nuts and bolts answer describing the pieces of the internet but we also answered that question by saying that the internet was a service infrastructure on which amazing applications are being built and that's precisely the point of
view taken here with ott services well i hope you found this section interesting we stepped away here just a little bit from an emphasis maybe on looking at protocols to look at application structure itself particularly the case of a large large-scale distributed application level infrastructure for streaming stored video we started off by looking at the characteristics of streaming video then we looked at buffering techniques and playout strategies at the client then we looked at chunking and that's where we encountered dynamic adaptive streaming over http dash then we wrapped up by taking a look at content
distribution networks very quickly there and the example of streaming stored video over a content distribution [Applause] [Music] network you