Application Layer (2/3)
Case Study 3) P2P Applications
- While the previous examples such as the Web, E-mail, and DNS all employ client-server architectures, P2P architecture operates differently.
- In P2P architecture, there is minimal or no reliance on always-on infrastructure servers, but instead have pairs of intermittently connected hosts called Peers.
Distribution Time
- Let's consider an example of distributing a large file from a single server to a large number of hosts called peers.
- The distribution time is the time it takes to get a copy of the file to all N peers.
- In Client-Server architecture, the distribution time is as the figure above.
- In P2P architecture, the distribution time is as the figure above.
- A bit sent once by the server may not have to be sent by the server again, as the peers may redistribute the bit among themselves.
- Total upload capacity of the system as a whole is equal to the upload rate of the server plus the upload rates of each of the individual peers.
- As a result, client-server architecture's minimum distribution time is linear, but P2P architecture's minimum distribution time is not linear, which means it is scalable.
BitTorrent
- BitTorrent is a popular P2P protocol for file distribution.
- In BitTorrent, the collection of all peers participating in the distribution of a particular file is called a Torrent.
- Peers in torrent download equal-size chunks of the file from one another.
- While torrents downloads chunks, it also uploads chunks to other peers.
- Once a peer has acquired the entire file, it may selfishly leave the torrent, or altruistically remain in the torrent and continue to upload chunks to other peers.
- Also, any peer may leave the torrent at any time with only a subset of chunks, and later rejoin the torrent.
- Each torrent has an infrastructure node called a tracker.
- When a peer joins a torrent, it registers itself with the tracker and periodically informs the tracker that it is sill in the torrent.
- The tracker keeps track of the peers that are participating in the torrent.
- Tracker randomly selects a subset of peers, and sends the IP addresses of these peers to a end-system, which are neighboring peers.
- At any given time, each peer will have a subset of chunks from the file, with different peers having different subsets.
- It is most desirable to request the rarest chunks first, so that the rarest chunks get more quickly redistributed.
- Also, each end-system sends chunks to those top-k peers currently sending chunks at the highest rate, and this is re-evaluated every 10 seconds.
- Every 30 seconds, randomly select another peer and start sending chunks, which is optimistically unchoking the new peer.
Case Study 4) Video Streaming and Content Distribution Networks(CDN)
- Video traffic is a major consumer of Internet bandwidth.
- For example Netflix and Youtube consumed a whopping 37% and 16% in 2015.
- Different users have different capabilities, so it is important to consider heterogeneity.
- Scaling is also an important factor to consider since there are many users.
- The solution is a distributed, application-level infrastructure.
- An important characteristic of video is that it can be compressed, thereby trading off video quality with bit rate.
- The higher the bit rate, the better the image quality and the better overall user viewing experience.
- Spatial Coding is instead of sending N value of same color, sending only two values, the color value and number of repeated values(N).
- Temporal Coding is instead of sending complete frame at i+1, sending only idfferences from frame i.
- CBR(Constant Bit Rate) referes to the video encoding rate fixed.
- VBR(Variable Bit Rate) refers to video encoding rate changing as amount of spatial and temporal coding changes.
DASH
- Dynamic Adaptive Streaming over HTTP (DASH) is a method of the video being chunked, and encoded into several different versions, with each version having a different bit rate and, correspondingly, a different quality level.
- The client periodically measures server-to-client bandwidth consulting manifest, and then requests one chunk at a time.
- It chooses maximum coding rate sustainable given current bandwidth, and can choose different coding rates at different points in time depending on available bandwidth at time.
- The clients select different chunks one at a time with HTTP GET request message, and HTTP server has a manifest file, which provides a URL for each version along with its bit rate.
CDN
- While building a single, large mega-server is an option to handle scalability, it has downsides.
- First, if the client is far from the data-center, the end-to-end throughput will be below the consumption rate, resulting in annoying freezing delays for the users.
- Second, a popular video will liekly be sent many times over the same communication links.
- Lastly, if the data center goes down, it would not be able to distribute any video stream.
- In order to meet the challenge of distributing massive amounts of video data to users distributed around the world, all companies make use of Content Distribution Networks(CDN).
- CDN manages servers in multiple geographically distributed locations, stores copies of videos in its servers, and attempts to direct each user request to a CDN location that will provide the best user experience.
- Two different server placement philosophies are Enter Deep and Bring Home.
- Enter Deep is to enter deep into access networks of Internet Service Providers, by deploying server clusters in access ISPs all over the world.
- Bring Home is to bring ISPs home by building large clusters at a smaller number of sites, and placing in Internet Exchange Points(IXPs).
- When a browser in a user's host is instructed to retrieve a specific video, the CDN must intercept the request so that it can determine a suitable CDN server cluster fo that client at that time, and redirect the client's request to a server in that cluster.