Distribución de contenidos

Comentarios

Transcripción

Distribución de contenidos
Tema 5:
Distribución de contenidos
 1. Introducción.
 2. Arquitecturas.
 Cliente-Servidor
 Web proxies.
 Réplica de contenidos.
Bibliografía
[GIL11] Gilbert Held, ”A practical Guide to
Content Delivery Networks”
[FLU95] Fluckiger, “Understanding networked
multimedia”.
 3. “Caching” y balanceado de carga
 4. Un caso
 La red Akamai
Arquitecturas de red para la distribución de contenidos
Arquitecturas de red para la distribución de contenidos
1. Introducción.
 Objetivo:

Conocer los mecanismos y arquitecturas para la
distribución eficiente y escalable de contenidos en
Internet.
 Para
ello…
Revisaremos los diferentes arquitecturas de
distribución de contenidos en Internet analizando sus
ventajas e inconvenientes.
 Nos centraremos en el concepto de red de distribución de
contenidos  CDN

 Definición
 Mecanismos de distribución, redirección y gestión.

Examinaremos un ejemplo de éxito: La red Akamai.
2
Arquitecturas de red para la distribución de contenidos
Single-site Single-Server
3
Arquitecturas de red para la distribución de contenidos
Single-site Single-Server
Advantages:
•
Reduced HW/SW Cost
Disadvantages:
•
•
•
•
Failure of server
HW/SW maintenance while servicing
Users experience unequal access delays
Networking/Processing scalability
problems.
4
Arquitecturas de red para la distribución de contenidos
Single-site Multiple-Servers: Server Farm
Advantages:
•
•
•
Resource Load Balancing
Error resilience improved.
HW/SW upgrading without service disrupting
Disadvantages:
•
•
•
Users experience unequal access delays
Networking scalability problems ??
HW/SW cost increases
5
Arquitecturas de red para la distribución de contenidos
Multiple-Sites Single-Server: Mirrors
Advantages:
•
•
Content is closer to users  Fast response
Supports network failure at origin server site
Disadvantages:
•
•
•
Keep content updated
Source server site require redundant network
access services
HW/SW cost increases ??
6
Arquitecturas de red para la distribución de contenidos
Client-side devices: Web Proxies
Web/Content Proxies:
Client-side agents accessing web contents
Caching Content 
Saves network resources
Reduces server load
Speedups web responses
7
Arquitecturas de red para la distribución de contenidos
Web Proxies are Intermediaries

Proxies play both roles
A server to the client
 A client to the server

www.google.com
Proxy
www.cnn.com
8
Arquitecturas de red para la distribución de contenidos
Proxy Caching

Client #1 requests http://www.foo.com/fun.jpg
Client sends “GET fun.jpg” to the proxy
 Proxy sends “GET fun.jpg” to the server
 Server sends response to the proxy
 Proxy stores the response, and forwards to client


Client #2 requests http://www.foo.com/fun.jpg
Client sends “GET fun.jpg” to the proxy
 Proxy sends response to the client from the cache


Benefits
Faster response time to the clients
 Lower load on the Web server
 Reduced bandwidth consumption inside the network

9
Arquitecturas de red para la distribución de contenidos
Getting Requests to the Proxy

Explicit configuration
Browser configured to use a proxy
 Directs all requests through the proxy
 Problem  requires user action


Transparent proxy (or “interception proxy”)
Proxy lies in path from the client to the servers
 Proxy intercepts packets en route to the server
 … and interposes itself in the data transfer
 Benefit  does not require user action

10
Arquitecturas de red para la distribución de contenidos
Challenges of Transparent Proxies

Must ensure all packets pass by the proxy
By placing it at the only access point to the Internet
 E.g., at the border router of a campus or company


Overhead of reconstructing the requests
Must intercept the packets as they fly by
 … and reconstruct into the ordered by stream


May be viewed as a violation of user privacy
The user does not know the proxy lies in the path
 Proxy may be keeping logs of the user’s requests

11
Arquitecturas de red para la distribución de contenidos
Other Functions of Web Proxies

Anonymization
Server sees requests coming from the proxy address
 … rather than the individual user IP addresses


Transcoding
Converting data from one form to another
 E.g., reducing the size of images for cell-phone browsers


Prefetching


Requesting content before the user asks for it
Filtering

Blocking access to sites, based on URL or content
12
Arquitecturas de red para la distribución de contenidos
Content Providers/Consumers

Content providers/consumers are interested in being
able to offer/access content
Efficiently
 Reliably
 Securely
 Inexpensively

Providers deploy server farms and replicas…
 Consumers deploy web proxies…
 But, there is an alternative solution…

13
Arquitecturas de red para la distribución de contenidos
3rd Parties: Content Delivery Networks
14
Arquitecturas de red para la distribución de contenidos
Content Distribution Networks (CDN)

Business Model:


A content provider such as www.cnn.com or Yahoo pays a CDN
company (such as Akamai) to get its content to the requesting
users with short delays.
A CDN provides a mechanism for
Replicating content on multiple servers in the Internet
 Providing clients with a means to determine the servers that
can deliver the content fastest.

15
Arquitecturas de red para la distribución de contenidos
CDN Terminology

Content


Content Provider


Content provider’s server , where the content is first uploaded.
Surrogate Server (sometimes called edge server)


Any individual, organization, or company that has content that
it wishes to make available to users.
Origin Server


Any publicly accessible combination of text, images, applets,
frames, MP3, video, flash, virtual reality objects, etc.
Content distributor’s server, where the replicated content is
kept.
Full/Partial Site Delivery
All the contents are delivered by the CDN (including HTML,
images, and other objects)
 Only images, streaming media and other bandwidth intensive
objects are delivered by the CDN.

16
Arquitecturas de red para la distribución de contenidos
CDNs and Content

Content Suitable for CDNS
Images
 Streaming media
 Java applets
 Static information


Content not suitable
Dynamic information
 Personalized information

17
Arquitecturas de red para la distribución de contenidos
CDN Players
Yahoo,
MSNBC,
Content Provider
CNN
Content
Distributor
Cisco,
H/W and S/W
Lucent,
Vendor
Inktomi,
CacheFlow
Akamai,
Digital Island,
AT&T
Hosting
Provider
Exodus
18
Arquitecturas de red para la distribución de contenidos
CDN: Distribution
The CDN company places hundreds of CDN
servers in Internet hosting centers.
 The CDN replicates its customers’ content in the
CDN servers. Whenever, a customer updates its
content (e.g., web page), the CDN redistributes the
fresh content to the CDN servers.
 The CDN provides a mechanism so that when a user
requests content, the content is provided by the CDN
server that can most rapidly deliver the content to
the user.


This can be the closest CDN server to the user (perhaps in
the same ISP as the user) or may be a CDN server with a
congestion-free path to the user.
19
Arquitecturas de red para la distribución de contenidos
CDN: Distribution
Origin server in
North America
push content
CDN
CDN distribution node
push content
CDN server in South
America
push content
push
content
CDN server in Asia
CDN server in
Europe
20
Arquitecturas de red para la distribución de contenidos
CDN: Functional Components



Distribution Service
Redirection Service
Accounting and Billing system
21
Arquitecturas de red para la distribución de contenidos
CDN: Distribution Service
The content provider determines which of its objects
it wants the CDN to distribute.
 The content provider tags and then pushes this
content to a CDN node, which in turn replicates and
pushes the content to all its CDN servers.
 When a browser in a user’s host is instructed to retrieve
a specific object (specified using a URL), how does the
browser determine whether it should retrieve the object
from the origin server or from one of the CDN servers?
 As an example, suppose the hostname of the content
provider is www.cnn.com
 Suppose the hostname of the CDN company is
www.akamai.com

22
Arquitecturas de red para la distribución de contenidos
CDN: Redirection
Users get an html document from www.cnn.com; this
could be index.html
 The file index.html uses a modified URL for content that
has been replicated.


Example: If the gif files are what has been replicated then
<img src=“http://cnn.com/af/x.gif> may be modified as
follows:
<img src=http://a73.g.akamaitech.net/7/23/cnn.com/af/x.gif>
The browser needs to resolve aXYZ.g.akamaitech.net
hostname for replicated content.
 DNS is configured so that all queries about
g.akamaitech.net are sent to its authoritative DNS
server. This is referred to as a Akamai DNS server
(authoritative DNS server)

23
Arquitecturas de red para la distribución de contenidos
CDN: Redirection
When the Akamai DNS server receives the query, it
extracts the IP address of the requesting browser.
 Based on the IP address and information that it has
about the Internet (called a map), the IP address of an
Akamai server(surrogate server) is returned to the
requesting browser based on policy e.g., select the
server that is the fewest hops away.
 The Akamai DNS server IP address is now in the cache
of the local DNS server.



The TTL associated with the IP address of an Akamai
server(surrogate) is relatively small.


This implies that it is not always necessary to go to the root
DNS server.
This is done for performance reasons.
Akamai content distribution servers are caches
24
Authoritative DNS server
for cdn.com
CNN.com
...
<img
src="http://www
.cdn.com/cnn/im
ages/1.gif”>
...
GET www.cnn.com/index.html
Index.html
Index.html
Arquitecturas de red para la distribución de contenidos
CDN Redirection
64.236.24.28
DNS query: cdn.com ?
Client
64.236.24.28
Local DNS server
25
Arquitecturas de red para la distribución de contenidos
CDN Redirection

What if content is not there?
If the request content is not found then the surrogate will
ask other surrogates within a specified region for
information.
 If requested information is still not found or is stale, then a
request is made to the original web site.

26
Arquitecturas de red para la distribución de contenidos
CDN Selection

The tricky issue is selecting which local content
server to use for a particular request
Want to spread load evenly
 Want minimal impact if server is added or removed.


In Akamai, each surrogate server sends
measurement results to the Network Operations
Communications Center (NOCC).
Measurement results include number of active TCP
connections, HTTP request arrival rate, bandwidth
availability, etc
 This information is used by the Akamai DNS server.

27
Arquitecturas de red para la distribución de contenidos
Accounting Mechanism
Accounting mechanisms collect and track
information related to request routing, distribution
and delivery.
 Information is gathered in real time and put into
log files for each CDN component.
 This gets sent to the Network Operations
Communications Center (NOCC).

28
Arquitecturas de red para la distribución de contenidos
How well do CDNs work?
Hosting
Center
Hosting
Center
OS
Backbone
ISP
Backbone
ISP
CS
CS
Backbone
ISP
CS
IX
IX
Site
ISP CS
ISP CS
ISP
S
S
S
C
S
S
S
C
S
S
Sites
C
29
Arquitecturas de red para la distribución de contenidos
How well do CDNs work?
Recall that the
bottleneck links are
at the edges.
Hosting
Center
Hosting
Center
OS
Backbone
ISP
Backbone
ISP
CS
CS
Backbone
ISP
CS
IX
IX
Site
ISP CS
ISP CS
ISP
S
S
S
C
S
Even if CSs are pushed towards
the edge, they are still behind
the bottleneck link!
S
S
C
S
S
Sites
C
30
Arquitecturas de red para la distribución de contenidos
Reduced latency improve TCP performance
DNS round trip
 TCP handshake (2 round trips)
 Slow-start

~8 round trips to fill DSL pipe
 total 128K bytes

Compare to 56 Kbytes for cnn.com home page
Download finished before slow-start completes
Total 11 round trips
 UMH - Berkeley University RTT is about 200 ms



Measured RTT last night
UMH – Nearest CDN (akamai) node RTT ~ 20 ms
One order of magnitude improvement in RTT !!!
 11 RTTs stand up for

20x11 = 220 ms with CDN support, saving 1800 ms in
downloading response time.

Certainly noticeable 
31
Tema 5:
Distribución de contenidos
 1. Introducción.
 2. Arquitecturas.
 Cliente-Servidor
 Web proxies.
 Réplica de contenidos.
 3. “Caching” y balanceado de
Bibliografía
[FLU95] Fluckiger, “Understanding networked
multimedia”.
[SEI04] R. Seifert, “Gibabit Ethernet: Technology
& Applications for High-Speed Networks”.
[GAN04] A. Ganz, Z. Ganz and K.
Wongthavarawat,”Multimedia Wireless Networks:
Technologies, Standards and QoS”.
carga
 4. La red Akamai
Arquitecturas de red para la distribución de contenidos
Arquitecturas de red para la distribución de contenidos
Some Interesting Observations

Top 1% of all documents account for 20% - 35% of
proxy requests

Top 10% account for 45% - 55% of requests

It takes 25% to 40% of all documents to account for
70% of requests

It takes 70% to 80% of all documents to account for
90% of requests
33
Arquitecturas de red para la distribución de contenidos
Web Caching

As an example, we use the web to illustrate caching
and other related issues
request
browser
response
browser
request
Web Proxy
Web
server
request
cache
response
response
Web
server
34
Arquitecturas de red para la distribución de contenidos
Web Browser Caching
Web browsers have their own caches. When a page
is downloaded from a site the web page is put into the
browser cache.
 This is especially useful in those cases when the back
button is pressed.
 If a new copy is needed then a “refresh” can be done.
 No page stays permanently in the cache. There is
limited room.


A replacement algorithm is needed to determine which
cached page should be purged.
35
Arquitecturas de red para la distribución de contenidos
Web Browser Caching

Client pull


The server provides the content with instructions on when the
client should ask for a refreshed copy of the content or if the
content should be cached.
Server push
The server transmits page information to the screen.
 The browser application displays the information and leaves the
connection to the server open.
 With an open connection, the server can continue to push
updated pages for your screen to display on an ongoing basis.
You can close the connection by closing the page.
 The server is in control


Browser caches are different from proxy caches
(discussed next).
36
Arquitecturas de red para la distribución de contenidos
Web Caching

Proxy caches (also called proxy server)

Intercepts HTTP requests from client
 Serves object if in its cache
 If not goes to object’s home server
– On behalf of user, gets the object and possibly deposits in its cache
before returning to user
 Usually deployed at edges of a network
– Wide area bandwidth savings,
– improved response time, and
– increased availability of static web-based objects
A browser may have to be configured to point to the
proxy server.
 Usually a proxy cache is purchased and installed by an
ISP

37
Arquitecturas de red para la distribución de contenidos
Push-Based Approach
Server tracks all proxies that have requested
objects
 If a web page is modified, notify each proxy
 Notification types

Indicate object has changed [invalidate]
 Send new version of object [update]


How to decide between invalidate and updates?
Pros and cons?
 One approach  Send updates for more frequently accessed
objects, invalidate for rest

proxy
push
Web
server
38
Arquitecturas de red para la distribución de contenidos
Push-Based Approaches

Advantages
Provide tight consistency [minimal stale data]
 Proxies can be passive


Disadvantages

Need to maintain state at the server
Recall that HTTP is stateless
Need mechanisms beyond HTTP

State may need to be maintained indefinitely
Not resilient to server crashes

The disadvantage is the reason why push-based
approaches are not used
39
Arquitecturas de red para la distribución de contenidos
Pull-Based Approaches
poll
proxy
response
Web
server
The proxy is entirely responsible for maintaining
consistency
 The proxy periodically polls the server to see if
object has changed



Use if-modified-since HTTP messages: This type of message
can be used by a proxy to tell a remote server to return a copy
only if it has been modified.
Key question: When should a proxy poll?

Server-assigned Time-to-Live (TTL) values
No guarantee if the object will change in the interim
40
Arquitecturas de red para la distribución de contenidos
Pull-Based Approach

Proxy can dynamically determine the polling interval

Compute based on past observations
 Start with a conservative poll interval
 Increase interval if object has not changed between two
successive polls
 Decrease interval if object is updated between two polls
 Adaptive: No prior knowledge of object characteristics needed

Advantages
Server remains stateless
 Resilient to both server and proxy failures


Disadvantages
Weaker consistency guarantees (objects can change between
two polls and proxy will contain stale data until next poll)
 High message overhead

41
Arquitecturas de red para la distribución de contenidos
A Hybrid Approach: Leases
Lease: Duration of time for which server agrees to
notify proxy of modification
 Issue lease on first request, send notification until expiry



Need to renew lease upon expiry
Smooth tradeoff between state and messages
exchanged

Zero duration  polling, Infinite leases  server-push
Efficiency depends on the lease duration
 Limited use

Get + lease req
Client
read
Proxy
Reply + lease
Server
Invalidate/update
42
Arquitecturas de red para la distribución de contenidos
Cooperative Caching

Caching infrastructure can have multiple web
proxies
Proxies can be arranged in a hierarchy or other structures
 Proxies can cooperate with one another

Answer client requests
Propagate server notifications

Uses a combination of HTTP and ICP (Internet Caching
Protocol).
ICP can be used by one cache to quickly ask another cache if it has
an object.
HTTP is used to actually retrieve the object.
43
Arquitecturas de red para la distribución de contenidos
Problems
Caching proxies do not serve all Internet users.
 Content providers (say, Web servers) cannot rely on
existence and correct implementation of caching
proxies.
 Accounting issues with caching proxies:


Example: www.cnn.com needs to know the number of hits to
the advertisements displayed on the web page.
44
Arquitecturas de red para la distribución de contenidos
DNS Query in Web Download

User types or clicks on a URL


Browser extracts the site name


Triggers resolver code to query the local DNS server
Eventually, the resolver gets a reply


E.g., www.cnn.com
Browser calls gethostbyname() to learn IP address


E.g., http://www.cnn.com/2006/leadstory.html
Resolver returns the IP address to the browser
Then, the browser contacts the Web server

Creates and connects socket, and sends HTTP request
45
Arquitecturas de red para la distribución de contenidos
DNS Resolution
.com .net Root
(InterNIC)
www.cnn.com
Local Name
Server
5
4
6
7
1
8
10
Browser’s
cache
cnn.com
DNS servers
3
2
User PC
9
46
Arquitecturas de red para la distribución de contenidos
Multiple DNS Queries

Often a Web page has embedded objects


E.g., HTML file with embedded images
Each embedded object has its own URL
… and potentially lives on a different Web server
 E.g., http://www.myimages.com/image1.jpg


Browser downloads embedded objects
Usually done automatically, unless configured otherwise
 Requires learning the IP address for www.myimages.com

47
Arquitecturas de red para la distribución de contenidos
When are DNS Queries Unnecessary?

Browser is configured to use a proxy
E.g., browser sends all HTTP requests through a proxy
 Then, the proxy takes care of issuing the DNS request


Requested Web resource is locally cached
E.g., cache has http://www.cnn.com/2006/leadstory.html
 No need to fetch the resource, so no need to query


Browser recently queried for this host name
E.g., user recently visited http://www.cnn.com/
 So, the browser already called gethostbyname()
 … and may be locally caching the resulting IP address

48
Arquitecturas de red para la distribución de contenidos
Directing Web Clients to Replicas

Simple approach: different names
www1.cnn.com, www2.cnn.com, www3.cnn.com
 But, this requires users to select specific replicas


More elegant approach: different IP addresses
Single name (e.g., www.cnn.com), multiple addresses
 E.g., 64.236.16.20, 64.236.16.52, 64.236.16.84, …


Authoritative DNS server returns many addresses
And the local DNS server selects one address
 Authoritative server may vary the order of addresses

49
Arquitecturas de red para la distribución de contenidos
Clever Load Balancing Schemes

Selecting the “best” IP address to return
Based on server performance
 Based on geographic proximity
 Based on network load
 …


Example policies
Round-robin scheduling to balance server load
 U.S. queries get one address, Europe another
 Tracking the current load on each of the replicas

50
Tema 5:
Distribución de contenidos
 1. Introducción.
 2. Arquitecturas.
 Cliente-Servidor
 Web proxies.
 Réplica de contenidos.
 3. “Caching” y balanceado de carga
Bibliografía
[FLU95] Fluckiger, “Understanding networked
multimedia”.
[SEI04] R. Seifert, “Gibabit Ethernet: Technology
& Applications for High-Speed Networks”.
[GAN04] A. Ganz, Z. Ganz and K.
Wongthavarawat,”Multimedia Wireless Networks:
Technologies, Standards and QoS”.
 4. La red Akamai
Arquitecturas de red para la distribución de contenidos
Arquitecturas de red para la distribución de contenidos
La red Akamai
Starts its commercial service in April 1999 with Yahoo!
as first customer
 Currently offers content delivery services to more than
1200 world’s leading electronic commerce
organizations.
 As Akamai states, between 15% to 20% of ALL Web
traffic is delivered by Akamai servers.
 Akamai’s content delivery service is based on caching
and replicating content through its servers which are
conveniently spread around the world.
 Also supports adaptive bitrate streaming HD video 
Akamai HD Network

52
Arquitecturas de red para la distribución de contenidos
Problems with the Centralized Approach
Slow
 content
must traverse multiple
backbones and long distances
Unreliable
 delivery
may be prevented by
congestion or backbone
peering problems
Not
scalable
 usage
limited by bandwidth
available at master site
Inferior
streaming quality
 packet
loss, congestion, and narrow
pipes degrade stream quality
53
Arquitecturas de red para la distribución de contenidos
The Akamai Solution
Multi-Site Multi-Server
distributed content
approach.
 Caches,replicates &
distributes all forms of
content and supports
applications
 Monitors the Internet and
routes around trouble
spots
 Provides feedback
on hit counts to content
providers

54
Arquitecturas de red para la distribución de contenidos
Advantages of the Akamai Solution
Fast
 Content
is served
from locations near to
end users
Reliable
 No
single point
of failure
 Automatic fail-over
Scalable
 Master
site no longer
requires massive
available bandwidth
55
Arquitecturas de red para la distribución de contenidos
Typical Page Content
Total page
Total Akamai Served
Logos
3,395 bytes
Navigation Bar
9,674 bytes
87,550 bytes
68,756 bytes
Banner Ads
16,174 bytes
Gif links
22,395 bytes
Fresh Content
17,118 bytes
78% Page Served by Akamai
56
Arquitecturas de red para la distribución de contenidos
Network Deployment
105.000+
Servers
1900+
Networks
78+
Countries
57
Web object delivered without Akamai
Noon May 27
Noon May 26
Noon May 25
Noon May 24
Noon May 23
Noon May 22
Noon May 21
Noon May 20
Noon May 19
Noon May 18
Noon May 17
Noon May 16
Noon May 15
Arquitecturas de red para la distribución de contenidos
Results
Web Site Performance
Typical Improvement with Akamai
Web object delivered by Akamai
58
Arquitecturas de red para la distribución de contenidos
Over 1300 Web
Sites are Now Akamaized
59
Arquitecturas de red para la distribución de contenidos
Akamai CDN: How it works…
HTML Title Page for www.xyz.com with
Embedded Objects
<html>
<head>
<title>Welcome to xyz.com!</title>
</head>
<body>
<img src=“http://www.xyz.com/logos/logo.gif”>
<img src=“http://www.xyz.com/jpgs/navbar1.jpg”>
<h1>Welcome to our Web site!</h1>
<a href=“page2.html”>Click here to enter</a>
</body>
</html>
60
Arquitecturas de red para la distribución de contenidos
Downloading www.xyz.com
- before Akamai
DNS
Server
1
Content Provider
Web server
WWW.XYZ.COM
2 10.10.123.8
5
3
6
• User enters www.xyz.com
• Browser requests IP
address for www.xyz.com
• DNS returns IP address
• Browser requests HTML
• Content provider’s web
server returns HTML
4
7
10.10.123.8
• Browser obtains IP addresses for
hostnames listed in URLs of objects
embedded on page
• Browser requests
embedded objects
• Content provider’s web server
returns embedded objects
61
Arquitecturas de red para la distribución de contenidos
Downloading www.xyz.com
- The Akamai way
DNS
Server
1
WWW.XYZ.COM
2
Content Provider
Web server
5
3
4
6
• User enters www.xyz.com
• Browser requests IP
address for www.xyz.com
• DNS returns IP address
• Browser requests HTML
• Content provider’s web server
returns page with Akamaized
URLs
• Browser obtains IP address
of optimal Akamai server for
embedded objects
• Browser obtains objects
from optimal Akamai server
62
Arquitecturas de red para la distribución de contenidos
Content Delivery Using Akamai
Embedded URLs are Converted to ARLs
<html>
<head>
<title>Welcome to xyz.com!</title>
</head>
ak
<body>
<img src=“http://www.xyz.com/logos/logo.gif”>
<img src=“http://www.xyz.com/jpgs/navbar1.jpg”>
<h1>Welcome to our Web site!</h1>
<a href=“page2.html”>Click here to enter</a>
</body>
</html>
63
Arquitecturas de red para la distribución de contenidos
Akamai caching services
ARL: Akamai Resource Locator
http://a620.g.akamai.net/7/620/16/259bf4ed29de/www.cnn.com/i/22.gif
Host Part
Akamai Control Part
Content URL
/7/620/16/259bf4ed29de/
a620.g.akamai.net/
/www.cnn.com/i/22.gif
64
Arquitecturas de red para la distribución de contenidos
ARL: Akamai Resource Locator (I)
http://a620.g.akamai.net/7/620/16/259bf4ed29de/www.cnn.com/i/22.gif
Content Provider (CP) selects which
content will be hosted by Akamai.
Akamai provides a tool that
transforms this CP URL into this ARL
/www.cnn.com/i/22.gif
a620.g.akamai.net/
65
Arquitecturas de red para la distribución de contenidos
ARL: Akamai Resource Locator (II)
http://a620.g.akamai.net/7/620/16/259bf4ed29de/www.cnn.com/i/22.gif
This in turn causes the client to access
Akamai’s content server instead of the origin server
a620.g.akamai.net/
/www.cnn.com/i/22.gif
66
Arquitecturas de red para la distribución de contenidos
ARL: Akamai Resource Locator (III)
http://a620.g.akamai.net/7/620/16/259bf4ed29de/www.cnn.com/i/22.gif
If Akamai’s content server doesn’t have the
content in its cache, it retrieves it using this URL.
a620.g.akamai.net/
/www.cnn.com/i/22.gif
67
Arquitecturas de red para la distribución de contenidos
ARL Control Part
Customer Number
(I.e. CNN, Yahoo…)
Type Code
(different types will
have different contents)
???
Content Checksum
(May be used for
identifying changed
content. May also
validate content???)
/7/620/16/259fdbf4ed29de/
a620.g.akamai.net/
/www.cnn.com/i/22.gif
http://a620.g.akamai.net/7/620/16/259fdbf4ed29de/www.cnn.com/i/22.gif
68
Arquitecturas de red para la distribución de contenidos
ARL Host Part
But why such a complex
domain name????
a620.g.akamai.net/
/7/620/16/259fdbf4ed29de/
/www.cnn.com/i/22.gif
http://a620.g.akamai.net/7/620/16/259fdbf4ed29de/www.cnn.com/i/22.gif
69
Arquitecturas de red para la distribución de contenidos
ARL Host Part (II)
Hierarchical DNS architecture
Points to ~8 akamai.net
DNS servers (random ordering,
TTL order hours to days)
.net gTLD
Attempts to select ~8 g.akamai.net
DNS servers near client. (Using
BGP? TTL order 30 min – 1 hour)
akamai.net
g.akamai.net
a620.g.akamai.net
CS
CS
Makes a very fine-grained loadbalancing decision among local
content servers.
TTL order 30 sec – 1 min.
70
Arquitecturas de red para la distribución de contenidos
DNS Resolution
xyz.com’ DNS server
.com .net Root DNS
(InterNIC)
xyz.com?
6
5
10.10.123.5
4
a212.g.akamai.net
Akamai.net?
7
Local DNS
8
9 15.15.125.6
Akamai High-Level DNS Servers
g.akamai.net?
20.20.123.55
10
11
a212.g.akamai.net? 12
30.30.123.5
13
1
3
16
Browser’s
cache
14
TTL: 30’
Akamai Low-Level DNS Servers
2
15
User PC
User DNS requests for www.xyz.com
71
Arquitecturas de red para la distribución de contenidos
Lets look at a study about CDNs performance

Zhang, Krishnamurthy and Wills

AT&T Labs
Traces taken in Sept. 2000 and Jan. 2001
 Compared CDNs with each other
 Compared CDNs against non-CDN

72
Arquitecturas de red para la distribución de contenidos
Methodology
 Selected a bunch of CDNs
 Akamai, Speedera, Digital Island
Note, most of these gone now!
 Selected
a number of non-CDN sites for which good
performance could be expected
U.S. and international origin
 U.S.: Amazon, Bloomberg, CNN, ESPN, MTV, NASA, Playboy,
Sony, Yahoo

 Selected
a set of images of comparable size for each
CDN and non-CDN site

Compare apples to apples
 Downloaded
images from 24 NIMI machines
73
Cumulative Probability
Arquitecturas de red para la distribución de contenidos
Response Time Results (II)
Including DNS Lookup Time
74
About one
second
Cumulative Probability
Arquitecturas de red para la distribución de contenidos
Response Time Results (II)
Including DNS Lookup Time
Author conclusion: CDNs generally provide much
shorter download time.
75
Arquitecturas de red para la distribución de contenidos
Other findings of study
 Each
client

CDN performed best for at least one (NIMI)
Why? Because of proximity?
 The
best origin sites were better than the worst
CDNs
 CDNs with more servers don’t necessarily perform
better

Note that they don’t know load on servers…
 HTTP
1.1 improvements (parallel download, pipelined
download) help a lot
Even more so for origin (non-CDN) cases
 Note not all origin sites implement pipelining

76
Arquitecturas de red para la distribución de contenidos
Another study

Keynote Systems


Doing measurements since 1997


“A Performance Analysis of 40 e-Business Web Sites”
(All from one location, near as I can tell)
Latest measurement January 2001
77
Arquitecturas de red para la distribución de contenidos
Historical trend: Clear improvement
78
Arquitecturas de red para la distribución de contenidos
Performance breakdown
Basically says that smaller content leads
to shorter download times (duh!)
Average content size 12K bytes
Average content size 44K bytes
Average content size 99K bytes
79
Arquitecturas de red para la distribución de contenidos
Effect of CDN
Note: non-CDNs can work
well (CDN not always better)
80

Documentos relacionados