Gregory W. MacPherson
This document provides a technical overview of the distributed server architecture supporting Mjuice®, including an overview of the Mjuice distributed server architecture, definition of important concepts and features, detailed objectives of the design, and a frequently asked questions (FAQ) section. The distributed server architecture is the foundation for the Mjuice music delivery system, which is the platform on which the Mjuice Web site is built.
INTRODUCTION | 1 |
WHAT IS THE DISTRIBUTED SERVER ARCHITECTURE? | 1 |
Scalability | 1 |
Speed | 2 |
Reliability | 2 |
Security | 3 |
Recovery | 3 |
WHY HAVE A DISTRIBUTED SERVER ARCHITECTURE? | 3 |
IMPORTANT CONCEPTS | 4 |
LOAD BALANCING | 4 |
DATABASE | 4 |
REDIRECT | 5 |
UNIX | 5 |
MEDIA CONTENT | 5 |
ARCHITECTURE | 5 |
WEB INTERFACE | 5 |
DATABASE | 7 |
MEDIA CONTENT | 8 |
SITE SUPPORT | 9 |
OBJECTIVES | 11 |
SCALABILITY | 11 |
SPEED | 12 |
RELIABILITY | 13 |
SECURITY | 14 |
RECOVERY | 15 |
This document provides a technical introduction to the Mjuice® distributed server architecture. The distributed server architecture is comprised of hardware and software components necessary to support a large global Internet World Wide Web (WWW) presence. It allows secure access to the Web site, processing of large numbers of concurrent requests and delivery of media content to a large number of concurrent users simultaneously. Associated components of the distributed architecture model allow for real time monitoring and archival statistical information gathering, remote administration, secure remote access, data backup and recovery, redundancy and failover and disaster recovery. In addition, the distributed server architecture is scalable and cost effective to deploy when contrasted with vendor specific hardware solutions and with dedicated turnkey system solutions.
The distributed server architecture is a design model depicting how the various hardware and software components of the Mjuice system are connected and interact with each other to produce the Mjuice Web site. The model consists of several independent, integrated hardware components that work together to provide the necessary functionality for Mjuice. The hardware components consist of readily available computers and networking components. The software is a combination of freely distributed source code, proprietary application software, and commercial applications. The distributed server architecture is the foundation for the Mjuice music delivery system.
Mjuice is positioned to be the premier supplier of secure music delivery on the World Wide Web. To accomplish this goal, and to ensure the protection of the assets of business partners and the privacy of customers, the underlying hardware and software components must meet several objectives:
Scalability is the characteristic of a system that allows it to expand in direct proportion to the demand on the system. The system capacity must be able to be expanded quickly, without incurring the costs of expensive programming time, proprietary hardware and software, and without the capacity limitations inherent in single server implementations.
The ability to handle large numbers of concurrent requests and bi-directional Web traffic in a secure manner requires that the functionality of the Mjuice site be distributed across several independent components. The Mjuice Web site can be expected to handle the following functions concurrently:
The ability to provide this functionality for an ever growing number of customers, in a cost effective manner and without placing restrictive upper limits on the number of requests that can be served, is the basis behind the distributed server architecture.
Customer reaction to World Wide Web sites is directly proportional to the speed with which the site responds to the customers' requests. The ability to serve customer requests for static and dynamic site content in an expedient manner directly impacts the customer desire to interact with the Mjuice Web site. Any contention for resources among the various components of the Web site would result in a decrease in the response time as perceived by the customer, and consequently would decrease the customers' enjoyment of the Mjuice experience.
Multiple independent hardware components allow for the distribution of the various functions of the Mjuice Web site. Providing the necessary hardware and software to handle the separate functions of the site without resource contention enhances response time. In addition, as different components of the site undertake greater loads, capacity enhancements can be made to one portion of the system without impacting the remainder of the site.
Customer access to the Mjuice web site must be predictable and reliable. A site that cannot be accessed or that responds inconsistently will destroy customer confidence, regardless of desirable content, technical features, or fast response time. Outside of the design considerations of the Web site content, the back end architecture must ensure that the availability and behavior of the site conforms to customer expectations.
The distributed architecture embraces redundancy of hardware and software to prepare for inevitable network interruptions in a complex internetworked computing environment. Standardization of software, a concise set of software tools, redundant hardware components, remote system management, and data backup facilities are all designed to minimize the time between a service interruption and restoration of service. System components are monitored 24 hours a day. Mjuice systems administrators receive a page when anomalies are detected.
The Mjuice Web site provides secure delivery of music content. Security is fundamental to the company's mission and to the financial success of the company. As in any community, trust relationships exist because a small proportion of the community will attempt to take advantage of any vulnerability in the system. Because Mjuice is committed to protecting the assets of its business partners and the privacy of its customers, network security is tightly integrated into the distributed architecture model.
The distributed server architecture ensures that access to the Mjuice Web site is unencumbered for legitimate customers and business partners. The content delivery system incorporates a fast, compact cipher mechanism. Sensitive data is stored separately from the Web servers and the music content. All access to the Mjuice servers is monitored. Administrative access to the system is tightly controlled. Remote system administration occurs through secure encrypted channels. Finally, the entire system is housed in a data center that provides 24-hour access control, triple redundant power backup, and fire control mechanisms.
Interaction with the Mjuice Web site produces substantial customer data for transactions. Also, the music content delivered by the Mjuice site is a collection of assets entrusted to Mjuice by the various business partners. Should an event occur that resulted in the loss of customer or business partner data, both customers and business partners would lose confidence in Mjuice. In addition, Mjuice would be liable for the loss of the assets entrusted to it by the customers and the business partners.
The distributed server architecture makes every attempt to ensure that data loss resulting from hardware or software error can be rectified. The relational database used by the Mjuice web site affords two phase commits and transaction rollback capabilities. Redundant hardware components allow for fast restoration of transaction data. Multiple servers and redundant disk storage diminish the probability that every existing copy of the data will be destroyed in the event of a natural disaster. Automated tape backup procedures ensure that systems can be restored to a state close to or exactly mimicking their previous condition.
Mjuice is positioned to be the premier provider of secure music delivery on the World Wide Web. Consequently Mjuice expects to handle enormous amounts of site traffic as the number of customers and business partners increases. Currently the availability of cost effective hardware and software that can manage such an enormous load is very limited. In addition, the desirability of expanding the Mjuice system to afford added functionality, greater customer scope, and additional content dimensions requires that the system be scalable and extensible.
The distributed server architecture can:
The distributed server architecture benefits both Mjuice customers and the Mjuice business partners. Customers achieve reliable, fast, and secure interaction with the Mjuice Web site. Business partners achieve fast, scalable, and cost effective deployment of static and dynamic Web content.
Some concepts and terms used to describe the distributed server architecture are new and some are not. Unfortunately, some definitions of terms are vague, inaccurate, or have multiple definitions. Before going on, it is important to understand how the following concepts and terms are defined in the context of the discussion of the distributed server architecture.
Load balancing is the ability of a networking component to accept multiple requests for service and to evenly distribute those requests among several other network components so that no single component is overloaded. The Mjuice distributed server architecture incorporates several load balancing switches. These switches are hardware components that incorporate load balancing software. The load balancing switches accept the incoming requests for static and dynamic Web content and route the requests to one of a number of servers, which are configured to respond to the requests. The load balancing switches also multiplex the responses from the servers to the requestor.
A database is a software component that acts as a storage repository for data. A database supports complex relationships between the data elements. For example, a request for all of the people with the last name MacPherson who have downloaded songs from the rock music genre can be modeled by a database. Databases also provide mechanisms for data analysis and report generation. The Mjuice distributed server architecture uses a database to store customer account data, account authorization data, and data related to the music available on the Mjuice Web site.
A redirect is a software mechanism that allows a service requested from one server to be transparently handed off to a different server. A redirect differs from a proxy in that all requests and responses are routed through the proxy, while a redirect is a one-time transfer of control to a different server. The Mjuice distributed server architecture uses redirects to transfer control of music content delivery from the Web server machines to the music content delivery machines.
UNIX® is a multi-user, multiprocessing computer operating system. Several variants and brands of UNIX are available both commercially and for free distribution. The Mjuice Web site utilizes two brands of UNIX, FreeBSD and Solaris. FreeBSD. is a freely distributed UNIX operating system that provides advanced networking, performance, security, and compatibility features. Solaris is a commercially available UNIX operating system for SUN workstations and servers. It is utilized on the Mjuice database servers only. All other servers in the distributed server architecture run FreeBSD.
Media content refers to all of the musical content, songs, song previews audio files, streaming audio, streaming video, and other multimedia content accessible by customers on the Mjuice Web site. The media content is stored on a number of servers that accept redirects from the Mjuice Web site. The media content servers verify that the client recipient has been authorized to receive the requested content, and then serves the requested content directly to the client.
This section illustrates the primary components of the distributed server architecture for the Mjuice Web site.
Initial incoming requests to the Mjuice Web site are directed to a load balancing switch. The distributed server architecture utilizes Foundary ServerIron 24 port Load Balancing switches. The switches are connected together to afford a failover mechanism in the event that one of the servers ceases to function correctly.
The load balancing switch directs the incoming request to one of the server according to an algorithm embedded in the switch firmware. The Web server responds to the request by returning the requested Web page to the customer's browser software. This is illustrated in figure1 below:
The static and dynamic Web content is stored on each of the web server machines. Each web server machine is a Pentium class server with a single Pentium III processor and three 9-gigabyte hard disks. One hard disk is dedicated to the operating system, while the remaining two are concatenated together to form a single 17 gigabyte virtual disk. The server memory is expandable up to 1 Gigabyte. In addition, the system boards can accept a second Pentium processor. FreeBSD supports symmetric multiprocessing (SMP), so only the addition of the processor and a UNIX kernel build are required to double the server's processing capacity. Each Web server runs the FreeBSD 3.2 operating system and the Apache Web server with the ModPERL extensions. The Apache Web server is a multi-process Web server that can handle multiple simultaneous connections. The ModPERL extensions are compiled into the Apache Web server to support the interactive functionality of the Mjuice Web site.
The Mjuice secure delivery mechanism utilizes a database to store information about the individual user accounts, information necessary to implement the secure delivery mechanism, and additional information associated with the media content.
The database resides on a separate back end network, and is only accessible by the Mjuice Web servers and Media content servers (see next section). Customers and business partners do not interact directly with the database. This is illustrated in figure 2 below:
The database is an Oracle 8.0.5 relational database product. The hardware supporting the database product consists of two Sun E450 Servers and two Rourke Data Redundant Array of Independent Disks (RAID) units. Each Sun E450 server runs the Solaris 2.7 64 bit operating system. In addition, both E450 servers are configured so that the two machines' instances of the Oracle database remain current with each other. In the event of a catastrophic failure of one of the servers, the other server automatically assumes the role of primary database server. In this way the functionality of the Mjuice Web site continues uninterrupted while the Mjuice systems administrators work to restore the failed machine.
The Mjuice Web site is designed to deliver music and other media content to authorized users. Properly authorized users may preview song selections for free, and may download song selections from the site. The Mjuice Web site accomplishes this by implementing a HyperText Transfer Protocol (HTTP) redirect to the load-balancing switch. The load-balancing switch directs the client request to one of a group of media content servers located behind the switch. The media content server responds to the request by authenticating the client requestor, and then initiates the download of the media to the customer's client software. This is illustrated in figure 3 below:
The media content is stored on each of the web server machines. Each web server machine is a Pentium class server with a single Pentium III processor and three 9-gigabyte hard disks. The hard disk configuration conforms to the description given in the section describing the Web servers. The server memory is expandable up to 1 Gigabyte. As with the Web servers, the system boards can accept a second Pentium processor. Each Web server runs the FreeBSD 3.2 operating system and the Apache Web server with the ModPERL extensions. The Apache Web server is a multi-process Web server that can handle multiple simultaneous connections. The ModPERL extensions are compiled into the Apache Web server to support the download functionality of the Mjuice Web site.
The Mjuice Web site incorporates numerous support servers in addition to the hardware already described. These servers' functions are required to meet the contractual business obligations of Mjuice to its business partners and to maintain the quality of service for the distributed server architecture.
Customers and business partners have no direct interaction with the support servers. Instead, the support servers enhance the functionality of the Mjuice Web site and protect the site against hardware failure, natural disaster, or external security threats. This is illustrated in figure 4 below:
Each of the supporting server machines is a Pentium class server with a single Pentium III processor and three 9-gigabyte hard disks, configured as previously described. The exceptional configuration is the statistics and monitoring servers. These have dual processors to support their increased numerical calculating functions. Each server's memory is expandable up to 1 Gigabyte. The system boards can accept a second Pentium processor, with the exception of the statistics and monitoring machines. All of the servers run the FreeBSD 3.2 operating system.
The Domain Name System (DNS) servers are responsible for the publication of the assigned names of the Mjuice servers across the Internet. DNS is a distributed name lookup service used by all Internet computers. Two servers are required, a primary and a secondary. Because it is essential that the DNS entries be correct in order for the Mjuice Web Site to function properly, the two DNS servers are located on separate networks in separate physical locations.
The serial console access is achieved through the use of terminal servers connected to the unpublished back end network of the Mjuice Web site. These serial console access points facilitate secure remote access directly to every server in the distributed server architecture. In the event of a server failure, the Mjuice systems administrators can quickly access the machine from a remote location and take steps to bring the server back online or replace it with a spare server.
The mail servers handle all electronic mail (e-mail) duties within the Mjuice distributed server architecture. There are two mail servers located on separate networks in separate physical locations. The mail servers handle the communication between customers subscribing to and unsubscribing from the Mjuice mailing list.
The tape backup mechanism is a fully automated system that stores both the full and the incremental backups of the Mjuice customer information offline. In the event of a failure of a hardware or software component impacting the customer database information, the customer records can be restored to their state at the time of the previous backup.
The time server is a dedicated Global Positioning System (GPS). It communicates with a network of 24 geosynchronous satellites in orbit around the Earth. The time server keeps accurate time, and is referenced periodically by all of the servers in the distributed server architecture. This ensures that all of the servers maintain the correct time, which is necessary both to support time based authentication mechanisms and to ensure that intruders attempting to reset the time cannot compromise the system.
The statistics and monitoring servers provide real time monitoring of the state of the servers in the Mjuice distributed server architecture. There are two servers located on separate networks in separate geographical locations. The two servers monitor each other, as well as the Mjuice Web site, to ensure that any failure of one is immediately communicated to the systems administrators. In addition, the servers provide mechanisms that allow for the archival storage and retrieval of statistical information for business purposes.
This section details the primary objectives of the distributed server architecture for Mjuice Web site.
The Mjuice distributed server architecture allows for enormous increases in capacity. Each FreeBSD server is a standard IBM(r) personal computer clone constructed using readily available hardware components. The cases, system boards, microprocessors, memory and hard disks are all mass produced components. None of the components are vendor specific or customized. Therefore additional hardware can be ordered and installed in the time required to assemble and deliver the machines. In addition, the impact of hardware failures and systems interruptions is minimized since each machine's components are identical.
The load balancing switches utilized in the Mjuice distributed architecture contain 24 ports. Each of these ports is connected to a 24-port Intel 510 network switch. This creates a total of 576 connection points that can be load balanced by each ServerIron switch. The total capacity of each individual switch's load balancing software is 1024 ports.
There are two load balancing switches for both the Web servers and the media content servers, for a total of four switches. Each pair of switches is ganged together to create a failover mechanism in the event any single switch experiences problems. In addition, the load balancing software of the ServerIron switches can detect when servers fail to respond and will automatically remove that server from its load balancing algorithm.
The load balancing switches support virtual IP addressing. This allows the physical hardware to service more than one IP address. For example, the Mjuice Web site contains media content servers for the mp3 audio file format. It also contains servers for the Real Audio G2 streaming audio format. Each of these groups of servers can be addressed using individual virtual IP addresses. In the event that new media content servers are added to the distributed server architecture, the new group of servers can be assigned another separate virtual IP address to differentiate them from the existing media content servers.
Spare machines are configured into the distributed server architecture. One of these machines can be inserted into the network in the event that a production machine experiences problems. The down time for any server is a function of how much configuration the replacement spare will require to allow it to function in place of the inoperative server machine. The uniform configuration of the Web and Media content servers in the distributed server architecture reduces the expected downtime of any of the FreeBSD servers.
Response time to customer requests ultimately will decide the success of the Mjuice Web site. The performance speed of the Web site is a function of several interacting components. These components include the net Internet usage, the capacity of the data center network components, the efficiency of the load balancing switches, and the hardware resources of the servers.
The net usage of the Internet is a factor beyond the control of any single point of presence. Internet traffic and the quality of service of all of the interconnected tier one providers and their equipment, is a variable beyond the control of any individual network entity. In the future, multiple points of presence may be necessary in order to circumvent Internet traffic congestion.
Currently the Mjuice Web site is connected to the Internet through two independent one hundred megabit Ethernet connections. These two connections allow the site to handle a maximum traffic of 160 megabits per second, allowing for network overhead. As the site's traffic capacity expands, the cost of the data center connection will increase as the efficiency decreases. Our current data center, Frontier GlobalCenter, can expand our capacity to gigabit Ethernet, which will increase the Web site's network capacity by an order of magnitude.
The efficiency of the Foundry load balancing switches is directly proportional to the configuration of the servers that they manage. Therefore it is in the best interest of the Mjuice Web site to minimize the number of servers that each load balancing switch manages. The most effective way to minimize the load on the switches is to maximize the efficiency of each individual server.
The FreeBSD servers incorporated into the distributed server architecture are state of the art hardware. The servers contain100Mb Ethernet network cards, which enable full duplex data transfer at a rate of 100Mbps. The IBM hard disks have a rotational speed of 10,000 RPMs and a measured data transfer rate of 38Mbps. The Apache Web servers have been optimized to minimize both their amount of memory usage and resident time in memory.
The Sun E450 database servers on the back end network utilize high speed fiber optics to connect to the two RAID array units. The RAID units are hardware RAID units, which are markedly faster than other software RAID implementations. The E450 servers each contain four microprocessors and can be configured with up to 4GB of memory.
The Oracle database model has been designed and configured with expansion and portability in mind. Both the servers and the database have been configured and tuned by a database professional to optimize their ability to interact with both the Web servers and the media content servers.
The total number of registered users on the Mjuice Web site is directly proportional to the continuity of service. Therefore every conceivable precaution has been taken to ensure that the Web site will remain available 24 hours a day throughout the year.
The network connectivity at the data center is the primary access point for the Mjuice Web site. The two redundant network connections have been configured to supplant each other in the event that a network outage affects one of them. While the net effect to the Web site will be decreased performance, the site will remain available to customers until the data center can correct the problem.
Behind the data center connections, the physical switching hardware is the next point of connectivity on the Mjuice Web site. Customers never see the switching hardware, yet it is an integral part of the distributed server architecture. Each switch has a second, identically configured spare that will replace the primary component in the event that the primary suffers a hardware failure. This failover mechanism will cause a brief interruption in service. However connected customers will only notice a brief momentary performance decrease and the Web site will remain online throughout the handoff.
The FreeBSD servers are behind the switching hardware. There are several spare servers in place that are already configured and that can be brought online very quickly. Bringing a replacement server online requires the systems administrators to log into the replacement server, rebuild the UNIX kernel, configure the server with the failed server's IP address, remotely distribute the server content to the new machine, and finally restart the machine. The quick turnaround ensures the availability of significant functionality of the Mjuice Web site.
The database servers amass the customer data for the entire Mjuice Web site. In addition, these servers contain the encryption components necessary to perform the secure delivery of music that is the basis of the Mjuice system. Therefore the integrity of the data on these servers is paramount. Fortunately, the Sun E450 servers are connected to the redundant disk storage arrays. The entire disk array appears as a single volume to the database machine. The disk volume can be entirely reconstructed on the fly without any interruption of service to the customers. This ensures that the customer data will not be damaged or lost during normal service.
The security of the information assets of the Mjuice Web site is paramount. Corporate espionage and hackers are a reality on the Internet. The Mjuice distributed server architecture attempts to prevent any unauthorized access to the servers and their data.
Communication with the individual servers is accomplished using the F-Secure Secure Shell protocol. This protocol provides an encrypted connection between the server and the client accessing it. While connections for TELNET and FTP transmit password data in clear text across the connection, attempts to view the content of the logon session or to steal the connection will yield only garbled encrypted data. Secure shell connections use strong encryption, and are subject to United States export controls at the time of this writing.
Communication between the Web servers and the media content servers is via HTTP redirects. The load balancing switches are the only access points for the media content servers. If a direct HTTP connection could be made to the media servers from the outside, the connecting client could only download the media content after providing the proper user and session keys. If the connecting client were able to spoof the authentication mechanism and obtain the content, the content files could only be played back on a device or a computer containing the proper decryption key.
Only the Web and media content servers on the front end network of the Mjuice distributed server architecture can communicate with the database machines. The database machines are not accessible to the outside world.
The statistics and monitoring software runs on every server in the distributed server architecture. If an intruder were able to access one of the servers, the server logs would detail the intruder's attempt. Were an intruder able to access the machine and modify the logs, the statistics and monitoring software would notify the systems administrators of a problem with the server.
The servers that constitute the Mjuice Web site are located in a data center. Entry to the data center is restricted to personnel whose names are listed on a restricted access list. Entry to the data center requires signing in and presenting identification to the guard on duty twenty-four hours as day. Entry to the cage containing the Mjuice hardware requires that the guard escort the personnel to the cage. The entire premises are under camera surveillance around the clock. No hardware can be installed or removed in the data center without the express written permission of the guard. The network traffic in and out of the data center is closely monitored twenty-four hours a day.
The reality of man-made technology is that it is imperfect. Network outages occur, people make mistakes, and natural disasters happen. Certain circumstances, such as a massive power outage, or the interruption of connectivity by damage to network equipment or a severed cable would not impact the integrity of the Mjuice Web site. However other events, such as a hardware failure or a natural disaster, would require that the systems administrators reconstruct the Web site and bring it back online. While no amount of preparation can prevent these acts, the distributed server architecture attempts to take every precaution to allow for a quick and thorough recovery from such an event.
The Mjuice Web site incorporates statistics and monitoring software. This software continuously checks the status of the individual servers in the distributed server architecture. Several performance metrics, including system load, CPU utilization, server response, network bandwidth utilization, and the number of running processes, are examined periodically. When any anomalies are detected, the monitoring software directs a message to the paging system for the systems administrators. The systems administrators can remotely log into the machine to diagnose and correct any problem.
To ensure that the customer data remains intact, the entire contents of the database servers are backed up on a daily basis. These backups permit the Mjuice Web site to be restored to a fully operational state with minimal loss of data.
The Mjuice Web site has been modeled on several test servers that reside in a separate geographical location. Although the capacity of the test system is only a fraction of the production site, the test system can be configured to replace the live system temporarily. The Web site would remain available to customers on a limited basis until the production site could be restarted.
Finally, the power for the distributed server architecture is maintained by the data center. They provide triple redundant power backup and a complete Halon fire protection system for the entire building.
Q: Where does the Web site live?
A: The actual static and dynamic Web content is located on the Web servers. These servers are FreeBSD machines that are co-located in the Frontier GlobalCenter data center in Sunnyvale, CA.
Q: Does the music live on the same machine as the Web site?
A: No, the music, previews, and all other media content lives on the media content servers. These servers are also FreeBSD machines co-located at the Frontier GlobalCenter data center. However a separate load-balancing switch manages the media content machines, so they are actually on different network segments.
Q: Does the music live on the database?
A: No, the music is served off of the media content servers. The database only contains customer account information, information necessary for the Mjuice proprietary encryption software, and peripheral information associated with the songs and artists, such as artist mailing list and Web site information.
Q: How can I tell how busy the Web site is?
A: The Web site statistics are displayed in two locations. Both locations can be viewed at the URL http://barbrady.mjuice.com/. [This site has been removed - ed] This site is only visible from Mjuice, not to other machines on the Internet. The link to the server monitor page will display a chart describing the state of several of the Mjuice servers, including the production Web site servers. The link to the bandwidth utilization page displays a graphical history of the incoming and outgoing bandwidth traffic handled by several of the Mjuice servers, including the production Web site servers.