Responses to Questions from GTE


Comments and questions to gsager@engr.sgi.com

1.0	Introduction

2.0	Questions

2.1	AT&T

2.2	Please describe how streams are generated and the number of streams per title that 
are supported.

Streams are generated by reading from disk and writing to the OC-3 ATM cards. The program that does this is 
called the MDS (Media Delivery Service or MPEG Delivery Service). The ATM cards meter cells onto the 
network at a precise rate, according to the requirement of the media stream. The MDS reads data from the disk 
and feeds the ATM cards at a rate that keeps their buffers primed. There is one MDS on every server that 
provides streams. The MDS is a highly optimized real-time program; it schedules its reads and writes to provide 
uninterrupted constant rate streams to large numbers of clients. The MDS uses as many processors as necessary 
to deal out the number of streams demanded of it. Currently, we get about 100 streams (at ~4Mbps) per 150MHz 
processor.

The MDS implements "VCR tricks" (pause, fast forward, fast reverse, -) for the streams it serves. It does this 
in a manner that insures viewers exercising the VCR features do not place a load on the system that interferes 
with stream delivery to other viewers.

The choice of which MDS (hence, which server, bus and disk(s)) provide a given stream is determined at the 
time the stream is "opened" by a request to the Mediator. There is one Mediator for every cluster of servers that 
cooperate to provide one or more application services (e.g., VOD, NVOD, or both). The Mediator is a very 
simple program that executes very quickly and does not present a bottleneck. In case of failure, a new instance 
of the Mediator will automatically start up on another server node.

Note that the names "MDS" and "Mediator" will change in release 2.0, to reflect improved and more general 
implementations based on our favorable experience with the approach followed in releases 1.0 and 1.1. In 
release 2.0, these approaches will be applied to how we deal with assets other than constant bit rate streams.

The precise number of streams that can be pushed through an OC-3 depends on details such as overheads due 
to the encapsulation technique used and "breakage" due to leftover capacity of less than a whole stream. As a 
rough rule of thumb, figuring 100Mbps payload per OC-3 and dividing by the encapsulated MPEG rate gives 
a reasonable estimate of OC-3 capacity. We have used very large AAL-5 encapsulations for Orlando, so the 
encapsulated rate is very nearly the same as the stream rate. However, the AAL-5 encapsulation standard 
(which we support for Tokyo) represents higher overhead. We assume that the standard is not an issue for GTE, 
so simply dividing the stream rate into 100 gives a good estimate of streams per OC-3. Given 8 OC-3 cards per 
server and 3Mbps streams yields [100/3]*8 = 264 streams. In practice, we have served 350 streams from disk 
onto the OC-3 cards at 3Mbps on a server configured with 11 OC-3 cards. This is not a generally recommended 
configuration as it limits the number of SCSI buses that can be plugged into the server. We conduct these 
experiments to identify potential bottlenecks; so far, our experience shows linear behavior in all aspects of the 
system, so it appears that the limits of the Challenge server are determined by the number of bus slots available 
rather than system bus bandwidth. Further scaling by adding servers is linear, since servers do not contend with 
each other.

The number of streams from disk subsystems is a much more complex problem. Based on 3.5Mbps streams and 
2GB disks, we use the rules of thumb: 6 streams per disk (no stripe), 17 streams per 4-way stripe, and 27 streams 
per SCSI with two 4-way stripes. The recommended stream loading for the 4-way stripes allow for a certain 
amount of background "system" activity and thus do not represent the maximum number of bytes per second 
actually possible. We recommend the use of 4-way stripes, with one or two 4-way stripes per SCSI bus as a 
reasonable compromise among the wide variety of configuration options possible. This is based on a 
combination of analysis, experiment and experience; some of this is covered in the attached memo Disk 
Subsystem Performance [Reference\x111]. We highly recommend reading this memo, as it indicates that approaches 
being taken by our competition are in many cases suboptimal and it motivates our current emphasis on 
providing powerful placement and configuration tools to make best use of system resources.

Recommended "Rules of Thumb" for 3Mbps streams and 4GB disks:

·	100 streams per 150MHz CPU

·	33 streams per OC-3

·	15 streams per 2-way stripe

·	30 streams per SCSI (two 2-way stripes)

·	Apply the above until there are no more bus slots

We have not yet had the opportunity to conduct extensive tests with the newer disks; however, we are confident 
that we can meet or exceed the recommendations above.

2.3	What video server and configuration are you proposing?

See the RFP response.

2.4	AT&T/IDS

2.5	What subscriber applications are being included (such as VOD, NVOD, etc.) and 
will GTE have the capability to modify them? Please address on an individual 
application basis.

IDS provides an NVOD application and has a VOD application working for the AT&T DVHT set-top; our plan 
is to use these with minimal modifications for the GTE deployment. The DVHT implementation is described in 
Reference\x113 and Reference\x114. The VOD application leverages much of the NVOD implementation (in fact, 
NVOD is in many ways more complex than VOD), so our discussions of NVOD apply in large part to VOD. 
Both NVOD and VOD can be demonstrated in Mountain View today, from both the viewer and operator 
perspective.

The NVOD and VOD offerings can be "customized" through:

·	Screen layout, button labels, and other on-screen text.

·	Screen backgrounds (full-motion MPEG streams), including a "barker" channel background.

·	Movie listings (including synopsis)

·	Movie "genres." (I.e. the categories into which movie listings are divided).

·	Movie schedules (play times)

·	Service packages and pricing, on a time-of-day basis

	For example, a "basic rental" might be non-recordable and offer no motion control. A "Premium rental" 
might be recordable, have motion control, and allow replaying any time in a 24-hour period. Prices can 
also be varied based on movie class and time of day.

By these means, a service provider can create a unique identity for their service that the viewer will associate 
with the service provider as well as the content. For example, the VOD application developed for the NTT 
Tokyo trial will be used to present cooking shows as well as feature movies. The NTT version of VOD is more 
elaborate than the service provided by the CableVision VOD, since the NTT set-top is much more capable; 
however, it serves to illustrate our direction in providing highly customizable applications.

If GTE desires more control over the NVOD and VOD applications than is provided by customization features, 
IDS can negotiate NRE for enhancements or source licensing and support terms that would enable GTE to 
undertake enhancements.

2.6	What CO operator applications are being provided and will GTE have the 
capability to modify them?

The NVOD and VOD applications include operator interfaces that permit easy control of the applications. For 
example, an NVOD schedule is designed using a graphical operator interface, then the completed design is 
used to drive automatic set-up of the menu screens, server resources, service schedules and network resources 
necessary to implement the desired presentation of movies.

The servers also define interfaces to a Subscriber Management System (SMS). Our SMS interface consists of 
schema definitions. We provide a GUI for operator access to the tables we maintain. If integration with the 
customer's external SMS system is desired, code must be provided on the server to import and export this data, 
and the SMS must be cumstomized to support the data implied by the schema definitions. By this means, a 
change to SMS information would be reflected to the viewer in a very short time, and viewer settable 
parameters would be reflected back to the SMS in a short time.

We have not provided for customization of these facilities. If GTE desires more control over these features, IDS 
can negotiate NRE for enhancements or source licensing terms that would enable GTE to undertake 
enhancements.

XXX AT&T to provide additional information regarding other "operator applications".

2.7	Please describe your systems, methodology and controls for software releases and 
distribution, including upgrades.

XXX AT&T/IDS

2.8	Please describe how the media server supports an EPG/Navigator function in 
relation to being downloaded to the set-top box?

There are some possible issues here with interpretation of terms. We use the term EPG (Electronic Program 
Guide) for non-interactive presentation of program guide information; this feature is often provided though a 
"barker channel". The term IPG (Interactive Program Guide) is used for an EPG that provides some interactivity 
such as viewer control of scrolling; this feature is often provided by streaming data into the set-top at a low rate 
and caching it so a complete program guide is (almost) always available. In addition, the scope of EPG and IPG 
information is limited to what is provided by a service licensed to the cable operator (for example, the Prevue 
Channel); this means that detailed information regarding other programming (such as VOD or NVOD) is 
obtained by navigating to the service and making use of its interfaces.

In the following discussion, we assume GTE has the rights to receive and use program guide data from one of 
the information providers.

Based on the information we have and on our experience, we foresee using one of two possible approaches. 
After addressing those approaches, we will give some discussion regarding the more encompassing Navigator 
issues.

1.	Our understanding is that the GI box will support an IPG function provided by Starsight. This is 
typically done by special equipment that injects IPG information into the VBI. The set-top caches the 
IPG information and has a built-in (ROM or non-volatile store) program that provides a user interface 
to browse the cached information. Based on this, the built-in function can be used by providing the 
appropriate Head-End equipment associated with the Starsight service. Note that this equipment also 
provides for the download or upgrade of the IPG application as well as the IPG data.

2.	In the case that the GI set-top cannot be equipped with an IPG function, an EPG function can be 
provided using a barker channel. This is implemented as an analog broadcast channel.

Note that the solutions above do not require work or changes on the part of IDS; they use "off the shelf" 
products and services common in the cable industry today.

We do not see integration of IPG/EPG with NVOD (for example, viewing the NVOD schedule using IPG) as a 
necessity, since the NVOD and VOD applications already present a viewer interface for movie selection based 
on the available movies and scheduled plays. Thus, it is simply necessary to provide the means for the viewer 
to navigate to the NVOD or VOD "channel" and then use the application itself to complete the navigation. The 
application interface already deals with commonly desired features such as parental controls and presentation 
of pricing options.

If presentation of the NVOD schedule as part of IPG is desired, IDS can scope the work with the preferred IPG 
provider. Note that we already have experience interfacing IPG providers to our systems in Orlando.

Note: see also Section\x112.9 and Section\x112.10 for other information that augments the above discussion.

2.9	Please describe how applications would be downloaded from the media server to 
the DCT-1000 to be executed.

There are several preferred approaches to this problem; the choice we make depends on more detailed 
information regarding the set-top, the provisioning of the network, and the network topology. We understand 
that the GI set-top hardware and software support a networking protocol that will allow us to use some of our 
existing download related services with minimal modification. The amount of set-top memory available for 
applications is small (we do not yet have the exact numbers) and that the network will be provisioned such that 
the (share of) forward bandwidth available to a set-top for downloading is small, making downloading of large 
application modules (code, data or images) impractical from a response time standpoint. Given this information 
and assumptions, there are several non-exclusive approaches we would take to this problem:

1.	Since the number of applications is small, our preferred approach will be to keep executables in non-
volatile memory. If this is possible, we still need to worry about initial loading and updating of 
information in the non-volatile memory.

2.	Downloads and updates should be concentrated as much as possible to set-top and television power-
up time so we can focus on one area to "cover" (see below), rather having to provide cover for a variety 
of situations.

3.	As a means of keeping downloads small, a small scripting language, based on the CableVision 
implementation, allows us to implement compact set-top applications that can be downloaded quickly. 
The approach we finally adopt will almost certainly include some form of this option, as it allows the 
greatest leverage of our CableVision experience. Ideally, the interpreter would reside in non-volatile 
memory or would be downloaded at power-on.

4.	Encapsulate data in an MPEG transport stream. We have experience with this approach in the 
CableVision system.

See Reference\x115 through Reference\x118 for information regarding approaches taken with respect to the DVHT. 
Many of these approaches should apply at the higher levels of the implementation relatively independent of the 
details of underlying networking and protocols.

In any of the situations where downloads (or updates) do occur, a comprehensive scheme will have to include 
some means to "cover" any delays and prevent viewer frustration that will result in loss of interest or calls to 
customer service. We have found that close attention to cover can sometimes mask poor response time and 
improve the viewer's satisfaction.

Once we have had the opportunity to consult more closely with GI on their set-top capabilities and to consult 
with GTE regarding the detailed approach to network provisioning and topology (especially with respect to 
issues regarding how bandwidth is shared among multiple set-tops), we can narrow down the above set of 
approaches to a precise specification of what will be done.

Our initial response is based on the judgement that, although we do not have the complete knowledge 
necessary to give a precise answer, a practical approach that leverages our past experience with several 
networks and set-tops is possible with a small engineering effort. Note that the CableVision DVHT has a less 
capable processor (a 186 at 27MHz), less memory (64KB flash and 64KB RAM for applications), a less capable 
OS (pSOS) and lower bandwidth Aloha signalling, yet the applications are implemented almost exclusively in 
C and our scripting language; hence our confidence in leveraging the work. Although the GI set-top is more 
capable than the DVHT, undertaking a new implementation approach would greatly increase risk and expense 
without necessarily representing a substantial improvement.

The implementation of a VOD/NVOD client depends upon:

1.	the ability to: store a small program in the set-top, download it very quickly as part of the power-on, or 
download it very quickly when the viewer navigates to the "channel" representing the VOD or NVOD 
service. The current DVHT client implementation fits in 64KB in its "idle" state and uses less than 128KB 
in its "active" state.

2.	the ability to update the application if it is stored on the set-top.

3.	provisioning of a small number (~8) of low bit rate (1.5Mbps) cyclic MPEG broadcasts to provide full 
motion backgrounds and a small amount of data consisting of menu items, icons and scripts to be used 
by the NVOD/VOD application.

The DVHT VOD/NVOD interface operates by switching the viewer among the low bit rate MPEG broadcasts 
as the viewer makes navigational choices. Once tuned to an MPEG broadcast, the set-top displays the 
background movie and picks up associated data and scripts in the data portion of the stream; thus, every viewer 
tuned to one of these channels sees the same background movie and gets the same data and scripts. The scripts 
cause menus and highlights to overlay the background; from that point, the menu presentation and 
highlighting change in response to viewer inputs. Certain viewer inputs cause navigation to a new MPEG 
navigational stream or to a selected movie.

The contents of the navigational streams are controlled by the service operator. Customization is done by 
selection of background movie material and by modifications to the scripts. The NVOD operator interface 
automates the attachment of the proper menu information to the backgrounds and scripts to reflect the feature 
movie offerings and schedules.

Depending upon customer demand, we will continue to create applications for DVHT-class set-tops. 

Providing a very rich application set will ultimately require upgrades to the set-top environment. We would 
like to consult with GTE on the future design, implementation and provisioning of set-tops and networks for 
richer application sets. We have developed (in conjunction with AT&T and Time Warner Cable system 
architects) a set of recommendations that should provide an economical way to initially provide services 
requiring high-end set-tops while ensuring flexible growth in response to viewer acceptance and use patterns.

2.10	Please describe how the media server system supports an EPG/Navigator function 
from the STB perspective, particularly downloading data to the STB.

This question should be covered by the responses in Section 2.8 and Section 2.9. If applications provided by 
other parties require access to the same interfaces and services we use, they will be made available.

2.11	AT&T

2.12	AT&T

2.13	There is no question 13.

2.14	SGI

2.15	AT&T

2.16	Please provide and explain the latency numbers and performance characteristics of 
the media server.

There are several interesting scenarios for the server component of response time characteristics:

1.	Open and play a movie: the time for this operation is dominated by the time to fill a new buffer of media 
for delivery. If there is no queue of requests, this takes approximately 75 milliseconds (assuming a 2-way 
stripe and 512KB buffers). In the worst case, there could be a number of reads queued up and it could 
take up to 1.3 seconds to fill the buffer; 650 milliseconds would be the expected time for a loaded system. 
The time to traverse the file system structure leading to the movie data (i.e., doing the pathname lookup) 
is less than 100 milliseconds, since the pathnames depths and directory sizes are chosen to reduce disk 
accesses rather than to enhance readability by humans (these names are not something an operator or 
developer often deals with anyhow). Furthermore, the access to the structural information benefits from 
file system caching (see\x11#7), so there is a high probability that the access time is much less than 100 
milliseconds. Therefore, an estimate of 700 milliseconds for a loaded system is reasonable.

2.	Seek in an open movie: as with opening movie (see #1) the time to fill the first buffer on a loaded system 
is on the order of 650 milliseconds; since the movie is already open, 650 milliseconds would be the 
expected time for a loaded system.

3.	Play through a sequence of movies: when a group of movies (e.g., previews) have been installed with 
the "play through" property, they can be played seamlessly in any sequence; the server must be given 
lists of all possible sequences and when playing a sequence will handle the job of filling buffers across 
the sequence in time to deliver them back to back. This feature is useful for commercial insertion, as it 
is possible to create a sequence by inserting one movie inside another (as long as one observes certain 
boundary constraints imposed by MPEG).

4.	Seek to another movie in a sequence of movies: a sequence of movies can be treated as a single movie 
and seeks to a specified time in the sequence will occur with essentially the same performance as 
seeking in an open movie (see #2); this is because movies in a sequence are already "open" but are not 
buffered. Seeking to the next movie in a sequence is often faster than a general seek since there is some 
chance the system is already reading the buffers for the next movie. This type of seek is often performed 
when the viewer makes a navigational choice.

5.	VCR Controls: the time for the server to stop one mode of delivery and begin another (say from normal 
to fast forward, or normal to stop) is the same as a seek in a sequence (see #4), as the media to provide 
the other mode is treated as though it and the feature represent a "sequence".

6.	Play an already opened movie: applications can open a movie without starting play; when a request is 
later made to play the movie, the play can start in a few 10's of milliseconds if the open has had time to 
complete to the point that the first buffer is ready. This is useful when the application can anticipate the 
play; for example, if the viewer is in the process of buying a movie in VOD.

7.	Download an executable, graphic, font, -: these types of data are treated as UNIX files and therefore 
have the same access times and characteristics one expects of files. The worst case may well occur on 
lightly loaded systems, as the file data and the file system structural information regarding the file are 
less likely to be in memory from a previous access; this situation requires several disk accesses with 
short reads before the data can begin delivery. A reasonable estimate here would be 100 milliseconds. 
As the system gets more loaded, commonly used files and file system structure information are more 
likely to be cached and response time for them will improve without penalizing requests for non-cached 
items by queueing up disk reads. For most services, the amount of file system cache required to obtain 
this effect will be small in relation to the total memory required for all purposes.

8.	Restart a movie in case of a failure: if the MDS on the server providing a movie fails, the client will 
detect a problem in a fraction of a second because the decoder queue will empty. The first action is to 
ask the server to seek in the movie to the point at which the problem occurred and restart delivery (note 
that only the application on the set-top knows this point precisely); since the restart point is likely to be 
in the server's buffer still, the server response time for this request would be a few 10's of milliseconds. 
If the server has actually failed, within a couple of seconds the application client informs the viewer that 
the problem is being dealt with and retries by re-establishing its connection to the Mediator (in case it, 
in addition to the MDS server, failed), re-opens the movie, seeks to the point where the viewing was 
interrupted and restarts the display. If there is a hard server failure, recovery is possible only if another 
server has a copy of the movie with enough play capacity to permit a successful open of the movie; in 
this case, recovery should occur in less than 15 seconds. If there is a soft server failure which is cured by 
an automatic restart and the only available copy of the movie is on the failed server, recovery will take 
5-10 minutes.

Looking purely at the server response time, we note that many of the actions required of the servers in this 
context are like those of a large transaction processing system; since the Challenge servers perform very well on 
transaction processing benchmarks that stress both capacity and response time, we have confidence that we will 
be able to handle a high transaction rate and can continue to tune the system for better performance as 
experience is gained.

In many areas of our system, we design for predictability rather than for simple optimality; this approach leads 
to solutions that tend to be more optimal in a systemic view because one can (usually) drive a system designed 
for predictability closer to its limits without incurring a degradation in quality for any customer, while a system 
that has "optimal" but high variance behavior must leave greater margins to avoid unacceptable probability of 
perceptible quality degradation. As an example of this, we have designed and implemented such that the load 
placed on the system by VCR control operations is approximately the same as that of serving the feature movie 
stream, so we can be more confident that VCR control operations on one stream (or many streams) will not 
perturb the delivery of other streams.

Our APIs provide, and our applications make use of, several schemes to deal with latency problems in all 
aspects of the system, from the remote control to the server systems; it is important to do this since it is often 
less expensive to reduce the perception of response time than to reduce the actual response time. We feel it is 
important to take a systemic view of the response time problem, so I will give a brief outline of several design 
principles we suggest to application developers, then go into more detail on issues related to the "internals" of 
the server.

1.	provide immediate feedback to the user to indicate the input was received; this must be done locally by 
the set-top. If the input is a change to a new application, the most appropriate feedback is for the system 
to blank the screen.

2.	provide timely feedback to the user to indicate that the action being taken is producing the desired 
result. In the case of a change to a new application, the initial download should provide "cover" to 
assure the viewer and maintain interest while the larger body of the new application is downloaded.

3.	Take care to insure that all upstream requests that are response time critical can be transmitted in the 
shortest possible time; for example, in a TDMA upstream scheme, try to make the request fit into a 
single TDMA slot.

4.	if possible, give the servers information to indicate when media "clips" tend to be closely related in 
access time.

5.	if possible, tell the server in advance that a large movie clip will be starting soon.

From a systemic view, it is also important to consider the remote: many remotes require 150-200 milliseconds 
for the set-top to interpret a button push. The Orlando remote was redesigned to operate in 50-70 milliseconds; 
better times are probably possible.

3.0	Referenced Documents

Copies of these documents are provided as supporting material in response to the questions.

1.	Disk Subsystem Performance. A comprehensive treatment of considerations for configuring disk 
subsystems for delivery of constant bit rate streams in an interactive multimedia environment.

2.	Interactive Community Software Architecture and Unity 8 Specification. An overview of the software 
architecture and implementation of Unity 8; although it is based on the Orlando set-top, the discussion 
of server features is largely relevant to all IDS deployments.

The following documents are early specifications and requirements documents for the VOD/NVOD 
application. The actual implementations differ in minor details, but these documents serve to give a 
good general overview of the applications and how they are supported in the network. Furthermore, 
these documents do not present modifications and improvements that would be made for the GTE 
deployment.

3.	EPPV/VOD Client Application. Describes the DVHT side of the NVOD and VOD applications.

4.	Design Specifications: Enhanced Pay Per View (EPPV). Covers the server side of NVOD and VOD.

5.	DVHT Authoring. Describes the authoring environment for the DVHT resident interpreter.

6.	Data Delivery Service Design Specifications. Describes how data is delivered to the DVHT via in-band 
MPEG streams.

7.	PROM Upgrades. Describes how flash-resident DVHT system software and applications are upgraded.

8.	DVHT Communications: Messaging and Program Delivery. An overview prepared to serve as a basis for 
constructing a test plan and tests of the networking components.