Packet telephony is revolutionizing the world’s telecommunications infrastructure, opening the way to previously unavailable communications services. Many of these services are built on a media server or value-adding service platform. But IP-only media servers run the risk of stranded investment if the rate of next-generation network deployment is slower than predicted. And PSTN-only media servers are not enjoying favorable procurement decisions. This has led to increasing investments in media servers that support both legacy and next-generation networks… the dual- network media server (DNMS). This white paper addresses how the DNMS is most effectively designed to minimize cost and time-to-market while producing a system with the media and service flexibility demanded in times of regulatory, standards, technology, and investment uncertainty.

Dual-Network Media ServerCost and flexibility depend, of course, on how the system is designed. Low-function telephony middleware can add greatly to the system’s application-development cost; high-function telephony middleware is able to make the target network transparent to the server application, making DNMS application development no more costly than for a single network. Fixed-function media-processing resources can add greatly to the system’s hardware cost if both PSTN and IP networks are supported by separate resources. But flexible-function media-processing resources, although marginally greater in cost than fixed-function, can reduce the incremental cost well below the increase in system value.

Telephony Middleware

High-function telephony middleware products hide the type of network from the application by exposing a network-neutral call-control API to the application. The API packages call-control commands into packets that are forwarded to the System Call Router (SCR). The SCR, as shown in the diagram above, routes commands to the proper network-interface resource based on routing rules, such as the destination address. The SCR, through a resource-management scheme, informs the system’s media-processing resource providers of the network selection so that they can create a resource graph that supports the network choice…all transparent to the application.

Media-Processing Resources

Media flexibility is based on two fundamental requirements: media-processing resources must be capable of dynamically binding any media-processing technology to any call stream at any time, and the processing resources must be provisioned to achieve the desired level of system service. The usual practice, especially in smaller enterprise systems, is to provision the system for the non-blocking worst case. But higher-capacity systems allow the designer to provision the system’s media-processing MIPS statistically, providing the system middleware supports this approach.

Mixed-Network Voice ServerThe argument is often advanced that cost-per-port is the overwhelming selection criterion, and fixed-function resources will always be lower in cost than flexible function. But the question is cost-per-port of what…voice, fax, data, video? Increasingly, carriers and enterprise network managers are moving to procurements that delay the media and network-support decision to the point of configuration, or, as discussed here, all the way to call placement or acceptance. For this to happen, the equipment designer must have control over both the media-processing hardware and software resources. A fixed-function resource is created whenever the equipment developer loses control of the technologies designed into the system to support media processing or does not implement a system architecture that supports media flexibility.


Voice processing for the PSTN has lower compute-power requirements (MIPS) when compared with packet telephony. But PCM-based systems suffer a cost-to- scale penalty compared with packet-telephony systems. Dual-network systems, therefore, must bear the costs to support this marginal increase in signal- and packet-processing resources for packet telephony, and the PCM-switching resources to support the PSTN.

For example, for a media server to play a voice message over the PSTN it must source the voice file to be played (or the text message if text-to-speech is to be employed). It is then converted to the desired voice-coding (vocoder) format, and placed into the proper time-division-multiplexed (TDM) stream and time slot. For the same message to be played over a packet network, the message source is accessed, but frequently the vocoder used requires greater processing power and must be selected on a per-call basis. Then, the coded voice data are inserted into packets (packetized) by a network-processing resource. So the TDM hardware required for the PSTN connection is traded for a packet-processing resource and network interface in the case of a packet network.

But that is the only cost penalty that should be tolerated by the designer. As diagrammed in the figure above, capable telephony middleware products allow application-level development to proceed without regard to the specifics of supported networks. The application must determine whether a voice message must be played, but the SCR routing information will cause the system to configure the media resources required to support the destination network.


The average Fortune 500 company spends $5-million per year sending faxes; the primary benefit in using data networks to send packet-based faxes is to reduce that expense. There are two ITU recommendations for transporting faxes across IP networks: T.37 and T.38. T.37 specifies how a fax image is encapsulated in e-mail and transported to the recipient using a store-and-forward process. T.38 defines a protocol for transmitting a fax across an IP network in real time, requiring no behavioral changes on the part of the user. A gateway designed to offer the carrier or enterprise a transport for corporate fax traffic that is at least as robust and capable as the PSTN must include T.38. A general diagram of a T.38-based fax transport is shown below.

The ITU has also specified a protocol for robust real-time transmission of fax over ATM in I.366.2.

Real-Time Packet-Based Fax

The media server should handle fax in a manner similar to the way voice is handled. The application knows to send a fax, but need not be aware of the type of network that the call will utilize. The SCR’s routing rules are used to select the network. For example, in an enterprise application, if the destination is reachable by the enterprise network, the IP facility is used, otherwise, the PSTN resources will be called into play.

Mixed-Network Fax-Media ServerThe fax system service manager utilizes the ITU T.30 recommendation for fax exchange. But the two networks require completely separate network-interface resources. The PSTN call requires analog fax modems; the packet-network call requires, for IP networks, a software entity that implements the T.38 protocol, but instead of relaying the fax data out to analog modems, as shown above for a gateway, it sends it to or receives it from, the server’s application, as shown below. This implementation of T.38 is called “terminating T.38” to distinguish it from the implementation found in gateways, which is called “fax relay”.

Unlike voice, where packet support frequently increases processing requirements, terminating T.38 is often implemented on scalar (host) processors, since it is a protocol-processing technology, not a signal-processing requirement.


For the foreseeable future, dual-network media servers will be required by both carrier and enterprise network designers. But the cost of the DNMS can be held close to that of the single-network design through the informed and judicious choice of value-adding technologies.

The equipment designer can take advantage of media-processing resource hardware that supports over 2,000 low-bit-rate vocoders on a single CompactPCI board. Telephony middleware software products designed to isolate applications from network specifics are available. Today, the system designer can license a high-density integrated-media software framework. And all the media-processing technology components required to support both networks are offered by value- adding technology vendors.