In this month's panel, we talked about media gateways: What are they? How
do they work, what applications do they enable, and how do we use them to
integrate voice with the IP network? Our panel:
Jerry Gavin, Manager - NMS Sales Engineering for the Americas, NMS Tung
Phan, Engineering Manager, Channel Access John Jainschigg, Editor in
Chief, Communications Convergence Magazine Keith Dawson, Senior Editor,
CommWeb
What Do Media Gateways Enable?
Keith Dawson got the ball rolling by asking the panel to talk about some
of the things that media gateways facilitate, asking them to lay out some
of the applications that next-gen media gateways enable? "What, precisely,
distinguishes their 'next-gen-ness'?"
Jerry Gavin of NMS replied: The Media Gateways will be able to provide the
access of applications to enable enhances services. These applications can
be IVR, Fax Messaging, Conferencing, ASR, TTS, Media streaming and
transcoding. The Media Gateway will terminate the traditional PSTN
connectivity (ISDN, SS7..etc) and route the via a media platform (can I
say my own board, CG6000) to the IP networks using various IP call control
(H.323, SIP, MGCP...etc.) and IP protocol (TCP/IP, RTP, RTCP and UDP) to
communicate to the media servers that will be running enhanced services.
What make them the "next-gen" is it an open architecture in regarding to
hardware and software. The close system or legacy equipment does not have
the density or scalability to offer all the enhanced services the NEPS and
SP need for it's growing customer base.
And Channel Access' Tung Phan offered this: Enhanced services is the most
important feature that Media Gateways enable. And it is easier and cheaper
to design, build, and maintain these services than in the traditional PSTN
networks.
Some of these services could be
(a) Messaging services. Not only voice mail as in the typical basic PSTN,
but also emails and faxes. Users can check, compose, forward and reply to
emails, voice mail or fax from any telephone using voice or have emails
converted to speech and played to user (or authorized user via voiceprint)
over the phone.
(b) Find Me/Follow Me allows users to be accessible at any time and
anywhere.
(c) Voice portals (like the VoiceCaster from Channel Access), voice
browsers and wireless web access which provide the user with access to web
content through voice commands. This service will evolve and can be
integrated with personal computers, PIMs and mobile phones.
This sparked a question from Junior Jones: Tung says: "...it is easier and
cheaper to design, build, and maintain these services than in the
traditional PSTN networks." My question is why? We're still talking about
wrestling with NMS/Dialogic/Brooktrout/etc. boards whether TDM or IP,
right?
Tung Phan replied: The advantages of a converged communication network is
in its architecture / platform. And some of these advantages are:
1. Migrating of functionality away from proprietary central office
switches to commercial off-the-shelf hardware and software.
2. Utilizing open standards at the hardware level (such as CompactPCI for
packaging or PCI for bus architecture) as well as software level (such as
TCP/IP, XML/VoiceXML, SOAP...).
These key advantages would be translated directly to lower cost, larger
talent pool, quicker time to market, and among many others, easier to
deploy. I mean the complexities associated with a central office switch
are well known. They are difficult to configure, and adding a new enhanced
service is a nontrivial exercise. On top of that, (enhanced) services must
be co-located in the central office with the switch, which is a very
expensive solution.
As for your concern about wrestling with NMS/Dialogic/etc., the converged
communication network enables company like Channel Access, bringing to the
market product like the Voicecaster. The Voicecaster is a completed
VoiceXML development environment right out of the box so users don't have
to spend 4-6 weeks wrestling with the hardware, installing the OS/drivers
and then try to configure the VoiceXML development environment. Instead,
Voicecaster enables you to put the finishing touch on your product after
4-6 weeks, a direct contribution to lower cost and quicker time to market.
Burt Crepeault offered these insights: Unless we're talking about the
ability to part away from traditional telephony by using packets instead
of TDM, I'm not sure we can talk about next generation media gateways
quite yet, or can we? I'd like to have that put in perspective with
current generation media gateways.
I tend to see media gateways as "dumb" machines, more or less like digital
cross-connects or Ethernet-to-token ring bridges, simply connecting two
different end-points together as instructed by some external intelligence.
To me, the intelligence (like signaling, billing, etc.) resides in the
softswitch while the value-added functions (IVR, transcoding,
voice-messaging, etc.) are in the media server.
It is true that these functions tend to be regrouped onto "unified"
platforms and that the differences are becoming blurry (perhaps what can
be considered next-gen-ness?), but the functionalities are clearly
distinct. Therefore, I would tentatively say that media gateways enable
little less than connecting two different telephony worlds together from a
voice perspective only, transforming PCM speech to packets and reverse,
removing echo and compressing when needed.
What's The Deployment Improvement?
One reader wondered if the panel could expand on the claim of quicker
deployment for media gateways. Are there quantifiable results you can cite
in terms of how much faster these things can go in? You mention 4-6 weeks
-- what's that compared to in old-style deployments?
Tung Phan took up the challenge. Actually, the 4-6 weeks I mentioned was
for setting up the development environment and coding the product. During
this period, resources (in many cases, it is the software developer) must
be allocated for the following tasks.
a. Installing the OS onto the system, ensuring all devices are in working
order and running stable.
b. Installing and configuring the telephony board with the appropriate
configuration / protocols
c. Installing the speech recognition software and it required components;
again going through the configuration steps for each software package and
making sure that they work together nicely.
d. Coding, finally!
In a large corporation, there are specialists for each one of the major
tasks listed above; so there is almost no "wrestling", as Mr. Jones said,
but the cost is high. In a medium to start-up environment, the wrestling
is becoming obvious when the software developer has to put together
his/her own development environment! Not that he/she is not able, but the
software developers was hired because of their skill set, writing codes
and not integrating system. In the end, development time is long and cost
is also high.
Back to Basics
One person asked to step back to basic principles: Can someone explain for
the benefit of the less technical among us exactly what a media gateway
"gates" between? That is, where does it fit into the schema of an
integrated voice and IP network?
Burt Crepeault offered an extensive answer: There are three main
components in a VoIP network: the softswitch, the media server and the
media gateway.
The softswitch acts as the intelligence in the network, the database that
contains names and numbers, connects parties together and performs the
billing. Wes was right when stating that it resides on open architectures:
as opposed to its traditional counterparts, the PBX and class 5 telephone
switch, it runs on a workstation or a PC rather than on dedicated
hardware. The softswitch resides entirely in the IP domain.
The second component is the media server. This device is generally
responsible for "value-added" services on a voice network, services like
voice mail, multi-port conferencing, IVR, etc. Like the softswitch, the
media server is an IP device.
Finally, the media gateway (MG) is the device that lets you connect the
VoIP world to the traditional telephony world, which uses Time Division
Multiplexing (TDM) rather than packets to transport the voice. In other
words, the MG will take TDM voice streams and convert them to packets and
do the reverse to bridge the gap between these two worlds. Media Gateways
sit right between the IP domain and the TDM domain, allowing services such
as Net2Phone.com to exist.
Where it gets confusing is when equipment manufacturers "bundle" two or
more of these concepts together in a single product (i.e. a softswitch +
media Gateway, or a media gateway + media server). I guess the boundaries
aren't yet clearly defined and people are experimenting with different
approaches. One thing is sure: it's much less confusing now than it was
3-4 years ago... :)
Then Bob Massad offered a technical analysis: The media gateway is
designed to take packets coming on to or off of a network and "translate"
them to or from TDM, where the digitized voice samples are placed in
reserved time slots for subsequent transport, to or from IP packets. For a
simple example, going from IP, packets are received at the gateway
interface, input to a jitter buffer for dejittering ( creating a smooth
and fairly constant playout by adding sufficient delay, i.e. turning
jitter into delay) and re-ordering in case any were received out of
sequence (this is quite likely in an IP network where packets do not all
take the same route to their destination). The packets can then be handed
off to the codec which decodes the packets into audio. Usually the codec
will implement a packet loss concealment algorithm to mask the effects of
loss voice energy. The PLC will often add comfort noise, replay the last
packet, or interpolate. Lost packets turn out to be a major problem on a
IP network. Lost packets may be lost in the network or be the result of
jitter buffer discards. Jitter buffer discards occur when there is
excessive delay in receipt of the packet, i.e. the delay time of the
packet is greater than the delay added by the jitter buffering.
The salient point is that the gateway does a lot more than interconnect
two methods of voice transport. Codecs, packet loss concealment and jitter
buffer technology are key attributes or parts of the media gateway
function.
Mike Coffee attempted to tackle the question, and earned kudos for his
analysis: Let's take it one step further. So far, the discussion has been
limited to voice. But do gateway vendors tell their carrier customers:
"Hey, tell your subscribers they can't use this thing for modems, so fax
and dial-up must be on other access points." I don't think so. We've heard
about packet-loss concealment and error-recovery techniques for voice, but
they don't work for data. Try concealing a lost packet from a high-speed
modem.
This means the gateway must detect the type of call and apply the
necessary stream processing resources to the call stream to handle it
effectively. Some gateways try to handle modem calls by switching to G.711
(64K full-duplex), but this is bandwidth-hungry and still doesn't handle
congestion well. Data-modem calls get dropped and faxes get messy or fail.
So, just as every voice gateway has voice-specific stream processing that
depends on the characteristics of the human ear and IP networks, the more
capable gateways that offer robust transport of fax and data-modem streams
use fax-specific stream processing for fax calls and modem-specific
processing for data modems.
These vendors have the ITU T.38 recommendation for handling fax calls, but
V.MoIP, the comparable standard for modems, won't be "determined" until
later this year. So important interoperability standards are not all
there. This is a useful reminder that this is still an industry and
technology in its infancy.
Quality of Service
Tung Phan: The best algorithms in packet loss concealment and/or
error-recovery will not do the job effectively, even when the gateway
would apply the necessary resources based on the type of call. As you have
pointed out, congestion is one of the major factors contributing to the
quality of the calls. The point is, data -on an IP networks- is inherently
bursty, no matter how high the capacity, congestion will always occur for
short periods. And this is why QoS is critical in a VoIP networks. QoS
ensures that high priority, delay and jitter sensitive traffic gets the
nod over lower priority traffic.
Certain QoS technology is developing to address specific area of the
network (i.e. DiffServ, Differentiated Services, seems to be a promising
technology for WAN. A combination of these QoS technologies (i.e. MPLS,
DiffServ, IntServ) may solve the congestion issue and allow us to improve
our VoIP networks.
Gregory Majersky: There are also little "tricks" being developed to shave
those precious milliseconds off of packet transmission such as compressed
RTP and disabling of UDP checksums. These may only be stopgap measures and
not reliable solutions to serious delay problems especially when
residential VoIP comes about, but then again every millisecond counts, and
those few milliseconds can add up when applied over a hundred or a
thousand termination points.
Bob Massad responded: QoS measures are really about priority and queue
management and have no direct impact on call quality. RSVP, while pretty
complex, does reserve bandwidth but would ultimately be inefficient as it
really is a virtual circuit. Still they are useful tools as would be an
intelligent call admission system - totally lacking today. To know which
tools to use and how to use them, there needs to be a data/knowledgebase
where relevant data from real/problem calls can be collected and analyzed.
Most often, net managers get fixated on delay, jitter and packet loss. The
problem is both more simple and more complex than that. It's simpler
because it turns out that monitoring packet loss alone you can get a great
picture as delay and jitter, when excessive reveal themselves via packet
loss. Note also, that the protocol transport used for Voice and video is
also completely vulnerable to packet loss.
It's more complex because the typical view of packet loss is woefully
deficient. As was pointed out earlier, packet nets are inherently bursty,
what use is an average loss number in a bursty environment? Moreover,
packet loss can in effect be invisible from traditional management devices
such as probes or analyzers or RTCP reporters. The reason is that packet
also occurs and may predominantly occur via jitter buffer discards. Jitter
buffer ops are completely transparent to the devices noted. It is via the
jitter buffer that delay and jitter are turned into packet loss.
If you can't measure burstiness...you have no reasonable idea of call
quality, neither absolute nor relative. If you have, what use are the
tools?
Burt Crepeault: I would tend to disagree that QoS is irrelevant to call
quality. From a router perspective, packet loss takes the form of packet
discard in congestion scenarios. By maintaining prioritized buffer queues
in your network routers, not only do you minimize packet loss, but you
also reduce jitter substantially, as higher priority queues will always be
emptied first, thus more regularly.
Now of course that will sound like utopia, that nobody controls the whole
network entirely except in the enterprise space. It is still however an
important aspect of call quality.
Bob Massad's response: I said they have no DIRECT impact on call quality.
They don't because they really indicate priority levels, which in many
cases are only locally significant, and all priority levels, can be
oversubscribed. Moreover, multiple different routes may used causing some
packets to arrive at the playout buffer too late, irrespective of
priority. So while not flushed by the router, they may still be flushed by
the jitter buffer.
Jerry Gavin offered some perspective: Mike is right about the infancy of
this industry and technology is currently at. With the acceptance of 2.5G,
3G and 4G networks within Asia and soon Europe, the issue of handling
non-voice modems (V.34 and V.90) will become more efficient and use less
bandwidth when processing these modems as data through these networks. NMS
is currently investigating these network types and seeing how we can
leverage the our product line to address this technology in the Media
Gateway Markets.
Massimiliano Angelino brought up some new points which sparked some
intense discussion: What's the point in having Modem Over IP? Doesn't one
use modem to transport IP already? Wouldn't a RAS do a better job? For
faxes, wouldn't it be better if the in Gateway converted the Fax PSTN call
in an image (GIF, JPEG, ...) which would then be transmitted to the out
Gateway which translates it back to a Fax PSTN call?
Burt Crepeault took a stab at answering. I have to agree that forwarding
modem and fax signals is somewhat twisted when you really think about it.
After all, it appears much more logical to demodulate everything as close
to the source as possible and forwarding the "real" data bytes rather than
the "bytes carrying a digitized modulated analog signal representing a
data byte". Whew.
However, there are scenarios where it is useful, even essential to attempt
to do so. All of this has to do with transparency of service, in other
words where the end users don't know that they're going through a VoIP
segment, which is usually the case in one-stage dialing systems (i.e. no
passcode/end number scheme such as those used by calling card providers).
Let's take the example of a VoIP trunk segment between two central
offices, a model that could be used by long distance carriers to save on
toll charges and offer lower fares. If you happen to be sending a fax or
connecting to an ISP across such a segment, the system has to forward the
information just as if you were using normal, G.711 long-distance trunks
carrying PCM bytes.
Twisted indeed... :) But what you are describing above is also being
implemented by some hardware manufacturers, so you'll always have the
choice.
Mike Coffee also took on those questions: Massimiliano asks why we need
modem-over-IP. After all, don't we now simply dial into our ISP's modem
bank over the PSTN? And for faxes, couldn't we just receive the fax in one
gateway and send the resulting image file to the opposite gateway, which
would resend it to the destination fax terminal? Burt suggests (I think)
that carrying modem calls (either data or fax) across a trunking gateway
would need to use G.711 full duplex for transparency.
Whether a modem call can be routed to an ISP's RAS (or modem bank) without
an intervening gateway depends on the access provider's network topology.
Commetrex (my company) is served by a CLEC with an all-IP access network
(metro Atlanta). We have an on-premises access device that "steals" DS-0s
from our broadband Internet connection for each voice call. So, were we to
dial an ISP the call leaves our premises as G.711 packets. But data modem
calls are unreliable since there is no V.MoIP available. And we have the
same problem with fax since the access device is not equipped with T.38,
causing an inordinate number of faxes to have errors. Overnight fax spam
is clean since there is little network congestion at 2:00 AM; POs faxed to
us during business hours are messy.
We tried to participate in an integration effort with a customer via
analog dial-up and couldn't stay connected for more than a few minutes
during the day. Why? No V.MoIP. V.MoIP would have converted the analog to
data on premises. The transport across our service provider's network
would be the demodulated data payload transported via packets. Then, with
V.MoIP, it would be reconverted to the PCM representation of analog at the
CLEC's point of interconnection with the backbone. All transparently. And
we could stay connected.
The job of T.38 (and V.MoIP when it finally becomes available) is to
render the interposing IP network transparent to the PSTN endpoints. The
endpoint fax or modem is not "aware" that the correspondent gateway is
anything other than transparent. In fact, the gateway, whether access or
trunking, must respond to voice and data calls differently. First, it must
"sniff" the call's media stream to determine the call type, then respond
accordingly. At a minimum, it must make sure that low-bit-rate vocoders
are not used with modems. Of course, silence detection and comfort-noise
generation won't work with modems. So, without the benefit of fax- or
modem-protocol spoofing the gateway will switch to G.711 full duplex,
which requires 128K of channel capacity for the payload data. (T.38 only
requires 14,400-bps...max.)
As for Massimiliano's question regarding receiving the fax and
transporting it as a store-and-forward payload, it's done all the time
using T.37. But it's not transparent. With T.38, the two endpoints aren't
aware that there is a gateway. The sender loads the paper in the fax
machine and it comes out the other side. Success or failure is immediately
reported. But with T.37, where the fax image is sent as an email
attachment, the operator only knows the network consumed the fax, not
whether it was received by the intended party. The network operator would
have to intervene if, for example, if the destination number was
incorrect, the machine was out of paper, turned off, etc. Why bother when
T.38 support real-time fax?
In summary, gateways that provide toll-quality voice have voice-specific
call-stream processing. Gateways that provide transparent toll-quality fax
transport have fax-protocol-specific processing (T.38). And, as our
industry grows up to fill the large shoes we are walking around in it will
provide toll-quality modem transport through transparent
data-modem-specific call-stream processing.
What About ROI?
Finally, Jon Archer asked: Has anyone done any studies or research into
justifying, from an ROI point of view, how well next-gen media gateways
perform? Just curious.
Jerry Gavin said: The ROI on Next Generation Media Gateways is based cost
per port. If you can offer Network Equipment Providers (NEPS) high
density, scalability and reliability for services at a lower cost then
what they currently charge their customers, then the NEPS will make the
investment to change to a "next gen" Media Gateway platform.
|