Monday, September 29, 2008

How important is a lab system?

How important is it to have a lab replicating your production (VoIP)
environment?

Conventional wisdom says that everybody has a lab: some people just
host their production users on it.

Having a lab incurs a lot of additional cost and work:

-- You have to buy the lab equipment, and the software.

-- You have to install and integrate the lab system.

-- You have to keep it up to date and secure.

-- To use the lab, you have to do everything twice: first on the lab,
then once it's proven, on the production system.


It's often easier to buy the lab gear in the first place than it is to
do all the work to actually use the lab.

Wednesday, August 13, 2008

This CDR has been brought to you from the letters B, F, and the symbol #

TWICE in the past month, I've bumped into CDRs from SS7 equipment in
Atlanta that include alphabetic characters and pound signs in the
calling party number (ANI) field of the CDR.

One of the calls was from 5176#0B600. (That's a phone number.)

What's going on here?

Voicemail peak-hour oversubscription ratio (48:1)

If I'm a phone company and I have 100 subscribers, and every one of them has voicemail, how many people will be calling into the voicemail system at any one time?

Back in the old days, they'd provision trunks into the voicemail system between the Class-5 switch and the voicemail system. They'd have to know how many trunks to provision to let all the subscribers both receive voicemails, and call in to check the voicemails.

I don't have any good answers here. But I did a quick study using data from logs on a couple of service providers.

These are new-fangled VoIP service providers. So for them, a media/application server supports:


  • Voice mail
  • Automated Attendant
  • Music on Hold
  • Call Center (hold queues)

However, the server can email voicemails somewhere else. So when people check their voicemail, they may not be checking over the phone. This is different than the old-school voicemail systems of 2001, and affects the utilization.

Just as an example, one service provider I checked had a 48:1 oversubscription ratio: i.e., for each 48 subscribers, there's one RTP stream at peak. They also had about 0.0001 calls-per-second-per-subscriber.

(I say "at peak" because I'm interested in building a network that supports the peak, "busy-hour" load. If I design for the average, I'll have problems when the load goes above that average.)

Does this ratio apply to you? All of these things could affect it:

  • How many of your users actually buy your voicemail service?
  • ...check their voicemail over the phone?
  • ...get all their voicemail via email, and listen to it that way?
  • ...use automated attendant?
  • ...put callers on hold, and get on hold music?
  • ...use call centers? And how long are callers waiting in queue?


As another note, it appears that the average duration of voicemail is in the 25-30 second range. This is true at two service providers, both in NFL cities.

Update -- A residential service provider reports around a 160:1 oversubscription ratio just for basic voicemail. (I.e., One concurrent call for every 160 subscribers.) A few of them do get voicemail via email.

Someone familiar with voicemail software development told me they expect over 200 subscribers per concurrent call for traditional voicemail.

Monday, August 11, 2008

The cat in Aastra 2.2 SIP Software

The Aastra 57i 2.2 software has this ASCII art embedded:


| ("`-''-/").___..--''"`-._
| Line (`6_ 6 ) `-. ( ).`-.__.`)
| Manager (_Y_.)' ._ ) `._ `. ``-..-'
| _..`--'_..-_/ /--'_.' ,'
| (il),-'' (li),' ((!.-'


Is it a Cat? Or is it a Pig?

Thursday, July 31, 2008

On Having an Opinion: Goodness/Badness vs. Pros/Cons

There are two ways of giving recommendations on technical issues: 


(a) To say that something is "good" or that something is "bad". 

(b) To list Pros and Cons, or else Advantages and Limitations.

The most popular, and natural way is just to say what's good, or often, what's best. In my opinion Linux servers are better than Windows servers; I'm ascribing goodness to Linux servers.

But I've noticed that more mature Computer Scientists seem to talk about advantages and limitations. My first memory of this came in my Comp 243 "Distributed Systems" class at UNC-CH. F. Don Smith was the professor. It was 2001. We was talking about network file systems. He introduced the section on the Sprite file system saying that it was one of the most interesting. Then he talked about it, and at the end of the section I asked: "Why is it that you say this is the best?" (*)

He corrected my question: "I didn't say that this was the best -- only that it was the most interesting."

My error took a few years to soak in. I was accustomed to thinking about goodness and badness of technical matters. But Don didn't use those terms. In his classes, he only talked about features, limitations, and therefore advantages.

It's human tendency to say that something we don't like is bad. In a technical context, what we usually mean is that I don't think that will work. 

There's something funny going on here; goodness and badness in this way exist in a total ordering. In my example above, Linux servers are closer to good than Windows servers are. If I had to add Solaris servers to the mix, I might put it somewhere between the two, but certainly closer to Linux servers.

But what if I need features that only the Solaris LVM offers? Is the "goodness" of Linux enough to outweigh my need for the Solaris LVM?

Goodness is only one dimension. In reality, technical decisions should be based on factors like features and constraints. Maybe Solaris's LVM does something I need, but perhaps I need much more expensive hardware. The cost of the hardware is a constraint, while the LVM is a feature.

The advantages and limitations of Solaris, Linux, and Windows servers for any one application can be listed. But, ultimately, a recommendation has to be made.  What server type fits the application the best?

So all the complicated evaluation of advantages and limitations has to be projected onto a single-dimensional line: fitness.

Is fitness  the same as goodness? 





(*) All quotations are approximated, and based on my memory. They're not guaranteed to be word-for-word exact.

Monday, May 26, 2008

Interop Lab Testing for VoIP Devices (2008 edition)


In an Interoperation (Interop) Lab, devices are made to work together. When they seem to, the vendors of the devices claim that they "interop with" each other. This is necessary, but not sufficient, to know things will work together.

Background

Suppose you make telephone soft-switch or application server, such as BroadSoft BroadWorks, Sylantro, MetaSwitch, or the Alcatel-Lucent Network Gateway (aka Lucent Compact Switch). The customers of your several-hundred-thousand-dollar device want to use VoIP telephones with it, but they've tried a few cheap ones and had trouble. And they had another one they liked, but when they upgraded it, call-hold stopped working. After opening several tickets with you and with the phone vendor, they give up and realize that neither of you did anything wrong.

The phone was always doing something reasonable -- though not quite what they hoped -- and the softswitch was doing something reasonable -- though not quite what they hoped. This is the curse of the VoIP Implementor: everybody can follow the rules, and working together, get nothing done.

After complaints from customers about this difficulty, you (the softswitch vendor) launch an interop testing program. Depending on the sort of technical staff you have, you might do it in-house, or outsource it.

Your goal is to confirm that specific products "work with" your product. You want your customers to be able to choose products confidently. It's a laudable goal.

A common interop lab setup might look like this, for a softswitch vendor's lab, when testing a new VoIP phone:



You've got the new phone (the Device Under Test, or DUT), and your piece of gear. Plus you connect them with an ordinary Ethernet switch. And you have another gold phone that you know to work with your platform, because phone calls require two phones.

Advantages and Limitations of Interop Testing



Interop testing of this variety can be very useful. SIP VoIP telephony is an evolving body of standards. There are more than one way to do something (such as put a call on hold, or make one phone number appear on two different phone (shared call appearance (SCA))). It is very useful to identify fundamental incompatibilities. Often, there are certain settings required on both sides: to use this device, The "rfc2543_hold" option must be turned on. Somebody has to figure this out: it had might as well be two the vendors involved.

This testing is only as good as the test plan. If the test plan doesn't include something that you need to do, then the testing isn't quite complete for you.

In addition, no two devices have all the same features. The vendor may test for T.38, but if the DUT doesn't do T.38, they don't want to mark it as failed for T.38. Instead, the testing engineer is likely just to mark that feature as Not Supported. Depending on policy, or haste, the testing engineer may just mark anything that doesn't work as Not Supported. There is no "one true standard" for interoperability. Ultimately, if the DUT and his softswitch work together at all, the two are "certified" as interoperable. It is to neither vendor's advantage to anger the other by claiming they're not interoperable.

Finally, Interop lab testing only tests the interoperation between a pair of device. Devices X and Y work together. But real systems are made of devices A through Z.

Real Integration Testing



A network for a VoIP Service Provider using SIP peering might look like this:

A real network for an SS7-connected CLEC might look more like this:

There are lots of devices. And many of them may affect the success you'll have with the DUT.

Take, for example, a simple SIP phone. It's easily possible that the signaling path for a PSTN call, in a CLEC case, is
  • SIP Phone
  • Customer Premise ALG
  • Session Border Controller
  • Softswitch
  • PSTN Gateway
  • SS7 STP networks
  • PSTN Class 5 Telephone Switch
  • PBX
  • PBX handset


It's definitely possible that the SIP phone could work with the softswitch, but irritate one of the other components. For example,
  • The SIP phone and the softswitch may use Requires: a specific SIP feature package, but the SBC doesn't allow it or support it.
  • Or perhaps yet-another form of caller ID is dreamed up, and the PSTN gateway can't support it, so that all calls from this phone appear to have no caller ID.
  • Or maybe the phone signals some sort of ISUP calling party category in SIP that the PSTN gateway passes through, and confuses a downstream PSTN telephone switch.
  • What if the phone needs SIP over TCP, but the ALG doesn't support it properly?
  • Or the DUT signals RFC2833 support properly, but the PSTN gateway expects telephone-events to have a specific codec number (i.e., tickling a bug that was already present)?
  • Perhaps it works fine if the SBC is configured with a "classic" configuration, but breaks miserably if you switch to the "new" configuration.
  • Maybe it can download configurations from the lab FTP server, but chokes if the FTP server is a little slow


My point is that we have to test components an in and-to-end environment to really get confidence that they work. This may change, but only if the complexity here is accidental (a side-effect of today's state of things) and not essential complexity (fundamental to the job). With all the new features that VoIP telephony systems try to support, and the upgradability the protocol designers intend, I'm not sure I can distinguish which it is now.


Show me a "simple SIP phone", and I'll show you a phone that nobody likes because it's not flexible enough to be configured to do crazy things.

Why won't anybody build far-end echo cancellation into their VoIP phones and ATAs?

Nobody VoIP Phone or ATA on the market offers talker far-end echo cancellation. They should.

Natural Rock Formation that amplifies echo.


Background



Some background: echo is when you hear yourself talking. It's usually the caused by the device on the other end of the call, but it's exacerbated in VoIP networks because they have long delays.

Suppose you have a phone call that includes a VoIP device (such as a PolyCom SoundPoint 650 IP SIP Phone, or a Cisco 7260 SIP phone, or a Cisco/LinkSys/Sipura ATA, or an Aastra 57i).



There's a VoIP phone plugged into some sort of IP network. In many cases, it's an Ethernet network with a DS1 (T1) connection to the VoIP service provider / ISP. There's a VoIP-PSTN gateway connected there, such as a MetaSwitch VP3510, or a Lucent Compact Switch (LCS), or an AudioCodes, or a Cisco AS5400. It has an Ethernet interface and TDM interfaces, such as DS3, DS1. Maybe it does ISUP and has SS7 A-Links, or ISDN PRI. Then there's the PSTN, which includes traditional telephone switches -- and actually a lot of VoIP hidden in there too. Finally, there's a PSTN phone.

When the VoIP phone user says something, he may hear an echo of his own voice coming back.



Cisco has a nice WAV file demonstrating what echo sounds like. (This page is known to some insiders as the Cisco Duck Quack page.)

Why does C3P0 hear his own voice coming back?

The PSTN Phone isn't perfect. Some of the electrical voice signal that enters it may be "reflected" back. This is sometimes called the "Two-wire-to-four-wire" conversion: a normal phone line is a two-wire electrical circuit with both sides of the conversation on it, but the handset itself has two wires for the speaker, and two wires for the microphone. If this isn't built just perfectly, some of the sound signal that's sent to the speaker will be reflected back down the wire. Many phones are not perfect.

The PSTN Phone may be on speakerphone, or pick up other acoustic echo. Normal room walls echo sound back to us. It's there, but we don't normally notice it.

The VoIP Network is long.We won't normally notice echo in a room, unless we're in a big room. Our brain is pretty good at filtering out sound that's echoed back very quickly; if we hear an echo less than 100 [ms] or so after we say it, we don't even notice it normally. So if the room is big enough that sound takes a long time to echo back, then we might notice it.

(How big would a room need to be? Around 55 feet / 17 meters across would work nicely to reflect off the far wall. Sound travels 340 meters / second, and we'll hear an echo if our voice takes around 100 [ms] to get back to us. That means it needs to be 0.1 * 340 meters from my mouth to my ear, or half that from one side of the room to the other because the sound travels that distance twice.)

It can take a long time for a sound signal carried via VoIP to get from the talker C3P0 back to his ear. 100 millisecond round-trip-time is easily achievable. Why? Because packets sit in buffers along the way. In traditional non-VoIP TDM networks, digital voice data is rarely "buffered" to any extent. But in VoIP networks, buffering always happens. This "buffering" means that packets are sitting inside devices -- phones, routers, switches -- effectively adding to the delay between the talker and his echo.


Put echo cancellation in the VoIP Phone.




This is all very annoying for the talker who hears himself. VoIP users suffer from this much more than PSTN users, because of this delay. Technically, the PSTN phone is creating the echo, it seems silly to just point blame. Technically, VoIP networks should be free of jitter (variation in packet transit delay) -- but networks aren't perfect, so VoIP phones have jitter buffers built in. It's standard equipment, like seat belts in cars.

But echo cancellation -- the sort that's really needed, that would cancel out the echo received back across the VoIP network -- just is not put into VoIP phones normally. Some phones, such as Polycom, have limit acoustic echo cancellation capability to prevent the VoIP phone itself from echoing back. That's nice, but that's not the main problem.

It seems VoIP phone vendors are spending their time building in G.722 wideband codec support ("HD Voice") , so conversations within your office building will sound nice. Whoop-ee.

VoIP Echo Cancellation is possible. Vendors are routinely building limited echo-cancellation ability into the PSTN-VoIP gateway device. (E.g., General Bandwidth G6, MetaSwitch). But it can't always be used, and it's not always effective. For example, if you connect to the PSTN through Level(3), then you don't get a gateway with echo cancellation. And on some gateways, using echo cancellation limits the number of calls you can make through the box.

But is echo cancellation in the VoIP gateway sufficient? Apparently not. If it were, I wouldn't hear about many complaints from my client base. And Ditech would have no reason to make a VoIP-only echo cancellation box. But as it is, VoIP carriers have lots of trouble with echo.

And, in general, centralizing the echo-cancellation capability doesn't seem optimal either. Putting lots of work and intelligence in one place tends to make one really-expensive place.

VoIP Phone and ATA Vendors should add echo cancellation to their devices. Polycom, Cisco, Aastra, Linksys, Snom, Adtran, etc. listen up: your device should cancel out the echo received back in the RTP stream from the VoIP network. It can't be that hard! I know there are DSPs that can help with this. Add this feature to your top-of-the-line phone! Charge more for it! Make it a premium add-on license if you must!

Some people just want a super-cheap VoIP phone. But some people are trying to replicate the services of traditional phone systems, but using VoIP. For them, the echo problem is real, and serious. It can't always be solved with a gateway. And an extra 25 percent in the cost of the CPE might be hard to swallow, but not as hard as having no solution at all.