VoiceXML — Good, Bad & the Ugly
I came across this post while checking_ my inactive blog on archive.org. Although the article is dated (2009) and some of the technologies, specs, and links may not be relevant today_, the VoiceXML part is still relevant as much as it was in 2009 and hence thought it is worth sharing — please read keeping in mind that it was written in 2009 and posted here ASIS.
The original article was an attempt to identify some of the key deficiencies of VoiceXML as a programming language for voice, and why we created VoicePHP in 2008 as a viable replacement of VoiceXML On its release, VoicePHP was praised by many, a notable quote from none other than GigaOM, Om Malik — “The idea of VoicePHP is Disruptive in its Simplicity”. This article follows why we did. You can view the original blog post and comments here.
VoiceXML — Good, Bad & the Ugly
While XML (and in general any marking language) has been great for representing the data, using it for programming is like using a wrench to hit a nail. All you need is a hammer to nail! Although a wrench can do a manageable job of hitting a nail, it’s not elegant, creates a mess, and cannot address the problem with precision. The same is the case with XML It is best suited for and was designed for data representation, data-transfer and has repeatedly proven its worth for the same. Trying to use that for programming is adapting it for something that it inherently wasn’t meant to do. Let me try and explain this in detail:
Consider a typical “hello world” application in VoiceXML (courtesy, vxml.org).
It took 10 lines of code to write something as simple as that. On top of it, for a simple hello world application, one may be tempted to seek more information about <form> and <block> tags which seem like overkill.
Shocked? You are not alone. As Dominique Boucher states on his blog
from a developer’s perspective, it’s (VoiceXML) like having to program in Cobol! And I only slightly exaggerate
The same application in PHP is reduced to barely 2–3 lines of code which is more readable and intuitive.
See the difference? Don’t take my word for it, here is what veterans and users say about VoiceXML. We will soon dwell on why do they say so.
From the Industry Veterans
In early 2007, Brian OConnor commented on annoyance in VXML standards. As he states, he was unable to use <if> within a <prompt> tag. It’s an arbitrary limitation and requires a nasty workaround.
In the article “ Is VoiceXML the Right Tool for Your Voice Application?”, Brian Brown identifies very precise weaknesses of VoiceXML. For example, when even basic voice controls (pause, resume, etc) are not available in VoiceXML, how it can be even considered the language to program voice? He nailed the problem very well. Look at how VoicePHP addresses it beautifully in a sample application here.
Dannis in his interesting email and unique style shares the pain of VoiceXML,
“TCL was the most ugly languge of the 90-ies. VXML has now taken over. The language appears not to have iteration (while, for) and no recursion. But it DOES have the goto primitive, which was banned by Dijkstra 30 years ago. There is no function abstraction and neither object-oriented constructs. “
He further adds which I will elaborate on later in this post
Even VoiceXML vendors are aware of the limitations and they have tried to create specific & proprietary enhancements to get around VoiceXML limitations, for example, CallXML by Voxeo. In fact, Voxeo CEO commented on VoicePHP coverage by Gigaom that
“ As a developer, I do not like VoiceXML. Personally, I find it to be too complicated, painful, and a barrier to entry for new developers as others have said. This is why Voxeo offers many other ways to create voice applications, including CallXML — a very powerful yet simple XML based telephony markup; “.
Do I need to say anything more?
The big question — Why do veterans say so?
In my opinion, VoiceXML looks like a creation out of obsession. XML was the new kid on the block and perhaps impressed or obsessed by it, somehow fitting it to the Voice programming needs became the name of the game. Basic TTS & ASR was made to work — wow! So far so good!
Now compare this code with the VoicePHP equivalent (demo here):
In contrast, take a look at just about any application at http://code.voicephp.com to see how easily one can take an existing application and move over to VoicePHP with all the programming constructs usually available in most programming languages.
CDATA — Add it to the mess
Consider the code for the first tutorial on Voice Recognition from vxml.org.
I am sure you need a coffee break after reading the above code; the code looks verbose, repetitive, and unmanageable. This same application when written in a commonly used ‘real’ programming language will have a lot less code and will read much better. Again refer to any code snippet at http://code.voicephp.com
This is not me saying but vxml.org —
“ Coding an application with just straight VoiceXML is just fine and dandy, thankyouverymuch, but the real potential of VoiceXML is harnessed when we add some ASP or JSP into the mix “
Any way you slice it; VoiceXML doesn’t come close to meeting the requirements of real-world applications. Voice applications would do really well if there was an easy way to bring them to life. Developers do not want to use complicated technology to achieve something simple, intuitive, and obvious — I know I won’t.
We are not against VoiceXML. In fact, VoiceXML spearheaded the way for voice programming and took away the complexity that one had to deal with in the early days (remember hardware card and proprietary drivers nightmare?). When it launched, VoiceXML was the “new” way to program voice and we were completely supportive of it too. We released the world’s first “ Adobe Flash-based VoiceXML Platform “.
But it’s about time that VoiceXML realizes its inadequacies and makes way for better alternatives. Alternatives like VoicePHP (or maybe even VoicePERL or VoicePYTHON) could do a better job. The web is evolving and solutions that can tightly integrate with it will become more and more important. Dedicated solutions to tackle a specific problem are a thing of the past. Some technologies (e.g. PHP for web programming, Flash for UI and widgets, Mobile applications using a data network, etc.) have proven themselves and it’s about time that we re-use them and not bind ourselves to technologies that began with the right attitude to solve a problem but couldn’t really establish themselves due to technical limitations.Originally published at https://web.archive.org.