Multimedia Futures


As noted, yesterday I had the oppotunity to participate in a workshop on the future of multimedia protocols. The consensus regarding current technologies (H.323 and SIP) seemed to be as follows:

  1. End-user devices are too complex.
  2. It's hard to exchange information about capabilities.
  3. Too many codecs! (But none are required.)
  4. NAT and firewall traversal is a major problem.
  5. Interoperability is inconsistent.
  6. Imitating the PSTN (voice calling) is not enough -- we need true multimedia innovation (whiteboarding, collaborative editing, app sharing, etc.).

There is a growing perception that the current crop of second-generation technologies are extremely limited and that SIP is not going to solve these problems. So what is to be done? For future technologies, the consensus is that we need to:

  1. Reduce complexity.
  2. Use a clear, modular architecture.
  3. Have excellent capabilities negotiation.
  4. Make the endpoints intelligent.
  5. Move service logic to the server side.
  6. Define the protocol in terms of basic primitives only.
  7. Use flexibility to create simplicity.
  8. Enable applications beyond voice.
  9. Push services and innovation out to the edges.
  10. Incorporate reliable identity assertion (identity in the clear, location and media exchange encrypted).
  11. Minimize the need for big infrastructure investments.
  12. Seriously simplify deployment and access (they are more important than features).
  13. Specify mainly or only the receiver behavior (analogy: standardizing the decoder, not the encoder).
  14. Minimize the protocol requirements and semantics.
  15. Keep the endpoints dumb in signalling and smart in features.

That's a tall order, and there are many details behind all of those considerations (and more). Several approaches were discussed:

  1. Define something like H.323 or SIP, but better (i.e., keep it simple).
  2. Work out a software-defined client architecture (standardize the virtual machine, not the wire protocols).
  3. Blend the two approaches.

What does this mean for our efforts in the Jabber/XMPP community? I recognize, as new blogger Jean-Louis Seguineau points out, that we still have a lot of work to do on Jingle. Projects like Asterisk, FreeSwitch, Telepathy, FarSight, Coccinella, Tapioca, Psi, and Kopete are raring to go! It's #1 on my to-do list, but realistically I need to dedicate a whole week or so to updating the various Jingle specs, thinking through the issues, and working to hammer out consensus on the Standards list. I don't pretend to have all the answers, and it would be great to get feedback from people who know more about multimedia protocols than folks in the data-centric Jabber/XMPP community do, so I plan to contact some experts in the coming weeks as well. I'd post more but I got back from the ITU/IMTC workshop at 1:30 this morning and my brain isn't working so well right now...

Peter Saint-Andre > Journal