It all started in 1998 when Jeremie Miller chose XML as the basis for Jabber. Although XML seems old-fashioned now, at the time it was the cutting edge, in part because from the beginning Jabber Identifiers could include characters outside the US-ASCII range (unlike textual protocols of the time such as email or SIP). In 2002, Craig Kaes and I started to codify the Jabber address format in JEP-0029 - an effort that was superseded by activity in the IETF's XMPP Working Group, which eventually led to the core specification for XMPP in the form of RFC 3920 in October 2004.
That original definition handled internationalized addresses using a method called Stringprep, which at the time was also used for domain names and other application identifiers. Unfortunately, over time the Internet community discovered some issues with Stringprep, first among them that it was tied to version 3.2 of Unicode (the underlying set of characters for all modern, and some ancient, human languages). These days we're up to Unicode version 7, with further improvements and updates on the way. After the DNS community decided to move beyond Stringprep in 2008, other application protocols (XMPP, LDAP, iSCSI, and the like) concluded that they needed to follow suit.
That was in March of 2010. Fast forward 5 years, and today the IETF's PRECIS Working Group has finally produced a new and better framework for handling internationalized strings in Internet protocols: RFC 7564, which I co-authored with Marc Blanchet. Although the exact reasons why I volunteered to help are lost in the mists of time (probably something about "the good of the Internet"), I ended up learning a lot more about internationalization than I ever thought possible. Unfortunately, internationalization is such an exceedingly complex and messy topic that I still feel like I have only scratched the surface. But at least now we have an internationalization framework that can serve us for the next 10+ years. Or so we hope!
Peter Saint-Andre > Journal