« Google pwnz collaboration | Main| Cats, bags and YouTube »

Google Wave for Yellow Bleeders using the Socratic Method

If you've watch the Wave presentation by now, you've probably bumped into a few questions about how Wave works.  Lucky you, I've been digging into everything they've published so far, and thought I'd share some of what I've been able to glean.

Click the permalink for more...
First thing's first.

It's Google.  So if you use it, you're putting all your stuff in the cloud, right?

Nope.  A Wave Server can exist inside your firewall.  The fun part is: it behaves like an email or an IM gateway -- it's a federating proxy to external Wave servers.  So if your customer also has a Wave server, you can simply drop the customer's contact details on to a Wave, and they're automatically included.  Your server opens communication with their server, and the information starts flowing.

Wait... You can't just hand-wave about that.  HOW!??!

It turns out that Wave uses XMPP as it's transport protocol.  XMPP is the protocol for Google Chat (also known as Jabber.)  So really, a Wave server is a type of Instant Messaging proxy.  And yes, you could build one on top of the Sametime Gateway.

Okay, so that's how the data flows, but how do we know who the user is?

Again, it's just an IM service, so your federating server works just like the Sametime Gateway.  From the IBM documentation on XMPP: "To communicate with the Google Talk community, you must first set up a DNS service (SRV) record and publish it to DNS so that Google Talk users and local Sametime users can discover each other and establish a connection."  Yes, Virginia, it's really that easy.  When you know the user's email address, your gateway can go query it from the XMPP gateway of the target domain, just like an MX server.

But what about security?

Once again, it's just federated IM, so it's as secure inside the firewall as whatever you do to secure IM.  In the Lotus world, that would be your standard username/password credentials you maintain via Domino or some other LDAP service.  On a transport level, Wave requires that all comms are secured via TLS (SSL v3.1) so information on the network is always end-to-end encrypted.  This also provides a nice mechanism for server-to-server federation, since a) information going outside the firewall is encrypted; and b) there's effectively a cross-certification step for server federation, so administrators can establish trust relationships (or just open things up to root certs.)

Is this like Groove?

I don't know anything about Groove other than it was based on peer-to-peer communications.   Wave isn't P2P -- it's client/server, with the servers able to peer directly, just like SMTP does.

Fine.  It's might be managable for an enterprise.  But how the heck do all those concurrent edits work, anyway?  We yellow bleeders have been dealing with distributed federated edits for decades now.  Has Google never heard of a replication conflict?

Ah, that's where Wave gets truly interesting.  The protocol is based on a concept called Operational Transforms.  Yeah, my eyes glazed over at first, too.  Basically what it means is that each operation performed against a piece of data should be transitive, so that operations occuring against a given version of the data can be performed in any order and still arrive at the same outcome.

WTF?  How the heck do you do THAT?

By twisting your brain up real good, and creating mutation semantics in your spec like <antidocumentelementstart>, <documentdeletecharacters> and <documentdeleteantielementstart>

Oh my.  That looks painful.

No doubt.

Is anybody actually going to build that?

Google's designed higher level APIs that are supposed to take care of much of this complexity for you.  So there are much more sensible operations in Java, for instance, like WAVELET_BLIP_REMOVED and DOCUMENT_CHANGED.

Wait... what the heck is a WAVELET?

All the demostrations in the presentation dealt with Waves, which are basically high-level containers for a single conversation.  They are roughly analagous to a Topic in a discussion forum, or a Folder in a QuickPlace.  A Wave is made up of unique Wavelets.  Wavelets are another container that control who participates in its conversation.  A Wavelet has a root document, and a collection of one or more Documents associated with it.  A Document is made up of a set of elements known as Blips.  I'm still a bit fuzzy on the scope of a Blip -- I think it can be as small as a single character, or as large as a paragraph.  I know that Annotations, which describe formatting attributes and tags, operate on Blips, so they're probably like text runs in CD structures and DXL.

So how does that relate to all that live co-editing business?

Well, any given message packet in Wave is an XMPP transmission that details a <waveop>, which is basically a semantic for "perform this transformation on Blip X in Document Y (v123) of Wavelet Z in Wave Gamma."  That transformation can be to add content, annotate  content, remove content, plus a few other house keeping operations.  Things like removals of Blips appear to take place in the protocol by adding an "antiBlip," which is why they can be transitive and are able to be replayed later.

Are you telling me that the server sends a message that tells another server to change the content of an element in a certain document with a sequence number, that's inside a security-controlled container with a specific ID?

Yup.

That sounds like replication!

*DING!*  Now you're catching on.  To be more precise, it sounds like streaming cluster replication.  But with two notable differences: 1) instead of the replicator telling another server what a Note Item's new value is, it's really telling it what transform to perform on a given Note or Item.  It's as if the stream were a replay of the Transaction Log.  (Which might be how CSR works internally -- I'm far from an expert on it.)  2) Rather than cluster members being defined at a server-level, in Wave, they're defined by the participants in the Wave.  So every Wave defines in itself what the members of its cluster are.  

So when are you building Wave for Domino?

Ummmm, if you'd like to hire us for that, I'm certainly up for it.  But it's not on my radar as a pet project at the moment.

Bullshit.

Yeah, you're right.  I'm dying to do it.  But there's too much going on.  And besides, you can't actually implement it solely on Domino.  Domino can't talk TLS, so you can't implement an XMPP server natively on it.  The Sametime Gateway can do TLS-secured XMPP, though.  So if you have the Gateway running, you could build a Wave plugin for it, that uses either WAS/DB2/LDAP for data storage and user management, or works against Domino.

So you've already designed it?

LOL.  You're funny.

Of course, if you see a Sametime Gateway show up in the Bleedyellow environment sometime soon, at least now you'll know why.  

Comments

1 - Thanks for the summary - saves me the other 50 minutes I didn't have time to watch "that" video etc

2 - Oh Steve, you shouldn't take my write-up here to mean that you don't need to watch the second 50 minutes of their presentation. You really are doing yourself a disservice if you don't watch the whole thing. There are some killer reveals that don't happen until even the last 10 minutes.

3 - Great write-up. It's kind of funny, just a couple of minutes into the presentation I knew how it had to work, and based on this I was spot-on.

I'd love to build Wave on Domino. It's close to being possible on just Domino alone -- there's already an enhancement SPR for adding to Domino's TLS capabilities (it can currently do TLS for SMTP, but not HTTP).

I wonder of Google will be open-sourcing their kick-butt RT editor?

4 - I'm confused. Who was asking you these questions?

No...really...that's LOL funny. You missed your calling.

...and excellent summary to boot!

5 - @3 - Even if Domino implemented TLS natively, I'm fairly sure I'd prefer to let the ST Gateway handle the transport layer, even if I was using Domino as the backend storage.

The Wavelet/Document hierarchy is particularly interesting to me. The threading requirements seem like a natural fit for, believe it or not, RESPONSE documents. Emoticon

@4 - My imaginary apprentice: Playdough.

6 - Any bets on this discussion going on in Redmond right now;
"If we just added another 4 or 5 servers to the Sharepoint stack we could do the same thing, only with our own (better) protocols". Perhaps it's time to WAVE goodbye Mr. Softie.

7 - Thanks for the info, Nathan. Nicely put and you've saved a lot of us tons of time.
- Andy

8 - WOW Nathan! That was a great summary. Thanks.

9 - Don't worry Nathan - I will watch it all very soon :) Anyway by the time I get around to it - you will have the Domino version up on BleedYellow right?

10 - doesn't the wave protocol also define the storage format (I remember it because it sounded odd to me)? If YES, is it mandatory or optional?
I'd guess that becoming feature- and storage compatible would b much harder than implementing the communications protocol.

11 - @10 - I can't find anything about storage in the documentation. There's the network protocol and the rules for "this is how you apply transforms in a streaming fashion." But the documentation so far says nothing about how the server has to store the content.

I would think that Google doesn't care about storage, any more than Sametime cares about how a SIP client stores messages, or Domino cares about how some other SMTP server stores MIME content. HTTP doesn't care how you store content, does it?

12 - "Truly you have a dizzying intellect"!
Thanks for the post, it is insightful in a way even us Admins get it.

And you were right the last 10 minutes blew me away!



13 - Awesome summary! I knew that you would be there to fill in a couple of the blanks for me. The video IS very cool! I'll have to watch it again when I'm not loaded up on cough syrup...

...although I did end up seeing the 'Google Wave' logos actually wave a few times while on the syrup... Emoticon

Post A Comment

:-D:-o:-p:-x:-(:-):-\:angry::cool::cry::emb::grin::huh::laugh::lips::rolleyes:;-)

Search 

Disclaimer 

Welcome to Escape Velocity!

Opinions expressed here by Nathan T. Freeman are not necessarily those of his employer. However, there's a decent chance they are, so check with them if you really want to know.

But really... do you need that kind of validation? Are the opinions expressed here in doubt?

MiscLinks