Friday, March 20, 2009

Idempotent Sequences, Sets and Interactions

Idempotency is a funny sounding word with a simple meaning, but a complex reality.  Perhaps it’s even unfair to say it has a simple meaning because I’ve seen many attempts to define it that turn out rather convoluted.  In the simple terms, I would define idempotency as the property of being able to perform the same action twice or more times in sequence, and end up with the same result as if it was performed once.

Those terms leave a fair bit of wiggle room when you really get down to the details.  For example, those terms say nothing about additional actions occurring in between the first and last occurrence of the idempotent action.  If there is one and only one action that can modify a resource than it’s not necessary to consider those details, but quite often that is not the case.  Even operations like an HTTP PUT are only idempotent when the input is the same.  Every different input for PUT is non-idempotent, with respect to every other input, unless of course If-Match is used and then some inputs are idempotent to other inputs but not to all.

As I’ve found in relation asynchronicity, a simple word, used without context, can cause more confusion than clarity.  The transmission of a message and the receipt of it’s response may be asynchronous from a user’s perspective, but still remain synchronous from a HTTP protocol perspective.  That is, from the user’s perspective clicking a link in the browser is asynchronous, because while the requests are sent, received, processed and rendered, only any modern computer the user can click on another tab, open another program, etc.  But from the HTTP protocol perspective, even if the HTTP library provides a method like BeginGet, EndGet, some infrastructure is required to construct the request, transmit that request, and do nothing else until the response arrives (Even this is blurring the truth a bit since there is the timeout…).

So, back to idempotency, I rarely see the context of idempotency discussed or defined in a more clarified structure.  I’ve occasionally seen references to idempotent sequences, but very rarely, not even enough to feel confident that my definition is the same as everyone else’s.  The HTTP RFC gives this description:

A sequence is idempotent if a single execution of the entire sequence always yields a result that is not changed by a reexecution of all, or part, of that sequence.

I wouldn’t be at all surprised to find people who would believe an idempotent sequence was a sequence, that if performed twice or more in a row was idempotent, but the description above goes farther than that.  In that description, the order of first execution of each action must remain constant, but reexecution can occur anytime after this first condition is met.

I can think of another, stricter set of actions, where the set of actions produces the same result regardless of order entirely, the only requirement to achieve the same result is that every action in a set is executed at least once.  For the lack of knowing anything better to call such a set of action, I’d call them an idempotent set.

I think it’s not uncommon for architects and developers to overreach at times and assume something that is a collection of independently idempotent actions, is an idempotent set.  You need an idempotent set in order to accommodate fully out of order execution, with a deterministic result.  An idempotent sequence works only if you’re willing to accept a slightly non-deterministic result.

Another common circumstance that arises, is a set of actions, that will result in the same outcome as long as the first execution of a “final” action follows the initial execution of all other actions.  This circumstance, which I’m struggling to find a good term for, is quite common.

I would love to hear from others on what definitions, or other common interaction patterns you’ve had experience with that relate to idempotency.

No comments: