All You Need is TCP - EtherSAN and Global Network Storage
It's been a bit difficult to write this month with a death in the family, a product launch ("Valux to focus on MinutePitch ") with a partner (Valux), a paper ("Lessons Learned in Massive Video Production (MVP) for University Alumni Outreach") for ACE2004 in Singapore last month (at the same time as the death, sigh), a wedding, and a funeral. So I'm only now looking at some serious comments regarding my earlier paper ("All You Need is TCP: EtherSAN and Storage Networks") for the global storage workshop.
Greg Pfister of IBM, Mr. "In Search of Clusters", was kind enough to provide some feedback on the paper. His questions were to the point - I needed to explain better 1) "What is an EtherSAN?" and 2) "Why Should I care? As he said "Answer those two questions. It can be done in 6 pages. It can be done, in fact, in one page." So here it is, in a page for everyone who asked me this.
Ethersan is unique in being a single, comprehensive technology object, used in processors, networks and peripherals, whose expression in each is different, yet each uses the same mechanism from a different perspective. Like Intel's Pentium, the same core can be expressed with strikingly different products, yet is the same core.
The effect of this is to remove contradictory communications mechanisms that fragment the use of the Internet, replacing them with a fundamentally simpler mechanism that is more effective that any of the prior ones, but only when it is comprehensively deployed. As an example of this, jitter and congestion are mitigated at the point they enter the path, not when they have compounded to impact the transfer. Also, instead of risking data integrity to signal transfer rate back-off, flow control is adjusted to directly constrain dynamic over-capacity instead.
The effect is more efficient use of the network by use of a comprehensive "end-to-end" solution that doesn't pay lip service to the principle, but embraces and enforces it stepwise along the communications path.
Prior views of attempting to justify the end-to-end principle involved reductionism, not minimalism. What we mean by reductionism is that Layer 2 switches and Layer 3 networks need only be limited to those layers in scope. In practice, this is seldom true (example RED in routers, flow management in switches), and the limits of the performance of network communications many be traced to the "not good enough" compromises vendors have made here to mitigate the problems - for example, deep buffers and fat packets cause time skewing or high level jitter that affects media synchronization.
What we mean by minimalism is that the simple TCP/IP communications mechanism rules are the definition of what is preserved every step across the network (in migrating to this, steps between are compensated for as suboptimal, limiting the effectiveness of that path). Thus we are not minimizing the mechanism used at each step, but the rules used in communication and how we respond locally to transient/longer faults, in real time.
In this model, recovering from congestion by methods like adaptive inter-hop retransmission are allowed as long as they do not distort the smoothed bandwidth / propogation time of the hop as expressed as an increment of the total path. Interoperability is thus expressed more specifically as a bounds on TCP/IP protocol transformation operations allowed in each step accross the path, rather than in just packet formats and protocol exchanges. Like the extensive timing digrams of circuit switched telephone exchanges, a more rigorous specification of operation is implied, yet the effect is simpler than said exchanges because there is just one derivation of this, and that is the TCP/IP already in use - nothing different.
I hope this helps everyone who read the paper and commented. I'd like to make it clear. I owe it to the engineers who have worked so hard on this over the years, and who's voices were drowned out in the bedlam of the bubble, when due diligence was a codeword for "wreck it".