Alex Cannera dropped an interesting paper on my desktop discussing congestion control in grid networks. And it’s results confirm what I and others have seen over the years; Vint Cerf seriously saw in 1998 that hop-by-hop reliability preserving end-to-end semantics in the routers was the real key to handling this issue. Vint also is a renowned wine expert, and treated me and William to a wonderful tour of fine wines at the Rubicon Restaurant in San Francisco where we had a memorable discussion on exactly this issue.
Of course, their terms-of-art are different from mine, since we all seem to invent new terms. So their “network of queues” is my bucket brigade mechanism. And their test demo is similar to one at InterProphet we called FlameThrower devised by Senior Hardware Engineer Todd Lawson and Software Engineer Madan Musuvathi to literally flood the other side with packets and see if it falls over.
Todd and Madan built a wirewrap version of SiliconTCP on a DEC PAM card with a NIC wired on (and that’s exciting with 100MHz logic). We demo’d this to Microsoft, venture, and lots of other companies back in Summer 1998. I have the wirewrap on my wall alongside a production board.
But the solution presented in this paper to “back pressure” the plug by disabling TCP congestion control selectively is where we part company. Herbert and Blanc/Primet quite correctly point out some of the barriers to FAST, Highspeed TCP / Scalable TCP, and XCP, but then fall back on the old link layer solution approach (which we diddle in the stack software). If only it were that simple.
Reliable link layer isn’t enough, and Vint (in looking back) clearly knew this. That’s why he saw SiliconTCP as fitting best here. This was the key reason he joined the board of InterProphet so many years ago.
Many people have made reliable link layers. We’ve done it with the boards we have here right now. But no one else made a reliable network and transport layer that spans many hops, maximizing the capacity of the aggregate network. Our boards also do this. So we did demo Vint’s vision in practice.
It’s only now that people are starting to thrash the problem that Vint saw many years ago. But they lack his insight as to the real nature of the problem. It isn’t turning off congestion control — it’s using it effectively.
So long as engineers think the answer is a simple “stack hack” instead of rethinking how to more effectively meet the protocol demands — not new protocols, not turning off the congestion, not cheating by biasing fairness — but really simply doing our job better, we’ll continue to run into this problem.