Sam Jansen of Wand Network Research Group in New Zealand recently complained of “weirdness” with Windows XP (SP2) and TCP when doing a TCP test of two systems connected over a link (very similar to those demos at InterProphet). After all, what could be simpler than a couple of cans and a string, right?
No, it isn’t simple. He finds that Windows is sending data outside of the receiver’s advertised window, as well as sending “weird sized” packets in what is supposed to be a “simple bulk data transfer (often sending packets with a lot less than an MSS worth of data)”. What’s going on here?
Poor Mr. Jansen is not losing his mind – what he is seeing is real and Microsoft is cheating. We saw lots of little cheats like this when we were testing on Windows for SiliconTCP and EtherSAN back in 1998 to the present. In sum, Microsoft does this because they think they “get ahead” with a technique called “oversending”. It thrives because TCP congestion control algorithms are all pessimistic with the send budget. It doesn’t always work, like any cheat, but I guess it makes them feel good.
Mediapost published an essay on cyberbullying – “Cyberbullying has suddenly entered into popular consciousness.” So it’s a new phenomenon, right? Nope, it’s been around as long as electronic communications made it possible. It just wasn’t as visible since there were fewer channels of communications, plus if someone acted up you could get them thrown off. Now, in a global Internet, there is plenty of places to hide and plenty of eyeballs for venom – you just gotta know where to look.
I came across a scheduled talk at Stanford Networking Research Center this week on “Cognitive Networks: Implementing Alternate Network Management & Routing with Software Programmable Intelligent Networks” by Shannon Lake, CEO of Omivergent. Unfortunately, I had another seminar to attend at exactly the same time (as usual), but I was curous about this talk and Mr. Lake’s assertions on “IP dogma”. So I went and asked him why do we need to “change our views on IP networks”, layer 3 versus layer 4, and the impact of jitter. He most kindly replied.
Well, my Japanese datacenter manager story hit a bit of a nerve, with one reader asking “doesn’t anyone test equpment anymore?” You’re correct. This was the first question in this incident. Didn’t anybody test anything? Yes, they did, as did the datacenter. Here’s the continuing saga of A Tisket, a Tasket, I’ve Lost My TCP Packet direct from that datacenter manager.
A gentleman today wondered if his expensive leased fibre line was causing packet loss, even though he compared it with an ADSL line from the server to the host. As Dennis Rockwell of BBN pointed out “What you have discovered is that your 2Mbps link is not the bottleneck; that lies elsewhere in your network path. The extra bandwidth of the fiber link cannot help this application”.
Dennis is correct. But how do you know where to look to fix the problem? Here’s a little story from a manager of international datacenters in Japan and the US to illustrate how complicated the issue can become…
In my current article Buffer, Buffer, Where is the Buffer? in Byte, Jim S. sent me the following:
Nice article in Byte. It reminds me of the old days
when you could read a good technical piece in the print Byte.
Kind of a rare phenomenon today.
But do you really mean to say that *all* security
problems are buffer problems?
Thank you Jim for your kind words. Could you please tell the editor of Byte as well? That way, more articles like this come the reader’s way. 🙂
No, obviously security isn’t just buffer overflows. But these little bandaids are everywhere, and cause an amazing amount of problems for something so trivial.
For example, on Cnet today another buffer overrun afflicting Windows was announced. “Secunia issued an advisory saying a buffer overrun flaw has been found in Office 2000, and potentially also in Office XP, that could allow hackers to take over a user’s system. The company rated the flaw as ‘highly critical.'” Alas, these bulletins are all too common.
I used the essay to illustrate that a one size fits all solution like a buffer can have larger implications than my “engineer” in the introduction realized, and that his solution may not be a solution at all. There’s a lot of sloppy thinking nowadays, and that doesn’t help in a more competitive global economy. I’d like to see fewer unemployed obsolete engineers and scientists, and more innovation and critical thinking. So I write these essays. I hope it helps. And I hope you continue to enjoy them.
In an off-list discussion in the protocols interest groups, I got involved in a rather deep discussion of packet rate, congestion control, network neutrality, jitter and choices in Internet design, which are actually quite interesting to share.
A little background here – one person asked if it was true (it is) that the cwnd (congestion window) internal stack variable doesn’t have an immediate impact on the network, because TCP updates its actual rate only once per RTT in the congestion avoidance phase, so the cwnd += SMSS*SMSS/cwnd update with each ACK is only an internal calculation. You got that?
Which went on to the question posed to me – “While I now believe that it would actually be ‘legal’ according to the spec. to implement a TCP sender like this (no one seems to say that you MUST saturate your window at all times)…”
Wait partner. Going back into the Internet Wayback Machine and chatting with some of the earlier worker bees, it turned out it actually started out this way, and congestion backoff fell right out of this.
Well, it’s official – Procket is unplugged and sold to Cisco. At $89M, where they invested $300M, and Cisco was an early investor, I’d say they got a bargain. But will they use the technology?
According to one Cisco insider I spoke with he think the technology isn’t the big thing. “I think we regard it as a bargain: purchase 50 high end engineers, fluent in router design, ASIC design and layout, board design, SW, etc. for 80 million. Not a bad deal”. As to the tech, he says simply “We are fragmented enough as it is”. So they’ll find a use for it somewhere but it isn’t urgent.
Alex Cannara loves to push those “why don’t we just turn off that pesky congestion control” papers my way. I think he does it just to annoy me. Which is correct, because I can’t imagine ever getting such a paper approved. But, as Britney likes to say, “Oops, they did it again…”.
It seems like every other CS grad student thinks he can get away with “disabling of TCP’s congestion control” and suddenly he’s solved the problem of congestion. Or, to put it in medical terms – it isn’t the disease, it’s the treatment. Everything is wonderful if you just stop treating the condition – even if the patient dies? Very much like a physics student thinking he’s gotten around energy conservation, when he doesn’t get what total energy of a system means, and wow, he’s invented a perpetual motion machine.
In a walk down memory lane, Craig Partridge and Alex Cannara discussed Craig’s mention of an XCP meeting and Greg Chesson, Alex saying “But, we still have suboptimal network design, insofar as we depend on TCP from the ’80s and a glacial IETF process — all this while now having complete web servers on a chip inside an RJ45 jack! So maybe his ideas for SiProts were something to consider, even if they weren’t right on target?”
For those not in the know, Greg Chesson stepped on a lot of “TOEs” (hee hee) first in the early 1990’s with filing a lot of patents with protocol engines (PEI – backed by HP at the time).
I have a slide from a presentation that I did for Intel back in 1997 explaining why he failed — simply put, preprocessing likely conditions based on heuristics always failed in the general case, with the preprocessor commonly falling behind the processor even though it was put there to speed up the processing — so the software stack on average was usually faster.
This same process in FEPs has been repeatedly repatented in network processors — I reviewed several — but they never got the methods that allow for completion of the processing without falling behind (esp. on checksum, but there are also other conditions). I always thought Greg could sue a number of network processor companies for infringement, but since they all fail in the same way, who the hell cares.
Greg made his money in SGI, by the way, and look how that company eventually turned out — lots of “throw code over the fence” to linux, which undermined their own sales of systems. Very self-destructive company.