15 October
2004

A Tisket, A Tasket, I Lost My TCP Packet Part II

Doesn't Anyone Test Equipment Anymore?

Well, my Japanese datacenter manager story hit a bit of a nerve, with one reader asking "doesn't anyone test equpment anymore?" You're correct. This was the first question in this incident. Didn't anybody test anything? Yes, they did, as did the datacenter. Here's the continuing saga of A Tisket, a Tasket, I've Lost My TCP Packet direct from that datacenter manager.


"Never got a straight answer (this is Japan), but we believe that it appeared fine on initial use, then degraded. The supposition among the datacenter support staff for the company (not the Ariake datacenter staff) was a defect that resulted in a soft error (like perhaps a weak component). Power off / power on - fine for a while. But as soon as we got under serious load, poof.


So it hit the first load day. And thereafter.


In the US, this would be sufficient grounds for replacement (once is enough), but because it appeared sporadic, and because the Ariake staff had already tested it and it was fine, they would not replace it. They presumed we were increasing their costs to force a cancellation of a contract (there was no intent to do this, but you can't get inside people's heads and argue with their fears easily). It's partly a cultural issue - the Japanese are very good on hard error situations, but don't take well to soft / sporadic situations in a damage control society because someone's going to get blamed, and that ruins careers.

In sum, they couldn't percieve it as damaging the quality. They in truth did not have the correct services in place to move a client from one floor of a datacenter to another (they were expanding) and the supplied LAN service was marginal for use on the interior of the 3 tier datacenter (like app server to database, because retransmission didn't really cost you much), but the problem was they used on tier 1 and that did make a major difference.


Sometimes you run into social situations where people get absolutely fixed on believing they have an adaquate situation. It was failing - but by then everyone was fixated on CYA. Yet the problem was very real. The site was one of the most popular in Japan. But it still was in Japan, and one has to play by the cards dealt - not the ones you'd prefer.


That's when the managers like me really earn their pay".

<< A Tisket, a Tasket, I've Lost My TCP Packet | Main | Fun Friday - Now, What is it About Unix or Yourself You'd like to Change? >>
Trackbacks
There are no trackbacks.