10 January
2007

Academics versus Developers - Is there a middle ground?

networking research, linux and BSD

Jim Gettys of One Laptop per Child is engaged in a furious discussion on the networking / protocol list as to whether academics should take responsibility for reaching out to the Linux community and maintaining their own work within the Linux code base. His concern is that networking academics, when they do bother to test their pet theories, use such old versions of Linux that it becomes infeasible to integrate and maintain this work in current and later versions. The flippant academic response is usually of the form of "we write papers, not code" variety (which isn't precisely true and actually then brings into question the relevence of said papers and the claimed work that stands behind them).


As Jim says himself, "If you are doing research into systems, an academic exercise using a marginal system can only be justified if you are trying a fundamental change to that system, and must start from scratch. Most systems research does not fall into that category. Doing such work outside the context of a current system invalidates the results as you cannot inter compare the results you get with any sort of 'control'. This is the basis of doing experimental science."


This is an old dispute, and one that has its roots in the creation and demise of Berkeley Unix (BSD) distributions. So perhaps a little perspective is in order.


Berkeley started with Unix through the direct involvement of one of its creators - Ken Thompson. One of the projects that was begun before the creation of BSD and it's research group was a modification Professor Fabry suggested to the scheduler of then Version 6 Unix. There was a PDP-11/70 with more than 70 users on it, so any inefficiency in the scheduler was revealed, usually in a ghastly manner. This was a production system, because it was also in use at other universities and in use at Berkeley's own computer center with 6 large time-sharing environments. Originally, v6 Unix simply used a process table with an index. Dr. Fabry wanted it to use a ready-to-run queue sorted on priority. Ken Thompson insisted that the benefit would be small, and the complexity (which was an issue due to the tiny address space -- the kernel program at that time fit into 64 KBytes, later raised by overlays done by William Jolitz as a CS199 project) would be considerable.


The change was implemented (I'm not sure by who -- anyone remember?), and the same machine with the same user base was run over week periods and results compared. The gain was found to be more than Ken had predicted. This empirical method of taking a production system and testing a revision was highly successful in proving Dr. Fabry's point and established a model for the creation and testing of systems.


With the success of this project, Unix at the University of California went from a convenience to hackers to the status of a research tool, and a research group of students (graduate and undergrad) was established later under Dr. Fabry's supervision.


At this time, universities were in fierce competition for ARPA grants, and there were proposals for the modernization of the ARPANET solicited. Elements of the modernization included creating a common platform for researchers (hardware and software), and an operating system that would support the networking protocols (originally CATENET, later INTERNET) from BBN and applications. Berkeley took a beta-level minimalist port to the VAX (done by Tom London) and revised it heavily (Bill Joy - system organization, rearchitecting and Ozalp Babaoglu - virtual memory).


There is a considerable list of networking and operating systems advances integrated and tested in this manner at Berkeley, and it fell under Dr. Fabry's purview. Among their later successes was the work later released in the original NET/1, as well as work integrated into NET/2 releases. Many papers came out of this academic-developer collaboration.


So why did it eventually end? The model of how an academic operating system worked at Berkeley depended upon the maintenance of a "near-production" version of an operating system continuously on a machine with a group of student systems programmers, later supplemented by staff programmers. Academics would collaborate with that group on an idea, the idea would be implemented by the aforementioned programmers, statistics would be gathered, and a paper would be published. But often, the academic did not do the implementation - instead, a programmer in this research group did.


The problem that arose is that as the research group became less a province of student programmers obtaining a CS299 / CS199 credit and more a province of professionally paid staff programmers, the academics, especially at Berkeley, began to see the lessening of an academic mission. In addition, the staff programmers were seen as "hackers", and no one on tenure track wished to be seen as consorting with hackers, so they disassociated themselves from the implementation work that led to outside success. Meanwhile, many of the programmers expressed frustration (many of them held degrees) and thwarted academics ambition -- their hard work was not taken as serious academic work. Finally, the people who would appreciate the implementation the most were other "hackers", so their professional circles diverged.


In a sense, an inadvertent class structure was created. While the Berkeley group still published papers, oftentimes these paper were done with outside academic groups, and the number of papers were fewer and fewer. The CS department at Berkeley had difficulty reconciling this work with their academic and research mission, and the social divide widened.


After Dr. Fabry left, there was no one within the faculty who could be persuaded to take on the management role long-term (it was handled by faculty on an ad hoc basis), and the group drifted. They survived by effectively taking on development work to pay for the maintanance of the software distributions upon which many other universities now depended, and questionable decisions were made independent of Berkeley's research mission. The academic-developer rift was complete, and this led to the eventual demise of the BSD releases.


But academics are still nostalgic for that brief period of time when networking work was in its "golden years". Much of the early Internet successes came out when when this model was fresh, equitable and seriously managed, so it is no surprise they would like to see it reestablished.


But as one can see from Jim's exploration of the matter, the academic-developer rift still remains. And given the real costs of developing and maintaining any new work in this area, it is unlikely to change until there is a real sense that the "hackers" who write the code should be considered the equal partners of academics who think up the ideas, and that perhaps both belong on that paper's authorship list.

Posted by lynne : "Academics versus Developers - Is there a middle ground?" at 15:00 | link to entry | Comments (1)
<< Fun Friday - Apple Phone Home, Supremes on Science | Main | Strange Friday - Anna Nicole, Jim Gray, Nowak >>
Comments
Re: Academics versus Developers - Is there a middle ground?

On a somewhat related topic, I decided I wanted to go to grad school to study congestion avoidance after learning about Van Jacobson's initial congestion avoidance research. I looked at several research groups, some development-oriented and some analysis-oriented. I was hoping to strike a balance between development and analysis.


When I asked people for advice on which groups were best for me, the analysts told me that the developers had a poor understanding of how networks really worked. OTOH, the developers said the analysts were just a bunch of theoreticians sitting in offices, not actively involved in building systems and studying how to improve them.


For a somewhat humorous essay on the developer vs. analyst split, see Doug Comer's How To Criticize Computer Scientists.

Posted by: Greg Skinner at January 26,2007 12:27