Wednesday, March 29, 2006

Andy Tanenbaum talk now!

I'm right now inside the Andy Tanenbaum talk. Andy just passed out CDs of his MINIX 3 secure OS and I just got one.

He is now talking about Myhrvold's Laws, which is based from Nathan Myhrvold:
- software is a gas, it expands to fill the container

He is talking about software bloat comparing the lines of source code for Windows NT 3.1 all the way to Windows XP, and how it would fit a bookcase if the source code were printed out and bound in a book. Comparing to the real world, computer OSes are still not reliable and easy to use. An example of this is that there was a newspaper article where a PhD in Computer Science couldn't fix his computer and instead threw it out!

The computers have not gotten to the point where it is like the Television Model where you buy a TV, plug it in, and then it works perfectly for 10 years. And then he compares it to the Computer Model where you have to install different pieces of software, and reboot the computer and it still doesn't work, then have to reinstall Windows!

THerefore, there is a need to rethink about operating systems. Software has gotten too bloated, slow and buggy. To achieve the TV model, OSes need to be small, simple, modular, reliable, secure, self-healing. Andy's research is to achieve these goals.
He created MINIX and it was released in 1987. Therefore need intelligent design for OSes. The microkernel for MINIX 3 has 4000 lines of code (LoC), and has 40 kernel bugs compared to 25,000 bugs for Linux, low interrupt latency (10 microsec), it is highly modular and runs as multiple user-mode processes.

This is the architecture of MINIX 3:

Bottom layer: Clock and Sys, microkernel handles interrupts, processes, scheduling, IPC
Next layer: Driver
Next layer: Server
Next layer: User

What Andy is talking about with interprocess communication gives deja vu back to me about the OS course I took in Waterloo like almost maybe 10 years ago. Now, he is talking about the kernel architecture. The kernel supports processes and IPC communication and rendezvous principle (no buffering or extra copying). The IPC communication uses the user process, server process and the driver process. So, the kernel calls for servers and drivers. The file system or driver can only write into memory that it is designated for, there is no competition for memory or shared memory, just like with Windows where applications crash because it tries to use more memory which is being used by another application or process.

For the process manager, there may not be demand paging in the OS because we now have 1GB of memory, not like the VAX machine days. Andy debates whether there is a need for demand paging anymore because of the abundant memory we have now. Everything that I've heard seems standard as an OS architecture, what's new is this Reincarnation server which is the parent of all the drivers and servers. When the driver or server dies, then run shell script, logs what happens and collects it. This is for recovery if a process crashes so it doesn't crash the OS like Windows or Linux does. So the reincarnation server pings the drivers and servers frequently. He said he wants to rename this to the Dick Cheney server. (Audience laughs)
If need to restore the disk driver, then keep a copy of disk driver in memory so that if the disk driver is dead, then can recover.

If other drivers crash, like for example the Ethernet driver, then just restart it because TCP takes care of lost packets. For kernel security and reliability, because there are fewer LoC, then this means fewer kernel bugs and reduced trusted computing platform. Once the kernel is stable, don't change it and add more code, otherwise introduce more bugs, so move the bugs to user space. If have separate instruction and data space, then provides protection against buffer overflows. All data structures are static, there is no dynamic memory therefore no chance for errors, there is no malloc in kernel.

For IPC reliability and security, MINIX 3 uses fixed length messages so this prevents buffer overruns, the rendezvous system is simple, the interrupts and messages are unified. There is deadlock prevention that prevents loops. Assume that the drivers don't work and are untrusted, so they are heavily isolated so they won't screw up the system. The drivers cannot touch kernel data structures, infinite loops are detected and the driver is restarted.

For security, each driver and server has a security policy and this is enforced by the kernel. Now, he is talking about the story about the devil and the nerd, where if the devil offers a Faustian bargain: "I'll give you twice the speed and twice the crashes". The nerd will say "Thank you, Mr. Devil. I'll take it". Nontechnical users would never accept it. CS people are always concerned about system performance. It takes about 4 seconds to build the entire operating system. They use their own compiler to compile the modules. Killing the driver once per second, causes 9% performance loss (from 11 Mbps to 8 Mbps), so there is a performance loss but you get more reliability. It's a tradeoff like he repeatedly said.

The logo for MINIX 3 is a raccoon because it's small, cute, clever, agile, eats bugs, and more likely to visit your house than a penguin. (Audience laughs)




The website is http://www.minix3.org.

He now giving a commercial about a topmasters program for graduate students to apply for.

I got Andy Tanenbaum to sign my copy of the Computer Networks textbook, Third Edition!

1 comment:

Anonymous said...

Thanks for sharing what sounded like it a great talk on the new Minix 3! It sounds like the only OS taking security seriously enough to develop the 'trusted core' kernel security concepts and letting everything else ride on top. Hope to dig into it a little bit more.