Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lisp-based OSes (linuxfinances.info)
84 points by fogus on Feb 8, 2011 | hide | past | favorite | 35 comments


The Unix-style process model has virtues that the OP doesn't seem to grok. It's sometimes helpful to be able to restart one server from a completely clean memory image without taking the rest of the system down.

Beyond that: the OP sez that "if the whole system is constructed and coded in Lisp, the system is as reliable as the Lisp environment. Typically this is quite safe, as once you get to the standards-compliant layers, they are quite reliable, and don't offer direct pointer access that would allow the system to self-destruct."

But as I write, we've got two Lisp posts on the front page, and the other one[1] is about the performance of code compiled with

  (declaim (optimize (speed 3) (safety 0) (space 0)))
That is --- "omit safety checks, just trust me that my array accesses are all in bounds and I'm getting the types right." Code compiled this way is not inherently safer than C, and has to be coded up with equal care.

So, at the very least, the "quite safe" guarantee applies only to code compiled with full safety checks, which typically come with a very large performance hit.

[1] http://news.ycombinator.com/item?id=2192629


The other half of his point was specially tuned hardware - the author seems to believe that the type-checking, gc, etc. don't cripple performance the way they do on x86.

I don't know if he's right, but that seems to be his point.


The idea is that you would have bits in the hardware dedicated to type checking and garbage collection. The example being, that in the assembly language/machine code, you may have a single arithmetic '+' operation.

Determining which hardware path to use to add two numbers would be done in the hardware itself. Check the type bits of the numbers and feed it into my ALU. Compare this to an x86 lisp, or compiled C, where 'type' of a 'number' is determined by the assembly code instruction that is used on it.

This isn't just a performance improvement, it is also an improvement in the safety of the dynamic language.

There are a lot of different things that you could do for garbage collection. You could have in-hardware reference counting, or 'dirty' and 'clean' (or color) bits, for a mark and sweep collector, or 'generational' bits for an ephemeral garbage collector.

The idea is that any time you take something out of software, and put it into specialized hardware, you should get a performance improvement.

This doesn't mean that the lisp on a chip would be faster than C on a comparable x86 chip, it means that the things that make lisp (and other functional languages) safer and easier to use would be supported in hardware-- therefore not slowing things down as noticeably.


> it means that the things that make lisp (and other functional languages) safer and easier to use would be supported in hardware-- therefore not slowing things down as noticeably.

Or, at least, forcing every language implementation on that hardware to use the same safety mechanisms, making some apples-to-apples benchmarks impossible.

It would be interesting to see what a C implementation for that hypothetical modern Lisp machine (CADDR?) would look like.

A close parallel is AMPC, which compiles C to JVM bytecode.[1] The vendors say it's standards-compliant, and I actually am pretty sure it is, but it doesn't do a lot of the nonstandard things C programmers have come to depend on. For example, the 'struct hack', where you pack data of multiple types into a struct and proceed to index into it as if it were an array (usually an unsigned char array), flatly does not work, due entirely to the runtime type checking done by the JVM. This always seems to lead to major debates over whether it's a very good compiler.

[1] http://www.axiomsol.com/


The 'struct hack' is when you leave the type of the last member of a struct undefined (effectively making structs variable-sized). This is actually not a problem for runtime type checking, and is C99 compliant.

What causes problems is casting pointers to ints and back, and casting all other crap to chars. This is not standards compliant.

Casting ints to pointers will never be type-safe, but one way to get around that is to just ignore the cast, and overload arithmetic operators to work correctly on pointers - the pointers will carry around their type info, and everything should work ok.

Casting other crap to chars will never work because it interferes with the way the other crap has its type encoded. Luckily in most cases this casting is done to perform I/O, where you can also just ignore the cast, and specialize the lowest-level I/O functions to dispatch on the actual types.

The moral of the story is that you should basically ignore all the line noise the programmer produces about types, and look at the actual objects. This is exactly how Java works, btw.

WRT hardware tagging and type checks, there's really no reason to do it on a byte-addressed superscalar processor. If you look at 64-bit Common Lisp implementations today, you'll actually find that they use only about half the available tag bits in each word. The only thing that needs to be boxed is double-floats.


"The Unix-style process model has virtues that the OP doesn't seem to grok. It's sometimes helpful to be able to restart one server from a completely clean memory image without taking the rest of the system down."

Unix processes virtualize real processes by providing indirection to the basic state identifiers of machine code - memory addresses. It's actually much simpler to virtualize a Lisp image - the basic identifiers are symbols. Dynamic binding already lets you do this to parts of the Lisp process. What's needed is an extension to let you do this for packages/modules.

"That is --- "omit safety checks, just trust me that my array accesses are all in bounds and I'm getting the types right." Code compiled this way is not inherently safer than C, and has to be coded up with equal care.

So, at the very least, the "quite safe" guarantee applies only to code compiled with full safety checks, which typically come with a very large performance hit."

With the Common Lisp model of optimization you can do this selectively for parts of your code, and the compiler can ignore your type hints or declarations if it wants to.

Array bounds checking can be lifted out of loops because CL arrays come with information about their dimensions. This is a fundamental and unfixable problem in C.

Type safety checks are only a performance problem in numeric loops, where it's always possible to provide the compiler with type information so that it can produce provably correct code (if you're looping over arrays of floats, you only have to check the type of the array once outside of the loop - again, this is a fundamental problem of C that is unfixable).

People have wasted hundreds of man-years trying to come up with automatic verifiers and safe subsets of C, but the language has so many problems and so few upsides that none of that work is worth it if you want to build reliable systems. Everything about C that is broken is actually very easy to fix, as long as you don't mind not using C.


> With the Common Lisp model of optimization you can do this selectively for parts of your code, and the compiler can ignore your type hints or declarations if it wants to.

What if those parts of your code have bugs in them? There's no reason they wouldn't, and whoever's writing file format parsers (fonts, video, JPEG) on your platform is going to turn off safety because it's too slow, even though that code has many attack surfaces.


You're making a lot of assumptions:

  1. Those type checks will be too slow
  2. The compiler will choose to follow declarations for untrusted code
  3. The declarations will lead to bugs on some inputs
  4. Those bugs will be exploitable
  5. Somehow those exploits will be worse than what's currently the case with C
99% of C exploits are string/buffer overflows. Bounds checking is not expensive, especially when your arrays carry around their size information. You can enable it in many current C compilers. W^X is an attempt to put something resembling bounds checking into the hardware.

I honestly don't understand the people that argue bounds checking is too slow. How many times do you have to make the same mistake to realize that what you're doing is wrong? It's just idiotic.


DARPA recently assigned a grant to Olin Shivers, along with members of Northeastern University's and University of Utah's faculty, to "seek to develop bug-free, secure technology using brand-new programming languages that enable programmers to write large, complex software."[1]

Around campus, it's been described as an opportunity for Shivers et al. to write a Operating System built completely with functional languages, from the low-level drivers up to user space tools and new programming languages.

My personal thoughts is that it'd be awfully cool to have something like "Emacs as a real OS." Perhaps it is lack of knowledge and self-confidence or the limited nature of Emacs, but I find it way easier to change the way Emacs works than to change the way the Linux kernel, GNOME, GNU tools, etc. work.

[1] Page 12 of http://www.ccs.neu.edu/news/CCIS-Newsletter-Fall-10.pdf


There are some interesting opportunities for functional languages in modern settings, but they seem to be mostly for things in the ML family. For example, Mirage (http://deeprecursion.com/cloud-computing-on-the-metal http://research.cens.ucla.edu/events/?event_id=263 )


I think it would be interesting to chop a *macs into an operating system. I would approach it in an iterative fashion with these initial goals:

- Replace elisp with Common Lisp

- Build os-level threading support

- Build a hardware abstraction layer / target a 'bare' machine.

That gets someone a 'ways' towards a traditional Lisp OS.

I think one of the big questions that arises in for a modern Lisp system is the design of of multiple processes and multiple users.


Personally, I'd rather toss Lisp entirely and go with Scheme, but something definitely needs to be done about elisp. There's a small group of undergrads here at NU hacking on Edwin, a Scheme based Emacs clone originally developed at MIT. They're (we're, I suppose) porting it to Scheme48.

Those last two would certainly be important as well, but something I have no context on.


Galois (a haskell shop) has recently released a port of the GHC Haskell runtime to Xen (http://halvm.org/wiki/). This seems like a promising approach to build single task images with the high level abstraction of a functional language but the low level access to the machine.


I bought a Xerox 1108 Lisp Machine in 1982 and loved it for the great display, windowing system, and awesome InterLisp-D development tools.

That all said, I prefer the modern world of general purpose operating systems with good commercial (Franz, LispWorks, etc.) and free (SBCL, Clozure, Clojure, etc.) Lisp development environments.


I agree, although most CL and Scheme implementations seem to be stuck in the Lisp box, from which escape is painful. Racket (PLT Scheme), Chicken Scheme, and Lush do a pretty good job of providing good access to the OS, but SBCL and CLisp (neither of which I've poked at in years, so there may be some magic library by now) always felt like they wished they were the OS and pretended there was no Unix, owing to the time when CL was standardized and this was sort of still true.

Smalltalk implementations are pretty similar in this respect, although both language families are still pretty fun. Erlang and Inferno/Limbo are both sort of guilty of this, but for slightly different reasons. But it's pretty inconvenient around the borders, getting into and ot of the "real" OS. It's sort of a mixed blessing that nowadays, most developer OSs are (on some level) Unix: there's a baseline, a set of facilities that can be expected to be present and to behave a certain way, so most modern Lisp-like languages are pretty friendly to the OS (or at least provide an FFI to C). The downside is that the baseline is POSIX, a little stiff sometimes and often quirky for historical reasons. (Still waiting for Plan 9 to catch on...Maybe I should stop holding my breath.)


On a whim, have you ever looked into any of the LISP machine emulators?


Yes. I found a simulator for the 1108 bundled with a NLP package and ran it for an hour, then deleted it. Not the same experience as using my old 1108.


Combining the Unix style process separation with mechanisms in the language might be useful.

A problem with such environments might be the availability of widely used applications like Google Chrome or Firefox. A way around this might be to expose the language's Virtual Machine as bytecode or some other intermediate, and target C compilers to that. This way, an entire POSIX environment could be built on top of the Lisp based OS, which would be more comfortable for many users, yet still offer an omnipotent, seamless access to code everything "from the bare metal-up" in Lisp.


I recently found this Scheme to C/JVM/C# compiler: http://www-sop.inria.fr/mimosa/fp/Bigloo/


I've heard people having good results using Gambit to generate C code from scheme to write software for unusual platforms. http://www.iro.umontreal.ca/~gambit/doc/gambit-c.html


Both NestedVM (http://nestedvm.ibex.org/) and Javum (http://sourceforge.net/projects/javum/) do this for the JVM.


The main problem is that the performances of the emulated POSIX would likely be terrible.


For one thing, if the VM was JIT-ed, I think your assessment is about 15 years out of date.

For another thing, no one would be building server/back-end infrastructure on top of such a POSIX layer. It would be there for compatibility for end-users. (So everyone could have Firefox, if they happen to need it, which isn't going to be that common anyhow. Most such machines would be headless.) If you want to run things for heavy-duty functionality on POSIX, just put it on another machine in the cluster running Linux. The whole point of such a system would be to code servers and such functionality in Lisp with exposure of the bare-metal as 1st class Lisp entities.

Performance won't be great but for the purpose it would serve, it certainly wouldn't be terrible.


Surprised no one has mentioned MonaOS. It's a small, Scheme-based, x86 OS. It's got a lot of parts written in C and assembly and is occasionally buggy, but is fun. It is largely a one-man show, but it's already fairly functional. There are images at http://monaos.org/ and the github repo is at https://github.com/higepon/mona .


I'm surprised that none of the efforts listed seem to target hypervisors like Xen or KVM.

A few weeks ago, Azul's VM was on HN, and its GC benefits from tight integration with the virtual memory system.


You can get Azul's Linux patches here:

http://www.managedruntime.org/

From what I understand, the main win is that they use nested page tables to let the JVMs handle page faults directly, which is how they implement high-performance read barriers.

I don't know a lot about garbage collection, but read barriers seem to be the essential piece for implementing real-time (which really should be called "non-blocking") GC.

There's a good discussion on LtU about this: http://lambda-the-ultimate.org/node/4165

[edit] I should mention how this relates to Lisp operating systems: if you replace the virtual memory system with a garbage collector (ie push the GC into the kernel), you can get the same effect but without needing nested page tables/VT-x/RVI, even for user-space processes.

It should also be more efficient and waste less memory on fragmentation than going through a dumb VM.


That page doesn't seem to have been updated in the several years since I first came across it. As such, any efforts targeting Xen or KVM wouldn't have existed at the time the list was put together.


This is by effectively bypassing Linux. The same can be said for any operating system that allows applications to do what they want to a greater degree - look at http://pdos.csail.mit.edu/exo http://en.wikipedia.org/wiki/Midori_(operating_system)

Although I don't know much about midori


That list is fairly old: I remember it from way back. I'm pretty sure most of it predates KVM by a long stretch.


While not a Lisp OS, StumWM http://www.nongnu.org/stumpwm/index.html is an interesting project. I run it on Debian, and I like to pretend I using a Lisp Machine.


Unix as in POSIX days are either numbered or is going to be perpetually hacked into something it can't do without issue. Distributed computing is becoming more of a norm with consumers having many devices. A look into what future operating systems might look like are Midori or Inferno (which was way ahead of it's time) or any other vm based operating system.


Does anyone know if there's any sort of open source project for a modern Lisp OS that is ongoing?


http://common-lisp.net/project/movitz/

Frode V. Fjeld doesn't seem to have much time to hack on it anymore, but the mailing list is active and you can hack on it today.


If there really was a tool (whether that be hardware or software or a combination) that for a couple of thousand dollars would genuinely make you 10x more productive, you'd be an absolute fool not to run out now and buy it.

Since people aren't willing to do this it goes to show that the claims of the Lisp junkies are just pipe dreams.

If programming on an all Lisp environment really was 10x more productive even a ten thousand dollar price tag would be chicken feed.

Lisp fans like to talk it up about how great it is, but at the end of the day are unwilling to put their money where their mouths are.


This is only the case if productivity is your only concern. There are also considerations such as support, security, and familiarity.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: