First, I love programming languages. That is, I love the field, and I love languages that are beautiful, clean and powerful.
Second, I'm a working programmer. I need to build code that is reliable and performs well. And I've always had a pragmatic bent. I also have a family. So I devote little time to languages that are interesting in theory but do not have strong implementations.
Third, I have a specific problem. I built a distributed system at work using Python. The system works fine for small loads but suffers under large ones. It needs to scale better. I've tried a few things, and will try more, but I'm hedging my bets. What have I tried? I threw psyco at it. Psyco compiles Python on the fly, and is delightfully seamless. It can provide significant speedups for some programs, but it didn't help mine much. I redid my communication protocol, switching to binary and even rewriting the deserializer in C++. That helped, but not enough.
I'm guessing the problem is that I'm using threads. The server has about 200 clients, each serviced by its own thread. I believe Python threads are inefficient. For one thing, they compete for a single lock that protects the interpreter data structures, known as the Global Interpreter Lock, or GIL. A single Python process will use only one CPU, regardless of how many are in the box. Furthermore, my application uses a single lock to synchronize the threads. It's a complicated state machine, and the idea of making the locking more fine-grained is daunting.
Python developers will tell you that the solution is to embrace an event-driven or continuation-based style of programming, in which an incoming chunk of data triggers a function call. The Python server Twisted exemplifies this programming style. Your program has only one thread. It receives a chunk of data, calls the associated function, and then waits for the next chunk of data. There is no need for locking, and no context-switching overhead.
I find this style of programming highly unpleasant, for two reasons. First, you must be very careful in your callback function to do nothing that could cause a significant delay: if your callback blocks, your server blocks. Even simple file I/O, like writing a message to a log file, is problematic. The second reason is that control state, conveniently and implicitly stored on the stack in a thread-based architecture, must be managed manually. For example, say that incoming messages have a recursive structure, like a parenthesized expression or XML, and further assume that the message length is unknown. To read such an expression recursively is trivial. But you can't do that in an event-driven architecture: you must encode into a data structure where you are in the simulated recursion when the current chunk of input runs out, so you can resume when the next chunk arrives. This is a real pain.
So if I can't lick my Python performance problems and I don't want to write event-driven code, I must rewrite in a different language. What are my choices?
- C++. This language is part of my workplace's culture, and unfortunately I must use it often for its number-crunching speed. But I don't need that kind of speed for my distributed application. The language's numerous flaws need no elaboration. On the whole, I strive to reduce the total amount of C++ in the world.
- Ruby. A beautiful language, but I don't expect it to perform any better than Python.
- Java. This would be a good choice. I've used Java extensively, and I believe it to be quite a well-designed language. It has a mature, stable and fast VM. I'm confident a thread-based server would scale. But I would still have my locking problems.
- C#. See Java. Also, my system must run on Linux and I don't believe Mono has enough mileage yet.
- Haskell, OCaml, etc. There is a raft of fascinating and elegant modern languages, but their implementations come out of academia and they have little or no real-world mileage. I can't trust such an implementation in a production environment.
- Erlang. It's a language with more than its fair share of warts, but it has two wonderful things going for it: a model of concurrent computation that does away with locks and their attendant problems, and an extremely stable and mature VM.
So I've been slowly rewriting my application in Erlang, a little at a time.