I am Ron Garret (f.k.a. Erann Gat), the author of the followup study mentioned in TFA. I am, frankly, amazed that this work is getting the amount of attention that it is over 20 years after it was published. I have a couple of comments about this latest critique.
> Given that the participants were recruited from internet groups, I doubt that the subjects were monitored in place, which makes me think that the Development time data collected should be taken with a giant grain of salt.
> Based on a small study like Garret’s, I find that the premise that Lisp is so great is unwarranted.
That's a fair criticism. In my defense I will say that I didn't actually claim that Lisp is great, only that it seems plausible that it might offer the development speed and runtime safety advantages of Java and the execution speed of C, and so it is worth paying some attention to it, and maybe doing further studies. I certainly believed (and continue to believe) that Lisp is great, but I certainly concede that my data didn't (and doesn't) prove it. Renato seems to agree:
> However, the data does suggest that Lisp is about as productive as the scripting languages of the day and has a speed comparable to C and C++, which is indeed a remarkable achievement.
FWIW, my study was a throwaway side-project conducted on the spur of the moment in my spare time. It was intended more to get people's attention and make the case that Lisp ought to be considered alongside Java as an alternative to C than to be a rigorous scientific case for Lisp's superiority over Java. At the time, Lisp was quite a bit more mature than Java, and indeed one could have argued (and indeed could argue today) that Java was an inferior re-invention of many of the ideas that Lisp had pioneered, with automatic memory management being at the top of that list. I was frustrated that Java was getting all the attention because it was the shiny new thing and Lisp was being ignored because at the time Java actually sucked pretty badly. (IMHO it sucks pretty badly today too, but that is now a much harder case to make.)
In any case, what astonishes me most is that AFAICT no one has ever done a follow-up to either of these studies that was better designed and used more rigorous methods. Instead, people just seem to keep citing these papers again and again as if squeezing these turnips harder might yield something worthwhile now that could not be extracted from them when they were first published.
This is a very fertile field for further research. Someone should cultivate it.
A random anecdote back in the early 90's I worked with a mad lad self taught non-degreed engineer. He did an embedded project where faced with memory constraints based on unit cost. Ported a lisp interpreter to PIC assembly[1] and then write most of the code in lisp. Performance was about 1/2 to 1/5 that of straight C[2]. But the memory footprint was about 50% less. Which meant it fit into flash. He went on to use that for a few other products.
[1] Took him three months to port it.
[2] A 50% difference in speed usually doen't cause firmware to fail performance metrics.
The traditional way to write such a Lisp interpreter is to compile the Lisp code to a somewhat FORTH-like stack machine. Elisp is done this way, for example; Smalltalk-80 is very similar.
A typical Lisp bytecode is more compact than typical FORTH threaded code because most of the operations are one byte, and common operations are assigned several bytecodes. (For example, you might have four bytecodes for function invocation with no arguments, one argument, two arguments, and three or more arguments; the last one takes the argument count from the stack or from a separate byte in the instruction stream.) In ordinary FORTH threaded code, every operation is two bytes (or maybe four on a big computer), but you probably want to implement "token threading" if you're trying to do it on a PIC, which ends up being the same thing.
That part of the interpreter, the part that's hard to implement because you're writing it in PIC assembly and want it to run fast while using a space-efficient encoding, almost can't tell whether you're writing FORTH or Lisp, because all it's interpreting is strings of stack bytecode. The big difference is the garbage collector: in FORTH you don't have one, normally. But you can write Lisp without a garbage collector, too, and if you're running on a PIC you probably want to do that.
Sounds like there's a decent chance that one could achieve similar things with something like Modula's M-code.
Also compressing a large program into a small computer by having a part of the program written in compact interpreted code dates back at the very least to Apollo's Guidance Computer's INTERPRETER component.
Compressing a large program into a small computer by having part of the program written in compact interpreted code dates back at least to Short Code on the UNIVAC I in 01950, which was "short" because it could fit six instructions per machine word instead of the usual two.
Of course Turing's original hypothetical Universal Machine from 01936 is "universal" precisely because it is an interpreter that can emulate any computing machine, most importantly including itself, but you could reasonably object that Turing didn't actually build a working Universal Machine in the 01930s, and at any rate he wasn't concerned with the question of how to fit a lot of functionality into a small amount of RAM.
I've seen references to "Turing's 'short code'" and to "ENIAC Short Code" but I am not sure if they are things that really existed or if someone was just getting their references confused.
I'd like to hear your opinion about the following vision of language comparisons:
Lisp as a language is much more malleable to the problem at hand. This will mean it will attract a small group of smarter people, but create code which is harder to read and hence harder to give to other people. Hence, lisp programs tend be smaller, more creative, better quality, faster to create, but also to stagnate when their creators leave.
Java, more recently Go, and maybe even Cobol, attract a bigger group of more average people. They'll write in a simpler, more standardized style. The programs tend to be more bureacratic, average, less optimized, but also much more readable. They are the better choice when you need maintainability over a long time by a varied group of programmers.
I don't remember where this came from, it's certainly not my own idea. It doesn't sound completely bollocks, but have no idea if its true or not.
I’m not the person you’re asking, but I’ve spent a bunch of time fixing bugs in other people’s lisp libraries and haven’t found this to be true at all. If anything, because Lisp encourages you to adapt the language to the problem domain, others people’s code tends to be easier to understand because the code tends to use names and constructs that makes sense to someone that understands the problem domain.
Lisp is tremendously powerful, and like all power, it can be abused. But when used properly it can produce some very easy-to-read and maintainable code precisely because of its malleability.
In that code I used an embedded an infix parser and macros to allow me to write modular bignum arithmetic expressions as infix. That lets me cut-and-paste the infix versions of elliptic curve point addition algorithms and use them directly without translating them into s-expressions, which eliminates the possibility of transcription errors. The resulting code is much easier to read than if I'd had to actually translate all the modular math into standard Common Lisp.
The Java program may be more readable in small snippets but a large Java program is often completely incomprehensible because the overall structure doesn't map well to the problem being solved. A lisp program with comparable functionality can be much smaller and more comprehensible due to the ability to mould the language to the business problem.
Java programs often have so many classes, methods and interfaces that it’s hard to comprehend the overall flow of the application. Terse languages (lisps and MLs come to mind) can often solve the same problem in a single page.
+1. “Let’s distribute all our algorithms in tiny chunks everywhere” seems to be the goal of some code bases. It’s unnecessary, you can control data and algorithm differently in most cases.
I cringe at how many code reviews are essentially, take this 40 line function that does things we don't need anywhere else, and split it into 5 other abstractions.
I mean... the review isn't wrong. You can totally split the functionality in ways that are very logical. But... did it really make THIS program easier?
It should make it definitely easier to read, if names of the extracted functions are good, signatures are simple and you can understand well what they do without looking at their implementations.
The main point of abstraction is not reusability, but the ability to understand code without a need to read all of it. Therefore it doesn't really matter if these extracted pieces aren't used anywhere else. But after splitting, now someone has to read and understand only 5 lines instead of 40, which reduces the cognitive load.
I agree, to an extent. Just there is basically no world where splitting a forty line function into five files is less code. And sometimes, having less to hold in your head helps compartmentalize it. So you may make the specific function easier, at the expense of the entire project. :(
Of course it is not less code, but that additional code is mostly declarative documentation in the form of function signatures. You'd have to hold it in your head anyways to understand the original code, but now it is stated explicitly.
The alternative to provide that information in a single big code blob would have to be code comments.
And there are chances you wouldn't have to keep the function implementaions in your head, if they were extracted reasonably. Do you need to keep the details of how `sort` is implemented to be able to understand what it does?
I used to believe it as well, but I think more often than not it is only a vertical terseness. If you count the number of words, it will be roughly in the same ballpark.
Even this isn't the problem; I generally prefer the style (in Java at least) of splitting things up into small pieces, pkolaczk below gives some more rationale. It's certainly easier to test, and is often easier to read. But yes, people can go way too far with it, leading to the infamous pejorative "ravioli code". And people can pervert it, e.g. in the name of expediency creating paranoia by wrecking previously simple and clear functions by editing in functionality that function shouldn't be handling and violating the principle of least surprise. Few things hurt as much as getting relaxed in a codebase, able to skim over most function calls that aren't relevant to the problem you're looking at, only to then be tricked by an innocuously named method along the code path doing what it shouldn't and wondering if you need to start inlining (mentally or otherwise) everything to avoid future trickery.
Terseness that comes from powerful expressions (simple SQL queries can replace lots of code in any language) is great, though terseness just in the form of density shouldn't necessarily be a goal. At my last job the worst Java code was the most dense, with huge methods spanning hundreds of lines and sometimes up to 10 levels of indentation. Sure it's all there, but it's hard to understand everything it's doing (and more importantly why), impossible to cover with good tests, very difficult to refactor because of so many implicit dependencies you have to untangle first (and there are higher priority items), no justified pressure against "just adding one more conditional" to the middle of some big method to satisfy some new requirement from somewhere (so long as the tribal knowledge exists about where such code ought to be placed anyway, it'll take anyone new a long time to find out from scratch), and so much dense code in many core infrastructure areas was initially written by a superstar coder over a decade prior -- if he ever left the company, a lot of things would break with no one to fix them in a reasonable amount of time, he has so much institutional knowledge and no time or pressure to document it all if such a thing were even possible.
Out of control ravioli code on the other hand was relatively easy to wrangle back under control -- my favorite was a huge logging framework written by a Principle level employee that actually wasn't used by anything, not even its supposed log() entry point, so it was just a big no-op, and could be deleted.
The hardest parts of maintaining software aren't the difficulties in understanding a clever piece of code using advanced language features, but just the sheer complexity of computation being expressed. This is part of the essence that has even a few old time Lispers saying things like "language doesn't really matter" -- nevertheless, I think CL pressures reduce complexity and difficulty of understanding more so than Java, and still decades later has better tools to pierce the veil when understanding is difficult. (Monitoring tools, not as much, though well-understood code tends to not need as much monitoring.)
An interesting study might be to first let people 1. code up solutions to a specific requirements list, then 2. randomly assign the participants within each language group to modify someone else's first-round solution to meet a second-round requirements list.
A good domain-specific language is much easier to read than a general-purpose language that forces you to continually restate the basics of the problem. But defining and then learning that DSL are time investments. If you find yourself working in a tarpit that rewrites everything every year or two, they can’t trust that the investment will get a chance to pay off before they throw it away.
But I at least want to master the most powerful general-purpose language they’ll let me use, because that investment will keep paying off until I find one that’s better still.
Something I’m struck by in reading such discussions (esp on forums) is how often people’s choice of programming languages (and tools, more generally) is from a defensive stance. I understand that when working on problems where better technology is unlikely to lead to significant advantages (so you’d prefer to cap the potential downsides), but I’m still surprised by the sheer paucity of examples where people are willing to make aggressive technology choices to get leverage on their goals. I wonder whether that’s another manifestation of the blub paradox.
<... Lisp’s performance is comparable to or better than C++ in execution speed ...
it also has significantly lower variability, which translates into reduced project risk.
Furthermore, development time is significantly lower and less variable than either C++ or Java.>
Then, towards the end of the article, the author says:
<Quite unexpectedly, Common Lisp was the best language overall,...>
Which implementation of Lisp did you use and how did you build/deploy the code? How idiomatic the code was? e.g. was it C-like code but in S-expressions? (imperative, for loops, in-place modifications, no high level abstractions, etc... but I guess this also might count as "idiomatic Lisp" in some circumstances)
Back in the day different submissions used different Lisps. Some used CL, some used Scheme. Unfortunately, I don't recall the details, and I can't find my records (this was over 20 years ago). If I do come across them I'll post a followup here.
I can tell you this: I currently have a consulting gig where I help maintain a tool for designing state-of-the-art ASICS which is written in Allegro Common Lisp. It is definitely competitive with C in terms of run-time speed.
I recently learned lisp, and did some macros to optimize some functions that were getting called billions of times a day. I compared the output of a common case (compiled by SBCL) to hand-written C (which I've been using a very long time). The lisp was generating equivalent or better assembly than GCC. So I'm a fan.
I am not very familiar with either one (I've read about them both but never written any code) so I can't really answer that. Same for C#, F#, Swift, in fact, pretty much anything that has come out in the last 10 years. Old dog, new tricks, something something...
I will say, however, that IMHO just about anything is better than C++, including C.
John Ousterhout (creator of Tcl (mentioned in OP paper)) also wrote an interesting paper about the “scripting” vs “system” language dichotomy, “Scripting: Higher Level Programming for the 21st Century”[0] which some may find interesting.
It was an interesting distinction in the nineties to make whether languages were interpreted or compiled. Then things got complicated when we started compiling things just in time. And then things got even more complicated when we started compiling compiled languages to imaginary instruction sets that we then interpreted at run time only to compile them just in time. To make things even more complicated, Java now has an ahead of time compiler too.
Java was merely the one of such languages and instruction sets. If you look at WASM, it's exactly the same vision: a portable instruction set that can be run anywhere. Except this time it's optimized not for just one language but any language and the most popular languages on it currently are C++ and Rust, which are compiled languages not intended for this kind of thing.
There are other ways to compare languages but their run times are a bit apples and oranges. Typing systems have evolved too. Java now has inferred types and ruby and python now have type annotations. The lines blurred. Purely dynamically typed languages still exist but are increasingly sidelined in favor of at least partially statically typed languages (e.g. Javascript vs Typescript). Assemblyscript takes that to the next level and compiles to WASM instead of Javascript. You see similar evolution in the Ruby ecosystem (e.g. Crystal). And the python ecosystem, which has multiple compilers.
The distinction has ceased to be relevant. It's more about what kind of limitations you are willing to impose in exchange for performance. Rust makes a few very clever tradeoffs here where they do a lot of heavy lifting with templates to provide so-called zero cost abstractions. People even attempted REPLs for it. It's a scripting language!
It’s interesting that even a system programming language like rust is actually not faster than java if not written / implemented correctly. People always thought that when they needed to go faster they go for c++/rust. I guess advance knowledge in one programming language is better than switching to another one in the case where you need to make your app faster
I've been asked to whiteboard this or very similar problems in 45-minute job interviews. Based on my experience in those interviews I am not surprised to see it typically takes 3-10 hours to do a proper job of it.
I mean of course Java is slower if you're including JVM start up time. A more modern version of the problem would ask for a service which performs the same function and averages time taken over N requests. I suspect modern java would look pretty close to C++/rust there
This is always an interesting question, about including the JVM startup time or not. In this context doesn't it seem like the JVM startup time should be included?
During JVM startup, one thing it's generally doing is allocating space for the heap from the OS. For the C/C++/Rust application, this could be implemented either as a large arena allocation, or allocations inline during the program. That is a programming choice, but the allocation cost is still there. It would be unfair to not include that allocation for Java whereas always including it (whether inline or upfront) in the native C/C++/Rust versions.
For long lived programs, like server-side things, the JVM startup time probably isn't relevant, but for short lived programs, it seems reasonable to include the startup time as that would be part of the time running the program from the CLI.
> for short lived programs, it seems reasonable to include the startup time
but then one can say, if the whole problem is so trivial that startup time is a significant chunk of the execution then is it really a problem worth optimising in the first place? The only situation where it would matter is if you are putting this in a tight loop on the command line or something like that (which is a thing, but its a reasonably small thing).
There are actually important cases where the JVM startup time is a huge problem. For example, my team chose Go instead of Java for a microservice solution because, among others, the startup time of a microservice after a scaling event or crash matters a lot for user perception. The difference between a 100ms startup time and a 500ms startup time (or more) is often the difference between a system that seems to 'just work' and a system that seems to 'constantly stutter'.
When my deployments are incremental I barely care about startup time. Even if a service takes ten seconds to come up healthy (which is mostly pubsub and polling backends, not classloading) we can upgrade the entire production tier 1% at a time in under twenty minutes, which is about as fast as I’d be comfortable going with automatic plus manual monitoring. And I don’t really expect sub-second response times and image load times from cluster schedulers like YARN or Apache Mesos or EC2.
I admit that Java is doing a lot of redundant bytecode verification and typechecking work at classload time, but nothing would really change if it were free. I certainly don’t want to give up plugins and AspectJ the way Go does.
I am by no means a devops expert, but it seems strange that 400ms would matter all that much, when the vm itself takes similar amounts of time for a new instance.
Either your application crashes way too much, or the scaling is not implemented properly.
That depends on what you're trying to evaluate. If you want a language that can do CLI tools, you should absolutely include the startup time. If you're doing a long lived service, you should benchmark a warmed-up JVM. Different environment require different performance characteristics.
With exception of Assembly all languages have a runtime, even C.
Who do you think calls main(), processes the command line arguments, runs functions registered via atexit(), emulates floating point when not available on CPU, processes signals on non-POSIX OSes, handles CPU faults?
The JVM startup time is discussed, and it mentions that a pre-compiled version using Graal might make for an interesting comparison. It wasn't done, because as mentioned in the blog post, things were getting out of hand with all the extra testing.
Anywhere you see "C/C++", treated as if a single language, you know that you will be not getting good information. It is common to claim "almost as fast as C", when it is known that C is now a low bar. For speed, nowadays, you use C++. (Or Rust, for identical reasons.)
The most characteristic quality of any publication about the performance of Lisp, as for anything like it, is exaggeration. In particular, indirect costs of garbage collection are invariably concealed.
Garbage collection is not part of the Common Lisp standard.
DISASSEMBLE is magic and practically speaking it’s easier to write machine language performant code in CL than any other language. Yes of course you can look at =as= output in your favored language, but it drops you out of your flow.
I agree that "C/C++" isn't a good sign, though it was more forgivable when C++ just meant C++98... For speed, nowadays, you use X. That is, many languages can be fast, especially if they have access to intrinsics, the real question is how much special knowledge and herculean effort do you have to spend? It's still true that for many problems idiomatic modern C++ will give a very nice result that may be hard to beat (though be careful if you forgot cin.sync_with_stdio(false) or you may be slower than Python!). But it's also true that C, Rust, Java, Common Lisp, and even JS all do very well out of the box these days, while languages like Python and Ruby have lagged. For this problem, the author had to spend quite a bit of effort in a followup post to get Rust to match some 20 year old unoptimized CL.
If one wanted to optimize Lisp, a simple place to start is with some type declarations, at least with the SBCL implementation. It can even be kind of fun to have the compiler yell at you because e.g. it couldn't infer a type somewhere and was forced to do a generic add, so you tell it some things are fixnums, see the message go away, and verify if you want with DISASSEMBLE that it's now using LEA instead of a CALL. For an example of going to quite a bit of trouble (relatively) with adding inline hints, removing unneeded array bounds checks, and SIMD instructions, see https://github.com/telekons/42nd-at-threadmill which I believe is quite a bit faster than a similar "blazing fast" Rust project featured on HN not long ago. But my point again isn't so much that CL is the fastest or has some fast projects, just that it can be made as fast as you're probably going to want. This applies to a lot of other languages too, though CL still feels pretty unique in supporting both high level and low level constructs.
Your GC comment is weird, since any publications looking at total performance invariable include the GC overhead, nothing is hidden. The TIME macro for micro-benchmarking even typically reports on allocated memory, which can be a proxy for estimating GC pressure, and both SBCL and CCL report the actual time (if any) spent in GC. Why not complain that C++ benchmarks hide the indirect costs of memory fragmentation, which is a real bane for long-running C++ programs like application servers? But I'll admit that the GC can be a big weakness and it's no fun to be forced to code around its limitations, and historically some GCs used by some Lisps were really bad (that is, huge pause times). I've been looking at the herculean GC work being done in the Java ecosystem for years with a jealous eye, and even the newer Nim with its swappable GCs when you want to make certain tradeoffs without having to code them in.
First, I would be far less skeptical than this author is over the tradeoffs of scripting languages. Given the tools offered by scripting languages, there is one obvious way to do things. The overhead of building a fancy "just right" data structure almost always makes things perform worse than the straightforward solution using dictionaries, arrays and scalars. Given that, people go directly to that solution, leading to less variation in length of code, time and performance. It is always possible to beat a scripting language's performance, but a surprising number of programmers won't.
This is a lesson that generation after generation of programmers have learned in a variety of scripting languages. With the latest winner being Python. But the differences between popular scripting languages generally matter a lot less than the commonalities. (There are exceptions, specialized languages like Lua, R and Julia have very specific strengths and weaknesses despite being aimed at scripting.)
Second, there are two basic reasons why Lisp loses.
The first is the disadvantage of being an image based system. This has a lot of upsides, but the downside is that people lose access to a long toolchain that they are used to for dealing with files. Tools including editors, source control, and so on. Programming expertise is fragile, and that is a pretty big shock.
The second is that Lisp attracts people who want a perfect language. Unfortunately people's ideas of perfection different. So we get a large number of related dialects of Lisp, which are almost compatible with each other. And fail to get critical mass behind any of them.
That said, how big a deal is achieving critical mass? Well obviously it matters. If you're using a mainstream language you can find packages for things that you need, answers on StackOverflow, and so on. But there are diminishing returns. The size of CPAN didn't keep Ruby from rising, and JavaScript and Python in turn have seen meteoric growth.
But that said, there is a fundamental tradeoff. Working in a popular language matters more if you're planning on leveraging lots of existing work. The more your project is building on what you yourself have built, the less important the easy start of external packages is relative to your own development familiarity. The more specialized the thing that you're building, the more it makes sense to use the perfect language for you/your space, rather than going with what's popular.
Common Lisp implementations typically aren't image based systems, although one can dump images. Code is in files, which are loaded. The typical workflow involves modifying files, compiling/loading the changed files and their dependencies (using asdf), and running tests. Nothing persistent is in the image.
I've heard lots of people give lots of reasons over the years why Lisp loses. Very few of those reasons were ever based in fact. Even today I hear people complain that Lisp is an interpreted language. Back in the day they used to complain that it used too much memory, with a RAM footprint of about 8 megabytes (that was including the application code).
They also commonly complained that the Common Lisp standard was too large with too many library functions. Nowadays one common criticism is that there aren’t enough libraries.
I started dabbling in CL last month, and I both of those points are valid, if for different reasons today.
Most languages have a single standard way of doing basic things, and provide libraries for advanced operations that are not faced by everyone everyday, but more than once per average person.
First criticism of Common Lisp being it has too large standard lib, and it collides with first half of above paragraph, because newcomers like me get confused at 4 different ways to set variable, all named ever so slightly differently.
Second criticism being lack of libraries. This comes in effect with second half of first paragraph. For eg. I wanted to build a simple webapp to parse PDF into separate pages and convert them to speech. I could write one in Java in a weekend, but there is nothing comparable in Common Lisp, especially in terms of proper documentation and examples. I couldn't lift off after initial interface. I had to scour through library source to even understand basic usage.
It is possible I missed something obvious, but as it is, Common Lisp is not a good language to start greenfield projects.
One can always call external code from Common Lisp - there is no need to re-invent/re-implement everything in CL. Use a FFI to call library functions, call external routines or call internal routines (ABCL runs directly on the JVM and can call arbitrary JVM) code.
Using Common Lisp does not make it necessary to write all code in it.
Not sure why I would for a project re-implement existing PDF features of other tools - unless there is a specific reason for it. I would always first explore to re-use code.
Thanks. I was trying to go as native as possible. I did get it to completion by using other tools for everything but web interface. The reason I was so disappointed was because my prior Lisp experience was with Emacs Lisp, which has probably the best documentation and iterative and interactive development experience of all the programming languages out there. I was spoiled and had probably unreasonable expectations. NixOS doesn't make things easier either. I'm looking to make another attempt at this, hopefully I'll do better then.
I love this quote (from Interlisp-D: Overview and Status):
"Interlisp is a very large software system and large software systems are not easy to construct. Interlisp-D has on the order of 17,000 lines of Lisp code, 6,000 lines of Bcpl, and 4,000 lines of microcode."
The GP might be thinking of Interlisp-D which was image based.
I remember Dough Lenat had an image that he used continuously for years. I wonder how you can depend on your research results after that — who knows what’s in your environment?
Yes, I used Interlisp-D decades ago. It was a different experience. Common Lisp has things like read tables that are designed around a file-based (or at least text-based) view of code; Interlisp did not and natively supported in-image code editing (with Sedit). This meant * in Interlisp was the comment operator, and you had to do multiplication with TIMES.
> Given that the participants were recruited from internet groups, I doubt that the subjects were monitored in place, which makes me think that the Development time data collected should be taken with a giant grain of salt.
> Based on a small study like Garret’s, I find that the premise that Lisp is so great is unwarranted.
That's a fair criticism. In my defense I will say that I didn't actually claim that Lisp is great, only that it seems plausible that it might offer the development speed and runtime safety advantages of Java and the execution speed of C, and so it is worth paying some attention to it, and maybe doing further studies. I certainly believed (and continue to believe) that Lisp is great, but I certainly concede that my data didn't (and doesn't) prove it. Renato seems to agree:
> However, the data does suggest that Lisp is about as productive as the scripting languages of the day and has a speed comparable to C and C++, which is indeed a remarkable achievement.
FWIW, my study was a throwaway side-project conducted on the spur of the moment in my spare time. It was intended more to get people's attention and make the case that Lisp ought to be considered alongside Java as an alternative to C than to be a rigorous scientific case for Lisp's superiority over Java. At the time, Lisp was quite a bit more mature than Java, and indeed one could have argued (and indeed could argue today) that Java was an inferior re-invention of many of the ideas that Lisp had pioneered, with automatic memory management being at the top of that list. I was frustrated that Java was getting all the attention because it was the shiny new thing and Lisp was being ignored because at the time Java actually sucked pretty badly. (IMHO it sucks pretty badly today too, but that is now a much harder case to make.)
In any case, what astonishes me most is that AFAICT no one has ever done a follow-up to either of these studies that was better designed and used more rigorous methods. Instead, people just seem to keep citing these papers again and again as if squeezing these turnips harder might yield something worthwhile now that could not be extracted from them when they were first published.
This is a very fertile field for further research. Someone should cultivate it.