"After this experience, it was hard to understand that the software engineering community did not recognize the benefits of adopting a high-level, type-safe language instead of C." -- N. Wirth
I suspect that the good programmers did notice the benefits of a strict statically typed language, however the cost was probably deemed to be too high at the time. Because C was a "mid-level language" and stayed close to the metal it could efficiently use the limited resources of pre-LSI (let alone VLSI) computers; effectively C allowed you to break the rules if you wanted or needed to. These days obviously the performance difference is generally less than a few percent and hence can be largely ignored.
Every Pascal I ever worked with (with the exception of the Pascal I used in school) had extensions to make it work in the Real World. Unextended Pascal was a toy.
Apple was a Pascal shop in the 80s, but they extended it to the point that I just translated (in my head) the C code that I wanted to write into the variant of Pascal they'd implemented. Pascal had a couple of nice things that C didn't (nested procedures, for one), but they weren't used all that much by us.
Strings were still a pain. Str255 stunk, and everyone knew it, but nobody had anything that was dramatically better.
I disagree. Unextended Pascal is not a toy: it's a tool to teach computer science and structured programming. In a classroom environment in the 1970's, it's not relevant that strings have to have a defined, fixed length. I'm not 100% sure what you mean by I/O being broken, but I'd wager that's not a big deal in a classroom environment, either. In any case, evaluating unextended Pascal as anything other than a teaching tool is like complaining that a hacksaw isn't very good for screwing in screws. Talk about "wrong tool for the job!" :-)
True. But, "high-level, type-safe language" need not refer to unextended Pascal, and I see no evidence that it was intended to. He could have meant Algol W. Depending on when it was written, he could have meant Modula 2, Modula 3, or Oberon.
I've read quite a bit of Pascal's history. There are three main explanations:
* Lack of a complete standard. The language was initially used in academy, so when it was adopted by industry, numerous mutually incompatible extensions were developed. The standard C library today may seem basic, but it was nice in comparison.
* Virtual machines. Now they're the cool thing. Then they were a shortcut to have a quick and dirty implementation for new hardware, instead of writting a compiler. Bad performance was associated with the language, not the implementation.
* Some implementations were of the "bondage and discipline" kind.
TurboPascal compiler in the eighties solved all three problems, adding an IDE and a $50 tag for good measure. But it was too late. Although very popular in Europe and for little shops in general, the big software firms in the USA had adopted C already.
One wonders where we would be if Pascal or one of it's offspring like Oberon had actually caught on and challenged C. It would be nice to have a alternative system programming language. C sometimes feels like a local maximum instead of a global one.
From the PDF:
"The clean solution given in Oberon [6] is the concept of type extension, in object-oriented languages called inheritance. Now it became possible to declare a pointer as referencing a given type, and it would be able to point to any type which was an extension of the given type. This made it possible to construct inhomogeneous data structures, and to use them with the security of a reliable type checking system. An implementation must check at run-time, if and only if it is not possible to check at compile-time,.
Programs expressed in languages of the 1960s were full of loopholes. They made these programs utterly error-prone. But there was no alternative. The fact that a language like Oberon lets you program entire systems from scratch without use of loopholes (except in the storage manager and device drivers) marks the most significant progress in language design over 40 years."
Pascal was the primary high-level language used for development in the Apple Lisa, and in the early years of the Mac. Microsoft assumed Pascal would become the dominant application programming language, so the early Windows ABI used the Pascal calling conventions. That's why we're stuck with Windows' PASCAL/WINAPI/stdcall and cdecl ABI mess today.
IIRC the first (unusable) Windows version was written in Pascal.
I remember reading somewhere that Pascal calling convention was more efficient. Return from a function and cleaning the parameters from the stack could be done with a single RET N instruction, while C calls needed both a RET and a separate stack realignment, because C allows variable number of params.
I don't know if it's still the case with recent processors.
C requires that variable-argument-list fuctions be called with a correct prototype in scope, so it is perfectly feasible to use a "callee-cleans-up" ("Pascal") calling convention for regular C functions and a "caller cleans up" ("cdecl") convention for varargs functions.
The point is moot though, because just because an operation can be represented using a single assembler mnemonic doesn't mean that it's any faster than an alternative that uses several. A case in point is the "REP MOVS" style string functions, that until very recently were actually slower than opencoding the equivalents, since they trapped to microcode.
<quote>
One wonders where we would be if Pascal or one of it's offspring like Oberon had actually caught on and challenged C. It would be nice to have a alternative system programming language. C sometimes feels like a local maximum instead of a global one.
</quote>
I think we would be using safer systems where buffer overruns and pointer exploits would be almost non existent.
The hegemony of C has cost the industry millions of euros/dollars in software correctness.
I'm always a bit hazy on why people bootstrap languages by writing compilers. I've always imagined it would be easier to write an AST-walking interpreter, write the compiler in the source language, and then interpret the compiler taking its source as input, to produce the first binary.
I guess I've never actually bootstrapped a compiler, but I've always found the interpreters I've written to be easier to reason about than the compilers I've written.
Bootstrapping a compiler is a good way to test how the language feels when writing a big project like a compiler. It also helps to find errors in the language/implementation.
In my proposed way to bootstrap a compiler, you still write the compiler in the language you wish to compile. You also write an interpreter, rather than another compiler for the same source language, written in some other language for which a compiler already exists. This means that the amount of code you are writing to get the language up and running is smaller, since interpreters are easier to write than a whole second compiler.
So what you say is true--it is a good way to see how the language feels--but not relevant to the facet of language development I was musing about.
So they had three tools to choose among. One was broken, one was wildly unsuitable, and one was unpopular. They chose the second for (apparently) purely cultural reasons. I think it's not so surprising that they ran into problems.
Don't discard unsexy solutions out of hand; choose the right tool for the job.
Was assembly really the right tool for this either? It seems like the difficulty of translating Fortran to Pascal due to Fortran's lack of many of Pascal's higher-level features would still apply to assembly.
I think so. Wirth called out Fortran's lack of pointers, records, and recursion. You have (or can fake) all of these things in assembly. Recursion is particularly important for building a compiler, since much of it involves manipulating trees.
Also, consider the stated goal of the project. They wanted to write a throwaway compiler so that they could bootstrap a Pascal version. In this case, it's a bad choice to use a language (Fortran) that's so different from Pascal. On the other hand, assembly is very flexible, so you can use whatever idioms you think will apply in the Pascal code you later intend to write. This will make the translation much simpler.
Writing in assembler, it's easy to construct linked records, and recursion works correctly, since you have a stack.
It might take awhile, but it doesn't seem like it would be as hard to macro up some higher-level routines for parsing and code-generating a simple PASCAL.
I'd expect that most Beta-machines (nowadays virtual machines) would have been written in assembler back then, including lots of expertise running compiled Algol amongst other things.
I think writing it in (perhaps a subset of) Pascal, then translating to assembler either manually or by interpreting the Pascal source by hand would have been the Right Thing To Do (tm). Writing in the high-level language makes it easier to get the algorithms and data structures right. Translating to a lower level language later is easier with that framework in hand -- the Pascal source would serve as a sort of spec for the ASM translation. I know that when I write code in Python and translate to C, it's frequently easier than starting off writing in C, partly for that reason.
Assembly allows you to use pointers and recursive methods, which were specific problems with Fortran. You can do anything in assembly, it just takes a lot of lines. Programming languages, especially early ones, can place major limitations on how computations are expressed, even if they are nominally as turing-complete as any other language.
I'm not a big Lisp fanboy, but I wonder why it wasn't considered an option for at least bootstrapping the compiler? I think even if no Lisp was available on their machines that hand coding one in assembly and then using that would have been much faster than other options (note: it's been 20+ years since my college Programming Languages class (where ironically I built a Lisp-1 in Pascal) so I may be mis-remebering my programming language history here)
Shoulda used Forth. It'd be fairly easy to go from assembler to an ultra bare bones Forth, and that could be built up to work like Pascal, until it looks like Pascal-written-backwards. Then write the compiler in that, and the translation and bootstrap would be trivial.
My first thought as well, but in 1969 Forth was still busy being born. Based on the notes here[1], many of the core concepts of a modern Forth like defining words and the structure of the dictionary were being actively hammered out.
I wonder which "high-level" features does Pascal have which C lacks.
When I looked into Pascal, literally everything useful there came from C. All things originating in Pascal seemed useless.
That would have been a rather neat trick, given that Pascal predates C.
Furthermore, C itself evolved rather substantially over the years, so people who mostly experienced the ANSI/ISO C era may not realize what an appallingly messy language it used to be. By the time the original K&R got published, the situation had improved considerably (with the exception of function prototypes), but I found the C code in 6th edition UNIX a truly hair raising read: http://www.tom-yam.or.jp/2238/src/
Pascal has nested functions and supports a limited form of closures.
In Pascal, a function can be passed as an argument to another function, but cannot be stored in a variable or data structure, cannot be returned from a function, and cannot be created without being given a name. However, when a function is passed to another function and later called, it will execute in the lexical context it was defined in, so it is, in some sense, "closed over" that context.
I also liked the I can pass parameters per value or reference without explicitly using pointers.
Nested routines in classical Pascal support downwards funargs; they are closures lite, i.e. actually more expressive than C functions. But we are talking classical Pascal; every practical commercial Pascal implementation supports function pointers directly, just like C does.
Having arrays indexed by enumerations can be nice.
type
TFoo = (foOne, foTwo, foThree);
const
FooStr: array[TFoo] of string = ('One', 'Two', 'Three');
There's no mucking around with implicit or explicit conversions between integers and enumerations. Similarly, you can enumerate through them easily with standard routines:
var i: TFoo;
{ ... }
for i := Low(TFoo) to High(TFoo) do { etc. }
Doing it in C relies on you adding fake enumeration members to stand for counts; and it gets worse in C++, because enumerations are more reluctant to decompose into integers.
Writing scanners (as in, compiler lexers) in Pascal is very convenient with set notation:
{ skip whitespace }
while cp^ in [#1..#32] do Inc(cp);
With a good debugger, it can be nice to use sets instead of bit flags, because you generally get a more reliable symbolic breakdown.
If you write a lot of code that breaks down in a procedural way, having nested routines is very nice. It can limit the scope of functions and procedures to just the code that needs them. In C, one can be tempted to cram it all into a single long function instead.
Practical Pascals like Turbo Pascal and Delphi have a real live module format that works fairly well for independent compilation. Changes to the interface exported by a unit do not necessarily need all dependent units recompiled. The Pascal linker associates versions with all exported symbols (basically, a hash of their signature), and only units whose import symbol versions don't match the export symbol versions need to be recompiled. This also prevents type mismatches that can (albeit rarely in practice) affect C, where if you change the signature of a function or the type of a variable, and fail to recompile clients, you won't get an error from the linker, because C linkers don't usually encode that info.
Pascal's pointer semantics are interesting. Because it's possible to detect when a value is used before being initialized, this reduces the probability of unintentional null pointer dereferences.
Subranges are also a nice feature.
Those, and nested procedures are about all I can come up with.
I'm confused. He starts off by asserting there were only three options; Assembly, Fortran or Algol. But then later on an essential part of the process turns out to be "a syntax-sugared, low-level language, for which a compiler was available". So why wasn't that apparently anonymous language one of the base options ?
I think that was the "compiler for a substantial subset of
Pascal using Fortran" that they originally planned to translate to Pascal after implementing it in Fortran.
I suspect that the good programmers did notice the benefits of a strict statically typed language, however the cost was probably deemed to be too high at the time. Because C was a "mid-level language" and stayed close to the metal it could efficiently use the limited resources of pre-LSI (let alone VLSI) computers; effectively C allowed you to break the rules if you wanted or needed to. These days obviously the performance difference is generally less than a few percent and hence can be largely ignored.