I believe already some 2 decades ago Bill Gates said that the biggest failure in software is the lack of productivity gains in software development.
Just to cherry pick a dramatized example. In the late 90s, one of my first programming experiences was in Delphi. It's a very visual way to program. You drag and drop a UI together from standardized elements. You could data bind things like input fields to a data source. You'd double click a button and it takes you to the already created "onclick" event where you do the actual coding. In a way, this is still similar to how you can "program" Microsoft Access today.
In this same period (I was in a computer science study) I was moaning to my teacher how tricky programming in C was. I made tiny memory management errors crashing the entire PC, with zero debug output.
His words: "Don't worry. By the time you get to work, you'll just do modeling".
Now I work on the web. It's been 30 years since Delphi and instead of productivity gains, we've taken steps in the opposite direction. There is no standardized UI library. JavaScript has no standard library. There's no single unified architecture. There's no standard tool chain. It's all very low level after which we stitch together fragile semi-abstractions.
I know it's an imperfect comparison, but I hope it gets the point across. It's all far too low level and fragile.
This doesn't just hurt productivity, also accessibility. It's simply not beginner friendly.
At least from the systems programming perspective I think productivity gains have been significant. Valgrind had _just_ appeared as I was doing my undergrad and it allowed me to focus on the high-level math of data structures -- as an example -- compared to my non-Valgrind using classmates that were struggling with memory bugs (which, of course, I did too, but to a less degree). C++ has noticeably improved since then, if you can hew to the newer bits of the language and keep all the arcane bits in your mind. Rust has arrived on the scene, inspired by Cyclone, C++, SML and is a serious jump in productivity over C++ because the compiler keeps track of all those arcane bits. If you squint, Go's in the mix for web stuff that doesn't mind GC. It's a shame ATS didn't become more of a thing, Ada has made real strides back from the dead, especially if you can stick to SPARK Ada.
We're even seeing more formal verification of complex bits of software in systems-land, something that seemed basically impossible when I first read about Coq twenty something years ago.
I do admit, though, that front-end seems like a wild west presently, to my ignorant eyes.
> Ada has made real strides back from the dead, especially if you can stick to SPARK Ada. We're even seeing more formal verification of complex bits of software in systems-land, something that seemed basically impossible when I first read about Coq twenty something years ago.
I really agree with this. The situation is a bit crazy. Since formal proofs are hard, the alternative taken is giving no guarantees at all. That's bad. Static analyzers can get really far. Contracts, even just at runtime are really useful. I don't want a piece of code to run if the precondition is not met, and I don't want it to return or have side effects if the postcondition is not met. This would rule out a massive class of errors. Of course, bonus points if you can do it at compile time.
Ada SPARK, Infer [1] and Dafny [2] show that rigorous engineering is possible and practical. Actually, Facebook (Meta now) argues Infer let's them move faster.
I think the problem is political and economical. It's hard to monetize this first. I wish someone like Apple took the plunge and rebuilt a simpler version of their stack using formal verification. It's the kind of thing only a large corporation or a government can do because it might take half a decade to take off. Once it takes off its unstoppable. Who would not like to buy e.g. a phone where most bugs and security breaches are impossible by construction? Verification also has a nice side effect. Change for the sake of change and featuritis are pretty incompatible with it.
The great thing about Ada/SPARK is that it has a package manager now (Alire). This means there are existing SPARK packages you can use, and you can share what you make. SPARK is also an Ada subset, so you can use verified components inside your Ada programs.
"It's a very visual way to program. You drag and drop a UI together from standardized elements. You could data bind things like input fields to a data source. You'd double click a button and it takes you to the already created "onclick" event where you do the actual coding. In a way, this is still similar to how you can "program" Microsoft Access today."
Yes! I remember Delphi as my first language in school, too. And later in university I could not understand, why in all the other advanced languages like Java, C and C++ doing the same was so hard.
Then I discovered Flash and later Flex with Flex Builder. That was a even way more powerful experience.
You could draw up anything - and make a button out of it with a click. Or a sophisticated, reusable component.
And then it died, because of Adobes unwillingness to open source the technology (and their focus on performance and flashy features and not security). All the bad flash applets that were flying around existed, not because it was so bad, but because it was so easy even for total beginners to make something working fast.
Regarding Java and C#, I am stunned by how good are IDEs now. To be clear, when I say IDE, I am specifically talking about a tool that enhances the edit-compile-debug cycle. Compared to 20 years ago when I wrote C, C++, and Perl, the IDEs were mostly terrible, except Visual Studio for Windows. The majority of my development cycle was all terminal-based: vim to edit, make to build (C/C++ only), gdb/perl -d to debug. Hoi, awful. (And, I still enjoy those all three languages!) As mega-corp computing slowly migrated to Java & C#, my productivity has increased dramatically. I guess at least 2x, if not more. Yes, some of this comes from how forgiving a modern VM can be (NullPointerExceptions, debugging, etc.), but a lot can ascribed to the incredible ecosystem of IDEs.
Regarding JavaScript, in the context of a modern web browser, high school students (in almost any country in the world) can easily Google a little bit, read some blogs, and then start GUI programming. And this doesn't require expensive, licensed software, like Delphi 30 years ago.
I programmed in Delphi for years and have done some toy things with Lazarus, so I'm painfully aware of the quality and reliability difference.
It is quite bizarre how us programmers, who make hundreds of thousands of dollars per year, are unwilling to pay $1,400 that will make us more productive.
> It is quite bizarre how us programmers, who make hundreds of thousands of dollars per year, are unwilling to pay $1,400 that will make us more productive.
Not all of us. Here in Latvia that amount would be around a half of one's salary, quite likely more than that.
The further you move in the direction of poorer countries, the less the argument of "developers are expensive, everything else is cheap" actually holds.
I don't doubt that in some places a single decent AWS EC2 instance would be a developer's salary.
Lots of devs are willing to spend on tools but end up in workplaces where they're not allowed to install them, tools don't offer compatible licensing for that scenario, or where the workplace won't purchase a license for them in tandem with their personal copy.
Look at all the copies of IntelliJ (and derivatives) that are sold. I think people are willing when the price point is right and it can get into the workplace. I personally wouldn't spend $1400 unless I could make my life better from it somehow, and that would mean it making me significantly better at earning money.
How about, it is quite bizarre how people who employ programmers, who make billions of dollars per years, are unwilling to pay $1,400 that will make us more productive.
The web, and much of programming really, feels so far behind the curve of what's possible as a programmer. Take a dive into shader design in something like unreal engine 5, it sounds exactly like what you're describing in Delphi, mind you, I say this as someone who has never programmed in that language. The ability to drag and drop functions, compose higher levels of abstractions, and really all of the fun programming stuff we do can be accomplished in these engines. I've found it very enlightening, and even somewhat frustrating, to go from drag/dropping some graph nodes that generate a playable world for fun, and then back into the world of javascript for work where I am slinging 100s of lines of text-based code just to get a form button to operate correctly.
In game engines, and especially in UE where things like Blueprints exist, the low level stuff is just conveniently hidden or abstracted away. But the long textual code that makes the button work, eventually exists somewhere in there. At least when you write it yourself, you know who to blame (and probably where to look) when it doesn't work.
"the low level stuff is just conveniently hidden or abstracted away. "
Yes. This this is exactly what I want.
When I write an engine or framework, I will deal with low level stuff. But for basic tasks, I don't want to. But I have to, even where it would not be necessary.
"But the long textual code that makes the button work, eventually exists somewhere in there."
Also there exists even longer textual or even binary code that makes the code for the button work, etc. etc., but I still do not want to deal with it on a daily base.
When I make an button, it should be as simple as possible.
Where is it positioned. How is it styled. Where is the onclick method to handle it. Those 3 things I want to do with simple clicks in the IDE.
But the more abstract you go, the less efficient does your editor become. And speed of development goes down. It's not a simple - just put more abstraction.
Of course not. It is about the right level of abstraction.
Abstracting all the details away, I do not need to accomplish the task at hand.
If done right, this also will not slow things down, rather the opposite. Imagine programming a button in assembler and opengl. There you have all the details (and power). But chances are, you will be way slower and with a worse result, because you have to focus on other details and not on the task at hand.
Stop. Abstraction is not a magic pill that absolves you of the need to know wtf you are doing.
At best, it allows you to defer the deeper reading for a time. You might not even have to do the deeper reading for this piece of functionality, but the time will come when you have to figure out where an impedance mismatch is.
This quest for abstraction is the most infantile attitude I straw to squash in ever developer I meet. If you aren't reading the code that your code depends on to work, you have no idea what you're actually doing.
"Stop. Abstraction is not a magic pill that absolves you of the need to know wtf you are doing."
Erm, a beginner who wants to place a button that calls his defined method foo, really does not need to know about all the framework details to get the job done.
When I was a beginner, I was happy that I could place a button and link it with whatever with ease and it worked!
I simply did not needed to know about event loops, rendering algorithms, or internals of the framework. Now I happen to know quite a lot about it, because I designed a UI framework from scratch to solve a custom need.
But most use cases with UI are to display text, images and give text input and buttons. Ordinary programmers should be able to do that, without having to learn the graphic stack. Thats why we have frameworks. They just could be easier with better tooling.
I've used UE4 for years and the Blueprints system doesn't save you much. Those abstractions are bricks, made for building brick houses. If you want to build something made out of anything other than brick, you're back to custom HLSL nodes and better off programming your shader somewhere else.
Blueprints are nicer for known problems, they are a better glue for old knowledge, but when you want to do something custom or new, you're out of luck.
This is the same problem with all programming, you can't make something specific and new without doing an engine teardown and rebuilding it. All the parts are terminally interconnected and always will be.
At the end of the day, UE4/5 is a landing pad for all the work you make in other tools. I agree web dev is no fun though.
Ironically, I know everything from ASM to...well many languages, including C, C++, PHP, Python, Ruby, Javascript, etc.
The most productive tools I used prior to learning all of that? HyperCard on the Mac, Visual Basic on Windows. Why? Tooling. Even when you had to get down and dirty, it seemed the tooling helped you out in some way. Shoot, I wrote an application back in the 90s in VB6 that still is functional TODAY not just in Windows, but in Linux and MacOS (thanks to translation layers such as Wine)
The same can NOT be said for the game I made in C++/OpenGL later...
I have personally never regretted using tools that were dismissed as "toys" by the mainstream devs. Including HC, VB, and Arduino. I've got a VB5 program running in the factory that has had virtually zero failures in use for 14 years. I even hand-translated a curve fitting routine from Numerical Recipes into VB.
What angers me on Windows world is how the WinDev team routinely sabotages VB and C# efforts to be back to those productive flow in graphics programming on the platform.
Managed Direct X, XNA, all killed when their advocates gave up fighting for them.
Nowadays they proudly refer to Unity when one complains about missing .NET bindings to DirectX.
JavaScript's Standard Library is basically NodeJS + Browser WebAPI's.
UI libraries/toolkits were really only considered a part of any standard library in like one language, (Java) which is still a stretch, because even Seing and AWT were toolkits.
Standard libraries are small. That's the point. You use primitives to build up the rest.
"No standard library" might be too strong a criticism, but "a dozen or so built-in global variables with various methods" is still probably not the user experience that most desire.
"A dozen or so built-in global variables with various methods" is exactly what a standard library is. (Well, usually it's more than 'a dozen or so'.) Generally those variables are of the "package" type, but no such type exists in JavaScript.
well it depends on what you mean by convenient - java's got a UI toolkit that works out of the box for all supported environments (awt), and ontop of that is built a fair number of UI frameworks like swing (which is also part of the java standard library).
The sum language of HTML+JS+CSS, or in other words html files. I haven't seen a better way to write UI in text files, other UI frameworks relies a lot on building the UI in some special editor or aren't easy to use.
HTML as a UI is geared towards making documents and displaying text and graphics in a rigid layout/format.
It's not geared, imho, for a UI with lots of interactivity - e.g., imagine trying to implement blender or an audio processing app (like audacity) in html/css. You'd basically be creating new primitives on top of html, and build your UI on that (or install one of those UI libraries that _renders_ to html).
Reminds of my time doing WebObjects development, and I fully agree that it seems like the frameworks I use today ( ReactJS, FeathersJS) aren’t (yet?) as simple to visually develop.
Great post - the one word missing, "maintenance." Most programming work is done maintaining/enhancing existing code. The greenfield work is a piece of cake by comparison.
You want to do the hard stuff? Maintain existing code you're not familiar with.
The problems around maintaining unfamiliar code are huge, largely unsolved, expensive and risky. There's a little branch of computer science called Program Comprehension and no one pays any attention. Though most programming money is spent on maintenance.
I always tell my clients - the difference between writing code and maintaining it, is the difference between raising your hands and keeping them raised indefinitely.
I always smile when a green field project starts and then they claim its “Clean Code”. No, you won’t known if it was clean code until years down and the system will need updates. Then and only then you can reflect and see how hard it was to changes things in it.
Fully agreed. No matter how "clean code", the next person or team is immediately going to label it "legacy" and complain endlessly about all the choices made by the original author(s).
Much as we denigrate COBOL, that is still its greatest advantage. Yes, it's wordy, yes it's old. Yes, it needs to be really updated. But it's still easier for a new hire to understand the COBOL old code than any other old code.
I sincerely don't know why uni don't make more classes on that only.
- pick any software
- try to change something
- give a precise impact analysis
- generate a few potential implementation paths
- measure how fast you did all that and what failed / worked
A fundamental problem with how we teach programming is that we focus on writing, instead of reading, software. To borrow terminology from the language arts; we don't focus enough on reading comprehension.
At the University I'm associated with, there was a discussion about why more 'practical' classes weren't taught. The answer was 'We're not a trade school.'
I've also wondered if it's time to back off on agile a bit. Or at least "agile" as it is implemented generally, which means we get to make up new requirements every two weeks. In my experience, the hardest maintenance problems occur because we're trying to re-shape code into something that the developers never knew would be coming down the line. Spending lots more time up-front deciding requirements would go a long ways towards more maintainable code.
It's interesting to me that people are so into Agile when it simply doesn't work all by itself. How many books, conferences, certifications, practitioners, evangelists, etc. does it take to make a paradigm, supposedly the paradigm, of working actually work? Every company I have worked for that used Agile basically had broken processes and were not productive. The one job where we did not explicitly use Agile was actually a place where I produced the most useful work.
If the number one answer to we're using Agile but it's not working is "you're doing Agile wrong", then maybe Agile isn't the solution?
The way I worked at the place that did not have an explicit process was one of switching between agile and waterfall methods. Early on in the projects, the process was primarily waterfall, defining things up front, building prototypes, laying out the project, etc. Then once passed that stage, projects would enter into a more agile or iterative process. Then as new big features came in, back to waterfall and then switching to agile once that stabilized. This worked quite well.
I think engineering effectiveness comes down to a people and communication problem.
Writing software well is difficult by itself and people are all different in understanding, experience and skill level. Throwing a group of people together and trying to build something that works for every scenario is a miracle given the complexity of software and the complexity of interpersonal relationships and organisations (see Conway's law). I'm impressed by every multiplatform language or tool or software. It's a lot of work!
If everybody did things the same way i.e the agile way and if agile was proven to work and everyone followed it the way it was intended and designed, then maybe we could all be interoperable and easily work together and produce projects that don't fail. That's the fantasy.
To be fair I've been 6 years worth of agile projects and I was never on a project that failed.
Maybe I'm just an old curmudgeon, but it seems to me there's way too much emphasis on speed of development in general. Time to market seems to be more important than quality, robustness, security, performance, or any other concern.
Another thing that rubs me wrong is the recurring notion that we need to get rid of the text as a representation of code. I've yet to meet a mathematician who wants to get completely rid of formulas because coming up with a proof is slow and cumbersome.
I understand that the incentives are drastically different between academia and business, but perhaps we've gone too far in this particular direction. Perhaps it's okay to admit that programming isn't as easy as we all want it to be and it's okay to take the time to change things carefully instead of moving fast and breaking stuff.
> Another thing that rubs me wrong is the recurring notion that we need to get rid of the text as a representation of code.
I completely agree with this notion. However, I feel like we're sorely missing out on some form of visual exploration. I feel like the majority of my time is spent trying to understand the flow of execution of a program I'm trying to maintain that was written by other teams that are long gone.
It would be so amazing to be able to "zoom out" from the text and get a graph view of execution flow through different files. And then being able to highlight a particular execution flow that you're studying would be great. I know we already have some flavors of this with "find all references", or something like Visual Studio's profiling tools that highlight flows of execution, but none of these have ever felt like they improve the exploration of the codebase very much.
It would be very interesting to see some tools that allow a more fluid exploration of an unknown codebase. More akin to zooming out of a Google map view and tracing flow from point to point, instead of diving headfirst into a million files looking for the correct information.
Yes, like crocodile clips for a circuit board, I can insert a probe into a circuit and see what's on the wire.
When your callstack is 50 methods deep and there's 20 layers involved and marshalling and it's difficult to see what is going on.
I want to mark two pieces of code and see the data structures passing through them, similar to a debugger but more like a log file or trace like Jaegar or Kibana. But the actual POJO or JSON objects themselves.
This is a great analogy and captures what I meant very clearly! It would be awesome to pinpoint two pieces of code and ask for the execution flow visualizer for that path :)
I've felt like this for a very long time, and it's disappointing that there aren't really any tools to do this yet. It would make it so much faster to come up to speed on a new codebase, and also to see where some potential problems might lie.
> Another thing that rubs me wrong is the recurring notion that we need to get rid of the text as a representation of code. I've yet to meet a mathematician who wants to get completely rid of formulas because coming up with a proof is slow and cumbersome.
I see what you're saying, if I understand correctly, in terms of the hype around "low or no code" environments, if that's what you mean. However, I do disagree with the notion of equating code with text. It seems myopic in the same fashion of equating tape or punch cards with code.
Also, I think the mathematics example isn't quite apt. Mathematicians are very willing to use visual representations to both illustrate and prove ideas. The way mathematicians work is actually a great example of moving between visual and symbolic representations and also prose.
In many significant ways, text is a highly limiting form of expression. In my opinion, we have nearly tapped out the available expression that can come from purely text-based programming languages. One example of this is the limitations of the innovations in syntax. There's been basically no significant innovation in this area aside from small iterative improvements, and I think that's a fundamental limit we've hit. I feel the future of programming is likely to hinge on a highly hybrid environment. In some sense, Smalltalk and its ilk, Emacs and Common Lisp, TouchDesigner, LabVIEW, and vvvv are precursors for what could come.
The part of Agile a lot of people ignore is the phase when, after your code works, you spend as much time as it takes to make it well structured and readable to others.
If you think of code writing as a process of organizing your thoughts, exploring/understanding a domain, and articulating it iteratively (for code/run/debug loop), then your pithy comment seems perfectly rational.
the problem is that a large amount of code produced is to support the accidental complexity of the software (let's call it infrastructure code) while the value generated by it comes from its essential complexity, that is, the application domain, as cited by the article .
while Domain Driven Design and Clean Architecture are a first step in the right direction, languages and frameworks are still limited in supporting these ideas.
It's almost insane to think that once you have a modeled domain you need to replicate them in resolvers, types, queries, etc (for graphql), resources, responses, endpoints (for rest), etc, etc.
to reinforce the point of the article and the one brought by @SKILNER, I believe that the great transformation that is to come is in maintainability through a focus on the domain
I think, as OP alludes, ultimately explorability is the heart of the problem that stops people from writing code this way.
If you don't care about other people being able to read and explore your code without a lot of preperation, you can go full hog creating layers of DSLs and metaprogramming, and with enough dedication you can end up with all your real domain level business rules in one place separate from the "infrastructure"
But if you do this, you end up with a codebase that is hard for a newcomer to ask simple maintenance questions about like "What are all the places XYZ is called from?" or "What are all the places to write to ABC" etc
So an experienced developer learns to limit their metaprogramming so their code retains easy explorability at the expense of long term sustainability. Golang is kind of an epitome of this kind of thinking. Lisps are kind of the epitome of the opposite I guess.
This is what's behind the paradox where a good dev working alone or maybe with one very like minded person can produce a level of productivity you can't match again as you add developers, till you have way more developers. The dip represents the loss due to having to communicate a common understanding of the codebase, that doesn't get adequately compensated for till you've added a lot more people.
> This is what's behind the paradox where a good dev working alone or maybe with one very like minded person can produce a level of productivity you can't match again as you add developers, till you have way more developers. The dip represents the loss due to having to communicate a common understanding of the codebase, that doesn't get adequately compensated for till you've added a lot more people.
Wow I couldn't agree more with this point. You put it perfectly. I worked alone and later with one other developer on a project and it felt way more productive than my current team of 5 developers!
I find other people's code difficult to follow due to all the abstractions that I wouldn't have inserted.
When I write Clojure code for example, the code just works. I somewhat understand the code I write.
But when I read other people's Clojure, I find it difficult to understand. I think it's my maturity in understanding Clojure, the mental model isn't there yet. Which is funny because I've been on two projects where we used Clojure.
I don't have the same problem with less metaprogramming languages such as C, Java or Python.
I would argue that a huge or maybe even the primary reason why maintenance is so hard, though, is because a lot of software was originally not written "correctly", that is without maintenance and longevity in mind.
The hardest part of maintenance, in my experience, is that programs were developed under needless or incorrect constraints and then expected to be magically maintained. There is a huge downstream effect of decisions made early on in a software system's life.
The story of "just get it working" that evolves into "now that's it's working, don't change it but add these new features" repeats itself over, and over, and over. It's not surprising why maintenance in systems developed like that is hard.
yes. I wish I could somehow get to work on a blend between: a decompiler, debugger, emulator, static analyzer, memory profiler, and so on.
The idea being some kind of a runtime for assembly code which does not actually execute the program but allows one to understand it in different sematinc levels, or dunno.. this is a very raw idea. needs a lot of work (and a lot more knowldedge) to set down.
too bad none of the professors that I was able to meet were really interested in this kind of thing
I think dynamic analysis is incredibly powerful and criminally underused in IDEs and other dev tools.
I have thought of an idea about 6 months ago that has been fermenting in my mind since then : what if (e.g.) a Python VM had a mode where it records all type info of all identifiers as it executed the code and persisted this info into a standard format, later when you open a .py file in an IDE, all type info of the objects defined or named in the file are pulled from the persistent record and crunched by the IDE and used to present a top-notch dev experience?
The traditional achilles's heel of static analysis and type systems is Turing Completeness, traditional answers range from trying and giving up (Java's Object or Kotlin's Any?), not bothering to try in the first place (Ruby, Python, etc...), very cleverly restricting what you can say so that you never or rarely run into the embarrassing problems (Haskell, Rust,...), and whatever the fuck C++'s type system is. The type-profiling approach suggests another answer entirely : what if we just execute the damn thing without regard for types, like we already do now for Python and the like, but record everything that happens so that later static analysis can do a whole ton of things it can't do from program text alone. You can have Turing-Complete types that way, you just can't have them immediately (as soon as you write the code) or completely (as there are always execution paths that aren't visited, which can change types of things you think you know, e.g. x = 1 ; if VERY_SPECIFIC_RARE_CONDITION : x = "Hello" ).
You can have incredibly specific and fine-grained types, like "Dict[String->int] WHERE 'foo' in Dict and Dict['bar'] == 42", which is peculiar subset of all string-int dictionaries that satisfy the WHERE clause. All of this would be "profiled" automatically from the runtime, you're already executing the code for free anyway. Essentially, type- checking and inference becomes a never-halting computation amortized over all executions of a program, producing incremental results along the way.
I have ahead of me some opportunity to at least have a go at this idea, but I'm not completely free to pursue it (others can veto the whole thing) and I'm not sure I have all the angles or the prerequisite knowledge necessary to dive in and make something that matters. If anyone of the good folks at JetBrains or VisualStudio or similar orgs are reading this : please steal this idea and make it far better than I can, or at least pass it to others if you don't have the time.
This is how JavaScript intellisense in Visual Studio used to work, except that the program was executed "behind the scenes" using a trimmed-down VM that could execute without side effects or infinite loops. It was eventually abandoned due to poor performance, predictability, and stability.
The problem is this dilemma: If you have to wait for a "real" execution of a program, then very reasonable expectations like "I can see a local variable I just declared" doesn't work. If you try to fake-execute a program, you have problems like trying to figure out what to do with side-effecting calls, loops, and other control flow problems.
Trying to reconcile a previous type snapshot with an arbitrary set of program edits was tried by an early version of TypeScript and wholly abandoned because it's extremely difficult to get right, and any wrongness quickly propagates and corrupts the entire state. The flow team is still trying this approach and is having a very hard time with it, from what I can tell.
Something very close to this can be done and it's in fact done already by static analysis. Static analysis doesn't have any Turing complete problem. Static analysis has a limitation of providing complete answers because of the halting problem, but it can provide instead sound answers. That is, static analysis can provide all possible runtime types for any given (e.g. python) expression. This can be accomplished by doing Abstract Interpretation, or Constraint Analysis and Dataflow Analysis (kCFA), which is in way similar to what you suggest of running the program in a profiler, but instead it runs the program in an abstract value domain. With static analysis you will get some imprecisions (false positives) but no false negatives, so it can effectively be very useful for a developer in an IDE. The precision (amount of FPs) and performance are mutually dependent, but good performance and reasonable (read useful) precision can be achieved, although it's a non-trivial engineering problem.
Additionally, some programs cannot be easily and/or quickly executed to cover all possible paths, so parts of the program will remain uncovered by the profiler. That is one place where static analysis becomes very powerful, because you can cover all the program much faster (in linear time making largish precision tradeoffs, but analysis still yielding a usable result).
Indeed, Abstract Interpretation is powerful and beautiful as heck. Type Systems can be seen as just a special case of general Abstract Interpretation, where the types are the abstract values computed from each expression. The problem is that it requires ingenuity to devise abstract value systems that don't degrade into useless generality as unknown branches in the code 'execute'. Even types only succeed as much as they do because they require extensive and invasive (language-design-changing as well as program-changing) cooperation from both the language designer(s)\implementer(s), as well as the language developer(s), without this cooperation you will leave open very wide gaps. Symbolic Execution is another much much more powerful technique but, predictably, also very inefficient in general.
My way of looking at this is just frustration at the misallocation of resources. Dynamic programs already have tons of typing information available, just not in the right time and place. Your brain spends an aweful lot of time "executing" the code of a dynamic language while it's writing the code, just like the language runtime only far more slowly and with a sky high probability of error. So all what I'm saying is, why this immense suffering, when the information that your brain is in dire need of is already right there in the interpreter's data structures, just in opaque and execution-temporary forms ? It strikes me as incredibly obvious to try to bring in those typing information from the runtime into persistent transparent records available at dev-time.
Abstract Interpretation or Symbolic Executation are (fancy) tools in programming language theory's toolbox that can help us, but the simplest possible thing to try first is to make use of the already-available information that are just lying around elsewhere. To make a somewhat forced analogy : if we have a food crisis at hand, we could try fancy solutions like genetic engineering of super crops or hydroponic farming, but the dumbest possible solution to try first would be to simply bring in food from other places where its plentiful. Typing info in dynamic languages is very plentiful, it's just not there when we need it.
The problem is what I mentioned at the end of my other comment -- complex programs will get inputs from web APIs, console, database, so it's not possible to have complete runtime coverage for all paths in the interpreter on the general case.
"Principles of Program Analysis" covers the subject very well. It's not an easy or beginner book because it's very formal, but very complete and with plenty of examples.
The idea of runtime type feedback was originally explored by the SELF group for optimization [0], and is currently used by JS runtimes like V8. The obstacle your vision faces today is that program editors are almost all completely separate to the languages they are used for, and thus such communication with the runtime is enormously painful and complex. Time to return to the truly integrated development environments of Smalltalk/Lisp with modern niceties like gradual types and multicore processors, methinks!
I developed a dynamic analysis/IDE tool for old IBM systems. I was told by VC's that IT management never spends significant money on programmer productivity. I think they're mostly right. Maybe because no one knows how to actually measure it.
I think if you have to run a whole program to understand a small part of it, you've already lost. The most valuable tools are a REPL that can execute units of your code, and a language that enforces purity and immutability so that local reasoning is more likely to be sufficient.
> The traditional achilles's heel of static analysis and type systems is Turing Completeness, traditional answers range from trying and giving up (Java's Object or Kotlin's Any?), not bothering to try in the first place (Ruby, Python, etc...), very cleverly restricting what you can say so that you never or rarely run into the embarrassing problems (Haskell, Rust,...), and whatever the fuck C++'s type system is.
This stanza was maybe not your primary point but it was beatifully written and multiple programmer friends of every variety mentioned have been cackling at it.
There has been attempts as you describe before. I can specifically point to work done in Ruby by my PhD advisor using the exact profiling approach, and then static typing from that: http://www.cs.tufts.edu/~jfoster/papers/cs-tr-4935.pdf
> you're already executing the code for free anyway
Based on my experience of working on similar domain of type systems for Ruby (though not the exact approach you describe), this turns out to be the ultimate bottleneck. If you are instrumenting everything, the code execution is very slow. A practical approach here is to abstract values in the interpreter (like represent all whole numbers are Int). However, this would eliminate the specific cases where you can track "Dict[String->int] WHERE 'foo' in Dict and Dict['bar'] == 42". You could get some mileage out of singleton types, but there are still limitations on running arbitrary queries: how do you record a profile and run queries on open file or network handles later? How do you reconcile side effects between two program execution profiles? It is a tradeoff between how much information can you record in a profile vs cost of recording.
There is definitely some scope here that can be undertaken with longer term studies that I have not seen yet. Does recording type information (or other facts from profiling) over the longer term enough to cover all paths through the program? If so, as this discussion is about maintaining code long term, does it help developers refactor and maintain code as a code base undergoes bitrot and then gets minor updates? There is a gap between industry who faces this problem but usually doesn't invest in such studies and academia who usually invests in such studies but doesn't have the same changing requirements as an industrial codebase.
https://github.com/instagram/MonkeyType can perform the call logging, and can export a static typing file which is used by mypy, but also e.g. PyCharm. It doesn't expose such fine grained types, but you could build that based on the logged data.
This is only feasible if the program takes no input, or a very limited set. Once you open it up to arbitrary input, no single run (or even a large set of runs) can capture everything the program might be expected to handle.
How does the type profiler know that the variable that only contained values like "123" or "456" was handling identifiers, not numbers?
>This is only feasible if the program takes no input, or a very limited set.
One of the insights that people making tools for dynamic languages discover over and over again is that most uses of dynamic features is highly static and constrained. In general, yes, a python program can just do eval(input("enter some python code to execute :) \n>>")), but people mostly don't do this. People use extremly dynamic and extremly flexible constructs and features in very constrained ways. This is like the observation that most utterances of human languages are highly constrained and highly specific, even within syntactically-valid and semantically-meaningful sentences, not all utterances are equally probable, and some are so improbable as to be essentially irrelevant. People who try to memorize the dictionary never learn the language, because the vast majority of the dictionary is useless and mostly unused, and even the words that are used are only used in a subset of their possible meanings.
>Once you open it up to arbitrary input, no single run (or even a large set of runs) can capture everything the program might be expected to handle.
Anything is better than nothing, right ? if your program keeps executing with some set of types over and over again (and it will, because no program is infinitely-generic, the human brains that wrote the code can't reason over infinity in general), wouldn't it be better to record this and make it avilable at static write-time ?
Human brains are finite, how do we reason over the "infinite" types that every Python program theoretically deals with ? We don't! like I said, most dynamic features are an illusion, there is a very finite set of uses that we have in mind for them. Here is an experiment you might try, the next time you write in a dynamic language, try to observe yourself thinking about the code. In the vast majority of cases, you will find that your brain already has a very specific type in mind for each variable (or else how can you do anything ? even printing the thing requires assuming it has a __repr__ method that doesn't fail.).
>How does the type profiler know that the variable that only contained values like "123" or "456" was handling identifiers, not numbers?
It doesn't. I think you misunderstood the idea a little, the type profiler makes no attempt whatsoever at discerning the "meaning" of the data pointed to by variables, it will only record that your variable held strings during runtime. If the number of string values the variable held was small enough, it might attempt to list them like this "Str WHERE Str in ["123","456"]". If the number of values the variable held was larger than some threshold but some predicate held for it consistently it can also use that, i.e. "Str WHERE is_numeric(Str)". If a string variable was always tested against a regex before every use, it will notice that and include the regex into the type. No additional "smart pants" than this is attempted, just the info your VM or interpreter already knows, just recorded instead of thrown away after each execution.
The profiler will not and can not attempt to understand any "meaning" behind the data nor it needs to in order to be useful, it's just a (dynamic) type system. No current type system, static or otherwise, attempts to say "'123' is a numeric, would you like to make it an int ?", that would be painful, absurd in most cases I can think of and misguided in general.
> wouldn't it be better to record this and make it available at static write-time
I think you misunderstand my position. It's better for the creator to simply specify it when they write the code - i.e. static typing.
> the type profiler makes no attempt whatsoever at discerning the "meaning" of the data
Your examples are waaay beyond what I need or expect. What I need is for the program to recognize that, for a number, 123 and 456 are valid, but "abc" will never be. Conversely, for an identifier, no matter how many runs use values like 123, someday someone might provide the value "abc" and that's ok. Also, any code that attempts to sum up a collection of identifiers should not be runnable, even if the identifiers in question all happen to be numbers.
This is something that static typing provides, and no amount of profiling will ever be able to divine.
> In the vast majority of cases, you will find that your brain already has a very specific type in mind for each variable
Sure, for the code I write. But I've seen plenty of code written by juniors where, upon inspection, it was completely inscrutable whether a given parameter expects an integer, a string, a brick wall or a banana.
Which is all to say that my default position is unchanged: dynamic typing is unhelpful for anything larger than small, single-purpose scripts.
> No current type system, static or otherwise, attempts to say "'123' is a numeric, would you like to make it an int ?"
SQLite's column type affinity will in fact do this, if you tell SQLite that a column is an INTEGER it will turn '123' into 123 but will happily take a string like "widget" and just store it.
I also wanted to add that your thoughts on this subject are well-stated and align with some work I've been patiently chipping away at, in the intersection of gradual typing and unit testing. I'll have more to say on that subject someday...
It would seem like that's getting the types too late to be useful, but in practice most code goes into production in stages, so you could start getting this probabilistic typing data from code that has rolled out in a limited way, for example features that are hidden to most users.
There have been interesting progress in this space from a different angle of attack. o11y (Akita, honeycomb, ebpf, prodfiler, ...) and "Learning from incidents" are two of these angle that have really built on this idea that "understanding what the system does in prod" matters far more than "writing it right the first time".
It is also the thinking we can see supporting a lot of the early devops movement.
Alot of popular languages make it really easy to write code, but hard to maintain it .
It's one of the reasons I like Rust so much,even for relatively high level code. It has one of the srictest type systems and std/library ecosystems of any language in existence, which tends to make code very robust and easy to refactor.
I often wish for a language that has a similarly strict type type/ecosystem but is higher level. Swift is quite close, but very Mac OS centric. Haskell has exceptions, which are often used for error handling, making code less robust (plus the ecosystem is small). Other languages like Ada/Spark or Idris are way too fringe...
>Alot of popular languages make it really easy to write code, but hard to maintain it .
The language you're writing in has next to no meaningful impact on long term maintenance. When you're trying to do software archeology on why a function even exists the types it has will not help you figure out it's there because between 1996 and 2002 there was a tax rebate on capex for rocket development.
You need human understandable documentation, not fancy language features. This is no where near as sexy so no one does documentation and we're in the middle of a digital dark age.
I disagree. Language does matter for long term maintenance. For instance with a tax rebate.
Static typing - a fancy language feature - can immediately tell you the function deals with money. +1 for maintenance. Static typing with a solid IDE will immediately reveal everywhere the function is used. +1.
Languages shepherd you into certain designs or mistakes by making some things easier than others, such as when class inheritance can be easily abused in OO languages, cryptic one liners are easy in Perl, point-free cleverness in Haskell, monkey patching in dynamic languages, etc. +/-1 depending.
Per documentation, languages come with tools and best practices: Javadoc, Godoc, Rustdoc; C# even has a dedicated comment syntax for documentation - a fancy language feature. +1. No one does documentation, until the language tooling forces them to.
That said, BG (Before Git) and widespread use of source control, as in '96 to '02, was a dark age. We know more now.
> The language you're writing in has next to no meaningful impact on long term maintenance.
I disagree.
If you abandon a large block of code for a year, then come back to it, depending on the language, it may be difficult to grok what the code was doing and how it worked.
If it's Java, the tendency to create 100 levels of abstraction make it hard to reason about. If it's Python without type annotations, then your IDE may not be able to determine the types of variables, so it's difficult to determine what can be done with an object.
If it's Perl, then God help you. May He have mercy on your soul.
Also possibly including: install very old OS version, get libs and dependencies to some old version, and maybe update to a slightly newer compatible version, but not the latest. Create mock services of complex back end systems for test cases. Convert https connections to http for testing, or use unsupported TLS versions and protocols for some connections. Then you can get started!
Working on old but popular software is where legends are made. You need to be methodical about changes. How do you make changes in a million line+ codebase without breaking anything for millions of existing users? This challenge is reserved for the finest engineers on the planet. These are the people you want your desk near when you start your professional career as a programmer.
this is simpler than it sounds - the process trumps any engineering stardom. the process is king. the process is love, the process is life, quite literally. getting any change in there isn't so much fine engineering as it is wrestling with layers and layers of process, where every layer has been added due to a monumental f-up in the past. it's an environment where getting any change committed into a repository usually takes weeks, unless one of processes for sidestepping the process is invoked.
A huge opportunity here: ways to assist developers to figure out how existing code base works by generating code flow, class hierarchy, etc.
yes there are a few tools on the market, nothing really standout though, maybe it's for AI/ML to innovate in this field.
Linux kernel is well designed, in that I can add code relatively easy into its subsystems, it's those Object oriented or FP code base that bothers me the most, especially when they're large, often times they just made me feel hopeless, good tools desperately needed.
A similar concept is the maintenance of physical world machinery, it's also often overlooked, but a huge industry even compared to manufacturing the machinery it self
I’ve been programming for over 30 years, and never been as productive as before. Need to load and decode a jpeg/png, I grab stb_image.h. Need to decode an ogg file, libogg. Need to decompress, libz. Need to decode video, libavformat. Need physics, libbullet. Need truetype fonts, freetype. Need a GUI, Qt. Need SSL, libssl.
My day becomes selecting libraries, integrating them, and testing integration. My business code is < 20% of our entire codebase. And I’m much more productive in 2022 than in 1992.
> My day becomes selecting libraries, integrating them, and testing integration.
Funny, being an old, cranky, gray-no-beard, if I wanted to spend my days selecting libraries, integrating them, and testing integration, I would have become a digital electronics engineer.
Your point is well taken, it's just our modern truth. It's just not was attracted me to this world in the first place.
> Writing tests is time-consuming, usually doesn't scale, and it easily creates tight coupling with implementations
After thousands of years, humans still use double accounting in finance. This is because it's effective. Unsurprisingly, it's tightly coupled to the original transaction recording.
The constant assertion that tight coupling is bad, because it creates more work which can be error prone. This is a fundamental problem with how people discuss software testing. It's not a panacea, but a guardrail. Guardrails aren't indestructible. That isn't the point. Developers have an obsession with making things as easy as possible (be that simplifying or codifying) and this is not a domain where that makes for a better solution, all other qualities being equal.
I think "double accounting" is a good analogy, and if you look at it that way, some tests that look like the same thing written twice in a different way start to make more sense.
The problem is just with effectivity - it'd be fine if the time spent on test would be the same as time on code, but very often the effort on test is much bigger while not even covering all important cases. I think we need to look for methods to get a good case coverage without spending majority of our time writing tests.
Hit the nail on the head. For same reason requirements writing is so difficult, to explain every edge case you end up writing the whole software again, using human language.
Writing software is the process of explaining a set of conditions and conversions from input to output. We should assume the programmer does this in as short code as possible (unless we are using a low level language like C which adds a lot of accidental complexity like memory management). To repeat all these conditions in test-code, would be to write the same thing twice, in a different way. Not saying this is wrong, it's just important to acknowledge it.
Testing often becomes more like sampling, with hard coded or generated known set of inputs/outputs, often taken from a scenario derived from a common user journey. This verifies that this particular journey works, but it would never be a guarantee that other edge cases of it does, to do that you do end up in the write-twice situation.
The alternative is shipping your own, which most of the time is a) a bad idea since you don't have domain specific knowledge and b) the customer is not paying for you to reimplement the 10th jpeg parsing function.
What is your alternative then? Vendored-in dependencies with their ossified security vulnerabilities? Or figuring out homegrown code of dubious quality for the functionality that is not core business of the company/product?
Nothing so dramatic. The alternative is judicious inclusion of dependencies rooted in thoughtful, experienced engineering.
Robust, comprehensive libraries from reliable vendors can provide a lot of value and are slow to rot. These can be anticipated and added early so that they’re made good use of and can become the first tool to reach for before adding other dependencies. Think React, lodash, QT, boost, etc.
Meanwhile, I acknowledge that most other packages and repos are of far more dubious quality than anything my team would write, have no accountability to my team or stakeholders, receive few/no code reviews when added or updated, introduce conflicting style/semantic conventions, and generally expand the surface area for bugs and vulnerabilities by including many lines of code that have no relevance to the project.
There are no strict rules, but these are the sort of considerations that weigh in.
> Robust, comprehensive libraries from reliable vendors
Hah! In all my years the biggest, most difficult vulnerabilities/problems to remediate all came from vendor-made (90% of the time closed source) software. With the free, open source stuff the problems are usually very minor but even if they're BIG we'll pretty much always get fixes and deploy them faster than if we had to wait on a vendor.
The more popular the OSS the safer it is but even niche OSS tools/libraries are easier to work with by definition because we can look at and modify the source to correct the problem even if the original maintainer hasn't gotten around to it yet. This is actually a big reason why Python is preferred where I work: The code is inherently visible and fixable.
What language code is written in has little to do with how "flexible" it is. Or what did you mean? Refactorable? Readable? I don't immediately see what the choice of Python specifically has over other languages.
My experience in engineering tells me that vendors and dependencies come and go and get acquired and all of them develop their own set of problems over the years. I get what you're saying, but you picked projects that are easy to see as popular in hindsight, but not so much in early adoption. Why react over angular? There is a problem with picking vendors outside of just "picking the best" that has multiple elements:
- Age of the company
- Complexity of integration with the vendor
- Stability in early adoption
If you adopt the tech too early, you risk instability during a time where you don't have the size as a business to afford that. Nobody that's a customer of Google is going to leave their contract because it was down for an hour. However your new 20 employee startup goes down for an hour? And you just signed your largest customer yet? And it's because you decided to early adopt some tech because it simply "looks good and efficient?". No thanks, that customer will be looking at alternatives when renewal time comes, if they don't just pay out the contract and leave on you already. And good luck getting funding with that blemish that you blew it by adopting cool new trends.
Okay, so being early adopter at a startup is risky. So maybe you go with something that is "kind of" new, and is being used by other large companies. Great, so you start building, and the feature system grows, and the integration grows, and now some new system comes along that is way better. Think about life before React/AngularJS and people who built systems with plain jQuery. There was a time where that was the hottest thing and nothing like React/Angular even existed. So you pick that and then 3 years later you have piles of jQuery and AngularJS/React/whatever else comes out, but remember the maturity thing mentioned above. So now you watch it grow over 2 years and now you have 5 years worth of jQuery and the amount of work it will take to switch to React/Angular is a year long project. You can't just stop everything else to work on this, so it's 20% of your time, nevermind that it's a moving target as you fix bugs/release new things in jQuery over that year. So now you're maintaining both for a year, and then 3 years later it becomes clear that React is the winner over Angular, and you picked the wrong one! At the time you picked they both seemed pretty mature (Google w/Angular and FB with React). Oh jeeze, do we migrate everything again? And from the business perspective how do you even begin to sell this to the people in charge when it results in no benefits to the end user. And by the time you finish it will these even be the tech you should be using? Or will something else rise up to replace that?
The complexity grows with the integration as the age of the company increases. And adding headcount and expecting to solve the problem just introduces more problems. Now your team size is 10 instead of 5 and standup is taking an hour everyday, do you break it up into other teams? What are those teams called? What if nobody wants to work on the upkeep of the legacy? Why would they as an IC? So now how do you split the work up fairly?
I'm not saying "don't use vendors", "don't adopt early" or "don't vet your vendors", but what I'm saying is the problem isn't the tech. It's the fact that the "obvious correct solution" is a changing answer over time. And the integration and added complexity combined with team dynamics eventually becomes the hardest problem a company has to solve.
The real answer is to cultivate a team that can work on the project long term using the old tech for significant gain. Your turnover just needs to be slow enough that new developers can be ramped up properly.
I did a gig at a place that brought me (along with like 50 other people) in to transition from a server rendered PHP app to a React front-end. They'd bootstrapped to something like $300M and had just brought in a new CEO who wanted to 10x the company.
It was fucking pandemonium. Every day I saw another message on their slack along the lines of "It's my last day, I had a lot of fun working with you all the last 8 years". My team was completely useless, as far as I could tell our mandate was to basically be nazis and tell all the supposed mouthbreathing pre-historic over the hill PHP greybeards how to write "good" React code (aka unmaintainable garbage).
I have no idea how they're doing now. I don't see them mentioned much anymore. They're probably worth $30M.
If you had to bet on who could 10x your company, would you go with the guys on the ground floor that have literally bootstrapped a company to $300M off the back of their PHP tech? Or would you go with the army of hip Javascript consultants, half of them off shore?
I say this as a specialised React developer, who has been doing solely React for 2015.
There's absolutely no reason why you have to roll with the industry to every new fad. I have no intention of going anywhere when the next thing comes along. In fact, I'm waiting for all the noobs to fuck the hell off so I'm free to actually write good React code again. I bet that's probably how the PHP developers felt back when that was the fad of the decade.
If that company had backed their PHP developers in and given them enough shares to stick around, I bet they'd probably be a $3B company today. 90% of that is going to come down to the business side of things, not whether you throw away your entire engineering culture to score 20% better on some fucking render metric on a page transition or whatever.
> My day becomes selecting libraries, integrating them, and testing integration.
This is why I hate modern programming.
When was the last time you had a job where you could actually learn how to actually program something? The prolifiration of libraries means that all your bosses and managers would extremely frown upon anyone handrolling a solution instead of using an existing library.
Which means the more time you spend at a regular job, the worse your brain will rot.
Just talk to any programmer at any company and you will notice that almost none of them know how to program anything. They just know how to stitch together libraries with some glue code.
> The prolifiration of libraries means that all your bosses and managers would extremely frown upon anyone handrolling a solution instead of using an existing library.
They are right, even if you do not agree. Handrolling is incredibly cost ineffective, yet another source of bugs and can be seen as a bad idea when it comes to security.
Huge third party libraries like LibOgg are maintained by gigantic industry players. Chances are, they are more knowledgeable in that specific domain than you are and they can focus more developer time on what is just a small part in your business application.
How many of the people who are working on any given library actually are domain expert on the subject matter as you imagine?
This might have been true 20 years ago. But now the cultural shift affects even the people who work on the lower levels.
A sort of famous example is how slow Visual Studio has gotten compared to 20 years ago (when it was running on slower hardware even!) where it was nearly instant.
The refterm saga has demonstrated that people employed by Microsoft and paid very high salaries to work on a terminal emulator have no idea how to make a half decent terminal renderer.
> handrolling is incredibly cost ineffective, yet another source of bugs and can be seen as a bad idea when it comes to security.
Without context this statement is blatantly false.
Many libraries are full of bugs and security problems.
I find this comment amusing because most tasks the GP mentions are so specialized that you wouldn't actually have time to learn them. If your complaint is left pad and similar libraries, that's not what GP comment is talking about.
And still you're typing repetitive text. My dad was punching cards and able to read code from punched tape. I grew up writing code on a home computer's display. And now, 35 years later, it's long overdue that some visual NoCode tool should be the default for mostly repetitive programming jobs. If I want to build a Finder Quick Action to resize an image, for example, I certainly won't open a C editor, I'll just use Automator.
Eh, unfortunately the one programming breakthrough the world actually needs is one that would drastically change, and perhaps harm, most of the people around here.
We need more "Excels." More and better tools that let "regular" people program.
I'm a firm believer in empowering end user automation. The atrocities that folks cobble together with the Excels of the world are a marvel, and more power to them.
But in the end, we all talk about leaky abstractions, and the stark horror is that the fact that all of these wonders run on a computer, the worlds leakiest abstraction, the worlds most stubborn, pig headed, cantankerous contraption out there.
I think it goes beyond automation and into customization. Giving end users the power to "program the program" just makes it possible to make the thing do what they want in a way that it wasn't set out to do. Without having to pay/find/wait for a developer to do it. Sometimes it's about automation. Sometimes it isn't.
>More and better tools that let "regular" people program.
There's one catch with Excel, the more complicated your spreadsheet gets, the closer it gets to programming. And at some point, it's more cost effective to write in a proper programming language than continue to maintain mess.
However, until this complexity level, Excel is great, and empowers a lot of people who barely know the basic features of Excel to solve their computational problems.
You can already have "regular" people as software engineers if you follow basic concepts like the SOLID principles. Unfortunately many engineers are incredibly smart and they don't see the difference between writing spaghetti code with cryptic names and writing clear, legible code that normal people can follow. I've worked with people that could figure out a program written in binary if they needed to, but that meant that they never learned to write structured, sensible, readable programs.
Or more realistically the rapid application development tools need to be better so we're not bogged down dealing with latest fad or resume driven development
There's not much reason to improve things. No one's out here saying "you don't get this job because you chose new technologies when they weren't needed".
Even if you can equate glue code with trash, we seem to have unlimited space or a black hole to throw that trash.
It is not a fantasy for small businesses, who rely on internal tools and processes to operate. These tools solve real pain points for them and that is why they exist and why it is a fast growing market.
It's a lot of hype right now, not necessarily indicative of staying power. We've been through multiple bouts of "no-code/low-code will replace X" and it has never materialized in any meaningful way.
>We need more "Excels." More and better tools that let "regular" people program.
The only reason why more people don't program is because Windows is antithetical to programming _anything_. The second you remove people from a windows environment is the moment they start coding, even by accident.
I was agreeing with you, but as I thought about it, I remembered MS Access. It is used, often ships with Office, but still isn't close to as popular as Excel. Either the model is wrong, or the need isn't there.
In my experience, the main problem is that while Excel "distributed code versioning" with a whole bunch of copies around is awful, Access databases are not only much more prone to corruption and data loss in general (even in single user, non-simultaneous ideal use cases, never mind people attempting to multiplayer edit it over network shares...), it's often very hard for non-programmers to get any insights on any data you threw on it without using Excel in the first place.
So if the data is coming from Excel, and being extracted back to Excel, why use anything besides Excel?
Access used to be pretty standard for a small business trying to improve their situation, but we in the software development world spent so much time trying to work around limitations of Access and upgrading growing businesses out of Access that we took it personally and shouted to everyone not to use Access.
However, it is still taught in IT classes as a useful tool for small businesses, and it is.
My very first paid web gig was connecting a small business' Access based product catalog to their website. I was 14 or 15 at the time. I was able to do it with out any real coding knowledge except for some basic HTML I learned from MySpace and some copy-paste VBScript, haha. The site was a hit and I made $1,200 and immediately spent it all on an electric guitar.
Many more such cases [0], but Excel and spreadsheets in general also save businesses $$$$$, which explains their pervasiveness. Calculations take a long time and are easier for a human to mess up, which is why the initial spreadsheets were so popular as to drive the adoption of the personal computer in general [1]. The gulf between Excel and Python is pretty obvious, but there recently there have been few with their sights set on it, as most programmers looking to improve programming do so from their own experienced vantage point.
We can't blame spreadsheets for the cited errors, because they're errors in arithmetic rather than Excel bugs. All businesses should get rid of arithmetic.
The question is not just how much Excel costs, it is how much revenue it helps bring in. And in many places, it brings more than it costs. Nothing is worse than a poorly designed webapp that was supposed to solve the "Excel problem" in a process.
Why would it harm us? We are problem solvers, we can use our intellect somewhere else. Programming is just a tool.
A lot of my colleagues are hybrid of traders + programmers. They do a lot of Excel. Sure, we can automate text programming, let everyone do Excel, and us problem solvers still won’t find difficulties finding high paying job.
True, but just using Excel does not lead to decent data models, though. IMHO, what we would need is for "regular" folks to grasp the basics of relational database first (at least 1:n and n:1) and then build an easy to use excel-like tool around it.
I'm not going to argue whether the 'world actually needs' that, or if people mostly, actually want that, or if, since excel is already there if you want it, you probably don't have to write, sell or build any more low code tools.
...because, although I could argue about those points, it's fundamentally unrelated to the OP, and issue that programming is hard, and, for a very long time, no one has really had any idea how to solve it.
Now, however, AI generated code (like copilot) is, for the first time in a long time, a potential avenue to actually change how software is created at all levels.
I think there's pretty interesting, because it opens up a lot of new opportunities.
Is the future dynamic languages / high level specifications that are AI-transformed into typed verbose languages like C/Rust/whatever and then compiled?
The ancient tools like Rational Rose tried to do this, but the tech was never actually technically good enough. Maybe... we'll see a real change in this space as the sophistication of the models improves... or maybe, like self-driving cars, it's always going to be 'nearly good enough'.
Hard to say.
...but, hot damn. More excels? Please no. Excel already exists. Don't rebuild that stuff again. Build new, different interesting tools please.
>...but, hot damn. More excels? Please no. Excel already exists. Don't rebuild that stuff again.
I don't think they literally meant another Excel. Just something that is easily accessible and usable by non-programmers to do very 'programmy' things, etc.
This argument also just assumes, 'anyone can sit down and use excel'. Which is not true, excel has a learning curve like anything else, most people learn 'just enough' to do their job, maybe there is a guy in their office that knows a little bit more and they can learn from them. But excel proficiency acquisition is very similar to a programming proficiency acquisition, maybe the track runs out faster (at some point you reach real diminishing returns and limits to a spreadsheet's capabilities but I've seen some insane shit written in excel, raytracers for instance, but no-one who has done that thinks it's a good idea.)
>But excel proficiency acquisition is very similar to a programming proficiency acquisition
Strong disagree. Maybe if you only consider those extreme cases you referenced (ray-tracing, heavy VBA, etc.), sure. But in general? No shot.
Excel gives immediate feedback with a visual interface. There's no need to learn about variables or syntax or memory or compiling. There's no dependencies and no package managers. It abstracts nearly everything away. Click the place you want the thing to go, type the thing. Want a chart? Click the picture of the chart you like and click and drag over what you want in the chart. Want to change chart colors? Click the color you want. Want to change how the data is displayed visually? Press the bold button, or the color button, or the border button, whatever your heart desires is a click away with the same interface your used to with Word.
A great comparison is a geographic heatmap. In Excel, you need 2 columns. One with some States/Provinces and the other with some numbers. Then you click the big map button and instantly have your map. I can write the instructions for it in the space of a napkin and someone who has never touched Excel can be making all the maps they want. Now think about how you would explain making the same geographic heatmap in Python (or whatever your 'easy' language of choice is) to someone who has never learned any programming concepts.
The amazing thing about Excel is that many (most?) people start using it without EVEN knowing how to make a simple formula.
They literally start using it as a way to lay out tables (usually of data).
Then they make graphs.
And add more tables.
And learn how to use it like a basic pocket calculator (half of the formula I see that should be sums are actually =B21+B22+B23+B24+B25…)
Then comes SUM. Then VLOOOKUP.
Perhaps some IFs or COUNTIFs.
This process can take years for someone to go from first use to anything even slightly programming-like, if you squint at it.
A tiny fraction of these people ever do anything complex enough it looks even much like low-code or no-code "development". Even less write macros.
But you know what? In the first half hour, when all they wanted was to email round a list of team members and their lunch preferences, they already achieved something with Excel that they found useful, and they felt productive ever since. Most Excel users spend almost NO time learning the tool compared to how much time they spend getting (what they see as, and what probably is) value out of it, even when we are sat in sheer terror as we witness the horror they have unleashed on the world.
That is the bar you have to clear to make a "better Excel".
Honestly I tried to do some simple shit in excel today for the first time in years and holy shit, who thought this ribbon interface was a good idea. I'm going back to apple's numbers.
They mean no-code or low-code tools, but those come with a large hidden layer. These tools are known for making the 80% easy and the other 20% impossible. Excel can also become a mess.
>Is the future dynamic languages / high level specifications that are AI-transformed into typed verbose languages like C/Rust/whatever and then compiled?
How is that different from the present? Why are these "AI-transformations" different from what a compiler can do?
It’s fundamentally more sophisticated. This is like asking what is the difference between modern ML translation and the previous 20 years of research on language translation; the former actually works.
The latter basically doesn’t except in very specific circumstances.
Compilers can turn language into instructions only in a limited extremely specific set of circumstances.
We need more interesting tools, as you said. As long as the horizon of innovation remains on "something that generates good old code" I'm not very excited about the future. Copilot is an amazing tool, sometimes gimmicky, but a tool of the present nonetheless. As a compiler, it takes instructions and generates instructions.
We need a future where AI is involved in debugging, and program comprehension, and correctness. More interesting tools than a better RAD.
Yeah, I was using Excel as shorthand for "simple enough yet powerful enough tools such that someone who is perhaps an expert in something else can build themselves something useful."
Glue code is used to piece together libraries. I think it's a sign you used the right libraries and aren't reinventing. It's boring, but glue code essentially describes why your project is not the same as every other apo using those libraries.
Boilerplate is a thing that happens when libraries don't have sane defaults. It's one of the more unpleasant things about Docker et al. I wish we had a dev climate where something like Linux Standard Base could take off.
I find frameworks to be perfectly fine, as long as you break out of the "Perfectly fit the code to the vision in my head" model and go for "How do I make this product using only these high level blocks I have, without modifying them".
Frameworks are great as long as you don't fight them. Using a distro like Mint? Don't try to swap out core system services. Using a declarative web framework? Don't try to do some synchronous imperative stuff it was never meant to handle.
I get the feeling a lot of programmers really value creative freedom and don't want to feel like they are playing a rail shooter, just coding the one obvious thing the opinionated tools tell them to do, and then they have a bad time when they use tools that were specifically designed to remove any need for interesting code.
A lot of why we have containers is to stop dependency hell, but having a stable platform to built on is another way to achieve that without containers at all.
Of course it seems like the other reason we use them is to bundle multiple services together, but I'm not exactly a microservice fan to begin with.
In any case, I don't see why we couldn't have an OS that makes debian packages into containers, sharing things that are needed by multiple containers as appropriate while allowing overrides.
Android apps have a pretty nice model. If Android were a bit more open and had a few modifications to better support that use case, it seems like Dart/Android would make a pretty nice backend, assuming the ecosystem was there.
Android proved the benefits of an extremely opinionated inside the box kind of system, making APKs run on Linux and Windows seems like it would have a ton of uses.
I feel like a lot of boilerplate is a choice, that often comes from frameworks being used.
I disagree with the text criticism as well. The reason we spend time thinking and reasoning about code, is not because its text format. Its because that's the job. We aren't scribes just copying books. The hard part of programming is figuring out what to do, not writing it out. Well i am sure better visualizations or something could offer minor improvements to comprehending code bases, i doubt it will change things significantly. We will always spend most of our time trying to figure out what we need to do not doing it.
Anything that feels repetitive, like writing boiler plate code, is something that can, should, and probably will be automated eventually. This is a constant if you are a programmer, whatever you are doing today that is repetitive, you'll likely be using some better way of doing that in the future. And as Alan Kay says, the best way to invent the future is to invent it. I'm always on the lookout for doing the same things with less repetition.
Boiler plate usually results from people using something that was not originally designed to do that thing or not designed that well. It's indicative of some kind of design friction or feature creep. When it works, people just do more of it without really thinking about what they are doing or why. If you see people copy pasting the same blobs of code over and over again, that's a good sign something is wrong.
At some point that becomes the way things are done and people start nit picking each other about doing it properly and then inevitably somebody comes along and does a thing that is vastly simpler and accomplishes the same thing. Happens over and over again. Some people then inevitably resist that new way of doing things because they are really invested in the old way. But the way our industry works is kind of Darwinist; so those things tend to die out quickly once something better comes along.
The two mistakes people in this industry make over and over again is assuming that 1) they know it all and 2) things don't change. Because things actually do change, and usually for good reasons, the former requires work to stay true. Some skills last longer than others and not all change is great. If you are just coasting and writing the same stupid code over and over again like it's ground hog day, it's going to eventually run out on you.
> I feel like a lot of boilerplate is a choice, that often comes from frameworks being used. [...] I disagree with the text criticism as well.
The representation as text files is also part of what's causing the boilerplate.
As an example: How would you reduce the devops and configuration boilerplate in a monorepo?
In your typical JS microservice, a lot of code is not actually JS but yaml, json, terraform, etc. And it is very hard to abstract these away since a lot of tools rely on the existence of actual files.
Of course you can use code generation and other macros to manage the units in your monorepo (nx.js does this). But this is very instable since you might need to tweak some file resulting in your macros not working any more.
My take would be that textual representation is a local maximum. And our needs have outgrown this maximum. We're spending 90% of our time to innovate to push the boulder up the remaining 10% of this very hill. But there are most likely other hills with a much higher peak.
>In your typical JS microservice, a lot of code is not actually JS but yaml, json, terraform, etc. And it is very hard to abstract these away since a lot of tools rely on the existence of actual files.
This is a problem of your own making, though. If you don't do a microservice architecture in JS and do a modularized monolith in Go, as an example, you don't have to write any of this code.
> Maybe you could write tests as queries that would test a whole set of possible programs, not only the current version of your program at the moment.
I think that the future of programming is more sophisticated static analysis. Programmers will write statements like, "every code path that writes to the Payments database must have called validate_user()." Then, the tooling will confirm that rule with every commit.
We kind of have this already (for example, Facebook's Infer tool [0]), but I think it will become much more important in the coming decade.
To set up an example, Azure has some API management stuff that could let you do this before you even got to your code. Writing a tool to make sure that API management rule exists would be different than static analysis.
I'd agree with you, but not all of the business logic like that is going to live in code in the future.
I would love to see a language that considers tooling a first class problem. Most (all?) languages barely consider complex projects with 100s of dependencies and large multi-step builds. Java shouldn't have to wait for Maven and Gradle to come along, and Python shouldn't be shackled to piss-poor systems like pip.
This is something Rails gets very right. Bundler and Rake make Rails one of the nicest development environments there is.
But we can and should do better. Why is DLL-hell still a thing? Imports should specify versions, defaulting to a default if none is specified. Languages barely even consider versions today, they are almost always hacked on via magic text strings in filenames.
Why do tools like Dependabot and Renovate need to parse a million dependency formats to figure out versions? The language runtime or build system should produce that as a first class output.
Deployment and documentation are other places most languages ignore and delegate to a variety of crappy external tools.
Build Server Protocol is a good step in the right direction, but it barely scratches the surface. It would be great to see a language that prioritizes the parts of the developer experience that aren't writing code.
>But we can and should do better. Why is DLL-hell still a thing? Imports should specify versions, defaulting to a default if none is specified. Languages barely even consider versions today, they are almost always hacked on via magic text strings in filenames.
So suppose you fix the version number when you import the library. You then need to copy paste the version number to other imports in your code base? Or maybe you mean each import may have different versions? which is all right until you need to pass some data between two versions of the same library. This wouldn't be so difficult if semantic versioning or another equivalent was respected, but we know it's a lie. This wouldn't be a problem at all if there were some kind of contracts in place when using modules/package/whatever instead of a silly number based on convention but how to specify the contracts is not that easy, and it's similar to say having static typing. I think improving versioning it's doable but it's very hard.
I'm all about improving tooling, I have been obsessing for a few months about the sad state of program documentation and navigation. Why is the computer doing nothing while I think so hard while I'm programming? Why there is so little money in this space?
Go and Rust with go and cargo commands correspondingly have great tooling in this respect. As long as you use one language, everything is very convenient.
As a dyed-in-the-wool Rubyist, I consider Ruby the pinnacle of high-level, abstracted, expressive programming for the contexts I care about (small web applications largely written by solo devs).
What's sad to me is that the modern follow-up to Ruby seemingly doesn't exist. Every hot "language du jour" which has come after Ruby has gone BACKWARDS. Lower-level, more systems programming oriented. Maybe even compiled. Static typing everywhere. It's utterly baffling to me.
"Why are you using Ruby? You should use…Rust! (Go! Zig! Fill-in-the-blank nerd hype!)"
Lol.
What I actually want is a new programming language/environment which makes Ruby look like programming pointer arithmetic in C by comparison. Something so advanced, so high level, that much of the time you're really just describing patterns and flows and data models and extensions, and then letting the computer determine the most efficient way to develop those code paths and execute them.
Unfortunately, I'm a bit cynical on this front. I believe the reason this doesn't exist is because it's at cross-purposes with programmer nerd culture. Many programmers enjoy the nitty-gritty of low-level coding. They fear abstraction. They fear "magic". They fear things like "implicit imports" or "duck typing" or "many ways to express the same method/function/algorithm" etc. because it's all nebulous and fuzzy compared to the safe confines of deterministic math & logic. "If I declare that this variable MUST BE AN INTEGER, then it MUST BE AN INTEGER. The idea you could pass me a string instead? UNACCEPTABLE !!!#%@!"
In other words, I don't have high hopes that great UX for forward-looking developers will come from present-day programming culture. For a quantum leap in DX, we probably need people who aren't die-hard programmers to engage in blue sky thinking. We need to talk to artists, philosophers, linguists, psychologists, and other experts in social & historical cultural dynamics. They can provide the insight we lack. Because for every "this is an integer, damnit!" type out there, there are probably many, many more who would see 123 and "123" and think THAT'S THE SAME THING. :-D
I think the industry has generally recognised that static typing makes more reliable software.
I started with PHP and have seen its type system evolve. Similarly Typescript. I since moved to other staticly typed languages. I wouldn't want to go back.
Yeah, I think this is what a lot of people miss. When I only used dynamic languages, I told everyone that they were the greatest thing ever and they were idiots for using static languages. Then I started using static languages and would never go back to dynamic languages. (OK, I still write Emacs Lisp.) All the things I thought were great about dynamic languages were actually a huge waste of my time, but I didn't know because I never really explored the other side of the world.
I have looked at modern Ruby projects, and my assessment is that I absolutely would never want to be responsible for code like that. It's all way too magical, if you see code that says something like "foo.bar", you really have no way of figuring out where the code for "bar" is. "bar" may not even appear as a literal sequence of characters in the application's codebase, or even in the libraries it includes.
If you like that, that's fine, but I wouldn't touch it with a ten foot pole. I don't find it that enjoyable, even if it does save me time today when I'm typing in the code.
In Ruby, `foo.method(:bar).source_location` works like 98% of the time. There are also LSPs like Solargraph now that work pretty well in VSCode, etc.
I will admit there are certain gems which metaprogram the heck out of everything and that can get annoying, but the flipside is usually those are the gems which are hugely flexible because they're so popular and widely used (I'm looking at you Devise)—and if so there's plenty of documentation and community support out there. I'm also glad `binding.irb` is a thing now for immediately jumping into a console in the midst of your running app, and the new debug.rb gem shipping with Ruby 3.1+ is awesome as well (also with a VSCode integration).
Oh, also I just got a hot tip from a buddy to look into Ruby Jard, a visual (but still terminal-based) debugger with a whole slew of features. Haven't tried it yet but it looks really neat-o.
> even if it does save me time today when I'm typing in the code
IMO this is the strength of Ruby/Rails. You can have something up and running incredibly quickly, but for complex applications it can easily become a mess. It's not easy to know where something is defined, what type it is, if it can be nil, if you're unknowingly calling the database one or one hundred times with your fancy one-liner, etc.
Of course you CAN structure it all nicely beforehand, and even use type systems, but that's hard to get right, and at that point, you might as well use something else.
I don't think that the industry has generally accepted that. It seems pretty hotly debated AFAICT.
What is not debated much, AFAICT, is that it's much much easier to work on large/old codebases when you have types. That's really where Rails runs into issues.
But it's not like legacy rails monoliths start throwing NPE or other type errors everywhere. It's just that developer productivity grinds to a halt. Mostly for two reasons:
(1) You have little idea where the code your modifying is used. Metaprogramming and indirection through send and constants make it hard to find every place it's called.
(2) If you're the 5th function in the chain you have no idea where your parameters are coming from. So you have no idea what type they are and method autocompletion doesn't work. Especially annoying when things get complex and you end up with multiple similar but slightly different representations of things.
Just because you tend to ignore the many complexities that are actually involved in a form on a web page. It's exactly this bias that stems from untyped languages and that needs to die. Http is a complex beast and we should either replace it with something more simple and robust or at least acknowledge the complexity in our programming.
Generally speaking, if doing something with a proof of type-safety feels complex, you probably ignored many corner cases before.
If you don't cut corners on your static typing, you're looking at 2-4 single use classes per API endpoint: 1-2 for the web layer, and 1-2 when going into the domain logic layer.
Many people skimp and just re-use the core domain class for everything, which is the worst of both worlds.
I'll take a map, spec, and select-keys over this all day long.
That's if you write it in Java. With a type system like TypeScript's you can write largely the same code as before, with a few annotations to clarify what you mean, and you get the exact same runtime only you also will get compile errors if a change elsewhere in your program violates the assumptions of that code.
2-4 single use type definitions per endpoint sounds pretty atypical to me, but I agree choice of language helps here, e.g. TypeScript has pretty powerful typing facilities (Pick/Partial etc) that make type reuse far more practical.
It's bookkeeping yes, but definitely not pointless.
Your tools leave you open to type errors, which you have to de-risk via unit tests. That risk is automatically eliminated by static type systems. Dynamic typing is a reasonable choice for single-developer and/or toy programs but isn't inappropriate for projects maintained by teams over time.
No, it's not. Initially it's not any more effort and in the long term it's significantly less effort. Having spent equal amounts of decades in static and dynamic typing (and strong/weak) I would absolutely take the former any day for both ease of development and correctness.
> Having spent equal amounts of decades in static and dynamic typing
I hear this, as well as "eliminates whole classes of run time errors" (as if Ruby and other such codebases are just ablaze everywhere on account of this...), but I wonder if people dabbled in this and think they have it all figured out, or if they've used a modern dynamic language like Clojure that has default immutability or been in a modern Ruby codebase that wasn't cowboy-coded.
> as if Ruby and other such codebases are just ablaze everywhere on account of this..
They literally are, I think? Ruby is a notoriously unreliable language, in large part due to its dynamic type system... (also all of the unsafe idioms it promotes, a separate discussion)
Some of this is just personal taste -- programmers are not all cookie-cutter intellectuals that approach all problems the same way. We're also not all working on the same problems.
For me, the idea of dynamically modifying code at runtime is like the first circle of hell. But I've programmed in enough languages to understand the appeal even if it doesn't fit with the way I like to work.
But I will take issue with blanket statements that static typing is more effort than dynamic typing with all things being equal. In my own personal experience, that is just not the case. But I wonder if many people have been so damaged by Java that anything seems better by comparison.
> When you're programming a form on a web page? It's way-ay-ay more effort for dubious gains, with new downsides added into the mix.
If you don't mean just HTML but some server side processing, yes, I would. This is still PHP's bread and butter and being able to bind a form to static types is really convenient.
New downsides such as? And if having to add a few type definitions to your codebase is "way-ay-ay" more effort, I can only assume you really hate typing, as the mental effort is typically very low (compared to figuring out how to efficiently translate business logic into code, and how to organize code to make it easy for a team to work with over many years as requirements change etc.). There even tools that will generate type definitions for you based on sample data.
This is a wild-ass guess but ... I don't use IDEs.
In fact I use original Bill Joy vi.
I suspect for most of the people we're talking to 'I<tab>' will say Int and 'S<tab>' will sat String and that's going to make our experiences very different.
Actually typing in PHP is not really static. Errors are thrown at runtime, afaik. There might be separate tools, which check the types before running the program, but they are not PHP itself. That aside, I think part the reason, why static typing brings so much to the table in languages like PHP and JS (in form of TypeScript) is, that they are rather weakly typed and not strongly typed. Weak typing and dynamic typing can make for a terrible DX, when bad designs are present like they are a lot in PHP and JS.
I like static typing personally, but for example have no issue with using a strongly typed language, which does not allow me to treat a number as a string or string as a number or similar. Often such a language provides means of making things safer at runtime at least. For example one can define structs and functions, which only work, when they get an instance of that struct. Any other type of value would result in an error. Then of course there are unit tests, which can cover a lot of ground, even if not all a type system can do in theory.
>Static typing everywhere. It's utterly baffling to me.
I find it far more baffling that people don’t want to offload the trivial parts of the mental work of ensuring the programs they write makes the minimum of sense.
I also jumped on the dynamically typed bandwagon after using low value type systems like java’s, but after using haskell, ocaml, typescript, purescript, rust, and especially elm, I realized that they let me focus on the business value over the incidental complexity from having any value potentially being anything…
(Even Agda and Idris have been amazing revelations in this context, but at that point the type theory does start to consume more time than necessary for ordinary development)
I also like statically typed everything. But I have to admit, some of the more adventurous stuff people do with JavaScript (that just works because it is dynamically typed) is pretty ergonomic. Even with TypeScript it is often next to impossible to make it strongly typed.
I guess Ruby or even Ruby on Rails is very similar, with the ability to monkey-patch just about everything.
Some things are just better (left) dynamically typed.
The main problem with more and more abstract forms of describing "what you want" instead of "what the computer should do" is that the program still needs to run on a computer, and those are inherently non-abstract. There is no duck typing on the CPU level and figuring out what exactly you passed as an argument takes up a lot of time.
Don't get me wrong, I love me some Ruby and it makes up the majority of my income. But it would be foolish to ignore that it is indeed 100x slower (or more sometimes!) than some of the modern compiled languages. Small to medium web applications are exactly the sweet spot for Ruby because network latency hides a lot of the language slowness. Nobody cares about 20 extra milliseconds for a request to a server 100 ms away, but a lot of people care about 20 milliseconds delay in frame rendering.
EDIT: The following:
> What I actually want is a new programming language/environment which makes Ruby look like programming pointer arithmetic in C by comparison. Something so advanced, so high level, that much of the time you're really just describing patterns and flows and data models and extensions, and then letting the computer determine the most efficient way to develop those code paths and execute them.
sounds exactly how I feel when programming in Haskell, although that particular language will probably not sit well with you if you absolutely want duck typing.
Pretty sure I speak for a majority of experienced devs when I say: we don't fear abstractions. We fear abstractions made by others, including our former selves, in the all to often occurring context of "business wants it done by yesterday and we have zero idea what the requirements actually are". All the while we're still figuring things out and having 30 minute discussions on the most miniscule things.
I think you're all being too charitable to OP: "Programming would be easy if it weren't for stuck-up snobby programmers gatekeeping it" is up there with "math would be easy if it didn't have all this notation stuff" or "music would be simple if they didn't insist on writing it on those staves with those circles".
IMHO math notation is ugly AF, and music notation was something I hated so much as a teen I almost abandoned my budding professional music career because of it. (Thankfully I found a great niche in early European folk music.) So basically you're right. ;-P
> "music would be simple if they didn't insist on writing it on those staves with those circles"
Yeah. Staves with circles are something optimized for pens, and almost every modern music software uses simpler notation. Does that make music simple? Not quite, but it certainly makes it more accessible.
I get your and the article's drift, but saying it won't come to pass because of grumpy programmers is IMO a bit unfair. I believe it'll come to pass, but it'll take a while because there's not a huge need for it right now.
I believe the whole computing ecosystem is very shallow at the moment. There are lots of tools and lots of programming languages, but they are all shockingly similar. There is just too much monoculture going on. We used to have APL, Lisp, Prolog. Weird stuff. Weird OSes. We need more of that weird stuff again. But there has to be some economic incentive for that to happen and I'm not seeing it at the moment.
Stuff like Smalltalk or even something like Symbolics' machines would be a better fit for people needing a more high-level approach to computing. These tools see very little if any real funding/action and my hunch is that this will change in the future. Not now, but someday. When indeed perhaps the philosophers, artists and psychologists of this world need to do some programming. I'm not a big fan of these kinds of systems, because I am one of those grumpy bastards that don't agree with "123" == 123 but I do see how these paradigms can fit other types of people more readily.
To be honest, I don't think we've even scratched the surface of what is possible.
Edit: OK I can't resist. "012" == ?
This is making me lose sleep. These things are NOT the same. There, I said it.
A monoculture is a sign of maturity. In the past, most programming languages were awful in one way or another. Whether it was C, Fortran, APL, Prolog, COBOL, Basic, etc -- all uniquely terrible. But now every programming language is pretty decent.
Re: your edit, I think the real answer is that the correct translation is fully context dependent, and the people who want it strict want to basically force/precompute the context, vs the people who want it loose want the context itself to smartly decide what the string means to itself. The problem is that modern paradigms are squishy about this, and aren’t smart enough to consistently do the right thing, so real systems end up both too strict and brittle for the real world domain but also too loose and ambiguous to be bug free.
I agree with you that "Re-write your Ruby project in Rust" is a terrible suggestion. Pretty much only "Re-write your C++ project in Rust" makes any sense.
But it's not all bad. Elixir is pretty widely considered to be the Ruby successor. Types are replaced almost everywhere by pattern matching, the widely accepted unexpected behavior response is to just not handle it, crash the "process" (green thread) and continue on with life, integers don't overflow, and the macro system is powerful enough to do just about anything you want.
> Elixir is pretty widely considered to be the Ruby successor.
Really? That feels really odd, once you get past the surface level syntax, Elixir doesn't really resemble Ruby to me, unless you've conditioned yourself to write purely functional Ruby?
You can certainly treat it as a nicer interface into the BEAM VM, but it's certainly possible to never write a line of Erlang while being an amateur Elixir developer.
Though I'll have to defer to others if its also possible as a professional.
Just because Ruby acolytes Dave Thomas, Jose Valim and Chris McCord championed Elixir that doesn't make it Ruby's successor. The languages are poles apart beneath the surface syntax. Crystal has more of a genuine claim. I also think Kotlin is a relevant language for re-writing a Ruby codebase.
Something like ruby is great for WRITING greenfield code. But without static typing, diving into and maintaining an existing codebase that you just got hired on to deal with becomes a lot more stressful.
And that's why there's still plenty of legacy Java code, but legacy ruby code over the last decade has been much quicker to get thrown away and rewritten.
I once took over a Java codebase that was clearly created entirely by people who barely knew Java and I was still able to fully understand (and, in the process, refactor it to cut it in half, because half of the code was dead code by the time I encountered it, lol) in about a week. This was not a very large codebase to be honest, but it would have been impractical without static typing and a good IDE that leverages the same.
Dude, the reason programmers fear abstraction is because they've had to maintain legacy codebases. It sounds like you've been lucky enough to mostly be able to do greenfield stuff or maintaining codebases that aren't very old. I am sure you would understand their views more if you had more experience with maintenance.
This all makes sense given "for the contexts I care about (small web applications largely written by solo devs)". Unfortunately for you most software (and therefore what most developers work with) isn't for those contexts, but rather for larger projects written by multiple devs, where being lower level and compiled (runtime speed) and having static typing (more explicit and easily understood by other devs) become much more valued.
You'll probably always be in the minority with your preferences because of this.
I dont fear magic as much as magic* where the * is all the exceptions that had to be hacked into the magic to make it work on the slightly different use case than what the magician dreamt up
If 123 and "123" were the same thing, then 123 + 123 must be equivalent to "123" + "123", and 123 + "123" cannot be made a runtime error. And I do think PHP (the poster child for weak typing, second to Bash) uses a separate . operator for string concatenation (rather than +), and relies on I think -> for object field access.
Complaints about static typing boil down to not wanting to be responsible for the schema and data architecture of the program itself, and that desire to remove responsibility for designing the scheme is where ORMs like ActiveRecord find a foothold.
Unfortunately for folks and languages with this mindset, if they need to have good performance, eventually they will need to take responsibility for the data model of their database and their codebase.
I totally agree. When given the chance, (and if the goal is productivity) I always use Python because it is the most high-level and most productive language and ecosystem out there. I learned it almost 20 years ago and it was apparent after only spending a night or two with it that it was massively more powerful than Java, C and C++ which were the languages I knew at the time. Since then I have not discovered anything that was a similar improvement as Python was over Java. Concatenative programming comes close but suffers from small ecosystem and there are issues with the syntax.
Automatic memory management became mainstream in the 90's and dynamic typing in the 2000's. But what did the 2010's give us? Github and Stackoverflow? I think we are due for another "programmer productivity" revolution but I have no idea what it will be. Statically typed programming languages that are marginal improvements on existing dominant languages for sure aren't it.
What I would like, as a step forward at least, is something that fills your ruby (or python) niche, but based on Rust, and when in doubt thinks like rust, so that one can migrate from it to full Rust, with a minimum of re-learning. Same libraries, making them as good as Rails and as easy as Ruby in general, garbage collection, etc. But when you flip a switch (or incrementally, in 2-4 levels?) the full compiler checks kick on, garbage collection is off by default, and it lets you specify the details that give rust-like performance, features, and, of course, ease of distribution (a single, smaller binary). With no re-learning of a whole new language -- taken in steps, if/when one is ready to grow.
That way it could be recommended to someone who would otherwise not be willing to take on a harder language, and they have a growth path. Or for a computer science program, to grow into it all incrementally.
And then yes, make whatever improvements in the OP article, from there.
> the reason this doesn't exist is because it's at cross-purposes with programmer nerd culture
If that was the reason programming is "hard", we wouldn't ever have gotten COBOL, Basic, PHP, SalesForce, Windows, UML or any drag-and-drop query tool.
Agreed. Worse for me was starting with VB6 as a kid. A nice IDE, amazing debugging, easy visual components. Entering the professional field using php seemed like a strange nightmare. It was like taking a time machine into the past.
Do you think that perhaps functional programming can be a happy compromise between high level abstractions / declarative code style and strong math / logic foundations? My programming experience is largely in Python and Java and so I can relate to both sides you present - I feel constrained by the aggressive static typing of Java but also feel that Python can be a bit too fast and loose with typing and I often find that dealing with unexpected behaviors takes as much time as it would to just have them written formerly and properly into the code to begin with through a strong type system.
> for the contexts I care about (small web applications largely written by solo devs).
found your source of disagreement. this is, generally, not the domain anyone writing languages cares about. no one writes a language for an individual.
> Instead of emphasizing the what, I want to emphasize the how part: how we feel while programming. That's Ruby's main difference from other language designs. I emphasize the feeling, in particular, how I feel using Ruby. I didn't work hard to make Ruby perfect for everyone, because you feel differently from me. No language can be perfect for everyone. I tried to make Ruby perfect for me, but maybe it's not perfect for you. The perfect language for Guido van Rossum is probably Python.
Can definitely understand that. I’m a big fan of Ruby myself and I use it quite often, and for sure at first readability of a lisp syntax seemed like a problem. I will just note that at some point I’ve realised that while Clojure can seem to be quite freeform, in fact it has very strict structure, which helps a lot.
I suppose that if somewhen you could get into it, you would find a rich toolbox with a lot of heavy lifting done under the hood and a lot of space left for you to think about application design first and foremost, with a fair performance.
I love Ruby, I'm not a dyed in the wool Rubyist but I've been involved with some fairly large projects that were written in Ruby and in my experience writing Ruby code can feel somewhat magical and empowering because of how fast you can get ideas down. As much as I love Ruby the meta-programming and the type system can make refactoring a complete nightmare and as projects grow larger they tend to become a really big mess.
Sounds like you want more declarative languages, like SQL. The magic to get those to work well is a lot easier to implement and optimize if the problem space is restricted. So, DSLs.
Have you tried any of the Low-Code/No-Code tools of the recent years? Do you feel they are a big leg up from Ruby when it comes to small webapps done by solo developers?
I can't say that I have specifically tried any, but I feel like if someone had come out with a new "the Rails for X" where X is a programming language even more abstracted than Ruby, I would have heard about it.
unfortunately I don't use it at work but Ruby is still by favorite language, by far; it suffers from a lack of solid math/sci libraries like Python has
Rails + Phoenix dev here who loves both. Elixir is far from having the conciseness of Ruby and Phoenix is super far from having the tooling of Rails (and Rails from having the tooling of Beam and OTP). From my point of view they are two very different beasts, each one being great in its own domain. The similarities are mostly syntactic and superficial.
> We spend endless amounts of time bikeshedding the right syntax, indentation level, tabs vs spaces, or where to put code in the structure of files, but this all feels just pointless - these are all properties of text, but the text is just a tool to manipulate some abstract model of the program.
No we don't spend endless time doing this. We automate as much of that as possible. I haven't cared about syntax for whitespace for years (I mean, I care and I have opinions, but I almost never have to work on it).
One item from the list, which we can't automate, is a really important part of the job: "where to put code in the structure of files". This is an intrinsic part of the solution modelling. Often this is something you want to change as the project expands. What are the modules and objects of your application? What are their responsibilities? Does this new object now have responsibility for that thing over there? These are key questions to answer on a daily basis.
I agree with the complaints against boilerplate. Boilerplate can help decide where a bit of code should live, but it also gets in the way when changing that decision is the best thing to do. We don't teach or celebrate refactoring enough (at least, not in the places and projects I've worked).
For me it is the separation of business logic and data from the programming implementation.
Often the workflow in automating a work process is:
- Worker has intuition on how things work
- Specialist starts automating and running into exceptions, walls
- Worker explains exceptions as they're discovered
- Specialist adds spaghetti to the 'clean' business logic model they started with
- This keeps going until the end result looks like what the worker expects
- The worker then sees a datapoint they think is incorrect, but they can't check it because the business logic is embedded in software, code, database stored procedures, etc.
- The specialist is then called back into debug and resolve the issue.
At my workplace we run into this issue constantly, where the ability to modify/adapt the business logic is lost through automation, and it becomes obscured to the point that the intuitive workers can't trust that the logical model is right.
There needs to be better separation of business logic from models so that programmers don't need to 'dig' into business processes and exceptions, and workers don't need to deal with part of their responsibility being shoved into a black box that they're not sure they totally trust or can adapt easily should a need arise.
Wow. I would say the complete opposite. Why are you writing a program for work unless you are helping to solve a problem? The implementation only exists to work with business logic and data.
Excel is an implementation that does not care about business logic and data. Anyone can add their own business logic and data.
That most of the business world lives and breathes Excel is a testament to OP's point. Companies can cram whatever garbage logic they like into a spreadsheet and no one has to call a developer Microsoft because their pivot table isn't doing The Business Thing that they expect - the end users simply have the power to understand and fix the problem themselves.
>Why are you writing a program for work unless you are helping to solve a problem?
I'm not entirely sure what your point is here other than implying that the programs I/we write aren't solving a problem, which is not the case.
> The implementation only exists to work with business logic and data.
This is true. I would say that any implementation of automation is really an implementation of business logic that turns one set of data into another. For example, taking financial transaction data and using business logic to convert this into data that measures company profitability.
> Excel is an implementation that does not care about business logic and data.
Yes and no. Consider the example of transaction data -> profitability measures. If you compute this in excel, you are hand entering data (or using data inputs if you're not a neanderthal) and you're inserting business logic into cell formulas that relate to other cell formulas. Maybe you're even using VBA. Business logic is embedded in the worksheet. This is why you end up with a lot of small companies that have "THE" accounting spreadsheet and "THE" inventory management spreadsheet. It is the extent to which they can automate.
I don't know if you've ever tried to detangle the business logic from an excel spreadsheet, but it is a pain in the ass, and it is very bug-prone. Tracking down bugs and errors in an excel spreadsheet is spiritual torture.
> Anyone can add their own business logic and data.
I guess you may have read my post from the perspective of a small company, where it is entirely feasible to store your company's data and business logic in spreadsheets. I'm coming from the perspective of a large enterprise where having excel spreadsheets on the critical path of any business process is a disaster.
> "There needs to be better separation of business logic from models so that programmers don't need to 'dig' into business processes and exceptions, and workers don't need to deal with part of their responsibility being shoved into a black box that they're not sure they totally trust or can adapt easily should a need arise."
I agree that this is unsatisfying, but I don't see programming tools fixing this, and especially not "separation"; however, I don't know what you mean by separation. I think UX* is the way to approach this, so that developers account for the actual data and decisions in a workflow. I think you convince the business with better observability.
Good luck trying to create something more powerful and versatile than text to create programs.
Text is a seemingly basic, simple and even crude format. Ancient, outdated even?
My opinion is that our perception about the simplicity of text is related to Moravec's Paradox, and is an illusion.
Reading and writing text is something we take for granted, but this is a skill we all have to painfully learn at a young age, and we're so good at it that it seems transparent.
Text is very abstract, the fact that we, humans, are trained to easily parse and decipher it is the reason why it is so powerful for us.
I don't see any replacement in the foreseeable future.
Dion Systems were working on changing the format of text to an AST format, that can be represented in different forms of text as needed, and the changes and implications it creates for developers.
I think I heard that these developers are now working for Epic on a new programming system replacing the node-based blueprint programming system (think Scratch) for UE5 (probably using an ast based programming system that can be viewed in different text/node forms).
This is true. Especially for programming where you can be asked to do anything at all that is computable. If you massively limit the scope then Excel is a fine interface. Try making blogging software in Excel though.
Do lawyers have want to move away from text and to something better? No. But for simple rules you can do no wordy things, like a no smoking sign or a locked door to indicate a rule.
Formatting issues can be resolved by using an autoformatter. Tools can be made to visualise the code too.
I don't think so. In fact, I think this perspective is actively harmful.
A program is whatever the human beings who maintain that program operate against. I mean there are many possible representations of a program, and there are different ways to evaluate those models, but the model that most precisely expresses the AST of the logic of the program, or the call sequence of the von Neumann execution of the program, or whatever else? These models are incidental. The thing that matters is whatever humans read and interpret and mentally model. And for the moment that's source code.
Yes, I agree! In fact, the break through we needed is to view programming as a human cognitive operation and embrace the text, and treat manipulating text as a main component of coding. For example, I want to code
foreach item in list_A
do_something(item)
This text is the native code in the programmer's mind, and we should allow programmer to just do so. Then, in a second layer, the programmer should code up the transformer and translate that to the actual programming language, adding incidental complexity such as specific syntax and internal language representations, so that the lower level compiler can verify and consume and feedback.
The transformer part is super hard if we rely on automatic tools, which is just another version of a compiler. It is super tedious if we rely on human manual work, which is just how today programmers do. But if we view the transformer part as part of programming, where programmer employs tools to mold their program, then it makes sense. The programmer will be able to program the tools to avoid the tedious part but still with full flexibility to mold anyway they desire. It is still programming, but in a meta frame where text is the target.
Indeed, in a way the author is saying that the meaning is independent of the text; which, in literary circles, was once hotly debated. I land on the textual side of the line; the medium is as important to the meaning as the abstract concepts that might be components of it. Multiple tellings of the same story carry different meaning, sometimes subtly, when they are each one of oral, text, photo, painting, or film.
I believe the same is true of programming. A solution in C is fundamentally distinct from a solution in ladder logic, or Haskell; even if they solve the same problem.
You don't buy a drill, you buy a way to get a hole in, say, wood. That's the core product.
The core product of 'programming' is a software solution. So far, the best way to approach it is text.
But... text is an incredibly poor model! It's almost literally an accounting system of "Jake spent fifty dollars." It's text which is "actively harmful," and we see it all around us in the absence of maintainable systems. Yes, currently, it's the best way we've found; just as horse and buggy was the best current form of travel 150 years ago. But it takes very little imagination to expect we'll find better.
Programs are ideas that need to be understood, modeled, and manipulated by humans. Text is the best available way to incept ideas into the minds of humans.
So did the medical practice of bleeding out a patient. It's been the only way but being the only book in the library doesn't qualify how good the book is.
The problem with text is that it's not structured and therefore very difficult to manage. If you're going to create data for an accounting system, do you open up a Word document and start typing words like "Jimmy owes A/R five hundred?" Of course not because while you can search that, you can't query it. When was the last time you queried your code? Exactly.
Horrific.
And we can see it in our lives. When was the last time you jumped into someone else's code and could clearly see what was going on? Probably about the same time you last opened the middle of a novel and understood what was going on.
> So did the medical practice of bleeding out a patient. It's been the only way but being the only book in the library doesn't qualify how good the book is.
Bloodletting hasn't been performed for hundreds of years. We still use writing to convey ideas.
> The problem with text is that it's not structured . . .
If that were true, text books wouldn't contain any diagrams. So I think it's fairly uncontroversial to say that there are cases when text is not the best way to incept ideas. So why should we be confined to just text when writing code? Should we not be free to choose the best tool for the job?
True, but IDEs are currently severely limited by nature of textual sources (e.g. macros are notoriously tricky to deal with). It's much easier to start from a good model for IDE and derive textual reprentation and input method from that.
Programming is an exercise in manipulating semantic objects which is why stuff like blueprints in unreal engine are so easy to use. It will just get better and text will become an obsolete vestige, regardless of whatever idea you have of this being "harmful". I really don't think a character encoding with 30 ways to control the terminal (and now extended to encode emojis) is the be all end all way of writing and editing programs.
Consider that if text had an advantage, it would be that you could cut some string in half as some sort of manipulation that saves time. What do I cut in half? Change >= to = ? There's basically no advantage of text. I don't want half of an if statement, that's what text allows me to do, it's a pointless feature that makes inputting programs harder. The basic element on the screen should be the if statement itself.
There is no advantage to a 1D language. Text is a 1D language. If you have
if (x) {
if (y) {
}
}
You are just reading it as a 2D language, but it's not. It could be
if (x) {
if (y) {
}}
}
and mean something else. (Yes I'm sure you can catch this one, but not all in general). The only way to be sure is read it from start until end, which nobody does. This is a security issue too, which I discovered when I was 12 years old reading perl scripts from milworm. By formatting text you are already conceding that 2D is better.
UE blueprints fail spectacularly when you try to do a simple for loop processing some input data. Editing all the vertices on a procedural mesh kicks you back to C++ almost immediately.
Blueprints (army of interconnected squids) works best on parallel code like GPU shaders. For general branching logic, it becomes a nightmare quickly.
Image blocks, code icons or blueprints do save you from syntax errors and highlight relationships between code blocks, but they crush customizability and lose the ability to absorb and abstract complexity.
So what happens when you have a 10 deep nested if tree?
Your answer would be to not do it because it doesn't fit nicely in a 2d page. My answer would be to put it in a 1d line and not bake semantics into presentation.
But you can't pretend that the space of programs isn't infinite dimensional when the space we work in is only 3d. Regardless of how clever you are using a medium with a dimensionality higher than 1 for semantics will always result in expressions you can't express.
This is why we've stuck with text.
It's 1d, but we can make it 2d to help understand what is going on visually in the simple cases humans can understand intuitively. Yet it is complex enough to represent any possible object.
Just switch back and forth between box representation and graph. It's still far better than text. Text fails far faster. Have you ever used IDA? It shows a graph view of assembly based on jump flow instead of a linear assembly listing. Of course you can't magically understand all programs just with a simple algorithm to format them a certain way, but it's still immediately obvious that this is better. Most code is immediately understandable just using a C#/Java decompiler or gofmt without the programmer needing to format it. I just see variable names and formatting as spam and red herrings. Anyway you can't audit code by taking the programmer's word that he formatted it a certain way, so this whole "text for writing a poem of comprehension" idea is moot and outdated at least 30 years ago anyway.
Semantic objects are non-dimensional. In fact they are probably best understood as simply ideas. Humans have been effectively incepting ideas in each other since the birth of language and writing. Text is, demonstrably, the best way to convey semantic ideas among human beings.
Woah there, it appears as if you are implying that text is demonstrably the best way to convey computer programs. Surely that's not what you're saying?
> A program is whatever the human beings who maintain that program operate against
This does neither invalidate the statement that "Program is not a text", nor that "[it is] a model".
> And for the moment that's source code.
How is wanting to change that harmful? Source code can have flaws like ambiguity (see C++'s need for typename).
Deeming a call for improvement on the existing representation of a program harmful is a fine way to stop any progress at all.
We would still be putting in raw numbers into memory with that attitude.
On another note: I think what matters is what the computer actually does. Whatever I, as a human, read and interpret has no meaning at all if the computer decides to "read and interpret" it differently. And the only limit with representing what the computer actually does should be your imagination.
> I think what matters is what the computer actually does. Whatever I, as a human, read and interpret has no meaning at all if the computer decides to "read and interpret" it differently. And the only limit with representing what the computer actually does should be your imagination.
Well, I understand this perspective, but I judge it to be backwards. The computer should be understood as secondary to the human. Mechanical sympathy is necessary for useful software but the goal of software must not be mechanical.
That's a tenable way to view a program until you hit a compiler bug.
Mostly, you don't hit a compiler bug. When you think you've hit a compiler bug, it's not a compiler bug.
There's an easy workaround: write a compiler. It will have bugs.
There are other non-trivial ways that a program isn't its source, but in fact, systems which treat the source code as the ground truth are an important foundation. Many Smalltalk users would paint themselves into a corner by modifying an image to the point where there was no way to restart it, because its important properties were no longer captured in source code.
to add to that, this programming is a model is how we got smalltalk. a collection of interesting ideas from which its peripheral ideas got pillaged into other languages while leaving the core idea to just smalltalk.
> usually the biggest problem is an enormous amount of CRUD boilerplate that looks very similar in each project, but it's nevertheless different in important details.
Once upon a time, a lot of microprocessor code was written in assembly. Then C came along, and many of the assembly people said: "whoah, hold on, I use different calling conventions in different places for good reason; what do you mean every C function is going to use a whole stack frame even if it's never re-entered?" But now 99+% of programmers don't care about sub-kb stack frames, and probably most of them even consider C too low-level[0] to touch.
I've said elsewhere[1] that much of the ceremony of cloud sw dev reminds me of 1960's mainframes — so what is the equivalent of a "structured programming" toolchain for the cloud, that takes some higher level description of inputs, outputs, and logic[2], and then generates not-entirely-ludicrous boilerplate to mash it all together?
[0] in hindsight, C's big advantage back in the day was that it mixed much better with assembly than purer HLLs.
[2] compare the "environment division" of COBOL with Terraform, and the "data division" with Swagger. I mean, we've definitely progressed on many fronts since then, but certainly not on the axis of "lack of verbosity"[3].
[3] OTOH, Conway's law suggests that multiple-team projects will always wind up with multiple configuration in multiple places.
The interesting part is that no, function calls do not all use an entire stack frame, unless you compile your code for debugging. Compilers are smart enough to reencode the calling into just registry shifting, or even inline the entire thing.
Well, the developer saying that some times had a real point in 1978. A lot of developers migrated, some didn't, often for very good reasons. The 99% of the developers being ok with C only happened after optimizing compilers was a thing.
Of course, none of that means that the entire endeavor was worthless before that. It just means that whatever improvement you create, it won't appease to a lot of developers, and it's ok. One can only get unanimity if the change has no drawbacks at all, under no context.
There’s a lot of complaining and not a lot of suggesting going on in this article so I don’t take it for very much.
I feel like the author got a job and realized that it almost exclusively involves writing glue code and tests and looking at text diffs, so how convenient to imagine a world where they didn’t have to do the boring parts of their job.
I think you're being unnecessarily harsh. I get your point about OP presenting problems, not solutions, but on the other hand, they are a programmer & they are clearly trying to automate the repetitive parts of the job. Isn't that basically the essence of programming: what good programmers should be doing is automating as much of their job as possible.
a Shannon-inspired taxonomy (note the indefinite object; ymmv):
- glue code: given the outputs, you can reconstruct the inputs. this code translates between formats, between systems, etc.
- parsley code: given the outputs, you could reconstruct the inputs. this code "gussies up" its inputs: it takes table rows to JSON, or JSON to framework objects, PDF, etc. so in principle the original data is still there, but once the outputs are sufficiently complex the inputs become extremely awkward to reconstitute.
- crunch code: given the outputs, you have no hope of reconstructing the inputs. this code does real computation, and forgets stuff. Example: You can get a source program given only an executable, but you won't get the source that it was compiled from (nor will you, without a lot of work, even get a source you're willing to read)
Everytime someone proposes a non-textual representation of code I cringe.
There's a reason no one single alternative has triumphed over text. And many have been proposed!
Standards are hard. Proprietary tools lock you in. Text is universal (OK, restricted to plain ASCII, which is a big deal but has already happened). You can open and modify a text file with whatever editor of your choice, the simplest possible editing tool. Some weirdos (joking) even print source code.
That reading, not modifying, is the primary operation to be done on source code is an unchangeable fact of life. You won't be able to change anything without inspecting it first, and no representation is going to save you from that -- and it's undesirable to wish for this, in fact.
>Standards are hard. Proprietary tools lock you in. Text is universal (OK, restricted to plain ASCII, which is a big deal but has already happened). You can open and modify a text file with whatever editor of your choice, the simplest possible editing tool. Some weirdos (joking) even print source code.
Ideas are impossible to specify precisely, which is why we use words, with all their myriad definitions. However, if you apply enough computing power, you can autoencode your way to a compact expression of ideas as a 384 dimensional vector (like Word2Vec), that is non-standard, but exact for that particular model, with those particular weights. You can then repeat the process in multiple languages, and with parallel texts, language translation becomes something that can be automated.
The abstract syntax tree doesn't have to be something that can be shared to be a very powerful tool. It certainly doesn't have to be standardized. Just as there's no rational choices made in the 384 dimensions generated by word2vec, there doesn't have to be anything other than a binary format the tool can load and save, in order to work.
If you can ingest multiple languages, and regenerate them as output, translation between programming languages becomes almost trivial.
I'm not sure if this is what you're saying, but I think the AST is not meant for "human" consumption (so it'd be outside the scope of this debate, I think), but also, translations that result in alternative and radically different representations between what you see and what I see can be problematic.
"I don't understand your program, it's hard to follow."
"What do you mean? The yellow and black colors alternate quite nicely."
"Colors? What do you mean? I only see triangles arranged in a jarring manner, and they are hard to navigate."
"Navigate? Everything fits in a single board of yellow and black colors, no need to navigate anything."
>I'm not sure if this is what you're saying, but I think the AST is not meant for "human" consumption, but also, translations that result in alternative and radically different representations between what you see and what I see can be problematic.
File systems aren't meant for human consumption either, they're a practical solution to the need to store various bits of data on a disk in a coherent manner. We've made tools for representing the inodes and other structures in human readable form. We also represent them visually.
We can do the same for the AST. It doesn't have to be a dump of the structure, a tool just has to present a view that is close enough for our brains to impedance match, and we're off to the races.
Isn't a text file representation -- what you and I see when we open a text editor -- already an abstraction for human consumption? They are not filesystem-level storage.
You can view an AST, but this is never going to supplant a textual representation of code as a lingua franca between you and me. A tool that represents ASTs is always going to be a boundary between you and me (and also, a more complex tool than a simple text editor).
Actually, I think we have already moved away from pure text a long time ago. Think about it this way:
If "text" (so only the sequence of tokens) was the valuable representation, then we would all program without any formating (no additional spaces and linebreaks).
Now, you might say well formating is important for readability. These spaces and line breaks do matter. But, just adding them randomly where they are syntactically allowed is not helpful either. There is some very specific structure to formating: The AST.
So, in other words we usually structure short snippets of text sequences in a graphical layout by ASCII-arting the structure into it. And, even with that, many people would complain about the missing syntax highlighting, another meta property that pure text does not have ...
Anyway, I think you are right about almost all tooling being build for text, so we are kind of stuck with it ("unchangeable fact of life"). However, that does not mean that it is the ultimate representation.
E.g. there is still one industrial nation stuck using the imperial system, and that is certainly not because it is more practical than the metric system. It is just that the infrastructure, tooling and culture has internalized it.
I actually started writing a blog post along this theme but never got very far with it.
I'm not against code storage and representation as text per-se, merely that typing/editing it (text) is a not particularly efficient way of thinking about code.
A lot of people I'd characterise as 'words-per-minute is everything' style thinking are seeking the most efficient way to convert thoughts into text input. From punch-tape to mechanical keyboards the ultimate goal of programming language development is a 'faster keyboard'.
I think this focus on text input as foundational is incorrect. Most text isn't meaningful, the subset of meaningful text is severely constrained by the execution environment of the text. Outside of variable names and naming functions text input just isn't that interesting or important. For example unless you have previously declared one, invoking a function named `ttoString` isn't useful.
This is why I think coding outside some form of IDE (how far along that spectrum is up for debate, but I'd say autocomplete at minimum) is somewhat inefficient. Code is text, but it's also a layer on top of the text, the set of possibilities defined both by the text and the compiler/interpreter (or more specifically as you mention in your comment, AST). Treating code as text where you might rename a function with find-and-replace is frankly, daft and backwards. Your tools should rename the concept/symbol/whatever you're targetting specifically and they should understand your intent and meaning. By focusing monomaniacally on text we're ending up with worse tools.
This also scales to text input more generally. For some reason spell-checking doesn't work for me in Firefox so I'll have no doubt made a bunch of spelling mistakes in this post and it feels a considerably worse way of typing. I don't particularly value the ability to write "ocurence" since it's incorrect. I'd rather have tools that assist me communicating efficiently in a manner understoof (sic) by others. Text is fine but tools are great.
This develops into a more general objection to languages like Python/Ruby/JS but that's a whole other flamewar.
To be clear: I don't think about text and text input in terms of efficiency. Well, that too -- almost every other form of input feels clumsier to me, but that may be because I'm more used to text. But I think typing efficiency is a red herring, a fetish of hackers who also worry about mechanical keyboards and keyboard layouts and "you cannot code unless you use three 4K monitors at a minimum". That's a fetish -- time with code is spent thinking about it, not typing words or even clicking with the mouse. Still, many here will fight to the death to claim these are very important things, even crucial; and I'll politely disagree.
I think text reigns supreme because it's more universal, less convoluted, and has zero vendor lock-in. If you want, the tools to read and edit source code come with your operating system!
There have been tons of musings, thoughts and even projects to replace text representation of code. Where is their widespread success?
I think text is the ultimate "worse is better", in the positive sense of that concept.
Thanks for the response. I think we probably ultimately agree. I was taking aim at that second group you outline with my rant.
I had always assumed punch-tape was encoding some other representation of code than text until I did the research for my blog post. I think text is here to stay and I don't have any real problems with that. Where my tools offer designers or visual editors for code/UIs I still prefer to use text.
As an on-ramp to programming nothing quite beat opening up an HTML file in Notepad (or text editor of choice), making some changes and seeing that your header now flashed and was red.
I do think however that much like digging on a beach with a plastic bucket and spade is fun when you are young and carefree; when you're being paid to dig holes you want the biggest shiniest JCB you can get.
I am mainly reacting to the second group who curl their lips with disdain at the idea of doing anything other than coding in Nano or whatever pure text environment. They seem to see programming as a priestly sect dedicated to text like Lindisfarne's monks and people who don't know their Cherry Reds from Browns as fake/noob/impostor programmers.
I think what you call formatting is an essential part of textual representation (alas! It's also the most contentious part). After all, formatting is an essential part of how we write and read novels, articles, etc.
In this sense, I think text with formatting is pure text... for human consumption.
I am arguing it is the ultimate representation: for historical reasons, of course, but also because there is less friction with tooling, less vendor lock in, fewer standards that must be developed and adopted universally, etc.
I think there's always going to be friction for old programmers used to their ways, but why then aren't new programmers more successful with their piñata alternatives to text? ;)
Alternatives have been proposed and they haven't caught on so far (for general usage, I'm not talking about specialized applications where they make sense).
Which one, ASCII, code page 9000, or UTF-8? Are you saying just restrict to ASCII? Then we need custom solutions every time someone wants to talk a different language.
> lock in
I am locked into using retarded insecure terminal emulators because all my tools depend on it to render what you people call "text" which is supposdedely "simple" or "universal". Even vim breaks from unicode already. The primitive language of the OS should not be some poorly defined idea of "text", but algebraic data types.
> reading important
Which is better done by graphics that text. Just the fact that with a structural editor - an if statement spans the amount it appears to span, instead of having to trust someone to format it correctly - is already enough reason to ditch text.
Please stay polite. I'm not a boomer and my comment wasn't a rant.
> Yes, it's called the status quo.
I didn't mention the "status quo", please don't misquote me in order to make a clever retort.
> Which one, ASCII [...]
I preemptively mentioned which one, which defuses the rest of your sentence. More importantly, you know which one I meant. And I know you know, so please, let's not argue this.
> I am locked into using retarded insecure terminal emulators [...] Even vim breaks from unicode already
None of this has much to do with what I wrote. Again, I must ask you to stay polite and on topic. If you dislike your tools, use better ones.
> The primitive language of the OS should not be some poorly defined idea of "text", but algebraic data types.
This discussion is about representations meant to be universally understandable by programmers, i.e. humans, not by the OS.
> [reading] is better done by graphics that text
Reading is mostly done using a special kind of "graphics" called "text". People with disabilities or who are busy doing something else, like driving a car, choose alternatives; but let's be honest, most people read text.
Until these ideas go into mainstream, treesitter has resulted in some interesting improvements to e.g Neovim. I use it extensively for AST based selection/navigation and along with conceal I can hide/transform a lot of noise characters in text source code.
FOAM is a modelling framework that generates cross-language boilerplate for you, but it takes a much broader view of what constitutes boilerplate than most systems. Typically, it can generate between 95-98% of a working cross-language cross-tier system.
FOAM helps you create features for modelled data. Features include things like a Java/Javascript/Swift classes to hold your modelled data, code to marshall to/from JSON/XML/CSV/etc., various GUI Views, and support for storing your data in various databases or file formats. However, FOAM models are themselves modelled, meaning they're afforded all of the above benefits as well. This lets you apply the MVC technique of having multiple views work against the same underlying data-model concurrently (say a grid and a pie-chart in a spreadsheet), so that you can choose the best view or views for your current need. When treated this way, your code is no longer text (but it can be, if that's one of your views), and you can easily view and store it in many different ways and more easily programmatically manipulate it.
Programming is encoding: some reduction of reality into a model suited to algorithms doing what we need.
The reduction to model and scope to algorithms are necessary and unavoidable. All of the criticisms made are objections to the selected model or limited set of algorithms.
That includes text as the standard "model". By contrast, IBM VisualAge in the 1990's had a true AST-based model of the code, and showed text as views. It did (some) amazing things, but people hated it because there was no way to make it do other things they wanted.
I would add the disparity between the granularity of commits and comments. Clearly, I should be able to describe specific changes in a file and provide high-level account of a long-running task without collapsing everything into one history.
Proposal: A single AST model, configurable for multiple languages, with JVM & LLVM back-ends, and automagic support for IDE's via LSP and program transforms like language conversion, slicing, and aspects.
The killer app there would be transforms for parallelism to the processor-specific vector instructions. That would capture more of the value of the oncoming onslaught of processor architectures.
One of the brilliant things about Smalltalk was its change-management. You didn't just store programs, you stored "changes" to existing programs which assumes a context of existing 'classes" into which such changes can be applied.
Contrast that with current CVSes like git where you don't store changes to the objects you create or have created earlier, but just changes to which files exist in your (git-based virtual) file-system.
Git only knows about things like "files" and "folders", not about objects the user creates. Therefore you can not view a git-repo as a set of changes to your "program", it is only a set of changes to files and folders.
> Currently the design of the Git index (staging area) only permits files to be listed, and nobody competent enough to make the change to allow empty directories has cared enough about this situation to remedy it.
> Directories are added automatically when adding files inside them. That is, directories never have to be added to the repository, and are not tracked on their own.
> You can say "git add <dir>" and it will add the files in there.
> If you really need a directory to exist in checkouts you should create a file in it. .gitignore works well for this purpose (there is also a tool MarkEmptyDirs using the .NET framework which allows you to automate this task); you can leave it empty or fill in the names of files you do not expect to show up in the directory.
the concept of the network should become tightly integrated - I should not have to jump through hoops to send network requests and evaluate their responses
handling errors should be extremely standardised in such a way that the programming language defines how it is done and all programming languages do it the same way
a programming language like Rust - inherently safe to do sophisticated multithreaded low level coding, guaranteed ot be memory safe and guaranteed not to have race conditions, but without the complexity of Rust
a programming language dramatically better at team coding - the idea of microservices is a terrible way of solving this problem
an internet that is type safe from top to bottom and all the type safety is the same typing system or at least compatible -all data and applications should talk through a ubiquitous type safe way of communicating
a much more obvious way, perhaps visual, of connecting data/functions/interfaces together - the typing systems I describe above should make it possible to just click code together
The author wants to do away with the program as text source code but doesn't suggest another way to represent a program.
You're going to have to perceive and manipulate your program in some way. Maybe it will be great to get rid of code, but then what will you have instead?
I think you misinterpreted that. Their gripe wasn't with a textual representation of code, but the fact that we interact with it as text, not in any form. And that fundamentally limits the power of the tools we use to interact with it, according to the OP.
Of course everything can have a textual representation eventually.
Your final question, "what will you have instead?", really nails the point of the article, as that's basically what the author is asking (to be invented).
> The author wants to do away with the program as text source code but doesn't suggest another way to represent a program.
Quoting the article:
> I imagine this model as a relational database - you have tables like `structs`, `fields`, `functions`, `arguments` and relationships between them.
I really want something like that pretty specifically. We still use text to edit data in SQL database, but you can also use queries to view or manipulate it. That's the idea.
Well, SQL queries are still written in code as text.
But I think you must be referring to the visualizers that many db tools have that, e.g., provide a grid view of a table or view, perhaps with QBE headers (or other means of filtering), etc?
Those kinds of things are nice. They make certain tasks easy to do or certain kinds of questions easy to answer. But they are all limited too -- each is a simplification of the overall problem space (which is why you still need SQL or if not that, some other equally complex query language variant, not to mention a more full-featured language on top of that to do the things SQL doesn't do).
For programming, IDEs are full of this stuff already. Code completion, all kinds of refactoring tools, symbolic searching/browsing. So I'm still not sure what the vision is... I can see that more sophisticated refactoring tools is part of it, but there seems to be a broader idea, but I can't tell what it is.
Programming is more complicated than it needs to be, but it and its tools cannot get simpler than the underlying problems they exist to solve. If we all were working on the same kinds of problems with a bounded, known set of variables, that could all use the same runtime, be hosted in the same kinds of environments, operate on the same kind of hardware, etc., then we could have simple, powerful tools for that (though probably in that case someone would simply solve the general problem, and there wouldn't be any more programmers needed, just sales people).
> Well, SQL queries are still written in code as text.
> But I think you must be referring to the visualizers that many db tools have
No no, I really mean SQL text, writing SQL is fine and writing source code, too - I don't have problem with using text to manipulate the program. The problem is when the program is defined by source text.
Let's say a program would just be a single file SQLite database and I'd use text editor and some query console to edit it - that's a rough idea. I can write a function by hand like I do now, but the function is not saved as the literal text, it's saved as few rows in the database. Refactoring is then just running some SQL against that.
IDEs already maintain such a database in memory but they are limited by the textual sources quite a lot (especially when metaprogramming is involved). If you drop textual sources and define the program by the database directly, many IDE features become much simpler to do and accessible to the programmer directly.
I think a really good idea of the direction is the Dion demo somebody posted here. They don't have any scriptability built in, but the idea is very similar.
https://media.handmade-seattle.com/dion-systems/
There are some more advanced refactoring tools now available. These tools enable you to write code to detect bad code patterns and even automatically fix them. You can use them to write one-off transformations of code too. Rust has Dylint [1] and C# has Roslyn Analyzers [2]. Facebook has tooling [3] that helps writing CodeMods, enabling authors to generate changes for thousands of files at a time.
The thing I really would like to see is a smarter CI system. Caching of build outputs, so you don't have to rebuild the world from scratch every time. Distributed execution of tests and compilation, so you are not bottle-necked by one machine. Something that keeps track of which tests are flaky and which are broken on master, so you don't have to diagnose spurious build failures. Something that only runs the test that transitively depend on the code you change. Automatic bisecting of errors to the offending commit.
> The thing I really would like to see is a smarter CI system. Caching of build outputs, so you don't have to rebuild the world from scratch every time. Distributed execution of tests and compilation, so you are not bottle-necked by one machine.
This is already achievable nowadays using Bazel (https://bazel.build) as a build system. It uses a gRPC based protocol for offloading/caching the actual build on a build cluster (https://github.com/bazelbuild/remote-apis). I am the author of one of the Open Source build cluster implementations (Buildbarn).
> To be more specific - for most web projects I work on, I have a very similar yaml file for CI, Dockerfile,
Very similar, but not actually the same - of course there's always a tradeoff between control and convenience. Maybe you don't like where kubernetes, $CI_TOOL_OF_CHOICE, etc puts that tradeoff, but that's why there's a zillion tools and languages out there, because someone wanted a slightly different tradeoff
> Changing the program is the most common thing we need to do in our work, it's the reason our job even exist. And yet, most of programmer's time is spent reading or planning how to change the code.
I think I fundamentally disagree with the idea that "reading or planning" are second-class activities here. I think that one of the highest purposes of a programming language is is to communicate to other humans, unambiguously, what the program is intended to do.
I think most languages have a sweet spot for "good at communicating for $TYPE_OF_PROBLEM" - it'd be a nightmare to read the assembly code for say a AAA videogame and try to figure out if it's a FPS or RPG game, but if the intent of a program is to do a small thing in a super-architecture-specific way, probably assembly is a great choice - exactly because some other human will see _exactly_ what you mean.
If I have to work in a big complex codebase, it's great to use a very structured language like java/rust/etc where all the many, many pieces are (or at least can be) clearly named. I'll have no idea which CPU registers are twiddled when, but I don't care, and neither did the author of that code (presumably why they chose a high-level language).
On the extreme end - a tool like excel which is AWESOME for getting computations done and presenting the results, but is (IMO) not the best for communicating _how_ to do those computations. Do every single one of the cells in this column have the same formula? If not why? Is that intentional?
I thing programming languages are awesome exactly because of the extreme, explicit precision and repeatability that they allow us to communicate with, but certainly not all problems [that are currently solved with programming languages] require that kind of rigor.
> It feels like there's an unexplored dimension of abstraction here. Something where generics, interfaces and higher order functions are too static and low level primitives. Something where it's really difficult to pinpoint what exactly is the repeated pattern here and how to exploit it. Instead of generic framework that can do everything, I'd like an efficient way do something specific.
This bit reminds me a lot of a concept Aaron Hsu mentions in his fantastic talk on anti-patterns in the Iversonian languages[0], a paradigm he terms "Idioms over Libraries". The idea being that because of the expressive power of APL's primitives programmers often opt for utilizing easily recalled snippets of symbols over the importing of library code to implement common functionality. This facet of array-lang culture can be seen with sites like APL Cart[1], which serve as indexes of these frequently used snippets.
I think a lot of people miss that this is one of the major advantages induced by the terseness of these languages (it's not just about code-golfing for the implicit joy of laconicism). By allowing for the direct expression of algorithms in the same context they are being used, APL and its kin enable one to tweak and optimize these idioms on a case-by-case basis: precisely the kind of specificity the author of this article is asking for.
I would suggest to author to look at functional languages, like Haskell. These are based on principles of function composition and referential transparency.
The move from testing to static analysis is happening there, in the expressiveness of the type system, and its inferences. Functional languages are on the forefront of experimentation with new type systems.
> the text is just a tool to manipulate some abstract model of the program.
yup.
> When you think about it this way, it becomes clear that using a textual source code is really inefficient way to manipulate this model.
Hard nope. All the attempts at different approach were a failure so far. Text so far is the best thing we have because text can be analyzed formally and computed with by computers. This is the result of the work that's being going for at least like 200 years in philosophy, logic, and math.
What we really need is to bridge the gap between programming and math as industry standard, not academical exersise somehow. But I highly doubt eliminating text as source code is it.
To add to this, I think text still needs to be the canonical representation of programs. But we're still ways away with tools to express ways we want to modify, generate code. I think the late Pieter Hintjens had the right of it: we should generate code more often (I know it's polemic here). Not behind some magic curtains like the overly complex (for me) and difficult-to-inspect c++ templates but actual code generators.
One example: recordflux [0] generates, from a high-level dsl, SPARK code that can be proved to be Absent of Run-Time Error. It has a model checker for high-level constraints but lays out serialisation/deserialisation code to access fields. They've been recently working protocol/session-level modeling and it's getting very interesting.
I have a similar tool that I use to generate AORTE serdes code, but once you have that, you can go on to generate fuzzers, proxies to other techs.
We're probably also missing a way to target interesting intermediate representations of programs (either Why3 VCs, cbmc GOTO, but maybe also to lean 4 or whatever temporal logic tool of the day, or other automated/assisted proof or generation environments).
Writing those is a huge undertaking every time, and a lot of duplicated
potentially buggy effort. I look at the amazing work that's been done (and still being done) to bring up libadalang, a parsing and high level semantic query tool that changed my day-to-day coding paradigm. One of the missing pieces of the puzzle is to me the 'generic
parser and semantic specification' that would generate an interpretor and all those proxies for you.
To be clear, I don't think we should get rid of text completely, or use some drag&drop UI or some visual model. It's just the program should be defined by the semantic model first and then viewed/edited however it is practical (often by text again).
And we already do this in many places - many common code changes we do are already automated by IDE (which maintains the model in memory), so we don't edit the text directly and we don't even think about it that way, we think about how we want to transform the model (ie. extract a function), not about how the text moves between files.
> Text so far is the best thing we have because text can be analyzed formally and computed with by computers.
This doesn't make too much sense to me. Using the approach I talk about would actually make that much easier, because tools wouldn't need to parse the text and work directly on the semantic representation (i.e. it's much easier to query a database than extract the information from textual sources)
Software reuse-in-the-small is a solved problem, but reuse-in-the-large seems unsolvable. That’s been the state of things for decades.
Testing is hard because the requirement for correctness is often absolute and inflexible. When software is allowed to be a little incorrect, then yeah testing gets much easier.
Yep. And your "reuse-in-the-large" I'd argue is also a side effect of reality itself. Take any market sector or industry, not just software - I suspect the "easyness" of directly reusing and copying any complex system is vanishingly small. If that were not the case, we would have infinite choices and brands of any complex product ever made!
The whole "program is a model" part makes me think of Smalltalk's image-based system. I never really got used to programming this way myself, but I do think an image-based environment might check some of the author's boxes. With the right tooling, a Lisp might even be a decent choice.
For those wishing to experiment what a current image-based development environment feels like, I recommend Pharo. Probably is the most advanced open-source development environment in the tradition of Smalltalk, available today.
Main site: https://pharo.org/
MOOC: https://mooc.pharo.org/
I tried to go through the Pharo mooc and learn a bit, but dealing with bugs and jank in an complex and unfamiliar UI made me drop the idea pretty quickly. The object orientation aspect of it was interesting, but I just can't get behind a language which is tied so deeply to the use of a poorly maintained UI. At a first glance I'm liking the approach offered by Unison, where you work with an image but everything you need can be done from the terminal, which I think is much easier to maintain and make portably. Creating a nice UI for it can be treated as a separate problem.
These 2 sections highlight exactly the problems I've been trying to solve with Molecule.dev. I agree with the author that software development has somewhat stagnated, and I believe something like Molecule.dev is the future. It isn't perfect and there's still a lot of work to be done, but I'm certain it's headed in the right direction. The codebase is currently in the process of being repackaged (it'll be an MIT licensed monorepo) so that developers can more easily play with it, and so that it can be more easily integrated into existing systems, as starter apps are not a frequent enough problem to build a scalable business from. This repackaging is taking longer than normal because all the investors I've spoken to apparently don't see the value in it (yet), so I've been looking for contract work to stay afloat. (Know anyone?)
> Testing and Correctness
> I want simpler testing
I built something else in early 2020 to address this exact problem as well, TestFront.io. I haven't touched it in a while (so don't sign up) but I may return to it eventually. It's open source. I've tried many testing tools/frameworks and none of them quite do what TestFront.io does. There is a pretty similar tool which someone turned into a very successful business (actually can't remember the name of it now), so there's definitely some value in it, but from what I saw when trying it, it's still not quite up to par with TestFront's granularity and ease-of-use. I'd like to return to TestFront some day, but for now there are bigger fish to fry.
>Also notice how writing the refactoring as a query over the model is actually not that difficult. I can imagine how I would write an SQL query like that in a few lines. On the other hand, writing an automated refactoring system in IntelliJ or VSCode sounds like a lifetime problem, and it's kinda unsolvable.
No, it really isn't unsolvable. I saw the demo for DION[1] after following a comment[2] here, and it was close enough to some ideas that I also thought were unsolvable, until I saw their demo. They made a brilliant choice, use the model to regenerate source code. This transforms the problem into something much easier to work on.
I'm an old Pascal programmer, so I did what I always do, fired up Lazarus (the Free Pascal based IDE) and started writing code.[3] It doesn't do any actual work, except for showing off the main concept. Abstract syntax tree on the left generates almost usable Pascal code on the right. Selecting code then selects the right part of the tree that generated it. I've only got a few hours into it, and it's already doing things I would have thought impossible, trivially. There's no reason you, dear reader, couldn't do the same with your own programming language.
We're at one of those points in history where 5 people invent the same thing. The future is going to be a lot better once we can directly manipulate Abstract Syntax Trees instead of source code.
I was talking about some specific cases - e.g. rename is unsolvable for some languages with macros (like Rust, in the linked article). Of course, DION solves that by going from opposite direction, which I think is the key (also got a lot inspired about this from their demo)
I think that a programming breakthrough that we need is better (graphical) formalisms for describing how software systems are intended to work. We can do better than drawing boxes and arrows with no clearly defined semantics. I don't mean formal models of how a program actually does work (i.e. I'm not talking about Petri Nets or CSP, although I'm not a computer scientist and couldn't tell you much about those things): what I'm talking about is a better language, or collection of visual formalisms, for referring to the components that we really _mean_ when we draw software designs using boxes and arrows. So something like UML, but that actually helps describe application/system function, rather than mind-numbingly documenting an OO codebase. There wouldn't necessarily be any connection between this and code (although you could perhaps imagine generating high-level automated tests from a description of how the system should work) -- the aim would simply be to have a more sophisticated way of communicating about how we intend or hope that our designs will work.
There's a hierarchical modeling paradigm/tools called C4 that (while being boxes and lines) helps with the zoom-in/zoom-out nature of understanding systems: https://c4model.com/
> I imagine this model as a relational database - you have tables like structs, fields, functions, arguments and relationships between them. When you think about it this way, it becomes clear that using a textual source code is really inefficient way to manipulate this model. It's very error-prone and requires tons of additional processing.
See Kent Beck shaking with rage at the passing of Smalltalk.
> We spend endless amounts of time bikeshedding the right syntax, indentation level, tabs vs spaces, or where to put code in the structure of files, but this all feels just pointless - these are all properties of text, but the text is just a tool to manipulate some abstract model of the program.
This is why I always use black for Python, cargo fmt, prettier, etc. for my code.
So, so much is misguided about this article. Yet, the opinions expressed are actually quite common, and I think represent some widespread and fundamental misconceptions about software development, so they're worth addressing. Apparently this comment was too long for HN, so I'll address the points individually in child comments to this one.
> Most code I write doesn't do anything interesting, it's either some boilerplate or glue for connecting subsystems together.
That's your fault. There's not some line where "interesting code" lives on one side and "boring glue code" lives on the other. I agree that you don't want boring glue code, but you also don't want highly interesting code either! Every block of code should be "mildly interesting": doing about one interesting thing. Your job is to evenly spread the intrinsic complexity of the problem over code blocks to make this happen. If you're writing boring glue code that doesn't solve part of the problem interspersed with highly interesting code that handles 15 different cases, your code is lumpy!
So many people talk about spending too long on boilerplate glue code, but you're stitching this part and that part of the code together for a reason -- for a reason that can be expressed in domain logic and user stories -- right? (If not, you truly are wasting your time, but that's only because that whole code block doesn't deserve to exist.) So express the reason for stitching that code together in the glue code, and now it's mildly interesting, just like all of your code should be.
It blows my mind that the same people who complain about over-abstraction also complain about spending too long in the weeds of boilerplate code. You're so used to bad abstractions that you've given up, and then you whine "why can't I do this at a higher level!?" Because you've stopped trying, that's why! Learn to write better abstractions, and keep tweaking them when they become awkward. Finding a good abstraction is really hard, but absolutely worth it when it happens. And there are meta strategies you can learn to make it easier.
Look, of course there's lots of unexplored territory in software engineering, and we absolutely should continue to strive for better programming languages and abstractions. And we are! From reading this article, this author is looking in entirely the wrong direction for such improvements. It's not going to be some magic visual model that
One thing we should not expect is that new developments will be easy for us to learn, because we are already steeped in the current way of doing things. Supposedly, lexical scoping (what we're all familiar with) was extremely difficult to understand by early waves of programmers that were used to dynamic scoping (an insane way of doing it). They could have easily complained that this was just some new over-complicated abstraction and language construction that we don't need, but once you get over that hurdle and understand it, life actually becomes much simpler. New breakthroughs will hopefully be simple, but probably not very easy for us [1].
Many of this author's complaints about the current state of programming sound like they just haven't really achieved fluency in their programming language yet, and that they've been burned out on bad abstractions and have stopped trying to create (or just can't recognize) good ones. That's OK, this is all really hard to do! But it doesn't mean that everyone else is doing it wrong.
A fair point, and I can't remember what my point was going to be (perhaps I meant to delete this sentence). But on the other hand, you never talk about the alternative to plain text, and I think every alternative I've ever heard of has been "visual" in some sense. In fact I'm not sure what other options there are other than "visual" and "plain text".
Perhaps what you were saying is that each developer should be able to choose whether they're working visually or in plain-text, with the underlying model being neither (binary? XML?). If you chose to work in LISP for the day, the computer would transpile the underlying model to LISP, and then transpile it back when you're done? I think this is the "magic" part, where some, what, AI does this for you? We're so far away from that being effective, and the benefits are just not there when you're truly fluent in the programming language. Every single instance I've ever seen of "Each developer can pick how it appears on their machine!" has made communication and synchronization between developers worse, not better.
> We have some promising ideas. Some examples include strong type system, fuzzing, snapshot tests and sanitizers.
Okay, look. There's a lot of different times when a software error can be detected and fixed -- a whole spectrum. Let's be real generous about the categories of things "software error" can include here, including inefficiencies or building something the user doesn't actually want. The lower on this list we find errors, the more costly it is to fix them. Our goal is to push up where we detect errors as high as possible in the spectrum. It looks something like this (details and order may vary):
- Initial ideation ["Facebook for dogs? That idea sucks."]
- Requirements analysis ["That's not actually how we calculate that metric."]
- Conceptual system design ["Wait, these parts don't fit together like that."]
- Low-level design ["Crap, a loop won't work here, I need a different flow."]
- Brain-to-fingers typing ["Whoops I almost typed 'elesif'"]
- Immediately post-typing [Red squiggly line under 'elesif']
- Re-reading ["Wait that should be i-1, not i"]
- Compile time ["SYNTAX ERROR: Unknown identifier 'elesif'"]
- Unit test time ["Assertion failed: expected 7, got 8"]
- Code review time ["You didn't handle the case where N is negative"]
- Merge / integration test time ["Oh crap, David's commit broke my commit! How the hell do I merge this?"]
- Internal manual testing time ["You need to tighten up the graphics on level 3"]
- Production ["The users say the application keeps crashing when they run this report! Fix it!"]
- Years later/never ["Turns out the last 10 years of analysis that we used to make business decisions were wrong."]
Testing is actually quite low on the list, and there are lots of ways to move more errors higher up that list, such as strong typing, good abstractions, and really clean code. It's also tightly coupled to the implementation itself, so a conceptual error in the code can also exist in the test's assertions. I'd much rather write test that is obviously correct (errors caught at re-read or compile time) than write tests; only when you can't do you fall back to that lower-level option.
> Some projects try to package this into a single framework, but this approach doesn't always work either, because it necessarily introduces new generic mechanisms and complexity
If the author doesn't want to have to learn "new generic mechanisms and complexity", then they'll certainly hate any huge breakthroughs in programming, because of course such a thing will come with lots of new generic mechanisms and complexity.
The reason single frameworks always fail is because they are single. The combination of technologies and abstractions is as important (or moreso) than the individual technologies themselves, and those are decisions you need to be able to make and change independently. Use small libraries that are focused on doing one thing well, are mildly interesting, and can be tweaked or swapped out without impacting everything else in your code. Don't use "single frameworks".
> A good hint is the recent GitHub Copilot development.
Absolutely not! This is entirely the wrong direction! You still have bloated code that you need to test and maintain, but instead of it being your code, it's the code of some (literally) brainless intern who has an unbounded potential for stupid errors?
All "scaffolded code" is doing what the compiler (or some abstraction) should be doing. Writing code the first time is the easiest part. Why do you want help with the easiest part, at the cost of making the hard part harder?
> And yet, most of programmer's time is spent reading or planning how to change the code.
Is the author really just another person who thinks that programmer productivity is based on the number of times they hit their keyboard? This is so off the wall that it threw me for a loop. You just complained about writing too much boilerplate, and you want people to think less and type more? This is insanity! The two most productive activities, by far, a software developer can do are:
- Discuss the problem space with other programmers and experts with a whiteboard nearby
- Stare out the window with the problem on your mind
> The question is - if most of the work we want to do is about changing existing code, then why is the system not optimized for change by default? The code we write is optimized for reading the source text and its storage.
But it is optimized for change! That's exactly the reason line-by-line textual coding has been so hard to disrupt! Many, many non-text or augmented-text options for code have been tried, and they all fail exactly because they're not optimized for change! It's a pain in the ass to go and change some visual, graph-based program that someone else wrote, and an even bigger pain to keep it in source control. Text is the easiest thing in the world to change.
And I'll repeat the same refrain whenever the idea of non-text-based programming comes up: you're underestimating how long code spends in a non-functional state (cannot be unambiguously parsed into an AST), and how important that intermediate state is. Text handles that state beautifully, and every other option fails at it hard.
> Or even better, I want to specify a goal like "I want this function to not take this parameter" and let the system figure out how to transform the program to achieve this goal.
Or, even better, I want to specify a goal like "Make me a website that does cool shit" and let the system figure out how to achieve this goal. The author's stated goal is extremely ambiguous, and it's exactly our jobs to resolve that type of ambiguity. What exactly should the system to do? If you want someone else to figure that out for you, then you just don't really want to be a programmer.
>I can imagine a system that can combine small transforms into larger ones and use some AI magic to figure out how to compose them to achieve the specified goal.
So this whole article really is just "I'm sick of programming and I want an AI to do it for me". It's OK to be sick of your profession, but that doesn't mean other people are doing it wrong.
> If we instead focused on building the right model, we could better optimize that model for editing and the text could be just a view of that model. If the text is just a view, it doesn't matter how it's written. Let everybody customize it the way they want. I don't care if you put opening brace on a new line, I don't even want to care.
But we do just focus on building the right model. The text is not the hard part -- not by a long shot! That's why discussion and deep thought are where the real programming happens; translating it into working code is a very small fraction of the whole process. "The text is just a view" just kicks the can down the road -- a view into what? Whatever that underlying thing is (some awful clunky graph-based model?), that's now your language. Of course you have a language somewhere.
Again I get the impression the author is just sick of being a programmer.
Agree with most of the points in the article. I’ve tried to attack this party by generating code and leaving all other maintenance to developer without any framework constraints: https://github.com/sashabaranov/pike
I've often thought about similar idea(s), the program is constructed out of nested components, almost like LEGOs. When we play with LEGOs we don't need a "programming language". Or do we?
But what is a "language"? It is a set of "legos" of different types like 'verbs', 'nouns', 'adjectives' which can be combined in many different ways, but only according to the lego-rules of syntax. Some lego-pieces can not be combined together.
The ways LEGOs can be combined is a "syntax definition" the designers at the LEGO headquarters came up with. LEGOS are a language.
So I don't think it is possible to program computers without a programming language. Even configuration needs some kind of language to give you the flexibility you need.
Lego created a visual programming language for their mindstorms system that is exactly this. It's very similar to scratch and while it lets you get things done and you can build some pretty amazing things with tools like that. The ... second order tooling... just isn't there. We have a huge amount of tools running around that can deal with code as text, but there are no analogous tools for these visual programming environments.
I've dug into some of these tools in the past, wrote some projects using things in the 'creative coding' space like quartz composer, touch designer, vvv ... at a certain point the visual clutter gets you and, at least in my experience, it was just faster to rewrite in a creative coding framework like openframeworks/c++.
I've thought about this issue so very many times, and I feel that until someone creates a system capable of understanding the underlying purpose of the code, we will not be able to automate this stuff away. I think the issue is that all of the things the author wants a breakthrough for is exactly what makes programming hard. We don't have systems intelligent enough to reason about a contract between two systems so as to create a general-purpose solution for things like glue code. The subtle differences that are there and that require us to do this work over and over again with small variations are precisely why programming requires intelligence.
Part of the "structured programming" revolution was that the compilers said "any color you want, as long as it's black": they were general-purpose glue code generators by declaring everything to use the standard contract.
(later, optimizing, compilers would spend a lot of effort to reason about different contracts, and in recent times, "undefined behavior" would come to surprise people who were used to the rather laxer approach to contracts in earlier days, but these are epicycles.)
There will always be work doing things over and over again with small variations, but I for one am glad to no longer be making variations on decisions like "will this parameter be coming in a register (which?) or on the stack (and if so, with what alignment)?"
Charles Simyoni's Intentional Software attempted systems "capable of understanding the underlying purpose" targeted at both programmers and business people. It was (hidden) parameterization all the way down. They invested several engineer decades and got no where. Microsoft bought them out for their patents.
We can't even automate testing of code (no matter how much "test automation" you have, you have manual testers too). That's orders of magnitude easier than automating the creation of the code in the first place.
In the old days we had compilers linkers and loaders. They all did the wrong thing. Compilers did text translation to semantics and generated code. The linker combined code from several sources and produced a loadable image. The loader checked for compatibility and relocated an instance to an address.
Instead, the compiler should have stopped at semantics. The linker should combine semantic modules into an abstract execution model. The loader should generate code for the target machine at the target offset.
We have some of that in different paradigms today. But still we transmit far too much of lame abstraction in large wads, with all the important semantics erased.
Or, one step up: The linker should combine semantic modules into an abstract set of cloud-deployable images. The loader should instantiate those images in the target cloud, with all their edges configured so the graph nodes connect properly.
(physically connecting logical graph nodes having been exactly why the linker/loader were messing with offset relocation on single machines)
This to me sounds like someone whose job it is to dig holes and fill them back in, asking for an improved shovel, or exercises for his lower back.
There is nothing wrong with improving shovels, exercises are great.
But why are we digging holes? :)
What does using a modern subset of C++ prevent us from achieving, that we really want? Are programming tools the bottleneck of modern society?
Do we really need 100s of computer games released each year, 1000s of funded start-ups trying to become unicorns, a 'metaverse'? Do we need all that? Or would we rather have UBI, land value tax and enough sanity in society so that women choose to have children again?
mostly just a glimpse of that talk, perhaps better suited is the one about programming he gave at Dropbox. In my opinion quite more in-line with what's being said.
The breakthrough is probably going to be a GPT-N model. It needs a few tweaks to be good enough - for example to be able to handle complex projects. The glue code is always going to be an impossible problem from the point of view of language design, can't standardise, every case is different in a different way. Only a human or a language model are flexible enough to adapt to this kind of variance. We're going to love our glue generator and won't be able to understand how we used to write that code by hand.
Excellent. Asimov's "decay of empire" scenario approaches! (every living human eventually forgets how to repeat, or even interperet how the complex systems undergirding our entire society works, eventually rendering maintenance impossible)
The #1 thing we need is a paradigm shift w.r.t complexity. We know that complexity, as the number ofvthongs you need to keep in mind to see the big picture, grows with the size of the project, but that growth varies greatly. A bad programmer writes code in such a way that complexity grows quadratically: it quickly crosses the threshold of being understandable by anyone and then the project is better to be redone. A decent programmer keeps the complexity linear by separating it into pieces with clear interfaces and behaviors. A great programmer can keep this growth sublinear, logarithmic maybe, but it's an art - this is what makes huge projects possible (linux, etc.) The paradigm shift would be a formal way to enforce low complexity growth, so even mediocre programmers can't f-k it up, similar to how syntax correctness is enforced.
The #2 thing is documentation, which is in a very sad state right now: everyone writes boatloads of unstructured text that vaguely amd loosely explains something, and newcomers have to either parse this mess, or ask those who know for a digestible summary. This happens because text, the way we write it, poorly represents our thought models. We've learned to split text into words, words into paragraphs, and paragraphs into chapters, and we've even invented content tables, but all that isn't enough to deal with the enormous complexity of software we have today.
There's a touch of "let them eat cake" in the author's expectation that these difficult problems could be easily solved, if someone just put a little thought into it.
This is perhaps most clearly revealed here: "Changing the program is the most common thing we need to do in our work, it's the reason our job even exist. And yet, most of programmer's time is spent reading or planning how to change the code." Combined with the disdain for testing those changes, it seems the author wants a magic programming wand.
This. The author lost me there, I just scrolled past the rest to the conclusion.
This guys seems to believe produtivity equals amount of information produced, and fails to grasp the basics of creative work, which is exactly thinking.
I’m for reinvention, and if thinking radically is what it takes to make the more typical and known processes evolve, so be it. I do have skepticism about a model liberated from the text file - most tools fall back heavily on our use of language as our most powerful tool, the extent that a visual system like a UI replaces that usually represents equal constraints on what you can do.
But constraints are what make programming languages great and easier to work with than machine code. If this boils down to a hierarchy of constraints the question that needs to be asked is “why are there so many different ways of constraining a program’s behavior” (via programming languages)? If there were only one way, much of the innovation wished for here, as abstract as it may be, would have probably already been worked on. Instead the base shifts constantly and the techniques for building changes everywhere.
Sometimes I wonder why all of programming can’t boil down to C code. Python would first be translated to C, Rust would be an analysis engine for C, etc. It is elitist to make programming a meta language and field of study just to understand and use effectively, even if not consciously done. The lack of standardization, compared to what could be, drives this complexity. You’ll know programming has revolutionized again when an order of magnitude more people can do it without dedicating chunks of their life to it.
I think the biggest waste is lack of interoperability and usablity of libraries and code between languages.
There is SO MUCH CODE out there. To some degree that is a bad thing, but there are genuinely good libraries in other languages that would be nice to more seamlessly utilize in other languages.
Generally, there isn't a "best of breed" movement to actively track which implementations and approaches to things (json deserialization/serialization, http client, regular expressions, high speed I/O, algorithms and data structures.
Instead, good approaches organically move around, or are at the whim of the steward of a language recognizing and pushing for some new language, approach, etc. Often good approaches or libraries or interfaces are shut out from another ecosystem over nothing more than dogma or reactionary hated in the worst case.
Instead, if one language gets a good library, you either interface at the OS level or you have to migrate to that language if the library is that critical.
The JVM was close to this, with Jython, JRuby, Javascript, Groovy, Clojure, and lots of other languages runnable within JVM code. Ultimately, notably Ruby where the JVM was the fastest runtime for Ruby for a time, rejected that approach.
But I think it was the right approach fundamentally, if the politics got in the way. LLVM may have that potential, but it seems to tied to compilerland. OS-level interop seems generally a failure, either because of OS balkanization or other concerns.
This is part of a comment I made on The State of State Machines (2021) [1] which is a Google Project Zero post about how complex state machines were exploited.
> I feel like we will continue to run into these kinds of issues to the end of time unless can figure out: 1. How to precisely describe the properties we want the final system to have, 2. How to 'compile' from a higher-level language/layer into a lower-level language while maintaining all those properties. At the least this will require us to: 3. Define the precise semantics of each layer and enforce that every construction in and implementation of that layer adheres to it strictly.
> Layers/languages should verifiably maintain these properties through every translation via paths like: eula/privacy policy, requirements, UX, UI, state machine, source code, AST, IR, ASM, machine code. Every one of these layers should have a well-defined execution model that is isomorphic with other layers, modulo metadata. For example: source code === AST [+ comment/whitespace/filesystem metadata], AST === IR [+ name/desugaring/applied-optimizations metadata]. All undefined behavior at every layer must be exterminated by definition or avoided by construction.
Maybe future programming systems can use an assertive pattern, where software developers only define what the software is supposed to do. And the programming systems generates a program that fulfills all assertions.
Then when the software designer finds that the software does something wrong, he can simply add assertions. Its a little like TDD but without coding.
For GUI applications it would be cool to define assertions via natural language and the programming system drawing pictures to explain what it understood.
> Maybe future programming systems can use an assertive pattern, where software developers only define what the software is supposed to do. And the programming systems generates a program that fulfills all assertions.
Isn't this just a programming, but harder and worse?
We've invented programming languages to be less ambiguous about what we want done, that's a feature and not a bug. It allows us to avoid the struggle of lawmakers (who are programming in spoken language) and lets us tell the computer exactly and unambiguously what to do.
Natural language does not.
"Bang on the floor" may mean to slam the floor or to have sex on the floor.
There's a programming joke that goes like a programmer's partner says "Go to the store and buy a loaf of bread, and if they have eggs, buy twelve". The programmer comes back with twelve loaves of bread.
Time flies like an arrow, fruit flies like a banana.
My intuition is that making all the assertions to write correct code with decent performance and scaling is at least as complicated as writing the code.
When I say "what" but not "how", I am trusting the system to be reasonable in the "how" it generates. But "reasonable" depends on my situation. Unless the system knows and understands that, it may easily do what I said, but do something that is anywhere from non-optimal to disastrous.
And if I have to say enough to prevent that - to prevent all the ways that could happen - is that more efficient, or less, compared to just writing the code myself?
Main reason to make the distinction is that most systems that parade themselves as "low-code" today are IME indistinguishable to high-code development to your average joe.
Even taking this example, the level of value totally depends on the degree of abstraction you can trust the computer to perform for you. There is a tremendous amount of nuance to human language and while we've come a very long way in that field of study, it's still a hard nut to crack.
Programming has always been making really detailed instructions that tell machine what to do. The article could have been written in 1989, 1995, or 2005 and really been materially the same. Glue code, boilerplate, tests, frameworks, languages, and maintaining existing code all things that developers complain about. All things waiting for the next magical solution...
We largely build software the same way with the same tools (I'm looking at an Emacs window). One line at a time (or one boilerplate paste at a time). It's been decades since I sat down with Turbo C and discovered the ID, and decades since I drew my first UI on a NeXT Cube. Decades since Java, Python and C#. We've never found a way to program at a higher level - we're still stuck processing strings, parsing text, formatting text... all these little details that make it really hard to work at a higher level of abstraction. And all little details where each developer does it just enough differently that we can't be sure what our code really does...
> We've never found a way to program at a higher level
Compare 1950-1970 with 1980-now. We found a way to program at a higher level then; why shouldn't we be able to now?[0]
In the meantime, as a minor win, I'll note that most of my $WORK codebases in the XX were heavy on pointer[1] manipulation, while very few of them are in the XXI.
[0] let's hope the gap between structured programming and its replacement is not going to be as long as that between Euclid and Lobachevsky!
[1] "pointers are the goto of data structures" (bastardized Tony)
1. Less code. Operating systems have become obscenely bloated and restrictive. There is an enormous amount of power in the hardware we run our stuff on these days but you can't even touch it as a programmer anymore.
2. Powerful proof assistants and code synthesis. I love low-level programming but even the most hardcore, brilliant people I've met can easily introduce show-stopping security flaws into software without realizing it. It turns out that programming is hard and without rigorous models and proofs humans are really poor at understanding programs. We naturally hand-wave away details which get us most of the way there but these are discrete systems and details matter a lot.
As for testing... I agree, unit tests are often insufficient but probably the lowest bar you could convince typical industry software developers to hurdle to prove the correctness of their programs. Even then, as the author notes, they will come kicking and screaming with their opinions. There are far better options and strategies but as soon as you mention property testing or formalization you'll lose about 90% of the room so you have to pick your battles.
The trouble is that most startups are motivated to accidentally stumble upon a market niche and in this day and age that often means, "something with computers." They're not in it to write reliable, robust software that is efficient and performs well on a target platform. They're writing code that they can cash out on and never see again. Very different world.
The final breakthrough we need is to fund research labs again. Technology isn't inevitable and civilizations in the past have lost the ability to build and maintain it. We shouldn't let that happen again if we can help it.
Very nice article. Literate programming might have some interesting features in this direction. The not so obvious thing is how it can scale.
A very good example of literate programming can be found in the fantastic book Physically Based Rendering [1]. It also has a preface discussing it [2].
The authors write a full physically-based rendering system in a declarative fashion. This approach has also led to a nicely adapted web-based version of the book [3].
Although we have successful examples like TeX and HTML, it would be very interesting to see this approach explored further by a more modern language or programming methodology.
I tend to write far more testing code, than the CuT, itself. Many of my various widgets have test harnesses that are, in and of themselves, full-fat, App Store-ready apps.
I find test harnesses are generally better (for me), than unit tests[1].
I was just thinking about this, this morning.
I'm what people refer to as a "crusty oldtimer." I wrote my first program (Machine Code) in 1983, so I've seen some changes in the landscape.
These days, the general posture of many shops seems to be "Our programmers suck, and can't be trusted to write code güt." Sadly, this is not necessarily an inaccurate stance.
It tends to result in highly defensive IT infrastructure and process, though, with heavy reliance on languages and strict, rote methodologies.
I used to write a lot of code in C++. Nowadays, people seem to freak out, at the thought of writing in C++. It's entirely possible to write safe, performant code, in C++, but it isn't enforced by the language. The same for PHP, which I've used a lot.
It's also quite possible for bad programmers to write truly nightmarish code in both languages. I know of which I speak, having written quite a bit of said "nightmarish" code. That's one of the reasons I write fairly good code, these days.
The conventional wisdom seems to be that we need to rely on the tools (languages, IDEs, VCS, CI/D, etc.), to assure good code.
That's fine, but it seems to stem from an unwillingness to invest in the most expensive component in a cubicle; the software engineer.
This is a long, tiresome discussion, with no one willing to change their minds, so I'm not going to follow up on it.
I did notice that this posture seemed to start in the mid-1990s. Before that, companies put ridiculous amounts of trust and investment in us.
I find it interesting, there are no comments about the author's argument that programming today is inefficient as the most popular "model" are relational databases. "* I imagine this model as a relational database...it becomes clear that using a textual source code is really inefficient way to manipulate this model.*" It wasn't too long ago, object oriented programming, ORMs, and NOSQL was in fashion. I believe that was a dead end.
I don't believe relational databases are the end all or be all either. What has stood the test of time is the double entry accounting transaction journal entry. In this model, the journal entry is the system of record. And you can have multiple distributed systems creating these entries at the same time. You collect and aggregate these entries to display the "balance" at a specific point in time. To me, this is what Kafka distributed logs powerful. Distributed logs builds on a proven concept.
I don't see the problem with boilerplate or editing code. Even using a template with your CI/CRUD boilerplate/deploy scripts/... is no problem if you have version control, you can easily see what comes from the template, what you've changed, and reconcile with a new version of the template.
What is the problem? That not 100% of the code in the file is business logic or only exists in that project?
When I receive a letter, there is plenty of boilerplate there: addresses, greetings, sign offs, business hours... does that really get in the way of my reading the message, once I'm used to the format? The information is there if I need it, easy to ignore if I don't, and easy to copy-paste for the author who will update if needed (e.g. closed exceptionally or something).
Is the author running into a problem or just complaining about perceived inefficiencies that don't actually get in anyone's way?
I think, the problem is: the production of languages (and libraries especially) is way too much market-driven, there is too little cooperation, too little time spent to polish and to model at least some major use cases.
Also the quality of education is too poor, too focused on time-to-market - it makes harder for non-seniors to understand e.g. complex type systems (which are required for more usable languages anyway).
A language is released too prematurely -> some companies start to use it -> design problems become clear, but the language is in a compatibility trap at this stage -> e.g. it becomes a swamp or it slowly becomes a monstrosity of design afterthoughts and crutches.
It probably even wouldn't require much more spending from corporations, only some initiative. They anyway already invest considerably in language&toolchain development, but often with wasteful goals.
>IntelliJ helps you with things like "extract parameter" or "inline function", but I wanted to do stuff like...
In the same vein as the developments around LSP, TreeSitter, etc. I had the idea that one could create a scripting language dedicated to refactors. The language would make it simple to talk about the AST of your programming language and provide simple primitives for safely transforming it.
Access and Resource Security - I want to look at a program and be able to tell "this program will never be able to do/write/read X" at one glance. Encapsulation, capability security and metering all the way. Let every dependency be in its own sandbox, with the minimum rights it needs. We need a language that can do all that, without relying on another language to spin off more sandboxes (like WebAssembly which requires Javascript) and without overhead - one that supports "internal encapsulation" instead. And not an incomplete approach like Java either, instead create a VM that ensures a sandbox can never access anything external, unless it is explicitly imported/passed.
I've approached building web apps as data models. It works really well for our team. This was how / why we started to develop Lowdefy [0], to express a web application with a data model. We've been experiencing the benefits of this - I really need to start writing about it.
Writing this app model in json or yaml might not be ideal start, but once you have that creating a IDE / GUI builder / next thing on top of that becomes really a simple task while reaping all the benefits of defining an program / app as a data model.
That looks pretty cool, I will look at that. I already had this idea many times (and I think anybody who done some web stuff must have had, too), but it always felt apart when you consider all edge cases you want to handle differently.
Thanks. Yeah, we’ve been building all kinds of things. Anything form dashboards, crud, public forms, ver advanced forms with complex validation and logic and even full MRP and CRMs. Of course there is an egde, for now it’s stuff like pub/sub, server oauth etc. But mostly seems like these edge cases can be supported as features later.
It seems like the abstraction is working really well thus far, separating the business logic, into operator functions, from the app and backend stuff is really key to this.
Really looking forward to the next phase where we get to create all nice programs tools in front of the program model to provide a superior dev ex :)
What we need before all this is a better way of specifying program behavior, both at a high and a low level.
In mobile there are multiple tools that you can use to prototype UI behavior. They allow you to specify navigation, transitions, clickable areas, behaviors when clicked, swipe actions, etc. They're this way so that the UI behaviors can be seen before coded (to an extent - not every prototyping tool does everything).
For code, well, only formal specifications get to that level. I haven't actually seen one of those outside of software engineering books, but I presume they exist in more critical fields. Commercial software doesn't really use them because they're too much work.
I solved this for me with a custom visual studio extension. It allows me to quickly specify a pattern that will create new code or change existing. The extension will show a so called Code Lens for each possibility to apply the pattern. This way for all the tasks I find my self doing over and over again I design the pattern as some kind of master and then apply it again and again on hundreds of locations in many projects. (some things I have solved this way: prisma schema, GraphQL resolvers, frontend code matching my backend API including full typings, ...). Saves me tons of work but I still have 100% controll over the code.
The essential complexity in today's coding is that code is created by humans. Therefore humans, we have to understand existing code bases to modify them and this is non-trivial. We also have to understand "what" needs to be built, that is the infamous "requirements". I think eventually coding will be automated away with artificial intelligence. I did not think like this for a long time in my career (that programming will be automated I mean) but neither I thought that computers would be able to create "deep fakes" and things like understanding & speaking human languages.
This paper might be of interest: David Harel, Guy Katz, Rami Marelly, Assaf Marron. Wise Computing: Towards Endowing System Development with True Wisdom https://arxiv.org/abs/1501.05924
It describes an idea called "wise computing" or "behavioral programming" which i'm not sure that i fully understand, but it sounds like the idea is to allow IDEs to help more, via additional formal reasoning about code in the manner of an expert system AI.
Something like Clojure inside Glamorous Toolkit (GTK) would make me a happy camper. The discussion about text is a moot point bc GTK makes it easier to create different visual representations of your code. Clojure makes it easier to keep code as either basic data types ([], {}) or functions without reaching for custom DSLs which OO defaults to.
That and that everyone adopts it so there's plenty of libraries and a community :D.
Seems like Clojure will get there with beefed up REPL workflows ala Calva/Portal. Community is amazing rn but tbd how it 10-100x (if ever).
I disagree whole heartedly with everything written in this post..whoah. I am almost thinking this was written as a sort of provocative summary of what are *not* the software problems to solve.
I'm surprised that the author did not even mention about GC and I think it's another programming breakthrough after structured programming.
Currently GC for compiled languages like Go and D (optional) are considered as disadvantages rather than an advantages for high performance systems.
Perhaps if we can get AI enabled GC that has good performance as manual memory management in C++ and Rust it will be a genuine breakthrough for modern programming languages.
Text does feel like a really slow way to work sometimes.
A while ago in a React app I needed to keep some state around so lifted it from component state hooks to a context. Very mechanical work. It feels like we should be to say “move these to a context consumed by A, B, C” and just have it all done automatically.
Is there anything inherent to React/JS that stops something like IntelliJ implementing that?
> Is there anything inherent to React/JS that stops something like IntelliJ implementing that?
Yes, actually. It's a problem for dynamic and weakly typed languages. IDE can't assume much properties about the program so it can't do many of the refactorings reliably. On top of that, React and many web frameworks build on some DSL (like JSX) and build-system specific project structure and source transforms that the IDE can't reason about very well. As a result, such refactorings often need to be implemented by the IDE for each specific framework separately. This is exactly the area where starting with a semantic model first would help a lot.
I can't find the article now, but I remember reading about SecDB and comparing it to programming in traditional languages. The point that stuck with me is that the software industry has spent decades taking other industry's data and lifting it from flat files into databases, yet our own code is still in flat files.
- The programmer's tool should be a tool for manipulating an annotated AST (not text)
- There should be many different types of UX's for different scenarios, each maps to and from an AST in a UX that is optimal for the developer for that scenario
- We must be conscious of human brain limitations and cognitive psychology and work within those constraints
- "Reading" and "Writing" code should have different UX's because they are radically different use cases
- Use RPN. It models the real world. Humans are designed to manipulate their environment in an incremental manner seeing the result each step of the way. When we have to plan out and write code for an extended period of time, trying to play compiler in our head, we overload our brain unnecessarily and highly likely to make simple mistakes.
- Testing should be a first class citizen in the developer experience and indeed baked into how we develop at a fundamental level that it seems strange that they are even decoupled to begin with.
A paradigm that Dave Farley has mentioned could be on the same level as OOP or structured programming is a completely asynchronous programming language. Every operation would have no garunteee of executing immediately. Such a thing may unlock more performance for cheaper on modern multicore cpus.
When we tried async-everything in the 1980's we discovered that there is such a thing as too-fine grained parallelism: the sweet spot (problem dependent!) is where communication and computation are balanced.
I work on this problem everyday, I am at 450+ journal entries of thoughts from concurrency, parallelism, desktop environment behaviour and programmatic expression. (see my profile)
I am working on the expression problem at this time.
> If the text is just a view, it doesn't matter how it's written. Let everybody customize it the way they want. I don't care if you put opening brace on a new line, I don't even want to care.
If an AI can now draw a rembrandt, then it's not far off that an AI can read this:
Regular users are allowed to edit their own books.
Editors are allowed to edit any books with the same account id.
The public can read about any book that is not in draft status.
I need a domain model where users can write books with the following properties: userId, accountId, title, author, status.
status is an enum supporting draft, published, no longer available
I know this seems like a completely different paradigm, but I think we were wrong about General AI needed to replace programmers - we just didn't know how "normal AI" would go without being General AI.
I can also see this extended into the front-end at least for rudimentary forms. I can imaging someone doing a transfer AI to make front-ends have a consistent and nice looking feel that can change pretty easily - perhaps to match the user making the request.
Assuming you mean that to be a rough draft for the spec of a book management app, you wouldn't need an AI to parse that and generate code. A simple DSL is enough and indeed the "spec" you wrote could be implemented in about ten minutes with Rails, devise and pundit. (I imagine most other popular backend webdev languages have similar libraries available)
I wonder if the author of the blogpost ever tried Rails’ convention-over-configuration approach. It might be exactly what he means when he says he gets to write less boilerplate.
> I think we were wrong about General AI needed to replace programmers
So, the hard part of software is figuring out what you actually need done.
You still need the person writing the system requirements to be precise, accurate and not ambiguous, or you need General AI to figure out if the supplied system requirements meets those requirements.
And when things inevitably goes wrong, or requirements change as they always do, "you just" have to review and rephrase your entire requirement list, or you need General AI to do that for you.
All this stuff exists, and most of it is decades old. The reason we don't make (better) use of it is that IT is a fashion driven field that ignores prior art (as the author of TFA demonstrates.)
no ideas who this fella is and too lazy to look up his huge achievements. but I dislike the way he is thinking. I mean - it's already there. go and research Smalltalk stuff or something.
This resonates deeply with me (and my research project).
1. Glue code and boilerplate waste
Yes, yes and yes! Glue code is the dark matter of software [1]
My hypothesis is that the reason we have to write so much glue code, and that it's always sufficiently different is that we don't have the right architectural abstractions. Specifically, our programming languages only support (essentially) procedural abstraction (procedures, functions, methods).
With Objective-S [2][3], I've been introducing language support for other architectural styles. It's been a slog, but recently things are starting to really click together, particularly the combination of "stores and streams" is very powerful and eliminates tons of boilerplate, often reducing it to the "connection operator" →
Which not only happens to be very brief, it's also generic/polymorphic.
3. Why don't frameworks work?
Also really very perceptive observations. Frameworks sort of work, but they tend to be limited by having to represent their domain in terms of procedural abstractions.
4. Non-textual code/modeling
I don't think this is the problem. Modeling is a problem, but a large part of that is because the systems/models we need to build are non-procedural, and so our textual code can only be a meta-program or meta-system that then constructs the actual system. Which is not visible/present in the code we write, but only exists in the running system and in our heads. If we close that semantic gap, having that model expressed textually shouldn't be a huge problem.
The problem is not the text, it's the wrong abstractions that our text expresses.
5. Testing
Not sure the idea of generic testing is that useful, because the key to testing (for me) is its concreteness. That said we can reduce test friction a lot, I think. [4]
6. UI construction
You don't need UI mockups and prototypes - you just program the real thing, because it's that simple. If the user doesn't like it, you can completely restructure it easily.
Yes. I've seen a lot of mockup tools, and it always seemed obvious that writing the real UI should be just as easy (or easier). With NeXT's Interface Builder, we had a large part of that, but sadly Apple has had it languish so horribly that nowadays even imperative code is often preferable, and the fluent APIs such as SwiftUI or React, Flutter etc. seem vastly preferable, despite their deep flaws.
Here's a few thoughts on how to make strides towards what this guy wants, but with tools and features that largely exist today.
RE: Boilerplate
A significant amount of the boilerplate he talks about comes from the web stack not being designed for what we use it for. That leads to a need for a complex three tier architecture, ad-hoc app-specific protocols layered on top of HTTP, and then lots of ad-hoc logic to convert that protocol to and from the underlying database protocols.
Consider how much less code you'd have in a two tier architecture that looks like this:
1. An app that runs on the user's machine, which connects directly to:
2. Your RDBMS, which knows about every user of your app and imposes security/privacy ACLs on them in the standard manner using views/row based security/etc.
In this architecture there is no web server, no REST, no JSON, no JWTs, and therefore also entire classes of security bugs are eliminated in one go (XSS, XSRF, SQL injection, non-transactionality triggered race conditions etc). There are also no load balancers because your DB driver already knows how to do client side load balancing, and a variety of other advantages. When you need more than just standard SQL CRUD operations you write server side function plugins for your RDBMS in a language of your choice and let the DB protocol and servers act as an RPC protocol that happens to support batching, transactions, large result streaming/paging all built in. In effect the RDBMS itself becomes the application server.
Today we don't build apps like this for a few different reasons, mostly related to the inconvenience of distributing desktop software outside a web browser. But that's solvable! And in fact my current project is a company that makes a tool that makes distributing desktop apps as easy as distributing web apps is [1]. With that and a technology like Kotlin Multiplatform + Jetpack Compose, or with JavaFX, you can write a single frontend app that runs on every desktop OS, Android and iOS (where you can write a custom SwiftUI GUI with shared business logic or re-use the Android UI code if you don't need pixel perfect native UI.
There are a few other issues that are all also solvable e.g. tunneling DB protocols through proxies, OAuth/SSO integration, streaming code to the client etc. And as a pattern it really benefits from a powerful RDBMS. But once you have the foundation of brain-dead simple app packaging+signing+notarization from your laptop, with smooth auto update to clients, suddenly you have lots of options to massively simplify up and down the stack.
RE: Text vs abstract code models.
Another big win if you go this route is you suddenly have way more language freedom. The author talks about wanting a database to store his code, but then talks about Rust, for which IDE support is limited. If you use a language with really good IDE support like Kotlin or Java, then your IDE is building a database like the one he wants already. He says it's painful to query that DB but that really depends a lot on what you know and what languages you use. IntelliJ has a structural search+replace feature that lets you do example based queries, and it also has a console and plugins that let you do on the fly queries and structural changes by writing code against their PSI API. You don't have to write full blown plugins [2].
I don't want to be too harsh but I think I strongly disagree with everything the author claims, and I've scrolled through the comments and found the criticism needs a touch.
1. Writing glue code and boilerplate is a waste / Why not use a framework?
The author complains about boilerplate just to circularly complain about frameworks hiding "important details you care about". This, to me, denotes a failure to recognize some pretty basic trade-offs of any kind of development: you either create your own axioms and start something new or base your work on previous art by others. The more you want to customize, the less prior art you can use. He seems to want to "eat the cake and have it too".
2. Editing code in general doesn't work well / Program is not a text / Program is a model / Do we need a language?
This section makes no sense to me. Everything is a language, whatever format he desires to abstract away all formats will need a language to be coded into. Language is a tool to convey meaning and achieve communication, it does not matter if it's text, pictures, sounds, whatever. It just so happens that written language is the single most ubiquitous communication method ever developed by humanity, and the formalization of syntax and semantics are there to reduce the ambiguity which is natural to natural language. The author seams to believe that we should achieve a single natural language that conveys meaning without ambiguity, or that some other form of communication can better represent the programmer intention than text. I would love to understand exactly how, because to me it seems like a pointless argument in the way it's presented.
3. Testing/Correctness / I want simpler testing
Again, the author does not seem to understand WHY testing is difficult. To me, it's very obvious that testing is hard simply because 1. proving a complex system is hard and often impossible as we all learn in CS101 and 2. Testing itself requires conveying the meaning of the requirement correctly, and as with all communication there are always failures in this process. Once more the author seems to want a magic wand, ignore the trade-offs and, I don't know, fix all communication problems ever to have existed through human existence?
4. What is the vision?
Finally, the author tries to force upon the reader an argument that everyone's perspective is wrong, that some sort of revolution is required - and possible - to make all these desires come true. I think this misses the point by such a long shot, like the good old Product Manager that "just wants things to be done and doesn't care how". I may be the one incapable of changing my perspective and thinking outside the box, but I have a feeling that if such a revolution was possible - and most importantly - unanimous - we'd had some glimpse of it already.
Note how every single argument of the author is - to my interpretation - fundamentally tied to a communication problem. All the issues mentioned converge on difficulty to encode a message in such a way that it is interpreted correctly by the receiver, be it a machine or a person, in a unambiguous way. This problem is so much bigger than programming, so much bigger than engineering, so much bigger than Science. I'd argue that this is THE fundamental open problem of humanity, and believing that there is some sort of final solution is extremely naive.
In my day-to-day JavaScript programming adventures, I typically encounter issues around paths and how they're imported. There has been so much energy devoted to parsing a string and reading the file it (probably) points to, and we just assume that what we're importing from that file is actually what is in it. Maintaining all of my static imports in JavaScript feels like I'm doing something that a compiler or bundler should be doing, not a human being.
Go has the right idea, it considers every file in the current directory to be part of the same package. This is really useful because you basically don't have to think about file importing anymore. I love being able to compartmentalize my Go programs without having to worry about updating a bunch of path imports all over the place. It lets me just focus on the code and sort out the organization later on when the project gets bigger. Of course, when the project _does_ get bigger, that's when you start having to grok how packages work and how they're built. This isn't a huge problem but it's definitely a learning curve that you don't necessarily need to overcome when you're first starting out, so it's a bit of a skill jump.
I feel like there should be a programming language where the module system is deeply integrated with the package management system, and they are both baked into the language using syntax. This way, you can optimize those systems without disturbing how programs work, and the programs can specify what stuff they need from each package. Rather than deciding on some kind of folder/file convention for what a "module" is in your language, this one would put that responsibility on the programmer.
A package would be defined by specifying its unique name, and the code that can be imported out of it.
You'd run this function on the command line by specifying it as your "entry point" like
lang --run 'my-app#App.start()'
Or, you can compile it into a program with
lang --compile 'my-app#App.start()' --output ./app
I don't know if a "registry" is really needed for these packages, as that might unnecessarily centralize the whole thing. Technically, since the language doesn't care at all about file paths, you can store packages however you want in your system. Literally `wget "https://github.com/some/pkg/releases/latest" | gunzip` should be enough to start using the package, but of course having a CLI for managing this stuff will also be table stakes.
Fundamental, foundational, unavoidable, overwhelming, invincible but just dirt simple truth: Please sit down for this. And may I have the envelope, please? Drum roll?
===>>> In computing, meaning is crucial. Computer source code doesn't mean anything. <<<===
Computer code obeys strict syntax rules, but those rules do not give meaning.
So far, sorry 'bout this, essentially, to date, there is only one good and significant approach to meaning -- writing in a natural language, e.g., English. Did I mention, sorry 'bout that.
In simple, blunt terms, in well written code, the comments which provide the meaning are more important, especially for maintenance, than the actual code that gets executed. Sorry 'bout that. Beyond that, beyond any doubt, far and away, in well done software, the most important part is the well written documentation. I should put this word documentation in all caps, ultra bold face, with flashing lights -- well written documentation.
Sorry 'bout that.
For my code for my startup, early on I wrote out some documentation in TeX, with the crucial core pure/applied math, data handling, etc. Then I wrote some notes about the code. The code is just awash in documentation, 100,000 lines of typing and 24,000 programming language statements. Right, on average 4 lines of typing for each programming language statement.
I had some external events pull me off the work of my startup, but now I'm returning to it and finding my old documentation just crucial, terrific. Even though it is all my project and code, I still need all the documentation, and it works great -- let's me understand again right away.
Don't believe me or argue with me. Instead, learn from D. Knuth who has demonstrated very well that he is really good with software. See what he did, and how and why, with his literate programming -- more like reading a book in English about something in engineering than just source code.
Right, long many software projects at IBM and the US DoD went through stacks and stacks of layers of planning, requirements, specifications, ..., before any code was written. Then for any change, might have to go through the hundreds of pounds of paper of the documentation to get it up to date again -- a real pain, bottleneck, boat anchor to progress, etc. Right. Need a better way. But apparently their documentation process was too clumsy or some such. On the other hand, if the software is for an airplane and it is you who is going to fly on that plane, then maybe ....
Net, the meaning is crucial, and the code doesn't mean anything. The meaning is in the documentation written in a natural language, e.g., English. Sorry 'bout that.
One more point: We can try, but I see little hope soon. We can try to design a programming language with syntax so expressive that the meaning is automatically obvious and no more documentation is needed. Did I mention I see little hope soon? Sorry 'bout that.
My 2 cents: There are thousands of projects trapped into formatting hell because nobody wants to touch git history. We need a standard to fix formatting programming without messing up history.
Some code formatters such as Spotless (https://github.com/diffplug/spotless/tree/main/plugin-gradle...) allow you to format code only in files that have changes against some designated branch such as `master`. So, you check out your feature branch, make changes, do some commits, and run spotless. Only the files which have some changes between your workspace and the master branch will be formatted. This allows you to gradually format the project as and when files would be changed anyways.
Just to cherry pick a dramatized example. In the late 90s, one of my first programming experiences was in Delphi. It's a very visual way to program. You drag and drop a UI together from standardized elements. You could data bind things like input fields to a data source. You'd double click a button and it takes you to the already created "onclick" event where you do the actual coding. In a way, this is still similar to how you can "program" Microsoft Access today.
In this same period (I was in a computer science study) I was moaning to my teacher how tricky programming in C was. I made tiny memory management errors crashing the entire PC, with zero debug output.
His words: "Don't worry. By the time you get to work, you'll just do modeling".
Now I work on the web. It's been 30 years since Delphi and instead of productivity gains, we've taken steps in the opposite direction. There is no standardized UI library. JavaScript has no standard library. There's no single unified architecture. There's no standard tool chain. It's all very low level after which we stitch together fragile semi-abstractions.
I know it's an imperfect comparison, but I hope it gets the point across. It's all far too low level and fragile.
This doesn't just hurt productivity, also accessibility. It's simply not beginner friendly.