Hacker News new | past | comments | ask | show | jobs | submit login
Programs are a prison: Rethinking the building blocks of computing interfaces (djrobstep.com)
225 points by vortex_ape on Nov 7, 2020 | hide | past | favorite | 189 comments



> We need computing environments ... without the concept of applications appearing at all.

Platforms keep trying to enable this, but application vendors want to control the UX and branding, so they're not going to provide these generic reusable building blocks.

Android, for example, lets apps make use of views from other apps and securely delegate a task (e.g. take a photo, pick a file, etc.) to the user's preferred app without needing to request permission for direct access. But nobody does this - apps just requests all permissions and do everything themselves.


Author here, yes this is a big problem (the biggest?), as the incentives are all wrong. As I noted in the post: "Often ... apps will have features to integrate with other apps and the wider operating system - but not so much that they become invisible. Instagram still wants you to see its logo, consume its specific content and stay within its ecosystem. Once again, the implementation and architecture are driven by economic imperatives."


Ah, yes. But you seem to be saying in the article that these incentives cause the platforms to lack such features, whereas what I mean is that e.g. Android's intents and activities provide just such a mechanism (which could be used to great effect as indicated in the sibling comment about OpenIntents), but commercial application vendors don't want to use it. They would rather control the user experience than integrate seamlessly into the platform.

In particular, you mention sandboxes imprisoning the code. But Android allows an app in one sandbox to display an activity (essentially a dialog box) from another app running in another sandbox with different privileges in a way that appears seamless to the user. I could have one app with access to bluetooth (but no camera) call upon another app with access to the camera (but no bluetooth) in order to take a photo.

I believe apps can also expose services and data sources (ContentProviders) – e.g. your Images, Tables and Conversations – to other apps and define their own permissions[1] for them.

[1] https://developer.android.com/guide/topics/permissions/overv...


Obviously economic incentives have an impact on what problems get worked on, but economic incentives can’t reduce the complexity of problems. I think what we have here is a problem of irreducible complexity. The reason that abstractions like the “Image” or “Table” you described one your article are few and far between is because it’s really hard to implement these objects in a way that scales to fit a broad enough set of use cases. The design of such objects involves making a set of tradeoffs where tradeoff space has a very high dimensionality and lots of shallow local optima.


You don't really need standardized implementations, or even fully standardized interfaces. What's missing most is any direct interface at all.


Not that I disagree with your overall assertion, but without economic incentive there is really no incentive in most cases. For someone who does have economic incentive, it's really business risk reduction 101 to not pin your application to a network of invisible and hidden dependencies that you have no control over.


Right, using f-droid and installing open source apps where the only incentive is the user's benefits.. that's a breeze of fresh air. Of course it can never give you access to instragram or any particular commercial platform like that, but you can handle your own data with dignity.


This has resonated with me as I've recently been thinking about service interopt. There is a strong push in my org to go full rpc, which will probably win out for not bad reasons, but I sort of wish we had but in the necessary up front investment to model it restfully.


Open intents is trying to standardize and promote such cooperation between apps on android. Tasker and similar apps are also a way to stitch together apps.

http://www.openintents.org/


How does it address the incentive issues ptx brought up?


are they accepting submissions?


Platforms are completely antagonistic to this goal, as the definition of a "platform" is pretty much "something that lives in its own world, and so doesn't care about following outside standards".


Platforms are very much in favor of this goal within the platform, but antagonistic outside. But then, every app within the platform is very much antagonistic to this goal - it also wants to have full control over user's data and interactions. So structurally, it's the same problem - but it manifests itself separately at every level of the stack - at least where commercial software is involved.

It's all a matter of culture. Proprietary software wants to take control, so it's opposed to general interoperability. Contrast that with e.g. the modscenes of games like Kerbal Space Program, where mod authors (i.e. authors of what are essentially apps, running on the platform of the game) go out of their way to be interoperable with everything else, often implementing compatibility features inside their own work that target other popular mods.


Yes, and focusing on the inside compared to the outside makes things exponentially worse for the outside.


I disagree. I think one of the big reasons for this is limited functionality of common interfaces that platform provides.

WhatsApp client implements its own photo picker, instead of using system one. Why? So it can add "crop" and "comment" functions. It would be objectively worse for me, the user, if they had stuck to platform features.


There is a reason why some apps dont use external camera app but take pictures themselves. Security. You want to be sure (as much as possible) that taken picture was real and not provided to intent by "use any picture as cammera output"


This is essentially DRM and experience tells us that it's not foolproof (a quick search on a piracy website will surface the latest media even though it's heavily DRM'ed).

It's better to just accept this as being impossible than to lure people into a false sense of security (where a minority that does know how to work around the DRM is then given more leverage since the majority believes it's impossible).


Why? I see very few use cases for this requirement. For example, my banking app lets me take a picture for only one purpose: OCR'ing payment info, instead of manually inputting it. But I usually get these as pictures from other people! Heavily reduces the utility of this feature for no good reason.


If the user wants to provide a "non-real" picture, there is no legitimate reason to stop them.


What if i want a user to provide a legitimate photo as much as possible for instance ID/license verification for car sharing app.


Then you can't rely on the user's hardware. If you want to verify an ID or license, a photo is not sufficient anyway. Many security features can't be checked that way.


In mac os, for example, there was the services menu. Is it still alive? I haven't been on mac os much since 10.5.


Isn't that what app-stores are for, i.e. to police the apps?


I think app stores are likely more about the 30% cut. The policing just makes that easier to sell.


> We need computing environments ... without the concept of applications appearing at all.

GPT-3 :-)


The answer is to push the "application vendors" back into the browser. None of them are really writing anything that needs to be running natively, it's almost all just a front end to some service on the internet.

Real computer applications are almost universally developed by researchers and the open source community and look exactly like this. No "application" just tools you install into your system that are watched by the community for breakage/malware or worst case offered as a git repo/tar for free.


Are vim and Emacs not real applications? How about gimp? Firefox? Thunderbird?

The gnome people could have used an Emacs or vim window instead of gEdit. But they didn't, because more or less the same incentives that prevent commercial entities from developing components instead of apps exist in the open source community as well.

Hell, even GCC is essentialy an app, even though parts of it would have been extremely useful as share-able components, as clang showed.

And I really don't know what researchers produce apps. The closest I can think of is maybe Coq?


Can you explain what definition of application you are using in this post. The only application I use that is just a front end to a web service is a client tracking software I am required to use for State level reporting purposes at work. All the rest of my software, ranging from development, to digital audio, video, and art, rendering and 3D work, etc, are all native and local. Some are open source, but many are not.


I see comments like this occasionally and I ask myself, what do people who make them actually do? Genuinely? My three main hats are developer, accountant and 3d designer and while most of my accounting work these days is done via web based services none of my dev work or 3d design work. Sure I consume a lot of stuff on the internet but the vast majority of actual creation is done via native software. Some of it is open source especially dev tools, but the most content creation tools are closed source and with the exception of Blender there are no or at least very few open source content creation tools that are widely used professionally in any field that I'm aware of. (Before the OSS fanatics start what abouting about gimp, Krita, openscad etc. please note the caveats of professional and widely used)


The nearest I've come to working in an environment like this is a system called Quartz developed at Bank of America. It's a massive set of infrastructure and code based on Python. It consists of a variety of services such as a distributed synchronised hierarchical object store, a set of compute grids, a web server farm, etc. All the code is in a massive code repository and is world readable by any developer in any team, with the configuration for everything in a single config hierarchy based on YAML.

To create a batch process you just commit the code, commit the config for where and when you want it to run and on which host group, and you're done. You can even develop desktop apps running a copy of the runtime locally. Because everything is Python (obviously low level libraries were done in C or C++) and exposed to python APIs, and all the Python code was available it was very easy to develop interoperable applications.

Of course there was huge duplication, but it was an incredibly fun system to work on. I really miss it. The previous iteration of it was actually called Athena and developed at JP Morgan by the same team, who were later hired away by BoA. Even before that they developed the original version at Goldman Sachs, but that wasn't based on Python.


Oh yeah; we have/had this-- it is called HTML and none of us got the idea behind it (I certainly didn't); so, instead we re-created the prisons we were, and are, trying to escape.

It also is really hard to profit from ONLY meaningful data thus the death of things like RSS feeds. Can't shovel advertisements, trackers, and spyware down someone's throat just sending a nicely formatted HTML table that the client decides how to display and use.

Truly, good, semantically meaningful and correct HTML tags used without tons of obfuscating markup to appease some maniac's absurd sense of aesthetic (my own included) would be a pretty sweet API to consume. There's also the subtle reality that nearly every single worth a shit UI designer app, is under the hood, using XML-esque format to describe what you've done... which is mostly akin to "put a table here, with a given convoluted datasource."

Things like yahoo pipes come to mind as something that was frankly awesome, and totally failed to find a viable market. Likely because you can't shovel ads down someone's throat and forcefully track their every move sending them only things they want to consume.

The continued gating of data is only going to exacerbate the problem; I fear that companies like Facebook (though it's far from the only guilty party) advocate for privacy solely to protect their data monopolies. It will soon become impossible for other players to enter the market because it'll be rightfully illegal to collect or mine that data. Yet, companies like Facebook will still have access to it. I have zero reason to believe their intelligence systems are going to "unlearn" from illegally sourced data or that they can even meaningfully remove it (you really gonna go remove data from those tape backups?).

I was reading the other day about folks who want to do link previews but its essentially impossible if you're not facebook because your bot is instantly blocked. So while organizations like Facebook and Google are allowed to freely pilfer the internet of resources for their own bottom lines... and applauded for it. Anyone else is looked at like scammers and frauds. But.. I'm starting to digress and ramble; so, I'll end it here :)

Edit: Slight updates for grammar/readability.


> the death of things like RSS feeds

This is really sad[1]. RSS feeds are amazing. Thankfully they are not completely gone!

[1] Actually, the whole state of the Internet is sad.


I share the passion for feeds. IMHO the decision of whether a site exports feeds should not be just up to the owner. There should be a neutral transformation layer that translates any html to rss. Shameless plug [0]. Feel free to support it.

[0] https://github.com/damoeb/rss-proxy


RSS feeds are common, but most readers are mediocre. I'm stuck between some proprietary "pay to remove ads and subscribe for premium" crap, an emacs mode that randomly sometimes takes up to an hour to open, and Thunderbird, which recently deleted all of my RSS feeds.


I use newsboat on Linux, and Feeder on Android.


NetNewsWire has been reliable.


Just curious, what (viable) state of the internet fits your definition of “happy”, or at least “not sad”?


For me at least, being able to use whatever software that adheres to the standard. This means that novice users can use a software that specifically targets them, while experienced users can use a swiss-knife program or even write their own program to handle it. For RSS specifically, most news sites do not have RSS now, which means that you need to visit them one-by-one or use an agregator that you can't modify much.


> Oh yeah; we have/had this

Not quite.

They are still stuck in a browser. The DOM is still largely impossible for users to parse and interact with. Web pages are still separate from each other.

> none of us got the idea behind it (I certainly didn't); so, instead we re-created the prisons we were, and are, trying to escape.

That's true for two reasons.

1. The reason you gave: we made it this way.

2. HTML isn't naturally leading us in a different direction than that.

I would say Emacs is a better implementation, though it still has faults.

Really, there is nothing that truly encapsulates this ideal. It's something that I still have difficulty articulating.

I think the most significant walls we have made are in user interface. We make programs that expect users to interact in a predetermined fashion instead of allowing users to make that decision themselves. In designing these programs, we wall the user away from engineering their own interfaces.

I think the second most significant wall is the floor. Take Emacs for example: Sure, you can alter every variable. You can change the keymap. You can change the fonts. But fundamentally, every variable has a default that the user must confront. There is no Emacs-from-scratch configuration option. This gets especially messy because default variables are organized in a carefully designed structure. They are codependent. It's trivial to make small changes, but significant refactoring requires planned cohesion with what is already there.

This topic is something that is vitally important to software design, and yet we don't even talk about it. We just keep rolling with the status quo until someone breaks down a wall and becomes a hero.


IMO one of the great things about eg semantic forms in HTML is that instead of building a GUI you essentially write a machine readable description of an API that a GUI gets generated from in a way that allows for optional styling.

When used correctly it's hard to imagine a better way to provide access to a remote service. Of course it's rarely used correctly when money is involved but I think that's a human thing that happens and the technology can't change it.


>Inherent, ubiquitous programmability: Currently, "doing programming" is a segregated activity from mainstream computing - separate software, command lines, specialist knowledge, clunky text-driven interfaces. This must end. Real expressiveness demands that every entity in the interface is inherently programmable - a table of data shouldn't just be a rendered picture of a table of data - it should be a table. Programming shouldn't be separate at all.

Almost every attempt at such an environment has failed for a reason. The problem with programming is not the syntax and other particularities, but inherent complexity of explaining the task to a computer.

Natural interfaces like modern voice assistants have a lot more chances to succeed than programming, because a) they imitate normal human communication and b) they are less dependent on unambiguous programming. And they are still limited because their fuzzy nature makes them unreliable and unpredictable.


> Almost every attempt at such an environment has failed for a reason.

Basic and spreadsheets are notable exceptions. However, they don’t scale to complex programs.


Spreadsheets basically run entire massive companies. What is the standard of complexity which this fails to meet?


Pretty much anywhere that depends heavily on spreadsheets also depends heavily on humans in complex ways. Humans are checking that different versions of the spreadsheet haven't gone out of sync, have been sent to the right people, are only updated when they are supposed to be, etc.


If you want to scare yourself, look up reports of spreadsheets with errors.


> The problem with programming is not the syntax and other particularities, but inherent complexity of explaining the task to a computer.

Not really. A great deal of the the complexity is accidental, not inherent. Take the example from the post of adding up some numbers in a table. The inherent complexity is very low, the complexity comes from all the stuff not inherent to the problem itself.


Programming, in the traditional sense, has to be formal and unambiguous, that's the main issue. The complexity can arise from seemingly nothing. Here's an extreme math counterexample: Fermat's last theorem. Proving it seems easy on the surface because it's so trivially formulated, yet it took centuries for mathematicians to actually do it.

Commoditized programming faces the same problem: many problems seem easy at first, but when you are trying to solve them with your shiny low-entry-barrier interface, you are hitting a fundamental ceiling you never knew existed. There's never been a shortage of unsuspecting newbies trying to parse HTML with regular expressions, as an example.

There were plenty of attempts to make end users program things. LISP machines and Genera (and Emacs, which is IMO the closest existing thing to the "ubiquitous programmability" you propose). BASIC, which was the main user interface for many early personal/home computers. Visual programming environments. Spreadsheets. Some of them survived and are very useful for simple cases, like programming materials visually in Blender, or automating stuff in IFTTT, but all of them suffer from the same issue: [non-AI] computers are too dumb and expect more or less exact instructions. That leaves the complexity on users' shoulders. Once you go Turing complete (and often even without that), syntax or entry barriers don't matter much - you either train for years to be able to formulate the human-generated problem, encode it for a computer, and change it as the need arises, or you hit a wall with a low entry barrier tool.

Programming is easy to learn, sure, there's no reason it should be hard. But it's fundamentally hard to master. And it's not about syntax or high friction interfaces (which can often be more productive for a trained person solving a hard problem, actually).


> There were plenty of attempts to make end users program things. LISP machines and ...

I would say Lisp-machines were designed for professional programmers not end-users. First of all they were so expensive that only Bill Gates could afford them :-)


Real world is always complex, ambigous and changing. So any simple model will in time be found inadequate, and complex models unusable.

The problem of programming isn't so much the interface. Text has been used for decades and remains a robust communication platform among humans. The real problem is defining the problem to be solved, its scope and adaptability to a complex and changing world.

The difficulty is the gap between vague ideas and real world outcomes of automation and human-computer interactions. To codify and implement ideas require precise understanding, design and adaptability which are otherwise demanding, understated and neglected.

Notice there is little need to focus on the tools themselves in this realization.

Update: Adding a bunch of numbers in a list is never a real problem, so is just an artificial construct.


What's vague and hard to define about adding up some numbers in a table?


The question shows that you have a hard time to understand the complexity of it. Maybe the reason is that you are an experienced developer who deals with all the complexity quite naturally and automatically.

So why is this task complex?

Well, what happens if there are no numbers in the table to begin with? Is the sum of no numbers a zero or is it supposed to be some error?

And are all the numbers supposed to be natural numbers or can there be e.g. irrational and complex numbers be in there? If so, how much precision do we need when summing them up - and how important is performance?

Also, what if there are not only numbers but other things (dates, text, ...) by accident?

And in a real world scenario, what happens when numbers are added to the table during the calculation? Should they be considered or ignored - should the table be locked somehow?

The problem with these questions is not that they are necessarily hard to answer (often they are) but that people don't even know, that they need to be asked in the beginning.

Many developers are so used to it, they often don't understand that they are doing things that are difficult for normal people.

In the same way mathematicians find it easy to do basic algebra for some more difficult task, it's just part of there toolkit - whereas most people don't even understand this basic part from the beginning, let alone more complex problems.


> The problem with these questions is not that they are necessarily hard to answer (often they are) but that people don't even know, that they need to be asked in the beginning.

Right. And what makes it even more problematic is that answers to these questions affect what answers are possible to other questions. So it is not the sum of the difficulties of the answer to each question but their product, the combinations of different possible answers.


Adding the numbers is not the problem, the problem is what the numbers represent, where do they come from (if they come from only one place...) where and for what are they needed.

Quickly you have to consider different kinds of numeral systems, bases, decimal separators (which may be locale dependant), precision, units, the difference in adding time (or dates), or money in different currencies (with and without different kind of taxes), etc, etc, etc.

Sure most of the times you can abstract yourself from all this, use sensible defaults, whatever, but that complexity is still there and sooner or later it will get you, even if you try to hide it under the rug...


Exactly this!

The complexity and misunderstandings also creep into system design itself. Seems many are eternally astounded that transactional data is "duplicate data" stored separately from the "same data", etc. Many of these complexities are non-obvious and unintuitive, until you are forced to think it through step by step yourself. Agile is simply the concession that the complexity cannot always be handled up-front, but that solutions must be developed to be adaptable to new observations and realizations.


You're starting from a solution. What exactly is the problem in this case? Generate a balance statement? You will have to divide it into smaller problems until each problem is small enough that it has an obvious solution. The fact that you only need an addition of some numbers in a table is the result of lots of thinking that you just glossed over.


How many jQuery table plugins are there? Sort, filter, pagination, partial display. And yet it is impossible to find components that meets all requirements.


TLDR; The problem is being precise enough to be accountable to real world outcomes. Which is why you need software developers (or evolvers, rather).


I'm working on the first building blocks to try and make this happen. The only public thing I have right now is a landing page but you can leave your email to get updates: https://hupreter.com


A lot of replies are missing the point.

It's not "apps integrate better with each other".

It is "there are no apps".

So what would Adobe sell, if not the "Photoshop app"? It would sell the Photoshop "menu of filters", the "selector toolbox", the "color histogram view" and such. But the workspace where you see the image and apply the selectors or filters would be outside Photoshop itself. It would be a standard part of the system, where the image could come from and go into another organizing system (possibly provided by another vendor). You could mix organizing systems, sharing/versioning systems and filters/selectors/menus/views from various vendors, commercial or free or open source.

This would apply not just to images, but to all kinds of media - movies, documents, including "code".


Microsoft already tried this in in mid 90s. It was called OLE (Object Linking and Embedding). You could embed a photoshop document live in your word document and if you clicked the embedded document the photoshop UI for editing would appear.

It turned out to be absolutely horrible. Maybe it was before it's time with computers having only a few meg of memory and being slower than today but it seemed more of an issue because things like UI, when you click the embedded document, what should happen? How much of photoshop's ui should be shown vs Word's (the outer app). It was also a nightmare because unless every user had the exact same apps on the exact same version nothing worked.

Your embedded photo uses filter XYZ but that's only in version 7+ of whatever app edited it, etc...

https://en.wikipedia.org/wiki/Object_Linking_and_Embedding


OLE was only part of the strategy, the idea behind it was to create a "document centric" operating system.

In other words the "no apps operating system" is what Microsoft envisioned already with Windows 95

IBM tried something similar with OS/2 warp that was "object centric"

The idea failed, not because of OLE, every other technology was bad back then and inherently less secure, but because the World moved back to app centric view and rebranded it as people centric. the modern app stores are the result of that and the fact that the integration comes in the form of "share this" "tweet this" "post this image on Instagram" is why we still have apps at the center of the old desktop metaphor, because branding has more monetary value than function (we don't beven write anymore, we tweet), even though something functional is better than something with just a brand attached to it.


Sounds terrible, but what you're describing seems to still fall in the category of "integrating apps" rather than "no apps".

I think the best real world reference point for "no apps" is the terminal.

Piping commands together is very powerful and intuitive (in this way they behave more like composable objects than applications). It works well, except for being unfriendly and reliant on low level text streams, which are both surmountable problems.


The terminal is exactly an example of an app-centric workflow and why it is easier to build and/or easier to use than an API focused one. If you want to see a no-apps workflow, something like the Genera LISP OS was probably much closer[0].

People choose to use apps like ls, cat, less, echo, touch, find instead of using the FILE* object directly with readdir(), stat(), read(), write(), open(), creat() etc. All of the apps are designed to have human-readable output first, lots of options for controlling that output to make it as readable as possible etc. However, none of them exposes a rich model for its output that would make it possible to easily integrate it into more complex workflows. Instead, we rely on yet other apps, like sed, grep, xargs and essentially copy-pasting text between these apps (this is all that pipes really are, essentially).

This becomes even more obvious when you use stuff like gcc or gdb, which have extremely rich and potentially useful layers of representation that they refuse to expose at all, even as APIs - only text is allowed in the interaction.

Hell, the MS Office suite is a better example of a no-apps workflow, since each of the Office UIs has a deep understanding of the data produced by the others, and you can combine these in meaningful ways - much more so than terminal apps (for example, Word can show a portion of a spreadsheet without you having to guess at what contents it might have and how to parse it, like you would if you want to expose a portion of ls's output to a file).

Interfacing code is hard, it requires well-thought-out APIs and much more work even with the best APIs. Interfacing apps with extremely minimalistic APIs (copy/paste, share) is much easier for everyone.

[0] https://www.youtube.com/watch?v=o4-YnLpLgtk&t=6m0s


In context of terminals, if you want to see a modern API-based instead of app-based CLI experience, check out PowerShell. The underlying principle is that all commands like "ls" or "ps"[0] return their results as .NET objects, and not unstructured text. If you just call "ps" in the shell, you'll get the default visual representation you'd expect - but you can also choose a different one (e.g. list view by Format-List, or filterable GUI table view with Out-GridView), but you can also then filter the objects by properties and call methods on them.

For example, to find and kill all instances of notepad.exe, you'd write:

   Get-Process | ForEach-Object { if($_.ProcessName -eq "notepad") { $_.Kill(); } }
A bit verbose (and that's generally a problem with day-to-day PowerShell usage), but relatively trivial to turn into a cmdlet and alias it to "killall".

Of course, the above example was trivial, but the same principle works for more complex ones - instead of streams of text, you have streams of objects, which you can filter and run methods on, without doing any parsing.

--

[0] - Them being aliases to PowerShell's Get-ChildItem and Get-Process, respectively.


Right, should have mentioned PS as well.

However, it's still important to note that it's easier in some sense, especially for simple tasks, to use Bash than PS, and I think that this is deeply tied to the reason why we fall back to apps rather than rich objects as the fundamental interactions.

I believe that human capacity to massage data together is still very hard to replicate in the kind of formal manner required by programming tools (for now, at least). That is why it is significantly easier for someone to copy data from one web UI to another than it is to write the rules for copying backend-to-backend automatically (up to some amount of data). This is a problem much more fundamental than the economic incentives for app creation. It's similar to the observation that, for small amounts of data, it is easier to run a select * from table and visually search, rather than go though the trouble of mentioning which columns and rows you want to see.


The terminal is absolutely full of apps though. It's literally an environment designed around invoking other programs.

Pipelines are indeed very powerful, but mostly for a specific class of tasks. As soon as you need interactivity, the pipeline model losess a lot of benefits. None of the popular databases, web servers, games or office suites are implemented as terminal pipelines. The other problem with pipelines is that while they scale reasonably well with data size, they don't scale quite as nicely with task complexity. A five-program pipeline with complex command line arguments can already be complex to understand. A 500-program pipeline would be a total nightmare, especially when it starts including components for error reporting and retry behavior.


No, what they're describing is very much "no apps", unlike terminals. OLE is part of COM, which is all about individual classes, hidden behind interfaces, being available globally in the OS as building blocks.

In a COM model, you can imagine, say, a word processor being composed of UI, document model, document store and spellchecker components. In this reality, I could replace the UI with a touch-enabled one, and run the same spell checker on a remote computer, all with few changes in Windows registry. Think of it like OS-level dependency injection and microservices, with RPC and orchestration being handled transparently for you.


What is in question is the UX, not the literal code.

Obviously terminal programs are literally programs and COM is literally objects.

However, terminal programs often have a single purpose, along with inputs and outputs that function as a crude interface allowing them them to be combined in a reasonably flexible way by the end user, who can use them to construct functionality of their own.


Not even mentioning the insane amount of security issues with OLE components.


They also did Visual Basic for Apps, which inspired the similar Google Apps Script.


The big issue here being that Adobe has zero interest in selling this, and this kind of model would not lead to a $250B market cap business. Adobe wants to tightly control the experience, record how you use the software, display their own branding, try to upsell you on their other products and services, etc etc.

Software businesses care about controlling the UX/branding/etc. tightly, because that's where the money is - not selling "menus of filters". That's why every webpage is nowadays is a SPA that hijacks standard browser features like scroll and copy paste and no one looking to make money was ever interested in the semantic web.

I'm very aligned with the views exposed in the article and your comment, and have been working on some open source approaches to it in my free time for a few years now. I figure that the only way that it can maybe work is to make something for myself that I love using, and maybe some other enthusiasts will like it too and it can grow a bit from there. But there's probably no way it would ever meaningfully compete with Photoshop, because it goes against every economic incentive that software companies have.

In parallel, I also suspect that that's why the design/UX of open source applications tend to be extremely poor in general - great, tight design is expensive and needs strong economic incentives.


That is exactly my thinking. This model goes contrary to the needs of propriety software, to lock you into a whole system you can charge for.

(And FWIW, I don't have the vendetta against proprietary software like a lot of HN seems to, so I'm sympathetic here about breaking the incentives for production of software that legitimately needs paid devs and can't get enough via donations. But if Open Source teams can make it work, I'm all for that!)

OTOH, as a general model ... I can see why it has failed to pick up, at least with typical devs and prevailing software practices. It requires a level of care about interfaces and interoperability that is not very common.

More often, you see APIs broken willy-nilly with no functional need to. Stable standards for exchange of data require a lot of work and are the exception -- even when you have them, there will be the warts that reveal abstraction leakage. (Another comment mentions how HTML was quickly broken.[0]) The no-app approach with basically be that fight, but for every external touch point, of every module.

I do hope the dev world can make that work ... but also: understand what you're fighting against.

[0] https://news.ycombinator.com/item?id=25021180


Yes, open source sounds like the right place for ideas like this, for the reasons you mention.

Consider Emacs, whose hundreds of extensions give you this mix-and-match setup, or something similar to it.

I think I might really like working with a system that works in this way, with a bit from here, a bit from there -- but I'd want access to the code, because it would take a lot of tweaking to make things nice.


It would likely mean losing lots of integration and special features though. If it's outside of the app how would you have a modal UI which would change what's in the right click menu of the canvas ? You would also have to wait for your OS to implement 10-bit support for it - I'm pretty sure that e.g. Krita had 10-bit canvas before MS Paint :)

Also enjoy teaching students or even learning to do anything. With big software like Photoshop (or musical equivalents), you can take the book that comes with it and become fairly proficient at pretty much everything that the app allows. How would that work if there is no individual software ? What screenshot do you show on the docs ? How do you make YouTube videos when most people may have a different setup than yours ?


> If it's outside of the app how would you have a modal UI which would change what's in the right click menu of the canvas ?

Simple. The object that owns the "modal UI" part would own, or at least aggregate, a right-click menu. The canvas shouldn't be busy handling right-click menus anyway, it should provide an interface to access and manipulate the underlying bitmap. The "modal UI" would own the right-click handler, and would populate the context menu with a filtered choice of options, including functionality provided by different objects.

It's more complex than naive approach with everything being owned by a single one, but fundamentally not that different from a design you'd arrive to if you were intended to allow for plugins in your application - except now the "plugin manager" isn't in your app, but in your OS.

> You would also have to wait for your OS to implement 10-bit support for it - I'm pretty sure that e.g. Krita had 10-bit canvas before MS Paint :)

Krita people could provide a 10-bit canvas implementation that offers an 8-bit canvas-compatible interface. Alternatively, your UI could require canvas objects implementing a 10-bit canvas interface, but you - or a third party - would also provide an adapter (polyfill, as the kids say today) that wraps the MS Paint 8-bit canvas behind a 10-bit canvas interface.

--

The teaching point you bring up is a strong one, and I don't have a good answer. I think Microsoft's COM, the Emacs ecosystem and every game with a large modding scene (e.g. Kerbal Space Program) all provide evidence that a fully interoperable system turns users into sysadmins. You don't need to code to use such a system, but you'd better be prepared to be aware of all the components, and do a bit of configuring, if you want anything other than the defaults.

(But, as every game with a large modding scene demonstrates, it's not necessarily that hard for regular non-tech-savvy users either.)


Exactly - for example, resizing photos.

Imagine a resizer object, that could be used:

- to implement resizing in various interfaces for image objects

- in a script (think unix pipes but operating over real image objects, not streams of bytes)

- wired up to various events and data sources for automation (when my aunt emails me a picture, resize it and put it in this folder)

The resizer wouldn't be hidden behind some app's implementation - it would be a first class object you can inspect and interact with ("ok, looks like this resizer objects takes an image object as an inputs and outputs another image. i'll drag an image onto it right now to try it out")


One of the big problems I've seen with multi-vendor systems that co-operate is if something didn't work. It was never clear where the problem was because the system incorporated multiple products from multiple vendors. So if your resizer object didn't work for some images after upgrading Photoshop, but it used to in the old version, how do you get that fixed?

In a single-vendor program, it's clearly a bug in the program and you can hand it back to the vendor to fix it. When multiple vendors are involved, it gets way more complicated. Each has a tendency to blame the other, and not take responsibility for the problem. Each will claim that they are coding to the interface specification. And they might be, it's just that the specification isn't tight enough to make all interactions bulletproof.

I don't really have a solution (apart from "write a really, really, tight interface. No, tighter than that"), but I've seen the problem enough to be skeptical that we can have this.

edit: I should add that I wish we could have this.


This. It is not just vendors, but also teams within any company. E.g. operating systems: Windows, Android, iOS/OSX - all are composed of modules which are running as isolated processes and communicating over agreed interfaces. Before system upgrade reaches the market there are countless meetings between teams to sort out bugs and design inconsistencies.


Apple already tried this in in mid 90s. It was called OpenDoc. You could embed a drawing document live in your word processing document and if you clicked the embedded document the drawing UI for editing would appear.

It turned out to be absolutely horrible. Maybe it was before it's time with computers having only a few meg of memory and being slower than today but it seemed more of an issue because things like UI, when you click the embedded document, what should happen? How much of the drawing's ui should be shown vs word processor's. It was also a nightmare because unless every user had OpenDoc and the exact same components on the exact same version nothing worked.

https://en.wikipedia.org/wiki/OpenDoc


On the plus side, this could be a way for those who don’t sell apps to compete with the juggernauts.

Yes, people could unite to improve, say, GIMP, by writing extensions for it, too, but if such extensions were usable elsewhere, too, maybe, more people would write extensions.


...and you've just turned Phtoshop into a slightly worse version of Gimp. Congratulations. Without adjustment layers and other non-destructive editing features, Photoshop becomes a whole lot less useful as a professional tool.


To some degree, Android's Intent system works towards this, in an extensible way.


Adobe has an API for all of their software, in principle you can hook into it with your own code. The main problem is that the document model are really complex, so I doubt lots of people would do that in their free time.


You mean like everything is a file and you could build pipelines of simple commands to modify your data? That’s genius!


Exactly, but imagine real objects instead of text streams, and rich interfaces instead of lo fi terminals from the 70s.


This doesn't make any sense.

Apps are what allows you to do any work on the data, starting with being able to display the data on a screen (or print it on paper).

Data without apps is completely useless.


That's a bit reductive. Surely there could exist a model where work on data doesn't have to be mediated through "apps", and this is what the article refers to.


"work on data" is the definition of "app".

Unless you're willing to go through your hard drive platters yourself with a magnetized needle ? (or whatever the equivalent is for SSDs)

EDIT : Though even in that hypothetical case, you would need to follow an "app" (more commonly known under the term "algorithm") to be able to make sense of the magnetic domains that you're detecting !


The problem this blog post describes, I've been working on for a few years - it's really not a hard problem per se, it's just a lot of meticulous work.

I think that's in general the case - most problems in life are problems of you know what you ought to do, but is it what you want to do?

Do you want to spend 5 years to try and solve a problem, with no promise of financial, social or personal reward? Do you want to save up for another 5 to give the next 5 a try? Do you want to spend 10 years on a goal that's no guarantee and that most other people will tell you is a bad idea vs working at FAANG?

For most people, they are solving for securing a predictable career so that they can have a family and live their lives. What better way to secure your spot, if not by becoming or joining/enabling a monopoly in your little sphere of life? Why is FAANG a term? Because people want to join monopolies/potential monopolies to secure a predictable family future :)

That's why software doesn't talk to one another for the most part - if it isn't enabling someone's potential monopoly - it isn't worth doing given most people's life goals.

I don't think that's ever going to change unless we establish basic income creative people can raise a family on and I don't see why most people who aren't creative, would be in favor of such an arrangement, so we're stuck with what we've got :)


a sort of conway's law in reverse, applied to society in general


OpenDoc was a mid-‘90s Apple software framework that basically did this. It was also adopted by IBM on OS/2 as part of the technology exchange that also resulted in Apple and Motorola using IBM’s POWER CPU architecture.

Steve Jobs killed OpenDoc when he returned to Apple in 1997 because it wasn’t NeXT software. The IBM side of the project had already died at that point as Windows 95 trounced OS/2.


Windows 95, with OLE, which became ActiveX, which is pretty isomorphic to OpenDoc.


There's also the wider Component Object Model (COM), which does pretty much everything the author of this article wants, and then some more (e.g. remote objects with network transparency, deep security), and it all works within Windows to this day - but, for some reason, app developers seem to avoid it like fire.

I blame COM being a bit annoying to use on developer side, and economic incentives mentioned.


I don’t think he killed it “because it wasn’t NeXT software”.

I think he killed it because the market didn’t support it (MS Office showed that an ‘everything but the kitchen sink” solution could conquer the market, leaving only breadcrumbs for smaller parties) and to focus the company.


But MS Office is sort of the implementation of what the article describes, with it being thoroughly COM based. You can embed pieces of Office in your software, or pieces of your software in Office (including saving its state in the MS Office's documents).


Technically, sort of, yes, but sociologically: no. Office doesn’t (seem to) use it itself. If you can insert an Excel table into a Word document, why does Word have table functionality on its own? Why does Excel have its own text box for styled text? Can I embed a Word table inside another document without getting the entire Word editor? An Excel table that’s just a table and not a sheet to which one can add charts, etc?

Also (and possibly alongside “we sell Word/Excel separately, too, and want it to be everything but the kitchen sink, to prevent others from providing missing parts”) the answer to that may be “because the UI of OLE-embedded parts isn’t as smooth as it could be”. Outside-in activation (which, reading http://preserve.mactech.com/articles/mactech/Vol.10/10.08/Op... doesn’t seem to be required with OLE, but I haven’t seen otherwise) means you have to click multiple times to start editing (clock to activate an OLE control, then click to start editing it)

OpenDoc promised much more, but of course, the market didn’t want to deliver it at the time, and we also don’t know whether its promises really can be fulfilled (Cyberdog was cute, though)


Codebases are, in my mind at least, a virtual space. I believe that one day programs will look like factory floors or cities. They will produce and consume physical analogues for values and types, which you can pick up and examine. Want to debug a function? Strap on your VR headset, teleport into its physically reified room and watch the execution. Tinker with the pipeline in real time.

I’ve been dreaming about this forever, and there have been many attempts, but with remote work and VR going mainstream I think someone will eventually build something usable and scalable.


I mean, pay Factorio and you can see it.


Satisfactory (the game) too!


Sounds cool, although a long way away when we're still dealing with text-based development interfaces right now. Definitely need more immersive environments and fully inspectable programs, and actual graphics in our terminals and editors!


In what way is a debugger not already this?

Are you asking for better visualizations?


Yup. Visualizations that harness our brains' spatial memory and reasoning, specifically.


This is a very old-school way of thinking, without any mention of privacy or how to share data safely between different users. As soon as you have multiple users, especially when they don't trust each other, things get much more complicated.

Should you really be able to do anything you like with your bank account or DMV record? And do you really want the people you interact with to download all the photos you share?

Single-user systems are much easier to deal with, but they're just sandboxes that don't do every much.


There's prior art for all of that, including object and method-level security, in Microsoft's Distributed COM (DCOM). It can be made to work.


Sure, but if you're doing RPC and it does a security check, this isn't all that different from filling out a form and getting a response. It's not empowerment since you don't get to do anything more with the data than you could otherwise.

That's not much like playing with your own data in a sandbox.


Seems to me that objects provide a much better way to do security than applications, as they allow permissions to be much more specific and granular.


I call this idea If you can see it you can use it and right click use.

So if you see a filename on the CLI, you can right click on the file name and interact with the file with a GUI.

Or you could hover over a <pre> tag in the browser of some dot syntax or table and right click and click Use. It would run various heuristics over the data to work out what the data is and then import it to the right program.


Yes!! Here's a link of an experiment I did some time ago to do just what you're describing: https://twitter.com/juancampa/status/1033495489637961729?s=1...


Like the Acme or Wily editors?


or the lisp machines


I like your thinking!


This is a problem I have personally spent several years thinking about and working on. The trick IMO will be to build it incrementally from what we already have. For anyone interested, here's my take on it: https://membrane.io

The TL;DR is that I've been building a orthogonally persistent, message-based, user centric, programmable (js/ts) graph


Looks pretty cool. Have you looked into Pathom (a Clojure library)? Its creator seems to share your vision of connecting APIs from different sources. Last 5 minutes of this video:

The Maximal Graph by Wilker Silva - https://www.youtube.com/watch?v=IS3i3DTUnAI


Sounds like what Microsoft wanted Windows to be back in the Win 3.1 era with COM and all that.


Can you elaborate a bit on what COM was? (or link a resource) I couldn't find its mention on the Windows 3.1 Wikipedia page.


https://en.wikipedia.org/wiki/Component_Object_Model

which is an elaboration of

https://en.wikipedia.org/wiki/Dynamic_Data_Exchange

A common use of COM was scripting with Visual Basic in the 1990s, for instance, ask Excel what is in cell B7, or dynamically load a GUI component out of a DLL and script it into a Visual Basic application.

This blends the boundaries between applications in that you might have a Word document that has an Excel spreadsheet embedded in it, and it really does boot up Excel and has Excel render itself in a rectangle inside the Word document.


This use didn't go away after the 1990s - Office still uses Visual Basic which still uses COM.


So basically what Amiga OS had with ARexx in the 1980s.

Edit: I misremembered, that was the 90s as well because it was a later development in the Amiga ecosystem.


Thank you for the links!

> A common use of COM was scripting with Visual Basic in the 1990s ...

This sounds nifty!


It is used for a lot more.

Want to integrate Windows Explorer in your application? COM.

Custom property pages in Windows Explorer? COM.

Custom folder view ala zip folder? COM.

Want Windows Explorer to be able to extract metadata from your custom file format, or Windows Search to search it? COM.

Want to play or manipulate video using the installed codecs? COM.

Want users to be able to drag an attachment from Outlook and drop it into your custom application? COM.

Just some examples. COM is a bit clunky, but it's a great enabler on the desktop.

https://docs.microsoft.com/en-us/windows/win32/shell/intro

https://docs.microsoft.com/en-us/windows/win32/properties/pr...

https://docs.microsoft.com/en-us/windows/win32/directshow/di...

https://docs.microsoft.com/en-us/windows/win32/shell/dragdro...


COM is still used a lot in audio (at least the COM ABI) because it allows to share objects between programs/shared libraries and manage their destruction. It also has a nice way to add functionality.


I'd also add:

Want to play video games? DirectX is a COM API too.


It was! Then the malware came. Those links opened a million security holes that were effectively untestable.


Nowadays you still interface with COM through PowerShell scripting, and it is pretty nifty.


It's a language agnostic binary interface. It's kind of hard to explain without getting into the technical details of how it works. For many years it was the only stable ABI on windows.


It is the only blessed API for all new APIs since 2000, I think.



World is moving in the opposite direction. Reality check:

* it would be nice if we could fix own device.

* it would be nice if we could install own software.

* it would be nice if we could fix own software.

* it would be nice if we could combine programs.

* it would be nice if data was not tied to application.

Most of the users could not utilize their freedoms. UNIX users could create and share programs, glue them with pipelines, beyond "user" level now.


I blame C++. Not just that it lacks any runtime type information by default, hindering any attempts to interface with compiled code. Heck, even interfacing with C++ source is hard. It also comes with the mindset that this is somehow an advantage and with derision against other "scripting" languages.

For example C is better in this regard, it easier to call library functions without access to source code. And it's equally as fast.


This is a great idea that would change computing for the better.

That said, one angle I sense here is blaming. It's easy to dream and blame, it's harder to build and lead by example. It'd be great if this post was a "why XXX" page in the documentation for a new platform which implements the said ideas.


> It's easy to dream and blame, it's harder to build and lead by example. It'd be great if this post was a "why XXX" page in the documentation for a new platform which implements the said ideas.

That's because these grandiose visions are just that – dreams. It's one thing to imagine a user's utopia of infinite possibilities and write a blog post, and a completely different thing to go and implement it. It typically falls apart when confronted with the messy reality of the real world and actual code. That's why you see a lot of those posts, and never anything that goes significantly beyond some toy examples, if at all. Compare the author's misconception that there's little inherent complexity in summing a range of cells in a spreadsheet [1], an idea that won't be entertained for long if you actually implement a general-purpose spreadsheet application.

@djrobstep: Sorry for the harsh words. I've heard too many of those visionary ideas (including my own), and have never seen them amount to anything (see also Alan Kay's vision of software inspired by biological cells and systems – lots of talk, no non-trivial proofs of concept).

I would be delighted to be proven wrong, so please don't let this post bring you down. Start building! If you succeed, feel free to rub it under my nose :-)

[1] https://news.ycombinator.com/item?id=25020363


> (see also Alan Kay's vision of software inspired by biological cells and systems – lots of talk, no non-trivial proofs of concept).

I just wanted to say that actually Kay's vision was implemented in something very non-trivial, both on the hardware side - Alto, and the Smalltalk operating system. They were used by real people and exhibited lots of traits that this article talks about. And some of their ideas were hugely influential in mainstream computing.

I love the demo of one of the Smalltalk programs in this talk by Alan Kay [0], starts at around 40:30

[0] - https://www.youtube.com/watch?v=p2LZLYcu_JY


In what way does Smalltalk embody the concepts underlying biological cells and systems? Note that objects sending messages to other objects has very little to do with that – it would look more like lots of identical objects emitting lots of identical messages into a shared medium, and lots of objects receiving none, some, or many of those messages (a stochastical process) and reacting in a more or less deterministic manner.

Such a system would be hell to develop, debug, and maintain. Which is why we don't design systems that way. Which is exactly my point.


Agree. I'm usually very frustrated with the app boundaries, poor system wide integration and so much duplication of work.

Another essay about computing without apps is https://humane.computer/killing-apps/


> We must build much higher level shared meaning - Images, Tables, Conversations and beyond, building a common implementation and understanding used by everybody.

Thinking you can build something like this is extremely naive. If you have been working in any company over a certain size. You will know that even inside a single company, people often don't use the exact same vocabulary. For example, what constitutes a product is very different across departments such a sales, production, design, customer service. Martin Fowler talks a bit about in this post on bounded contexts [1].

[1] https://martinfowler.com/bliki/BoundedContext.html


But we already do this, with a whole variety of different objects - strings, sockets, integers, floats, URLs for instance.


Strings... which kind? Nul terminated, Pascal Strings, ASCII, UTF-8, UTF-16?

It turns out that strings might not even be the best way to handle text, ropes look better. (A different thread here on HN)

We got close with COM and Windows... as much as I knock Bill Gates, at least he managed to push the clipboard into everyone's toolkit. Imagine if that hadn't happened?

What might be possible is to tweak the clipboard a bit to allow the user to set a clipboard boundary in the same manner, but handle the I/O in such a manner as to make it a universally agreed upon object type that can update, and serve as a persistent resource identifier. (Think Ted Nelson's Xanadu)

Someone has to show this as a working concept in an open source project, and then some other open source project has to integrate it.


Those elements you list all have the same single domain: computing. And we spent decades trying to agree on the definition of them in the computing community. On top of that all of them are simple and flat value-based types. Meaning they don't really have any relationship to any other elements.


I think that we actually do already have a lot of the relevant primitives in one place: accessibility APIs. It would be interesting to see experimentation in using this data for more than just screen readers and the likes, so that you could do things like slurp a table of numbers and add them up regardless of which program it came from (—though PDFs are unlikely to pan out, because the tabular structure is typically just not encoded in the file).

On macOS, there’s AppleScript which can, I believe, achieve some of these sorts of things using accessibility APIs and similar. I’m not familiar with the extent of its capabilities as I don’t use a Mac.


Sounds like unix to me.


Indeed, it sounds like the author is looking for the UNIX of the 21st century:

* Widely reusable meaning: everything behaves like a file. The types of files we can have are defined by specs: an Image can be described as a PNG file, which every process can understand. A table can be a CSV or a SQLite file. A Conversation can be a maildir folder. We might not have the best descriptions of "things" but we do have something

* Data without borders: if you can read from stdin and write on stdout, you can interact with the data. In fact, joining two tables is a base task and can be done with join (https://linux.die.net/man/1/join)

* Inherent, ubiquitous programmability: I'm not sure I understand the author's point, but it sounds like the entities in a software are too specific to the program. Again, if every "application", or rather set of utilities, used the filesystem with clearly defined specifications for what data is, then they can work together

What is not following the UNIX guidelines is definitely the Web and mobile platforms, as the author focuses on. There were some attempts at doing things the UNIX way, like uzbl (https://www.uzbl.org/) where every thing is a script away, or ii (https://tools.suckless.org/ii/) which gives a filesystem interface to IRC conversations. Want to parse a message ? It's just a string in the filesystem, any script can do it.

There's a reason it didn't work as well as we want, and it's that in practice it's all clunky and hard to maintain when the alternative is a single, unified application. Especially when the alternative is from a commercial vendor with a lot of cash. The incentives of doing FOSS that interacts with each other are not aligned with making money.


A lot of what you are describing exists in Plan 9:

Under Plan 9, UNIX's everything is a file metaphor is extended via a pervasive network-centric filesystem, and the cursor-addressed, terminal-based I/O at the heart of UNIX-like operating systems is replaced by a windowing system and graphical user interface without cursor addressing, although rc, the Plan 9 shell, is text-based.

Source: https://en.wikipedia.org/wiki/Plan_9_from_Bell_Labs

Many of the ideas from Plan 9 were implemented as user space programs on a variety of Unix-like operating systems:

Plan 9 from User Space provides many of the ideas, applications, and services from Plan 9 on Unix-like systems. It runs on FreeBSD (x86, x86-64), Linux (x86, x86-64, PowerPC and ARM), Mac OS X (x86, x86-64, and PowerPC), NetBSD (x86 and PowerPC), OpenBSD (x86 and PowerPC), Dragonfly BSD (x86-64), and SunOS (x86-64 and Sparc).

Source: https://9fans.github.io/plan9port/man/man1/intro.html


Plan 9 has been on my "things I need to try" list for as long as I've known it, but I've never gotten around to do it. Maybe it's time to tick some boxes off that list now that a new lockdown has started in here.


The "everything is a file" in UNIX is bit of a lie. And plan 9 went on fixing that. UNIX has many files that aren't actually files but special devices that you access with non-standard interface (mainly ioctl). In plan 9 everything truly is a file and all communication happens with read/write system calls.


I might be uninformed but it seems to be "everything is a file" was more or less true when UNIX was created, and then it evolved outside of the academic garden, especially Linux which was a hobbyist project and never strove to have a cleanly designed, rigorous architecture.

Plan 9 does fix this, and I wish so much it managed to be a dominant OS. So many things seem correct. Maybe it's exactly because it isn't used for real use cases that it can maintain its appearance of good design


Yes, but it could also be a single program with commands. I think Jeff Raskin talked about something like this with Archy. You could also think about the emacs paradigm extended to non-textual objects (to me emacs is like a lisp machine tuned to work with text)


A list of requirements to help make the OPs vision a reality might be:-

1. Separation of concerns (ie, make it trivial to separate the Data and the View components from the surrounding html clutter). Currently it can be a pain to extract data from the DOM.

2. Allow the Data component to express arbitrary data shapes, including at least: - 2-D tables - n-dimensional data tables - groups of related tables and schema (for a relational database) - sparse data sets - lists - trees - graphs

A lisp-like representation of data would provide adaptability to arbitrarily shaped data.

3. Make it easy to use:- The webpage coder would write <html> <head> <data name="data1">...lisp-like-data-structure...</data> </head> <body> <view name="view1" data="data1">...view-code...</view> </body> </html>

If no View-code was provided in the html then a default OS/browser View would be used.

3. Ideally HTML would provide native support for the Data and View primitives. Failing that then it could be provided by a script (hosted on a single website, to ensure consistency of interpretation).

4. Ideally the OS would provide support for the Data and View primitives, in the GUI and in the command line. Failing that then these could be provided as user programs.

5. Encourage webpage developers to use the new <data> and <view> primitives by deprecating <table> ;)


I made an analogy in another comment that I think is a really great expression of this problem:

Programs are like houses. They are made of walls.

Traditional UI gives users doorways and windows, but users are not allowed to pass through walls.

Even the most liberal programs that allow users to redecorate or even move walls do not give the user ultimate and immediate freedom.

Say a user wants to make a new room. They can move some walls around and shove the room in the space left over, but where do the doors in that room lead?

In order to make deep refactoring UI changes, the user must undo the careful design that developers gave them.

The ultimate freedom would be for the user to rebuild from scratch, but that's too much work, right?

What if the house was entirely configuration? What if every wall was optional? What if our program was fundamentally just an empty floor with an optional example house built on it?

That's what we almost get with shell commands. That's what we almost get with web browsers. That's what we almost get with Emacs. That's what we almost get with tiling window managers.

I've never seen an ultimate instance of this. I have, however, seen a trend away from it, and that trend is frustrating.

This topic is something that is vitally important to software design, and yet we don't even talk about it. We just keep rolling with the status quo until someone breaks down a wall and becomes a hero.


The two places I’ve seen this working best today:

- Open source data science/scientific computing ecosystems. Notably python, where all the libraries interop seamlessly via numpy/pandas/arrow and Jupyter is the visual coding platform. But also R/tidyverse and Julia.

- Modern “no-code” tools, where the visual coding is Notion/Coda/Bubble, interfaces via Zapier/Integromat/Autocode and data models in Airtable/Sheets. (Many of these tools use the word “block” as part of the UX)

And ofc, we take it for granted but the concept of a “file” is the ultimate building block for applications.

In my experience, commercial disincentives aside, the main trade off for this power/flexibility is the complexity. It is intimidating for new users, and hard to design well for because of the combinatorial explosion of interactions. Users need to be strongly motivated to get over this complexity hump - whereas most users, most of the time want a single happy path. Personally I don’t see this as a negative thing - you are essentially coding best practices into the tool.

As an aside, the instant feedback coding in python looks fantastic! Could be a fantastic extension eg for Jupyterlab or VSCode.


>And ofc, we take it for granted but the concept of a “file” is the ultimate building block for applications.

Yes. I love me some abstraction and building blocks. We constantly think about them because it helps the product be flexible by design in situations we haven't thought about.

Similar abstractions and building blocks or "units of computational thought", although not "perfect": Docker containers, Jupyter notebooks.

>In my experience, commercial disincentives aside, the main trade off for this power/flexibility is the complexity. It is intimidating for new users, and hard to design well for because of the combinatorial explosion of interactions. Users need to be strongly motivated to get over this complexity hump - whereas most users, most of the time want a single happy path. Personally I don’t see this as a negative thing - you are essentially coding best practices into the tool.

I agree. Complexity is not going anywhere, it just is a matter to decide who inherits it and at what level. For example, we're making our machine learning platform[0] after years of custom ML products to help us. We have different profiles: some of them can move between training a neural network, building custom connectors for esoteric data sources, setting up infrastructure, and running cable. Others live and breathe in a notebook who have trouble setting up the proper environment.

What we do is that we build a product that handles most things for the latter profile, but gives the possibility for advanced users to tweak things. One of the reasons we haven't adopted other products is that because they were way too restrictive. Point and click, no API, use custom non time proven abstraction, etc. It's also because in our years of actually shipping machine learning products for paying enterprise customers, the problems we faced in machine learning projects were not for lack of snappy CSS or animations. In other words, the products we had seen were solving non problems for us. We keep an eye out for products made by people who actually shipped ML products, though.

That's one of the reasons we built functionality on top of JupyterLab, like near real-time collaboration, scheduling long-running notebooks, and automatic tracking, instead of wanting to re-create the wheel with "Our Way (TM)".

- [0]: https://iko.ai


Computing "prisons", or better call them boundaries, are result of evolution, limited trust and need of control. Agreed, computing systems simulate human social structures.

BTW this article reminds me Windows OLE https://en.m.wikipedia.org/wiki/Object_Linking_and_Embedding


I've read a lot of discussion about "app barriers", but there's a general confusion about how such a "wall breaking" dream would work. We say that there are barriers to application because only the enterprise has the ability to develop applications, and this unbalanced entry barrier is the source of the world Wide Web's inequity. Applications are for the profit of enterprises, so only enterprises will build applications according to their own needs and profits. This is the core issue of web2.0, and also the key to the optimization of the future network. That means enabling everyone to build applications that suit their needs and profits with the help of Internet tools. So, if we go back to the basics, decentralizing data is not the point, and enabling everyone to build value scenarios with the help of the Internet is the real problem facing the Internet upgrade. When we have a clear purpose, all discussions and actions make sense! Only around the goal, all the streams will join a big river, and run to the sea!

So how can we empower individuals to build value scenarios?

1. A modular front-end UI design The modular front end can create all the value scenarios in the world.

2. Create a network of power and stakeholders for everyone All the benefits and administrative rights of the application must belong to the owner. Enterprises naturally have this management structure in the current network, while the network power and interest subjects applicable to everyone need to be redesigned. The design is simple, we just need to create a concept of "scene" on the network, the scene is the basic beneficiary of all management and money power.

3. Place the new structure in a scenario where it can empower For enterprise applications, the best scenario is the application market, while for individuals, the best scenario is their own life scene, is all the social scene, business scene, residential scene.


OOPs was suppose to help with the issue of programming lock in but for the most part it has not advanced to the point where you can interchange program pieces. I think we need to go back to the idea that software can be like building a building where there are basic building blocks that can be purchased from many vendors and let each vendor decide how to improve the blocks and let the architects and engineers decide what blocks to use. Creating software from lines of code is too slow and ultimately it leads to lock in and inability to upgrade. People think that it will lead to a slow down in technological advancement but at the moment we aren't really advancing. We keep on changing programming languages with out much advancement in the ultimate result. It's change for the sake of change leading no where.

If you look at advancements in society you will see that once defined standards are put in place then technology moves forward.


> we need to go back to the idea that software can be like building a building where there are basic building blocks that can be purchased from many vendors and let each vendor decide how to improve the blocks and let the architects and engineers decide what blocks to use

Isn't that just libraries and APIs? There's friction when interfaces aren't standardized, but it's certainly a good deal of the way there.


Yes, very true, they are a big step forward but standardization is the real key to innovation. It lets society focus its limited resources as oppose to going all over the place looking for a way forward.


That's true - but standardization comes at a cost. If the standard is not good or not good enough (even after some time later) then it hinders progress. That's why even easy to standarise things such AC powercables are changed by some companies (Apple, OnePlus) to improve charging speed, because the standard isn't sufficient.


> but for the most part it has not advanced to the point where you can interchange program piece

Hasn't it ? I have no trouble e.g. replacing a hash map or container by another implementation in C++. We live in an era where we have libraries for everything and it takes seconds (okay, sometimes minutes) to introduce them in a codebase and swap them. What more do you want ?


C++ libraries are baked in, broken, or missing. C++ class hierarchies are not objects. To me, objects are like biological cells that can poly-transform and communicate on the fly. Both libraries and class hierarchies are not just jails, but dependency hells.

Want and more are problems. What do we need?

Everything including functions can be abstracted as live data. I want to connect objects live, changing on the fly. The "C++ text -> compile -> link -> install / run" pipeline is way too clunky.


> C++ class hierarchies are not objects.

who cares ?

> Both libraries and class hierarchies are not just jails, but dependency hells.

sounds like you had some severe trauma. I wonder how much better our profession would be if we had code psychiatrists, that would help people move past the bad experiences they had with $LANGUAGE.

> Everything including functions can be abstracted as live data. I want to connect objects live, changing on the fly.

well, you can use puredata or max/msp for that, and then when your program ends up being too slow and clunky because of the lack of optimizations, you can mail me for rewriting it in C++ at a cost :-)


the product "notion"

it create a new structure,page and block can be combined freely, i always think "notion" probably is a prototype


I do feel like iOS is moving in this direction.

My photos are in the photos app (and they're not "files"). I can tap share, and send them to WhatsApp / Telegram / Email.

This somewhat unifies the "photo selection and sharing" interface of all applications in one place. It's also good for security since I can deny all these apps access to my library.

Also, on any app when saving an image, it also goes into photos. There's no filesystem, it goes into a dedicated "photos" storage, and I can later find it there to use it for whatever (also, copy-pasting images in modern OS's is fabulous).

It's quite clear that he proposal from the article aligns well with slowly shifting away from a "files" mentality. Honestly, I'd love to see a "Photos" app on Linux that handles all my images files, and stop using the cli / file manager to treat them as files (e.g.: drop one level of abstraction).


I think to some extent at least, having an open-to-extension capability like what Julia offers (open multiple argument dispatch) could help with this. This isn't at loggerheads with the idea of an application, but let's such applications share concepts - where app2 can extend app1's concepts by defining more specialized methods. So if Instagram exposed an "image" concept (which might be named Abstract image in Julia) , Photoshop should be able to run with the provided capabilities and specialise a custom notion of image as paint on canvas" or illustrator can add "vector drawing" capabilities. Then, depending on how the specialization is done, it would be possible for Instagram to post vector drawings. (Simplifying a bit to illustrate)

This is a common enough pattern between libraries in Julia dedicated to doing very different things and yet cooperate beautifully.


Microsoft's COM shows that you don't need any magic features in your language - just regular late binding for functions. You can interface with COM components - the one mainstream implementation of what this article is describing - in programs written in C. Because underneath, the whole thing works by asking the OS to give you an array of function pointers in exchange for a UUID.


I'm familiar with COM and I do think the kind of interoperability you get with open multiple dispatch is of a different character altogether. There has not been any theoretical reason why this has to be true yet, but empirically it has been surprisingly powerful in Julia. I guess Common Lispers would relate to it too.


I believe the only way we can push the state of the art at this point is to replace Linux entirely with some new experimental kernels. Linus will never accept a radical departure from his own design.

We definitely also need new ways to talk about these ideas (or I/we just need to learn them!). Object orientation is a philistine way to group concepts that have matrices of complexity. For example, there are distinctly different functions of code that most languages I've seen don't provide a syntactic way to express. How can I explain to someone that this part of the code is one part of an ordered set of operations tied to a variety of states influenced by a variety of functions, while some other code is idempotent and stateless? And can't our compilers take advantage of this to connect the pieces for the developer?


I've read a lot of discussion about "app barriers", but there's a general confusion about how such a "wall breaking" dream would work.

We say that there are barriers to application because only the enterprise has the ability to develop applications, and this unbalanced entry barrier is the source of the world Wide Web's inequity.

Applications are for the profit of enterprises, so only enterprises will build applications according to their own needs and profits. This is the core issue of web2.0, and also the key to the optimization of the future network.

That means enabling everyone to build applications that suit their needs and profits with the help of Internet tools.

So, if we go back to the basics, decentralizing data is not the point, and enabling everyone to build value scenarios with the help of the Internet is the real problem facing the Internet upgrade.

When we have a clear purpose, all discussions and actions make sense!

Only around the goal, all the streams will join a big river, and run to the sea!

So how can we empower individuals to build value scenarios?

1. A modular front-end UI design

The modular front end can create all the value scenarios in the world.

2. Create a network of power and stakeholders for everyone

All the benefits and administrative rights of the application must belong to the owner. Enterprises naturally have this management structure in the current network, while the network power and interest subjects applicable to everyone need to be redesigned.

The design is simple, we just need to create a concept of "scene" on the network, the scene is the basic beneficiary of all management and money power.

3. Place the new structure in a scenario where it can empower

For enterprise applications, the best scenario is the application market, while for individuals, the best scenario is their own life scene, is all the social scene, business scene, residential scene.


This is one weird article.

What it wants is standards, but the word is never used.

What it wants is the Unix philosophy-like micro-programs, that "Do One Thing And Do It Well.", but no mention of that either, except in drive-by disparaging text-driven interfaces that most of these currently use…


The article also made me think about the Unix philosophy of text pipes.

I think Powershell is a step further in that direction, as it it allows you to pipe typed objects between scripts and cmdlets. It's just not clear to me how to extend that idea to a GUI-centric environment like a smartphone.


A suite of data format standards will achieve all this without having to resort to a (hand-wavy) half-baked notion of standardized set of data processing modules (aka “applications”). And we’re already doing this, and I am certain the OP has also modified data created by one application with an entirely distinct other application. Overall, OP fails to provide a lucid and compelling reason for speculating about a half-baked notion that supposedly solves some poorly defined problem. (Repeat: interop is optimally done via data translation. Local coupling that allow for eco-systems of loosely coupled (via data) applications.)


Programs these days, dating back to the arrival of the PC, are roach motel silos. When you use two or three programs to work on a problem, each program often maintains its own state -- on mobile, often data files as well.

In the "old days" you could work on a task with a single directory (or tree) holding source code, documentation, email, etc and your editor, email program etc would work fine. Now my email is in its own tree, slack messages theirs, bug reports their own, and of course source code in a source tree. What a massive regression.


Interesting article, but somewhat of a false dichotomy.

> This absolutely hasn't eventuated - once again, applications are the problem. Photoshop's codebase and Instagram's codebase no doubt have sophisticated Image objects defined, but each only exists within its gated prison.

Photoshop & Instagram's "Image object" are not what make them good. What makes them good is their "Image object" in the context of their eco-system (app/platform/userbase/DESIGN/integrations/etc).

A building is (and has always been) the sum of it's parts.


This reminds me of Pink from the 90's: https://en.wikipedia.org/wiki/Taligent


PostgreSQL has a feature called Foreign Data Wrappers. It's certainly not what th OP imagins, but still pretty cool to query and join external data sources.


I was not aware of this feature. It is REALLY cool...even if I’m struggling to think of a situation where I would prefer it over doing the “joins” in application code, at least for the drivers I looked at when I searched this space. If “smart” drivers already existed for making sense of schemas across platforms (including performance characteristics that Postgres would leverage) then I would drop everything and go all in on this.


The first thing that should be considered is security. I 'm writing this from a computer which has been hacked over and over, and could very well be caught right now in a man in the middle scheme. Once you've got a serious security model baked in, you can start to imagine other things.


I'm not entirely sure the author understands what software is or how it works.

Photos are stored in standard formats. Your web browser can probably save images for you that could be opened in photoshop's. If you want drag and drop that could be done (browsers dont allow you to drag out AFAICT). But filters are software, not data. Someone could try to create a standard way to define filters but then each program would need to understand that. I like the idea of dragging a filter onto an image in any program and having it apply there, but that means the filter has to be some independent piece of code implementing "the image filter API". Would every app need to know how to apply those, or would the UI toolkit have means of presenting images that all apps use and that knows how to apply filters?

Not only does this require extremely good design of abstractions, it requires a very open system that seems to go against commercial interests. In the long term I think commercial software has other similar issues, but we need an alternative means of paying developers before that can change.


hello!i m from china,english not good,but for this idea i have thought for 5 years,so i want to say something

for this idea,i want mention :

Data is not the core, but the value is the core. All the programs on the Internet have barriers, not because there are barriers to data, but because there are barriers to value.

I've written a lot about the cost of developing the World Wide Web, and only by lowering the barriers to development and allowing more people to participate in the building of the web's "value program" can the vision of the Web truly be realized.

The closure of the program itself is not the closure of data, but the closure of the value of the scene, is the closure of the construction of the scene power, I have a product prototype design.


There are hundreds of libraries and packages for dealing with images which can all be mixed and matched together. So maybe Instagram and Photoshop don’t participate in that open ecosystem, but that’s their choice. If you want to use the concept of `Image` it is an import statement away.


I'm thinking about those ideas since years. For me it is strange that it seems for many such a hard concept to grasp, since the advantages are huge and obvious.

Today some tasks like mass renaming of files of a certain type require an extra tool for a casual users, which is in most cases not available and the task therefor not doable. This is a pity and wastes a lot of potential/productivity.

If programs where things you could easily talk to - and I don't mean by using a programming language - then filtering and renaming some files should be easy.

This kind of mechanism would also allow to blur the line between the traditional desktop, the cloud and AI (something that has been tried before, but failed because the use-case where not compelling). For example, if Microsoft would update Windows in such a way, every user could have some cloud points for AI image recognition per month. If for some reason you needed to do a lot of image recognition, you would have to pay extra. Which would be okay, since using "more resources" creates costs somewhere and we as society agree that someone has to pay for it -> capitalism. This blending of ecosystems and capabilities is where things should be going, but strangely none of the big tech companies seem to pursuer such a path.


This would require standards, and then competitors would eat their nice, fat margins.


Why do articles like this keep being written and upvoted? It just keeps raising problems ranging from design, technology, politics and economy. Perhaps the author should try to answer his own questions as an exercise.


Most of these problems are obvious to me. I can't solve them all, and can't think of them all – and sometimes I have a good idea about a problem somebody else has raised. More eyes on a problem means more solutions, and eventually somebody might come up with a good one.


I was not aware of this earlier. I lightly remember that I saw some related articles to this matter but I didn't even thought about it after reading. Yeah there is something thoughtful


Look at Prof. Wirth's Oberon project and Jef Raskin's "Humane Interface".


I have pretty good integration of different text based applications inside of emacs.


I like where this is going, I have been thinking about similar problems lately and, inspired by a talk from Alan Kay, started writing down a research direction for a solution. I'm looking for feedback and other people interested in this. You can read the whole thing here: https://hackmd.io/kafpxBeqQua_rcrncP14tQ?view

TLDR: I'm thinking of building on top of the browser a universal document standard, which allows for interactive documents that contain the application you need to render it, but also to get data from it and link it to other documents. I'm thinking of using IPFS as a storage mechanism to get stable links. Iframes to securely compose documents, using a bootstrap javascript line as the only requirement for any document format. A message bus system, using iframe postMessage api, to connect all documents.

I'm in the pre-design phase, there is no code, just thoughts.


As I recall reading that TempleOS had this neat feature that every function loaded by the OS was available to all other programs. That sounds so powerful.

This is one of the wonderful features of Powershell. You have the whole .net ecosystem there you can call into.


> Think about adding up some numbers in a tabular structure. That's straightforward with most programming languages. But what if that same table is in a web page, or a mobile app, or a PDF? It's right there on the screen, it's probably encoded as a table in the markup. So the data is there. And yet, we can't query it.

I mean, it sounds like you're describing command line tools that fetch the data you want, wherever it might exist (in a table, in an image, on the web, across a network link, whatever) and pipes the data you're interested in on the command line so you can use a whole ecosystem of filtering/transforming/combining/querying whatever — even your own code.

> Currently, "doing programming" is a segregated activity from mainstream computing - separate software, command lines, specialist knowledge, clunky text-driven interfaces.

What's clunky about a text-based interface like a command line, or code written to process simple data in text form or in files or in a database? It seems like you have a /integration/ and /ingestion/ problem, not a programming or app design problem. The issue is that, wherever the data lives, you simply need to get it out and transformed into a format that makes it easy to process it with the amazing and existing tools that have existed for decades.

The reason that apps, websites, etc. exist are either because those interfaces aren't built for programmers to consume (i.e. it's for non-technical users, or business people, or some other purpose), or out of ignorance of the power of the command line, or because of personal preferences of the person who designed it.

> How do you build ubiquitous programmability into interfaces without adding clutter or reducing usability?

You don't. You build integrations and ingestion pipelines to move data from wherever it may currently live into a place that is easy for your system to process.

The reason you can't get ubiquitous programmability is because different users/consumers need different things from interfaces, and that's just a simple fact of life. The closest (and one of the most powerful) thing we have that's pretty close to ubiquitous across so many different types of users are spreadsheets. But these come with tradeoffs as well — first of all if you just need the information and don't want all the surrounding capability then a spreadsheet is overkill. If you need rigid validation, and hugely powerful query capabilities then it needs to be in a database.

> The realization that the software experience is still built on artifacts of computing from the 80s like text-based command lines is a lot less surprising considered within the context of this ongoing decline.

Software is built on these text-based command lines because they work. They're not clunky once you get to learn them — they're pretty much the best thing anyone's ever done. They're still around because no one has improved on them to a degree so significant as to replace them.

It sounds like this article is proposing a new way to do things, will which just end up being yet another walled garden. It's absolutely preposterous to think you're going to reinvent 60+ years of advancements in computing when the things that have been working, evolving, and still constantly improving for at least the last 20 years of those 60 years work incredibly well already.

> Climate change has shown us that mere awareness of the situation we are in isn't enough. Actual liberation from disaster requires a bold change of direction and a acknowledgement of shared, public goals beyond the financial.

Climate change taught us this? Wow. Learn your history. There's /always/ a mix of short-term and long-term research going on, there always will be, and while the mix might change a bit no one part of it has even been completely dried up. Some people and organizations have short-term goals. Some have long-term goals, and lots of orgs fall somewhere in between. Innovation only looks like a really inefficient search and cobbled-together mess when looking at it in hind sight, where you can look backwards and see, "If only the people 20 years ago would have done X, Y, and Z and not wasted time on A, B, and C we would have been in the present 20 years earlier" —- but the major problem with this kind of thinking is that this is only obvious in hind sight, and nobody has the benefit of predicting the future from the present. Is there waste? Sure. But the idea that old stuff is clunky, or terrible, or poorly designed just because you're judging it from modern standards and a place where you have access to more information than people in the past is just silly. You're just going to wind up creating yet another attempt at "solving" computing once and for all. This kind of silver-bullet thinking is naive at best. No one has invented a silver bullet because there either isn't one, or so much collectively learning needs to happen /before/ it can be found that we just need to keep doing the sometimes boring work of trying things and seeing what works. It doesn't feel glorious in the present, but that's what it takes.

The problem with judging the past is that there's always waste, you never know which part is the waste and which part is going to lead you to a good solution. The searching (researching) is what gets you there, and it's hard work.

> Rare-but-notable efforts like Xerox PARC suffered similar fates, able to fend off the bean-counters for a while but not indefinitely.

This is romanticism, and "good old days" kind of thinking. The past is worse in almost every single way, and to put Xerox PARC up so as to imply that modern research orgs aren't probably better in almost every way is a bit foolish. Is modern research flawed? Sure, but so it was in the past as well.





Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: