More

akst · 2025-12-27T11:39:05 1766835545

(Note I share your sentiment, however)

Is there any data that’s supports this suggestion users with older devices are actually being discriminated? (% of users actually using older devices incapable of upgrading to browser versions supported by cloud flare)

I just find it hard to believe users are actually getting denied access because their device are old. Surely you can still run new versions of Chrome and Firefox on most things [1].

——————

[1] Don’t get me wrong I use Safari and I find it inflammatory when a site tells me to use a modern browser because they doesn’t support safari (the language more so). But I wouldn’t call it discrimination seeing as I have an opinion to run firefox/chrome from time to time.

akst · 2025-12-16T23:47:18 1765928838

Yeah my impression was the Orthography is pretty consistent compared to English.

From what I understand this isn't the first time they've made some kind of change to orthography, I remember reading something about updating offical use of certain kana to reflect more modern pronunciations. It wasn't a dramatic change.

It's interesting to see some countries just have this centralised influence over something like how their language is written as they're the main ones speaking it, as opposed to English.

throwaway2037 · 2025-12-17T04:32:35 1765945955

    > Yeah my impression was the Orthography is pretty consistent compared to English.

As a native English speaker, I have learned this watching non-natives try to learn English spelling over the years. It is hell! I studied French in middle school and high school. I remember there being a similar level of ambiguity in their orthography (similar to English).

One weird thing that I have noticed when Japanese native speakers write emails in English: Why don't they use basic spell check? I'm talking about stuff as basic as: "teh" -> "the". Spell checkers from the early 1990s could easily correct these issues. To be clear, I rarely have an issue to understand the meaning of their emails (as a native speaker, it is very easy to skip over minor spelling and grammar mistakes), but I wonder: Why not spell check before you send?

astrobe_ · 2025-12-17T18:39:43 1765996783

> As a native English speaker, I have learned this watching non-natives try to learn English spelling over the years. It is hell! I studied French in middle school and high school. I remember there being a similar level of ambiguity in their orthography (similar to English).

Yes. I think english is even slightly worth than french wrt spelling/sound mismatches, but you can call me biased. Moreover, William the Conqueror, who brought civilization to England, also brought the inconsistencies of the french spelling with him.

> I wonder: Why not spell check before you send?

Well, some of my coworkers don't either, from french to french. And up to recently in most programs it was a bother to switch back and forth between 2 languages.

But really, that's probably about common laziness; the typos you mention can be caught by proof-reading before sending, which can also catch other mistakes like missing words or inconsistent sentences caused rewrites.

Proof-reading just after writing is not the best tho, as you tend to skip words because it is "too fresh". I try to introduce some time gap between the too (for instance, proof-reading after lunch or the next morning).

akst · 2025-12-14T22:12:29 1765750349

Sympathies to the author, sounds like he's talking about crawlers, although I do write scrapers from time to time. I'm probably not the type of person to scrape his blog, while it sounds like he's probably gone to lengths to make it useful, if I've resorted to scrapeing something it's because I never saw the API, or I saw it and I assumed it was locked down and missing a bunch of useful information.

Also if I'm ingesting something from an API it means I write code specific to that API to ingest it (god forbid I have to get an API token, although in the authors case it doesn't sound like it), where as with HTML, it's often a matter of go to this selector, figure out what are the land mark headings, the body copy and what is noise. Which is easier to generalise, if I'm consuming content from many sources.

I can only imagine it's no easier for a crawler, they're probably crawling thousands of sites and this guys website is a pitstop. Maybe an LLVM can figure out how to generalise it, but surely a crawler has limited the role of the AI to reading output and deciding which links to explore next. IDK maybe it is trivial and costless, but the fact it's not already being done shows it probably requires time and resources to setup and it might be cheaper to continue to interpret the imperfect HTML.

akst · 2025-12-14T21:53:32 1765749212

I have to agree, if it's clear the talk is just someone mindlessly rambling about a topic, it leaves me feeling like my time isn't being valued and I don't know why I'm spending it listening to this person.

akst · 2025-12-14T21:33:50 1765748030

I'm actually using JSTypes in app, I don't mind it.

I choose to use it because I didn't want to deal with a build step for a smaller project. The project has grown and I am looking at adding a build step for bundling but still not too worried about using JSDoc over TS.

This might be my config, but one thing that does annoy me is whenever I define a lambda, I need to add an doc type. I guess if that's disincentivising me from writing lambdas maybe I should just add a TS compile step lol.

----------------------

Here's an example - I got some config typed with this function https://github.com/AKST/analysis-notebook/blob/c9fea8b465317... - Here's the type https://github.com/AKST/analysis-notebook/blob/c9fea8b465317... - And here's something to generate a more complicated type for defining config knobs https://github.com/AKST/analysis-notebook/blob/c9fea8b465317...

g947o · 2025-12-14T23:26:00 1765754760

These days you can run TypeScript files as-is (type stripping) for testing in many runtimes, and for bundling, most bundlers support using a ts file as the entry point directly.

akst · 2025-12-15T02:26:47 1765765607

yep! However this is a website I made for hosting my own notes in the browser browser (sorry for the load times, no bundler so each file is a seperate request)

https://akst.io/vis/20250601-complex-geo/?app=unsw.2102.01-1

That said, when I write tests, I write them in Typescript for that reason.

akst · 2025-12-11T06:59:15 1765436355

A singleton for me is just making sure you define a single instance of something at the entry point of your app and passing it down via constructor arguments, which is normally what I do.

I understand you can override modules in tests, and if you have a version of your app that runs in a different environment where you might replace a HTTP service with a different variant (like if an RPC service that normally performs HTTP requests, but in one case maybe you want to talk to specific process, worker or window using message passing).

I really really feel using constructor args is just a lot simpler in most cases. But I think there's a reason why people don't always do this:

1. Sometimes someone is using a framework that imposes a high level of inversion of control where you don't get to control how your dependencies are initialised.

2. (this was me when I was much younger) Someone dogmatically avoided using classes and thought they could get away with just data and functions, and then they realise there is some value in some shared instance of something and then they just undermine any functional purity. Don't get me wrong, functional purity is great, but there are parts of your apps that just aren't going to be functionally pure. And you don't need to go full OOP when you use classes.

----------------------------------------

Here are some examples of what I mean.

Here's one example, where I have an app defined in web components and I define the components for each part of the app before passing them into the skeleton. It's a more simple app and I avoid some shared state by different landmark components communicating by events.

https://github.com/AKST/analysis-notebook/blob/b3f082fc63c9b...

----------------------------------------

Here's another example, this app is built of multiple command line instructions that sometimes have their own version of services (singletons) with specific configuration, sometimes they are initialised in child processes.

https://github.com/AKST/Aus-Land-Data-ETL/blob/672280a8ded69...

Because this app is doing a lot of scrapeing and I wanted to also child worker processes to do their own HTTP I was going to make a variant of my HTTP service which talked to a daemon process that tracked state for open connections to specific domains to avoid opening too many simultaneously and getting blocked by that host.

Updating the code that uses HTTP will be effortless, because it will continue to conform to the same API. I know this because I've already done this multiple times with HTTP scrapping in the app.

https://github.com/AKST/Aus-Land-Data-ETL/blob/672280a8ded69...

akst · 2025-12-10T05:01:21 1765342881

The success so far is really just political, which has largely been shutting down debate and dismissing calls for some kind of cost analysis of what we risk losing in enforcing this.

Whenever someone brings up this stuff, the politicians take the tone that "we won't let anyone get in the way of protecting children", and this is in response to people who in good faith think this can be done better. Media oligopolist love it because it regulates big tech, so they've been happy to platform supporters of the policy as well.

Third spaces won't reappear because the planning system in most cities shuts anything down the moment someone files a compliant. They get regulated out of existence the moment police express concern young people might gather there. The planning system (which in NSW/Sydney is the worse) has only gotten worse since the 80s after the green bans. It was largely put in place to allow for community say in how cities are shape, which sounds nice but it's mostly old people with free time participating who don't value 3rd spaces, even if they might end up liking them. They just want to keep things the same and avoid parking from getting overly complicated (and this is a stone throw away from train stations and the CBD).

Third places can be fixed by reforming planning which is slowly gaining momentum via YIMBY movements, but this social media ban is just not a serious contribution to changing that. If anything Social media phenomenon like Pokemon GO contributed more to these third places lighting up.

Governance in Australia is very paternalistic, it's a more high functioning version of the UK in that sense. I think it might be in part due to the voting system being a winner takes all single seat electorate preferential voting system which has a median voter bias for least controversial candidates.

As a kid I always felt being in Australia you missed out on a lot of things people got to do in America, that has slowly changed as media and technology has become less bound by borders but looks like that being undone.

akst · 2025-12-10T04:34:07 1765341247

It sounds like his main gripe of vibe coding is it robs you of the satisfaction as a programmer in solving the problem. I don't disagree, but the preferences of programmers are not necessarily the main driving force here behind these changes. In many cases it's their boss and their boss doesn't really care.

It's one thing to program as a hobby or to do programming in an institutional environment free of economic pressures like academia (like this educator), it's another thing to exist as a programmer outside that.

My partner was telling me her company is now making all their software engineers use ChatGPT Codex. This isn't a company with a great software engineer culture, but it's probably representative of the median enterprise/non SV/non tech start employer than people realise.

akst · 2025-12-03T11:17:42 1764760662

"The amount being spent on AI data centres not paying off" is a different statement to "AI is not worth investing in". They're effectively saying the portions people are investing are disproportionately large to what the returns will end up being.

It's a difficult thing to predict, but I think there's almost certainly some wasteful competition here. And some competitors are probably going to lose hard. If models end up being easy to switch between and the better model is significantly better than its competitors, than anything invested in weaker models will effectively be for nothing.

But there's also a lot to gain from investing in the right model, even so it's possible those who invested in the winner may have to wait a long time to see a return on their investment and could still possibly over allocate their capital at the expense of other investment opportunities.

akst · 2025-11-28T07:40:10 1764315610

I know "next-generation" is just SEO slop, but I'm going to hyper fixate on this for a moment (so feel free to ignore if you're actually interested in Positron).

I think the future of data science will likely be something else, with the advent of WebGPU[1] (which isn't just a web technology) and the current quality/availability of GPUs in end user devices, and a lot of data computation clearly standing to benefit from this.

The real next generation of data science tools will likely involve tools that are GPU first and try to keep as much work in the GPU as possible. I definitely think we'll see some new languages eventually emerge to abstract much of the overhead of batching work but also forces people to explicitly consider when they write code that simply won't run on the GPU, like sequential operations that are nonlinear, nonassociative/noncommutative (like highly sequential operations like processing an ordered block of text).

I think WebGPU is going to make this a lot easier.

That said I'd imagine for larger compute workloads people are going to continue to stick with large CUDA clusters as they have more functionality and handle a larger variety of workloads. But on end user devices there's an opportunity to create tools that allow data scientists to more trivially do this kind of work when they compute their models, process their datasets.

[1] Other compute APIs existed in the past, but WebGPU might be one of the most successful attempt to provide a portable (and more accessible) way to write general GPU compute code. I've seen people say WebGPU is hard, but having given it ago (without libraries) I don't think this is all that true, compared to OpenGL there are no longer specialised APIs to load data into uniforms everything is just a buffer. I wonder if this has more to do with non JS bindings for use outside the browser/node or the fact you're forced to consider memory layout of anything your loading into the GPU from the start (something that can be abstracted and generalised), just in my experience after my first attempt at writing a compute shader it's fairly simple IMO. Like stuff that always complicated in rendering like text is still complicated, but at least its not a state based API like web/opengl.

hatmatrix · 2025-11-28T08:54:02 1764320042

It's worth considering what nextgen really would be, but probably VSCode and its forks will dominate for the time being. I recall Steve Yegge predicting that the next IDE to beat be the web browser, and this was around 2008 or so. It's not the reality, but took about 10-15 years for it to actually happen, even though there were earlier shots at it by like Atom.

akst · 2025-11-28T16:55:19 1764348919

I guess my mind is wasn’t so much on the editor but that was what the article is was about and I don’t disagree.

ssivark · 2025-11-28T10:33:13 1764325993

Interesting question. I don't know much about WebGPU, but I'd posit (heh!) that the GPU on the client devices doesn't matter too much since folks will likely be working over the network anyways (cloud-based IDE, coding agent connected to cloud-hosted LLM, etc) and we also have innovations like Modal which allow serverless lambdas for GPUs.

As long as silicon is scarce it would make sense to hoard it and rent it out (pricing as a means of managing scarcity); if we end up in a scenario where silicon is plentiful, then everyone would have powerful local GPUs, using local AI models, etc.

akst · 2025-11-28T12:19:52 1764332392

I guess in my mind I was thinking use cases other than AI. Like statistical or hierarchical scientific models, simulations or ETL work. I also don't know if some of the econometricians I know with a less technical background would even know how to get setup with AWS, and I feel more boardly there's enough folks doing data work in a none tech field who know how to use Python or R or Matlab to do their modelling but likely isn't comfortable with cloud infrastructure, but might have an apple laptop with apple silicon that could improve their iteration loop. Folks in AI are probably more comfortable with a cloud solution.

There are aspects of data science which is iterative and you're repeatedly running similar computations with different inputs, I think there's some value in shaving off time between iterations.

In my case I have a temporal geospatial dataset with 20+ million properties for each month over several years each with various attributes, it's in a nonprofessional setting and the main motivator for most of my decisions is "because I can and I think it would be fun and I have a decent enough GPU". While I could probably chuck it on a cluster, I'd like to avoid if I can help it and an optimisation done on my local machine would still pay off if I did end up setting up a cluster. There's quite a bit of ETL preprocessing work before I load it into the database, I think are portions that might be doable on the GPU. But it's more so the computations I'd like to do on the dataset before generating visualisations in which I think I could reduce the iteration wait time for processing for plots, ideally to the point I can make iterations more interactive. There's enough linear operations you could get some wins with a GPU implementation.

I am keen to see how far I'll get, but worst case scenario I learn a lot, and I'm sure those learnings will be transferrable to other GPU experiments.

ssivark · 2025-11-28T18:01:48 1764352908

TBC, I too did not really mean "AI" (as in LLMs and the like) which is often hosted/served with a very convenient interface. I do include more bespoke statistical / mathematical models -- be it hierarchical, Bayesian, whatever.

Since AWS/etc are quite complicated, there are now a swarm of startups trying to make it easier to take baby steps into the cloud (eg. Modal, Runpod, etc) and make it very easy for the user to get a small slice of that GPU pie. These services have drastically simpler server-side APIs and usage patterns, including "serverless" GPUs from Modal, where you can just "submit jobs" from a Python API without really having to manage containers. On the client side, you have LLM coding agents that are the next evolution in UI frontends -- and they're beginning to make it much much easier to write bespoke code to interact with these backends. To make it abundantly clear what target audience I'm referring to: I imagine they are still mostly using sklearn (or equivalents in other languages) and gradient boosting with Jupyter notebooks, still somewhat mystified by modern deep learning and stuff. Or maybe those who are more mathematically sophisticated but not software engg sophisticated (eg: statisticians / econometricians)

To inspire you with a really concrete example: since Modal has a well documented API, it should be quite straight-forward ("soon", if not already) for any data scientist to use one of the CLI coding agents and

1. Implement a decent GPU-friendly version of whatever model they want to try (as long as it's not far from the training distribution i.e. not some esoteric idea which is nothing like prior art)

2. Whip up a quick system to interface with some service provider, wrap that model up and submit a job, and fetch (and even interpret) results.

----

In case you haven't tried one of these new-fangled coding agents, I strongly encourage you to try one out (even if it's just something on the free tier eg. gemini-cli). In case you have and they aren't quite good enough to solve your problem, tough luck for now... I anticipate their usability will improve substantially every few months.

hhh · 2025-11-28T08:10:56 1764317456

check out the RAPIDS ecosystem from 2018 or so :)

akst · 2025-11-28T12:20:46 1764332446

This looks interesting, thanks for sharing.