More

kruador · 2025-09-25T09:04:22 1758791062

Toyota's hybrids, at least, have valves in the hydraulic system. If everything is working, the driver's pedal is isolated from the physical pistons. Pressing the pedal instead moves a 'stroke simulator' (a cylinder with a spring in it), and the pressure is measured with a transducer. The Brake ECU tries to satisfy as much braking demand through regenerative braking as possible, applying the rear brakes to keep balance and front brakes if you brake too hard, requesting more braking than can be generated or the battery can absorb.

If there's a failure of the electrical supply to the brake ECU, or another fault condition occurs, various valves then revert to their normally-open or normally-closed positions to allow hydraulic pressure from the pedal through to the brake cylinders, and isolate the stroke simulator.

Because the engine isn't constantly running and providing a vacuum that can be used to assist with brake force, the system also includes a 'brake accumulator' and pump to boost the brake pressure.

Reference: https://pmmonline.co.uk/technical/blue-prints-insight-into-t...

I don't know for certain, but I would assume that other hybrids and EVs have similar systems to maximise regenerative braking.

kruador · 2025-09-23T11:19:00 1758626340

It's a pain in the backside to run on Windows, for two reasons. Firstly, Windows doesn't have (by default) a lot of the tools that are preinstalled in most nix environments. Git for Windows ships half a Cygwin distribution (MSYS2) including Bash, Perl, and Tcl.

Second, Windows doesn't really have a 'fork' API. Creating a new process on Windows is a heavyweight operation compared to nix. As such, scripts that repeatedly invoke other commands are sluggish. Converting them to C and calling plumbing commands in-process has a radical effect on performance.

Git for Windows is more of a maintained fork than a real first-class platform.

Also, I believe it's a goal to make it possible to use Git as a library rather than as an executable. That's hard to do if half the logic is in a random scripting language. Library implementations exist - notably libgit2 - but it can never be fully up to date with the original. Search for 'git libification'.

Many IDEs started their Git integration with libgit2, but subsequently fell foul of things that libgit2 can't do or does inconsistently. Therefore they fall back on executing `git` with some fixed-format output.

1718627440 · 2025-09-24T09:52:29 1758707549

I don't get why everything needs to be a library? Using the OS to invoke things gets you parallelism and isolation for free. When you need to deal with complicated combination of parameters to an API, it doesn't become too different from argument parsing, so you might as well do that instead.

You can still wrap the interface to the executable in a library.

kruador · 2025-09-15T11:57:56 1757937476

The text that the footnote is attached to is:

"Large Language Models can gall on an aesthetic level because they are IMPish slurries of thought itself, every word ever written dried into weights and vectors and lubricated with the margarine of RLHF." I infer 'IMPish' as meaning 'like Instant Mashed Potato'.

I read that footnote as a somewhat oblique criticism of two LLMs, rather than on the statistic itself - which may indeed have just been fabricated by the LLM as opposed to an actual statistic somehow dredged from its training data, or pulled from a web search.

kruador · 2025-09-15T11:41:22 1757936482

Strictly, UK teaspoons are 5 ml and tablespoons 15 ml. The metric tablespoons already used in Europe were probably close enough to half an Imperial fluid ounce for it not to matter for most purposes.

My kids' baby bottles were labelled with measurements in metric (30 ml increments) and in both US and Imperial fluid ounces. The cans of formula were supplied with scoops for measuring the powder, which were also somewhere close to 2 tablespoons/one fluid ounce (use one scoop per 30 ml of water). There are dire warnings about not varying the concentration from the recommended amount, but I assume that it's not really that precise within 1-2% - more about not varying by 10-20%. My kids seem to have survived, anyway.

inferiorhuman · 2025-09-15T12:45:50 1757940350

  Strictly, UK teaspoons are 5 ml and tablespoons 15 ml.

Well there's a rabbit hole I wasn't expecting to go down. I knew that Australian tablespoons (20 mL) were significantly different from US tablespoons. I didn't know that UK tablespoons were a whole different beast (14.2 mL), nor did I realize US tablespoons aren't quite 15 mL, and in fact my tablespoon measures are marked 15 mL. 15 mL is handily 1/16 of a US cup so it's easy enough to translate to 1/4 cup (4 tsbsp) and 1/3 cup (5 tbsp).

https://en.wikipedia.org/wiki/Tablespoon

kruador · 2025-09-15T11:02:15 1757934135

Blame the European regulators who decided that it was no longer necessary to have standard pack sizes.

Pack sizes were regulated in 1975 for volume measures (wine, beer, spirits, vinegar, oils, milk, water, and fruit juice) and in 1980 for weights (butter, cheese, salt, sugar, cereals [flour, pasta, rice, prepared cereals], dried fruits and vegetables, coffee, and a number of other things). In 2007, all of that was repealed - and member states were now forbidden from regulating pack sizes!

I think the rationale was that now the unit price (price per unit of measurement) was mandatory to display, consumers would still know which of two different packs on the same shelf was better value. But standard pack sizes don't just provide value-for-money comparisons, as this article shows.

antonyh · 2025-09-16T15:25:39 1758036339

Ironically it seems (from memory, I've not researched it deeply) that continental butter has not changed from 250g, whereas the British brands have moved first to 200g. I could understand if they switched to 225g as essentially a half-pound block, but 200g isn't any closer to an useful Imperial measure than 250g.

kruador · 2025-07-14T11:04:08 1752491048

It wasn't possible on the 386. Ken Shirriff discusses how the Intel 80386's register file was built at https://www.righto.com/2025/05/intel-386-register-circuitry..... Only four of the registers are built to allow 32-, 16- or 8-bit writes. Reads output the entire register onto the bus and the ALU does the appropriate masking. The twist is for the legacy 16-bit upper half-registers - themselves really a legacy of the 8080, and the requirement to be able to directly translate 8080 code opcode-for-opcode. The output of these has to be shifted down 8 bits to be in the right place for the ALU, then these bits have to be selected.

AMD seem to have decided to regularise the instruction set for 64-bit long mode, making all the registers consistently able to operate as 64-bit, 32-bit, 16-bit, and 8-bit, using the lowest bits of each register. This only occurs if using a REX prefix, usually to select one of the 8 additional architectural registers added for 64-bit mode. To achieve this, the bits that are used to select the 'high' part of the legacy 8086 registers in 32- or 16-bit code (and when not using the REX prefix) are used instead to select the lowest 8 bits of the index and pointer registers.

From the "Intel 64 and IA-32 Architectures Software Developer's Manual":

"In 64-bit mode, there are limitations on accessing byte registers. An instruction cannot reference legacy high-bytes (for example: AH, BH, CH, DH) and one of the new byte registers at the same time (for example: the low byte of the RAX register). However, instructions may reference legacy low-bytes (for example: AL, BL, CL, or DL) and new byte registers at the same time (for example: the low byte of the R8 register, or RBP). The architecture enforces this limitation by changing high-byte references (AH, BH, CH, DH) to low byte references (BPL, SPL, DIL, SIL: the low 8 bits for RBP, RSP, RDI, and RSI) for instructions using a REX prefix."

In 64-bit code there is very little reason at all to be using bits 15:8 of a longer register.

This possibly puts another spin on Intel's desire to remove legacy 16- and 32-bit support (termed 'X86S'). It would remove the need to support AH, BH, CH and DH - and therefore some of the complex wiring from the register file to support the shifting. If that's what it currently does.

Actually, looking at Agner Fog's optimisation tables (https://www.agner.org/optimize/instruction_tables.pdf) it appears there is significant extra latency in using AH/BH/CH/DH, which suggests to me that the processor actually implements shifting into and out of the high byte using extra micro-ops.

aleph_minus_one · 2025-07-14T15:01:20 1752505280

> In 64-bit code there is very little reason at all to be using bits 15:8 of a longer register.

I disagree: there only exists BSWAP r32 (and by 64 extension BSWAP r64): https://www.felixcloutier.com/x86/bswap

No BSWAP r16 exists. Why? in 32 bit mode, it was not needed, because you could simply use

XCHG r/m8, r8

with, say, cl and ch (to swap the endianness of cx).

In 64 bit mode, you can thus only the endianness of a 16 bit value for the "old" registers ax, cx, dx, bx using one instruction. If you want to swap the 16 bit part of one of the "new" registers, you add least have to do a 32 bit (logical) right shift (SHL) after a BSWAP r32 (EDIT: jstarks pointed out that you could also use ROL r/m16, 8 to do this in one instruction on x86-64). By the way: this solution has a pitfall over BSWAP: BSWAP preserves the flags register, while SHL does not.

jstarks · 2025-07-14T15:17:02 1752506222

What about ROL r/m16, 8?

aleph_minus_one · 2025-07-14T15:31:09 1752507069

This would indeed work (and is likely the better solution), but in opposite to BSWAP and XCHG, it also changes flags.

hollowonepl · 2025-07-23T13:55:32 1753278932

In the meantime I also read somewhere that this feature is only available in the long mode

kruador · 2025-07-01T12:21:14 1751372474

No, SDRAM means Synchronous DRAM, where the data is clocked out of the DRAM chips instead of just appearing on the bus some time after the Column Address Strobe is asserted. Clocking it means that the data doesn't appear before the CPU (or other bus master) is ready to receive it, and that it doesn't disappear before the CPU has read it.

Static RAM (SRAM) is a circuit that retains its data as long as the power is supplied to it. Dynamic RAM (DRAM) must be refreshed frequently. It's basically a large array of tiny capacitors which leak their stored charge through imperfect transistor switches, so a charged capacitor must be regularly recharged. You would think that you would need to read the bit and rewrite its value in a second cycle, but it turns out that reading the value is itself a destructive operation and requires the chip to internally recharge the capacitors.

Further, the chip is organised in rows and columns - generally there are the same number of Sense Amplifiers as columns, with a whole row of cells discharging into their corresponding Sense Amplifiers on each read cycle, the Sense Amplifiers then being used to recharge that row of cells. The column signals select which Sense Amplifier is connected to the output. So you don't need to read every row and column of a chip, just some column on every row. The Sense Amplifier is a circuit that takes the very tiny charge from the cell transistor and brings it up to a stable signal voltage for the output.

So why use DRAM at all if it has this need to be constantly refreshed? Because the Static RAM circuit requires 4-6 transistors per cell, while DRAM only requires 1. You get close to 4-6 times as much storage from the same number of transistors.

kruador · 2025-07-01T12:08:05 1751371685

The Sinclair ZX80 and ZX81 have static RAM internally, which you wouldn't expect for a) a computer that's designed to be as cheap as possible and b) uses a Zilog Z80 which has built-in refresh circuitry.

The reason is that the designers saved a few chips by repurposing the Z80's refresh circuit as a counter/address generator, when generating the video signal. Specifically, it uses the instruction fetch cycle to read the character code from RAM, then it uses the refresh cycle to read the actual line of character data from the ROM. The ZX80 nominally clocks the Z80 at 3.25MHz, but a machine cycle is four clocks (two for fetch, two for refresh), so it's effectively the same speed as a 0.8125 MHz 6502.

I wrote a long section here about how the ZX80 uses the CPU to generate the screen and the extra logic that involves, but it was getting too long :) The ZX81 is basically just a cost-reduced ZX80 where all the discrete logic chips are moved into one semi-custom chip.

Doing this makes external RAM packs more expensive too. You couldn't use the real refresh address coming from the Z80 because the video generator would be hopping around a small range of addresses in the ROM, rather than covering the whole of RAM (or at least each row of the DRAM). The designer has two options:

1. Use static RAM in the external RAM pack, making it substantially more expensive for the RAM itself; 2. Use DRAM in the external RAM pack, and add extra refresh circuitry to refresh the DRAM when the main computer is using the refresh cycle doing its video madness.

I think most RAM packs did the second option.

kruador · 2025-05-16T11:52:16 1747396336

Most 8-bit CPUs didn't even have a hardware multiply instruction. To multiply on a 6502, for example, or a Z80, you have to add repeatedly. You can multiply by a power of 2 by shifting left, so you can get a bigger result by switching between shifting and adding or subtracting. Although, again, on these earlier CPUs you can only shift by one bit at a time, rather than by a variable number of bits.

There's also the difference between multiplying by a hard-coded value, which can be implemented with shifts and adds, and multiplying two variables, which has to be done with an algorithm.

The 8086 did have multiply instructions, but they were implemented as a loop in the microcode, adding the multiplicand, or not, once for each bit in the multiplier. More at https://www.righto.com/2023/03/8086-multiplication-microcode.... Multiplying by a fixed value using shifts and adds could be faster.

The prototype ARM1 did not have a multiply instruction. The architecture does have a barrel shifter which can shift one of the operands by any number of bits. For a fixed multiplication, it's possible to compute multiplying by a power of two, by (power of two plus 1), or by (power of two minus 1) in a single instruction. The latter is why ARM has both a SUB (subtract) instruction, computing rd := rs1 - Operand2, and a RSB (Reverse SuBtract) instruction, computing rd := Operand2 - rs1. The second operand goes through the barrel shifter, allowing you to write an instruction like 'RSB R0, R1, R1, #4' meaning 'R0 := (R1 << 4) - R1', or in other words '(R1 * 16) - R1', or R1 * 15.

ARMv2 added in MUL and MLA (MuLtiply and Accumulate) instructions. The hardware ARM2 implementation uses a Booth's encoder to multiply 2 bits at a time, taking up to 16 cycles for 32 bits. It can exit early if the remaining bits are all 0s.

Later ARM cores implemented an optional wider multiplier (that's the 'M' in 'ARM7TDMI', for example) that could multiply more bits at a time, therefore executing in fewer cycles. I believe ARM7TDMI was 8-bit, completing in up to 4 cycles (again, offering early exit). Modern ARM cores can do 64-bit multiplies in a single cycle.

cbm-vic-20 · 2025-05-16T12:08:46 1747397326

The base RISC-V instruction set does not include hardware multiply instructions. Most implementations do include the M (or related) extensions that provide them, but if you are building a processor that doesn't need it, you don't need to include it.

kruador · 2025-04-08T11:57:11 1744113431

This is, in some ways, reintroducing something that other source control systems forced on you (and you can see it in one of the videos that Scott linked, about using BitKeeper - Ep.4 Bits and Booze, https://www.youtube.com/watch?v=MPFgOnACULU). The previous tools I used (SourceGear Vault, MS Team Foundation Services) required you to have a separate working tree for each branch - the two were directly tied together. That's sometimes useful if you need to have the two versions running concurrently, but for short-lived topic branches or, as you say, working on multiple topics at the same time, it can be very inconvenient.

Initially it was jarring to not get a different working directory for each branch, but I soon got used to it. Working in the same directory for multiple branches means that untracked files stay around - can be helpful for things like IDE workspace configuration, which is specific to me and the project, but not the branch.

You can of course have multiple clones of the repository - even clones of clones - but pushing/pulling branches from one to another is a lot more work than just checking out a branch in a different worktree.

My general working practice now is to keep release versions in their own worktree, and using the default worktree (where the .git directory lives) for development on the main branch. That means I don't need to keep resyncing up my external dependencies (node_modules, for example) when switching between working on different releases. But I can see a good overview of my branches, and everything on the remote, from any worktree.