I said this was my ideal, not that it was a practical way forward for our current software stack.
In my ideal world, your app registers that it needs a “pick weapon #2” button, and a “walk backward” button. Then I can configure my keyboard firmware and/or the low levels of my operating system keyboard handling code to map whatever button I want to those semantic actions.
The problem with the scheme where the application is programmed to directly look for the “2” key and then pick which semantic meaning to assign based on context, is that there is at that point no way to intercept and disambiguate “put a 2 character in the text box” from “pick weapon #2”.
This isn’t the biggest problem for games, but it’s a huge pain in the ass when, for example, all kinds of desktop applications intercept my standard text-editing shortcuts (either system defaults or ones I have defined myself) and clobbers them with its own new commands (for example many applications overwrite command+arrows or option+arrows or similar to mean “switch tab”, but in a text box context I am instead trying to say “move to the beginning of the line” or “move left by one word”), often leaving me no way to separate the semantic intentions into separate keystrokes.
The problem is that there is no level at which I can direct a particular button on my keyboard to always mean “move left by one word”... the way things are set up now, I have literally no way to firmly bind a key to that precise semantic meaning, but instead I need to bind the key to some ambiguous keystroke which only sometimes has that meaning, but sometimes might mean something else instead.
I can touch-type on a QWERTY keyboard. For the Latin alphabet, I want a QWERTY layout even when I'm typing French or German; even when the keyboard is physically labelled AZERTY or QWERTZ. I type Korean on a 2-Set Hangul layout, but have never used a keyboard that was labelled with that layout. (Aside: typing in this layout does not generate a Unicode codepoint per keypress, but combines two or three letters into a single codepoint).
If the keyboard was responsible for deciding all this, I'd need three keyboards and I'd need to carry them around with me whenever I wanted to use somebody else's computer.
> In my ideal world, your app registers that it needs a “pick weapon #2” button, and a “walk backward” button. Then I can configure my keyboard firmware and/or the low levels of my operating system keyboard handling code to map whatever button I want to those semantic actions.
Then the OS / Firmware is dealing with questions like, "What button fires the secondary dorsal thrusters". Does it make sense to handle that kind of question so far from the site where the semantic knowledge is present? No.
Likewise, a web server doesn't provide application-level semantic information in its replies, only protocol-level semantic information. One application might think 301 means "update the bookmark" and 502 means "try again later", but another application might think 502 means "try another server" or that 404 might mean "delete a local file" or "display an error message to the user".
Likewise, a game is prepared to deal with buttons, not semantics. The number and layout of buttons is closely tied with design decisions. A FPS gives you WASD, plus QERF for common actions, ZXC for less common actions, 1234 for menus / weapon selections. The design of the game from top to bottom is affected by this. You swap in a controller for a keyboard, and you'll decide to change how weapon selection is presented: maybe spokes around a center so you can use a joystick instead of items in a row corresponding to numeric buttons. You add auto-aim to compensate for the inaccuracy inherent in gamepad joysticks, but the vehicle sections become easier. You might even redesign minigames (Mass Effect has a completely different hacking minigame for console and PC versions).
Or look at web browsers. As soon as you hook a touch interface to the web browser you might want to pop up an on-screen keyboard in response to touch events, but you need to move the viewpoint so that you can see what you're typing.
Input is inherently messy, and you can't pull semantics out of applications because you'll just make the user experience worse.
ON THE OTHER HAND...
> The problem is that there is no level at which I can direct a particular button on my keyboard to always mean “move left by one word”...
This is available through use of common UI toolkits. I believe on OS X, you can bind a button to mean "move left by one word" in all applications which use the Cocoa toolkit (which is the vast majority of all applications on OS X). The way this works is there are some global preferences which specifies a key binding for "moveWordLeft:". The key event, when it is not handled by other handlers, then gets translated to a "moveWordLeft:" method call by NSResponder. The method for configuring these key bindings is relatively obscure, suffice it to say that you can press option+left arrow in almost any application to move left one word, and you can configure the key binding (i.e., choose a different key) across applications on a per-user basis.
"Likewise, a game is prepared to deal with buttons, not semantics. The number and layout of buttons is closely tied with design decisions. A FPS gives you WASD, plus QERF for common actions, ZXC for less common actions, 1234 for menus / weapon selections."
Except on my keyboard layout an FPS should be giving me QSDZ, for the same pattern of movement keys, because I'm French and use AZERTY. Or AOE, because I use Dvorak; ARSW because Colemak...
Not really, but you take my point...
And of course, most games let you redefine your keys anyway, probably largely for this reason. I'm not sure how much this undermines your other points.
I was simplifying; I don't use QWERTY either. Most operating systems provide two ways to identify key presses, let's call them "key codes" and "char codes". So the char codes on a French layout are QDSZ instead of WASD but the key codes are the same, and the keys are in the same physical location so it doesn't matter. The only difficult part is figuring out how to present key codes back to the user.
In my ideal world, your app registers that it needs a “pick weapon #2” button, and a “walk backward” button. Then I can configure my keyboard firmware and/or the low levels of my operating system keyboard handling code to map whatever button I want to those semantic actions.
The problem with the scheme where the application is programmed to directly look for the “2” key and then pick which semantic meaning to assign based on context, is that there is at that point no way to intercept and disambiguate “put a 2 character in the text box” from “pick weapon #2”.
This isn’t the biggest problem for games, but it’s a huge pain in the ass when, for example, all kinds of desktop applications intercept my standard text-editing shortcuts (either system defaults or ones I have defined myself) and clobbers them with its own new commands (for example many applications overwrite command+arrows or option+arrows or similar to mean “switch tab”, but in a text box context I am instead trying to say “move to the beginning of the line” or “move left by one word”), often leaving me no way to separate the semantic intentions into separate keystrokes.
The problem is that there is no level at which I can direct a particular button on my keyboard to always mean “move left by one word”... the way things are set up now, I have literally no way to firmly bind a key to that precise semantic meaning, but instead I need to bind the key to some ambiguous keystroke which only sometimes has that meaning, but sometimes might mean something else instead.