You really shouldn't ever run under the C locale unless your need to support older (pre-locale/unicode) software or data of the same vintage. Doing so defeats the entire purpose.
You're confusing issues here: locale support lets you parse text. Your "file" example is not relevant, and would be part of something like gettext(3). Yes, gettext can be configured from the locale, but that is a separate feature from what I was talking about.
The locale support is how you automatically handle lexing the input stream, which is why I brought up the character classes that, unfortunately, most people seem to ignore, resulting in broken text support.
If you properly support locales, you program will support the user with LC_ALL="fr_FR.utf8" typing their floats as 3,14 automatically.
The fact that your program expect a field name to be 'file' is unrelated, but is something the user could learn by 1) reading your man page, 2) reading your text output and copying it (if appropriate), or 3) reading your error messages that you should be generating when you see 'fichier="foo.txt"' when you were expecting 'file="foo.txt"'. Note: if you used [:alpha:] to lex the input, you would automatically be able to extract the incorrect word for your error message when a user in LC_CTYPE="jp_JP" enters 'ファイル="foo.txt"'.
I believe the problem here is related to a confusion of lexing with parsing. The various locale features solve the lexing problem, but you still have a parsing problem no matter the syntax of how the data is serialized. You have to check that /[[:alpha:]]+/ was 'file' not 'fichier' or 'ファイル', just the same as you would have to check the output of your XML parser that the <file> tag was not <fichier> or <ファイル>, just the same as you would have to check that a binary field was 0x0003 or whatever the flag value was.
You may be looking for a way to automatically discover the semantics (schema) that another program expect - and that would be a nice feature - but that is generally orthogonal to the syntax used for IPC. (must like how XML can be converted to YAML/JSON/BSON/etc). It is also a far more complicated feature, and I'm not sure it can ever really be solved (halting problem, possibly), but maybe a "good enough" solution could be created.
You really shouldn't ever run under the C locale unless your need to support older (pre-locale/unicode) software or data of the same vintage. Doing so defeats the entire purpose.
You're confusing issues here: locale support lets you parse text. Your "file" example is not relevant, and would be part of something like gettext(3). Yes, gettext can be configured from the locale, but that is a separate feature from what I was talking about.
The locale support is how you automatically handle lexing the input stream, which is why I brought up the character classes that, unfortunately, most people seem to ignore, resulting in broken text support.
If you properly support locales, you program will support the user with LC_ALL="fr_FR.utf8" typing their floats as 3,14 automatically.
The fact that your program expect a field name to be 'file' is unrelated, but is something the user could learn by 1) reading your man page, 2) reading your text output and copying it (if appropriate), or 3) reading your error messages that you should be generating when you see 'fichier="foo.txt"' when you were expecting 'file="foo.txt"'. Note: if you used [:alpha:] to lex the input, you would automatically be able to extract the incorrect word for your error message when a user in LC_CTYPE="jp_JP" enters 'ファイル="foo.txt"'.
I believe the problem here is related to a confusion of lexing with parsing. The various locale features solve the lexing problem, but you still have a parsing problem no matter the syntax of how the data is serialized. You have to check that /[[:alpha:]]+/ was 'file' not 'fichier' or 'ファイル', just the same as you would have to check the output of your XML parser that the <file> tag was not <fichier> or <ファイル>, just the same as you would have to check that a binary field was 0x0003 or whatever the flag value was.
You may be looking for a way to automatically discover the semantics (schema) that another program expect - and that would be a nice feature - but that is generally orthogonal to the syntax used for IPC. (must like how XML can be converted to YAML/JSON/BSON/etc). It is also a far more complicated feature, and I'm not sure it can ever really be solved (halting problem, possibly), but maybe a "good enough" solution could be created.