I tried this a few years ago; http://canonical.org/~kragen/sw/dev3/colors.html has them as foreground colors and http://canonical.org/~kragen/sw/dev3/colors.2.html has them as background colors. I tested 3-letter words as well as 6-letter words, and used 1 as "l" as well as "I", but I didn't try aghasemi's very productive suggestion of using 5 as S. I don't remember if it it didn't occur to me or if I tried it and didn't like the results.
Some of them are pretty #bad (#011 doesn't really look much like "oil") and some, though they read quite well, correspond to awful colors; you might even say, #faeca1 colors. Still, I've made my #bed, #0dd as it may be; now I must #11e in it. I think I've #fed you enough #babb1e for today.
I’d golf with you but I think you got a hole in one there. I didn’t spend any time thinking about how to make the sed more compact. I sorta just translated what I’d already written in bash.
Thanks for this. I'd probably call the original the GNU coreutils version. The linked github also has a sed-only version in the comments. It's instructive to see the different versions.
I just added a sed version as well. I'll have to click through and see how closely it resembles what's in the gist.
bash is actually pretty powerful if you don't mind its baroque syntax. Writing it in POSIX would be a bit more challenging. You could use a case statement for the pattern matching, but I'm not sure about the substitution.
This snippet demonstrates how a number of small tools, each doing its narrow job, strung together via the most trivial interface, produces a non-trivial result.
This composability is still unreachable to the vast majority of GUI tools.
The non-trivial part here is actually the source data (the dict file.) It is also its pitfall - after adding 5 for S you should see a lithany of plurals. Most dict files (for English anyway) however seem to omit plural nouns. I guess the logic is that in English most plurals are regular, and the naive algorithm for deriving them from the singular forms (correctly most of the time) is quite trivial.
While it is a neat trick as one liner, I would recommend against doing anything like this in any software that requires maintenance. The code is hard, or impossible to follow, no comments. Brittle and only few people can understand what it really does. Better option would be 10 lines of Python or JavaScript with some comments.
I thought it was trivial to understand, though the comment above it helps a lot, and it's maybe an unfair advantage that I'd done the same thing in pretty much the same way four years ago. It probably depends on your background; I wouldn't write it that way for people who didn't know shell, just like I wouldn't write this comment in English for people who speak only Spanish.
I'm not convinced that it's easier to understand in Python (even though I simplified it a bit, in part because one piece of the Python 3 braindamage was moving string.maketrans to bytes):
import re
def main(words):
for word in words:
word = word.strip().upper()
if re.compile(r'[A-FOI]{6}$').match(word):
print('#' + word.replace('I', '1').replace('O', '0'))
if __name__ == '__main__':
main(open('/usr/share/dict/words'))
I think the shell version is clearly better for interactive improvisation, though.
I prefer search with an explicit '^' in the pattern to using match. For a throw-away script I'd probably do this:
import re
is_hex_like = re.compile(r"^[a-foi]{6}$", re.I).search
for word in filter(is_hex_like, open("/usr/share/dict/words")):
hexword = word.upper().replace("O", "0").replace("I", "1").rstrip()
print(f"#{hexword}")
findall and multiline mode makes it even easier, at the cost of loading whole file into memory though, for that reason your alternaive is probably better
import re
wordlist = open("/usr/share/dict/words").read()
for word in re.findall(r"^[a-foi]{6}$", wordlist, re.IGNORECASE | re.MULTILINE):
hexword = word.upper().replace("O", "0").replace("I", "1")
print(f"#{hexword}")
1. I don't have to remember which implicitly anchors to the start of the string and which doesn't. 2. I prefer the explicitness of '^' (maybe that's just another way of stating (1). 3. I can use re.M to modify '^' to match at the start of each line on multiline strings, whereas match will still keep searching from the front. 4. The asymmetry of anchoring the front but not the end is weird. Python now has fullmatch, but ugh, just use the pattern for that if you need it. 5. Off the top of my head, I can't think of another language that has a regex function that implicitly anchors the front.
Hmm, I see. Interesting! I think of regexps as state machines, so I think of the implicit loop to find a starting position as extra complexity, which can give rise to for example performance problems, though it's true that in many languages you can't avoid it.
Comments can be added. Understanding it requires learning the tools. Just like understanding python or javascript requires learning python or javascript. It's not impossible to follow.
I look forward to your improved version that tests against the Cartesian product of /usr/dict/words with itself plus the empty string and maybe some slang words like "bod". I suggest you limit to shortish words before the Cartesian product rather than after.
I've noticed that for years in embedded (where we use "Intel HEX" formatted files) but I ascribed it to a field full of eccentric loners doing idiosyncratic things, or some kind of DOS 8.3 brain damage.
Does anyone have a link to a guide on how to write Python or node or rust programs that behave well with bash? Ie. Streaming inputs and outputs and other things I probably don’t know about?
And avoid seek. Pipes are not random access. I once tried to use a python library to convert a file from stdin but it failed on a f.seek(0) the library added 'just in case' in the beginning.
My book Data Science at the Command Line has a chapter about this that scratches the surface and lists some resources in case you want to dive deeper [1]. I can also recommend checking out packages such as Rich [2] and Click [3], if only to get an idea of the possibilities when it comes to creating command-line tools with Python.
This is oddly something that some of the earliest Node interfaces do quite well. (I say “oddly” because Node was mostly promoted early on for network/server use cases.) It’s generally not idiomatic in these days of async/await and Web Streams, but streaming IO was a core async primitive from very early on. 0.1.90 for child processes, unspecified for the main process object so possibly from the first release. Granted the interfaces really show their age in terms of incidental complexity, they’re far from being as simple as their shell equivalents. But as far as behaving well, streaming is solid and there’s a wealth of compatibility affordances depending on how portable your script needs to be.
pager code, probably better. "143" = I love you; but 177427*711773 = what time. I don't miss those days. I never had a pager, and i managed to convince all my friends that they shouldn't, either, by pager bombing them. Pagers are still in use, and they're plaintext over the air so if you live near a place that uses pagers (hospitals still use them, for instance), you can get all the messages in real time. It's the frequency. It's in VHF (iirc) so it goes places microwaves cannot; it's also low bandwidth, so the small spectrum carved out for it is usually enough for hundreds of pagers in the area.
And since there's no real place to mention this elsewhere, there's a HTML color bot on fediverse (botsin.space) that periodically posts two colors, that work as compliments as foreground and background, and vice versa. I haven't seen it in a while, but our little instance has gotten popular so the feed rate is up near a few hundred posts an hour to sift through.
I like this! I usually try to pick a word/set of words that relates to the subject matter I’m testing, or something off the top of my head when that fails. But ABAD1DEA is a great default for exploratory work.
This is also an 8 character string, which I had wrongly inferred from usage in existing code to be restricted to certain APIs, but I looked it up and it’s evidently part of CSS Color Module Level 4 and has wide browser support. The one-liner could trivially be expanded to support 8-character codes. Not sure how trivial multiple words would be, my gut says “reasonably so but won’t feel quite so reasonable on one line”. Alas I’m on mobile so I’m not gonna try it right now.
Not sure why this is being called "Bash" one-liner. It will work with many shells. It will run noticeably faster in Dash, for example. Test it yourself. Linux chooses Dash for non-interactive use, like this one-line script, because it is faster than Bash.
How many "professional" programmers even know the difference between BRE, ERE and PCRE.
Perhaps this is why use of regex is so controversial amongst a majority of "professional" programmers. They are trying to use PCRE for every pattern matching task, i.e, even ones where it is not necessary, whether it is within their programing language or with command-line utilities. This "Bash one-liner" is a simple example.
I have reviewed a number of books written about regular expressions and for the most part^1 they focus only on regex as implemented in popular programming languages. That almost invariably is PCRE or some form of PCRE-like pattern matching. There is little distinction, let alone acknowledgment, between PCRE/PCRE-like patterns and anything simpler.
Not being a "professional" programmer, I use regex everyday but I never (intentionally) use PCRE.^2 Too complicated for my tastes, not to mention slow if using backtracking.
1. I recall one older book that did include an incomplete table attempting to show which type of regex was used by various UNIX utilities in addition to what regex was used by popular programming languages of the day.
2. For programs that optionally link to a PCRE library, I re-compile without them without it.
I wanted a t-shirt that is the color #FAB; and says #FAB; on it, thought it'd be a fun one for digital artists, then I found out how hard it would be to get t-shirt that matches it just right.
"...We used to go to lunch at a place called St Michael’s Alley. According to local legend, in the deep dark past, the Grateful Dead used to perform there before they made it big. It was a pretty funky place that was definitely a Grateful Dead Kinda Place. When Jerry died, they even put up a little Buddhist-esque shrine. When we used to go there, we referred to the place as Cafe Dead. Somewhere along the line, it was noticed that this was a HEX number. I was re-vamping some file format code and needed a couple of magic numbers: one for the persistent object file, and one for classes. I used CAFEDEAD for the object file format, and in grepping for 4 character hex words that fit after “CAFE” (it seemed to be a good theme) I hit on BABE and decided to use it. At that time, it didn’t seem terribly important or destined to go anywhere but the trash can of history. So CAFEBABE became the class file format, and CAFEDEAD was the persistent object format. But the persistent object facility went away, and along with it went the use of CAFEDEAD – it was eventually replaced by RMI...."
I had the distinct pleasure of discovering CAFEBABE myself, in high school (not sure what direction this is dating myself in but I'll risk it), when I went on a tear of opening odd things in a hex editor.
I've been using that as my own alternative to DEADBEEF for years, I had no idea it was part of the official Java spec. Maybe it got lodged in my brain subconsciously at some point.
Acacia is green, and fesses (buttocks in French) is pink. Coocoo is the only red in a surrounding of violets, and sobbed is a transparent-y blue like a tear :)
Using 3 you can get more colors with human readable names, and maybe pick the canonical color for any given word based on some criteria of interestingness.
Some of them are pretty #bad (#011 doesn't really look much like "oil") and some, though they read quite well, correspond to awful colors; you might even say, #faeca1 colors. Still, I've made my #bed, #0dd as it may be; now I must #11e in it. I think I've #fed you enough #babb1e for today.