More

romka2 · on Sept 22, 2019

You can use powerlevel10k with the default system font. It works fine with urxvt without any twiddling.

romka2 · on June 4, 2019

The minor modification would be from `$opts` to `$=opts`.

I'd been using ZSH for a year as if it was Bash with better history and much better completions. The difference mentioned above was the only one I knew about. I kept writing my scripts in Bash and was quite happy with the UX.

romka2 · on June 4, 2019

Pro tip: s/powerlevel9k/powerlevel10k/. Same config, same theme, but much faster.

romka2 · on May 31, 2019

I remember reading this article back when I was still using Bash. I've finally got the history I like only after switching to ZSH.

You can get decent history in ZSH just by setting the right options. If you want great history, you'll need to write a bit of code. Right now my history works as follows.

History is written to disk after every command. Up and Down keys go through the local history (from the same session). Ctrl+Up and Ctrl+Down go through the shared history (from all sessions). Ctrl+R also uses shared history (it's easy to add another binding for local history but I don't have it). Pressing Up/Ctrl+Up after typing something will go over history entries that have the matching prefix. For example, `git<Up>` will show the last command from the current session that starts with `git`. Here's my config: https://old.reddit.com/r/zsh/comments/bsa224/how_to_setup_a_....

In addition, my history is stored in Git. History from the local machine comes first, and from other machines second. This is fairly easy to do it Bash, too.

romka2 · on April 10, 2019

There is no listed equivalent of RecordIO. What do people use for high-reliability journals?

When I needed something like RecordIO to store market data, I couldn't find anything. So I implemented https://github.com/romkatv/ChunkIO. I later learned of https://github.com/google/riegeli (work in progress), which could've saved me a lot of time if only I found it earlier. I think my ChunkIO is a better though.

shereadsthenews · on April 10, 2019

That appears to be exactly RecordIO. It even has transposition. Is there a reason that doesn't meet your requirements?

Edit: it even includes an open source implementation of Cord, which they've renamed to Chain for some reason.

romka2 · on April 10, 2019

> That appears to be exactly RecordIO.

I suppose you mean "exactly" in a figurative way. Riegeli is definitely inspired by RecordIO and is meant as a successor to it but it's not RecordIO.

> Is there a reason that doesn't meet your requirements?

I need to store timeseries with fast lookup by timestamp. Riegeli doesn't support this out of the box. If I had discovered it before I built ChunkIO, I probably would've pulled the low-level code out of it and added timeseries support on top. Or maybe not. Reliability is very important to me and it's risky to use work-in-progress software that may or may not have any production footprint (I'm no longer with Google so I don't know if they use it internally.)

shereadsthenews · on April 10, 2019

I don't understand. RecordIO doesn't support lookup of any kind; it is a linear format. The interface of Riegeli looks to me exactly like the interface to RecordIO. All they've done is removed support for Google's abstract File* storage interface so it can be used by the public.

What you are describing sounds like SSTable. Perhaps you could benefit from LevelDB.

https://www.igvita.com/2012/02/06/sstable-and-log-structured...

romka2 · on April 10, 2019

RecordIO supports a form of random access lookup, although it's rarely used. Riegeli supports random access lookups as a first-class operation.

mempko · on April 10, 2019

Would Riak's Bitcask format fit the bill here?

https://riak.com/assets/bitcask-intro.pdf

romka2 · on April 10, 2019

This format looks somewhat underpowered. If one record is corrupted, there is no way to read anything after it. For the same reason there is no lookup/sharding support, such as finding the first record that starts in the second half of the file. If a writer crashes, a new instance of writer cannot append to an existing file without reading its whole content and truncating on the last readable record.

romka2 · on April 9, 2019

What part of the code do you expect to get faster with `-march=sandybridge -mtune=sandybridge`?

> excessive ASM is a mistake these days

There is no ASM in the article.

romka2 · on April 8, 2019

nftw is a wrapper over opendir/readdir, not the other way around.

romka2 · on April 8, 2019

Yep, this is mentioned in the doc.

> [...] every element in entries has d_type at offset -1. This can be useful to the callers that need to distinguish between regular files and directories (gitstatusd, in fact, needs this). Note how ListDir() implements this feature at zero cost, as a lucky accident of dirent64_t memory layout.

romka2 · on April 8, 2019

Yep, every string is potentially an allocation (unless it's short and std::string implements Small String Optimization) plus O(log N) allocations by the vector itself.

C++11 didn't make returning the vector in this function faster because it's written in a way to take advantage of RVO. It did make growing the vector faster though -- individual strings now get moved instead of copied.

romka2 · on April 7, 2019

gitstatusd calls ListDir in parallel from multiple threads. At least with a fast SSD it's CPU bound. I don't have an HDD to test on.

the8472 · on April 8, 2019

have you dropped disk caches before each bench iteration?

romka2 · on April 8, 2019

No, I did the opposite. I made sure the disk caches are warm before each benchmark. Since all versions of ListDir are identical in terms of IO demands, warming up caches is an effective way to reduce benchmark variability and to make performance differences of different code versions easier to detect without changing their order on the performance ladder.

the8472 · on April 8, 2019

That's reasonable when optimizing the average case, but not for the worst case.

romka2 · on April 9, 2019

I would agree with this sentence if "optimizing" were replaced with "making marketing claims". Optimization is the process of finding the fastest implementation. In the case of ListDir each successive version is faster than the last on a benchmark with warm IO caches, therefore it'll be faster on the same benchmark with cold IO caches (this is not a general claim; I'm talking specifically about ListDir). Benchmarking with warm IO caches is easier (invalidating caches is generally very difficult) and yields more reliable and more actionable results, hence it's better to benchmark with hot caches. It has nothing to do with the average vs worst case.