In this article I present a new faster full-range function to calculate whether a year is a leap year. This type of calculation is a fundamental date library utility, it is used all over the place: validating and parsing dates, determining the Nth last weekday of a month, handling month overflow/underflow when adjusting the day, etc.
The article outlines the technique, shows related developments before and after its creation, and includes an interactive, human-verifiable "proof".
The first article in this blog post series has a little section talking briefly about this history, and there's a representation of this that I think sheds a lot of light on the original design. See the heading "Side-Note on Month / Day Determination" in the below link [1].
Displaying the months like the following helps see the regularity at a glance. Columns 1, 3 and 5 are the long months, others being shorter:
+-----+-----+-----+-----+-----+
| 31 | 30 | 31 | 30 | 31 |
| I | II | III | IV | V |
| MAR | APR | MAY | JUN | JUL |
|-----+-----+-----+-----+-----+
| 31 | 30 | 31 | 30 | 31 |
| VI |VII |VIII | IX | X |
| AUG | SEP | OCT | NOV | DEC |
+-----+-----+-----+-----+-----+
| 31 |28/29|
| XI |XII |
| Jan | FEB |
+-----+-----+
> To a person who natively thinks in Roman numerals, remembering that the short months are: II, VII, XII, along with IV & IX would be much easier than the way us modern folks have to memorise it.
A write-up of a new Gregorian date conversion algorithm.
It achieves a 30–40% speed improvement on x86-64 and ARM64 (Apple M4 Pro) by reversing the direction of the year count and reducing the operation count (4 multiplications instead of the usual 7+).
Paper-style explanation, benchmarks on multiple architectures, and full open-source C++ implementation.
I was a bit confused initially about what your algorithm actually did, until I got to the pseudo-code. Ideally there would be a high level description of what the algorithm is supposed to do before that.
Something as simple as: “a date algorithm converts a number of days elapsed since the UNIX epoch (1970-01-01) to a Gregorian calendar date consisting of day, month, and year” would help readers understand what they're about to read.
Thanks, that is a good idea. This was originally a blog post series, and the first article gave a bit of an introduction.
When I started the blog series, I expected the first article to be the most noteworthy, with the 2nd and 3rd being lesser supplementary topics.
Now that the 3rd blog post ended up with a much larger result than I was expecting, it stands on its own and could do with some editing as you suggest.
How would this algorithm change on 16-bit or 8-bit devices? Or does some variety of the traditional naïve algorithm turn out to be optimal in that case? There's quite a bit of microcontroller software that might have to do date conversions, where performance might also matter. It's also worth exploring alternative epochs and how they would affect the calculation.
It might also come into play if developing SIMD alternatives for batch date processing, as one can have more lanes with 16-bit. I plan to make a blog post covering SIMD and if 16-bit algorithms have reasonable performance then that will be covered.
I was fortunate enough to be programming on an ARM based device, which meant that the terms (x * 4 + 3) strongly stood out to me as highly inefficient, being 2 cycle prep for the more important division. On x64 computers, those two operations are calculated in only one operation by using the 'LEA' assembly instruction (which I wasn't aware of at the time), and so others using that type of computer might not have felt this step needed any simplification.
I tried everything under the sun to get rid of these steps. The technique noted in the article of using the year 101 BC was for a long time my strongest candidate, you can view the implementation of that attempt at the link below [1].
An epoch of 101 BC still meant that there was extra work required to re-normalise the timeline after the century calculation, but it was only a single addition of 365 in the calculation of `jul`. The explanation of how this works is probably a whole blog post in itself, but now that this algorithm has been discarded it's not worth the time to explain it fully.
I also had the year-modulus-bitshift technique developed at that time, but it couldn't be integrated cleanly with any of my algorithm attempts yet. My plan was to simply document it as an interesting but slower concept.
I don't know what sparked the idea of going backwards other than immersing myself deeply in the problem in my spare time for about a month. It finally came to me one evening, and I thought it was only going to save 1-cycle, but when it also meant the year-modulus-bitshift could be utilised, the entire thing fit together like a glove and the speed collapsed down from 20% time saving to 40%.
This article dives deep into a topic I find very interesting, even if it’s not immediately useful.
If you don't have time to read, you may want to quickly check "Appendix A" for the “Smoitus” chart to see visually how the timezone alignment systems works, and "Section 5: Equation of Time (Smoital System)" to see how well this system tracks solar time.
Thanks, there is some attempt in the code to auto select US/Canada date format if the user's timezone is in the USA or Canada, but I didn't actually test it to see if it is activated correctly. Any US user's able to chime in if it does?
Even if it works, I'll probably take your suggestion add a tip below the date to clarify the date format for the user's first session.
What do you consider the Canadian format? Most people I know prefer YYYY-MM-DD as it's non ambiguous. Anytime you see XX-XX-YYYY (or XX-XX-XX) it could just as easily be be little endian or weird American.
The intention is to present dates the way they often are found in that country (not just the "gold standard"). From my understanding Canadians often encounter dates in all three formats you noted (although often using named months to avoid ambiguity).
All three date formats are used in Weekle when set to Canadian mode, and fully numeric d/m/y or m/d/y dates are only used if the date is >12 to avoid ambiguity.
The initial popup dialogue shows the default locale setting with a "change" button right next to it, and it can be changed at any time on the settings tab.
Thanks, that's a good point that it does not clearly explain the use of that table. I will update the page soon to explain that the table cell represents the last 2 digits of a year, and the left column is the result of the year calculation for that year. The table has only 28 years in it since any year 28 or higher can be simplified by subtracting 28, 56 or 84.
I recommend only attempting to memorise the table after you are already able to calculate the year number using the normal algorithm.
(The link in the submission is incorrect; apologies for the confusion.)