More

ztorkelson · on Sept 28, 2024

It’s concatenation.

ztorkelson · on Dec 14, 2023

> Given a long-term token, can I create short-term, auto-expiring tokens?

I suspect the answer is: yes, via attenuation. :)

ztorkelson · on July 22, 2022

Is (non-)determinism really the right concern here? I’m aware that most hash tables do not have generally predictable iteration orders, but I nevertheless understood them to be deterministic.

ilammy · on July 22, 2022

They are deterministic in the sense that provided everything else is the same, iteration order would be the same after each insertion.

However, it can change after insertion (if the hash bucket count changes). If the hash function is random-keyed, iteration order would be different for different instances of the hash table even if you insert the same items in the same order.

Sometimes you want the order to be consistent every time. Be it insertion order, or some natural order of the items.

xxs · on July 22, 2022

>Is (non-)determinism really the right concern here?

A massive one - there is a lot of code that implicitly depends on the order of iteration (think configurations/initialization). The issue might be invisible, and reproduce rarely in production only. The iteration order depends on the hash of the keys, plus the capacity of the underlying array.

In Java I advise to never use the standard HashMap in favor of LinkedHashMap. In cases where data density is important both are a terrible fit having nodes instead of a plain object array (and linear probe/search).

andyferris · on July 22, 2022

Well, hard/impossible to predict perhaps. Iteration order can depend on the order things were inserted and deleted and may differ from computer to computer (for example in Julia the hash-based dictionary ordering differs between 32 and 64 bit systems, and might change between versions of Julia, etc - you’d see the same thing with C++ unordered maps, etc, etc).

okennedy · on July 22, 2022

The same thing was a huge issue in Python until somewhere around 3.6, until they canonicalized the sort order.

ztorkelson · on June 23, 2022

Is this still true? I understood SHA256 to be faster than SHA512 due to hardware acceleration on current CPUs; dedicated instructions exist for the former but not the latter.

NovemberWhiskey · on June 24, 2022

It seems like it's not true on a MacBook with the Apple M1 processor; SHA-256 is now significantly faster. It still seems to be true on a couple of Intel machines I have access to.

bawolff · on June 25, 2022

Googling i noticed there seems to be a bug in openssl where it does not use optimized sha-512 on m1 (but does for sha256) - https://github.com/openssl/openssl/issues/14897 so that might be the explanation.

Also i think the length of the input matters when comparing sha256 vs sha512.

ztorkelson · on June 23, 2022

Items could be placed at any pixel location within the bounds of the container. The game used the same X and Y (integer) variables which otherwise denoted the item’s location in the world.

This gave players more degrees of freedom to arrange their inventory as compared to a one-dimensional slot-based system.

ztorkelson · on Jan 8, 2022

Sequential UUIDs don’t start at 0. They are a 128-bit composite of two integers; a temporal component in the high order bits and a random component in the low order bits.

ztorkelson · on May 3, 2019

You appear to suggest that programmers with “rudimentary knowledge of transactions” should prefer lower isolation levels which sacrifice correctness for performance. If anything is “grossly irresponsible” here, it’s that.

Such isolation levels are notoriously difficult to reason about—even for experienced practitioners—and their misuse can and does introduce persistent data anomalies that can be costly to remediate.

Generally speaking, performance issues are significantly easier to diagnose and resolve than data anomalies, and they may be address in a targeted fashion as the need arises.

There’s no substitute for thinking. But if I had to prescribe general advice, it’d be this:

(1) When given the choice, select a modern database system which supports scalable and efficient serializable and snapshot transaction isolation levels.

(2) Use serializable isolation, by default, for all transactions.

(3) In the event that your transactions are not sufficiently performant, stop and investigate. Profile the system to identify the bottleneck.

(4) If the bottleneck is contention due to the transaction isolation level, stop. Assess whether the contention is inherent or whether it is incidental to the implementation or data model.

(5a) If the contention is incidental, do not lower the isolation level. Instead, refactor to eliminate the contention point. Congratulations; you are now done.

(5b) Otherwise, lower the isolation level—only for one or more of the transaction(s) in question—by a single step. Carefully assess the anomalies you have now introduced and the ramifications on the system as a whole. Look for other transactions which could intersect concurrently in time and space. Implement compensatory controls as necessary to accommodate the new behavior.

(6) Repeat only as necessary to achieve satisfaction.

ztorkelson · on Jan 2, 2019

Our Team

We are a small team of experienced software engineers tasked with ensuring that Clover’s rapid growth is sustainable over the long term. Our team solves for cross-cutting non-functional requirements like the security, scalability, and fault tolerance of Clover’s backend services. Together we design and develop the core architectural components, libraries, frameworks, tooling, and distributed systems at the heart of our global payment platform.

Our Work

We recently completed a project to horizontally shard our OLTP cluster, which had grown to 10+ TB in size. Next up is building a fully autonomous service for rebalancing merchant data across the shards to distribute load and eliminate hot spots.

We just finished moving our production infrastructure from private data centers to the public cloud in an effort to streamline our global expansion. Now we’re revisiting our architecture, processes, and tooling in order to better take advantage of the cloud environment.

We are actively working on the design, development, and deployment of data pipeline infrastructure to support richer analytics and reporting for our merchants and internal business needs. Our focus is on its security, scalability, reliability, and performance.

We already have a comprehensive suite of functional unit and integration tests, and are now focused on improving our automated stress tests and supporting infrastructure. That involves building the tools to spin up full production-scale environments, synthesize load, perform fault injection, and to collect, analyze, and surface test results to help drive continual improvement of performance and availability.

Our Stack

  - Java for backend services.
  - Python for integration and stress tests.
  - MySQL for OLTP. Snowflake for OLAP.
  - Kafka for stream processing.
  - Memcached for caching (duh).
  - Redis for ephemeral shared data structures.
  - Wavefront and ELK for operational visibility.
  - Google (GCP) as our cloud service provider.
  - Docker for building containers. Kubernetes for running them.
  - Netty for speaking HTTP, behind HAProxy for load balancing.

This is the Clover of today. You can help shape the Clover of tomorrow.

Contact: zac at clover dot com (and mention you saw this in HN!)

More info: https://www.clover.com/job-post?gh_jid=1461732

ztorkelson · on Dec 31, 2018

Right. I was waiting for the author to get to this point, and much to my surprise they never did.

The takeaway shouldn’t be “strings are hard”. Strings are hard, but that’s not the problem here. The problem is using inappropriate data types for the task at hand, and the takeaway should be that representation matters.

Representing an IP endpoint as a string only really makes sense at the human-computer interface. In general, the first thing a program should do with such a string is convert it into a more suitable data type. And as noted elsewhere, more often than not there will already be a library function to do so.

Strings are especially pernicious because they are ubiquitous, in that just about anything can be represented as a string, but the operations one can do on a string representation of an object do not generally correspond with the operations one would want to do on the object itself. This disparity is the source of many bugs, which the article exemplifies (though falls short of directly addressing).

Similarly problematic are assumptions that converting an object to a string and a string to an object (i.e. `parse` and `format` functions) are bijective. This is not generally the case. (Wise practitioners might choose such a mapping, though, when the opportunity presents.)

ztorkelson · on Dec 24, 2018

Congratulations on the launch! I’ve heard great things about Outlands and have been meaning to log in and check it out.

It’s been a long time since I was involved in the RunUO community, but as I recall, one of the biggest limiting factors on scalability was activated NPC/AI. Range queries and movement were the two big pieces, so I’m sure your improvements would have a big impact there.

We ran load tests on Hybrid with over 10k clients (not just idle, mind you, but moving and talking and whatever else we were able to throw in to the load generators), and the server was able to keep up just fine. That was on mid-2000 era hardware too, but then again, RunUO wasn’t built to really take full advantage of multiple cores.

There are a lot of things I’d do differently if I had the opportunity to go back and redesign it from the ground up, but the simple single-threaded concurrency model is not something I’d want to change without great care. For all the scalability problems RunUO had, I think the concurrency model contributed significantly to its approachability, and I’d be very cautious of making any changes which would complect game logic with concurrency control.

I’ve heard quite a few stories (and now I’m hearing a few more) of folks whose path into software development started by tinkering around with RunUO. In fact, a few of my closest friends (and some now colleagues) took that same path. I am filled with a weird mix of pride and abject humility whenever I have the opportunity to see how the project has touched people all over the world, often in ways I could never have anticipated.

Please do share your changes back. I’d love to take a look at them, even after so many years.

-krrios

uo_throwaway · on Dec 25, 2018

I'll get those patches out soon. Outlands generally is running much more complex AI with much faster response targets, so in conjunction with the far more detailed map it is stressing RunUO much harder than previous shards. But as noted the primary CPU consumer is definitely the map searching. My changes don't entirely change the algorithm (I've adjusted the sector size), but rather take advantage of more recent C# features that are much friendlier to the JIT and shift several allocations to the stack.

I'd also like to move away from timers for mobiles and simply call a function on a subset of them (sector by sector) each tick. This is advantageous because it groups all of the processing for a set of nearby mobiles in game space together in time, so it should greatly improve the CPU cache hit rate during the map searches. That would also require moving RunUO to a constant tick rate, which I also have patches for.

If anything, my changes have made RunUO more single threaded (and eliminated some locks in doing so). This has proven to be faster than some of the previously highly parallel code because the contention was so bad. That's not to say that it couldn't be done in a way that did scale well, but I agree with you that it would put the code out of reach of hobbyists entirely. I think the code today strikes the right balance of approachability and performance. Thanks for all of your effort on this project!