This is where LLMs shine, where you need to dip your toes into really complex systems but basically just to do one thing with pretty straightforward requirements.
...and apparently waste 3 months doing it wrong thanks to it without doing anything as basic as "maybe fix your bitrate, it's far higher than any gameplay streaming site and that's for video game, stuff with more movement"
The peak of irony, because you know how these people arrived at their 40 Mbit bitrate H264 and their ineffective tinkering with the same in the first place is guaranteed to be some LLMs expert suggestions. As is often the case, because they had no understanding of the really complex system subject matter whatsoever, they were unable to guide the LLM and ended up with .. slop. Which then turned into a slop blog post.
God knows what process led them to do video streaming for showing their AI agent work in the first place. Some fool must have put "I want to see video of the agent working" in.. and well, the LLM obliged!
>As is often the case, because they had no understanding of the really complex system subject matter whatsoever
Something I want to harp on because people keep saying this:
Video streaming is not complicated. Every youtuber and twitch streamer and influencer can manage it. By this I mean the actual act of tweaking your encoding settings to get good quality for low bitrate.
In 3 months with an LLM, they learned less about video streaming than you can learn from a 12 year old's 10 minute youtube video about how to set up Hypercam2
Millions and millions of literal children figured this out.
Keep this in mind next time anyone says LLMs are good for learning new things!
Video Streaming has surprising little overlap with Video Codecs. Once you choose input/output options, then there's little to change about the codec. The vast majority of options available to ffmpeg aren't supported in the browser. Streamers don't have options for precisely the same reason OP doesn't have options - you are limited entirely into what the browser supports.
I've built the exact pipeline OP has done - Video, over TCP, over Websockets, precisely because I had to deliver video to through a corporate firewall. Wolf, Moonlight and maybe even gstreamer just shows they didn't even try to understand what they were doing, and just threw every buzzword into an LLM.
To give you some perspective 40Mbps is an incredible amount of bandwidth. Blu ray is 40mbps. This video, in 8K on Youtube is 20Mbps: https://www.youtube.com/watch?v=1La4QzGeaaQ
I knew someone who was getting significant public assistance to help get out of homelessness.
The requirements were a nightmare. Your employer had to fill out regular forms, the office administering the program had to fill out regular forms, when they made mistakes they'd threaten to take away your housing (and the office frequently made mistakes). If you were employed there were perverse incentives... they would reduce your benefits my MORE than you earned so it only made sense to get a job if the pay would completely disqualify you from the program. It really was torture.
- An error is an event that someone should act on. Not necessarily you. But if it's not an event that ever needs the attention of a person then the severity is less than an error.
Examples: Invalid credentials. HTTP 404 - Not Found, HTTP 403 Forbidden, (all of the HTTP 400s, by definition)
It's not my problem as a site owner if one of my users entered the wrong URL or typed their password wrong, but it's somebody's problem.
A warning is something that A) a person would likely want to know and B) wouldn't necessarily need to act on
INFO is for something a person would likely want to know and unlikely needs action
DEBUG is for something likely to be helpful
TRACE is for just about anything that happens
EMERG/CRIT are for significant errors of immediate impact
PANIC the sky is falling, I hope you have good running shoes
If you're logging and reporting on ERRORs for 400s, then your error triage log is going to be full of things like a user entering a password with insufficient complexity or trying to sign up with an email address that already exists in your system.
Some of these things can be ameliorated with well-behaved UI code, but a lot cannot, and if your primary product is the API, then you're just going to have scads of ERRORs to triage where there's literally nothing you can do.
I'd argue that anything that starts with a 4 is an INFO, and if you really wanted to be through, you could set up an alert on the frequency of these errors to help you identify if there's a broad problem.
You have HTTP logs tracked, you don't need to report them twice, once in the HTTP log and once on the backend. You're just effectively raising the error to the HTTP server and its logs are where the errors live. You don't alert on single HTTP 4xx errors because nobody does, you only raise on anomalous numbers of HTTP 4xx errors. You do alert on HTTP 5xx errors because as "Internal" http errors those are on you always.
In other words, of course you don't alert on errors which are likely somebody else's problem. You put them in the log stream where that makes sense and can be treated accordingly.
The frequency is important and so is the answer to "could we have done something different ourselves to make the request work". For example in credit card processing, if the remote network declines, then at first it seems like not your problem. But then it turns out for many BINs there are multiple choices for processing and you could add dynamic routing when one back end starts declining more than normal. Not a 5xx and not a fault in your process, but a chance to make your customer experience better.
> An error is an event that someone should act on. Not necessarily you.
Personally, I'd further qualify that. It should be logged as an error if the person who reads the logs would be responsible for fixing it.
Suppose you run a photo gallery web site. If a user uploads a corrupt JPEG, and the server detects that it's corrupt and rejects it, then someone needs to do something, but from the point of view of the person who runs the web site, the web site behaved correctly. It can't control whether people's JPEGs are corrupt. So this shouldn't be categorized as an error in the server logs.
But if you let users upload a batch of JPEG files (say a ZIP file full of them), you might produce a log file for the user to view. And in that log file, it's appropriate to categorize it as an error.
4xx is for client side errors, 5xx is for server side errors.
For your situation you'd respond with an HTTP 400 "Bad Request" and not an HTTP 500 "Internal Server Error" because the problem was with the request not with the server.
Counter argument. How do you know the user uploaded a corrupted image and it didn't get corrupted by your internet connection, server hardware, or a bug in your software stack?
You cannot accurately assign responsibility until you understand the problem.
This is just trolling. The JPEG is corrupt if the library that reads it says it is corrupt. You log it as a warning. If you upgrade the library or change your upstream reverse proxy, and starting getting 1000x the number of warnings, you can still recognize that and take action without personally inspecting each failed upload to be sure you haven't yet stumbled on the one edge case where the JPEG library is out of spec.
Violations of the Computer Fraud and Abuse Act (CFAA) can be either misdemeanors or felonies. It's definitely broad enough that doing so could get you in serious trouble if pursued.
Would it even crash a computer? They would fill up their hard drive but that would just yield warnings to the user in most operating systems. Chances are they would kill it manually because it would take a long time
You close the ticket and ping the manager of the nontechnical person submitting the ticket. Then you have a discussion with management about the arrangement and expectations. If it doesn't go well you polish your resume.
1) has the savings such that they aren't a wage slave (applies to any income level)
2) has any dignity
People willingly put themselves in situations where they have no autonomy and no options by living their entire lives paycheck to paycheck or close enough to not make a difference.
That’s a pretty judgmental take. The only people with dignity in your formulation are independently wealthy.
If I stomped out the door as soon as I had to curb my tongue, I would never build the social and reputational capital required to be effective on bigger projects, and those are fun (to me).
We're in tech and this shit is happening in big tech companies. Yes, there's many of us who are not getting those wages but every one of them is independently wealthy.
Regardless, you are not a slave. Have a backbone. If you do not stand up for yourself you make it harder for others to do so. Your actions don't just affect you.
People at ALL wage levels put themselves into wage slavery by setting up their finances in a way where they have very little buffer. Mortgage, car payments, savings levels, and all sorts of things combine into being even briefly unemployed is a terrific burden.
In that state, you can't say no, you can't stand up for yourself or anyone else, you can't make choices because choices have significant life effects. You don't have to have a $10MM trust fund to escape this. You just have to live far enough under your means that you have the savings and spending profile that allows you "fuck you" privileges.
Everybody living so close to broke all the time makes everything more expensive, particularly the competitive things like housing.
Depends on how big the chilling effect is, no? For example, if a school librarian notices that a colleague in another district loses their job or worse, gets personal threats because of a specific book, they might well remove a book from shelves before it's challenged.
That is not a rebuttal to your point -- I don't have a guess on whether or not the chilling effect is significant. I'm just noting there are follow-on effects to be considered.
My point is we all need to moderate our reactions to things based on actual scale, across the political spectrum rare events are being amplified to make people think they're prevalent disasters and it distorts too many peoples' reality.
There are much worse, much bigger problems and we need to constantly be reminding people of how big issues actually are. Book bannings are concerning but what is the size of the actual impact? I see this issue more of as an embarrassment for a handful of schools and boards who are bowing to moralizing fools, people are acting like they're afraid of an escalation to Fahrenheit 451 when we really should be mocking the book banners for their foolishness instead of being afraid of them.
This is far from the only issue suffering from a lack of sense of scale.
It goes far beyond that. The Iowa legislature has already moved to make changes to how libraries work in Iowa as a result of all of the attention these issues are getting here. They're essentially trying to condense the power to the state level instead of at the municipal level, where it belongs. It's a power grab that'll have repercussions that may very well cause the smallest of libraries here to cease existing.
And it all started with people complaining about books in the library.
I don't disagree with the underlying point, I just don't agree that the effects of this particular issue are all that minimal. Mockery only gets you so far when the moralizing fools are, say, serving as Speaker of the House.
Probably also worth asking if this problem is really independent, or if it's a facet of larger, more clearly damaging trends.
Really depends on the size of the school districts. There are districts with more than a 100,000 students and there are those with barely 100.
This is like saying "there are only 147 cities and towns who voted for X while 15,000 towns didn't, therefore X is very unpopular" without taking into account how many voters live in those cities/towns.
> (target 80% resource utilization, funny things happen after that — things I don’t quite understand).
The closer you get to 100% resource utilization the more regular your workload has to become. If you can queue requests and latency isn't a problem, no problem, but then you have a batch process and not a live one (obviously not for games).
The reason is because live work doesn't come in regular beats, it comes in clusters that scale in a fractal way. If your long term mean is one request per second what actually happens is you get five requests in one second, three seconds with one request each, one second with two requests, and five seconds with 0 requests (you get my point). "fractal burstiness"
You have to have free resources to handle the spikes at all scales.
Also very many systems suffer from the processing time for a single request increasing as overall system loads increase. "queuing latency blowup"
So what happens? You get a spike, get behind, and never ever catch up.
Yea. I realize I ought to dig into things more to understand how to push past into 90%-95% utilization territory. Thanks for the resource to read through.
You absolutely do not want 90-95% utilization. At that level of utilitization random variability alone is enough to cause massive whiplash in average queue lengths.
The cycle time impact of variability of a single-server/single-queue system at 95% load is nearly 25x the impact on the same system at 75% load, and there are similar measures for other process queues.
As the other comment notes, you should really work from an assumption that 80% is max loading, just as you'd never aim to have a swap file or swap partition of exactly the amount of memory overcommit you expect.
Man, if there's one idea I wish I could jam into the head of anyone running an organization, it would be queuing theory. So many people can't understand that slack is necessary to have quick turnaround.
I target 80% utilization because I’ve seen that figure multiple times. I suppose I should rephrase: I’d like to understand the constraints and systems involved that make 80% considered full utilization. There’s obviously something that limits a OS; is it tunable?
Questions I imagine a thorough multiplayer solutions engineer would be curious of, the kind of person whose trying to squeeze as much juice out of the hardware specs as possible.
It might not be the OS, but just statistical inevitability. If you're talking about CPU utilization on Linux, for example, it's not all that unlikely that the number you're staring at isn't "time spent by CPU doing things" but "average CPU run queue length". "100%" then doesn't only mean the CPU gets no rest, but "there's always someone waiting for a CPU to become free". It likely pays off to understand where the load numbers in your tooling actually come from.
Even if that weren't the case, lead times for tasks will always increase with more utilization; see e.g. [1]: If you push a system from 80% to 95% utilization, you have to expect a ~4.75x increase in lead time for each task _on average_: (0.95/0.05) / (0.8/0.2)
Note that all except the term containing ρ in the formula are defined by your system/software/clientele, so you can drop them for a purely relative comparison.
Edit: Or, to try to picture the issue more intuitively: If you're on a highway nearing 100% utilization, you're likely standing in a traffic jam. And if that's not (yet) strictly the case, the probabilty of a small hiccup creating one increases exponentially.
> I’d like to understand the constraints and systems involved that make 80% considered full utilization. There’s obviously something that limits a OS; is it tunable?
There are OS tunables, and these tunables will have some measure of impact on the overall system performance.
But the things that make high-utilization systems so bad for cycle time are inherent aspects of a queue-based system that you cannot escape through better tuning, because the issues these cause to cycle time were not due to a lack of tuning.
If you can tune a system so that what previously would have been 95% loading is instead 82% loading that will show significant performance improvements, but you'd erase all those improvements if you just allowed the system to go back up to 95% loaded.
Hmmm makes sense. Sounds like I may have a misunderstood mental model of resource consumption. I ought to reread https://technology.riotgames.com/news/valorants-128-tick-ser... (specifically the section on “Real World Performance” where the engineer describes tuning) now that I have a better appreciation that they’re not trying to make resource utilization % higher, but instead making available more resources through tuning efforts.
That's a great article you link and it basically notes up front what the throughput requirements are in terms of cores per player, which then sets the budget for what the latency can be for a single player's game.
Now, if you imagine for a second that they managed to get it so that the average game will just barely meet their frame time threshold, and try to optimize it so that they are running right at 99% capacity, they have put themselves in an extremely dangerous position in terms of meeting latency requirements.
Any variability in hitting that frame time would cause a player to bleed over into the next player's game, reducing the amount of time the server had to process that other player's game ticks. That would percolate down the line, impacting a great many players' games just because of one tiny little delay in handling one player's game.
In fact it's reasons like this that they started off with a flat 10% fudge adjustment to account for OS/scheduling/software overhead. By doing so they've in principle already baked-in a 5-8% reduction in capacity usage compared to theoretical.
But you'll notice in the chart that they show from recent game sessions in 2020 that the aggregate server frame time didn't hang out at 2.34 ms (their adjusted per-server target), it actually tended to average at 2.0 ms, or about 85% of the already-lowered target.
And that same chart makes clear why that is important, as there was some pretty significant variability in each day's aggregate frame times, with some play sessions even going above 2.34 ms on average. Had they been operating at exactly 2.34 ms they would definitely have needed to add more server capacity.
But because they were in practice aiming at 85% usage (of a 95% usage figure), they had enough slack to absorb the variability they were seeing, and stay within their overall server expectations within ±1%.
Statistical variability is a fact of life, especially when humans and/or networks are involved, and systems don't respond well to variability when they are loaded to maximum capacity, even if it seems like that would be the most cost-effective.
Typically this only works where it's OK to ignore variability of time, such as in batch processing (where cost-effective throughput is more valuable than low-latency).
One way to think about it is 80% IS full utilization.
The engineering time, the risks of decreased performance, and the fragility of pushing the limit at some point become not worth the benefits of reaching some higher metric of utilization. If it's not where you are, that optimum trade off point is somewhere.
reply