Sorry for going off topic here but I've had the same experience.
I'm not sure which update improved 4o so greatly but I get better responses from 4o than from o4-mini, o4-mini-high, and even o3.
o4 and o3 have been disappointing lately - they have issues understanding intent, they have issues obeying requests, and it happened multiple times that they forgot the context even though the conversation consisted of only 4 messages without a huge number of tokens.
In terms of chain-of-thought models I prefer DeepSeek over any OpenAI model (4.5 research seems great, but it’s just way too expensive).
It's rather disappointing how OpenAI releases new models that seem incredible, and then, to reduce the cost of running them, they slowly slim these models down until they're just not that good anymore.
No need for the apology, and FYI I broadly agree with everything you say (except about 4.5, which I don't actively disagree with I just haven't played with it myself).
I don't think things would be very different if Guido were still BDFL. He's second-author of the t-strings PEP, and has been an author of most other major PEPs in recent releases (including the walrus operator, PEG parser, and pattern matching).
It's so fascinating to see so many people, when faced with a simple and useful programming language feature, get all up in arms about how it's the end of the world for that language.
I honestly feel like a lot of people just seem bored and looking for stuff to be mad about.
Someone on reddit [0] mentioned that they updated their device via USB and hadn't encountered any issues.
If that's true, then it might actually have been the previous firmware update that silently bricked the device.
Or maybe Samsung only test in a controlled lab environment without real world signal interference.
In any case, it's mind boggling how a multi billion dollar company lacks proper rollout strategies.
I have a pair of Sony WH-1000XM4 headphones, and their app constantly tells me to install the latest firmware update.
After the 20th time I finally agreed - only to be met with the update instructions:
I must perform the update in a place with no other bluetooth or
wifi devices.
Where on earth would I even have to go to find a place without there being any 2.4Ghz signal interference?
I've never been more careful when pressing “Cancel,” making sure I don't accidentally tap “Agree and Continue”.
> Where on earth would I even have to go to find a place without there being any 2.4Ghz signal interference?
Unironic answer: most airports. Even small ones will have avionics shops, those avionics shops will have to test Emergency Locator Beacons, and those beacon signals are not meant to escape to the outside world during testing.
Thus, most have Faraday rooms, cages, or just small (2-3 cubic feet) boxes to block signals. I used to work for one of those teeny-tiny companies. Would not recommend working in aviation. That said, knocking on the door and offering to come back with doughnuts if they can help you out when it's not crazy busy, feels like less an insane idea than I'd have expected previously.
Agreed to everything in the first paragraph - second isn't something I can speak to as a Canadian. Came back to say you forgot the boom-bust cycle and the constant layoffs that come with it. Would like to reiterate on the stress and (corresponding) responsibility too, with again, the low pay not helping.
Can't say how glad I am to be out of aviation. I will say that it can play well on dating apps though - it can be dressed up to look very nice
I also have a pair of XM4s. I installed the app briefly when I first got them so I could turn off the voice notifications on connection/mode change, and then immediately uninstalled it and have never needed it again. Why on earth would I want to update the firmware on my perfectly working headphones?
Only 5th and 6th to be believed. Every time a manufacturer says vague descriptions like "security" or "performance" fixes, be wary - they probably removing perfectly working functionality for "reasons".
If it was something that really added value to the user they would mention it specifically (like on the 5th and 6th items).
I have a Dell laptop that mentioned such vague "improvements". After updating the firmware I couldn't undervolt anymore. Luckily I was able to downgrade.
> Every time a manufacturer says vague descriptions like "security" or "performance" fixes, be wary - they probably removing perfectly working functionality for "reasons"
I have a pair of WF-1000XM3s and this is painfully true. ANC was brilliant on these until I naively updated, and whoosh - instantly and grossly degraded ANC, to the point I previously almost didn't hear people talking at distance, keyboard chatter, city traffic etc. and now I do, no matter the app settings.
I wanted to upgrade to the in-ear XM4s, but after this? NEVER again Sony. At least for portable audio. I got instead a pair of cheap QCY HT07s (then $28, now ~$20) and got quite surprised with ANC performance on these: easily beats the crap of the XM3s-on-latest-firmware, and gets close to the previous one in audio quality. Which is a lot to say about Sony "updates".
Actual answer: better ANC. ANC algorithm improvements are one of the more common items I've seen in headphone firmware changelogs. Also, Bluetooth upgrades. I can't remember which, but one of my pairs of headphones gained multipoint support a year or so after release via a software update.
On the Bose 700 headphones there was quite a bit of controversy after many users reported the ANC performance getting worse after an update. This was a few years after the headphones were released, so there were theories of it being intentional degradation to get people to upgrade.
Personally I didn't notice any difference. Bose denied any wrongdoing and seemed to spend real effort on investigating the customer complains.
Because a version 1.0 of anything predates power management bugs fixed in 1.28, massive connection improvement in 1.33, basic compatibility fix in 1.57, whole load of problems added in 2.00.00 and binary signature enforcement added at some point(not real world examples).
By the way, Sony wearable products make use of their proprietary NN inference library called Nnabla, with a free helper GUI app Neural Network Console for Windows that can export low-code code into Spresense board codes. It is apparently used across the brand for tiny and transparent features like on-head detection through accelerometers. Not super related, but just so you know...
The firmware update does fix/cause battery issues depends on your batch. The wf-1000xm4 changed the battery model(thus voltage) it's using. And update the firmware to match the new battery model. However the new firmware did not handle different type of battery correctly. And damaged quite a few devices with incorrect voltage setting. (Some devices are also preload with these incorrect config) There is a firmware update to correct this setting problem.
How is the audio compression codec[0] negotiated between the phone and the headphones over Bluetooth? IIRC, Sony supports higher quality codes outside of the standard BT required ones. Is the app required for that negotiation or is it all in the operating system now?
[0] There is no lossless high quality audio over BT, only a bunch of lossy codecs.
IIRC, the app isn't actively involved in bluetooth audio negotiations, but it does allow you to change settings within the headphones around what codecs it will advertise support for and prefer to use. Those settings have reasonable defaults and any changes you make persist on the headphones even if you uninstall the app.
Yeah, I'm not sure why I'd want that on my headphones themselves. I just set it to a neutral EQ during initial setup, and now I change the EQs elsewhere in the audio pipeline (music app, mixer, etc) just like we were all doing before the advent of headphones with their own apps.
None of my headphones have firmware to update. They connect with copper (8000BCE) wires (1830CE) to a 3.5mm jack (1950CE) based on a 1/4" phone plug (1890CE). Some of them use neodymium (1885CE) magnets.
If I want equalization or convolution I apply them upstream shortly after decoding.
The EQ settings should depend on what you device you are using to listen - your headphones or your phone's internal speaker - according to their natural response curves.
I don't think major music listening apps will switch your EQ automatically settings based on your listening device. So either you are doing that manually every time you switch devices, or you set your headphone EQ directly.
In any case, the software around this is not clean, and has lots of room for improvement.
My girlfriend had to wear a sleep monitoring device, and the instructions also had stuff to that effect. including putting all phones in airplane mode and unplug any assistant speaker things you might have. I assume the real purpose of this is to make you actually sleep. But they claimed it was to make the data collect properly...
It’s much more just typical manufacturer trying to avoid liability. It costs them nothing to say don’t do that, and if it cuts tech support costs by 1%.
I love when I last called my cell phone carrier and they asked me to try putting my phone in airplane mode. I said "wouldn't that disconnect the call?", they went "no it will not", and guess what happened when I turned on airplane mode.
yeah, I normally use that trick for other stuff, but I guess I was just especially gullible at that moment. If it really was just some ploy to get me off the line because they just didn't want to talk to me or something, well that's hilarious honestly, I wouldn't even be mad at them for that.
I think it's a ploy. I shared your comment with a friend who works in a call center and agents have been known to resort to shenanigans to get rid of a call (or just take a break). In her specific case they had a combined support/sales line where you get commission for sales but not for support, so if someone is just ranting at you about some issue you have no power to fix, you might be tempted to unplug your phone's Ethernet cable and re-roll for a new call that might be sales instead of support. Could easily be the same thing for cell phone providers, though this was a hotel chain.
The real reason is that Bluetooth is awful for data transmission and the bitrate absolutely plummets when there's crosstalk. I live in an older building with a ton of interference on the 2.4GHz band (WiFi, BT beacons, "smart" appliances) and updating any device over bluetooth is impossible.
The new models actually handle update much better. The update is way slower (requires about 1 hour) compare to old model. But it allows you to continue using it while update. (It probably rate limited itself?)
Perhaps you could stick the phone and earbuds in a (non-running) microwave. They keep 2.4GHz in just fine, and Faraday cages don't discriminate based on direction.
You might have to line the inner walls with something to prevent the signal from bouncing back? I'm not sure.
Actually doesn't work particularly well. I suspect signal reflections destroy the signal.
You get similar problems in other larger metal boxes, eg caravans. In a caravan, short high data rate packets are transmitted properly, but bigger packets get lost because they interfere with a reflection off an internal wall.
> In any case, it's mind boggling how a multi billion dollar company lacks proper rollout strategies.
Having worked for several billion-dollar companies, I can tell you it's very common. The extremely short answer to why is "silos on silos on silos on silos". Quite often, each team rolls things out however the hell they feel like. And the teams don't have very good people on them. It doesn't have to be this way, but the people at these companies simply don't give a shit about doing it in a better way. Bad leadership ensures it continues.
I don't know why but it makes me happy that vim is getting so much attention from not-so-experienced developers.
Great job OP!
I like how you're not focusing on creating a complete list of commands but instead including only the ones you use and expanding it as you learn more.
Aside: I find it fascinating how you can often deduce how long someone has been using vim based on how they accomplish certain tasks.
It made me smile seeing OP use ggVG - I can think of two reasons why someone would do that: to delete everything ggVGd or to yank everything ggVGy. We (probably) all did that at one point.
And at some point you ask yourself what actually are ex commands? - and that's when you learn about :%d and :%y.
I chose firefox because I don’t want my browser to build an ad network to sell targeted ads.
And I definitely don't want this:
> You give Mozilla the rights necessary to operate Firefox. This includes processing your data as we describe in the Firefox Privacy Notice. It also includes a nonexclusive, royalty-free, worldwide license for the purpose of doing as you request with the content you input in Firefox. This does not give Mozilla any ownership in that content. [0]
There are only two ways to generate revenue: direct and indirect. Nobody will pay for a browser.
I don’t use Firefox and this whole thing is distasteful, but I’m not sure how they’re supposed to cover operating expenses without indirect monetization, or what for of indirect other than ads would work.
Well yeah and I do pay for Kagi but would still say “nobody will pay for a search engine” using “nobody” in the “not enough people to scale a mass market business” sense.
> There are only two ways to generate revenue: direct and indirect. Nobody will pay for a browser.
There's a third way: screw revenue, dump all staff not related to browser development and documentation (MDN) and look for government grants to fund that.
Especially the EU may be a target for a well-written proposal, given the political atmosphere it would make sense to have at least one browser engine that is not fundamentally tied to the US and its plethora of bullshit like NSLs.
It’s a weird term, but I’m not sure how “for the purpose of doing as you request” is terrible. To me that means that when you type a url, they have the right to do a DNS lookup for it.
Is there some interpretation where “for the purpose of doing as you request” means any purpose they want?
The problem is that I'm not requesting Mozilla do anything. Firefox isn't a "service" it's a web browser. When I input a seach query, _I_ am acting on my behalf, not Mozilla.
I don't want any language where they get to insert themselves into that chain of behavior. Curl doesn't need a TOS, why does Firefox?
I agree. The very fact that they added a Terms of Service is weird. I don't want Firefox to be a service in any way. It's a tool.
When I drill a hole in my wall, DeWalt don't tell anyone who I am, how large a hole it is, what material I drilled into or even the fact that I actually drilled anything. They don't know any of that, and neither should Mozilla know when my local copy of the browser makes a DNS, HTTP or any other request.
Very much this. The browser already have all the features to do what I want it to do. Why does Mozilla insists of being a middleman? It's my computer, Firefox code, and someone's server.
I also tried speaking German and translating it to English and when I said "Hallo ich wollte das nur mal ausprobieren" (Hello I just wanted to try this out) it translated it to "Hi, how are you? Do you know anyone who quit smoking?".
After o3 was announced, with the numbers suggesting it was a major breakthrough, I have to say I’m absolutely not impressed with this version.
I think o1 works significantly better, and that makes me think the timing is more than just a coincidence.
Last week Nvidia lost 600 billion because of DeepSeek R1, and now OpenAI comes out with a new release which feels like it has nothing to do with the promises that were being made about o3.
Since I have access to the thinking tokens I can see where it's going wrong and do prompt surgery. But left to it's own devices it gets thing _stupendously_ wrong about 20% of the time with a huge context blowout. So much so that seeing that happen now tells me I've fundamentally asked the wrong question.
Sonnet doesn't suffer from that and solves the task, but doesn't give you much if any, help in how to recover from doing the wrong task.
I'd say that for work work Sonnet 3.5 is still the best, for exploratory work with a human in the loop r1 is better.
Or as someone posted here a few days ago: R1 as the architect, Sonnet3.5 as the worker and critic.
This is the mini version which is not as good as o1 and I don’t think they demoed in the o3 announcement. I’m hoping the full release will be impressive
I know this isn't the full o3 release, but I find it odd that they're branding it as o3 when it feels more like an update to o1 mini.
Yes, reasoning has improved, but the overall results haven't advanced as much as one would expect from a major version update.
It's highly unusual for OpenAI to release a milestone version like this - it feels more like a marketing move than a genuine upgrade.
Who knows what's going on behind closed doors?
If I put on my tinfoil hat for a moment, maybe Nvidia made a deal with OpenAI - offering a discount on computing power in exchange for a timely release.
OpenAI needs an enormous amount of computing power these days, and while Nvidia would take a financial hit by offering a discount to one of its biggest (if not the biggest) customers, that's still nowhere near as costly as losing 600 billion.
Looks like Azure is experiencing a major outage, but I cannot find anything about it.
If you look at downdetector.com you'll notice reported outages from OpenAI, Microsoft 365, XBox Live, Walmart, the list goes on.
>Impact Statement: Starting at 18:44 UTC on 26 Dec 2024, you have been identified as a customer who was impacted by a power incident in South Central US and may experience a degraded experience.
>Current Status: There was a power incident in the South Central US AZ03 which affected multiple services. We have applied mitigation and are actively validating recovery to the impacted services. Further updates will be provided in 60 minutes, or sooner as events warrant.
The times are the same for OpenAI - first notice from 11:00 PST (19:00 UTC)
I don't think this stuff is the work of the devil personally by a long shot.
We don't know what exactly caused the power issue and they might not have had a root cause at the time either. Let's assume that their power redundancy equipment failed, say, due to insufficient maintenance. This is not an active action, it's a passive one (they didn't do their maintenance duties properly and now it blew). So there is nothing to say for point #1 and #2.
There's also the part where they say that the customers they identified as impacted may be experiencing a service degradation. This may sound pedantic, but I think it is not an entirely unreasonable phrasing. Maybe my business isn't actively relying on the resources I have deployed in that datacenter. How would they know (#3)? Should I clean those resources up? Possibly. Depends on my access patterns and other considerations.
It reads like face (and ass) saving legal esque language. But there's a reason face and ass saving legalese sounds like it does.
It’s the same accountability shirking language as when layoffs “have affected you” instead of “I mismanaged this business and as a result I’m firing you”.
It’s always the same abstract invisible hand that just keeps affecting everyone! Scott Alexander’s Moloch perhaps :)
Power outage seems really odd. Don't datacenters usually have multiple redundant power supplies + on-site backup power generation? Maybe power "incident" mean something else?
The switching equipment can fail. Had this happen at a DC where the switching equipment arc flashed when going from mains to diesel generators. The switching equipment detected the arc and then locked out until someone onsite could inspect the equipment and override it. The rack UPSes only lasted like 5 minutes and then everything went dark.
this happened to a dupont fabros facility in northern va in I wanna say ~2012?
derecho storms hammered the area and killed power. external power lines in failed, and the ATS hung or died when switching to the N+1 diesel generators.
since it never got switched to diesel, the UPS systems kept things going for the standard interval (e.g. ~3-5 minutes) and then ran out of power, and then everything went down. AWS died and IIRC it took a lot of stuff with it, most notably reddit, etc.
I had equipment at a colo facility -- that had a (licensed, bonded, not fly-by-night) electrical tech accidentally drop a tool into the main bus connecting mains, generators, and batteries.
They are lucky they were able to walk away, but the facility was dark till someone could get in there and give the power equipment the green light.
Back in the 00's there was a power outage in downtown Vancouver. It caused Peer1 to fall back to their generators... that weren't tested for ages. They struggled for 5 minutes, gave up and bursted into flames, resulting in the colo not being able to go back online even when the main power was restored.
That was an epic mess. Especially considering they positioned themselves as the most technically sophisticated colo in the region. So, yeah, it happens.
Things happen, e.g. the redundant system also failing, the system that should handle the failover failing, a short circuit that causes enough chaos that the redundant supply shuts down rather than feeding power into a potential fault, ...
Tl;Dr fire in one data center hall was put out with water, water leaked into other hall's power generator and battery area. Turns out loads of water and power generation equipment don't mix well, and servers don't like sitting in puddles of water.
If nobody minds a plug: My own product, StatusGator, which was launched here on HackerNews 10 year ago, notifies IT teams about outages before they are acknowledged by official status pages.
- This OpenAI outage[1] we notified 4 minutes before they acknowledged.
- The last AWS outage[2], we notified 28 minutes before they acknowledged
- There is def an Azure outage[3] now yet they have still not updated their status page. We notified 35 minutes ago.
Btw, tried to sign up and got a message that it would send me an email to confirm my login. Instead what I received was an email pointing me to a video demo. Not sure if simply clicking on that link was enough to confirm my email. That’s outside of a normal workflow.
Sounds like you got the onboarding email but not the confirmation email. It should have a subject of "Confirm your Account". Email us hi@statusgator.com if you still have issues.
The author misses something crucial - for this to work the
the right environment must be established first.
Without a supportive and healthy culture, tight deadlines can do more harm than good.
To succeed, teams need an environment where:
- Mistakes are seen as opportunities to learn, not as reasons for punishment. This fosters confidence and creativity.
- People feel recognized and valued, rather than being treated as just another cog in the machine.
- Management is transparent about goals and decisions, building trust and aligning the team's efforts with a shared vision.
Equally important:
After periods of intense deadlines teams need time to:
- Fix hasty decisions made under pressure.
- Reduce technical debt.
- Explore and experiment with new tools or technologies to improve their skills and boost morale.
Without this foundational culture implementing the author's ideas risks creating a toxic work environment.
Tight deadlines in the wrong setting can lead to:
- Extreme stress and burnout, resulting in slower progress or higher turnover.
- Decision paralysis, or as I like to call it "team freeze". Especially if people lack confidence in their abilities or fear making mistakes.
- Fragmented collaboration, where rushed individual contributions fail to integrate into a cohesive whole.
- Misaligned priorities, particularly if management operates from an "ivory tower" and imposes deadlines that seem arbitrary or disconnected from reality. This can create uncertainty. Worst case this may create rumors about financial instability, distracting teams from their work.
- A loss of individual potential, as workers who feel unrecognized are less likely to go above and beyond or contribute unique ideas.
Additionally, if deadlines become the norm without breaks teams lose their drive and motivation. The pressure of constantly sprinting leads to diminishing returns while unresolved technical debt creates long term pain points in the software.
Over time this can result in teams feeling like they're digging themselves into a hole with no opportunity to climb out.
I'm not sure which update improved 4o so greatly but I get better responses from 4o than from o4-mini, o4-mini-high, and even o3. o4 and o3 have been disappointing lately - they have issues understanding intent, they have issues obeying requests, and it happened multiple times that they forgot the context even though the conversation consisted of only 4 messages without a huge number of tokens. In terms of chain-of-thought models I prefer DeepSeek over any OpenAI model (4.5 research seems great, but it’s just way too expensive).
It's rather disappointing how OpenAI releases new models that seem incredible, and then, to reduce the cost of running them, they slowly slim these models down until they're just not that good anymore.
reply