This really resonates. I am in the CS academia and I feel like I am losing "friends" who used to share the same values and social identity with me at an unprecedented speed. I wonder if it is possible to gather people like this and regroup our social circles.
It could in theory. The model generates a depth image per frame, so each pixel becomes a small 3D point. It also assumes that the 3D scene is static. From this, you can then simply register all the frames into a huge 3D point cloud by unprojecting the pixels to 3D and render it anyway you like (using a classical 3D renderer) and it will be consistent.
Though, a problem is that if the generated video itself has inconsistent information, e.g., the object changes color between frames, then your point cloud would just be "consistently wrong". In practice this will lead to some blurry artifacts because you blend different inconsistent colors together. So when you turn around you will still see the same thing, but that thing is uglier and blurrier because it blends between inconsistent coloring.
It will also be difficult to put a virtual object into the generated scene, because you don't have the lighting information and the virtual object can't blend its color with the environment well.
Overall cool idea but obviously more interesting problems to be solved!
It's interesting to see the trend of the attitude towards GenAI in Hacker News through out the years. This is totally vibe based and I don't have numbers to back it up, but back in 2022-2023, the site was dominantly people who mostly treat GenAI as a curious technology without too much attachment, and some non-trivial amount of folks who are very skeptical of the tech. More recently I see a lot more people who see themselves as evangelists and try very hard to boost/advocate the technology (see all the "LLM coding changes my life" posts). It seems that the tide has turned back a little bit again since we now see this kind of posts surfacing.
For me, I kind of wish this site to go back to the good old days where people just share their nerdy niche hacker things and not filling the first page with the same arguments we see on the other parts of the internet over and over again. ; ) But granted I was attracted by the clickbait title too, so I can't blame others.
If anything, it reminds me of crypto - lots of investment seemed to attract a lot of users to HN that I highly suspect had some sort of...let's just call it motivated reasoning.
Crypto always had hard to understand and abstract use cases. It became popular because the value was going up.
LLMs are different. There are an endless amount of use cases that people can easily understand. Now, just how well it does things is debatable but there is a very clear value gain.
Hell, I got it give me a list of recipes for the week based on my preferences and dietary needs then created a grocery list in 2 minutes. Did I need an LLM for this? No, but it made it so much faster and this is what I am finding with a lot of tasks.
> Hell, I got it give me a list of recipes for the week based on my preferences and dietary needs then created a grocery list in 2 minutes.
I mean, er, yeah, but that's not a multi-trillion dollar industry, is the thing. "People find ChatGPT mildly useful" is not going to cut it, not at current levels of investment.
This reminds me of the Louis C.K skit where he talks about people complaining on a plane, completely forgetting how incredible it is to be flying.
Not so long ago, the example I gave about recipes was something you would only see in sci-fi.
Maybe you’re right there is too much investment, but that’s the same whenever there is a new technology that has completely unlocked new possibilities.
> More recently I see a lot more people who see themselves as evangelists and try very hard to boost/advocate the technology (see all the "LLM coding changes my life" posts).
It feels very much like the crypto bubble a few years back (the second, larger one, when we were informed that soon everything would be an NFT). This is actually one thing that puts me off AI; on top of a certain amount of scepticism about whether it is actually useful, the whole space feels very, very, _very_ grifter-y. In some cases it is literally the same people who were pushing NFTs a while back.
I don't pay much attention to the submission themselves, but I do care what the fellow HN-ers think, and my own "vibe-based" perspective is that the voices have been predominantly negative for many years now, and only grow even more so.
HN is usually negative, cynical, skeptical, eyerolling, regardless of topic.
Just the other day someone posted the ImageNet 2012 thread (https://news.ycombinator.com/item?id=4611830), which was basically the threshold moment that kickstarted deep learning for computer vision. Commenters claimed it doesn't prove anything, it's sensational, it's just one challenge with a few teams, etc. Then there is the famous comment when Dropbox was created that it could be replaced by a few shell scripts and an ftp server.
My general rule is thumb now is that if HN takes the time to deride it then there's probably something to it, if it gets completely ignored then there's probably not.
The whole endless debate about "stochastic parrots" and "singularity" was already actively ongoing in threads here in 2022, for example. I remember when GPT-4 just dropped and was everywhere in the comments, and all those things you describe were already there.
Curious technology? People were foaming at the mouth about "license concerns" when GitHub Copilot was first announced, saying they're going to boycott Microsoft. But just like all things, over time people realize they're not as good or bad as initially thought. I noticed this too with media generation, people on Twitter were very mad about it and now many of them use Photoshop's AI features.
I haven't, although I'm sure there exists a way to do. The point is that if one were on HN and reading the threads on AI in 2021, they'll see dissent as being nothing new.
If the assembly programmers were struggling with correctly optimizing loops for optimal performance on several distinct target machines, I would hope that their management would want them to try this new Fortran thing and see how well it worked. (And it did, and it enabled new companies like CDC to win customers from IBM.)
Gemini 2.5 came out just over two weeks ago (25th March) and is a very significant improvement on Gemini 2.0 (5th February), according to a bunch of benchmarks but also the all-important vibes.
Discussed in another thread https://news.ycombinator.com/item?id=41234415
As someone who has worked on diffusion model, it's a clear reject and not a very interesting architecture. The idea is to train a diffusion model to fit to low dimensional data using two MLPs: one accounts for high-level structure and one accounts for low level details. These kind of "global-local" architecture is very common in computer vision/graphics (with the paper mentioned none of the relevant work), so the novelty is low. The experiments also do not clearly showcase where exactly this "dual" structure brings benefits.
That being said, it's very hard to tell it apart from a normal poorly-written paper from a quick glance. If you tell me it's written by a graduate student, I would probably believe it. It is also interesting in a way that maybe for low-dimensional signals there are some architecture tweaks we can do to modify the existing diffusion model architectures to make things better, so maybe not 100% BS.
I am a graphics researcher who has been following your blog posts on your GPU vector graphics engine. I've been benefitted by your posts and I think some of the vector graphics renderer designs you discussed in your blog are genuinely novel and worth publishing as academic papers if written and evaluated properly. Please consider submitting them to a graphics venue like SIGGRAPH, EGSR, High Performance Graphics, or Journal of Computer Graphics Techniques if you find free time to do it. It will greatly benefit the community. ; )
reply