I got access to the preview, here's what it gave me for "A pelican riding a bicy...

yurylifshits · on Dec 17, 2024

There's another important contender in the space: Hunyuan model from Tencent

My company (Nim) is hosting Hunyuan model, so here's a quick test (first attempt) at "pelican riding a bycicle" via Hunyuan on Nim: https://nim.video/explore/OGs4EM3MIpW8

I think it's as good, if not better than Sora / Veo

chrismorgan · on Dec 17, 2024

> A whimsical pelican, adorned in oversized sunglasses and a vibrant, patterned scarf, gracefully balances on a vintage bicycle, its sleek feathers glistening in the sunlight. As it pedals joyfully down a scenic coastal path, colorful wildflowers sway gently in the breeze, and azure waves crash rhythmically against the shore. The pelican occasionally flaps its wings, adding a playful touch to its enchanting ride. In the distance, a serene sunset bathes the landscape in warm hues, while seagulls glide gracefully overhead, celebrating this delightful and lighthearted adventure of a pelican enjoying a carefree day on two wheels.

What does it produce for “A pelican riding a bicycle along a coastal path overlooking a harbor”?

Or, what do Sora and Veo produce for your verbose prompt?

whywhywhywhy · on Dec 17, 2024

If Sora is anything like Dall-e a prompt like "A pelican riding a bicycle along a coastal path overlooking a harbor" will be extended into something like the longer prompt behind the scenes. OpenAI has been augmenting image prompts from day 1.

sashank_1509 · on Dec 17, 2024

Hard to say about SORA but the video you shared is most definitely worse than Veo.

The Pelican is doing some weird flying motion, motion blur is hiding a lack of detail, cycle is moving fast so background is blurred etc. I would even say SORA is better because I like the slow-motion and detail but it did do something very non physical.

Veo is clearly the best in this example. It has high detail but also feels the most physically grounded among the examples.

sfjailbird · on Dec 17, 2024

The prompt asks that it flaps its wings. So it's actually really impressive how closely it adheres (including the rest of the little details in the prompt, like the scarf). Definitely the best of the three, in my opinion.

dyauspitr · on Dec 17, 2024

Pretty good except the backwards body and the strange wing movement. The feeling of motion is fantastic though.

arjie · on Dec 17, 2024

I was curious how it would perform with prompt enhancement turned off. Here's a single attempt (no regenerations etc.): https://www.youtube.com/watch?v=730cb2qozcM

If you'd like to replicate, the sign-up process was very easy and I was easily able to run a single generation attempt. Maybe later when I want to generate video I'll use prompt enhancement. Without it, the video appears to have lost a notion of direction. Most image-generation models I'm aware of do prompt-enhancement. I've seen it on Grok+Flow/Aurora and ChatGPT+DallE.

    Prompt
    A pelican riding a bicycle along a coastal path overlooking a harbor
    Seed
    15185546
    Resolution
    720×480

taneq · on Dec 17, 2024

I mean, you didn’t SAY riding forwards…

TZubiri · on Dec 17, 2024

I suppose if you reverse it would look okish

gcr · on Dec 17, 2024

FYI your website shows me a static image on iOS 18.2 Safari. Strangely, the progress bar still appears to “loop,” but the bird isn’t moving at all.

Turning content blockers off does not make a difference.

theWreckluse · on Dec 17, 2024

Fwiw, it is finicky but the video played after a couple seconds (iOS 18.2 Safari).

dr_kiszonka · on Dec 17, 2024

Reddit says it is much better than Sora. Are you hosting the full version of Nunyuan? (Your video looks great.)

echelon · on Dec 17, 2024

HunYuan is also open source / source available unless you have 100M DAU.

Then there's Lightricks LTX-1 model and Genmo's Mochi-1. Even the research CogVideoX is making progress.

Open source video AI is just getting started, but it's off to a strong start.

yurylifshits · on Dec 17, 2024

Our limited tests show that yes, Hunyuan is comparable or better than Sora on most prompts. Very promising model

prometheon1 · on Dec 17, 2024

Is it still better if you copy his whole prompt instead of half of it?

c0brac0bra · on Dec 17, 2024

I mean, the pelican's body is backwards...

tim333 · on Dec 17, 2024

Here's one of a penguin paragliding and it's surprisingly realistic https://x.com/Plinz/status/1868885955597549624

0_____0 · on Dec 17, 2024

This is the first GenAI video to produce an "oh shit" reflex in me.

oh, shit!

p1necone · on Dec 16, 2024

As long as at least one option is exactly what you asked for throwing variations at you that don't conform to 100% of your prompt seems like it could be useful if it gives the model leeway to improve the output in other aspects.

oneshtein · on Dec 17, 2024

Here is my version of pelican at bicycle made with hailuoai:

https://hailuoai.video/share/N9dlRd1L1o0p

nkingsy · on Dec 16, 2024

His little bike helmet is adorable

mckirk · on Dec 16, 2024

The AI safety team was really proud of that one.

AgentME · on Dec 17, 2024

It's funny having looked forward to Sora for a while and then seeing it be superseded so shortly after access to it is finally made public.

grumbel · on Dec 17, 2024

I am surprised that the top/right one still shows a cut and switch to a difference scene. I would assume that that's something that could be trivially filtered out of the training data, as those discontinuities don't seem to be useful for either these short 6sec video segments or for getting an understanding of the real world.

jerpint · on Dec 16, 2024

It looks much better than Sora but still kind of in uncanny valley

spaceman_2020 · on Dec 17, 2024

This is the worst it will ever be…

victorbjorklund · on Dec 17, 2024

That is surprisingly good. We are at a point where it seems to be good enough for at least b-roll content replacing stock video clips.

rob74 · on Dec 17, 2024

Well yeah, if you look closely at the example videos on the site, one of them is not quite right either:

> Prompt: The sun rises slowly behind a perfectly plated breakfast scene. Thick, golden maple syrup pours in slow motion over a stack of fluffy pancakes, each one releasing a soft, warm steam cloud. A close-up of crispy bacon sizzles, sending tiny embers of golden grease into the air. [...]

In the video, the bacon is unceremoniously slapped onto the pancakes, while the prompt sounds like it was intended to be a separate shot, with the bacon still in the pan? Or, alternatively, everything described in the prompt should have been on the table at the same time?

So, yet again: AI produces impressive results, but it rarely does exactly what you wanted it to do...

soco · on Dec 17, 2024

Technically speaking I'd say your expectation is definitely not laid out in the prompt, so anything goes. Believe me I've had such requirements from users and me as a mere human programmer am never quite sure what they actually want. So I take guesses just like the AI (because simply asking doesn't bring you very far, you must always show something) and take it from there. In other words, if AI works like me, I can pack my stuff already.

jillyboel · on Dec 17, 2024

This tech is cute but the only viable outcomes are going to be porn and mass produced slop that'll be uninteresting before it's even created. Why even bother?

andybak · on Dec 17, 2024

There will be both of those things in abundance.

But I'm also seeing some genuinely creative uses of generative video - stuff I could argue has got some genuine creative validity. I am loathe to dismiss an entire technique because it is mostly used to create garbage.

We'll have to figure out how to solve the slop problem - it was already an issues before AI so maybe this is just hastening the inevevitable.

incrudible · on Dec 18, 2024

The real problem is that trust in legacy media hit rock bottom right as we enter the era where we would need such trust the most. Soon enough, nothing you see on video can be believed, but (perhaps more importantly) nothing must be believed either.

bottled_poe · on Dec 17, 2024

Comments like this one are so predictable and incredulous. As if the current state of the art is the final form of this technology. This is just getting started. Big facepalm.

latentsea · on Dec 17, 2024

Have you already noticed the trend of image search results for porn containing inferior AI slop porn?

I have. It sucks. The world we're headed for maybe isn't one we actually wind up wanting in the end.

I like the idea of increasingly advanced video models as a technologist, but in practice, I'm noticing slop and I don't like it. Having grown up on porn, when video models are in my hands, the addiction steers me in the direction of only using the the technology to generate it. That's a slot machine so addictive akin to the leap from the dirty magazines of old to the world of internet porn I witnessed growing up. So, porn addiction on steroids. I found it eventually damaging enough to my mental health that I sold my 4090. I'm a lot better off now.

The nerd in me absolutely loves Generative models from a technology perspective, but just like the era of social media before it, it's a double edged sword.

ralusek · on Dec 17, 2024

It sounds like you have a personal problem that you’re trying to project onto the rest of society.

latentsea · on Dec 17, 2024

No, I'm providing a personal anecdote that some members of society that do have, or may develop, the same or similar problems are having both the (perceived) good and the bad aspects of those problems seriously magnified by this technology. This can have personal consequences, but also the consequences can affect the lives of others.

Hence, a certain % of the population will be negatively affected by this. I personally personally think it's worth raising awareness of.

ferguu_ · on Dec 17, 2024

I hope they're right. If the technology improves to such a degree that meaningful content can be produced then it could spell global disaster for a number of reasons.

Also I just don't want to live in a world where the things we watch just aren't real. I want to be able to trust what I see, and see the human-ness in it. I'm aware that these things can co-exist, but I'm also becoming increasingly aware that as long as this technology is available and in development, it will be used for deception.

jnwatson · on Dec 17, 2024

That ship sailed shortly after the invention of photography. Photos were altered for political purposes during the US Civil War.

Now, we have entire TV shows shot on green screen in virtual sets. Replacing all the actors is just the next logical step.

ferguu_ · on Dec 17, 2024

That's exactly what I mean, all of those methods take some human effort, there is a human involved in the process. Now we face a reality that it might take no human effort to do... well, anything. Which is terrifying to me.

I do believe that humans are restless, and even when there is no longer any point to create, and it is far easier to dictate, we still will, just because we are too driven not to.

bratwurst3000 · on Dec 19, 2024

you know that there is still offline artforms like concerts theaters opera installations etc so i wouldn see it that negative. and we have nearly 100years of music and film we can enjoy. so maybe video is a dying artform for human to act in but there is so much more.

vultour · on Dec 17, 2024

The most predictable comment is yours, especially since you completely missed the point of the original comment which had nothing to do with the video quality.

gruez · on Dec 17, 2024

AI generated slop content begets human generated slop comment.

jillyboel · on Dec 17, 2024

So, even better porn?