Hacker News new | past | comments | ask | show | jobs | submit login

You could easily host your model on https://beam.cloud (I'm a founder). You just add a decorator to your existing Python code:

    from beam import App, Runtime

    app = App(name="gpu-app", runtime=Runtime(gpu="T4"))

    @app.rest_api()
    def inference():
        print("This is running on a GPU")
Then run beam deploy {your-app}.py and boom, it's running on the cloud



An A10G for 1200$ per month will ruin me financially


I think the Beam website should be a lot clearer about how things work[0], but I think Beam is offering to bill you for your actual usage, in a serverless fashion. So, unless you're continuously running computations for the entire month, it won't cost $1200/mo.

If it works the way I think it does, it sounds appealing, but the GPUs also feel a bit small. The A10G only has 24GB of VRAM. They say they're planning to add an A100 option, but... only the 40GB model? Nvidia has offered an 80GB A100 for several years now, which seems like it would be far more useful for pushing the limits of today's 70B+ parameter models. Quantization can get a 70B parameter model running on less VRAM, but it's definitely a trade-off, and I'm not sure how the training side of things works with regards to quantized models.

Beam's focus on Python apps makes a lot of sense, but what if I want to run `llama.cpp`?

Anyways, Beam is obviously a very small team, so they can't solve every problem for every person.

[0]: what is the "time to idle" for serverless functions? is it instant? "Pay for what you use, down to the second" sounds good in theory, but AWS also uses per-second billing on tons of stuff... EC2 instances don't just stop billing you when they go idle, though, you have to manually shut them down and start them up. So, making the lifecycle clearer would be great. Even a quick example of how you would be billed might be helpful.


why did you decide to make such a bad pitch for your product like this?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: