Workers AI: Serverless GPU-powered inference

bredren · on Sept 27, 2023

I tried serverless for whisper on an existing competitive service.

It had a cold boot and run on a 8 word STT time of 45 seconds and warm never got past 15 seconds.

This does not work for STT, where it has to be much faster turnaround.

Can anyone give feedback on if whisper and any of its model sizes can work well on serverless?

Do most AI serverless solutions suffer from significant cold boot delays?

The cheapest persistent GPU cloud instance I saw on G was ~$160 a month. Is that roughly the kind of money people need to be prepared to spend to have a model ready to go at all times as a service to another product?

FL33TW00D · on Sept 27, 2023

Whisper large is only 1.5B params, why not run it client side with something like https://github.com/FL33TW00D/whisper-turbo

(Disclaimer: I am the author)

bredren · on Sept 27, 2023

Seems like webgpu is not supported by mobile safari yet.

And possibly has coverage in only 65% of the desktop browser market. [1] Does that roughly conform to how you understand the current penetration of this browser api?

Presuming coverage for a given user, I don’t have a good answer for why to consider remote.

It seems like it would be worth testing compatibility for webgpu and attempting to run on the client if possible, but then have a remote instance available otherwise.

Does that make sense to you?

Can you tell me another reason why someone would want a remote instance of whisper given 20x realtime potential at client in your project?

[1] https://caniuse.com/webgpu

FL33TW00D · on Sept 27, 2023

Indeed WebGPU is basically only supported on Chromium based browsers.

This means that the primary usecase for whisper-turbo and my upcoming libraries is Electron/Tauri apps. For users that don't have WebGPU supported for whatever reason, we will still hit OAI/other server deployment. In the ideal case there should be a 90% cost reduction and same or improved UX.

Someone will still want a remote instance today as there is still engineering to be done. I need more aggressive quantization, better developer experience and more features in order to get people off of the OAI API.

bredren · on Sept 27, 2023

Got it.

I see WebGPU is available in safari tech preview 92, but still an "experimental" feature there.

It looks like Webgpu is kind of been around the block for a while now. I wonder what the hold up is at firefox and safari. It would be much preferable to run more ops on the client. (Complete speculation but I could see this, and the battery use implications possibly making Apple hesitant)

My goal is to provide a browser-based experience first. This is to get at the largest potential user base and there's no friction from install. So, at least for now a electron app is not in plan.

FL33TW00D · on Sept 27, 2023

WebGPU will be multiple years out for Safari, they've had and removed implementations prior.

Firefox has a working implementation, but it lags behind the Chromium one.

Browser based still works great! Check out the whisper turbo demo: https://whisper-turbo.com/

chatmasta · on Sept 27, 2023

Maybe the client is another backend service or serverless function, i.e. one where they'd need to pay for the GPU anyway.

Mernit · on Sept 27, 2023

Serverless only works if the cold boot is fast. For context, my company runs a serverless cloud GPU product called https://beam.cloud, which we've optimized for fast cold start. We see Whisper in production cold start in under 10s (across model sizes). A lot of our users are running semi-real time STT, and this seems to be working well for them.

bredren · on Sept 27, 2023

>...this seems to be working well for them.

Is this because the users are streaming audio in a more conversational style?

For example, when you give siri a command, it is stated, and then you stop speaking.

For most of ChatGPT‘s life, in openAI’s iOS app, if you wanted to speak to input text, you would tap the record button, and then tap it off, either using the app’s own Speech to text capability or siri’s input field speech to text.

Conversational speech to text is more ongoing, though, which would make a 10 second cold start OK, because you don’t sense as much lag because you’re continuing to speak.

Or perhaps people in general record input longer than 10 seconds, And you are sending the first chunk as soon as possible to get whisper going.

Then follow up chunks are handled as warm boots? Then the text is reassembled? Is that roughly correct?

Anything you can provide on sort of the request and data flow that works with a longer cold boot time in the context of single recording versus streaming, and how audio is broken up would be helpful.

NicoJuicy · on Sept 27, 2023

Cloudflare has almost no cold boot and I think their ml models are prefetched within the same DC, so loading the model the first time should have no noticable overhead either.

Correct me if I'm wrong, but that's how I interpreted it when they first started with ML models on their CPU's.

celso · on Sept 28, 2023

The models in our catalog are all pre-loaded before requests come in.

lampington · on Sept 28, 2023

Makes sense. A bring-your-own-model feature would be awesome, but would make it impossible to do that for rarely-used models without using valuable GPU RAM.

fragmede · on Sept 27, 2023

(STT: Speech to Text)

htrp · on Sept 27, 2023

which service did you look into ?

bredren · on Sept 27, 2023

It is a startup and happened in the onboarding demo, I'll avoid naming them in case they sort it out or whatever. If you really want to know, please send me an email.

amayne · on Sept 27, 2023

This is very cool. I'm still trying to understand the pricing. What is a "neuron" in this context? A token? A character?

"Neurons are a way to measure AI output that always scales down to zero (if you get no usage, you will be charged for 0 neurons). To give you a sense of what you can accomplish with a thousand neurons, you can: generate 130 LLM responses, 830 image classifications, or 1,250 embeddings."

130 LLM responses of that length? 1,250 embeddings what size of text?

eastdakota · on Sept 27, 2023

It's effectively a unit of time benchmarked to what we can accomplish in that time as of Sept 27, 2023 (launch). The challenge here is that because we're abstracting away the underlying hardware it's not the same as renting a VM for a period of time. We also don't want to create perverse incentives that keep us from making the underlying system faster. It's similar to how AWS standardized EC2 to a standard compute unit. Over time, as we continue to add faster and faster hardware and better optimize models we expect the cost of a neuron will trend down but the amount of AI inference work that you can do with a neuron will remain relatively constant.

esafak · on Sept 27, 2023

Then call it something like Neural Time Unit (NTU) or Computational Time Unit (CTU) because neurons make people think of neural networks. As in, you pay for the size of your model.

amayne · on Sept 27, 2023

Could you give us an example of what that means in practical terms?

The post says that 1000 neurons will give you 130 LLM responses - but of what length?

(LLMs are generally priced by input and output tokens. The longer the tokens the longer the compute time. Without an idea of what you mean by a response it's hard to understand.)

Likewise: 1,250 embeddings – How big is the text size in the example?

I'm VERY excited to see you doing this and understand it's early stages, but I wan't wrap my head around the pricing without context.

KRAKRISMOTT · on Sept 27, 2023

Please rename it, or at least make sure it corresponds to actual neural operations. It's terribly confusing for practitioners.

choppaface · on Sept 27, 2023

Really amazing stuff to see this launch with Hugginface! Hope to see it expand beyond text too.

“neuron” is a cute name but there’s too much conceptual overlap with floating point ops, layers, model parameters etc which are time independent. Should just call them inference credits or something. When some large model runs on multiple GPUs it’s even more confusing what neurons / dollars per second might be.

coolspot · on Sept 27, 2023

Sounds like 1 Neuron ~= X FLOPS

spikey_sanju · on Sept 27, 2023

You can see the limitations here—https://x.com/spikeysanju/status/1707057365231812630?s=46

amayne · on Sept 27, 2023

Those don't explain the relation between neuron cost and length.

joshka · on Sept 28, 2023

Those max tokens seem pretty low

latchkey · on Sept 27, 2023

Interesting how the same exact blog url was used previously in April 2021.

https://news.ycombinator.com/item?id=26795517

eastdakota · on Sept 27, 2023

That was our early alpha cooperation with NVIDIA. We've learned a lot since then. Not to mention, the AI ecosystem has grown up a bunch. But you are correct: this is something we've been planning for for a loooooooong time.

latchkey · on Sept 27, 2023

I'd love to see a blog post about the knowledge delta and growth progress. That's more interesting to me than the actual announcements.

showerst · on Sept 27, 2023

I tried to spin up a free plan and run the whisper demo on a new worker and it immediately just gives me an:

    Error 1102
    Worker exceeded resource limits

Did I mess up the config or is it just not intended to be tried out without being on a paid plan already?

pdwittig · on Sept 27, 2023

It's 100% designed to let you try it out for free, so something else must be going on. Feel free to message me at pwittig at cloudflare dot com, and I'm happy to help debug.

Also, we're still figuring some things out, but current limits are here: https://developers.cloudflare.com/workers-ai/platform/limits...

ryandetzel · on Sept 27, 2023

I want to love workers but I've never had great luck with anymore more than a basic crud app. Even getting an external db to connect proved to be more work than it should have, and their docs are all outdated and all over the place, often contradicting themselves.

throwaway77384 · on Sept 27, 2023

The docs are soooooo bad. This is the same story with any exciting / new product.

Honestly, I'm beginning to think that we need some kind of documentation-first style development. Like TDD, but DDD....

I have integrated Facebook APIs, Instagram (well, same thing), Google APIs, Stripe APIs, Mailchimp APIs, etc. etc.

And the only thing common among all of them? The documentation is _terrible_..., like, _terrible_.

I also run a few products online and I spend so, so much time trying to get the documentation right. It's incredibly boring and tedious, but I really feel that if you want to set yourself apart from the big players, make good documentation. It can't be that hard.

kristopolous · on Sept 27, 2023

good documentation is the key to php's success along with a bunch of other things that get dismissed as "inferior technology".

The worst documentation are the jargon filled abstract vibes-based ones where the authors basically typed it with one hand on the keyboard. It's like "ok, you're amazing. Now how do I resolve this error and what are your command line flags?"

celso · on Sept 28, 2023

What do you think is missing in the docs?

UnlockedSecrets · on Sept 27, 2023

I am having quite a few issues getting the reference API code at https://developers.cloudflare.com/workers-ai/models/llm/ to work?

{'errors': [{'code': 'invalid_union', 'unionErrors': [{'issues': [{'code': 'invalid_type', 'expected': 'object', 'received': 'string', 'path': ['body'], 'message': 'Expected object, received string'}], 'name': 'ZodError'}, {'issues': [{'code': 'invalid_type', 'expected': 'object', 'received': 'string', 'path': ['body'], 'message': 'Expected object, received string'}], 'name': 'ZodError'}], 'path': ['body'], 'message': 'Invalid input'}], 'success': False, 'result': {}}

pdwittig · on Sept 28, 2023

Sorry for the trouble. The docs has since been updated - you should try again.

Also happy to help if you run into any other issues - pwittig at cloudflare dot com.

jgrahamc · on Sept 27, 2023

TL;DR: GPUs all over the Cloudflare global network; working closely with Microsoft, Meta, Hugging Face, Databricks, NVIDIA; new Cloudflare-native vector database; inference embedded in Cloudflare Workers; native support for WebGPU. Live demo: https://ai.cloudflare.com/

ushakov · on Sept 27, 2023

Do you actually run the inference in the worker? Or is it like what Fermyon does where they basically host the models for you and you get a SDK that is automatically connected to the function?

celso · on Sept 27, 2023

Unlike the first version of Constellation, Workers AI runs inference directly on GPUs that we are (quickly) installing in our global network.

ushakov · on Sept 27, 2023

But the code isn't running on the worker? It runs somewhere else on a GPU cluster?

jgrahamc · on Sept 27, 2023

It's a little like how Cloudflare Workers runs. You don't know which CPU it runs on, all you know is it's a CPU close to your end user. Same goes for this. We are rolling out GPUs everywhere across the globe and so Workers AI will just use a nearby GPU. Probably in the same machine as your workers, or maybe the same data center, or whatever other smart routing decision we make. What we are not doing is running a massive GPU cluster somewhere. This is all distributed and that's the power of owning your own network.

pseg134 · on Sept 27, 2023

Since they don’t seem to be able to give a simple answer: the inference does not run in the worker. It connects to external GPUs.

eastdakota · on Sept 27, 2023

I think the confusion is what is meant by "in the Worker." From a hardware perspective, the GPU may be in the same machine as the CPU that's powering the Worker. Or they may be across different machines in our network. We are not routing requests to some third party. And we will try to run the inference task as close as possible to who/whatever requested it. The whole idea of "serverless" is you shouldn't have to worry about what machine where runs whatever unless you're on the team building the scheduling and routing logic at Cloudflare.

thegagne · on Sept 27, 2023

I think his question is more about does the worker directly access the GPU and thus require js tooling to handle the GPU somehow (no), or does it make subrequests to a separate GPU service not running the worker runtime (yes).

tebbers · on Sept 27, 2023

Hey John, great work on this! Just a headsup, small typo on that page under R2: "Build mutli-cloud training architectures with free egress."

jgrahamc · on Sept 28, 2023

Thanks. Getting it fixed.

foggedb0nk · on Sept 27, 2023

Any chance you're looking for technical product folks to work on this? I actually worked on a very similar deployment internally at Livepeer (focus was on live video enhancements but also generalized edge compute)!

rita3ko · on Sept 27, 2023

we always are! email is rita at cloudflare dot com :)

foggedb0nk · on Sept 28, 2023

thanks!

claytonjy · on Sept 27, 2023

I see plans for more models via HF partnership, but can I or will I be able to run a custom fine-tuned version of a supported model?

celso · on Sept 27, 2023

On top of our hosted and supported catalog of models, and the deploy to CF partnerships like the HF one, you will also be able to bring your own custom model at some point in time.

claytonjy · on Sept 27, 2023

Awesome. What about compiled model support? Running most of the listed models without compilation only makes sense for hobby projects.

aprxi · on Sept 27, 2023

Is CodeLlama somewhere on the roadmap?

jonplackett · on Sept 27, 2023

Very cool and also very simple as I’d expect from Cloudflare.

But I have a question - why not make inference as easy as the translation? Why do I have to run that in a worker rather than just as a simple API call? That would be much simpler.

Is there a technical reason or is it that people would want to have logic before making the call to llama?

rita3ko · on Sept 27, 2023

you can do both! all of our models are supported either via workers / pages binding (which makes it really easy to host the rest of the logic), and via REST API

docs: https://developers.cloudflare.com/workers-ai/get-started/res...

(llama specific example here too under curl: https://developers.cloudflare.com/workers-ai/models/llm/ )

jonplackett · on Sept 27, 2023

Awesome! Thanks for that. Cloudflare smashing it on simplicity as per usual.

ryanrussell · on Sept 28, 2023

@jgrahamc

Very curious if you can elaborate on Vectorize. More than edge GPU's, entering the Vector DB marketplace and a CF proprietary integration is interesting (and a bit scary) to build on.

- Will Vectorize ever get OSS'd?

- If you want to migrate either direction from some other Vector DB(Milvus, Weaviate, Qdrant, Pinecone, etc), what should you expect in terms of level of effort and features?

- What inherent advantages(latency? features?) would you get exclusively from Vectorize?

mohitgangrade · on Sept 29, 2023

This sounds really cool. I already love cloudflare because of how easy they make it to compete with bigger companies for indie devs like myself.

Their pricing for their products always seems so much more affordable than something like AWS or GCP. I'm using R2 for storage for a client project, and from my calculations we would have to pay almost 3 times more if we hosted the files with AWS on S3.

I really hope they keep adding all the latest open-source AI models to their platform. If their pricing will be as cheap as they state in this blog post, then I would rather use this service than install models on my computer. To get a good inference speed for open source models on my PC right now, I have to let usage spike to up to 100%...

baobabKoodaa · on Sept 27, 2023

This could become something useful in the future, but right now it appears to be toy models only. I'm assuming Cloudflare will add useful models eventually, but the cold start times are going to be horrible on those. I'm struggling to think of useful applications for this. Maybe one day.

celso · on Sept 27, 2023

Which models would you find useful?

baobabKoodaa · on Sept 28, 2023

Depends on the task, but generally, LLaMA 2 finetunes of at least 13B params, 4-bit quantized

celso · on Sept 28, 2023

did you see we have lamma2-7b int8?

baobabKoodaa · on Sept 29, 2023

Can you provide an example of a task where the 7b model is useful?

arbitrarian · on Sept 27, 2023

It will be interesting to see if they can undercut OpenAI themselves on the cost for running Whisper in the cloud.

danskeren · on Sept 28, 2023

Are OpenAI still limiting Whisper to 50 requests per minute (and gpt-3.5-turbo at 3/min)? If so then they don't need to undercut OpenAI but just provide unlimited requests.. it's near impossible to provide user-facing AI solutions to customers due to these limits, and it's a very bad user experience (and security risk) to force users to provide their own OpenAI API keys.

https://replicate.com/ is much better with an average 10 requests per second but would still very much prefer unlimited (where they can monitor high volume users to see if it's legitimate), or add new pricing model where e.g. 0-10k/rpm is at $0.001/sec, 10-100k/rpm is at $0.010/sec, 100k+/rpm is at $0.100/sec (pricing would of course need to be fine-tuned, just a quick example).

bobjmiles · on Sept 27, 2023

https://blog.salad.com/whisper-large-v2-benchmark/

deanCommie · on Sept 28, 2023

AWS too -> https://aws.amazon.com/blogs/aws/amazon-bedrock-is-now-gener...

> (Coming Soon) The Llama 2 13B and 70B parameter models by Meta will soon be available via Amazon Bedrock’s fully managed API for inference and fine-tuning.

seejayseesjays · on Sept 27, 2023

It would be super cool to see SD running on this! hyped to play around with llama since I don't have access to a good GPU.

siwakotisaurav · on Sept 27, 2023

The biggest one missing is stable diffusion.

rita3ko · on Sept 27, 2023

stay tuned!

foggedb0nk · on Sept 27, 2023

Any chance you're looking for technical product folks to work on this? I actually lead a very similar deployment internally at Livepeer (focus was on live video enhancements but also generalized edge compute)!

jgrahamc · on Sept 27, 2023

You can always email jgc@cloudflare.com and I'll route the resume to the right people.

foggedb0nk · on Sept 28, 2023

thanks!

NicoJuicy · on Sept 27, 2023

At this cost, no one will be running embeddings elsewhere :o

https://twitter.com/eastdakota/status/1707056412575023352?t=...

willquack · on Sept 27, 2023

I've never played with Cloudflare workers, but I thought they were implemented as JavaScript runtimes that form an edge computing network

Are the models run in JavaScript/WebAssembly behind the scenes?

celso · on Sept 27, 2023

You can use Javascript or Wasm to interface with the AI binding, think of it of as the SDK, but the inference task itself runs natively on top of a ML runtime and the models are loaded into GPUs.

edunteman · on Sept 27, 2023

Given this is a hosted API rather than arbitrary hosting, why choose the word "serverless"? Do you plan to offer arbitrary hosting in the future?

(bias: am Banana CEO)

xfalcox · on Sept 27, 2023

Embedding cost and model choice makes this a very compelling choice. I'm working on leveraging embeddings in https://github.com/discourse/discourse-ai where it powers offering related topics, semantic search, tag and category recommendations among other things.

A cheap offering like this can make it a lot more reasonable for self-hosters.

thyrox · on Sept 27, 2023

This would be a game changer if they had something for image generation as well. Oh well, maybe it's coming soon as the page says it's just a small preview. The best part for me personally is it's available on all plans and pricing looks good too.

OT but if cloudflare fixes their false positives on showing random captchas when I'm trying to browse the net on VPN they will surely be one of my favorite companies.

celso · on Sept 27, 2023

Image generation is coming.

winddude · on Sept 27, 2023

kinda cool, but rather limited if you can't use custom models.

jgrahamc · on Sept 27, 2023

You will be able to.

winddude · on Sept 27, 2023

looking forward too it!

mountainriver · on Sept 28, 2023

What’s the cold boot time? Isn’t that the most important part?

AmericanOP · on Sept 27, 2023

How does Workers AI compare to Replicate.com?

pdwittig · on Sept 27, 2023

There are a lot of similarities, but here are a few differences:

It's region-less, and runs your inference task on the Cloudflare network, near your end users. Though that's not entirely true yet - we'll be in 100 sites by EOY '23, and nearly everywhere by EOY '24.

It was built to work alongside our new vector database, Vectorize, out of the box.

It's accessible to all developers, regardless of where you deploy (via API), but we wanted to offer a seamless option for developers already building on Cloudflare - Workers, Pages, etc.

iAkashPaul · on Sept 28, 2023

Thought they were gonna go CPU based inference with llama.cpp/ggml, this project should definitely be made into a programming book

NicoJuicy · on Sept 28, 2023

I think they quickly reached the limits of CPU inference and that's why they decided to scale up with GPU's.

Because of how they operate, they can't just release in a single DC like other providers. It's a partial "all or nothing" scenario.

capableweb · on Sept 28, 2023

Unless I missed something, it seems they didn't share which models by parameter count they're gonna be offering, nor anything at all about cost.

Also, this strikes me as really weird phrasing:

> Llama 2 now available for global usage on Cloudflare’s serverless platform, providing privacy-first, local inference to all

"privacy-first & local inference" would be to run it locally, on your own hardware, isn't that exactly what should be referred by when using "local"? Or has the definitions of words completely gone out the window as of late?

caesil · on Sept 28, 2023

This is just a press release, and not a great source of in-depth info. This blog post from yesterday goes into it: https://blog.cloudflare.com/workers-ai/

>Models you know and love

>We’re launching with a curated set of popular, open source models, that cover a wide range of inference tasks:

>Text generation (large language model): meta/llama-2-7b-chat-int8

>Automatic speech recognition (ASR): openai/whisper

>Translation: meta/m2m100-1.2

>Text classification: huggingface/distilbert-sst-2-int8

>Image classification: microsoft/resnet-50

>Embeddings: baai/bge-base-en-v1.5

capableweb · on Sept 28, 2023

I missed that blog post somehow, thanks for sharing that. Bit disappointing that it's just the 7B model, it's a good starting point for fine tuning another small model, but it really isn't useful on its own.

> This is just a press release, and not a great source of in-depth info

Not sure I'd call outright lying/getting the most basic points wrong "not a great source of in-depth info", like the "local-first" part.

robertlagrant · on Sept 28, 2023

> Or has the definitions of words completely gone out the window as of late?

I think it means local as in "nearby" - so if you're in the UK it isn't processed in us-east-1, for example. You can pick one local to you.

capableweb · on Sept 28, 2023

Yeah, I think so too, but I guess I'm more upset that they use "local" in a "remote but physically nearby" fashion when it usually refers to "this computer/device".

robertlagrant · on Sept 28, 2023

Yeah - I think it's "local" as in "localisation". It is a little overloaded.

warkdarrior · on Sept 28, 2023

At least you get privacy from Meta.

capableweb · on Sept 28, 2023

Considering this is a collaboration, don't you think the agreement between them has something about feeding data back to Meta?

NicoJuicy · on Sept 28, 2023

I just think it's loading the models and revenue sharing.

It would be similar as their partnership with Huggin Face, I suppose.

What data would be useful to capture? Outside of a human feedback loop, I don't think there is an actual use-case for it.

Note: not certain

capableweb · on Sept 28, 2023

Without anyone of us actually knowing the details of the collaboration, all we can do is guess, I guess.

> What data would be useful to capture?

Pipe the prompts straight to Meta and I'm sure they'll be able to extract a ton of useful data.

NicoJuicy · on Sept 28, 2023

What's the use-case without any human feedback, which I already mentioned?

capableweb · on Sept 28, 2023

I don't work at Facebook, so I'm not gonna spend my time doing their work for them or you.

But off the top of my head, classifying the data into various categories and mentions of Facebook/Meta could allow them to derive sentiment about the company based on geographical location, and know where to invest more in changing the sentiment.

NicoJuicy · on Sept 28, 2023

Even if this would be a valid suggestion ( I don't think so).

This would be incredible hard to do without IP.

Cloudflare won't forward it and it can be setup as an API. Additionally, IP's on mobile are re-used across users.

That doesn't even sound remotely to a valid indicator for "investments". Definitely not considering the costs to create and update such a model.

bhhaskin · on Sept 28, 2023

100% marketing speak.

jonplackett · on Sept 28, 2023

Can you use this for _anything_ you like? Business uses es etc? Not sure what the terms are for Llama. I thought they were restricted in some way?

ninjin · on Sept 28, 2023

LLaMa 2 is sadly not open source and not open science, despite what Facebook keeps on claiming:

https://blog.opensource.org/metas-llama-2-license-is-not-ope...

Thus you will have to read the license and judge whether it is compatible with your business or use cases.

One should give them credit for making it available, which is a lot better than plenty of others. However, actually open models are starting to appear, so perhaps we will soon see Facebook and the likes making theirs open as well? Who knows.

0xDEF · on Sept 27, 2023

How well-proteced is the "edge" in edge computing? I can see Cloudflare has edge locations in many countries. Can entities with physical access to Cloudflare's edge machines get access to sensitive user data?

r1ch · on Sept 27, 2023

With Cloudflare's default settings, a malicious entity can intercept any Cloudflare <-> Backend connections invisibly to the end user since the SSL certificates aren't validated. The end user also can be victim to plain old HTTP MITM on Cloudflare's upstream networks, as happened in 2016: https://news.ycombinator.com/item?id=12091900

It's hard to take Cloudflare's commitment to security seriously when they still ship such terrible default settings.

NicoJuicy · on Sept 27, 2023

What do you mean?

You can install certificates by cloudflare and then the only one that can connect to your server is from cloudflare.

No one can intercept it then.

If you're talking about flexible SSL. Sure, you can use it purely as a https proxy for your SEO score of your blog. But securing it is not much effort.

If it's just for a static blog, I'm not sure what you would though.

r1ch · on Sept 27, 2023

If you have a valid HTTPS certificate for example.com and then add example.com to Cloudflare, your overall security decreases because the path from the CF datacenter to your origin is now vulnerable to MITM - the default SSL setting is "Full" which doesn't check certificate validity.

To the less experienced sysadmin everything looks like it's working fine and users also don't notice any difference, which is why it's a terrible default.

Sure you _can_ configure Cloudflare securely, but it should be secure out of the box. But that adds friction when the origin doesn't have a valid SSL certificate which probably hurts someone's KPIs.

celso · on Sept 27, 2023

https://blog.cloudflare.com/securing-memory-at-epyc-scale/ https://blog.cloudflare.com/speeding-up-linux-disk-encryptio...

rtcode_io · on Sept 27, 2023

We were building AI https://efn.kr/#ai into https://RTCode.io and ...

Cloudflare drops this! Sweet! Now, we have BYOAI.

Our whole offering runs on their network https://RTEdge.net

Our playground lets one live-code user workers that deploy to Cloudflare Worker for Platforms!

- https://sw.rt.ht/?io (in-browser)

- https://sw.rt.ht/ (region-Earth)

ignoramous · on Sept 27, 2023

All well and good, but why is the webpage loading this obfuscated javascript file? https://archive.is/htQgN

cetinsert · on Sept 27, 2023

CEO here, simply because we want to protect the core components that differentiate our services. Similar to, say https://www.photopea.com/ and many others we can find if we look behind the scenes.

Once we raise funding, and establish strong market presence, we will revisit this decision, and dedicate developer resources to sharing our in-house tech with the world more openly. This will take a full position. If you liked what you see, and want to work with us, send us an email at work@elefunc.com and we will get in touch once we have open positions!

mmahemoff · on Sept 28, 2023

Looks interesting, but as feedback, the above-the-fold message on your homepage is buzzword salad, not really clear what you're offering a potential customer here. "Elefunc is building the real-time web, to empower thought-speed creativity."

By way of comparison, I know exactly what I'm getting as a developer on Replit's homepage: "Make something great. Build software collaboratively with the power of AI, on any device, without spending a second on setup"

cetinsert · on Sept 28, 2023

Thank you! We are just getting started and you offer our first public valuable feedback on the company landing page. I will make sure our above-the-fold becomes just as clear! If you have further suggestions, please send them to support@elefunc.com