Hacker News new | past | comments | ask | show | jobs | submit | varunkmohan's comments login

They did not, Microsoft made the bet that there would be some application in the future that would require distribution to all developers. Turns out that application was Github Copilot.


"Why did Microsoft strive for [market penetration] turns out: They want to [utilize market penetration]"

What a novel idea.


Some application in the future, that turns out to be AI? That is some level of hammering a square into a circle as I see it (confirmation bias?). This quickly falls apart, since GitHub Copilot is also available as an extension on intelliJ, and is not exclusive to VsCode. Vscode shines with or without GitHub Copilot, because it also has some other exclusive extension ecosystem that are not opensource but very much a bait to pull Linux devs into windows, their bet on WSL, and exclusive extensions like WSL integration and remote containers extension, literally shows how much they want developers to be in their ecosystem. Vscode is their, bet itself to keep developers wrapped in their tentacles, OTOH Copilot is barely a feature, that allows them to charge subscribers, on multiple IDEs. Copilot is not an exclusive feature that brings devs into their ecosystem, it's just an addition.


I mean they already have published many plugins linked to azure and other services they offer it's not about copilot but they did strike gold with copilot.


You don't think the popularity of atom vs visual studio wasn't the main factor?


Yeah, It's pretty clear to me MS sees the web as the future and they had the big heavy (expensive) Visual Studio and Notepad and nothing in between.


Hi from the Codeium team. It's awesome to hear you are allowing other code LLMs to be used on the Replit platform (we're big fans)! We'd love to enable our free chrome extension on Replit.


would love to be able to compare codeium vs ghostwriter inside replit! (or toggle between them based on known strengths or preferences, perhaps by project or by filetype)


Hi, Varun from the Codeium team here. This is fairly common thing to do to not support these sorts of domains to prevent abuse. Also, Copilot downloads a language server binary as well for their vim extension.

Happy to discuss how we could improve.


Hi, thanks for answering! The problem with giving companies one's real e-mail address is that they often sell it to advertisers. I've read your privacy policy, but it's still not clear to me whether you're allowed to do that (there are some phrases suggesting yes and some suggesting no).

What is the binary needed for? Why can't it be distributed as a part of the repository?

I don't use Copilot, so I can't compare it in this way.


We explicitly state in our privacy policy that we don't sell user data to third parties.

The language server is the common binary cross ides that actually processes local / repo context and communicates with the cloud service to generate completions. For now, that is closed source, similar to other tools like Tabnine and Copilot, but we may change that in the future.


Problem is what companies say and do is not always aligned, things change and accidents happen.

Unfortunately, that means wide brushes are now used to paint impressions and what companies say in their terms are not trusted.


The privacy policy also states:

> We and/or our third-party marketing partners may use the Personal Information you send to us for our marketing purposes, if this is in accordance with your marketing preferences.

Doesn't this imply that the third-party marketing partners get the user's e-mail address?

What's the reason behind the language server being closed-source? Do you have something to hide?


> Do you have something to hide?

I imagine this is the only reason why closed-source software exists


Sad, this whole thing where companies train chinchilla optimal models has to stop - I'm always happy to get open source models but it would be great if it was useful rather than a benchmark.


Exactly why our product Codeium (Copilot alternative) supports self-hosting.


OpenAI has done the reasonable thing of not exposing the probability distribution per generated token so it's very hard to use that to completely map to their models. Ultimately, you still do need a very large base model to compete.


"not exposing the probability distribution per generated token"

can you elaborate on what that means?


A language model takes in a sequence of tokens and outputs a probability (0-1) for each token in the vocabulary (the set of all tokens the model knows). Based on this probability distribution, there are various sampling strategies that can be employed to choose which token to actually show to the user.


OpenAI's previous completion endpoint for the davinci-003 and older models included a "logprob" return option: https://platform.openai.com/docs/api-reference/completions/c...

Their newer chat style endpoint for the GPT-3.5-turbo and GPT-4 models no longer supports this. https://platform.openai.com/docs/api-reference/chat


I was wondering. They do give you an embeddings endpoint. Can’t that theoretically be used to reconstruct the model’s weights?


This looks very cool and is a massive improvement. If only OpenAI would also publish what they are doing to make 32K work.


Curious if anyone has actually used this. It's quite slow for me and feels more like a cute idea rather than a useful product.


Thanks for the feedback. Currently there is not, but we will look into setting up custom keybindings or some other approach.


It isn't zero, that's for sure! Before Codeium, we as a team were building scalable ML infrastructure for some of the world's largest ML workloads, so we have a lot of experience in building infra (especially ML serving infra) that optimizes computation on GPUs/mixed compute resources to drastically reduce serving costs. We will probably talk more about these infra side optimizations in future blog posts!


Not a very elegant way of tip-toeing around the question of "how do you finance this?/what's the catch?"


The catch is the code runs on your machine but the team doesn't tell you.


Yes, but that would actually be an awesome feature! If only it didn't call home so much to send unknown payloads...


What is your monetization strategy? What additional features are you planning that will generate revenue?


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: