Hacker Newsnew | past | comments | ask | show | jobs | submit | kateklink's commentslogin

for pass@1 HumanEval tells how well the model solves a task from a set, given only one chance to solve it. It's not the perfect metric, there're other like DS-1000, MBPP (we have included them on HuggingFace model card). HumanEval is good for benchmarking with other models as it gives a fast idea how powerful the model is.


> given only one chance to solve it

my understanding is that there are 2 usages of the pass@{number} syntax. the HumanEval/Codex paper interprets the {number} as number of attempts[0]. however language modelers seem to use it to denote the number of few shot example demonstrations given in the context. these are starkly different and i wish the syntax wasnt overloaded

---

[0] https://arxiv.org/pdf/2107.03374.pdf

> Kulal et al. (2019) evaluate functional correctness using the pass@k metric, where k code samples are generated per problem, a problem is considered solved if any sample passes the unit tests, and the total fraction of problems solved is reported.


we want to help developers who need either on-premise or permissive code assistant, copilot has neither of this. We also wanted to lower the barriers for self-hosting, so that the model is available on most GPUs with just 3GB Ram. Plus making the code completions fast and efficient (understanding entire context, not just the previous tokens).


We’ve finished training a new code model Refact LLM which took us about a month. The main use-case is for blazing-fast code completion with fill-in-the-middle, additionally, the model could reply to chat prompts.

It has much better performance than all of the code models of similar size, and almost reaches the same HumanEval as Starcoder being 10x smaller in size.

With the small size, it can work with most modern GPUs requiring just 3GB Ram.

You can try self-hosting it in Refact https://github.com/smallcloudai/refact/ and get a local fast copilot alternative with decent suggestions.

Weights and model card https://huggingface.co/smallcloudai/Refact-1_6B-fim.

We would love to hear your feedback!


How does it compare to Copilot? A metric I'd like to see is % of proposed completions accepted by a human user. If you had an extension that 50% of the time proposed a Copilot extension and 50% of the time proposed a Refact extension (blind to the user) then you could come up with a metric like this.


Does ctransformer (https://github.com/marella/ctransformers#supported-models) support running refact?

I see that model type "gpt_refact" in https://huggingface.co/smallcloudai/Refact-1_6B-fim/blob/mai...


Is it possible to run it as an LSP so that it can be used in editors other than VSCode and JetBrains? (sorry if this question is completely mad, my understanding of how these things work is extremely limited)


Yes, it's coming up in a couple of weeks.


Great, thanks. I'll keep an eye out.


hi, i try to fine tune refact model using evolve code alpaca, but the loss is always bigger than 2, i try some different params but it doesn't work, can you give me some advice?


> almost reaches the same HumanEval

how can you tell that HumanEval is not leaked to your training data in some form?


Hi! We ran LSH filtering over datasets to remove all code that can be similar to HumanEval samples.


so, we have to trust your procedure..


It can be checked if the model predicts canonical solutions from humaneval. I understand it is not ideal, but at least you can check it yourself

There are a bunch of other benchmarks too, check out the page https://huggingface.co/smallcloudai/Refact-1_6B-fim

Also, feel free to run any new benchmarks


at Refact we have Jetbrains plugin that works with a bunch of local code models https://github.com/smallcloudai/refact/


having more plugin support is in our plans for sure. We're also open for contributions.


we try to eliminate this problem by using code models trained only on permissevely licensed code, then you can run them locally without sending code anywhere


we're going in this direction for code models with Refact https://github.com/smallcloudai/refact/ - right now you self-host code models, fine-tune them on local files, get the model running locally inside your IDE


try refact.ai they have plugin for JetBrains IDEs and support for local LLMs https://github.com/smallcloudai/refact/


there's also Refact (https://github.com/smallcloudai/refact/) with support of several open-source code LLMs and extension for VS Code and Jetbrains


at refact.ai we have different functions for code refactoring, making it shorter, simplifying complexity, etc


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: