> 99% of the code in this PR [for llama.cpp] is written by DeekSeek-R1 It's defi...

joshstrange · 2025-01-28T16:45:33 1738082733

First off I want to thank you for Aider. I’ve had so much fun playing with it and using it for real work. It’s an amazing tool.

How do you determine how much was written by you vs the LLM? I assume it consists of parsing the git log and getting LoC from that or similar?

If the scripts are public could you point me at them? I’d love to run it on a recent project I did using aider.

anotherpaulg · 2025-01-28T16:52:23 1738083143

Glad to hear you’re finding aider useful!

There’s a faq entry about how these stats are computed [0]. Basically using git blame, since aider is tightly integrated with git.

The faq links to the script that computes the stats. It’s not designed to be used on any repo, but you (or aider) could adapt it.

You’re not the first to ask for these stats about your own repo, so I may generalize it at some point.

[0] https://aider.chat/docs/faq.html#how-are-the-aider-wrote-xx-...

joshstrange · 2025-01-28T18:26:29 1738088789

Thank you so much for linking me to that! I think an `aider stats`-type command would be really cool (it would be cool to calculate stats based activity since the first aider commit or all-time commits of the repo).

prometheon1 · 2025-01-29T13:33:24 1738157604

Slightly longer than `aider stats` but here you go:

  uv run --with=semver,PyYAML,tqdm https://raw.githubusercontent.com/Aider-AI/aider/refs/heads/main/scripts/blame.py

nyarlathotep_ · 2025-01-28T20:15:13 1738095313

does this mean lines/diffs otherwise untouched are considered written by Aider?

If a small change is made by an end-user to adjust an Aider result, who gets "credit"?

anotherpaulg · 2025-01-28T21:05:24 1738098324

It works like normal git blame -- it literally uses git blame.

Whoever changed a line last gets credit. Only the new or newly changed lines in each release are considered.

So no, "lines/diffs otherwise untouched" are NOT considered written by aider. That wouldn't make sense?

yoyohello13 · 2025-01-28T19:30:32 1738092632

Maybe this is answered, but I didn't see it. How does aider deal with secrets in a git repo? Like if I have passwords in a `.env`?

Edit: I think I see. It only adds files you specify.

FeepingCreature · 2025-01-28T19:34:37 1738092877

Aider has a command to add files to the prompt. For files that are not added, it uses tree-sitter to extract a high-level summary. So for a `.env`, it will mention to the LLM the fact that the file exists, but not what is in it. If the model thinks it needs to see that file, it can request it, at which point you receive a prompt asking whether it's okay to make that file available.

It's a very slick workflow.

anotherpaulg · 2025-01-28T19:40:12 1738093212

You can use an .aiderignore file to ensure aider doesn't use certain files/dirs/etc. It conforms to the .gitignore spec.

almostgotcaught · 2025-01-28T18:27:10 1738088830

> 99% of the code in this PR [for llama.cpp] is written by DeekSeek-R1

you're assuming the PR will land:

> Small thing to note here, for this q6_K_q8_K, it is very difficult to get the correct result. To make it works, I asked deepseek to invent a new approach without giving it prior examples. That's why the structure of this function is different from the rest.

This certainly wouldn't fly in my org (even with test coverage/passes).

Jimmc414 · 2025-01-28T18:49:53 1738090193

>> Small thing to note here, for this q6_K_q8_K, it is very difficult to get the correct result. To make it works, I asked deepseek to invent a new approach without giving it prior examples. That's why the structure of this function is different from the rest.

> This certainly wouldn't fly in my org (even with test coverage/passes).

To be fair, this seems expected. A distilled model might struggle more with aggressive quantization (like q6) since you're stacking two forms of quality loss: the distillation loss and the quantization loss. I think the answer would be to just use the higher cost full precision model.

Philpax · 2025-01-29T02:06:24 1738116384

llama.cpp optimises for hackability, not necessarily maintainability or cleanliness. You can look around the repository to get a feel for what I mean.

almostgotcaught · 2025-01-29T02:37:03 1738118223

i guess that means no one should use it for anything serious? good to know

Philpax · 2025-01-29T02:43:59 1738118639

To some extent, yes. I would not run production off of it, even if it can eek out performance gains on hardware at hand. I'd suggest vLLM or TGI or something similar instead.

fsndz · 2025-01-28T21:34:39 1738100079

I think the secret of DeepSeek is basically using RL to train a model that will generate high quality synthetic data. You then use the synthetic dataset to fine-tune a pretrained model and the result is just amazing: https://open.substack.com/pub/transitions/p/the-laymans-intr...

brianstrimp · 2025-01-29T07:08:40 1738134520

> It's definitely possible for AI to do a large fraction of your coding, and for it to contribute significantly to "improving itself". As an example, aider currently writes about 70% of the new code in each of its releases.

That number itself is not saying much.

Let's say I have an academic article written in Word (yeah, I hear some fields do it like that). I get feedback, change 5 sentences, save the file. Then 20k of the new file differ from the old file. But the change I did was only 30 words, so maybe 200 bytes. Does that mean that Word wrote 99% of that update? Hardly.

Or in C: I write a few functions in which my old-school IDE did the indentation and automatic insertion of closing curly braces. Would I say that the IDE wrote part of the code?

Of course the AI supplied code is more than my two examples, but claiming that some tool wrote 70% "of the code" suggests a linear utility of the code which is just not representing reality very well.

anotherpaulg · 2025-01-29T15:57:21 1738166241

Every metric has limitations, but git blame line counts seem pretty uncontroversial.

Typical aider changes are not like autocompleting braces or reformatting code. You tell aider what to do in natural language, like a pair programmer. It then modifies one or more files to accomplish that task.

Here's a recent small aider commit, for flavor.

  -# load these from aider/resources/model-settings.yml
  -# use the proper packaging way to locate that file
  -# ai!
  +import importlib.resources
  +
  +# Load model settings from package resource
  MODEL_SETTINGS = []
  +with importlib.resources.open_text("aider.resources", "model-settings.yml") as f:
  +    model_settings_list = yaml.safe_load(f)
  +    for model_settings_dict in model_settings_list:
  +        MODEL_SETTINGS.append(ModelSettings(**model_settings_dict))

https://github.com/Aider-AI/aider/commit/5095a9e1c3f82303f0b...

brianstrimp · 2025-01-29T20:05:49 1738181149

Point is that not all lines are equal. The 30% that the tool didn't make are the hard stuff. Not just in line count. Once an approach or an architecture or a design are clear then implementing is merely manual labor. Progress is not linear.

You shouldn't judge your sw eng employees by lines of code either. Those that think the hard stuff often don't have that many lines of code checked in. But it's those people that are the key to your success.

stavros · 2025-01-29T08:31:28 1738139488

That's pretty reaching though if you're comparing an AI to a formatter. Presumably 70% of a new Aider release isn't formatting.

simonw · 2025-01-29T14:18:30 1738160310

"The stats are computed by doing something like git blame on the repo, and counting up who wrote all the new lines of code in each release. Only lines in source code files are counted, not documentation or prompt files."

reitzensteinm · 2025-01-28T18:16:07 1738088167

R1 is available on both together.ai and fireworks.ai, it should be a drop in replacement using the OpenAI API.

SkyPuncher · 2025-01-28T19:03:18 1738090998

The problem is it's very expensive. More expensive than Claude.

7thpower · 2025-01-28T19:37:33 1738093053

You can use the distilled version on Groq for free for the time being. Groq is amazing but frequently has capacity issues or other random bugs.

Perhaps you could set up Groq as your primary and then fail back to fireworks, etc by using litellm or another proxy.

dzhiurgis · 2025-01-28T20:15:17 1738095317

Do you know any assistants for jetbrains that can plug into groq+deepseek?

7thpower · 2025-01-28T20:54:23 1738097663

I do not as I'm not in the ecosystem, but groq is openai compliant, so any tool that is openai compliant (99% are) and lets you put in your own baseurl should work.

For example, many tools will let you use local llms. Instead of putting in the url to the local llm, you would just plug in the groq url and key.

see: https://console.groq.com/docs/openai

manmal · 2025-01-28T22:05:19 1738101919

Continue.dev is available for Jetbrains, though the plugin is not as good as the VSCode counterpart. You can plug in any openai compatible API. Under experimental settings, you can also define an applyCode model (and others) which you could set to a faster, cheaper one (eg Sonnet).

htrp · 2025-01-28T19:03:49 1738091029

Run your deepseek R1 model on your own hardware.

girvo · 2025-01-28T21:17:50 1738099070

Only various distillations are available for most people’s hardware, and they’re quite obviously not as good as actual R1 in my testing.

sampo · 2025-01-28T22:45:46 1738104346

"$6,000 computer to run Deepseek R1 670B Q8 locally at 6-8 tokens/sec"

https://reddit.com/r/LocalLLaMA/comments/1ic8cjf/6000_comput...

maeil · 2025-01-29T06:29:52 1738132192

> I've been shifting more and more of my coding from Sonnet to DeepSeek V3 in recent weeks.

For what purpose, considering Sonnet 3.5 still outperforms V3 on your own benchmarks (which also tracks with my personal experience comparing them)?

hammock · 2025-01-28T19:26:44 1738092404

That's amazing data. How representative do you think your Aider data is of all coding done?

carpo · 2025-01-28T22:49:22 1738104562

aider looks amazing - I'm going to give it a try soon. Just had a question on API costs to see if i can afford it. Your FAQ says you used about 850k tokens for Claude, and their API pricing says output tokens are $15/MTok. Does that mean it cost you under $15 for your Claude 3.5 usage or am I totally off-base? (Sorry if this is has an obvious answer ... I don't know much about LLM API pricing.)

simonw · 2025-01-28T22:54:46 1738104886

I built a calculator for that here: https://tools.simonwillison.net/llm-prices

It says that for 850,000 Claude 3.5 output tokens the cost would be $12.75.

But... it's not 100% clear from me if the Aider FAQ numbers are for input or output tokens.

anotherpaulg · 2025-01-28T22:58:00 1738105080

It's "total" tokens, input plus output. I'd guess more than two-thirds of them are input tokens.

simonw · 2025-01-28T23:03:29 1738105409

If we guess 500,000 for input and 350,000 for output that's a grand total of $6.75. This stuff is so cheap these days!

anotherpaulg · 2025-01-28T22:57:37 1738105057

When I was mostly just using Sonnet I was spending ~$100/month on their API. That included some amount of bulk API use for benchmarking, not just my interactive AI coding.

jsnell · 2025-01-29T06:42:09 1738132929

If you're concerned about API costs, the experimental Gemini models with API keys from API studio tend to have very generous free quota. The quality of e.g. Flash 2.0 Experimental is definitely good enough to try out Aider and see if the workflow clicks. (For me, the quality has been good enough that I just stuck with it, and didn't get around to experimenting with any of the paid models yet.)

nprateem · 2025-01-28T19:17:13 1738091833

> As an example, aider currently writes about 70% of the new code in each of its releases.

Yeah but part of that is because it's physically impossible to stop it making random edits for the sake of it.

Imanari · 2025-01-28T17:49:07 1738086547

Love aider, thank you for your work! Out of curiousity, what are your future plans and ideas for aider in terms of features and workflow?

realo · 2025-01-28T17:05:09 1738083909

Hello...

Is it possible to use aider with a local model running in LMStudio (or ollama)?

From a quick glance i did not see an obvious way to do that...

Hopefully i am totally wrong!

simonw · 2025-01-28T17:07:56 1738084076

https://aider.chat/docs/llms/ollama.html

anotherpaulg · 2025-01-28T17:07:57 1738084077

Thanks for your interest in aider.

Yes, absolutely you can work with local models. Here are the docs for working with lmstudio and ollama:

https://aider.chat/docs/llms/lm-studio.html

https://aider.chat/docs/llms/ollama.html

leetharris · 2025-01-28T17:13:38 1738084418

Yes absolutely

In the left bar there's a "connecting to LLMs" section

Check out ollama as an example

m3kw9 · 2025-01-28T17:11:54 1738084314

Yes and is easy

fragmede · 2025-01-28T19:13:58 1738091638

yeah:

    aider --model ollama_chat/deepseek-r1:32b

(or whatever)

sureglymop · 2025-01-28T21:01:48 1738098108

This didn't work well for me, no changes are ever made but maybe it's because I'm just using the 14B model.

manmal · 2025-01-28T22:07:56 1738102076

In case you are on a 32+GB Mac, you could try deepseek-r1-distill-qwen-32b-mlx in LM Studio. It’s just barely usable speed-wise, but gives useful results most of the time.

rahimnathwani · 2025-01-28T21:19:08 1738099148

When a log line contains {main_model, weak_model, editor_model} does the existence of main_model mean that mean the person was using Aider in Architect/Editor mode?

Do you usually use that mode and, if so, with which architect?

Thank you!

wvlia5 · 2025-01-29T04:30:14 1738125014

Can you make a plot like HISTORY but with axis changed? X: date Y: work leverage (i.e. 50%=2x, 90%=10x, 95%=20x, leverage = 1/(1-pct) )

aledalgrande · 2025-01-28T22:50:32 1738104632

Could you share how you track AI vs human LoC?

simonw · 2025-01-28T22:52:30 1738104750

That's covered here, including a link to the script: https://aider.chat/docs/faq.html#how-are-the-aider-wrote-xx-...