Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

C does not support strings, regex and maps natively. You have external libraries that can do it, but from my experience, they are painful to use. String manipulation in general is painful in C and probably not the best language for this demo.

However, if you really want high performance, you can go ahead and write it in C with a custom matching code and a custom data structure for keeping your word count. This will most likely result in hundreds of lines of code, and enough performance to max out a PCIe SSD. However that would be an different exercise.



First, if Rust's solution gets to pull in crates, then a C program can use regexps, with pcre or cre2.

Second, it's not clear to me at all why this problem asks for regexps. Both examples used a regexp to check a file extension, and then to find word boundaries. Both are trivial† to code directly.

Am I misunderstanding the problem here? It seems like there's hardly any string manipulation in it at all.

admittedly, I didn't bother being careful about word boundaries


It shouldn’t require them. I thought I saw a comment by the author explaining why they chose to use them, but there’s been like six different threads on three different forums and I can’t find it right now...

I know some C++ folks were annoyed it’s using regex due to criticisms of the standard regex libraries that can’t be fixed, or something, so it’s also not like folks agree that the original solution was optimal. I don’t think it was trying to be, so seems fine, but it is what it is.


The idea here is that it is a toy problem. The author is not trying to make the best word counter, he is comparing two languages by implementing the same algorithm on both and giving us his conclusions.

Using a different technique, for example by not using regexp is a bit like cheating in that context. You are not comparing two languages, you are comparing two different solutions to the problem.

That's what I meant with my previous post. Either you do it as specified in the article, with regex and maps, which require pulling libraries and working in a way that is much less convenient than with the C++ and Rust example for no good reason. Or you reimplement it using different techniques and this is not the point of the article.

Or to put it simply, the example in the article is not good for C.


The program is going to look substantially the same if I use pcre to check for ".txt" and split up words. And, if you read the original C++ article, it is not at all about comparing regex interfaces!

So, no, I don't think you're right about this.

More to the point, though: I'm talking about regex libraries because the parent comment is. My point: the Rust example uses 3p dependencies, so what C does "out of the box" is already out the window.


I assumed the original example was a bit of an overkill/lazy coding just to demonstrate that regexes and other conveniences are readily available, so the language would work nicely not just for this trivial example, but for bigger things you may want to write too.

I'm fed up with C dependencies, which like Makefiles, always seem to be very easy in principle, and then kill by thousand cuts (like a regex library that defines a symbol that happens to conflict with a POSIX regex function, which I didn't even use, but it corrupted memory of a completely different dependency elsewhere).


You definitely can't manage a C program with dependencies the way you would a Rust program with Cargo. You'd ordinarily bring in a couple major deps --- OpenSSL, zlib, pcre, &c --- and then maybe vendor in the small stuff. Most large C projects don't dep in things like logging or error handling the way Rust programs all do.

This particular program needs no third party dependencies at all; in fact, I bet it'd get longer if I added them (like a glib hash table, or pcre).


Thanks. How about PERL? Hear it's good at string manipulation but never used it.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: