Hacker News new | past | comments | ask | show | jobs | submit login
Clib: Package manager for C (github.com/clibs)
83 points by petercooper on May 18, 2014 | hide | past | favorite | 56 comments



Nice, but can anything challenge the "every language gets its own unique package manager"? (And own unique build system for that matter).


There is a natural break between system-level package management and language-level package management. If you install language-level packages with system-level package managers, then each language package needs to support every single system-level package manager. E.g. you write a Ruby package and now you have to support apt-get, yum, Gentoo, Arch, brew, MacPorts, etc. This is a completely ridiculous situation to put language developers in, especially since most language-level packages don't care what system they run on.

Instead, the best practice seems to be to install programming languages with the system-level package manager and install the language's packages with the language-level package manager. That way you only need one system-level package per language, and each language-level package only needs to deal with that language's package manager. The system package manager is responsible for dealing with system-specific things while the language package manager is responsible for language-specific things. It's a natural division.

It is conceivable that there could be a single cross-language, language-level package manager. But what language would it be written in? And how would it deal with the fact that each language uses different technologies and has different approaches for installing and configuring libraries and packages? When you think about it for a bit, I don't think it's so insensible for each language to have its own package manager.


> you write a Ruby package and now you have to support apt-get, yum, Gentoo, Arch, brew, MacPorts, etc

wrong. that's the package maintainer's job.

moreover, sensible package management systems have templates or libraries (eclasses in Gentoo) to properly handle all kinds of packages.

all we ask for is that developers don't actively break package management by bundling everything as is done here.


Distro packages don't solve the needs of software developers. They solve the need of distros, which are different (and conflicting) with the needs of software developers. Therefore you will always have an eternal struggle between language-level package managers who work for developers and system ones that work for distros.

If there was some simple way to lob a tarball over the wall and get package inclusion on Debian, Fedora, Gentoo, etc., then that would be one thing. In reality however the number of random GitHub repos >> the number of package maintainers. I know that Debian uses a variety of technical and social methods to discourage new packages, and most people's software just isn't popular enough to overcome those barriers, even if some of the reason they're not popular is because they're not well-packaged (sort of a catch-22 there). Not to mention people who install via apt or rpm are perpetually out of date.

Now compare that with pip or rubygems--one package to maintain, installs more-or-less everywhere, accepts packages regardless of popularity, end users are more often up-to-date. Better for software developers' problems in every single dimension.

Sure that's bad for Fedora maintainers, but what are Fedora maintainers doing for me?


> Sure that's bad for Fedora maintainers, but what are Fedora maintainers doing for me?

ensuring a myriad of things that individual developers simply don't seem to care about, ranging from simple testing to inter-package compatibility to licensing to init scripts to old package cleanup to unbundling to...


Your view assumes that all language-level packages are already so mature that each of Debian, Gentoo, Arch, brew, MacPorts, etc. have a person whose responsibility it is to make a system-level package for that language-level package. What about new packages? If language packages are to be distributed on each system using the system package manager, then the way to get a new package into the hands of programmers is to a) write the package and b) before anyone has used it, spend a couple of years convincing a person on each system to create and maintain a system package for this new and completely unused Ruby/Python/Perl package. I guess that's reasonable if you're dead set on only using software that was written ten years ago.


Ir seems like there is often a breakdown in communication between upstream and distro package maintainers. It may also be that authors don't have time to talk with ~3 (Debian, Fedora, ???) distros, each with different packaging processes and rules.


The way this works is that the system-level packages handle setting up the metadata for the language's module system, either through files or install-time scripts, and the language package itself handles the rest.

This is how it works in Python, Perl, Ruby, Guile, and even Emacs on at least Debian(and derivatives such as Ubuntu), Fedora (and derivatives such as RHEL, CentOS, Oracle Linux, and a couple others), Arch, and some others I won't bother to mention.

It doesn't matter whether you're using DPKG/APT, RPM/YUM/DNF, ALPM/Pacman, or whatever, the logic is handled in what files are in the package.

In a world where systems have decently-functional package managers of their own, there is no need for a language-specific package manager.

Language level package managers exist because on major desktop and workstation platforms, system package managers have failed.(OS X and Windows both have several broken, incomplete package managers; none of them are installed by default.)

The fact that this got upvoted so heavily is not surprising, given that judging by the current frontpage of HN, much of the userbase doesn't even know ^R does reverse interactive search in most terminals since the '80s.


System package managers conflate the stuff that is needed to run the system and what the developer builds with. They do badly in handling several different versions of libraries, they add latency of up to two years and only a subset of the packages are available. And when you are done you have to uninstall all the crap you installed instead of just blowing away the virtualenv (or whatever mechanism the language has).


Of course, you don't actually need to uninstall all the crap, because when you install it only once on the system, you don't need to build thirty virtualenvs with roughly the same packages.

Also, choosing system package archives is simple. If you want two-ish years of stabilization for a feature release, and you want bugfixes for two or three years or longer, you can go with Debian. If you pull most things off of head anyway then go with Arch.

Of course, you're entitled to liking egocentric dependency resolution(or no dependency resolution at all, as practised by most), and I'm not going to stop you. But it really is very convenient and robust to handle high-level language libraries in a system package manager, it helps immensely when you intend to actually ship a piece of software rather than fapping off a script that some poor sap will need to update to patch five-year-old known remote vulnerabilities.


Hey, until recently (about 6 months ago) I didn't even know about <Alt>+<.> either.

It's not until you start using this stuff on a daily basis that you start discovering these things.

Not everyone has been using Linux for long. Frankly, I consider myself pretty much a Linux noob. I'm a spoiled Windows brat, who only recently started transitioning to Linux (now Linux is the one spoiling me, but that's besides the point).

This is the primary reason why I come to HN anyway. To learn. :-)


What's alt-.? Some Gnome shortcut?


In many/most shells (bash) it inserts the last word of the last line at the position of the cursor. It's very handy when you want to do two or more things to the same file because you don't have to retype the name, just hit alt-..


Ah, it changes my tile layout in xmonad...

Edit: Well, I tend to use vim keybindings for readline, but I found it in the man page under "default keybindings", I think:

    "M-."  yank-last-arg
    (...)
    "M-_"  yank-last-arg
No default binding for vi mode apparently.


On Windows, I may install pywin32 at system level, but I need a virtualenv with different versions of requests [1] for different projects.

For .NET, NuGet is the only game in town - decent dependency resolution, a good package selection and if you do things right, no binaries stored in source control (except eleventy copies of NuGet.exe but I don't think they can fix that bad decision now...)

Even on Debian I might go with a virtualenv or rvm because I don't necessarily want users to have to apt-get a whole lot of stuff just to run a simple script.

>The fact that this got upvoted so heavily is not surprising, given that judging by the current frontpage of HN, much of the userbase doesn't even know ^R does reverse interactive search in most terminals since the '80s.

That worries me too, but then I remember I don't have a clue about the functional programming articles that turn up here. It all balances out in the end.

There may be a fair few people here who are making the shift from GUIs to CLIs for the first time, so it may not be a bad thing.


The smartest way to do it is to have everyone's special snowflake languages support automake. Because automake's tarballs are so common in the free software world, every package manager has special support to make packaging them easy (cdbs, eclasses, ...).

Then when you need something for yourself, you can go get it from an overlay (layman is the thing I miss most about gentoo), ppa or quickly roll up a package yourself.


It seems like a reasonable task to create a package manager which is able to interface with the "standard" package management tools for a particular language to avoid the need to worry about the details of each tool. I'm not sure if I'd use such a tool myself, but I could imagine some would find it useful and it doesn't seem like it would be a huge challenge to create.


I honestly find pkg-config to be the most sane "c package manager"

As long as a ./configure && make && make install installs package config files all I need to do is set PACKAGE_CONFIG_PATH and be done with it. I don't think c needs its own package manager, or at least another one.


I really like pkg-config but it has some issues with static linking. I had to create a GCC wrapper script to properly build and link the libimobiledevice suite into 100% static executables. pkg-config config files have fields to handle transitive shared linking but not one for static linking.

P.S. Apologies for the accidental downvote. I compensated by up voting another comment.


Yeah I love pkg-config and static linking is annoying. Generally what I do to get around that is in configure.ac something like:

    libuvdir=`pkg-config --variable=libdir libuv`
    LIBS="$LIBS $libuvdir/libuv.a"
That tends to work well enough for my needs. I gate it with a --build-static flag so that I can run off shared if I want to or not.

I'd rather take pkg-config+autotools over any of these newfangled options. Going to take a lot to sway me over from a tried and true option across osx/linux/solaris/eh fine hpux too but just cause i'm in a good mood.


James Halliday a while ago wrote a package system for C based on npm - a node package system: https://github.com/substack/dotc#dotc

The machinery behind it is surprisingly simple. You have 2 pre-processor commands: #require and #exports and you use them to build up a hierarchy of dependencies.


Interesting. Not sure how I feel about the requirment to use c++ features for a c package manager though...


It's not too hard to make a meta-manager, is it? Wrap the ten or so people use in a script, have dependencies like

"clib": { // whatever }, "gem": { // whatever }

So you could do something like install GSL in the C package and then Ruby GSL in your gems. Saves the trouble if installing things to bind to, lets the things manage what they manage best. Install for the meta-manager would install the relevant package managers, then you've got it all.


I think you significantly underestimate how difficult a problem correctly installing and configuring software is. Even just specifying and resolving the dependencies and relationships between libraries and packages is quite hard.


One thing I didn't realize until recently is that it's not just "hard" engineering-wise, it's hard theoretically, i.e. NP-hard. With version constraints, package managers actually have to solve a problem equivalent to SAT solving.

All the "practical" package managers rely on relatively terrifying hacks and heuristics to get around this. I have been looking through Debian metadata, and I am very surprised at how unsound it is. The metadata is very tightly coupled to apt-get ("virtual" package handling, etc.) Not to mention that all the algorithms basically only work in one direction (going forward, not backward).

It appears that some academic/niche package managers are incorporating SAT solvers.

"Managing the complexity of large free and open source package-based software distributions"

http://scholar.google.com/scholar?cluster=162491025473640248...

More references:

http://www.mancoosi.org/edos/references/

Anyway the root problem is that C headers and other language-specific interfaces are not always a good way to interface software components... developing more stable binary protocols would greatly ameliorate the versioning problem.


I was also originally surprised to realize that this problem is NP-hard. It makes sense though: it involves both universal (for all requirements & dependencies) and existential (there exist packages & versions) quantifiers.

Julia's package manager [1] actually uses a custom belief propagation algorithm [2] to find an optimal set of package versions to satisfy given requirements. Originally it used linear programming, but belief propagation is much more efficient and solves the problem optimally in all realistic cases we've tested it on. Using a SAT solver would be another approach, but there is still the issue that you don't just want to find one satisfactory solution, but the optimal one with the freshest versions of the least number of packages. Since SAT isn't an optimization problem, it doesn't give you that.

[1] http://docs.julialang.org/en/latest/manual/packages/

[2] http://en.wikipedia.org/wiki/Belief_propagation


Wow! This looks amazing.

I have been working on productionization of R code, i.e. allowing people to ship their prototypes instead of rewriting algorithms in another language, so I am very interested in this problem.

I have also looked at lot at R's metadata format (CRAN), which, like Debian/Ubuntu, is very successful, despite being unsound.

It looks like METADATA.jl is versioned, while R punts on that... the version constraints in the "head" PACKAGES file are mutated regularly and it becomes impossible to install older versions of packages with correct dependencies.

This is not only bad for productionization, but for reproducible science (related: http://arxiv.org/abs/1303.2140).

I noticed in the docs there is a local package dir like: "/Users/stefan/.julia/v0.3". If this is settable for individual Julia processes rather than being a system-wide global, then I will be very happy :)

Pretty much all language-specific package managers started out with this as a global, and then as they evolved to "production quality", there were some truly horrible hacks layered on top: Python's virtualenv, Ruby's bundler, R devtools, etc.

I was a little skeptical about Julia at first (mainly the "one language for everything" philosophy), but I have seen lots of very impressive stuff so far.

This is definitely worthy of more attention/exposition, since it will be probably the first "principled" package manager in wide/"practical" use.


> I noticed in the docs there is a local package dir like: "/Users/stefan/.julia/v0.3". If this is settable for individual Julia processes rather than being a system-wide global, then I will be very happy :)

It's controllable via the JULIA_PKGDIR environment variable, so yes, it can easily be per-process; you can also easily change it while a process is running, which allows you to do operations on multiple different package directories.

> I was a little skeptical about Julia at first (mainly the "one language for everything" philosophy), but I have seen lots of very impressive stuff so far.

There's a subtlety to this philosophy: it's intended to be a language in which you can do everything, not in which you must do everything. It's very easy to call C, Fortran and Python from Julia.


Regarding the computational difficulty of dependency resolution, I ran into a pathological case with Bundler a while back [1]. The money quote from the EDOS folk is:

> automatic package installation tools ... live dangerously on the edge of intractability, and must carefully apply heuristics that may be either safe, and hence still not guaranteed to avoid intractability, or unsafe, thus accepting the risk of not always finding a solution when it exists.

In other words, you either risk running into pathological cases (as I did with Bundler) that will come up with a solution only after many hours (or more), or you use methods that terminate quickly but will not find every solution.

[1] http://polycrystal.org/2012/03/07/bundler_searches_an_np_com...



While it would be very nice if one of the existing language package managers stepped up their game by supporting multiple languages and cross-language dependencies, one still has to remember that these fulfill a need - namely to make it easy to share code within a coding community. This is something distro package systems have failed to achieve - their systems have too much bureucratic overhead to make a package available for all and require that a library to be packaged multiple times to be used in more than one distro.


It's because we need a user-space package manager.

System package managers require root access to be handled and tend to provide only a single stable version of each package. While they are good to provide the base system, on the application level it's desirable to use specific version of libraries.


> It's because we need a user-space package manager.

that word doesn't mean what you think it means

> System package managers require root access to be handled

nope, Gentoo Prefix can be installed by anyone, as long as you have somewhere that's u+w and exec.

> and tend to provide only a single stable version of each package.

nope, Gentoo allows any versions of every package to be mixed and matched, as long as the underlying platform and package supports it.


Is Gentoo Prefix installable on other platforms ?

If you want to match the language-specific package managers you also have to be able to support at least the common POSIX platforms and allow per-project dependencies management.


Shouldn't software optimally always use the newest version of all the libraries it depends on? It's a security issue if they don't. And as far as I know, only Linux distro package repositories even have backports, not application package repos.


> Shouldn't software optimally always use the newest version of all the libraries it depends on? It's a security issue if they don't.

Not necessarily. There's great value in having stable "long time support" versions of libraries, that are not the latest version, but often have backported security fixes.

New features introduce new bugs, some (most?) new bugs will be (new) security issues.

[ed: For a new application, tracking upstream is often the best way -- say you assume to have a stable(ish) release of your application in 6 months, you don't want to miss out on new features that'll be available in a supported release of some library you're using. But it doesn't follow that you should always migrate to the latest release of that library.]


There's value in having automatic system updates for the common base.

Once you go on the road of unsupported libraries that would be bundled with your application more work is involved for upgrades. Even with semantic versioning you depend on the author to do it properly. It's best if upgrades can be controlled to avoid regressions.


its a pain, but realistically, the needs of each community are often best served by that community.


nix (http://nixos.org/nix/) seems to be the solution with the most potential in this area, solving all dependencies at the project level and removing the need for language specific build environments.


I've actually been exploring nix myself recently. It has some interesting features. I'm finding it quite nice to maintain some C-based packages across both Linux and Mac OS X (where it is somewhat of a competitor to Homebrew).

I've been trying to get it to manage some Ruby applications, complete with gems, private github gems, etc. This has been difficult, in part because some areas of nix are a bit rough, and in part because Rubygems has incomplete dependency information (i.e. there's no standard way to know that, say typhoeus gem depends on C library libcurl, and I'm still unclear on how/if dynamic loading can be made to load the nix packaged libcurl).

Something like nix is what I had in mind with my original question. I'd like to see new languages at least design their custom package managers with these more generic systems in mind. Rubygems has some quirks, and Bundler just includes everything and the kitchen sink (sourcing gems from local files, github, various other features) that complicate matters.

Nix is good for keeping things more or less isolated and tracking complete dependencies, though an alternate way to accomplish similar is something like Docker and then just scripting all the native package managers...


Also, per-user installs sweetens the deal even further.


It's a little scary for me that the source of available package is a wiki page which is editable by anyone with a GitHub account.


Not only that, but there appear to be potential buffer overflows which involve the data pulled from the wiki page[1]. I'm not sure why asprintf wasn't used here (and in other places). The partial re-implementation of libc[2, 3] is also somewhat frightening. As are potential unchecked integer overflows when calculating a size to pass to malloc[4].

C is hard.

--

[1] https://github.com/clibs/clib/blob/master/src/clib-install.c...

[2] https://github.com/clibs/clib/blob/master/deps/fs/fs.c

[3] https://github.com/clibs/clib/blob/master/deps/str-copy/str-...

[4] https://github.com/clibs/clib/blob/master/deps/http-get/http...


Oh dear... I'm just sight-linting so this might be wrong, but I'm certain the function executable in src/clib-install.c [1] can be easily overflown by providing a maliciously long package name or version --neither of which are validated at all [2] despite coming straight from the wiki page, as you point out. The function allocates 256-byte buffers to store a url and file name, respectively, but blindly fills them up using sprintf with non-validated user-supplied data, and without checking the resulting number of characters written, eg:

  char *url = malloc(256);
  if(NULL == file) goto e1;
  sprintf(file,"%s-%s.tar.gz",pkg->name,pkg->version);
C is unforgiving indeed.

Also what's up with all those (0 == count) yoda conditionals all over? Is that becoming a thing again?

[1] https://github.com/clibs/clib/blob/master/src/clib-install.c

[2] As far as I can tell, the package info from the wiki is dealt with by functions in deps/clib-package.c that, despite promising names like json_object_get_string_safe, do not ever validate the provided name or version string:

https://github.com/clibs/clib/blob/master/deps/clib-package/...


I think this is missing half of the problem, that is connecting the package with the build system. I know there is no "standard" build system for C, but still. Go for instance does both; once you "go get" a package, it not only downloads and pre-builds it, but makes sure that it is available for transparent usage in your program.

I personally bias towards CMake, so I would appreciate a C package manager which also makes sure that the package is included in my CMakeList.txt somehow.


what if these are in fact two separate problems? then perhaps it is simpler to solve them with separate tools instead of conflating them.


I can't see how it's simpler, because then you need to write N*M bridges/connectors/plugins to let N package managers interact with M build systems. And what if the library you download is meant to be built with a build system which is different from the one you chose for your application? What if you then want to globally turn on debugging or change a compilation flag, or a preprocessor define to enable/disable a feature?

I don't know of any language-specific package manager that doesn't also make the package immediately available to the programming environment without further fiddling. Solving half of the problem... well, it's a half solution :)


I love this if only because it highlights that package management is a hard problem, still unsolved in even the most mature languages.

Package management is something that sounds really easy on the surface but once you get into it you are overwhelmed with details, tradeoffs, and drudgery. If we're ever going to solve it, we first need to acknowledge that it's a very difficult problem.


There is some overlap with CCAN (http://ccodearchive.net/).


This is a good idea. I could see this changing C development for the better.


I like this! Is it a sort of staging ground for libraries to eventually graduate to various distributions' official repositories, or is it just random bite-sized utilities for quick weekend projects?


Can be a big time saver.

For any package manager to work painlessly and to scale, the repository must be able to detect name conflicts on the code level.


What ever happened to "apt-get" as the C package manager?


Doesn't work on AS/400, Windows, VxWorks, QNX, ...


And Fedora Linux.


  $ apt-get
  bash: apt-get: command not found




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: