Use `Python -m Pip`

jwilk · on July 19, 2022

Beware that "python -m" is insecure in untrusted cwd:

https://bugs.python.org/issue33053

E.g.:

  $ echo 'import os; os.execvp("cowsay", ["-", "pwned"])' > pip.py
  $ python -m pip --version
   _______
  < pwned >
   -------
          \   ^__^
           \  (oo)\_______
              (__)\       )\/\
                  ||----w |
                  ||     ||

cgriswald · on July 20, 2022

You can use -I to have the interpreter not include the directory in sys.path. (This also eschews site-packages and any environment variables.)

$ echo 'import os; os.execvp("cowsay", ["-", "pwned"])' > pip.py

$ python -Im pip --version

pip 22.0.2 ...

jcelerier · on July 20, 2022

> untrusted cwd

... when does that happen and why would you run any program in that case ?

aahortwwy · on July 20, 2022

Maybe you just downloaded a bunch of files into your cwd and now you're going to run a script to process them.

hauntsaninja · on July 21, 2022

In Python 3.11, you can use the "-P" flag or the "PYTHONSAFEPATH" environment variable to avoid adding cwd to sys.path

https://docs.python.org/3.11/whatsnew/3.11.html#summary-rele...

ericholscher · on July 20, 2022

Isn’t that true of command execution as well? I guess you could use a full path to the pip executable, but similar issue.

singron · on July 20, 2022

Only if PATH includes ".", which is why they say not to do that (or use windows. I'm not sure if you can fix that there).

colejohnson66 · on July 20, 2022

CMD includes the CWD in the PATH,[a] but PowerShell doesn’t; you’re forced to do the `./` (or `.\`) dance.

However, if you tab complete a file name that’s not "dot slash" prefixed, it’ll add it for you. Not sure if sh and friends do that.

[a]: Implicitly. Echoing the $PATH variable won’t include `.`, but you can execute as if it was

thaumasiotes · on July 20, 2022

Even if PATH includes the current directory, you'll be safe as long as it doesn't take priority over the regular bin directories.

jjnoakes · on July 20, 2022

Do you mean there might be a 'pip' command in the local directory that might get called by accident, or do you mean the real python pip command might load a python module from the local directory by accident?

danudey · on July 20, 2022

Windows' cmd.exe is the only shell I'm aware of that (by default) checks the local directory for executables before the actual PATH variable, so I wouldn't consider that a real problem.

The 'local directory' one is the actual concern. For example, our CI system for one of our tools runs commands like `pip install -U setuptools`. If we switched it to `python -m pip install -U setuptools`, it would continue to work fine - unless a developer accidentally committed a file called `pip.py` to the root.

At that point, Python would try to import `pip.py` as `pip` instead of the actual module pip (because current directory is checked for includes first).

jjnoakes · on July 20, 2022

Did you reply to the right comment? I was looking for clarification on a comment made when discussing running pip directly, as a full path to the pip executable, and not via "python -m pip ...".

romanows · on July 20, 2022

The second thing. Python includes the current working directory first in the module search path. It leads to the above issue and occasionally some a tricky debugging problem.

jwilk · on July 20, 2022

If you run a script, such as "pip", sys.path[0] is the script's directory, not cwd:

https://docs.python.org/3/using/cmdline.html

> If the script name refers directly to a Python file, the directory containing that file is added to the start of sys.path

romanows · on July 20, 2022

You are right! I missed the switch from talking about the `python -m`-style invocation versus the `python <script>`-style invocation.

(Should I double down and attribute it to difficulties with how significant whitespace indentation can make it hard to determine the scope (of the comment threading on mobile)? :P)

gammalost · on July 19, 2022

Python is a fun language but the ecosystem around it is horrible. It's a shame. I just want to pip install like I would a package manager

newuser4321 · on July 19, 2022

I think you mean the package management rather than the ecosystem. For all its failings, the python ecosystem (for machine learning / data science in particular) is unmatched, which is why python is so popular.

As a "professional" python user (who got there in a roundabout way), I'd say the biggest problem with package management is all the conflicting advice and different ways to accomplish the same thing (there are other problems, but they are surmountable). Once you get a workflow down, it ends up being pretty easy to set up and manage an environment for a given project. Not that it's perfect, but I think a lot of the stereotypes about how bad it is come from how bad it appears as you try to converge on a setup that works for you.

bloblaw · on July 19, 2022

I remember when Perl was more popular generally than Python (late 90's and early 00's), and Perl espoused the mantra of (There is more than one way to do it: TIMTOWTDI).

As a tongue-in-cheek reaction, Python espoused "TOOWTDI", or There's Only One Way To Do It :)

reference: https://wiki.python.org/moin/TOOWTDI

The problem is, when it comes to the Python package management ecosystem, there are SO MANY ways to do it. And they aren't equal and require a deep understanding of each tool to select the correct choice.

It is a problem for Python, although I often regrettably see it dismissed as not a true issue because $latest_package_management_system fixes it.

Python has allowed me to be profoundly productive in many ways, but this is a huge sore spot for the Python ecosystem.

xdfgh1112 · on July 19, 2022

I'm relatively new to python. I use venv, pip and requirements.txt. It's dead simple. What am I missing?

bloblaw · on July 20, 2022

The fact that the suggested solution in Python is to give every Python script a full copy of an entire specific Python runtime (via venv) is a mild annoyance as a design pattern...to me.

Python scripting today requires shipping your development environment. Python is wonderful until you want to run that code on another machine. At that point, the target system has to venv their way into reproducing your environment...often including the specific Python interpreter you picked, and to download (and possibly compile) all modules and their dependencies.

It can work beautifully and many of the large companies I've worked for have put oodles of engineering effort into making it "easy", as long as you follow their happy path and don't deviate.

But, to your point, on my own machine with venv + pip it "just works". The pain comes when trying to venv + pip on another machine.

How confident are you that you could venv + pip your moderately complex Python application on 200k machines without issue? What if it's a mix of Windows, Linux, and macOS? (this is a real scenario I've experienced). From my own experience I can share that it is painful. Your mileage may vary.

And after a while your system is riddled with Python venvs. Leading to some mildly annoying problems with popular IDE's: https://www.jetbrains.com/help/pycharm/package-installation-...

_ofdw · on July 20, 2022

>The fact that the suggested solution in Python is to give every Python script a full copy of an entire specific Python runtime (via venv) is a mild annoyance as a design pattern...to me.

It's not just a mild annoyance, it's a sad statement about python as a community of developers that this is not only accepted but recommended.

I write python because some of what I do occurs in the domains where python makes the most sense, but the idea that the python community accepts venv rather than considering the fact that venv even exists at all to be a source of profound embarassment is mystifying to me.

elbigbad · on July 20, 2022

Yeah, I agree with this. I typically dockerize where appropriate to sidestep this a little, but very often rewrite everything into a more friendly language like Go if I am intending to ship software to other machines. The latter is obviously painful if I’m making heavy use of Python specific libraries that do a lot of heavy lifting like Numpy, if those features aren’t in something like GoNum.

behnamoh · on July 20, 2022

I should also say: Unfortunately, Conda’s environment exports only work for similar OS’s. I don’t know why Python’s situation is like this, after so many years.

Does docker fix all the above issues?

bloblaw · on July 20, 2022

Docker simultaneously solves and creates problems here.

Step 1: Docker must be installed on all targets. This is not a given and is a new piece of overhead.

Step 2: The Dockerfile hopefully doesn't source from just "ubuntu:latest" and bring in the whole kitchen sink for this SINGLE APPLICATION.

Effectively if you are using Docker and deploying your script to Windows or macOS, you are saying "Hey, this script requires you to install linux in a VM to run this. Docker makes it easy. Go download 500mb of Docker and a couple hundred more mb of images to run this 800k script.

bornfreddy · on July 20, 2022

Yes, docker solves all these issues (because your app carries all of its environment in the image), however it is a bit cumbersome if you want to distribute your app to many unknown users. They can't just download something and run it, as Windows users are used to doing.

thaumasiotes · on July 20, 2022

> The fact that the suggested solution in Python is to give every Python script a full copy of an entire specific Python runtime (via venv) is a mild annoyance as a design pattern...to me.

That bothers me too, a bit. But sidethread to you people are comparing python unfavorably to ruby, which as far as I know uses exactly the same strategy (except that it's called "rvm" instead of "virtualenv"). Node.js appears to do the same thing.

To my eyes, the solution to this would be static linking, but I don't know how much sense that makes in the python / ruby / js context. Other than that, what's the alternative? There's certainly a lot of convergence on this solution.

SAI_Peregrinus · on July 21, 2022

`pyinstaller --onefile <script.py> [secondscript.py] [...]`

Statically links all the dependencies and the Python interpreter. Other machine can just run the executable, no need to set up a venv+pip.

Of course the binary is huge, but IME it works. Unfortunately it can't cross-build, so you need a VM or multiple machines set up with the dev environment to run pyinstaller for each OS.

andreareina · on July 20, 2022

Ansible is a special example of this. Some Ansible modules (e.g. the postgresql one) require non-stdlib modules to be installed in the remote python. So either install it globally or figure out how to make Ansible work with environments. Even if the thing you're trying to set up ostensibly doesn't use python.

sigh

danudey · on July 20, 2022

As I recall, pipx does this; it installs a separate env, but then provides entrypoints (commands) to the global environment.

For example, it would install ansible and the ansible dependencies to a separate environment somewhere, but still put commands for `ansible`, `ansible-playbook`, etc. into /usr/local/bin for example. When you run the ansible command, it loads ansible from the separate environment.

I haven't actually tried this myself since our environments are pretty uniform, but it could be worth looking into. It would be nice if pip itself provided this as an option (or if it became the default), but that could also become extremely complicated in terms of upgrades. I already hate having to download and build scipy, but it would be incredibly irritating to have to do it once for every tool I have that uses it. I'm sure some kind of cache could be made to work, but it's not trivial.

andreareina · on July 20, 2022

Big fan of pipx, but I don't think it would work here? Local ansible calls remote python directly and that's where the modules need to be, so a sandboxed {local,remote} ansible would be inert.

There's a workaround but it's not ergonomic and I always forget what it is.

okasaki · on July 20, 2022

Virtualenvs are no worse than static linking, or packaging dlls/sos, all very common practice (and even recommended).

Python package management is VERY easy.

bloblaw · on July 20, 2022

When I statically compile a C++ or Go binary, I end up with something I can copy and execute on any platform for which it was compiled for.

Single file. Copy and run.

Virtualenvs require every single target to reproduce your dev environment:

(1) have internet access and able to reach pypi (or artifactory, or whatever you use).

(2) the ability to install the required version of Python if it isn't already installed. That's another 30mb download.

Ever want to run a complex Python script on a bastion host or a host behind a bastion? Well now #1 and #2 above won't work (or haven't in my direct experience working for some of the big cloud companies). So you have to use something like PyInstaller and hope it works. It might. It might not. A statically compiled C++ binary or Go binary probably will.

This thread is full of pain points in python package management. The first step is admitting there is a problem. I don't think you've experienced the pain caused by Python package managers that others outline in this thread.

okasaki · on July 20, 2022

I mean in the sense that you're bundling your dependencies with your program. This is well established, Python isn't special.

Obviously Python is (usually) an interpreted language, so you're going to need Python on user system. If that's a problem Python might not be for you, or you're going to need to do some extra work.

It can be done. For example the popular visual novel software Ren'Py is written in Python.

mixmastamyk · on July 20, 2022

Not necessary unless project is huge. And not a full copy, venv uses links. Pip will install copies if you ask.

jjoonathan · on July 20, 2022

Conda. Back in the day, before docker, it was a poor-man's docker combined with a repository of significantly newer and more compatible software than you could get from distro repositories. Python libraries with external dependencies used it heavily for this reason. Conda tended to Just Work in a bunch of cases where pip effectively forced you to go off and debug a bunch of autotools builds on your own. Yuck!

However, it was pushy. It would put its own header/linker paths in front of the system paths (this is how it made "environments," its wannabe containers), which tended to create inadvertent cross dependencies if you didn't understand or didn't remember that the semantics of a conda environment extended beyond python. These dependencies could get baked into binaries and break far down the road, or they could get sucked in as a transitive dependency and trip over the shoelaces of a different build of the same software installed outside conda. Unfortunate. However, around 2019, the problems started growing beyond mere foot-guns. Conda uses a full SAT solver to provide a highly featured versioning system, and this worked great until the big conda channels grew to the point that it started getting really slow. Installing packages went from taking seconds to minutes to hours to forever. They tried caching, they tried fragmenting channels, but it was all very not-seamless.

Eventually, people started migrating back to pip. It turns out that over the last decade distro repositories had gotten their shit together and now Docker existed to sweep up the last few use cases, so nobody needed conda's "poor-man's docker plus curated 3rd party repos" anymore. Now pip is the tool that Just Works, and it Just Works without any of conda's baggage. Virtualenv environments don't hook your system quite as aggressively, pip never stalls when resolving its version plans, and Docker can be used to reproducibly experiment and find the happy path.

Be glad that you missed out on pre-conda pip and the conda arc.

xdennis · on July 20, 2022

> I'm relatively new to python. I use venv, pip and requirements.txt. It's dead simple. What am I missing?

Compare this with Node. It always installs locally by default and always installs in node_modules regardless of which package manager you use. This is integrated into Node so you never need to modify the path like in Python. You don't have to guess whether your packages are installed in .venv, env, environ, or whatever someone else decides.

`pip` actually has a lot of issues with regards to deciding which version to use. That's why people moved to pipenv... then pipenv stagnated and people moved to poetry.

Using `requirements.txt` is dead simple but it ignores issues such as version locking. If you just add the packages you need to `requirements.txt`, then every time you install you could get a different set of packages. If you do `pip freeze > requirements.txt` then you don't know what comes from what.

eslaught · on July 19, 2022

To be honest, I don't see how you can give this advice. A couple of the most popular package management systems for Python are very obviously deficient (and sigh, these are usually the ones I get stuck working with, due to outside constraints).

A classic example is version pinning. Rust has Cargo.lock, Ruby has Gemfile.lock. Python? It depends on which one of the multitude of options you pick. But at least a couple of the most popular ones basically don't do this (pip) or do this in a hacky, ugly way that makes you want to tear your hair out (Conda). As a result, at least in the projects I work on, people tend to skip this.

I hope it should be obvious what bad things can happen if you don't pin your dependencies, but for the uninitiated: I'm talking about things like projects breaking inexplicably after 6 months because you rebuilt some Docker container and something or other got upgraded and is now incompatible.

Beyond the technical issues, there's a human engineering problem of getting everyone on the same page about the right processes, and Python makes that vastly harder by not having a One True Solution that everyone just uses.

gjvc · on July 19, 2022

"pip freeze" generates a versioned list of packages to install in the same format as requirements.txt -- in this example below i've called it versions.txt. I have a bash wrapper "bin/venv-create" which essentially does this

    python3 -m venv .venv/
    if [[ -f versions.txt ]] && [[ versions.txt -nt requirements.txt ]]; then
        pip install --requirement versions.txt
    else
        pip install --requirement requirements.txt
        pip freeze > versions.txt
    fi

(I place the files in etc/pip/ in my projects (and check them into git), but I've omitted the paths for clarity. One could embellish this by including python version number in the filename, as package requirements can change between python versions.)

I also have a bin/venv-python wrapper which sets PYTHONPATH, PYTHONDONTWRITEBYTCODE before chain calling .venv/bin/python3 with the arguments, and this is how pip above is called. (again, omitted above for clarity.)

This won't cover everyone's usage scenario, but it works for me. YMMV.

https://iam.georgecox.com/2021/09/25/python-3-venv/ explains the details.

jvolkman · on July 20, 2022

`pip freeze` doesn't generate hashes, so you can't be sure that the package contents haven't actually changed but maintained the same version string.

Also, `pip freeze` doesn't include platform-specific dependencies for other platforms. So if you run the following on linux and again on macos, you'll get different results because `ipython` depends on `appnope` only when running on macos.

  python -m venv env && env/bin/pip install ipython==8.4.0 --quiet && env/bin/pip freeze

gjvc · on July 20, 2022

these are legitimate weaknesses -- thank you for highlighting them. On the other hand, if one can accept these weaknesses, keeping separate the root requirements file and the versions file goes a long way to making the process manageable, and not as chaotic as some might have you believe.

The beef I have with so many explanations of pip is that they tell you to source ".venv/bin/activate" and then "just run pip install whatever" without a) separating the root requirements from the effective/complete requirements, and b) they don't suggest using a wrapper for the ".venv/bin/python3" binary so that execution is the same in all environments.

eslaught · on July 20, 2022

Thanks, I appreciate this.

I don't think this practice is widespread, which is exactly why I made a point about human engineering in my original post. But I do appreciate that solutions like this exist, and I should look into driving more of this sort of thing in my projects.

gjvc · on July 20, 2022

the blog post is missing a link to the template github repo; i'll fix that

viraptor · on July 19, 2022

> A classic example is version pinning. Rust has Cargo.lock, Ruby has Gemfile.lock. Python?

Does it matter? I get a different way to define dependencies. But the lock file itself is an implementation detail. You use the package manager the project uses and the lock file can be opaque. It's the same for package-lock.json / yarn.lock in js land.

bee_rider · on July 20, 2022

The rest of the sentences in that paragraph present an argument for why it might matter.

rat87 · on July 20, 2022

python's poetry has poetry.lock I think it's slowly becoming semi-standard. But still slowly and still not fully ready(bugs). Also its odd that pyproject.toml (not poetry.toml which is the poetry settings for repo) is the dependency definition file and poetry.lock is the lock file

for pip file I usually make a requirements-to-freeze.txt, make a fresh virtualenv, install requirements-to-freeze.txt and then pip freeze > requirements.txt poetry does this automatically

returnzero · on July 20, 2022

> But still slowly and still not fully ready(bugs).

I use poetry every day and so far it's been pretty great. The main complications I have had are with accessing private pypi repos in Azure DevOps pipelines which use short lived tokens but it just looks a little clunky, still works.

> Also its odd that pyproject.toml (not poetry.toml which is the poetry settings for repo) is the dependency definition file and poetry.lock is the lock file

This is because poetry is using Python's PEP 518[1] specification rather than define their own build requirements format. It also isn't limited to just building, you can also include the configuration for other python tools like `pytest`[2].

[1] https://peps.python.org/pep-0518/

[2] https://docs.pytest.org/en/stable/reference/customize.html#p...

rat87 · on July 20, 2022

> This is because poetry is using Python's PEP 518[1] specification rather than define their own build requirements format. It also isn't limited to just building, you can also include the configuration for other python tools like `pytest`[2].

I sort of get this but its still a bit odd. especially since I have a small poetry.toml as well.

> I use poetry every day and so far it's been pretty great.

I like poetry, it works a lot better for me then pipenv did(to be fair that was a few years ago). But I do seem to have to delete lock file to update it and sometimes poetry add/install breaks oddly(or at least with awkward error messages). Also sometimes it interacts with venvs in odd ways. python really needs to ship with it

mickeyk · on July 20, 2022

The pyproject.toml is telling you what should work in terms of a range of package versions and the poetry.lock is telling you what my environment is in terms of exactly what is installed. You don't need the lock file, but without it there is no guarantee of what it will install so it may not be completely compatible.

rootusrootus · on July 20, 2022

I must live a sheltered life. Everything I've run into so far uses requirements.txt to pin dependencies. Well supported and easy is probably why.

eslaught · on July 20, 2022

Yes, exactly. And how many of these projects go to any effort to pin their transitive dependencies?

In my experience, this is basically no one; in fact I've seen many (most?) projects not even pin the versions on their top-level dependencies. (Ask yourself how many times you've seen just "numpy" as a dependency without any version bound. Far too often in my experience.)

And this is exactly why stuff breaks: even when the root dependencies are pinned (and are they?), a transitive dependency could get upgraded and break something (or fail to build entirely).

jmcgough · on July 19, 2022

As someone two months into their first python job - the ecosystem is solid for ML / data science, but there's a lot of places where it's painfully lacking.

I hate the most popular ORM (sqlalchemy) and alembic has a lot of footguns: for instance, if you autogenerate a migration where you change a table name, it will try to drop the old table and create a new one.

In web dev a lot of the OSS community has moved on to more appropriate or exciting languages, so the tools I'd normally reach for frequently are OSS projects that haven't been maintained in seven years.

As for pip: it needs a major version bump that creates a lockfile, or it needs to be intentionally sunset. It's easy for a beginner to start using (part of why it's heavily used), but extremely dangerous because of the lack of guarantees. I wish it didn't have as big of a footprint as it does.

wombatpm · on July 19, 2022

Much prefer the Django ORM, which is a pain to use if you are not building web apps

mickeyk · on July 20, 2022

I've used peewee in the past which worked well.

m_mueller · on July 19, 2022

agreed on the lockfile. we use pip-tools to achieve this, which gives the `pip-compile` command. a Makefile is tying together the different development and build workflows we have, such that it can be executed both locally as well as in our CICD build server. complexity is manageable, but it's roughly comparable to a compiled language if you want locked-down reproducible builds (which I recommend for anything production grade).

thrashh · on July 19, 2022

The reason for the conflicting advice is because people keep making new package managers that still suck overall. Like they fix one major package manager problem and somehow neglect the other 60% of problems.

So you end up with all these different Python package managers that are all good at individual different things but no single package manager that is good at all of the things.

I swear that is a major problem with most libraries or tools (and products overall actually) that people make. It’s like people fix their pet peeve but forget that the whole picture matters way more.

woodruffw · on July 20, 2022

The conflicting advice is a serious problem.

I hope you'll forgive me for adding one additional piece of advice: for many Python packages, the only packaging metadata you need is `pyproject.toml`. You don't even need `setup.py` anymore, so long as you're using a build backend that supports editable installs with `pyproject.toml`.

Here's an example of a Python package that does everything in `pyproject.toml`[1]. You should be able to copy that into any of your projects, edit it to match your metadata, and everything will work exactly as if you have a `setup.cfg` or `setup.py`.

[1]: https://github.com/sigstore/sigstore-python

Hackbraten · on July 20, 2022

Thanks for the link!

hoosieree · on July 19, 2022

Python's package management is to package managers as C++ is to programming languages: everyone who uses it understands about 10% of it, and no two people understand the same 10%.

wheelerof4te · on July 21, 2022

The most relatable comment on Python's packaging ever.

BTW, I know setup.py method. How much is that in percents?

gernb · on July 20, 2022

Could you please link to your favorite way to install python at specific version and all dependencies without have to put anything in a global place. Ideally on MacOS (since it's what I'm on most). I don't have homebrew or macports since eventually those always lead to me getting a borked system since they want everything installed globally.

takeda · on July 20, 2022

Not the GP, but I became a big fan of using poetry for managing python package dependencies.

For managing python itself and binary libraries I started using Nix package manager.

It allows to describe all dependencies via code, but with time that code became a boilerplate, so I created this: https://github.com/takeda/nix-cde

It works very well for me so far.

You do need to have Nix[1] installed, but hopefully that should be the only thing needed and everything you can just list in project.nix file (in the example in README.md you should have access to `dive` command even if you never installed it on your system). Which will make it available only for that project.

Here's also an example of a very simple project: [2]

[1] https://nixos.org/download.html#nix-install-macos

[2] https://github.com/takeda/nix-cde/tree/master/tools/aws_assu...

Hackbraten · on July 20, 2022

Not OP but anecdotally, this is how my workflow looks like:

1. Install pyenv to a central location of my choice, e. g. ~/.pyenv (this helps manage all those different Python versions piling up from all the isolated projects I have.)

2. Install pipenv as a stand-alone tool. Doesn’t matter which Python I’m using, I just want to have it in my PATH.

3. Now I’m ready to create fully isolated projects using pipenv. Not only does pipenv create a venv right inside my project directory for me, it also helps me manage and lock dependencies. It also talks to pyenv so it can fetch the per-project Python version for me; this is especially useful when collaborating with others on the same project; they can reliably check out the project without having to manage specific Python versions themselves.

YMMV but I’ve had a stable experience using this setup so far.

lovingCranberry · on July 20, 2022

I wish there was a "standard" package manager like npm. While npm might not be the most efficient one, it just works.

jeroenhd · on July 20, 2022

The problem is that the operating system packages itself often depends on some version of a dependency. You can perfectly pip install something globally, but globally installing a library means everything else on your system is affected too.

There are solutions to this (pipenv being a popular one, though sadly often not explained to Python newcomers) and there are alternatives for that as well. You probably don't actually want to install a package like a package manager, most of the time you want to install an application and the specific requirements for that application, which is exactly what pipenv is for.

Your global package manager and your global Python package manager don't agree on what version is "the latest stable version" of a package so you just can't expect global package installs to work like that. It's like trying to make pacman, apt, and dnf work on the same file tree: you can probably get it done, but the end result will grow unstable or unusable within days.

bschwindHN · on July 20, 2022

Yep. Every time I try out a python-based tool, it _never_ runs successfully on the first try. I always look for Go/Rust/C based alternatives, especially when it's a CLI tool.

jefurii · on July 20, 2022

I think this has become a meme.

It's true that people have made a lot of different ways to install Python packages. Each of these people has an axe to grind and they have supporters who are trying to boost their preferred solution and keep people away from the others. That part really is a mess.

The standard virtualenv+pip has always been rock solid for me. Maybe this is because I do all my development inside virtualenvs and I only ever use pip inside a venv. And I run Linux, though I do ship my Python code to a fleet of servers.

Sometimes I think maybe this Python-ecosystem-is-horrible stuff comes from people who are trying to steer people away from Python towards their preferred language.

vorpalhex · on July 20, 2022

Poetry or pipenv.

woodruffw · on July 19, 2022

This is great advice. To tack on:

Python has had built-in virtualenv support since 3.3[1], meaning that you can do this:

    python -m venv env/

...on any version of Python released in the last decade and get a reasonable virtual environment. To go one step further, you can also have `venv` automatically bring `pip` and `setuptools` to their latest versions:

    python -m venv --upgrade-deps env/

...which you should almost always do, since the versions bundled with your Python distribution are likely to be behind the latest.

CobrastanJorji · on July 19, 2022

Yeah, I strongly advise ALWAYS using a virtualenv for any Python project. It's pretty much always a good idea.

srvmshr · on July 20, 2022

As someone who uses environment & containers fairly regularly, to this day, I didn't know `virtualenv` & `venv` were actually two different packages. I always used these interchangeably. I saw the difference pointed out in the article.

gernb · on July 20, 2022

and how am I able to run

    python -m venv env/

if haven't already installed python?

with node I use nvm. No root need, have never run into a problem of something complaining that I don't have node installed at a system level. Is there an equivalent for python?

jeroenhd · on July 20, 2022

How am I able to run npm install/nvm install if I haven't already installed Node? I've seen way more machines that come with Python than I've seen machines that come with Node. You need to install and configure nvm, just like you need to install and configure Python.

The difference is that pip comes from a time when package managers were still universally global, while npm and friends only operate on directories by default.

If you want a systemless Python then you'll have to do it manually. I don't think it happens often enough for it to have a tool like nvm. Nothing prevents you from downloading the binaries, stuffing them into your user folder and setting up the right environment variables in your .profile, except for that it's a pain and probably not worth the effort.

Edit: apparently pyenv is a thing. I haven't ever needed it myself, but it seems to do everything nvm does and more.

If you just want to restrict package installs, versions, and configurations to a single project, virtual envs are the way to go.

Chiron1991 · on July 20, 2022

pyenv. same thing as nvm, but for Python.

jonatron · on July 19, 2022

If you're on Debian/Ubuntu, you'd first need to install python3-virtualenv. It does say that though IIRC, if you try without it installed.

woodruffw · on July 19, 2022

Almost! `python3-virtualenv` is virtualenv, the third-party tool that ultimately inspired `venv`. `python3-venv` is the Ubuntu/Debian breakout package for `python -m venv`.

TL;DR: Install python3-venv, not python3-virtualenv.

(Ubuntu and Debian's decision to break out core parts of the Python standard library has been an unending source of problems. PEP 668[1] is an in-progress effort to better categorize these kinds of distro changes and improve the user experience around them.)

Edit: I should also qualify: virtualenv is still maintained and useful; venv is contains the subset of virtualenv's functionality that 99% of users use.

[1]: https://peps.python.org/pep-0668/

whichquestion · on July 19, 2022

Pyenv[1] solves the multiple versions of python problem in my experience. You can install the version of python you want and then can set that version globally, per directory, and then use that versions pip or take it further and use a virtualenv/poetry shell.

1: https://github.com/pyenv/pyenv

sph · on July 19, 2022

Not to be confused with pipenv!

aynawn · on July 20, 2022

asdf works as a universal tool for any cli app i.e. python, terraform, ruby, golang, etc

https://github.com/asdf-vm/asdf

    asdf plugin add python

    asdf install python 3.9.1

    asdf global python 3.9.1

    asdf local python 3.9.1

romeoblade · on July 19, 2022

This!

I use a combination pyenv, direnv and poetry for my projects.

butwhywhyoh · on July 19, 2022

I mean, this is generally good advice, but this reads like an infomercial where they show someone struggling REALLY hard to boil water to make a pot of spaghetti, when we all know it's really not that hard.

They try to make finding your pip executable sound difficult, and even more difficult would be to understand what interpreter its tied to. Except...

> pip -V

> # pip 19.0.3 from /usr/local/lib/python2.7/site-packages/pip (python 2.7)

I'm all for evangelizing best practices, but let's at least be honest about it.

psyklic · on July 19, 2022

I think the point is that most people just immediately type `pip install ...`, without thinking about which interpreter it's tied to.

It's fairly common for beginners to tell me they installed something using pip, then the Python interpreter can't find the package.

bornfreddy · on July 20, 2022

This. Also, the argument is a bit weird:

- always use python -m pip

- if in venv, it doesn't do any good, but do it anyway for consistency

- you should also use venv

So... Maybe just always use virtual env and then you can use just pip?

Hackbraten · on July 20, 2022

While it’s good advice to use a venv for development work, there are a couple of Python packages that you want to have installed globally, such as pip, pipenv or poetry. So I don’t see a contradiction here after all.

figbert · on July 19, 2022

If I'm understanding correctly, this basically just kicks the ball a little further down the road...

You shouldn't use pip directly because you don't know which version is the one in your path. Ok: the same applies to the python command?

Calling pip is version ambiguous, but so is calling python.

remram · on July 20, 2022

Yeah if your Linux distro comes with python, python2, python3.7, and python3.8... then you almost certainly have the matching pip, pip2, pip3.7, and pip3.8. If you activate a virtualenv, that will override Python and python3 but also pip and pip3.

The only situations where I've encountered breakage is pydoc (because your virtualenv does not necessarily have its own pydoc, contrary to having pip) and calling pip from a Jupyter notebook: the current kernel's virtualenv is not necessarily activated (the solution is the %pip magic or `!{sys.executable} -m pip` since `!python` would have the same issue).

dmurray · on July 20, 2022

If there's an executable file named "python" in your current directory, typing "python" in your shell won't in general execute that file. You need to add the current directory to your PATH, or to run it explicitly with something like "./python". This is different from the behavior with "python -m modulename".

So this security concern applies when you trust your shell and all the directories in your PATH, but you don't trust the contents of the current directory. That's not the norm, but it's quite a common situation to be in - you downloaded some files but don't intend to execute them.

This is (used to be at least) different on Windows: typing "python" risks executing a file in the current directory called "python.exe", though maybe UAC saves you now.

xigoi · on July 20, 2022

Any sane system package manager will ensure that pip corresponds to python, pip2 corresponds to python2, …

bcbrown · on July 20, 2022

It may be an ambiguous version, but it'll be the same version as the repl you get when you type `python`, and it'll be the same version that'll run your script when you type `python script.py`.

pseudalopex · on July 20, 2022

They used version numbers in the examples except upgrading pip on Windows. They even used the full path in the 1st example.

martopix · on July 20, 2022

My two cents: if you have to use `python -m pip` because it might be the wrong pip or something like that, if you have multiple versions of python and you don't know which one you're using, it means you have a mess in your system, and you're probably going to cry sooner or later regardless of whether you use `python -m`. If there is confusion, `which pip` can help.

As for upgrading pip in Windows, that's a silly argument. When I want to upgrade pip, I will use `python -m`, no need to use it all the time.

Yes, it's a pain to keep your Pythons in order. We can use pyenv and the like. I wish there was a better way and it's unfortunate, but at the moment it's what it is, and having to use `python -m` is only a sign that you don't have control over your own system.

ssalka · on July 19, 2022

Python as a language is a joy to use (for small projects), however pip has soured my experience of Python so drastically that I actively avoid taking up Python projects out of knowledge - not fear, knowledge - that the setup/install process is going to be a humongous pain. In most cases I can get started more easily with Node.js/TypeScript.

in b4 "use some other package manager / pipenv / virtualenv etc" - no. How about pip actually installs things as expected rather than making me jump through 1000 hoops. /rant

bornfreddy · on July 20, 2022

The issue is that there are many different ways to setup your environment. Every python dev has their own set of tools that mostly work, but sometimes break in some corner cases... and then the problems start, because searching for a solution will inevitably produce something that is almost, but not completely, applicable to your situation.

I use pyenv+pipenv (@linux) and are mostly happy with it.

martopix · on July 20, 2022

pipenv and virtualenv are not "some other package manager". The package manager is still pip, which, by the way, does its job as a package manager fairly well. The problem is when you want multiple environments for multiple projects. A problem that is just nonexistent with, say, apt-get, because you don't need multiple environments.

ausbah · on July 20, 2022

you can't complain if there are viable easy to use solutions

ssalka · on July 20, 2022

[flagged]

ok_dad · on July 20, 2022

Probably not down-voted for that, many Python devs agree that pip is basically garbage, down-voted because there are very simple solutions to Python's distribution problems that you refuse to acknowledge.

Depending on your use case, there are various tools, but one easy set of tools that is good for anything from large production deployments down to an ML notebook environment is pyenv+poetry. Those two will, with very few commands, allow you to specify and install your environment for a project very easily and with similar ease as you might find when using nvm (similar tool to pyenv) and npm (similar tool to poetry) for a JS/TS/node/whatever project.

You got down-voted for being willfully ignorant and proud of it, basically.

bxparks · on July 19, 2022

What is the best way for an author of a Python package to develop and test on multiple versions of Python (e.g. 3.6, 3.7, 3.8, 3.9, 3.10), and be able to switch easily among the various Python versions? Every time I try to research this, I get lost in the chaos of Python packaging and environments.

Currently I do a `pip3 install -e .`, which uses the default Python provided by the OS. Then I hope that my continuous integration matrix catches any problems in the other Python versions.

shadycuz · on July 19, 2022

You need `pyenv` and `nox`. Just read this entire guide https://cjolowicz.github.io/posts/hypermodern-python-01-setu...

maest · on July 19, 2022

I think tox was designed to solve that problem.

https://pypi.org/project/tox/

Izkata · on July 20, 2022

This, absolutely. If you're running tests for building a package, here's an entire "tox.ini" file for running all your tests on multiple python versions:

  [tox]
  envlist = py27,py36,py38

  [testenv]
  deps =
     pytest                 # Install extra packages for testing separate from what's in setup.py or whatever
  commands =
     pytest                 # Whatever test command your project uses

That's really it for bare minimum. If you have pytest configuration, they can also go into tox.ini under a [pytest] section to keep them in the same place.

Instead of installing tox globally you can even install it to a separate virtualenv, activate it, then run "tox". It figures out the right thing to do.

If you're not building a package:

  [tox]
  envlist = py27,py36,py38
  skipsdist = True          # Don't attempt to build a package / pull from setup.py

  [testenv]
  deps =
     -r requirements.txt    # Pull in project requirements
     pytest
  commands =
     pytest

nl · on July 20, 2022

If you scroll down in the linked article it says:

> ALWAYS use an environment! Don't install into your global interpreter!

...

> When you need to create an environment for a project I personally always reach for venv and virtual environments. It's included in Python's stdlib so it's always available via python -m venv (as long as you are not on Debian/Ubuntu, otherwise you may have to install the python3-venv apt package)

I think this is good advice, and it's pretty easy to do, and you can have named environments (via directory names) for each version you can switch between easily.

Very occasionally some python libraries will use a globally installed system library or a particular version which changes per python version (I think maybe OpenCV has this problem on Windows or something?) and so then you have to reach for containers.

martopix · on July 20, 2022

Environments don't help you if you want to use different versions of Python.

bxparks · on July 20, 2022

Just want to thank everyone who responded with suggestions. I think it's amusing that there seems to be at least 5 different solutions to this problem: pyenv, nox, tox, asdf, docker.

imgabe · on July 20, 2022

A lot of people will recommend pyenv, but I find this ends up causing obscure problems as it messes with your PYTHONPATH.

I like to just install alternative versions from source. Download the tarball you want from python.org and run

    ./configure
    make
    make altinstall

and it will be installed separately without overwriting any other version. Then you can just create a venv to use with that version like

    python3.6 -m venv py36

jnwatson · on July 19, 2022

pyenv is my tool of choice here. It allows one to easily install multiple versions of Python from source and switch among them.

It has mostly avoided feature creep, which is a big plus among all the competing solutions.

nicoburns · on July 19, 2022

Not an expert in python, but there is pyenv. I personally use asdf, which is great because it manages versions for multiple language runtimes.

StopHammoTime · on July 19, 2022

asdf is good, and supports more than Python

https://asdf-vm.com/

ellisv · on July 19, 2022

use pyenv (https://github.com/pyenv/pyenv) to manage multiple python versions (e.g. install 3.6, 3.7, etc) and then use virtual environments

or use containers and a docker-compose file to run your test suite across versions

cmcconomy · on July 19, 2022

I wish there was a way to lock the global python so you couldn't install packages to it by accident

kstrauser · on July 19, 2022

I have:

  $ cat ~/.pip/pip.conf
  [global]
  require-virtualenv = true

...which makes pip refuse to install anything unless I'm in an activated virtualenv. That, plus running as a regular user that doesn't have write permission to /usr, goes a long way.

bravetraveler · on July 20, 2022

I love this, pip is an absolute nightmare on systems with a restricted umask.

Newly created files/directories lack permissions for regular users... essentially breaking the interpreter for everyone [if /usr, sudo, etc]

primax · on July 20, 2022

This is great, thank you!

0x008 · on July 20, 2022

then you cannot upgrade pip :D

kstrauser · on July 20, 2022

I do sometimes have to disable that, but at least then it's because I'm very deliberately doing it for a specific reason.

But if you're in a virtualenv, you can use the venv's pip to upgrade the venv's pip. :-)

dragonwriter · on July 19, 2022

On unixy systems, won't the system Python usually require root/sudo to install packages, whereas the ones in your environments will be owned by a less-privileged user?

And on Windows, you may have a globally installed Python (though there is little reason to have one instead of the Py launcher), but even if you do it's not a system Python that system components are relying on.

remram · on July 19, 2022

Unfortunately the pip developers had the genius idea to add a "user install" mode which pip will default to in this situation. So if you're not root, you won't be able to mess up your system-global Python, but you can still mess up your user-global Python (deleting some ~/.local subfolders takes care of this).

mixmastamyk · on July 20, 2022

Useful feature that should have existed from the beginning. If it did, the sudo mistake wouldn’t have happened.

__mharrison__ · on July 19, 2022

You can do this in UNIX systems by dropping a line like this in your ~/.bashrc

. /home/matt/envs/menv/bin/activate

Thrymr · on July 19, 2022

(2019) Not that anything has fundamentally changed, there are just more other ways to install and manage python packages.

dtrodrigues · on July 20, 2022

There's currently discussion within pip to support calling pip directly: https://github.com/pypa/pip/issues/11243

valbaca · on July 19, 2022

Where this conversation is inevitably going to go: https://xkcd.com/1987/

sidpatil · on July 19, 2022

Contrast with https://xkcd.com/353/.

josefx · on July 20, 2022

They broke that with python 3. print "Hello World!" no longer works and import antigravity now only links to the comic.

Liquid_Fire · on July 20, 2022

> import antigravity now only links to the comic

"now"? Hasn't it always?

ActorNightly · on July 20, 2022

Note how that is the environment under Mac OS

This issue does not exist under Linux.

revskill · on July 19, 2022

I have many friends, who's not working as a programmer, but treat programming as a side job only. And what i observe the common theme from them is, they don't mind software engineering practice. What made me confusing and i think, Python actually did a good job at bringing programming to mass audience.

What's not going good, is, in Python, software engineering practice is not something to be a huge concern, so end-user hardly learn the same way about how to do thing "correctly".

That made me sad though.

kbumsik · on July 19, 2022

AFAIR I needed to do that when upgrading pip itself on Windows.

`pip install --upgrade pip` may trigger error on Windows because the binary cannot be updated itself.

`python -m pip install --upgrade pip` works instead.

martopix · on July 20, 2022

The article says that.

shadycuz · on July 19, 2022

I haven't used pip directly in years. Lots have changed in the Python eco system and I feel like using pip (directly) for dependency management has it's writing on the wall.

https://cjolowicz.github.io/posts/hypermodern-python-01-setu...

ZeroCool2u · on July 20, 2022

I'm currently fighting with pip and and some particularly brutal dependicies related to Kerberos (Windows build tools are exhausting and such a pain in the ass to install) on two different projects. Number one on my Python wishlist is a significant overhaul to pip to make it more like Cargo.

JadeNB · on July 20, 2022

I know we editorialize titles, but the actual title is "why you shouldn't use `python -m pip`". Setting aside the strange capitalization, this still seems like a big change.

CaliforniaKarl · on July 20, 2022

It has only recently sunk in for me that, with the latest versions, Pip is not a tool for maintaining an environment and list of dependencies (where you'd have a lockfile, for example). Instead, it is an interface to…

1. Download packages from PyPi (or a different repository that provides the same interface)

2. Read the pyproject.toml file to find the build backend to use, and then install that.

3. Call the build backend to actually do the installation. This can be Poetry, Setuptools, Flit, or something else.

Pipenv is another such interface.

PEP 517 (https://peps.python.org/pep-0517/) created pyproject.toml and defined the API that build backends follow. That API includes a way for the build backend to tell Pip (or Pipenv, etc.) what dependencies to install. For systems that have a lockfile (like Pipenv), that could be a list of packages with explicit versions.

PIP 660 extended PEP 517, defining a standard way to have "editable installs". In other words, a way to support `pip install -e package`, or even `pip install -e .`.

The above is all my understanding, which is not at all authoritative!

It's worth checking out the list of packaging-related PEPs: https://peps.python.org/topic/packaging/ It's worth reading PEPs 518 and 621.

As an example, here's a Python package that uses Poetry: https://github.com/globus/globus-timer-cli It works perfectly fine with Pip, because Pip sees (via `pyproject.toml`) that it needs to pull in Poetry, and call it to do the actual install.

Here's an older repo of mine, where I've just started to learn about the transition: https://github.com/stanford-rc/mais-apis-python/blob/main/py... In this case, I could delete `pyproject.toml` entirely, Pip would see the `setup.py` file, and understand to use Setuptools. This was me when I was just starting to learn about the transition.

Finally, here's a newer repo of mine, where I've ditched setup.py (and _almost_ ditched setup.cfg) entirely: https://github.com/stanford-rc/globus-group-manager I'm still using Setuptools, but all of the metadata and requirements are included in the `pyproject.toml` file.

It's definitely been a rocky transition, but it's really looking (to me, at least) like we're at (or near) the point where I can just use `pip install …` and it'll work, regardless of the build backend in use!

rthomas6 · on July 20, 2022

  alias pip="python -m pip"

thund · on July 20, 2022

…and

    doskey pip=python -m pip $*

4dregress · on July 20, 2022

I don’t understand why people seem to have so many problems with installing python packages?

It could not be more simple.

jamesjamesm · on July 21, 2022

It's simple until it isn't