Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Structuring Your Project (python-guide.org)
186 points by ai_ja_nai on Dec 7, 2019 | hide | past | favorite | 37 comments


Always one of the good articles to link to when creating a Python repo from scratch.

Should mention that __init__.py is not needed for python 3.3+ anymore (just found that out this week myself to my surprise) so for projects where backwards compatibility is not required you can stop creating extra blank files if you don't need them.

https://stackoverflow.com/questions/37139786/is-init-py-not-...


Although, beware that you may need to change `find_packages()` to `find_namespace_packages()` in your setup.py.

I get burned by that one a lot!


That's really not recommended. A lot of behaviors are subtly affected by failing to have init files.


Like what?


Not op but I have personally noticed that doing a ‘python setup.py install’ doesn’t correctly import modules unless you have an __init__.py file. It’s possible I could be doing something wrong however.


Mypy sometimes uses init files to understand internal imports.


When I'm just learning a new programming language or framework, I prefer to have something like create-react-app [0], django-admin startproject [1], cargo new [2], or lein new [3].

Guides like these are just full of (the wrong type of) rabbit holes. Just give new people something that lets them work productively as soon as possible and refer them to language documentation to understand how features of the language or the runtime work - in case they want to dig deeper.

Most people will not want to dig deeper, anyway. They will be happier to just have boilerplate that works and that they can iterate upon.

Look at the golang version of this guide: https://github.com/golang-standards/project-layout

How many programmers new to go are going to want to wade through all of that? Especially when they're probably starting out with projects for which most of those conventions don't even apply!

It's surprising to me that Python, as focused as its philosophy is on providing canonical solutions to problems, doesn't have a "python -m newproject". There must be a good reason. Will try making one in my free time to find out.

[0] https://github.com/facebook/create-react-app

[1] https://docs.djangoproject.com/en/2.2/ref/django-admin/#star...

[2] https://doc.rust-lang.org/cargo/guide/creating-a-new-project...

[3] https://github.com/technomancy/leiningen/blob/master/doc/TUT...


`poetry new`[0] is what you are looking for, previously I have used pyscaffold[1] but I like the pyproject.toml from PEP 518[2] better than all the old stuff. Unfortunately poetry does not set up docs like pyscaffold does though.

[0]: https://poetry.eustace.io/

[1]: https://pyscaffold.org/en/latest/

[2]: https://www.python.org/dev/peps/pep-0518/


I never tried Poetry because of this somewhat worrying part of installation instructions: "The installer installs the poetry tool to Poetry's bin directory ... located at $HOME/.poetry/bin ...This directory will be in your $PATH environment variable".

The directory will be in my PATH? That makes me wonder what else it might take the liberty to reconfigure on my system.

But, reading the installer script, it seems like it at least prompts you for confirmation first, so perhaps I'll try it.


Nice, thank you! Will definitely check out poetry now - have been avoiding it for reasons of inertia so far. :)


Psyscaffold has a --pyproject option (or at least an add-on linked in the readme, don't remember)


poetry new --src projectname

is my preference these days.


Given Python's history, there are just too many "one way"s to do things.

The space of packaging has changed hugely in the last 5 years, and probably will again in the next 5 (both in good ways).

Maybe then we get a standard templater.

Though sure, if poetry is your thing, it does this.

(Me, FWIW, given that I still generally stick to setuptools for better or worse, I have my own templater that I use [0])

[0] https://github.com/Julian/mkpkg


Tooting my own horn try: https://github.com/mikadosoftware/mkrepo

mkrepo reponame .

will get a fairly decent skeleton created (been using a lot for my own stuff recently)


> It's surprising to me that Python, as focused as its philosophy is on providing canonical solutions to problems, doesn't have a "python -m newproject". There must be a good reason.

The reason is probably related to the perennial python packaging problem; many of the projects offering solutions to that problem have their own project generator bundled.


This is the kind of documentation I wish every language/framework provided. Every time I try to learn a new language/framework, I just see code fragments with no idea how to organize them coherently.

This month I'm working on a side project and learning to use sequelize (a SQL ORM for node). All the examples [1] just seem to assume my entire project live in a single file.

[1] https://sequelize.org/master/manual/models-definition.html


The most frustrating part of learning new tech stack is the implicitly assumed tribal knowledge.

ex. s.o answers start with np.yada without the import numpy as np part.


This is especially a problem I see on iOS. I can find lots of examples and tutorials but each assumes a bare-bones project with a handful of files but none seem to teach you how to structure your project in a way that would support the size of a real-world application.


Most of the advice here seems okay but this stuck out to me:

    foo += 'ooo'  # This is bad, instead you should do:
    foo = ''.join([foo, 'ooo'])
This seems like such a silly micro-optimization. I also have to question its validity, even in scenarios where you're working with huge strings.


I would agree with you.

The only reason I can see for the advice is that the first line does not handle that foo (#) might be other than a string - for example if foo was [1,2,3] you just got [1,2,3,'ooo']. the second line would barf on that. But if that was the case it is waaaay more readable to explicitly test for that

so yeah, generally not great advice

(#) Holy moly how many times will autocorrect change foo to too!!! stop it!


It's not so much large strings as lots of concatenation; aggregating all the substrings into a list and joining saves allocations and at least at one point in time it gave a huge speedup on a number of large systems. While Python's overall performance has improved since then, my understanding is that creating lots of strings and incrementally assembling the allocating and dropping lots of progressively-larger items is still often a significant performance hit.


If you use += in a loop over a sequence, then you should replace it with a single line of join(), which should be more efficient. Otherwise you’re absolutely right.


As far as I am aware, str.join instead of str += is just a legacy optimisation that is not really valid when developing for modern Python versions.


Nitpick: I'd rather use the python -m pytest way than modify sys.path (AKA "the django way").

Modifying sys.path is an ugly hack which makes python look like some... idk. matlab?


I don't understand what the author has against requiring a pip install -e .

You're requiring contributors to install dependencies anyway.

If you're using pytest (I think the article assumes you're only using the standard library unittest which I wouldn't personally recommend) there is pytest-pythonpath which at least makes this invisible to the user.


Totally agree with this.

The guide presents a false dichotomy. The main package doesn't have to be in site-packages as the alternative to this sys.path hack. It just has to be accessible from your PYTHONPATH. Running tests as a module from the project root will put your main package on the PYTHONPATH.


The best answer in 2020 is to use poetry.

http://poetry.eustace.io


Whenever I start a new Django / Python project I come up with a short and generic(ish) internal name, like “pluto” - then, the repo is structured like this:

    README.md
    runtime.txt
    requirements.txt
    pluto
      wsgi.py
      manage.py
      urls.py
      - conf
        - common.py
        - prod.py
        - local.py
      - static
      - templates
      - apps
        - accounts
          - models.py
          - etc
        - users
        - etc
In a case like this, the repo is named “pluto” in GitHub, but I clone it as “src”; so, the project locally looks like this:

    pluto
     \ src
        \ pluto
I CD to the root (pluto/src) and work from there.

I also use relative imports from within the “apps” directory:

    # e.g., inside pluto/apps/accounts/views.py
    from ..users.models import User
I’ve used this pattern for years now. It feels so much cleaner than any other pattern I’ve ever used, and it also makes my projects much more reusable (if I want to copy my latest project as a skeleton for the next, etc.).


  To give the individual tests import context, create a  tests/context.py file:
  import os
  import sys
  sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
Or you can convert your tests directory to package by placing __init__.py. It's pretty amazing that Kenneth Reitz uses this ugly solution.


I find myself coming back to this guide whenever I set up a new project: https://sourcery.ai/blog/python-best-practices/


A general ignore_missing_imports=True in mypy.ini is very bad advice. This will hide lots of real errors. It's surprisingly easy to set up mypy such that it will silently not check types in many files.

Additionally I can't be the only person who utterly despises pipenv. It is confusing for most people who don't know python packaging internals, phenonomially slow for everyone else and besides discourages building proper distributables (or:wheels).


What do you do instead of pipenv?


Pipenvs attempts to replace all other tooling to the answer is: all other tooling. Specifically, setuptools via either setup.py or setup.cfg. pbr, virtualenvs, tox, pip-tools, pip, etc are all useful. Anything that does not take 5-10 minutes to recompute dependencies basically!


That is a big weakness, I agree. But the mental tax of having to remember how to use >5 cli tools vs one is definitely pro pipenv for me.


For all the grief Kenneth Reitz gets... gotta give him credit for his emphasis on usability and professional presentation.


Why does he get grief?





Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: