Hacker News new | past | comments | ask | show | jobs | submit login
Pendulum – Python datetimes made easy (github.com/sdispater)
172 points by sdispater on Aug 17, 2016 | hide | past | favorite | 73 comments



Pendulum is a new library for Python to ease datetimes, timedeltas and timezones manipulation.

It is heavily inspired by [Carbon](http://carbon.nesbot.com) for PHP.

Basically, the Pendulum class is a replacement for the native datetime one with some useful and intuitive methods, the Interval class is intended to be a better time delta class and, finally, the Period class is a datetime-aware timedelta.

Timezones are also easier to deal with: Pendulum will automatically normalize your datetime to handle DST transitions for you.

    import pendulum
    
    pendulum.create(2013, 3, 31, 2, 30, 0, 0, 'Europe/Paris’)
    # 2:30 for the 31th of March 2013 does not exist
    # so pendulum will return the actual time which is 3:30+02:00
    '2013-03-31T03:30:00+02:00’
    
    dt = pendulum.create(2013, 3, 31, 1, 59, 59, 999999, 'Europe/Paris’)
    '2013-03-31T01:59:59.999999+01:00’
    dt = dt.add(microseconds=1)
    '2013-03-31T03:00:00+02:00’
    dt.subtract(microseconds=1)
    '2013-03-31T01:59:59.999998+01:00’
To those wondering: yes I know [Arrow](http://crsmithdev.com/arrow/) exists but its flaws and strange API (you can throw almost anything at get() and it will do its best to determine what you wanted, for instance) motivated me to start this project. You can check why I think Arrow is flawed here: https://pendulum.eustace.io/faq/#why-not-arrow

Link to the official documentation: https://pendulum.eustace.io/docs/

Link to the github project: https://github.com/sdispater/pendulum


Cool library, I'll definitely be looking into this more closely in the future. The automatic handling of ambiguous time during timezone transitions is curious. As an example, Django punts on automatic handling and requires a user to choose between pre and post transition (I wrote the is_dst handling): https://github.com/django/django/blob/19e20a2a3f763388bba926...

What if you were to begin with a time at 3:30 and then subtract an hour. Would your library properly return 1:30? What if a user had a naive time at 3:30, subtracted an hour, and then converted to an aware time using your library? For what it's worth, I think moving to 3:30 is the correct behaviour in the vast amount of cases. Requiring a user to provide a direction to move and throwing an exception if they don't is dangerous. How often are users going to see this exception, if at all? Just wondering how much you've considered cases, if they exist, that would be better off moving to 1:30.

I'll add one more thing. As soon as I started browsing the page I thought "why would I use this over arrow?". Great to see you addressed that by default. I didn't know about some of arrows shortcomings/bugs, so they were really useful.

Handling dates, times, and timezones in particular is a tricky problem, as evident by the large number of libraries in each language trying to get it right. If you haven't already, I'd really recommend reading the blog posts of Jon Skeet regarding Noda Time http://blog.nodatime.org/ and https://codeblog.jonskeet.uk/category/nodatime/ even if it turns up some corner cases you haven't considered, or validates ones you have.

Thanks!


PEP 495 - Local Time Disambiguation (folding) handles this in Python 3.6 quite elegantly:

https://www.python.org/dev/peps/pep-0495/

It adds an extra attribute 'fold' that can be '0' (the default) for normal times and '1' for the later version with that same local time representation in the case of the clock going backwards.


I can't decide if auto-normalizing my times to DST is a feature or anti-feature. If I didn't know it was there, I would consider it a surprising default.


I see what you mean but as soon as you deal with timezone-aware datetimes you don't really have a choice, if an hour has been skipped, it simply doesn't exist.

What is important here, I think, is that any time arithmetic is not affected by this normalization, so if you add an hour the difference will be an hour, so you don't really have to think about it.


> I see what you mean but as soon as you deal with timezone-aware datetimes you don't really have a choice, if an hour has been skipped, it simply doesn't exist.

So somebody clearly made a mistake (either the user or the programmer not checking the input) which you can't automagically fix. Well, at least, it's a terrible idea.

I understand the intent of helping them, but you make more harm this way.

It's also against the ZEN of Python.


I tend to agree.

Obviously automatic normalisation is the correct thing to do when doing arithmetic that crosses DST-rollover boundaries (this is what the Python stdlib gets wrong), but I don't think it should be done (by default) upon creation with a location-based TZ specification:

pendulum.create(2016, 10, 30, 2, 30, 00, 0, 'Europe/Paris') is ambiguous, and pendulum.create(2016, 3, 27, 2, 30, 00, 0, 'Europe/Paris') is non-existent. pytz raises AmbiguousTimeError or NonExistentTimeError respectively in cases where you try to construct local times like that and specifiy is_dst=None: http://pytz.sourceforge.net/#problems-with-localtime

With pendulum I don't see a possibility to enforce a "strict mode" that turns of those automatic assumptions.


I am not too kind on raising an error by default (which pytz does not do by default either but rather return the pre-DST transition time).

I think the best thing to do here is keep the current behavior (post-DST) but with an option to choose what you want exactly (post-DST, pre-DST)


If the author is still reading this, I agree.


A cursory glance at Django's timezone docs says pytz (which is optional but recommended) raises an exception when a time doesn't exist due to DST.

> Unfortunately, during DST transitions, some datetimes don’t exist or are ambiguous. In such situations, pytz raises an exception. Other tzinfo implementations, such as the local time zone used as a fallback when pytz isn’t installed, may raise an exception or return inaccurate results. That’s why you should always create aware datetime objects when time zone support is enabled.

https://docs.djangoproject.com/en/1.10/topics/i18n/timezones...


I already faced some of mentioned problems using Arrow, I'll give pendulum a try tomorrow.


Good, because arrow is effectively dead at this point


Why not submit fixes for the existing library instead?


Can't talk about the specific complaints the author has, but there are a bunch of datetime libraries for python with often intentionally different behaviors. Submitting "fixes" requires the other libraries' authors to see what you want to fix as a defect and not a design decision/feature/to irrelevant to justify breaking changes.

At least that was my impression from trying to find a combination of libraries I like and reading various bug trackers in the process. In the end I pick and choose the pieces that do what I like for each part of the process, luckily the datatypes are mostly compatible or easily adapted.


What mysql lib would you recommend?

I'm doing a `pip install https://dev.mysql.com/get/Downloads/Connector-Python/mysql-c... on older centos boxes. Because I found `pip install mysql-connector-python` seems to fail on older boxes.

I also wonder if I should not start to look at SQLAlchemy. Not so much the ORM. But rather the simple DBAPI[0] interface.

I'm already using Postgres FDW to integrate PostGIS into our mysql dbs. So I am going to drag a Postgres python lib around in the near future, any recommendations.

PS: I Believe I just convinced myself, to go invest some time into learning SQLAlchemy.

http://aosabook.org/en/sqlalchemy.html


Simply because I do not really like the API of the existing libraries (Arrow or Delorean) and I just can't fix the design decisions made by the authors.


This area is prone to severe bikeshedding. Back in 2012, I filed a Python bug, "datetime: add ability to parse RFC 3339 dates and times"[1] RFC 3339 timestamps appear in email, RSS feeds, etc. The datetime library could output them, but not parse them. There are at least seven parsing functions in PyPi for them, and each has some major problem.

There have been steady discussions of this issue for almost four years now. I dropped out years ago, but the arguments go on.

[1] https://bugs.python.org/issue15873


> RFC 3339 timestamps appear in email

Where exactly in email to do they appear? In the header, it's been my experience they all conform to the RFC 2822 spec, and could be parsed with the standard library function email.utils.parsedate_tz[1].

[1] https://docs.python.org/2/library/email.util.html#email.util...


Past bikeshedding, there's also the long list of myths programmers believe about time[1] and the crowd-sourced followup[2]. Pendulum inherits from datetime in the stdlib, but I'm unsure how well either of those address the issues raised (or even if it's possible - some need to be addressed by the code that uses pendulum/datetime).

[1] http://infiniteundo.com/post/25326999628/falsehoods-programm... [2] http://infiniteundo.com/post/25509354022/more-falsehoods-pro...


Nobody has a good solution to leap seconds. 86400 seconds = 1 day is nailed into too much software. The problem is serious enough that some operations, including high frequency trading, are stopped around a leap second.


I know pandas is a bit meaty for a date time library if you don't already use it but their Timestamp class is awesome. String parsing is a breeze, offsets and timezones are easy and then there's a ton of support for time series.

    In [34]: pd.Timestamp('2016-08') == pd.Timestamp('2016.08') == pd.Timestamp('2016/08') == pd.Timestamp('08/2016')
    Out[34]: True
    
    In [38]: pd.Timestamp('2016') == pd.Timestamp(datetime.datetime(2016,1,1))
    Out[38]: True
    
    In [49]: pd.Timestamp('2016') + pd.offsets.MonthOffset(months=7) == pd.Timestamp('2016-08')
    Out[49]: True
    
    In [52]: pd.Timestamp.now()
    Out[52]: Timestamp('2016-08-17 08:01:07.576323')
    
    In [53]: pd.Timestamp.now() + pd.offsets.MonthBegin(normalize=True)
    Out[53]: Timestamp('2016-09-01 00:00:00')
    
see http://pandas.pydata.org/pandas-docs/stable/timeseries.html for more examples


I also really like the Timestamp class, are the obvious reasons not to use it?


Having pandas as a dependency?


That doesn't seem like a terrible dependency, is that the only reason?


I don't understand why this library exists. I am (literally) manipulating datetimes for a living in a Django app this whole year and we don't even have Arrow installed. Pytz and maybe dateutil is all you need.

Also I really hate when someone fragment the energy and their time making a new, inferior library instead of fixing and patching the existings for basic things like this. This way we will have a bunch of incomplete, inferior libraries which all have quirks instead of only one really good one which could everybody use...


I don't see what the big deal is. These people are spending their own time making libraries, and not using up your time. Are you to judge how people should make use of their time?

Second, I think it's great that people are making alternatives - it promotes a healthier ecosystem where people can share good ideas as well.

> instead of fixing and patching the existings Finally, by your logic, the popular requests library shouldn't have existed and Kenneth would have been patching the more conventional urllib library


> Are you to judge how people should make use of their time?

This library did not born because of fun, but because of solving a (non-existing) problem from frustration. I would have not said one word if he was doing this for fun or just learning or something like that.


What defines a problem as non-existing/existing? It's very subjective. While it may not be for you, the author definitely saw an issue that needed to be fixed.

Again, I'm using the example of requests: was Kennethwrong to implement something new when there was already a standard lib?


Django provides django.utils.timezone to make handling dates and times easier. Not every python project has access to django.utils.timezone. pytz is mostly what you need to get timezones right, but there's still room for something like this.


good point, but still, there are at least 3-4 libraries which do exactly what this library is trying to do and do a better job at it


What libraries? I'm honestly interested to know which ones you refer to.



So, basically, you haven't read any of my comments. Arrow is really flawed see https://pendulum.eustace.io/faq/#why-not-arrow and Delorean is not as complete feature-wise as Pendulum. Both of these libraries do not handle properly shifting time around DST transition times which is a big flaw in itself beacause it's not accurate.


Also manipulating datetimes for a living (telemetry + billing). The reason I stick with Arrow despite its slowing maintainership? Its ability to round up or down time along a granularity: http://crsmithdev.com/arrow/#arrow.arrow.Arrow.floor


Looks like a nice polished interface. I am wondering what is Pendulum's policy on invalid input. The examples illustrate the inconsistent approach to invalid inputs:

In some cases it guesses what you meant:

    >>> pendulum.create(2013, 3, 31, 2, 30, 0, 0, 'Europe/Paris')
    '2013-03-31T03:30:00+02:00' # 2:30 does not exist (Skipped time)
In other cases, it raises exceptions:

    pendulum.parse('2016-06-31')
    # ValueError: day is out of range for month


+1, i'd expect a ValueError in the first example


Python stdlib datetime/time/calendar libs are junk. That one constantly has to read obscure function signatures in the docs to do rather obvious things is just awful.

On the otherhand if you've e.g. Ruby/Rails datetime handling than you get used to reasonable things working ( such as Time.now + 1.day ) that Arrow doesn't handle well. As a matter of fact Arrow got rid of DT deltas and seemingly made the situation worse.

I've only looked at the examples in the docs; but to be serious, Python should just scrap their DT/time/calendar libs and copy Ruby verbatim. This NIH thing needs to stop..


Does Time.now + 1.day return the time 24 hours into the future or does it return the same time (hour:minutes) but the next date? (it may be more or less than 24 hours from now if the UTC offset has changed for any reason e.g., due a DST transition). How do you express these different cases in Ruby explicitly?

Related: http://stackoverflow.com/questions/441147/how-can-i-subtract...


1.day? I confess to not being a Rubyist, but does that require monkeypatching the base int class?

I don't see anything unreasonable about Pendulum's interface. Let's let Python be Python and Ruby be Ruby.


I am a rubyist for over a decade now, and while I don't have a problem with the monkeypatching per se, I don't think anyone should be holding up ActiveSupport's time zone support as a paragon of good API design. Yes, on the surface it looks pretty nice, but because of the weird mix of different classes and extensions you get a frankenstein API that is a very leaky abstraction. I could dig up a raft of examples, but just off the top of my head... Date.today respects the global setting of Time.zone, but Date.yesterday always gives you the UTC date. The inconsistencies and permutations of Date, DateTime, Time, and TimeWithZone, combined with machine clock, Time.zone global, and UTC lead to so much confusion that the only way to ensure correctness is to declare a subset of the API which you always use, and reject everything else just so your team gets used to reasoning about it.

Sorry for the rant, but I've spent many years as the only California developer for a time based in the UK, suffering the tyranny of developers who spend half their year blissfully living in UTC and unknowingly foisting off their dirty time zone bugs on me.


Similar idea to Ruby's .times() method [0].

[0]: http://ruby-doc.org/core-1.9.3/Integer.html#method-i-times


Well, for a long time, letting "Python be Python" required dealing with absolutely terrible time zone support. As an example, Python's strptime only got timezone support (in %z) in version 3.2 in 2011.


No, don't think so. I think this is a native syntax construct in Ruby.


1.day is not a native Ruby concept, it is a method monkeypatched into the Numeric class by Rails' ActiveSupport. See http://api.rubyonrails.org/classes/Numeric.html#method-i-day and https://ruby-doc.org/core-2.2.0/Numeric.html.


I can see how people think it is native Ruby. Very few people use Ruby without Rails.


I just added a feature proposal to this library with a similar syntax to this but in a Pythonic way. import pendulum as pm

    now_in_paris = pm.now('Europe/Paris')
    tomorrow_in_paris = now_in_paris + pm.day
    next_week_in_paris = now_in_paris + 7*pm.day
https://github.com/sdispater/pendulum/issues/17


    >>> from datetime import datetime, timedelta
    >>> datetime.now() - (datetime.now() + timedelta(days=1))
    datetime.timedelta(-2, 86399, 999969)


I’m not entirely sure what you were trying to demonstrate, but clearly the result of datetime.now() changes between the two invocations (it has microsecond precision).

Try:

    >>> from datetime import datetime, timedelta
    >>> now = datetime.now()
    >>> now - (now + timedelta(days=1))
    datetime.timedelta(-1)


datetime.timedelta internally uses a triple of (days,seconds,microseconds) to represent it's data and exposes that to users of the class.

That a timedelta can be represented in multiple ways is alone quite surprising but some of the representations that happen can be very confusing. I think the example I gave which represents "-1 day and a few microseconds" as "-2 days + 86399 milliseconds + 999969 microseconds" is very surprising and it commonly happens in practice.

For comparison, here's what postgres does:

    postgres=> select now() - (now() + '12 milliseconds'::interval + '1 day'::interval);
           ?column?        
    -----------------------
     -1 days -00:00:00.012
    (1 row)


That's a problem Pendulum is trying to solve

    d1 = datetime(2012, 1, 1, 1, 2, 3, tzinfo=pytz.UTC)
    d2 = datetime(2011, 12, 31, 22, 2, 3, tzinfo=pytz.UTC)
    delta = d2 - d1
    delta.days
    -1
    delta.seconds
    75600
    
    d1 = Pendulum(2012, 1, 1, 1, 2, 3)
    d2 = Pendulum(2011, 12, 31, 22, 2, 3)
    delta = d2 - d1
    delta.days
    0
    delta.hours
    -3


What benefits does this have over [Delorean](http://delorean.readthedocs.io/en/latest/)?

The README for Pendulum seems to show me one feature Delorean doesn't explicitly have -- `is_weekend()` -- otherwise these libraries are conceptually very similar.

I do agree that this (and Delorean) is a usability improvement over `datetime` and perhaps even Arrow (though I'm not tremendously familiar with Arrow).


Mainly, its API which tries to be as close as possible as the one provided by the standard datetime library while providing a bunch of helpful methods.

Also, Pendulum is not just about datetimes but also provides both the Interval class and Period class which are improvements over the native timedelta class.

And finally, it handles properly time shifting around DST transition times and normalization of datetimes when creating them which neither Delorean nor Arrow do well.


So far I haven't seen either of these libraries do substantially better than datetime+dateutil+pytz at anything they claim to be good at.


I'm happy to not have to import 3 libraries anymore (2 that aren't part of standard library) to see if one date is greater than another or to add few days to a date.


They both import all three under the hood.


arrow.get(value).humanize()

How can I do this with datetime+dateutil+pytz?


I was talking about delorean and pendulum (I think josefdlange edited his comment). I agree that arrow's multi-locale natural language renderer does add a lot of value. (Although I'd argue it should be unbundled into its own module, instead of Arrow trying to replace a ton of datetime/dateutil functionality.)


> I agree that arrow's multi-locale natural language renderer does add a lot of value. Although I'd argue it should be unbundled into its own module

http://babel.pocoo.org/en/latest/api/dates.html

humanize is apparently a human-readable directioned delta, so that'd be format_timedelta(delta, add_direction=True)

    >>> print format_timedelta(ref - datetime.now(), add_direction=True)
    1 hour ago
    >>> print format_timedelta(ref - datetime.now(), add_direction=True, locale='zh')
    1小时前
    >>> print format_timedelta(ref - datetime.now(), add_direction=True, locale='ru')
    1 час назад


No editing here but that's okay. I only just came back to this conversation now :D

I think Delorean, Pendulum, Arrow, etc, aim to make dealing with dates more understandable and less painful. It's perfectly possible to fully grasp all the technicalities of pytz/dateutil/datetime -- and I recommend that anybody who has to deal with dates and times in Python do so -- but if you'd rather not think about it all and are okay with deferring some control to the opinions of one of these libraries, that's when you probably pull them off the shelf instead of the aforementioned trio.


I've been happily using django.utils.timezone for a while now and doing everything on the backend in UTC. For any user-facing timestamps, to some degree I'd rather keep that in the front end and a separate concern from my API.

Not saying storing user timezone and converting on the backend is bad; but this is simpler when localized timestamps aren't a core part of my app.


Yes, always work in UTC. But you could use this library to convert user provided naive datetimes to UTC, and datetimes coming from the database into localised datetimes. Converting from/to timezones other than UTC in Django isn't as nice as it could be.


No, don't always work in UTC.

If a Samoan user told you to record an event for January 1st 2012 on January 1st 2011, if you stored the date in UTC would have reminded them on January 2nd 2012 (all local).

Because in May 2011 Samoa announced they were going to skip a local day and move across the international date line. So 2011-12-30T09:00:00 UTC was 2011-12-29T23:00:00 Pacific/Apia, but 2011-12-30T10:00:00 UTC was 2011-12-31T00:00:00 Pacific/Apia.


Wow, that's a really interesting case. I guess the same thing would happen whenever an offset was adjusted after you store a value in UTC intended for another timezone. I don't think this is even supported by Django - all times are converted to UTC on entry to the database.


> Wow, that's a really interesting case. I guess the same thing would happen whenever an offset was adjusted after you store a value in UTC intended for another timezone.

Yep, or if a DST change was added or rescinded (which can happen on surprisingly short orders, many governments love fucking up with DST, this year 2016 we got an infirmation of Egyptian DST with 3 days lead time).


That's very interesting. I wonder if the popular datetime libraries issued a patch for this case?


tz libraries shipped a new version for the tzdata update, as they do more or less every time the database is updated (monthly or so). 9 months lead time is actually pretty good, consider: April 29th the Egyptian government announced they'd go on DST on July 7th, with no further information until June 27th when Parliament proposed to abolish DST and passed a (apparently non-binding) vote for that on June 28th, following which the Egyptian government announced they wouldn't go on DST after all on July 4th.


It sounds like I should be updating my tz dependencies more often :o


What about precision? There is the Date object with day-precision and the Datetime object with microsecond precision ... but nothing else. There's no canonical way in Python with any library that I know of to say "July, 2013" or even "12:15pm". The former will simply put in July 1st and the latter will put in seconds and microseconds implying precision that doesn't exist.

Anybody know of an elegant solution/library or even a library that would be open to including such a concept?


numpy does this, with np.datetime64('2013-07') or np.datetime64('2013-07-01 12:15'). I wouldn't recommend using it if you're not otherwise using numpy, though.


My experience over the last few years has been datetime > dateutil > pytz > arrow > back to dateutil. Maybe I'll try Pendulum next. ;)


Good to see we're starting to come close to getting dates and times handled according to the rules (I won't say sensibly, because the rules themselves aren't sensible). But then we start over-reaching and trying to handle social constructs built on top of dates and times, which is even more of a mess D:

Eg, spot the error:

    _weekend_days = [SATURDAY, SUNDAY]
    date.is_weekend = date.day in _weekend_days
Hint: https://en.wikipedia.org/wiki/Workweek_and_weekend#Around_th...

(TIL one country doesn't even have consecutive weekend days)


I agree but this is an optional feature which might cover most of the developers needs. But, if not, it's configurable:

    pendulum.set_weekend_days([pendulum.SUNDAY])


Looks clean.

I think the `in_words()` function should probably be delimited by commas though, good to have your strings easily deconstructed into their constituent parts.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: