The questions do not make sense so stop trying to ask them. It’s like asking if 2022-01-01 is an orange.
Generally speaking you want context of what the date or datetime is being used for. If you want to know when Christmas takes place you will use a date: it does not take place on 2022-12-25 00:00:00 through 2022-12-25 11:59:59 because this would require a timezone and Christmas takes place at different UTC times around the globe. But you can reasonably say that Christmas takes place on 2022-12-25 and leave it at that to let the implementation of whatever program figure out if it is or is not currently Christmas based on the information it has about time and timezone.
This is way overstated. It's more like asking if the integer `1L` is between the float `1.0D` and `2.0D`. It requires an implicit cast. Decide whether operations involving LocalDate[0] and LocalDateTime should use the former or the latter as the working type (the latter, IMO), do the cast (LocalDate becomes a LocalDateTime with time component 00:00:00, or LocalDateTime becomes a LocalDate by dropping its time component), then do the operation.
If the types you're dealing with are LocalDate and Instant, then you'd need a contextual timezone to perform the conversion in either direction.
I don't think this is a hard problem, it's just one that requires more specificity than one might think at first glance. Users should be encouraged to read the spec of any language that does these sorts of implicit casts, in the same way that C programmers should be aware of the implicit typecasting rules in that language.
0. I'm using the java.time/org.joda.time type names here. Their equivalents should exist in any good time library. However, I think in reality many libraries fall short of the types you really need to express date & time concepts, which is what leads to so much confusion around them. Once you're familiar with LocalDate, LocalTime, LocalDateTime, OffsetDateTime, ZonedDateTime, Instant, Duration, and Period, (whew), it's possible to be clear about what you're doing.
I think the point is that the nature of software is that these nonsensical questions are “asked” by the code all the time. Data migration, different teams using different formats, user interfaces with imperfect translations, and other scenarios all result in these silly questions popping up. And in the middle sits an engineer who needs to decide how it will work.
And my point is that if you don’t understand what the data field represents, don’t make comparisons against it. You must understand the context and then use the data. Otherwise you’ll ask nonsensical questions and get nonsensical answers and then be surprised.
Actually, in Germany Christmas already starts 12/24, so even this seemingly simplistic example is more complex of you support the entire world. Dates are hard.
Added to that, in some (but not all) Orthodox majority countries, most notably Russia, Christmas is on 25 December by the old Julian calendar, which is on 7 January by the current Gregorian calendar (which everyone uses, Russia included, for civil/commercial/everyday use). It’s Gregorian date moves forward by one day every century (except for every fourth century, when it doesn’t.)
Some of the other Orthodox Churches (such as the Greeks) technically celebrate Christmas, not using the Gregorian calendar, but rather the “Revised Julian” - which happens to be identical to the Gregorian until 2800. I wonder if, come 2800, they’ll remember to move the date of Christmas, or if they’ll think “there’s no point to it, let’s not” (assuming of course that both they, and humanity as a whole, are still around in 2800)
Sure. We can be more specific about which Christmas we talk about. I could have used the USA Independence Day instead (July 4th) as a less universal but still valid example. My point wasn’t about definition of a holiday but about date locality.
The answer is no/no/no because you cannot define a general comparison operator between dates and times without additional context. It's possible that you can define a contextual comparison operator that works for your domain and the question you want to ask, but without knowing the application for this comparison it's pointless to try to make a general statement.
My mental model for these kinds of things is that Times are instants and Dates are either:
1. Ranges (with the start and end Time depending on TZ and possibly other context)
2. Discrete cells of a calendar (which are mostly TZ independent - July 5th doesn't happen at the same time everywhere, but it is well-defined everywhere)
Also, I've been writing code for 25 years and I still have no idea what a DateTime is.
Those two things are called the time zone - a geographical area that has and had the same time all the time - and the zone time within that time zone which might change from time to time, for example regularly because of daylight saving time or kind somewhat randomly because people change their mind about which time they want to use.
The problem isn't that you can't define an operator that will perform some boolean operation akin to equality on the values, the problem is that you can't define a universally agreed upon one. There's many sensible options (round to nearest, truncate, refuse entirely, and this is the start of the list not the end). The problem isn't a lack of options but an embarrassing surfeit of them. It's not that there aren't any answers, it's that there isn't an answer.
Moreover, there isn't anything intrinsically wrong with that. To go one "why" deeper, the problem is in the human brains that insist that such an operator simply must exist. How can we not do equality on times and dates? To which the answer is that your incredulity isn't really relevant and certainly doesn't create a universal answer.
Perhaps one "why" below that is a shallow understanding of operators in general, as Spivak references in a, err, nephew post of this one. If your concept of equality is that it was handed down from on high by your fourth grade teacher, you may have trouble dealing with the fact that there's really nothing special about it from a programming perspective, and programming routinely uses multiple concepts of "equality" even before we get into fundamentally ambiguous type comparisons! For instance, == vs. === in Javascript is two right there. Arguing about which is "the" equality operator is fundamentally flawed on the grounds that there really isn't such a privileged position of "the" equality operator to be occupied in the first place. We define many equality-like operators all the time, suitable for whatever local needs we have, and while there are certainly some that are more "equality-like" than others there is even so quite a few to chose from.
The ambiguity must be embraced and dealt with... or you will experience the implacable, impersonal engineer consequences of failing to do so, whatever those consequences may be for you in your particular situation.
Why define an operator? That is just asking for trouble. If you absolutely must compare two such values, insist that the person do the conversion manually at the comparison site, so that the intent and local use case is made explicit. Because you may wind up in a situation where someone else in the same codebase needs a different "method of coercion", and operator creation is fairly global-ish in most PLs.
If the date is, for example, the starting date of a contract, then the date is most likely just a shorthand for 00:00:00 on that date. With that there is no problem at all comparing the date with an instant and decide whether the instant is before or after the contract went into effect. This is not arbitrary, this is making explicit use of otherwise implicit domain knowledge.
All comparators are arbitrary, even among same-typed objects.
You can define a poset on the natural numbers ordered by divisibility that’s well defined and different from the usual < relation.
Pulling an example from maths,
comparisons between integers and rationals are well defined despite them not being the same type. The obvious ordering uses the n <-> n/1 mapping but it doesn’t have to.
I would argue that both dates and datetimes dimensionally signify ranges of time with different lengths. A date being a range of ~24 hours and a time being a range of a minute/second/microsecond depending on the definition. It's rarely truly meant to signify an instantaneous point in time.
Looked at as ranges, equality can be defined as whether they overlap, and that works for most use cases but not all.
The only right model of dates and times is Java Time API. This is an example of great API design that everyone must be familiar with.
For example, they have different representations of time there that are context-specific. Time is not always an instant and not always require a time zone: when you specify opening hours of a shop, they are defined as a LocalTime.
ZonedDateTime is not the same as Instant in the context of calculations. Adding a day to instant always adds 24 hours. Adding a day to the time in the time zone observing DST may result in adding 23 or 25 hours.
Time is not hard, it’s just you always need enough context to pick the right representation.
> I’d argue though, that without context, a date has the time 00:00 in your local time zone
I'd argue that without context a Date cannot have any time inferred. In the programming context it should be treated as a programming error and comparisons should fail to compile unless additional context is given.
A lot of the confusion here seems to stem from how we traditionally store dates (seconds/millis after a reference point in time), and over time we've confused the predominant way in which the data is represented from what the data actually is intended to mean.
let date1 = Date("06/05/2022")
let date2 = DateTime("06/05/2022 20:05 UTC")
if date2 > date1 {
// SHOULD NOT COMPILE
}
let date3 = Date("06/05/2022").withTime("05:00 UTC")
if date2 > date3 {
// DOES COMPILE
}
Injecting context automatically without the intentional action/declaration of the programmer is where a billion bugs are born.
But that would have been fine, they know the UI uses dates only. Even if you store as a time stamp you can pull out just the year, month day. Compare as normal.
These guys are comparing things that they know is a date only with something that needs a time stamp.
I am a little shocked that neither the blog post nor the twitter discussion nor any of the discussion here clearly identifies the missteps of this approach beyond alluding to a "category" or "type" error (which is correct, but not particularly informative.)
The issue here is around the semantics of the mathematical operators. It isn't even really about the types to which they apply; there are systems where `=` is well defined on heterogenous types.
The reason the answer to all these questions is a clear "no" is that they do not satisfy the core properties of the operators. For example, take equality. Equality in almost all mathematical constructs means that it supports substitution, is symmetric, transitive and reflexive. There are also well defined properties for the concepts of `greater than` and `less than`.
So, no, the OP's conclusion "Literally we’re all just making it up" is incorrect. You cannot use the operators `=`, `<`, and `>` between dates and times because they do not satisfy the core properties that define those operators. (I guess you could try to document an alternate definition of equality without the symmetric property in your documentation but... good luck with that not leading to massive confusion.)
Where you can just make it up is to define new operators as you actually want them to be. It's not `=`, it's `myDateTime=()` and then you're free to write the definition of that yourself. As long as you're consistent in the UI of how you present it (don't pretend it's vanilla `=` to the user!) you will at least be telling the truth. It may not solve all your problems but at least you won't be feeding any more to the fires of confusion, which you will as long as you keep pretending it's possible to make `=` mean something that's not reflexive.
> If I define a date MM/DD/YYYY as equal to the time MM/DD/YYYY 00:00 GMT
That's the exact problem - if you contrived a date into an int then yes, it is compatible with the =, >, and < operators.
The point of the blog post is to assert that that contrivance is not appropriate in nearly all use cases, despite how popular it seems to be. In the vast majority of use cases contriving MM/DD/YYYY into MM/DD/YYYY 00:00 GMT is not reasonable and can mask a vast amount of unintended behavior.
For example, a person in California who enters a date of 06/05/2022 in reference to their local time zone will suddenly wind up the day before - because 06/05/2022 00:00 GMT is actually 06/04/2022 16:00 PST.
If the developer wants to contrive the date into a "midnight instant", they are welcome to do so and there should be plenty of convenience functions to allow them to do that, but implicitly performing that contrivance is dangerous.
Well it depends on if you have a type system that distinguishes between "date" types and "instant" types.
If you only have instants, then sure, you could do what you're saying, but the context in which the question was poised seems to imply that "dates" and "times" are separate.
And if you do have separate "date" and "instant" types, you lose the property of substitution: f(x) = f(y) if x = y for any arbitrary function f.
For example non-mathematicians/programmers will specify date range as 2022-01-01 to 2022-01-31.
to map this you need a special mapping to convert to instant... (you need to map the upper range to 2022-02-01 (00:00), anything else is probably wrong).
I deal with this on a regular basis, and my approach is straightforward: I don't compare them. If a have a date on one side and date-time on the other, then I am missing the data necessary to make a comparison. The question seems like a non-sequitur because the implied question is "how do you compare two variable when one of them does not have the data needed for comparison"
The only question in circumstance like this is whether or not my requirements allow for-- or can reasonably be modified-- to strip off time and only compare dates.
If time is an inherent requirement to the project then the response I give is simple: "Then begin collecting time data."
If you want to ask "Does a given date contain a time?", you find the start and end of that date in the relevant timezone and check if the time value falls in that range.
I hate to be "that guy" but I think everyone is looking at this wrong.
It's not a programming issue, a type casting issue or a CS issue.
It's a UI issue. That the point you got the date and the datetime values, there was a human being (either a user, an admin or a programmer) sitting in front of the screen and they were asked a question in some way.
The precise way that question was presented affected their intent. It's their intent you are asking about here. So you need to think about what they were asked and how they were asked it. If that was consistent then you can make a reasonable call here. If it's inconsistent at different times and places then the meaning of that date is different and you can't solve this in any reasonable way.
Often it is first and foremost a type casting issue.
Whether you are comparing a date to a datetime or saving a date as a datetime, almost every language turns '2022-01-02' into something like '2022-01-02T00:00:00+0000' i.e. the first second of the day.
In practice it is usually safer to convert to '2022-01-02T12:00:00+0000', centering the time in the day and minimizing the chance that timezone related or other errors will move you into a whole new day.
The time component is always a lie, but noon is the safest lie.
That seems to be completely counter to the point I was trying to make. It's nothing to do with type casting - you need to consider the intent of the human being who created that data. What did they mean?
My favourite is battling with people who use 23:59:59.
Where I work (banks and other financial institutions) there is frequently a need to check whether something happened within a particular day. Or maybe select records from the database for a day, or do some other kind of logic or filtering.
For some unknown reason, most people decide that best way to do this is to take start date as the beginning of the period and then add to it 23 hours, 59 minutes and 59 seconds and use that as the end of the period.
Explanations that they are missing a whole second do not seem to be working. People are absolutely convinced they are doing this correctly.
I thought about this for a long time and I arrived at an explanation.
It seems that some people use the time as a label for a span of time of unit length. March 21st is a label for an entire day. 12 pm is a label for an entire hour that starts at 12pm and lasts an hour, etc. 2021 is a whole year.
And in some contexts it makes sense. Way say something happened on 12th of January -- we use Jan, 12th as a label for an entire day that we would otherwise have to denote with two timestamps. But in some contexts what we need is an exact point in time, a timestamp. And here is where a lot of people just don't think / can't recognise a difference between timestamps and labels for a span of time.
If you use a wrong model then yes, the day starts with a second labeled 00:00:00 and ends with a second labeled 23:59:59.
Except that's not how most of the underlying software works. Most software in this case expects two exact timestamps to denote the end of the span of time. And 23:59:59 is just 1 second shy of the actual end of day which means that, even if we are missing an entire second of the day, most of the time everything seems to work fine.
Unless you are large bank and you have millions of transactions all over the clock that have to accumulated exactly. Then yes, it makes a lot of difference.
Another explanation is that people don't seem to be comfortable with the concept of selecting items from between 00:00:00 of one day and 00:00:00 of the next because they are seeing another date.
This is why experienced programmers will choose closed-open intervals by default -- they sidestep many of these pitfalls. For your example, you would do
... and date >= '2020-02-13' and date < '2020-02-14' -- notice: >= and <
Since this also applies to real numbers, maybe it is easier to keep in mind with an example in that realm. You would never check for a real number in a given range with:
... and X >= 3.0 and X <= 3.9 // or 3.99, or 3.999999
And the scales just fell from my eyes. I'm guilty of this in the system I develop at work. Whenever I want to show the user all widgets that were made on the date they selected, I will just do something like
DATE(widget_made) = :user_date
Which does what I want and avoids the problem you brought up. But sometimes I have some more complicated logic that falls down if I try to compare datewise. So I manually create the DateTime ranges, and use 23:59:59 as the end of the range. So something like
widget_made BETWEEN :user_date '00:00:00' AND :user_date '23:59:59'
Anything that came off the line at 23:59:59.3421 will be excluded. Fortunately for me, I don't think that has ever actually happened. But now I know to be on the lookout, and use proper date-handling tools to ensure correctness.
Right. The best way to think about it is that any implicit values which are required to univocally define a timestampz should be considered as NULL. So 2022-02-12 should mean “some time during such day in some time zone”.
No/No/No definitively. Dates without times are ranges. If one person is born on 2022-06-01 12:00:00 and another person was born on 2022-06-01 13:00:00 then they have the same birthday 2022-06-01. It follows, that if you know that 2 people are born on the same day such as 2022-06-01, then it is unknown if they were born on the same time. So adding a default time to a day (such as 00:00:00) is nonsensical.
> Dates without times are... Dates? Aren't we talking about two distinct types here?
I assume this is a response to specifically this part:
> Dates without times are ranges.
If we're talking conceptually about general date/time/datetime comparisons, not pegged to a specific programming language or type system, I agree with it: Dates are a 24-hour range.
If you consider that dates are ranges, they’re not 24h but 50 (currently, can change) since a dare has no zoning information: a given date starts at 00:00 on Kiribati (UTC+14:00) and ends the same on Baker (UTC-12:00).
If you're treating dates as ranges, then even ignoring the timezones and daylight savings it still doesn't mean that a date implies a range from 00:00 to 24:00 - the mapping is inherently domain-specific and thus depends on the particular interpretation of that date field (thus it's nonsensical to expect a generic answer/comparison rule for "dates" i.e. all dates).
For example, in the domain of financial settlement, a future date of 2022-03-03 would imply that the event must happen by the end of business day of 2022-03-03 (and thus an event at 23:00 of 2022-03-03 would be too late and would map to 2022-03-04 instead); and in a similar manner, the appropriate date for any events happening in the middle of a sunday would be either monday or friday depending on what your rules are, as the effective date which you would be using to calculate the number of days between two events (for e.g. interest) in may jurisdictions has to be a business day; so two events timpestamped five minutes apart might need to be treated as if they are on the same day, different days, or in some cases many days apart (e.g. with a combination of Christmas + weekend).
Different domains will have different rules; the generic concept of "date" is too vague to define an universal comparison operator and you have to look at the meaning of each specific date field/variable and expect different date fields/variables to need different semantics.
Wouldn't dates with times also be ranges? The ranges are just tighter. If something happened at 11:02AM, did it really happen at 11:02AM? Or did it really happen somewhere between 11:02.0AM and 11:02.99999AM?
Ultimately, this is the same problem as “is ‘04’ > 3”? That is to say, you can only properly compare two things of the same types, and you can either implicitly or explicitly cast.
The easiest option is that comparators should only work on the same data type, to avoid any ambiguity, leaving it up to the user to do explicit casting, and throwing errors if they don’t.
Of course, like integers and decimals, maybe it makes sense to have implicit casting, but it’s unintuitive to me if “‘2020-01-01’ > ‘2020-01-01 12:000’” should be cast to two dates, two timestamps, or two timestamptzs. Even if a language allows implicit casting, it’s probably an area where not doing it as a author is a smell.
Over time I’ve come to firmly believe that code should rarely take shortcuts for me. Make me explain what I intend and error if you ever find yourself having to make a guess.
Is today equal to lunchtime? No, but also yes. "Today's lunch happens today" is a true statement, but "Today happens at lunch" is nonsensical.
Is today lunch and a few months from now? No, but also yes. Parts of today are, but the entire "today" isn't within that window.
Is today's lunch after today? No! It's during today, not tomorrow or some other future day. But also yes! Lunch happens after the calendar has flipped to today.
So we got Instants, Calendar Objects ("days", "weeks", etc), Durations, etc. Depending on context you could want some set operators to determine unions and intersections, or maybe you want simple numeric operators.
To me the main point of the tweet seems to be that either:
- what is assumed as "today" was a range and the fact that the question is asked is a sign something got really wrong down the chain. So the question has no answer.
- it is assumed that "today" is a time. If the exact time value existed but was cut off at some point, we're again in a "no answer" situation. The only way this can be answered is if the date is implicitly a time at midnight.
To me too, the only proper answer is "no,no,yes" and any other context is a "what happened?"
My intuition is that a "date" is a bounded range of times 24 hours long. So we should speak about dates containing times, or dates subsetting, supersetting, or intersecting other time ranges.
I agree with you except on point 3. 4.5 is not beyond 4 because to me 4 represents the complete window of "4-5". 4.5 is neither before nor after it, it's within.
How about this: a date, or indeed even a time, is either of the shortest unit used in calculation at some moment (say by convention, the start of the date or time expressed perhaps down to nanoseconds); or it is a range, from the start of the date or time (expressed in terms of the shortest unit used in calculation, unless you want to fiddle with going down to Planck times to allow for future increase in precision) up to but not including the start of the next increment at the stated precision (for a time in seconds with a minimum resolution of nanoseconds, the billion nanoseconds starting at the beginning of the second).
So,an hour range starting at a half hour boundary might fall entirely within a day, entirely outside that day, or half in that day and half before or after it.
All of that neglects time zones, or assumes they're at most recorded for local convenience and/or historical purposes but converted for calculation into UTC or the like.
So, for instants, either they're equal or they're not; but for ranges, they could be equal, or a smaller range entirely within a larger, or partly overlapping, or disjoint. Pick whichever model (instant or range) works best, and be consistent thereafter, and at least the surprises shouldn't be too surprising.
The specific question is strangely put. Why 12:00?
In the generic case where one has conflicting data types to work with, including all sorts of lost precision, the boring answer is that it is context dependent.
Direct user interfacing applications should follow the principle of least surprise, which for an interval search would be to always treat the value as inclusive when in doubt.
The root of most date/time problems stems from the fact that we use/have insufficient vocabulary to define what we actually want.
For example, "time" can be used as "time of day" or "duration". In my own work we have to measure "from this time, to that time" and yhd result should be duration. [1]
We also use the phrase Time when we mean Timestamp (a date /time combination). And we use "local timezone" implicitly almost everywhere, where we should be storing everything as UTC and then displaying as desired. (this would make comparisons trivial.)
Daylight savings is an abomination since it means time is not contiguous, and the sooner it goes away the better. This is somewhat solved by using UTC when time-stamping.
Overall I prefer storing everything as UTC, then displaying that as preferred by each user (ie as their local time).
[1] should be, but is instead stored as a time field for historical design reasons.
This site finds the midpoint between date/times as well as the difference between them. Also, time units can be added or subtracted from one date/time to get another. All of this is done through the JavaScript Date object which leaves to trace of the original time zone in arriving at the offset in milliseconds since 1970.
If doing the difference between dates in days, round the difference to the nearest day to avoid problems with daylight savings time not being in effect for one of them.
https://wjporter.com/misc/MidPoint.htm
A `date` is a time interval (from midnight to midnight in a given time zone). One could argue that `time` values in computer programs also cover a span, even if it is infinitesimally small for most purposes.
The correct operation to compare intervals of varying lengths is not equality, it is either containment or overlap.
One could argue that a date is a region in spacetime. It starts at the dateline, spreads eastward, and ends at the dateline some 48 hours later. (Edit: 50 hours is closer to the truth. And one could quibble about the definition of “dateline”, as Samoa moved across it once.)
Interesting take from 1999:
Erik Naggum: The Long, Painful History of Time
Isn't this just a matter of being clear of inclusive or exclusive ranges? Perhaps this is merely a provocation that the general perception of exclusivity or inclusivity rests in the casual programmer's perception (rather than the specification of which I confess ignorance)?
Or is the provocation in measuring the contour of a coast? Can it be measured more precisely (surely it can)? In which case is the provocation one of mere precision?
Or is the provocation one which asks how many fairies may fit on the head of a pin?
No. No. No. Dates are to Integers as Date+Time is to Floating Point. You (we) -want- to compare them, but you need something akin to a language level specification for these. Otherwise, it is all what you are trying to do/what problem you are solving. Excel (where I suspect 60% or more of this nonsense lives) at least has a clear definition (both are floats, so you can do the comparison). Otherwise, "Down the Wat-hole" for what to expect.
I never understood why programmers have such a hard time with time. There are two and only two pieces of information you need to work with time: seconds past the epoch and location. Everything else is applying legal and social context to the actual data.
Here both of those pieces of information are missing on all three questions, so no answers can be given. Any answer to these questions makes assumptions or deductions about those pieces of information.
I am with you but you missed a key issue of comparison: granularity.
Unix Epoch is by definition at granularity of seconds.
How do you compare 2 epochs where one is in seconds while other in milliseconds. We sort of end up in same comparison game.
> There are two and only two pieces of information you need to work with time: seconds past the epoch and location. Everything else is applying legal and social context to the actual data.
Date-of-birth?
Calendar-quarters?
Computer-local time? (No location, only the computer's UTC offset).
My (naive) answer would be to localize any DateTime to account for any changes in Date due to timezone, strip the time and compare the Dates. I'd reckon a Date means any time in that date and is of lower specificity than a DateTime, rather than midnight on that Date. Is there any obvious issue with this in a situation where you might be (inadvisably) comparing a Date to a DateTime?
I was quick to say that a date equals a datetime at 0 hours and 0 minutes, but after reading and thinkiong about it, I agree that the question is wrong, but a little rephrasing can probably help solve the problem at hand:
- Does this datetime equal this date? No, no it doesn't and never will.
- Does this datetime fall on this date? Yes, it might.
- Is this date, the date of this datetime? Yes, it might be.
First, are we talking about what _is_ or what _should be_?
Second, what is the business logic behind the choice of Date versus Time to represent data (a data point or a segment) in the timeline? And is that logic consistent with the original design, and therefore with how the data have been collected?
There is more, of course. Lots more, some touched upon in the article and comments.
If I were to design a new programming language then x > y or x == y when x is the float 1.2 and y the integer 1 would throw an exception. Mathematically that is wrong because obviously 1.2 > 1, but in my experience, trying to compare values of different types leads to much pain and many annoying bugs.
Without more context, they are incomparable. The author only says they came from a UI and a database - but what do they represent? And what UI did they come from (i.e. what information did the user have when choosing the date[time] components, and how were those components encoded into a date[time]?). How are they being used now? I suspect the author is withholding some of these details to make a more interesting conversation (or to describe a "general case", but unfortunately there is no such thing as a general case for date[time]s).
Right. In a general case of a library casting types, the only thing that makes sense is to set the time to 00:00:00.
In practical usage, you would probably want to slice off the times and only compare the dates, or use a fuzzier comparison, or just conclude that your input data is crap.
It depends on the implementation and definition of “datetime” in whatever you’re working with. Taking Postgres as an example:
date: the concept of a calendar date (Your birthday)
timestamp: the concept of of a calendar datetime (You should get a new years kiss at 2022-01-01 00:00)
timestamptz: the concept of a precise moment in history (This comment was written at xxx time utc)
You can design a system where timestamps/datetimes are considered to be precise moments in time, utc, but that’s a matter of the impmentation you’re dealing with. Again, postgres does not assume timestamps as being in UTC (which has messed me up on more than one occasion).
I just had a long-standing bug come to the fore because I shifted a view to a materialized view, and embedded deep down in it was a date cast from a time stamp with tome zone.
When running as a view, everything was done in the user's time zone (because tits set per connection). When it’s a materialized view, it's he refresher's time zone.
This led to some inconsistencies, as the server is properly but inconveniently in UTC.
Not if you are working with timestamps in the future, no. Someone may want a meeting to happen when the wall clock displays 4pm at some day in the future, no matter what unknown time zone rule changes will be introduced in the meantime. In those cases you will need to store the local time and a timezone identifier (or geographical identifier).
It depends on whether or not the local time is meaningful or not.
Say you are recording logins from a user. A suspicious login may be outside of work hours in local time. Without knowing the local time, you cannot apply this rule.
This seems like a Kobayashi Maru (a.k.a. a famous no-win situation test for budding starship captains in the Star Trek universe), but this isn't a simulator, this is real life. You'll never be in this situation. You'll always have more information about the sources for these dates/times. And that may give you enough context to make a decision, maybe not. But trying to solve the problem in the general sense is impossible.
The article indicates the opposite: this is a real situation (it's explained why they are asking) and like Zach Holman states (somewhat exaggerated, true): unless your software has no users and is only one hour old, you'll find yourself in this situation.
No, the article doesn't reveal anything about where this data comes from, other than the vague "user input vs database" which is useless.
So their problem is real-world, and has a solution. We're supposed to solve the problem without the benefit of the context that they have, which makes it a contrived academic problem, which was my point.
Generally speaking you want context of what the date or datetime is being used for. If you want to know when Christmas takes place you will use a date: it does not take place on 2022-12-25 00:00:00 through 2022-12-25 11:59:59 because this would require a timezone and Christmas takes place at different UTC times around the globe. But you can reasonably say that Christmas takes place on 2022-12-25 and leave it at that to let the implementation of whatever program figure out if it is or is not currently Christmas based on the information it has about time and timezone.