How to do arithmetic with dates and times
As a programmer, you quickly learn that there are two things you should never do yourself: cryptography and dealing with dates and times. The former because it is easy to mess up security. For the latter, you would have to consider leap years, leap seconds, daylight saving time, timezones, and a bunch of other exceptions. As Tom Scott put it: that way lays madness. Well, let's look into it anyway.
Absolute time
Unix time is a number that counts the seconds since 1970-01-01T00:00:00Z. I have no clue how "1970-01-01T00:00:00Z" is defined exactly or how we can measure seconds (something about atomic clocks?), but let's just assume that it works.
This gives us a solid base that we can build on. This number is the same, no matter where we are or which language we speak. Crucially, it allows us to measure durations as the difference between two points in time.
There is a small caveat: Unix time ignores leap seconds. We still get the same number everywhere, but some durations are just slightly off. All libraries that I have looked into ignore this detail, which seems to be fine for most common usecases.
Absolute date
Similar to how unix time counts the seconds since 1970-01-01, we can also count the days since 0001-01-01. I am not sure if there is a catchy name for that concept. Python's datetime
calls it "ordinal", but that term is used differently in other places. Rust's chrono
calls it days_from_ce
.
However we want to call this, this gives us an absolute measure for dates. Here again, we can measure distances between dates and all the good things.
Compared to time, dates have much less physical grounding. It is somewhat related to the rotation of the earth. But since the date changes at night and night is at different times in different places, the date is not the same everywhere. Still, a measure for absolute dates is useful for cultural reasons.
Nominal date and time
When we talk about date and time, we don't usually use huge integers. I have grown up with a system based on the Gregorian calendar that has 60-second minutes, 60-minute hours, 24-hour days, 7-day weeks, irregular months, and 12-month years. This system has very little physics and is mostly based on culture. So it has all of that messy human stuff.
For each absolute date, we can get the corresponding nominal date in a given system (don't ask me how, but I am sure these mappings are well defined). Similarly, we can map unix time to nominal datetimes for a given system and timezone. For example, the timestamp 1725022800 can be converted to 2024-08-30 15:00 Europe/Berlin.
Timezones define an offset from UTC. However, that offset often changes (typically twice a year for daylight saving time). The list of all timezones and their offsets at different times is maintained in the IANA timezone database.
The mapping back from nominal dates and times to absolute ones is a bit more complicated. With daylight saving time, many timezones regularly skip or repeat hours. So an innocent-looking nominal value might have zero, one, or even two corresponding absolute values.
Nominal deltas
Now this is the part I had the most fun with, so I am going to go into some detail.
As I explained before, for absolute dates and times it is pretty simple to define a measure of duration. For nominal dates and times, this is much less straight forwards. What does it mean to add a month to a date? What is the duration between two datetimes? Let's look at some examples:
Adding a delta
Say I want to add one month to 1970-01-30. This is an issue because 1970-02-30 does not exist. There are several options that come to mind:
- no API: simply do not offer any API that would allow to do arithmetic with such a squishy concept as "months"
- error: raise an exception if the resulting date does not exist
- overflow: normalize the result to 1970-03-02
- clip: clip the result to 1970-02-28
- count from end: since 1970-01-30 is the second-to-last day of January, one month later should be the second-to-last day of February, 1970-02-27
This is mostly an issue for months due to their irregularity. But similar issues can also happen in other places, e.g. when naively adding a day would fall into an hour that is skipped due to daylight saving time.
Most libraries I have seen clip the result. JavaScript's Date
object uses overflow, and python's datetime library does not provide an API for adding months.
Adding a mixed delta
How do you add one month and one day to 1970-04-30? Do you first add the month (1970-05-30) and then the day (1970-05-31)? Or do you first add the day (1970-05-01) and then the month (1970-06-01)?
Comparing deltas
Which duration is longer, 2 months or 60 days? In most cases, 2 month will have 61 days, so they will be longer. But when February is involved the two months can be as short as 59 days. Again, we have different options.
JavaScript's Temporal
allows to compare these deltas, but only if you provide a reference point that both deltas can be added to (which brings us back to the previous issues). However, most libraries simply don't allow to compare deltas.
Difference between dates/times
What is the difference between 1970-01-01 and 1971-03-04? Again we have options:
- no API: as before
- top heavy: add as many years as possible without overshooting, then iterate with the next smaller unit until you have reached the target. In this case this results in 1 year, 2 months, and 3 days
- absolute values: convert the nominal values to absolute ones and use the absolute difference: 427 days
- explicit: let the user choose a unit. If the difference can not fully be expressed in that unit, return either a fractional part (1.17 years) or truncate (1 year)
Python-dateutil's relativedelta
uses the top-heavy approach. JavaScript's Temporal
also uses that approach, but also allows to choose a largestUnit
. Python's timedelta
uses absolute values. moment.js
uses the explicit approach with fractions. Rust's chrono
mostly uses the absolute values approach, but also as a years_since()
method.
Other calendars
I know that there are other systems than than the one described above. As far as I understand, the Chinese calendar has eras and the Ethiopian calendar has 12 months, but also some days that do not belong to any month. These systems seem so different to each other that I think that we should not try to shoehorn them into a single implementation.
Each system can have its own nominal data types, solving it's own unique challenges. You can convert from one system to another by converting to and from absolute values.
Why is this relevant?
If you look at all of these ambiguities, it is easy to give up and just live without arithmetic for nominal dates and times.
But still. In our human conversations it is completely normal to talk about "two years ago" or "in two hours". We should be able to express these same ideas in software.
There is a silver lining: All of these ambiguities are only relevant in few edge cases. The naive way to add a month does work in a lot of cases. For the rest, we need APIs that are clear, simple, and avoid unexpected results.
I am not sure whether any of the libraries I mentioned have already nailed that part. Some get close. Here is my personal wish list:
- Make a clear distinction between absolute and nominal values
- Make a clear distinction between dates and times
- Clearly document how ambiguities are resolved
- Do not assume that every day has 24 hours (most libraries get this wrong)
- For adding deltas, most libraries seem to go with the clipping approach. It might also be a good idea to offer different options to users.
- I don't really care about comparing nominal deltas. Instead, users can either compare absolute deltas or use a reference point.
- For differences between dates and times,
Temporal
's top heavy approach withlargestUnit
looks pretty flexible. This or something like it is probably the way to go.