Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Good Design is Imperfect Design, Part 1: Honest Names (domainlanguage.com)
292 points by dang on Aug 4, 2021 | hide | past | favorite | 126 comments


The problem is not in the `plus` method, as it's working as expected when the unit is in days.

It's the `month` unit that is misleading, since `1 month` looks like a constant value but really isn't, as the behaviour depends on the other operand, breaking associativity – or any expectation on how addition between two constants work really.

It would be less confusing if the API removed `month` as a period, and as a programmer you would have to write:

    periodLength = t.advanceMonths(1)
    t' = t.plus(periodLength days)
This way you can only `plus` in constant units (days), and `advanceMonths` is free to follow any heuristic and return anything depending on the value of `t` without breaking the expectations of `plus`, or you can have any number of functions with different definitions for "advance N months".


As others have pointed out, `plus` has odd behaviors for days, hours, and minutes too (DST, leap seconds). That's because dates are weird.

Absolutely anyone who works with dates needs to understand this. There's no point in calling the method `plusWeird`, that's redundant.

The good news is that most people actually have an intuitive understanding of dates (because we have spent our lifetimes looking at calendars) and when you are doing exotic date math ("move this to the same date next month") the default behavior of `plus` is what you want.

I spent a lot of time working with Joda Time (and now java.time). I use date math functions on the regular, professionally. I found this API intuitive and ergonomic from the get-go, and well matched to the problem domain. What I see in this thread is a bunch of people with a relatively poor grasp of the problem domain trying to rethink the API.


I was going to post exactly this: the problem is "month" which is not a fixed value. "Year" also has the same problem due to leap years. I suppose that leap seconds could also pose problems too.

I've recommended that clients specify time periods in days or weeks and avoid "months" entirely (ie, 90 days or 26 weeks, but not 3 months) or to specify an end date explicitly.


None of day†‡, hour‡, or minute‡ are fixed duration, either, depending on use cases & circumstance.

An absolute delta type is good to have, but it's also useful to have a "relative delta" type (how my language calls it). Each has their use cases.

† ±1–2h on DST enter/exits

‡ ±1s during leap seconds


And of course if day isn’t a fixed duration, then no intuitive definition of week would be a fixed duration either.


The month itself is unnecessary complex because long ago, someone decided to break a perfectly good calendar.

Ever wondered why the variable-length month is second in the year, instead of a last? English names of months offer a hint: "September", "October", "November", "December" - sound similar to "hept-", "oct-", "non-", "dec-", i.e. 7th, 8th, 9th and 10th. So as it turns out, February used to come last, as reason would demand, but then the calendar got rotated right by two months.

If we imagine the calendar as it was, with February being the last, then at least the mapping between "day of year" and "month, day of that month" would be a function of just "day of year", instead of being a function of both "day of year" and "what year is it".


Doesn't really change that adding n months to a date is an ambiguous operation. You would need to give all months the same length to avoid the ambiguity, but then months would not add up to a year, which has its own set of problems.


As far as I can tell, you agree with the articles main point. `plus` is a honest name for adding days to a date but dishonest when it comes to adding months. `advanceMonths` is an awkward but more honest name for adding months. An even more honest name would be `addMonthSameDayButCapIfShorter()` or something like that which reflect the gnarly behavior. A weird operation deserves a weird name.

I don't like if `advanceMonths` returns an interval though, since this indicate that the interval is a meaningful unit separately from the initial date. It should return the new date, not an interval.


This is just a hindsight, but I'd make two kinds of time period: "exact" period and "flexible" period. Define these as two different types, which can be converted to each other but only explicitly. Functions like plus() can take both types (via method overload) but the user must be aware that their behavior is different.


I think it's possible to be both honest and readable.

One way to look at the issue is that the confusion stems from giving the same name to operators of different types, namely the "Instant plus Period" operator and the "Period plus Period" operator.

Period could be implemented as a vector with independent components for days, months, and so on. (I don't know if that's how JodaTime does it, but that's what I would do if I wanted it to have a "plus" operation.) Then it could have a well-defined operator appropriately named "plus", which is both commutative and associative as one would expect.

"Instant plus Period", however, is asymmetric, and cannot satisfy the identities we associate with "plus". So let's give it a name that is also asymmetric. How about "advanceBy"?

2021-01-30.advanceBy(1 month) = 2021-02-28

2021-01-30.advanceBy(1 month).advanceBy(1 month) = 2021-03-28

2021-01-30.advanceBy(1 month plus 1 month) = 2021-03-30

2021-01-30.advanceBy(2 months) = 2021-03-30

That seems much less mysterious to me. Sure, a casual reader wouldn't be immediately confident about what advanceBy returns in all cases, but giving it a name that conveys its asymmetry helps a lot.


Okay, I almost wish I hadn't read this comment because it is so similar to what I have in part 2! So please don't be annoyed when you see it in a couple of weeks ;-)

I actually separated it into a separate part because it undermines my primary point. Sure, we all love it when we have a better decomposition, better names, and everything falls into place. But it doesn't always. Not in the time we have. So then we need ways to deal with the flaws. I decided that if I had ended with this it would have communicated: Aw, just keep trying and you'll get something nice! That can be very risky.


Heh. You didn’t have a perfect example but you went with what you had and shipped.


The only thing advanceBy has going for it is that when the programmer inevitably tries to type add (after wondering why plus turns up nothing), the IDE _might_ show it in a dropdown.


Ok, but that’s when you read the docs. Also, in many languages, those expected methods could be mocked out to raise compiler errors, redirecting the developer to the correct methods.


I’d argue it should be shiftBy rather than advanceBy to deal with negative periods, but yes. The issue here is the conflation of summing periods (commutative and associative) with shifting a date by a certain period.

None of the proposed names in the post actually clarify anything - they merely make the behavior seem more confusing.


IMO, it should be advanceBy, if going backwards takes a negative interval. Shift by 10d isn't completely clear which on direction it would shift.


Being honest with naming things is also a great roundabout way to ensure you write maintainable, readable code. If the name is honest and it feels awkward, it's a good red flag that there might be a problem with the approach you're taking. I think code golf languages (a-la [0]) are a good example of this approach as well, when your language is as terse as possible, giving very deep consideration to what the language actually does is crucial.

[0] https://github.com/DennisMitchell/jellylanguage/wiki/Quicks


I think I missed something here, does this article really suggest that plusExcuseExcuse() would be better? The domain is dates, we all know months have different numbers of days so _something_ must be done and this is explicit from the domain.

Changing PI to PI_ISH because numbers in a language is limited does make code more readable.

Almost everything in a computer is an imperfect model.

i++

is not improved as

incrementUnlessItOverflows(i)

You are not going to change how increment works so avoid making it awkward.

You cant change the fact that number of days in a month varies.

I find this as annoying as "mistakes programmers make about time" articles. The reality is programmers understand that all time in all computers is just a model to be played with.

Recently I saw the people are fretting because there may _have to be_ a negative leap second. Leap seconds are a man-made concept. You don't _have to_ have a negative leap second, you just have to accept the earth's spin is not constant, utc is a model, the model is not worse if its out by >1, it was never perfect, it exists to be convenient. utc ignores being out by >1 millisecond, why is there a problem if the denomination is a 1000ms?

The world keeps turning and the model is simpler if _all_ leap seconds were ignored, it will take a while until midday over Greenwich is affected and it will not matter when it is. It was an arbitry location in the first place.

It seems to me JodaTime is correct, if you want to add 30 days plus(30 days), if you want to accept that the month model is imperfect plus(1 month) and see what you get.

++ is concise and precise, the domain is a programming language.


I don't really see how neither the article nor my comment suggests anything of that kind, if anything, the opposite. Feels like you're fighting windmills.


Without quibbling about the particular (adding a month) example, I can vouch that this is a great way to help the people for whom the software is being built refine their thinking. I help clients often with inherited/legacy/hacked-together code, often written by non-expert programmers for "odd" platforms (think: Excel VBA macros, which use the spreadsheet itself as the sole data structure). Identifying the awkward implicit concepts in this code, and giving them corresponding awkward names, can help the client recognize places where their tool doesn't map onto their domain knowledge the way they thought it did. As a recent example: "run of same-named tasks when ordered by date, then name" isn't exactly the same concept as "block of time dedicated to a specific task"!


The problem is not the "plus" operator but instead lies in the "month" duration.

Not all months last the same number of days, and while we are at it not all days have 24 hours!

I'd rather move the explicit (and complicated) choice of what kind of month are you talking about (and thus also the relevant discussion about naming) into the operator that constructs a time interval.


This is definitely the sort of idea that I like when exploring and trying to find a better model. The point I was trying to make in the article is that there are times when, however you try, you don't find a satisfying answer in the time you have.


Your point was good. But perhaps your unfortunate example is in turn a good example that if you can't find a good name for something perhaps there is a different way to look at things that produces a less contrived (but still explicit and honest) naming scheme.

Looking forward to read about your explorations on that front.


That's not really always possible/practical. It's pretty common to say "in three months from now" in which case defining a month to be 31, 30, 29, or 28 days is not correct. The plain English meaning is to be on the same day of the month, but add 3 to the month value, probably rounding down if necessary. Therefore, 3 months from January 31st 2021 would be April 30th despite going through February which has 28 days.


I think the point of the person you're replying to is to move the ambiguity onto the month object and off the plus operator. So today.plus(1 monthish) rather than today.plusish(1 month).


It may be common and still completely ambiguous what “3 months from now” means — ask people around and you’ll get different answers.

- Same calendar day plus 3 calendar month, round down if < end of month

- Just plus 3 calendar month plus remaining days if > end of month

- 90 days from now


but this breaks down when the destination date is not within the calendar days of the month. For example, what is 3 months from Nov 30th? Should it be Feb 28th (or, if it's a leap year, should it have been 29th?), or should it be Mar 2nd?


I came across this problem once before and came to the same conclusion. “A month” is an ill-defined unit, so I just don’t allow it to be representable in datetime arithmetic. You make the point that “1 day” would also fall under this category, but it’s less variable and so it being equal to “24 hours” is good enough.


One can argue that when something works most of the times until it breaks in some edge cases (twice a year) it's actually more dangerous since you're unlikely to encounter those issues until it hits all of your customers at once.


I disagree about "plus" in the JodaTime example. The month addition with corner cases did exactly what I expected because the whole library has been polished to "do what a normal human would do, most of the time". A normal human would not suspect Jan + 1 month, is any month other than Feb. However I suspect, not even reading the documentation, if it were fed 30 or 31 days, rather than '1 month', it would also do exactly that, and mostly give dates in March.


What is 2021-02-28 + 1 month? Is that 2021-03-28 or 2021-03-31? It is far from obvious what a normal human would do most of the time here.

The lack of associativity is still a problem. If 2021-02-28 + 1 month = 2021-03-28,then (2021-02-28 + 1 month) + 1 month = 2021-03-28 + 1 month = 2021-04-28. While if I ask what is 2021-02-28 + 2 months (given 2021-02-28 + 1 month = 2021-03-31), most people would say 2021-04-30.

While I am not entirely sold on using awkward/"honest" names by default, the author does raise a good point: sometimes concepts are inherently messy or full of important edge cases, and we shouldn't just brush that aside.

I recently ran into a similar problem. I am implementing a distributed lock (https://www.joyfulbikeshedding.com/blog/2021-05-19-robust-di...) -- like a mutex, but works across processes and machines. I try to mimic the language's standard Mutex API as much as possible.

A normal Mutex has a query method named "owned?" to check whether the calling thread owns the mutex. When I tried implementing this method for my distributed lock, it raised a question: owned according to who? Owned according to the local state that represents the lock, or according to the state that lives in the server? Because they can differ (e.g. due to bugs in other clients or because an admin manually messed with the state). So I opted for "honest names" here too and implemented two methods: "owned_according_to_local_state?" and "owned_according_to_server_state?"


The problem is that "What is one month after 2021-02-28?" doesn't have a single meaning in plain human language.

It could mean 30 days in the future. Or the same day of the week 4 weeks in the future (i.e., 28 days in the future). Or the day of the same cardinality in next month. Or any date in the next month. And those are all equally correct.

It's simply not a precise measure of time when spoken from one human to another human in plain language. Indeed, I think we inherently understand it to be an imprecise measure of time just as much as "tomorrow" doesn't mean "exactly 86,400 seconds from this moment". "Next week" doesn't necessarily mean "7 days from now", either. That's why computers don't typically use imprecise terms. They provide feedback and say "this will occur at this time and date".

You always have to check what the operators actually do and what the requirements actually mean when you're working with times and dates.


Plus/Increment/Add 1 month, does have one single meaning, it's that the outcome may be invalid that is the issue.

"2021-01-31".plus(unit="Month", size=1) => "2021-02-31"

But nobody really wants that, because it's not a valid date. So implicitly the library is deciding to return a valid date.

A library could be written to just provide invalid dates, and let the end user handle any errors. That library could also include an explicit validation method that takes a date and returns a valid one.

"2021-02-31".coerceToValid() => "2021-02-28" // Overflow == Max

"2021-02-31".coerceToValid(asDays=True) => "2021-03-03" // Overflow Carries (to the right)

In fact the library, that provides an ignorant response and no contract on validity would hold to the associative property, it just wouldn't be as ergonomic.


FWIW, I believe there are only two ways to build this library correctly: the way you described here (where intermediate date values are allowed to be invalid) and simply making "one month after 2021-01-31" throw an exception.


I agree those are the technically correct way to do it.

But if the reason people are wanting to use the library, is they want something to handle the complexity for them, coercing the data is good for simplication.

There is actually another option, that provides idempotent/associative consistency and implicit coercion to valid values.

This option, which discards some use cases (days beyond 28, when manipulating months), you coerce all values 29..31 to 28. This isn't even as technically correct, as the original option, but it removes the inconsistency and holds to the simplification contract to users.


> Plus/Increment/Add 1 month, does have one single meaning

No, it doesn't.

You walk in to the doctor's office on February 28. At the desk, you see another patient about to leave. They turn to the desk attendant and say, "I'll see you in a month for my follow-up."

What date is the other patient's next appointment? What if the date you walked in had been January 31?

Also, for what it's worth, in C#:

  DateTime x = new DateTime(2021, 1, 31);
  x.AddMonths(1); // Feb 28
  x.AddMonths(2); // March 31
  x.AddMonths(1).AddMonths(1); // March 28


I explained it pretty clearly what you get when you add a month on January 31st. You get February 31st.

A month is a discrete unit of measure. It is not decomposable into any number of days.

When you increment a month, you get YYYY - (MM+1). Any higher significance is maintained, but irrelevant to the operation. (This applies to the hypothetical statement in the doctor's office, the specific day is indeterminant, but can be assumed the same as current day next month.)

The fact that not all possible days exist is orthogonal to the singular meaning of the operation. It's obviously not greatly valuable to an end-user, but the method of addressing the ambiguity involves a second operation that ensures validity.

End-users want an method that does both the addition and coercion, but you can create consistency if you follow the simple path I laid out in GP.

I'll use your syntax but with the strictly correct definition of the operation.

  Datetime x = new DateTime(2021, 1, 31);
  x.AddMonths(1); // DateTime(2021, 2, 31)
  x.AddMonths(2); // DateTime(2021, 3, 31)
  x.AddMonths(1).AddMonths(1); //DateTime(2021, 3, 31)

  // Ensure Valid, using a coerce to clamp overflows
  DateTime(2021, 2, 31).EnsureValid(); // Feb 28
  DateTime(2021, 3, 31).EnsureValid(); // Mar 31
  DateTime(2021, 4, 31).EnsureValid(); // Apr 30


> The problem is that "What is one month after 2021-02-28?" doesn't have a single meaning in plain human language.

Thank you. That's exactly it. Crazy to see how many developers don't seem to grasp it here. I guess it's the "trap" that we are used to datetimes before the time when we became developers and we have to actually relearn this stuff to get the idea.


I think if you try and confuse folks with this, that is easy to do. But, this isn't special. What happens if you add a meter to a kilometer? For most measurements, you get a kilometer. We teach this with significant digits, but then typically ignore it and assume all sorts of conventions that are not spelled out.

If it is important in the domain you are working in, take extra care to understand the maths that you are built on. And don't be surprised to find special cases everywhere.


You seem to be stuck on: 'what is the value of 1 month' and thinking in terms of days, because that appears to be the precision of the left value.

Programmers exist in a world where things such as leap seconds matter. Normally if you have a timestamp that is just before a leap second, then add exactly a day's worth of seconds, you'd slide back a little in time. This might matter in another context, such as defining the limits of neighboring ranges properly. Also, who's to say the underlying precision is a second?

The intent of the library in question is to behave the way most people would. With imperfect buckets and idealized answers; yet also precision where someone makes the attempt to be specific.

None of the examples in the article use a more vague syntax, such as "0 days before the end of the month". They start with what a human might, a rounded but full date; then apply an interval. So a more clear contrived example might be.

  Jan 31 .plus(1 months) => Feb 28
  Jan 31 .plus(2 months) => Mar 31
  Jan 31 .plus(3 months) => Apr 30
  Jan 31 .plus(4 months) => May 31
  Jan 1 .plus(1 months) => Feb 1
  Jan 1 .plus(2 months) => Mar 1
  Jan 1 .plus(3 months) => Apr 1
  Jan 1 .plus(4 months) => May 1
Note how in the second half there are still variably sized months, but the result is what a human would want.


You’ve ignored your critique and answered only the more obvious anecdotes.

So I’ll repeat:

What’s Feb 28th + 1 month?

Does the human expect the last day of March? Or the 28th day of March?

The API doesn’t make that clear - I think you could reasonably argue for either.


If you're adding 1 month, you're working on the month's location, e.g. 2021 - (02) - 28.

When you increment the month, the result would be 2021-03-28.

The only time you'd modify the day, is if the day became invalid due to an overflow, during that increment. If, when, you overflow the days you'd set the value of days to the maximum in that month.

If I tell someone, I'll get to that in a month, they expect by this day in the next month, the next calendar page, not 30/31 days.


You're setting the same trap as those viral math problems people share on Facebook:

6÷2(1+2) = ?

We could argue about what the _right_ answer is to that equation, but I call it a trap because it's intentionally confusing and devoid of any context (or the ability to ask a follow up question). There isn't really a situation where you would see that equation and not know the intended way to interpret it... just like your question.

There are a couple of ways that context _could_ be provided though:

I'm writing an automated task that should run once per month, I don't necessarily care what day of the month it runs though since it just cleans up some temp files. If today happens to be Feb 28th, and I say run today, then every month after, I would expect it to run Feb 28th, March 28th, April 28th...

I'm writing an 'end of month' task that needs to run at the end of every month for some bookkeeping reason. If today happens to be Feb 28th, and I say run today, then every month after, I would expect it to run Feb 28th, March 31th, April 30th...

In both of these situations I would program accordingly. Computers don't understand context, that's the job of the human programming it.


The human expects March 28th. Ask any normal person and they’ll answer March 28th.


The article is about naming the method. The fact that dates are messy means a clean API is difficult/impossible.

I think plus() is a name that is good enough. I can't think of a better name that will help the user understand what will happen in the 2/28 + 1 month case. That's asking too much of a method name. That's what docs are for.


Perhaps .dateAfter(1 month) would be more appropriate? I sympathise with the author in finding the violation of associativity of “plus” a bit jarring.


I think that's the easier case—most people would expect Feb 28th + 1 month = Mar 28th. The tricker one is Jan 31 + 1 month, because there is no Feb 31st. I don't think there is a "correct" answer to that.


If I pay you Jan 31, Feb 28, March 31, April 30, etc., don't I pay you monthly? Shouldn't I be able to express this as a repeated addition of a month?

At any rate, these problems have been solved in finance, with proper date and schedule libraries.


No, you should be able to express it the way I mentioned in other examples: Base .plus( N months ) where N is whichever month after the reference you want.

.plus .plus .plus isn't correct because "x months" doesn't have a fixed size. You are NOT saying Base .plus(30 days), NOR are you saying Base .plus(4 weeks) ((which BTW, I'd expect to stay on the same weekday)). You're incrementing by an unstable value.


This doesn't work either as it wouldn't work correctly if started in a month with less than 31 days. To represent "the last day of the month" you really need to be subtracting one day from the first day of the next month, full stop: any solution working off of these weird rounding rules is going to be unreliable.


> Shouldn't I be able to express this as a repeated addition of a month?

I don’t think so, no—more likely, it should be Pattern(Month)… like how would you express “second Tuesday” as a series of additions?


To express this in a sane way you want to express "the last day of the month". You can't claim it is "just keep adding to the hire date one month times the current month minus the hire month" as if I hire you in February that won't work. You really need to be calculating "take the current time, replace the day with 1, add one month, subtract one day" to get the next pay date.


I did cover that:

""" None of the examples in the article use a more vague syntax, such as "0 days before the end of the month". """

As another reply points out, it's incrementing the Month set of buckets. I'll also extend with other results I expect:

  2020-12-31 .plus(1 months) => 2021-01-31
  2020-12-31 .plus(2 months) => 2021-02-28
  2020-12-31 .plus(3 months) => 2021-03-31
  2020-12-31 .plus(4 months) => 2021-04-30
A normal human has several options, and truncating to stay within the month makes the most sense to the most people most of the time. It's perfectly reasonable to take that step when resolving the indicated date to a representable value.

I'll go further: JodaTime probably isn't focused on Precision Date Calculations; it behaves very much the way I expect someone working with forms and fields, general CRUD enterprisy software stuff, would want auto-filled dates to work.


Maybe I'm just being thick (it's usually the case), but for the life of me I still don't know what your answer to the parent's question is, and I can't tell how your quoted part is supposed to answer it.


My 2 higher levels post post included an alternate phrasing of the test case they specified:

""" None of the examples in the article use a more vague syntax, such as "0 days before the end of the month". """

---

They asked:

""" What’s Feb 28th + 1 month?

Does the human expect the last day of March? Or the 28th day of March? """

---

It's implicit, the human only expects the month to change, because the input isn't a descriptive phrase "the end of the month" adjusted or not, it's a literal date. That's why my other test cases show the same behavior for the end of the month.


It should be the 28th and there should be a ceiling method.


I still think plus is fine in this case. And really, my expectation is, that anybody who thinks about adding a months, knows about their own intention. Because I suspect just adding the month is not what most people are actually doing, I guess in 90% of the cases most developers will add another step, rounding to the last (or first) day, or to the next same weekday, whatever.

The one thing nobody wants, is to add a month and land in the month one over.

So even if the semantic differs between libraries for adding a month, it actually doesn't mather that much. Because for all the other cases one could imagine, most people will add days or weeks, if staying on the same day matters.


I found the argument about lack of associativity convincing: it's very counterintuitive that date + 1 month + 1 month is different from date + 2 months


One way I suggested that we solve in a different domain was a small name change that really helped. To me "plus" is the _operator_ and "add" is the _operation_. So to me date.add(1 month).add(1 month) is actually different than date.add(2 month) because I can read that as taking two one-month steps which _may_ be different.


The unit of 'month' is not fixed. It doesn't even have a value until it's applied to a date.

(It is almost like a quantum state. It can be between 28 and 31 days, depending on what it's being applied to. But as soon as it's applied to an absolute date the ambiguity disappears).

If you expand out the short hand 2000-02-02 + (1 month forward from February) + (1 month forward from March), then we can see associativing is nonsensical.


If I tell someone on January 31st “call me a month from today” then we have a call on February 28th and I tell them “call me one month from today” I would expect them to call on March 28th.

In contrast if on January 31st I told them “call me two months from today” I’d expect them to call on March 31st.

It’s very intuitive.


Disagree.

Jan + 1 month = February, sure.

But, as soon as you add the day, it falls apart for me.

For most dates, if I add a month, in my mental model, the answer is the next month with the same date.

Jan 15 + 1 month = Feb 15, etc

But, at the edges, it gets odd quickly.

Jan 31 + 1 month = ??? Not sure, maybe Feb 28, maybe Feb 29, maybe Mar 2, maybe Mar 3. Depends on the year and who's asking me to solve the problem.

I would expect any reasonable software to fail gracefully when asked to solve this problem. And by fail gracefully, I mean ask for clarification. Or prevent me from asking silly questions in the first place.


The point of the article is that it doesn't have to fail gracefully.

Consider the API:

    data Date = Date { getYear :: Int, getMonth :: Int, getDay :: Int }
        deriving (Eq, Ord, Show)

    addDays :: Date -> Int -> Date
    addMonthsRounded :: Date -> Int -> Date
Someone who does

    d `addMonthsRounded` 3
immediately has a contextual clue that there might be something fishy going on, and has a string they can google to get to the docs to find out that this "Rounded" business is all about "hey, the code

    let y = (x `addMonthsRounded` 1) `addMonthsRounded` (-1)
      in y == x
might sometimes return False because it truncates if your day doesn't fit in the given month."


You expected 2021-03-31 + 1 month + 1 month to be different from 2021-03-31 + (1 month + 1 month)? I find this behavior understandable, but not apparent at first glance.

Also, FYI, this is not how GNU date works:

  $ date -d "Jan 28 next month" 
  Sun Feb 28 00:00:00 CST 2021
  $ date -d "Jan 29 next month" 
  Mon Mar 1 00:00:00 CST 2021
I could see confusion from this, depending on what libraries you are used to.


The inconsistency with GNU date is more visible from Feb:

  $ date -d "jan 31 next month"
  Wed Mar  3 00:00:00 PST 2021
  $ date -d "jan 30 next month"
  Tue Mar  2 00:00:00 PST 2021
  $ date -d "jan 1 next month"
  Mon Feb  1 00:00:00 PST 2021
  $ date -d "feb 28 next month"
  Sun Mar 28 00:00:00 PDT 2021
Edit: This was on my unconscious mind for a bit and I came up with an additional test case to confirm a suspicion I realized.

  $ date -d "2016-1-31 next month"
  Wed Mar  2 00:00:00 PST 2016
  date -d "2016-2-1 next month"
  Tue Mar  1 00:00:00 PST 2016
  date -d "2016-2-1 next year"
  Wed Feb  1 00:00:00 PST 2017
  $ date -d "2016-3-1 next year"
  Wed Mar  1 00:00:00 PST 2017
GNU date will add the duration of the CURRENT interval (ignoring already occurred deviations, like leap years) relative to the specified base date.

The oddity in behavior I observed above is adding the length of the month of Jan to dates in Jan. I suspect only a programmer would find that inference remotely correct.


I suspected this when I was playing with it, but good to have the confirmation. Thanks!


> A normal human would not suspect Jan + 1 month, is any month other than Feb.

Because humans intentionally reduce the precision of their computations to make them easier.

  Today = 2021-08-04

  Next year = 2022
  The exact month, day, hour are all unknown.

  Next month = 2021-09
  The exact day and hour are unknown.

  Tomorrow = 2021-08-05
  The hours and minutes are unknown.
If humans used the same level of precision as computers, they'd run into the same problems. Probable date of birth calculation is an example.


I disagree with the author on this one. While it isn't a perfect analogue to mathematical addition, neither is floating point math. The bigger issue is that calendar math is tricky and durations cannot always be converted between one another.

The correct solution in my opinion would be to have distinct types for this sort of thing to help clear up the ambiguity. Date and time are really tricky concepts to model though and it is a difficult balance between honesty/precision and intuitiveness for such APIs. One such clunkiness I've seen with KeepassXC is that it stores password expirations exclusively as timestamps. This is actually not correct for the common use case of passwords expiring on a particular day (since you don't know the time) because what if you are in a different timezone? This I think shows that you can't just convert dates into datetimes without problems.


If I may elaborate, the comparison to IEEE floats is apt because they addition is not associative there either:

julia> (1e16 + 1) + 1 1.0e16

julia> 1e16 + (1 + 1) 1.0000000000000002e16

I think the fundamental issue with dates, however, is deeper than "addition is non-associative". It is that "1 month" is a context-sensitive duration (so is a "1 day", due to leap seconds).

I am curious about using intervals. If "{Year} January (no day)" had a representation as "Jan. 1 - Jan. 31", then "Jan. 1 -- Jan. 31" + "1 month" = "{Year} Feb. 1 -- Feb. 28" (or 29 if the {Year} is a leap year".


I use the expiration date in KeepassXC and here's my thought on it in particular: I want the KP date-time field to err on the side of simplicity (implementation and interface).

Seems to me best practice of renewing accounts is to always apply padding (TODO: renew Foo today because it expires in roughly n days); if you make as a minimum n >= 2 (or how about >=3 as a safety factor) then you don't have to consider how the account provider administers their cut-off (expires at 00:00 vs 23:59, is timezone a factor?, etc).

With this approach the difference between 'instantaneous time' and 'calendar time' are, for practical purposes, a moot point.

So the date I enter into the KeepassXC field has the padding included. The expiry date as described by the provider I can put in the note field for additional reference.


Date arithmetic seem to be a complicated matter.

Around 2.5 years ago I sent the following feedback to the Wolfram|Alpha Feedback Team, never heard back from them.

    Message: When I compute
    "2019-01-31 to 2016-04-04" I get "2 years 9 months 26 days"
    and when I compute the reversed input
    "2016-04-04 to 2019-01-31" I get "2 years 9 months 27 days"
    
    But when I compute
    "2019-01-31 to 2015-10-21" I get "3 years 3 months 10 days"
    and when I compute the reversed input
    "2015-10-21 to 2019-01-31" I get "3 years 3 months 10 days"
    
    Shouldn't the very first one ( "2019-01-31 to 2016-04-04" ) also return "2 years 9 months 27 days"?


Inverting the date range probably confused the logic involved in accounting for leap dates. Possibly it fails to account for swapping the order of the dates to achieve an always positive result?


It's not (only?) related to leap years. It appears to be related to the month of April and some other weird stuff.

    -- leap year 2012:
    2019-01-31 to 2012-01-30 --> 7 years 1 day
    2012-01-30 to 2019-01-31 --> 7 years 1 day

    2019-01-31 to 2012-02-29 --> 6 years 11 months         <---- Weird stuff
    2012-02-29 to 2019-01-31 --> 6 years 11 months 3 days  <---- Weird stuff

    2019-01-31 to 2012-02-30 --> 6 years 10 months 30 days (2012-02-30 does not exist)
    2012-02-30 to 2019-01-31 --> 6 years 10 months 30 days (2012-02-30 does not exist)

    2019-01-31 to 2012-03-30 --> 6 years 10 months 1 day
    2012-03-30 to 2019-01-31 --> 6 years 10 months 1 day

    2019-01-31 to 2012-04-30 --> 6 years 9 months          <---- April (any day in April)
    2012-04-30 to 2019-01-31 --> 6 years 9 months 1 day

    2019-01-31 to 2012-05-01 --> 6 years 8 months 30 days
    2012-05-01 to 2019-01-31 --> 6 years 8 months 30 days

    -- non-leap year 2013:
    2019-01-31 to 2013-04-30 --> 5 years 9 months          <---- April (any day in April)
    2013-04-30 to 2019-01-31 --> 5 years 9 months 1 day

I stumbled upon it while testing some JavaScript time and date frameworks and wanted to use Wolfram|Alpha because I was somewhat confused with the correct interval between two dates.


Try your test cases in other languages. E.G. python3 import datetime ; (datetime.datetime.strptime("2019-01-31","%Y-%m-%d") - datetime.datetime.strptime("2012-02-29","%Y-%m-%d"))

The result, which reminds me of one of the other reasons I avoid python for trivial projects, is the duration / interval answer of days=2528.

Part of the bug is surely in Wolfram|Alpha returning the interval broken out in human durations; but importantly those durations __don't use fixed time values__. The precise length of a //year// and of a //month// are variable.

I still suspect there's double or single counting of leap-days in those durations as the reverse ones clash with converting a duration back to a whole number.


Strong disagree - 'plus' is quite a good name for Joda Instant, and his alternatives are atrocious.

Certain problem domains require baseline familiarity with the subject. Far more people can recite the old "30 days has September, April, June, and November" rhyme than can explain what the words commutative and associative mean. Date math may annoy pure mathematicians but normal humans are used to working with calendars.

In the problem domain of dates, 'plus' is analogous to (but not exactly) its mathematical counterpart, and Joda's month math is almost always exactly what you want. Furthermore, plusIshRoundCeiling doesn't really explain anything; ceiling of what? The OP suggests that the cognitive dissonance is beneficial to the user. In which case it might as well be plusAsterisk or plusDontForgetToReadTheDocumentation.

The problem with plusGoReadTheDocs et al is that all problem domains have little edge cases like this. Excepting pure math, every single plus method is going to have notes. It'll be worse than those useless Prop 65 warnings in California.

Joda did this one right. Date math is simply not associative or commutative. Thankfully, most people are familiar with calendars and have some intuitive sense of this already. Littering the API with special hints doesn't help.


Exactly right.

Although I think your comment here rather underestimates mathematicians :) Regular people are the ones who are only used to thinking in real numbers or integers. While some may be familiar in a practical way with how dates and times work, they would probably struggle to rigorously define the algebra of time math where associativity and commutativity don't hold. Mathematicians will be familiar with areas like abstract algebra and group theory and very capable of understanding the concept that date math is not normal integer arithmetic.

Either way though, I agree the plus operator works great here given the inherent weirdness of how we have structured human time, and everything the author is proposing is worse. Joda handles the trickiness of dealing with time far better and in a much less error-prone way than any other library I've seen.


It is interesting that many of the comments have suggested that JodaTime does what a "normal person" would expect in most of these odd cases, whereas you are pointing out that advanced mathematical concepts can define an algebra where associativity doesn't apply. Almost opposite points! But well taken.

One thing though: I think I was clear that I like Joda Time. It does handle these things better than most libraries. That is what makes it interesting to discuss. I could write a fun article picking apart some awful library, such as the old Java default library, but what would be the point.


Yeah ironically I think STEM non-mathematicians (STE?) in these comments seem to have the most difficulty with the concept of “month” because they try to conceive of it as a numerical quantity of days rather than an abstract concept in its own right as a pure mathematician would recognize.


I like the point on accepting awkward names a lot.

It comes close to the “system should be a complex as they need to” (some quote I am completely butchering) idea. There’s a point where trying to hide messiness isn’t helpful.

As a side point to the “plus” argument, I think having a name that covers 99% of the use cases but will fail unpredictively on edge cases should be fine for something that is central to the library.

People for whom exact behavior matters will have looked at the doc or tried these use cases by themselves, or there will be enough blog posts like this one to motivate them into checking what their lib does beforehand. Well, anyone working with times and dates will have learned to not trust clean abstractions at this point.

For people who are just writing convenience applications and want a lib that make them feel they can use it without hassle, “plus” is a very good, memorable and easy to use name.


That

    (some date + 1 month) + 1 month ≠ some date + 2 months
has always been true for floating numbers. This is exactly why you never trust fast matrix multiplications---they rely on cancelations of the form (a + b) - b = a + (b - b) = a.

I would argue that 1 month is just having too few significant digits. You can even implement it as returning 30 with probability 60% and returning 31 with probability 40%. Then on average you would have

    1 year ≈ sum([1 month] * 12)


Sure, but I'd guess that neither developers nor "normal people" immediately think "floating point" when they think about a month.


For what it’s worth, SAS can add months to time points with the aid of an alignment parameter, so 01JAN2021 + 1 month would be 01FEB2021 with beginning alignment and 28FEB2021 with end alignment. There is also middle and sameday alignment.

This is the snappily named intnx function. I guess that’s an honest name in the sense that it is very suggestive of needing to read the documentation before using it.


I was expecting the author to actually give a supposedly better name other than 'plus' at the end of the post. Bummer. Perhaps there is no better name and actually 'plus' is the perfect election?


I think this is exactly the opposite of the takeaway. If the underlying concept isn't simple, don't pretend it's simple with a simple name.


Yes, that was my point. Avoid overly simple names and also overly elegant names, unless you've actually found a simple, elegant concept.

I do have thoughts on better names. I'll write a follow up about that. But I didn't want to include it in this article because my most important point was that we need ways to curb our perfectionist tendencies, and not by hiding the rough spots. And if I had ended on that up note, it would have been the usual cheesy ending: Look! I'm so good that I always end up with a beautiful design. Which, as you say, was the opposite of the takeaway I wanted.


to go a step further, anyone using time variables based on calendar periods should know better than to think the relative time of +1 month would be simple. I'm not sure it's the fault of using a simple name in this case.

I'm sure there could have been a lot better options to drive that point home, but the author picked one that I'm sure everyone has an opinion about. That seems like an effective way to get people talking about it, but it could be divisive in bad ways.

I'm a little put off by the other saying "a clean name shuts down our thinking", so maybe I'm just responding in disgust.


What would be a better example to make the point, do you think? I always use the best example I can think of. Obviously there must be better ones.


Classic 'one of the hardest things in Computer Science: naming things'


I completely agree with this. And reminds me of how Oracle brushed all complexity from the word "filter" and simply went and named the function as is. A Filter implies a flow direction, it also implies two segments, the desired and the unwanted portion of the filtering operation. Which one is the one staying? which is the one that passes through?

I've seen people argue that one side should be called "sieved" while others say it should be named "selected"... to add to this , let's add even more complexity, since "selected" implies agency.. while filter performs a passive _selection_, in which case it should not be considered a selection at all.

It seems easy, but in reality some concepts (If not all) are inherently messy, specially on the English language since it seems the most abstract of all languages.


Everyone in computing knows calendars and time are tricky concepts. In all honesty, there is no point to make API less terse: unaware will ignore it and use anyway. Educated will use it with thought, even if it is terse.

But general point holds: honesty in API is important


Also

``` .plus(1 month) ```

The unknown behavior of 1-30 + 1 month is a red flag that maybe you shouldn't be using "month" as a unit of time because it isn't.

Months suck. Bill for services every 10 days or every 25 days or every 50 days instead of every month.


But sometimes you need to bill at a certain point in the month. I don't pay my rent every X days, I pay it on the first of the month.

The best solution for that may be something like establishing it as a frequency rather than accumulative addition. But that won't work in every situation. Dates are complex and require lots of thinking to do correctly in some circumstances.


> I don't pay my rent every X days, I pay it on the first of the month.

I don't want to pay rent that way. It's a good way to get scammed because they're charging you a higher per-day rate in February than January.

I'd much rather pay rent per day, if possible. If Amazon can take over my property manager I'm sure it'll be possible to bill it with the same flawless consistency of AWS billing, and have a concept of discounted "reversed instances" and "spot instances" for real estate.


> I don't want to pay rent that way. It's a good way to get scammed because they're charging you a higher per-day rate in February than January.

Are you serious? They're not scamming you. They're giving you a discount 11 months out of the year. Feb is the only month They're charging you full rate for!


Per-day still has errors when leap-seconds and daylight savings get involved. It's just a smaller error. And unless you can convince landlords nationwide to change how billing works, a company creating a monthly billing service is just going to find another developer that makes what they want.

edit: Also this may not be typical in every lease, but all of my leases have established the rate as both a yearly and monthly amount. I'm sure my landlord wouldn't complain if I paid all 12 months up front. That's the only truly "fair" way.


They should give you a discount for committing to 12 months, and a further discount for paying 12 months upfront. AWS does.


In New Zealand we pay rent every 2 weeks instead of monthly and I’ve seen someone from the US come here and make the opposite argument.

“They are scamming you because you have to pay an extra months rent every year”


It's a mixed bag in Aus.

Rent is listed per week prices but you'll see some landlords calculate it out by the day rate if you want to pay monthly. Others use the 52 week rate divided by 12.


Sure, programming would be infinitely easier if I could ignore reality and substitute it with my own.


> Bill for services every 10 days or every 25 days or every 50 days instead of every month.

Isn't that just prioritizing one's laziness as a developer over what makes sense to the user? My guess is most users would rather see the bill come out on the same day every month.


I want my bills to be nice and predictable. 2 week billing means my bill is the same amount. It means it is in sync with everything else here as well (new Zealand).

I can set up an ap, and it is done. No tweaking needed.


> most users

As a user, I would not. I hate months. They're inconsistent. I hate them.


That is a good point. The awkward thing here isn't really "plus" it's "month".


Product: we need to support doing x on a monthly cycle Dev: Actually, months are a really inaccurate way of measuring time and we should convince all our clients not to use them. CEO: you're fired.

Programming is the art of turning messy, real world, human problems into tools that work in the best way possible.

If you abdicate that responsibility, quite frankly, what use are you?


Please don't argue in the flamewar style on HN. You can make your substantive points without that.


What is your definition of 'unit of time'? Neither 'month' nor 'day' have fixed durations (the latter due to leap seconds). So why should we consider 'day' a unit of time but not 'month'?


Or bill on the X day of the month, where X <= 28.


This is the sort of mistake I see junior devs in my company making often. Any recommendations for an article that teaches a programmer on how to give honest names to variables and functions, ideally with lots of examples?


Nope. The same philosophy of imperfect design can be held to imperfect naming as well. When will this unexpectedly-intelligent behaviour that belies its name ever bite a user in the ass? Hardly ever? Then it's fine to have a slightly dishonest name if it means a more straightforward API.

To be clear it's still a tradeoff. But I think JodaTime did the right call by settling for the simpler name in this situation.


Please don’t have a method named plusIshRoundCeiling() in a library that “might” be used for years in lots of projects :) The plus() one is almost there and you can easily explain the ugly bits in javadocs (with lots of warnings around it so that it sticks out and is addressed hopefully in some next convenient cycle)


I was trying to be funny here! There is lots of room between "plus", which I think is misleading, and the kitchen sink name you quote. Although it depends on how serious the problems were, in this case I'd pick something shorter that still suggested rough edges.


Makes perfect sense, I thought so as well. It just made me think hope somebody doesn't assume this would be the best name in this case :)


plusIshRoundCeiling() forces the reader to look at the javadocs to understand the edge cases, and in fact it serves as a reminder that there even are any edge cases.


> (with lots of warnings around it so that it sticks out and is addressed hopefully in some next convenient cycle)

How do you expect to address it when there isn't a single obvious "right" that everyone will have the same intuition on?


For instance, as in the article, provide the examples so that the users have more heads-up as in what is actually happening.

Long term, you might deprecate it and solve the problem with more intuitive abstraction.


Can you explain why you see this name as unacceptable?


I did’n quite intended to sound as if I think it would be unacceptable but the method name sounded too comical to me to be honest. Almost as if the intention is to say this method is a bit of a joke don’t use it please.

Maybe move() might work better here or moveToCalendarNearest() or something to flag up that in certain cases you need to be extra careful as mentioned in the article.

It is really hard to get this right and personally I would be flagging it up in javadocs.


The behavior is exactly what I would expect it to do. Arguably I’d use “plus” for adding timespans together (which should be commutative and associative) and “shift” for shifting a date by a timespan, but I don’t think the behavior here needs much elaborating as long as the method is accurately documented


If you want to be real nitpicky you shouldn't be using '+' for non-commutative operations either. Ideally you'd also have some kind of distributive law, but if needs be any commutative operation will naturally result in a module over the integers, which I suppose works well enough.


I don't normally comment on sites' styling, but this one really needs some horizontal padding or margin


That there is no ISO 8601 duration in Java has always bugged me. Duration() is just a wrapper around seconds and a millisecond part, so you can’t use it for a calendar duration. Even the concept of an “hour” or “minute” doesn’t work because of leap seconds.


I think a lot of people here are getting hung up on the associativity, or that the name is not fully descriptive.

But remember that this functionality exist ms is a library for operating on dates. It is clear to the user that values like "2 days" or "1 month" are intermediate values that cannot or should not be used as output. They need to be applied to an absolute date to become resolved and be useful.

The context, and real world experience with dates, makes this distinction obvious.

Side note: I created a similar library in the past. I struggled more with deciding if the clipping behaviour was even desirable than worrying about the naming, but that was merely due to the API I used. My function signature was `addMonths(v, 2)`, which eliminated ambiguity.


The flaw in this argument is over focusing on edge cases. Having surprising edge cases is ok, making the 90% case easy is usually more important. Software is built on lies.


Naming models and subsequent database tables is hard. Or in other words, changing it later is the hard part.


Advertisements on your page ruin the experience. Anyway, great post!


Why not call date.plus date.after?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: