Skip to content

Conversation

@jbrockmendel
Copy link
Member

Waiting to update expected's until after #62801, so this won't pass yet.

@jorisvandenbossche
Copy link
Member

Should we only do this for pd.date_range for now, and not yet for pd.timedelta_range, since in general timedelta still defaults to nanoseconds? (eg when converting strings or integers in pd.to_timedelta)

@jbrockmendel
Copy link
Member Author

I think best case would be to do it all at once (xref #63018 gets everything but the string-parsing inference for td64). But if it'll make a difference in getting a release out soon, I'll happily revert.

@jorisvandenbossche
Copy link
Member

Ah, I hadn't seen #63018. Yes, my preference would also be to do it all at once. Will try to take a look at that one then.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests for the inference itself? I could have missed it.

unit : {'s', 'ms', 'us', 'ns'}, default 'ns'
unit : {'s', 'ms', 'us', 'ns', None}, default None
Specify the desired resolution of the result.
If not specified, this is inferred from the 'start', 'end', and 'freq'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to given an indication of how the inference works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you have in mind? I can't think of an explanation that is meaningfully simpler than "read the code"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally started a separate chain here: #63146 (comment)

unit : {'s', 'ms', 'us', 'ns'}, default 'ns'
unit : {'s', 'ms', 'us', 'ns', None}, default None
Specify the desired resolution of the result.
If not specified, this is inferred from the 'start', 'end', and 'freq'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@jbrockmendel
Copy link
Member Author

Do we have tests for the inference itself? I could have missed it.

I didn't add dedicated tests bc existing tests hit all the cases, but will do so now.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

unit : {'s', 'ms', 'us', 'ns'}, default 'ns'
unit : {'s', 'ms', 'us', 'ns', None}, default None
Specify the desired resolution of the result.
If not specified, this is inferred from the 'start', 'end', and 'freq'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized writing this that we also don't document Timestamp inference (at least, in the API docs). So this is perhaps a little incomplete unless that is done, but as long as there is no opposition to doing so I can open an issue for that.

Suggested change
If not specified, this is inferred from the 'start', 'end', and 'freq'
If not specified, this is inferred from the 'start', 'end', and 'freq' using the same inference as :class:`Timestamp` taking the highest resolution of the three that are provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's track as a separate issue indeed (your above suggestion does not make it much more clear, right now, IMO)

@rhshadrach rhshadrach added Timedelta Timedelta data type Datetime Datetime data dtype API - Consistency Internal Consistency of API/Behavior labels Nov 25, 2025
@rhshadrach rhshadrach added this to the 3.0 milestone Nov 25, 2025
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual change looks good to me!

Personally, I would do less of unit="ns" in the tests (to ensure we test the default unit more as well, and it also just makes them a bit less verbose), but it that requires more updates in the tests to get them passing, that's certainly fine for a follow-up issue.

Comment on lines 656 to 664
df = DataFrame(
{"dt": date_range("2015-01-01", periods=3, tz="Europe/Brussels", unit="ns")}
)
result = df.apply(lambda x: x)
tm.assert_frame_equal(result, df)

result = df.apply(lambda x: x + pd.Timedelta("1day"))
expected = DataFrame(
{"dt": date_range("2015-01-02", periods=3, tz="Europe/Brussels")}
{"dt": date_range("2015-01-02", periods=3, tz="Europe/Brussels", unit="ns")}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't (yet) work without specifying the unit in both cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATM the Timedelta("1day") above has "ns" unit, so we need it in the expected line. In order to be robust to the Timedelta PR, need it in the df line too. Can remove both if the Timedelta PR goes in.

"B": [1.0, 2.0, 3.0],
"C": ["foo", "bar", "baz"],
"D": date_range("20130101", periods=3),
"D": date_range("20130101", periods=3, unit="ns"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"D": date_range("20130101", periods=3, unit="ns"),
"D": date_range("20130101", periods=3),

and then remove the .as_unit("ns") below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Comment on lines 1850 to 1853
start = Timestamp("2025-11-25").as_unit("ms")
end = Timestamp("2025-11-26").as_unit("s")
dti = date_range(start, end, freq=off)
assert dti.unit == "us"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
start = Timestamp("2025-11-25").as_unit("ms")
end = Timestamp("2025-11-26").as_unit("s")
dti = date_range(start, end, freq=off)
assert dti.unit == "us"
start = Timestamp("2025-11-25 09:00:00").as_unit("s")
end = Timestamp("2025-11-25 09:00:02").as_unit("s")
dti = date_range(start, end, freq=off)
assert dti.unit == "us"
off = DateOffset(milleconds=2)
dti = date_range(start, end, freq=off)
assert dti.unit == "ms"
off = DateOffset(nanoseconds=2)
dti = date_range(start, end, freq=off)
assert dti.unit == "ns"

to also cover the two other cases? (unless that is already covered elsewhere?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, though i'll change the nano case to make a smaller array

@jbrockmendel
Copy link
Member Author

Personally, I would do less of unit="ns" in the tests (to ensure we test the default unit more as well, and it also just makes them a bit less verbose), but it that requires more updates in the tests to get them passing, that's certainly fine for a follow-up issue.

I'm of mixed opinion on this. Ended up deciding that outside of tests targeted at date_range, I prefer to be a little bit more verbose in order to be robust to e.g. decisions on other PRs

@mroeschke mroeschke mentioned this pull request Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API - Consistency Internal Consistency of API/Behavior Datetime Datetime data dtype Timedelta Timedelta data type

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH/BUG: pd.date_range() still defaults to nanosecond resolution

3 participants