ENH: Add `unit` argument to `to_datetime` and `to_timedelta` to avoid value-dependent parsing

### Feature Type

- [x] Adding new functionality to pandas

- [ ] Changing existing functionality in pandas

- [ ] Removing existing functionality in pandas


### Problem Description

Over in https://github.com/dask/dask/issues/12178#issuecomment-3604828151, we're discussing how dask should adapt to the new datetime / timedelta resolution inference.

The new behavior is value-dependent: you don't know what dtype the result will be until you run it on the values. This is a challenge for dask, which might process subsets of the data in parallel, but would like each partition of a column to have the same data type: 

```python
In [9]: values = ["1", "2", "1 day 2 hours"]

In [10]: pd.to_timedelta(s[:2])
Out[10]: 
0   0 days 00:00:00.000000001
1   0 days 00:00:00.000000002
dtype: timedelta64[ns]

In [11]: pd.to_timedelta(s[2:])
Out[11]: 
2   1 days 02:00:00
dtype: timedelta64[us]
```

The first partition (`s[:2]`) are inferred to be `timedelta64[ns]` while the second partition (`s[2:]`) are inferred to be `timedelta64[us]`.


### Feature Description

Add a `dtype` or `resolution` parameter to `to_datetime` and `to_timedelta`:

```python
pd.to_datetime(
    arg: 'DatetimeScalarOrArrayConvertible | DictConvertible',
    errors: 'DateTimeErrorChoices' = 'raise',
    dayfirst: 'bool' = False,
    yearfirst: 'bool' = False,
    utc: 'bool' = False,
    format: 'str | None' = None,
    exact: 'bool | lib.NoDefault' = <no_default>,
    unit: 'str | None' = None,
    infer_datetime_format: 'lib.NoDefault | bool' = <no_default>,
    origin: 'str' = 'unix',
    cache: 'bool' = True,
    resolution: 'Resolution | None' = None,
) -> 'DatetimeIndex | Series | DatetimeScalar | NaTType | None'
"""
...
dtype: Dtype, optional
    Controls the resolution of the result.
"""
```

This would ideally be implemented as `pd.to_datetime(...).as_unit(resolution)`.

### Alternative Solutions

Dask could just go on its own here and add that `resolution` keyword. But I suspect other workloads might benefit from knowing exactly what range they'll get out.

### Additional Context

There's some complexity here in how this proposed `resolution` keywords: in particular `unit` (how you interpret numeric values) and `errors` (what happens if you specify a value that is out of bounds for the `resolution` you provide?). I'd be curious to hear if those downsides outweigh any benefits.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Add `unit` argument to `to_datetime` and `to_timedelta` to avoid value-dependent parsing #63270

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Add unit argument to to_datetime and to_timedelta to avoid value-dependent parsing #63270

Description

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

ENH: Add `unit` argument to `to_datetime` and `to_timedelta` to avoid value-dependent parsing #63270