You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v3.0.0.rst
+60-11Lines changed: 60 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -117,6 +117,9 @@ process in more detail.
117
117
118
118
`PDEP-7: Consistent copy/view semantics in pandas with Copy-on-Write <https://pandas.pydata.org/pdeps/0007-copy-on-write.html>`__
119
119
120
+
Setting the option ``mode.copy_on_write`` no longer has any impact. The option is deprecated
121
+
and will be removed in pandas 4.0.
122
+
120
123
.. _whatsnew_300.enhancements.col:
121
124
122
125
``pd.col`` syntax can now be used in :meth:`DataFrame.assign` and :meth:`DataFrame.loc`
@@ -382,6 +385,8 @@ In cases with mixed-resolution inputs, the highest resolution is used:
382
385
383
386
.. warning:: Many users will now get "M8[us]" dtype data in cases when they used to get "M8[ns]". For most use cases they should not notice a difference. One big exception is converting to integers, which will give integers 1000x smaller.
384
387
388
+
Similarly, the :class:`Timedelta` constructor and :func:`to_timedelta` with a string input now defaults to a microsecond unit, using nanosecond unit only in cases that actually have nanosecond precision.
Previously, when dealing with a nullable dtype (e.g. ``Float64Dtype`` or ``int64[pyarrow]``), ``NaN`` was treated as interchangeable with :class:`NA` in some circumstances but not others. This was done to make adoption easier, but caused some confusion (:issue:`32265`). In 3.0, an option ``"mode.nan_is_na"`` (default ``True``) controls whether to treat ``NaN`` as equivalent to :class:`NA`.
556
+
Previously, when dealing with a nullable dtype (e.g. ``Float64Dtype`` or ``int64[pyarrow]``),
557
+
``NaN`` was treated as interchangeable with :class:`NA` in some circumstances but not others.
558
+
This was done to make adoption easier, but caused some confusion (:issue:`32265`).
559
+
In 3.0, this behaviour is made consistent to by default treat ``NaN`` as equivalent
560
+
to :class:`NA` in all cases.
552
561
553
-
With ``pd.set_option("mode.nan_is_na", True)`` (again, this is the default), ``NaN`` can be passed to constructors, ``__setitem__``, ``__contains__`` and be treated the same as :class:`NA`. The only change users will see is that arithmetic and ``np.ufunc`` operations that previously introduced ``NaN`` entries produce :class:`NA` entries instead:
562
+
By default, ``NaN`` can be passed to constructors, ``__setitem__``, ``__contains__``
563
+
and will be treated the same as :class:`NA`. The only change users will see is
564
+
that arithmetic and ``np.ufunc`` operations that previously introduced ``NaN``
565
+
entries produce :class:`NA` entries instead.
554
566
555
567
*Old behavior:*
556
568
557
569
.. code-block:: ipython
558
570
559
-
In [2]: ser = pd.Series([0, None], dtype=pd.Float64Dtype())
571
+
# NaN in input gets converted to NA
572
+
In [1]: ser = pd.Series([0, np.nan], dtype=pd.Float64Dtype())
573
+
In [2]: ser
574
+
Out[2]:
575
+
0 0.0
576
+
1 <NA>
577
+
dtype: Float64
578
+
# NaN produced by arithmetic (0/0) remained NaN
560
579
In [3]: ser / 0
561
580
Out[3]:
562
581
0 NaN
563
582
1 <NA>
564
583
dtype: Float64
584
+
# the NaN value is not considered as missing
585
+
In [4]: (ser / 0).isna()
586
+
Out[4]:
587
+
0 False
588
+
1 True
589
+
dtype: bool
565
590
566
591
*New behavior:*
567
592
568
593
.. ipython:: python
569
594
570
-
ser = pd.Series([0, None], dtype=pd.Float64Dtype())
595
+
ser = pd.Series([0, np.nan], dtype=pd.Float64Dtype())
596
+
ser
571
597
ser /0
598
+
(ser /0).isna()
572
599
573
-
By contrast, with ``pd.set_option("mode.nan_is_na", False)``, ``NaN`` is always considered distinct and specifically as a floating-point value, so cannot be used with integer dtypes:
600
+
In the future, the intention is to consider ``NaN`` and :class:`NA` as distinct
601
+
values, and an option to control this behaviour is added in 3.0 through
602
+
``pd.options.future.distinguish_nan_and_na``. When enabled, ``NaN`` is always
603
+
considered distinct and specifically as a floating-point value. As a consequence,
604
+
it cannot be used with integer dtypes.
574
605
575
606
*Old behavior:*
576
607
@@ -584,13 +615,21 @@ By contrast, with ``pd.set_option("mode.nan_is_na", False)``, ``NaN`` is always
584
615
585
616
.. ipython:: python
586
617
587
-
pd.set_option("mode.nan_is_na", False)
588
-
ser = pd.Series([1, np.nan], dtype=pd.Float64Dtype())
589
-
ser[1]
618
+
with pd.option_context("future.distinguish_nan_and_na", True):
619
+
ser = pd.Series([1, np.nan], dtype=pd.Float64Dtype())
620
+
print(ser[1])
621
+
622
+
If we had passed ``pd.Int64Dtype()`` or ``"int64[pyarrow]"`` for the dtype in
623
+
the latter example, this would raise, as a float ``NaN`` cannot be held by an
624
+
integer dtype.
590
625
591
-
If we had passed ``pd.Int64Dtype()`` or ``"int64[pyarrow]"`` for the dtype in the latter example, this would raise, as a float ``NaN`` cannot be held by an integer dtype.
626
+
With ``"future.distinguish_nan_and_na"`` enabled, ``ser.to_numpy()`` (and
627
+
``frame.values`` and ``np.asarray(obj)``) will convert to ``object`` dtype if
628
+
:class:`NA` entries are present, where before they would coerce to
629
+
``NaN``. To retain a float numpy dtype, explicitly pass ``na_value=np.nan``
630
+
to :meth:`Series.to_numpy`.
592
631
593
-
With ``"mode.nan_is_na"`` set to ``False``, ``ser.to_numpy()`` (and ``frame.values`` and ``np.asarray(obj)``) will convert to ``object`` dtype if :class:`NA` entries are present, where before they would coerce to ``NaN``. To retain a float numpy dtype, explicitly pass ``na_value=np.nan`` to :meth:`Series.to_numpy`.
632
+
Note that the option is experimental and subject to change in future releases.
594
633
595
634
The ``__module__`` attribute now points to public modules
- :class:`IncompatibleFrequency` now subclasses ``TypeError`` instead of ``ValueError``. As a result, joins with mismatched frequencies now cast to object like other non-comparable joins, and arithmetic with indexes with mismatched frequencies align (:issue:`55782`)
751
790
- :class:`Series` "flex" methods like :meth:`Series.add` no longer allow passing a :class:`DataFrame` for ``other``; use the DataFrame reversed method instead (:issue:`46179`)
791
+
- :func:`date_range` and :func:`timedelta_range` no longer default to ``unit="ns"``, instead will infer a unit from the ``start``, ``end``, and ``freq`` parameters. Explicitly specify a desired ``unit`` to override these (:issue:`59031`)
752
792
- :meth:`CategoricalIndex.append` no longer attempts to cast different-dtype indexes to the caller's dtype (:issue:`41626`)
753
793
- :meth:`ExtensionDtype.construct_array_type` is now a regular method instead of a ``classmethod`` (:issue:`58663`)
754
794
- Arithmetic operations between a :class:`Series`, :class:`Index`, or :class:`ExtensionArray` with a ``list`` now consistently wrap that list with an array equivalent to ``Series(my_list).array``. To do any other kind of type inference or casting, do so explicitly before operating (:issue:`62552`)
755
795
- Comparison operations between :class:`Index` and :class:`Series` now consistently return :class:`Series` regardless of which object is on the left or right (:issue:`36759`)
756
796
- Numpy functions like ``np.isinf`` that return a bool dtype when called on a :class:`Index` object now return a bool-dtype :class:`Index` instead of ``np.ndarray`` (:issue:`52676`)
797
+
- Methods that can operate in-place (:meth:`~DataFrame.replace`, :meth:`~DataFrame.fillna`,
- Bug in :class:`DataFrame` and :class:`Series` ``repr`` of :py:class:`collections.abc.Mapping` elements. (:issue:`57915`)
1230
+
- Bug in :meth:`DataFrame.to_hdf` and :func:`read_hdf` with ``timedelta64`` dtypes with non-nanosecond resolution failing to round-trip correctly (:issue:`63239`)
1186
1231
- Fix bug in ``on_bad_lines`` callable when returning too many fields: now emits
1187
1232
``ParserWarning`` and truncates extra fields regardless of ``index_col`` (:issue:`61837`)
1188
1233
- Bug in :func:`pandas.json_normalize` inconsistently handling non-dict items in ``data`` when ``max_level`` was set. The function will now raise a ``TypeError`` if ``data`` is a list containing non-dict items (:issue:`62829`)
1234
+
- Bug in :func:`pandas.json_normalize` raising ``TypeError`` when ``meta`` contained a non-string key (e.g., ``int``) and ``record_path`` was specified, which was inconsistent with the behavior when ``record_path`` was ``None`` (:issue:`63019`)
1189
1235
- Bug in :meth:`.DataFrame.to_json` when ``"index"`` was a value in the :attr:`DataFrame.column` and :attr:`Index.name` was ``None``. Now, this will fail with a ``ValueError`` (:issue:`58925`)
1190
1236
- Bug in :meth:`.io.common.is_fsspec_url` not recognizing chained fsspec URLs (:issue:`48978`)
1191
1237
- Bug in :meth:`DataFrame._repr_html_` which ignored the ``"display.float_format"`` option (:issue:`59876`)
@@ -1239,6 +1285,7 @@ Plotting
1239
1285
- Bug in :meth:`Series.plot` preventing a line and bar from being aligned on the same plot (:issue:`61161`)
1240
1286
- Bug in :meth:`Series.plot` preventing a line and scatter plot from being aligned (:issue:`61005`)
1241
1287
- Bug in :meth:`Series.plot` with ``kind="pie"`` with :class:`ArrowDtype` (:issue:`59192`)
1288
+
- Bug in plotting with a :class:`TimedeltaIndex` with non-nanosecond resolution displaying incorrect labels (:issue:`63237`)
1242
1289
1243
1290
Groupby/resample/rolling
1244
1291
^^^^^^^^^^^^^^^^^^^^^^^^
@@ -1269,7 +1316,8 @@ Groupby/resample/rolling
1269
1316
- Bug in :meth:`Rolling.apply` for ``method="table"`` where column order was not being respected due to the columns getting sorted by default. (:issue:`59666`)
1270
1317
- Bug in :meth:`Rolling.apply` where the applied function could be called on fewer than ``min_period`` periods if ``method="table"``. (:issue:`58868`)
1271
1318
- Bug in :meth:`Rolling.sem` computing incorrect results because it divided by ``sqrt((n - 1) * (n - ddof))`` instead of ``sqrt(n * (n - ddof))``. (:issue:`63180`)
1272
-
- Bug in :meth:`Rolling.skew` incorrectly computing skewness for windows following outliers due to numerical instability. The calculation now properly handles catastrophic cancellation by recomputing affected windows (:issue:`47461`)
1319
+
- Bug in :meth:`Rolling.skew` and in :meth:`Rolling.kurt` incorrectly computing skewness and kurtosis, respectively, for windows following outliers due to numerical instability. The calculation now properly handles catastrophic cancellation by recomputing affected windows (:issue:`47461`, :issue:`61416`)
1320
+
- Bug in :meth:`Rolling.skew` and in :meth:`Rolling.kurt` where results varied with input length despite identical data and window contents (:issue:`54380`)
1273
1321
- Bug in :meth:`Series.resample` could raise when the date range ended shortly before a non-existent time. (:issue:`58380`)
1274
1322
- Bug in :meth:`Series.resample` raising error when resampling non-nanosecond resolutions out of bounds for nanosecond precision (:issue:`57427`)
1275
1323
- Bug in :meth:`Series.rolling.var` and :meth:`Series.rolling.std` computing incorrect results due to numerical instability. (:issue:`47721`, :issue:`52407`, :issue:`54518`, :issue:`55343`)
@@ -1307,6 +1355,7 @@ Sparse
1307
1355
- Bug in :class:`SparseDtype` for equal comparison with na fill value. (:issue:`54770`)
1308
1356
- Bug in :meth:`DataFrame.sparse.from_spmatrix` which hard coded an invalid ``fill_value`` for certain subtypes. (:issue:`59063`)
1309
1357
- Bug in :meth:`DataFrame.sparse.to_dense` which ignored subclassing and always returned an instance of :class:`DataFrame` (:issue:`59913`)
1358
+
- Bug in :meth:`cumsum` for integer arrays Calling SparseArray.cumsum caused max recursion depth error. (:issue:`62669`)
0 commit comments