Skip to content

__setitem__ should accept sequences containing None? #1538

@anentropic

Description

@anentropic

Describe the bug
I have code a bit like:

In [1]: import pandas as pd
   ...:
   ...: df = pd.DataFrame({"flag": [True, False], "count": [1, 0]})
   ...:
   ...: df["result"] = [
   ...:     1 if flag and count > 0 else None
   ...:     for flag, count in zip(df["flag"], df["count"], strict=True)
   ...: ]

In [2]: df['result']
Out[2]:
0    1.0
1    NaN
Name: result, dtype: float64

The code works as intended.

Pylance complains:

No overloads for "__setitem__" match the provided arguments Pylance reportCallIssue
frame.pyi(849, 9): Overload 2 is the closest match
Argument of type "list[Any | None]" cannot be assigned to parameter "value" of type "Scalar | ArrayLike | NAType | NaTType | IndexOpsMixin[Any, Any] | Sequence[Scalar] | Sequence[Sequence[Scalar]] | DataFrame | Mapping[Hashable, Scalar | NAType | NaTType] | None" in function "__setitem__"
Type "list[Any | None]" is not assignable to type "Scalar | ArrayLike | NAType | NaTType | IndexOpsMixin[Any, Any] | Sequence[Scalar] | Sequence[Sequence[Scalar]] | DataFrame | Mapping[Hashable, Scalar | NAType | NaTType] | None"
"list[Any | None]" is not assignable to "str"
"list[Any | None]" is not assignable to "bytes"
"list[Any | None]" is not assignable to "date"
"list[Any | None]" is not assignable to "datetime"
"list[Any | None]" is not assignable to "timedelta"
"list[Any | None]" is not assignable to "datetime64[date | int | None]"
"list[Any | None]" is not assignable to "timedelta64[timedelta | int | None]"

Looking at the types in frame.pyi we have something like:

    @overload
    def __setitem__(
        self,
        idx: (
            int
            | IndexType
            | tuple[int, int]
            | tuple[IndexType, int]
            | tuple[IndexType, IndexType]
            | tuple[int, IndexType]
        ),
        value: (
            Scalar
            | IndexOpsMixin
            | Sequence[Scalar]
            | DataFrame
            | np_ndarray
            | NAType
            | NaTType
            | Mapping[Hashable, Scalar | NAType | NaTType]
            | None
        ),
    ) -> None: ...

I'm guessing if we have Mapping[Hashable, Scalar | NAType | NaTType] then Sequence[Hashable, Scalar | NAType | NaTType] should also allowed?

But it seems like both of these should also accept None?

Or None should be part of Scalar ?

Certainly Sequence[Scalar] seems too restrictive.

I had a bit of a look but I have no idea if/where such behaviour is documented in pandas. The closest I could find is https://pandas.pydata.org/docs/user_guide/10min.html#setting which is far from a rigorous description of what types are accepted on the RHS of assignment.

To Reproduce

  1. see above
  2. using pyright
  3. see above

Please complete the following information:

  • OS: macOS
  • OS Version 14.6.1
  • python version: 3.11
  • version of type checker Pyright 1.1.407
  • version of installed pandas-stubs: 2.3.3.251201

Metadata

Metadata

Assignees

No one assigned

    Labels

    DataFrameDataFrame data structureIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions