Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# This is used to build the CI test environment and so Github can track dependencies.
xarray==2024.3.0
xarray==2025.1.2
pandas==2.2.3
numpy==1.26.4
matplotlib==3.10.1
Expand All @@ -15,3 +15,4 @@ h5netcdf==1.6.1
impactlab-tools==0.6.0
p_tqdm==1.4.2
pyarrow==19.0.1
numcodecs==0.15.1
3 changes: 3 additions & 0 deletions src/dscim/preprocessing/input_damages.py
Original file line number Diff line number Diff line change
Expand Up @@ -715,6 +715,7 @@ def prep_mortality_damages(

# longest-string gcm has to be processed first so the coordinate is the right str length
gcms = sorted(gcms, key=len, reverse=True)
max_gcm_len = len(gcms[0])

if mortality_version == 0:
scaling_deaths = "epa_scaled"
Expand Down Expand Up @@ -808,6 +809,8 @@ def prep(
if damages[v].dtype == object:
damages[v] = damages[v].astype("unicode")

damages["gcm"] = damages["gcm"].astype("U" + str(max_gcm_len))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the xarray previously automatically updated the datatype when appending additional GCMs. With the update to xarray 2025.1.2 this broke and the code was throwing an error when trying to append a GCM with shorter name. This change ensures that all GCMs use the same datatype.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

astype("unicode") is pretty old (like, python 2) way of handling this because python strings at the time were not natively unicode. Would swapping all these over to astype("str") fix this and hopefully not create trouble elsewhere? I think this would use a variable-length string representation -- meaning you don't need all this manual checking for max string length in a GCM name.

if i == 0:
damages.to_zarr(
f"{outpath}/impacts-darwin-montecarlo-damages-v{mortality_version}.zarr",
Expand Down
4 changes: 2 additions & 2 deletions tests/test_main_recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,11 +258,11 @@ def test_weitzman_min(menu_instance):
diff = censored_values_slower[0:index] - censored_values[0:index]
assert all(x > 0 for x in diff)
as_share = menu_instance.weitzman_min(
[20, 50, 100], [5, 30, 60], as_share
np.array([20, 50, 100]), np.array([5, 30, 60]), as_share
) # the first value only should be censored
assert as_share[0] != 5 and [as_share[1], as_share[2]] == [30, 60]
all_cons = menu_instance.weitzman_min(
[20, 50, 100], [5, 30, 60], all_cons
np.array([20, 50, 100]), np.array([5, 30, 60]), all_cons
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test was failing because of the default python lists. Nowhere in the code do we pass the weitzman_min function with anything except an xarray array, so I thought it would be okay to "lose" the functionality. And in fact the docstring of the weitzman_min function says it expects an xarray array anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems good enough, especially if that's how this is documented. A more involved workaround may be another issue/PR. What was it erroring on? Do you have a copy of the error anywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the error:

>       as_share = menu_instance.weitzman_min(
            [20, 50, 100], [5, 30, 60], as_share
        )  # the first value only should be censored

tests/test_main_recipe.py:260:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/dscim/menu/main_recipe.py:910: in weitzman_min
    clipped_cons = xr.where(
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/computation/computation.py:737: in where
    result = apply_ufunc(
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/computation/apply_ufunc.py:1280: in apply_ufunc
    return apply_array_ufunc(func, *args, dask=dask)
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/computation/apply_ufunc.py:891: in apply_array_ufunc
    return func(*args)
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/core/duck_array_ops.py:387: in where
    return xp.where(condition, *as_shared_dtype([x, y], xp=xp))
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/core/duck_array_ops.py:281: in as_shared_dtype
    dtype = dtypes.result_type(*scalars_or_arrays, xp=xp)
/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/core/dtypes.py:267: in result_type
    array_api_compat.result_type(preprocess_types(t), xp=xp)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

xp = <module 'numpy' from '/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/numpy/__init__.py'>
arrays_and_dtypes = ([5, 30, 60],)

    def result_type(*arrays_and_dtypes, xp) -> np.dtype:
        if xp is np or any(
            isinstance(getattr(t, "dtype", t), np.dtype) for t in arrays_and_dtypes
        ):
>           return xp.result_type(*arrays_and_dtypes)
E           TypeError: Field elements must be 2- or 3-tuples, got '5'

/home/jonahmgilbert/anaconda3/envs/dscim-testing/lib/python3.12/site-packages/xarray/compat/array_api_compat.py:44: TypeError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thanks.

I think the original fix is fine.

The other aggressive approach might be to cast no_cc_consumption and cc_consumption to xr.DataArray at the start of the function like:

no_cc_consumption = xr.DataArray(no_cc_consumption)
cc_consumption = xr.DataArray(cc_consumption)

That might have side-effects for edge cases and it feels kinda magical for an internal-ish function. Could be a solution if this PR's fix accidentally breaks things.

) # all values should be changed
assert all([x != 0 for x in all_cons - [5, 30, 60]])

Expand Down