Skip to content

Add several subclasses of Step/Generic - Make interactive usage succinct #30

@PeterDSteinberg

Description

@PeterDSteinberg

I made this notebook in elm with landsat data showing usage patterns of Step from xarray_filters and some improvements we could use.

Implement and test the following ideas:

Step using a "for_each_array" decorated callable

class ForEachStep(Step):
    keep_attrs = True
    func = None     
    pass_attrs = False
    def transform(self, X, y=None, **kw):
        kw = kw.copy()
        kw.update(self.get_params(deep=True).copy())
        # TODO here should we filter args and kwargs (see func_signatures.py in xarray_filters)
        if kw.pop('pass_attrs'):
            kw['attrs'] = X.attrs
        dset = kw.pop('func')(X, **kw)
        if kw.pop('keep_attrs', True):
            dset.attrs.update(X.attrs)
        return dset

Would be used with a function like this:

@for_each_array
def set_nans(arr):
    arr = arr.copy(deep=True) 
    arr.values = arr.values.astype(np.float32)
    arr.values[arr.values <= 1] = np.NaN
    arr.values[arr.values == 2**16] = np.NaN
    return arr

And then using ForEachStep as a base class with parameters:

class SetNaNs(ForEachStep):
    func = set_nans                                     #<----currently fails

or calling the ForEachStep constructor directly

ForEachStep(func=set_nans).fit_transform(dset)

Step using a "data_vars_func" decorated callable

Make something like this:

class DataVarsStep(Step):                             
    func = None                                       # func should have signature of **data_vars
                                                                # (expecting data_vars of "X")
    def transform(self, X, y=None, **kw):
        kw = kw.copy()
        kw.update(self.get_params(deep=True).copy())
        # TODO here should we filter args and kwargs (see func_signatures.py in xarray_filters)
        return kw.pop('func')(dset=X)

To be used with functions like normalized_diffs

def normed_diff(a, b):
    return (a - b) / (a + b)

@data_vars_func
def normalized_diffs(**dset):
    print('Called with ', dset.keys())
    dset['ndwi'] = normed_diff(dset['layer_4'], dset['layer_5'])
    dset['ndvi'] = normed_diff(dset['layer_5'], dset['layer_4'])
    dset['ndsi'] = normed_diff(dset['layer_2'], dset['layer_6'])
    dset['nbr']  = normed_diff(dset['layer_4'], dset['layer_7'])
    return dset

Use as follows

class NormedDiff(DataVarsStep):
    func = normalized_diffs                       # currently fails for reason mentioned above

Or like this with constructor:

DataVarsStep(func=normalized_diffs).fit_transform(dset)

Fix the descriptor pattern in Generic/Step

  • First think about Generic vs Step in xarray.pipeline.py and see if we need both? Or how their differences should be explained to end user
  • Make sure Generic / Step can use any type of data in their descriptor pattern that builds the parameters list for the step. For example, If trying to do the snippet below, the code fails due to layers being a list and func being a callable but runs ok if both are set to None. If we allow the descriptor pattern with any data type, that will be easier because the user won't have to pass func=something, layers=something on initialization (ChooseBands is just an example that is specific to a Landsat notebook).
class ChooseBands(Generic):                   # TODO - this section should work but currently doesn't
    include_normed_diffs = True
    layers = DEFAULT_LAYERS
    func = choose_bands

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions