Skip to content

Whitening source step produces matrix with complex-valued entries  #4

@xemcerk

Description

@xemcerk

Thank you a lot for producing such inspirational work, I'm trying to apply the coral algorithm in NIR calibration field but encounter an unexpected issue. The problem appears after 'whitening source' step, which produce a matrix with complex-valued entries, as I need to train a regressor like PLS over the result, the complex number is not supported. Presumably there's some factors which differ from your experimental setting may account for the problem:

  1. The distribution of NIR calibration data may be different from vision data. Each sample correspondes to a real-life object, mostly fruit or mine etc. And the data is produced by a instrument detecting the how much light are absorb by the object in different wavelength. So it's still (n samples* m features) format, in which m features are wavelengths sampled from a certain range.
  2. Covariance matrix of the NIR calibration data has no negative value.

It would be really nice of you to have any suggestion in in terms of the math or code. Thanks a lot!

Here's the code I use, which's implemented in python.

import numpy as np
import scipy.io
import scipy.linalg
import sklearn.metrics
import sklearn.neighbors
from sklearn.cross_decomposition import PLSRegression
import matplotlib.pyplot as plt


class CORAL:
    def __init__(self):
        super(CORAL, self).__init__()

    def fit(self, Xs, Xt):
        '''
        Perform CORAL on the source domain features
        :param Xs: ns * n_feature, source feature
        :param Xt: nt * n_feature, target feature
        :return: New source domain features
        '''
        cov_src = np.cov(Xs.T) + np.eye(Xs.shape[1])
        cov_tar = np.cov(Xt.T) + np.eye(Xt.shape[1])
        A_coral = np.dot(scipy.linalg.fractional_matrix_power(cov_src, -0.5),
                         scipy.linalg.fractional_matrix_power(cov_tar, 0.5))
        Xs_new = np.dot(Xs, A_coral)
        return Xs_new

    def fit_predict(self, Xs, Ys, Xt, Yt):
        '''
        Perform CORAL, then predict using 1NN classifier
        :param Xs: ns * n_feature, source feature
        :param Ys: ns * 1, source label
        :param Xt: nt * n_feature, target feature
        :param Yt: nt * 1, target label
        :return: root mean standard error and predicted values
        '''
        Xs_new = self.fit(Xs, Xt)
        pls = PLSRegression()
        pls.fit(Xs_new, Ys)
        y_pred = pls.predict(Xt)
        mse = mean_squared_error(Yt, y_pred)
        rmse = np.sqrt(mse)
        return rmse, y_pred

if __name__ == '__main__':
    corn = scipy.io.loadmat('../input/corn.mat')
    Xs = corn['mp5spec']['data'][0][0]
    Xt = corn['m5spec']['data'][0][0]
    y = corn['propvals']['data'][0][0]
    coral = CORAL()
    rmse = coral.fit_predict(Xs, y, Xt, y)

And the dataset can be downloaded HERE

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions