Code Monkey home page Code Monkey logo

Comments (15)

christianbrodbeck avatar christianbrodbeck commented on September 23, 2024

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

Hello @christianbrodbeck . Thanks a lot for your response. I made the change you suggested, but still cannot get the assertion to be true. What else could I be doing wrong?

# Author: Christian Brodbeck <[email protected]>
# sphinx_gallery_thumbnail_number = 4
import os

from scipy.io import loadmat
import mne
from eelbrain import *
import numpy as np

# Load the mTRF speech dataset and convert data to NDVars
root = mne.datasets.mtrf.data_path()
speech_path = os.path.join(root, 'speech_data.mat')
mdata = loadmat(speech_path)

# Time axis
tstep = 1. / mdata['Fs'][0, 0]
n_times = mdata['envelope'].shape[0]
time = UTS(0, tstep, n_times)
# Load the EEG sensor coordinates (drop fiducials coordinates, which are stored
# after sensor 128)
sensor = Sensor.from_montage('biosemi128')[:128]
# Frequency dimension for the spectrogram
band = Scalar('frequency', range(16))
# Create variables
envelope = NDVar(mdata['envelope'][:, 0], (time,), name='envelope')
eeg = NDVar(mdata['EEG'], (time, sensor), name='EEG', info={'unit': 'µV'})
spectrogram = NDVar(mdata['spectrogram'], (time, band), name='spectrogram')
# Exclude a bad channel
eeg = eeg[sensor.index(exclude='A13')]

# model eeg by boosting the speech envelope
res = boosting(eeg, envelope, -0.100, 0.400, basis=0.100, partitions=4)

# convolve the filters with the envelope to obtain the model prediction
prediction = convolve(res.h_scaled, envelope)

# calculate the residual error "by hand"
res_error_hand = np.sum((prediction.get_data(['time','sensor'])-eeg.get_data(['time','sensor']))**2,axis=0)

# assert that res.residual and the residual obtained by hand are the same
assert np.allclose(res_error_hand, res.residual.get_data(['sensor']))

from eelbrain.

christianbrodbeck avatar christianbrodbeck commented on September 23, 2024

The residual (as all fit metrics) is calculated after normalization, so you'd have to take that normalization into account.

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

Hello @christianbrodbeck , thanks for the reply.

What should be normalized and at what stage? Along which dimensions/features? Thanks for clarifying

from eelbrain.

christianbrodbeck avatar christianbrodbeck commented on September 23, 2024

See scale_data boosting parameter. I’ll add this, will this be clear:

scale_data : bool | 'inplace'
    Scale ``y`` and ``x`` before boosting: subtract the mean and divide by
    the standard deviation (when ``error='l2'``) or the mean absolute
    value (when ``error='l1'``). Use ``'inplace'`` to save memory by scaling
    the original objects specified as ``y`` and ``x`` instead of making a 
    copy. Scale is stored in the :class:`BoostingResult: :attr:`.y_mean``,
    :attr:`.y_scale`, :attr:`.x_mean`, and :attr:`.x_scale` attributes.

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

Thank you @christianbrodbeck . I think now I better understand what is going on. I assume that the whole point of getting both h and h_scaled is so that you can directly apply h_scaled to the original x. In contrast, the scale would have to be manually applied to y. I tried just that, but the numbers are still off. Maybe I'm still doing something wrong.

# Author: Christian Brodbeck <[email protected]>
# sphinx_gallery_thumbnail_number = 4
import os

from scipy.io import loadmat
import mne
from eelbrain import *
import numpy as np

# Load the mTRF speech dataset and convert data to NDVars
root = mne.datasets.mtrf.data_path()
speech_path = os.path.join(root, 'speech_data.mat')
mdata = loadmat(speech_path)

# Time axis
tstep = 1. / mdata['Fs'][0, 0]
n_times = mdata['envelope'].shape[0]
time = UTS(0, tstep, n_times)
# Load the EEG sensor coordinates (drop fiducials coordinates, which are stored
# after sensor 128)
sensor = Sensor.from_montage('biosemi128')[:128]
# Frequency dimension for the spectrogram
band = Scalar('frequency', range(16))
# Create variables
envelope = NDVar(mdata['envelope'][:, 0], (time,), name='envelope')
eeg = NDVar(mdata['EEG'], (time, sensor), name='EEG', info={'unit': 'µV'})
spectrogram = NDVar(mdata['spectrogram'], (time, band), name='spectrogram')
# Exclude a bad channel
eeg = eeg[sensor.index(exclude='A13')]

# model eeg by boosting the speech envelope
res = boosting(eeg, envelope, -0.100, 0.400, basis=0.100, partitions=4)

# convolve the filters with the envelope to obtain the model prediction
prediction = convolve(res.h_scaled, envelope)

# calculate the residual error "by hand"
res_error_hand = np.sum((prediction.get_data(['time','sensor'])-(eeg.get_data(['time','sensor'])-res.y_mean.get_data(['sensor']))/res.y_scale.get_data(['sensor']))**2,axis=0)

# assert that res.residual and the residual obtained by hand are the same
assert np.allclose(res_error_hand, res.residual.get_data(['sensor']))

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

Also, given what we have discussed so far, I would have expected these to give the same result, but they don't.

# convolve the filters with the envelope to obtain the model prediction
prediction_hand = convolve(res.h, (envelope-res.x_mean)/res.x_scale)
prediction = convolve(res.h_scaled, envelope)

assert np.allclose(prediction_hand.get_data(['sensor','time']),prediction.get_data(['sensor','time'])*1000)

I feel like there's something very fundamental that I'm missing.

from eelbrain.

proloyd avatar proloyd commented on September 23, 2024

Hi @iranroman,

Prediction

according to my understanding, the prediction should be:

prediction = convolve(res.h_scaled, (envelope - res.x_mean)) + res.y_mean

and that matches with hand calculated prediction:

prediction_hand = convolve(res.h, (envelope - res.x_mean) / res.x_scale) * res.y_scale + res.y_mean
assert np.allclose(prediction, prediction)

Residual

However, according to documentation, the residual should be:

error = ((eeg - res.y_mean) - convolve(res.h_scaled, (envelope - res.x_mean))) / res.y_scale
residual = (error ** 2).sum('time')

but this does not match with the res.residual. May be this residual is the residual from the data fitting phase (evaluated separately on the partitions and averaged) @christianbrodbeck?

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

Hello, thanks you @proloyd for your response.

This line seems rather complicated to me:

prediction_hand = convolve(res.h, (envelope - res.x_mean) / res.x_scale) * res.y_scale + res.y_mean

Let me try to break this down:

convolve(res.h, (envelope - res.x_mean) / res.x_scale)

This expression convolves the unscaled TRF filters with the standardized envelope (?). I'm assuming here that res.x_scale is the envelope's standard deviation, but res.x_scale==0.0114 and np.std(np.array(envelope)))==0.0123, so not sure if I'm misunderstanding something there (please answer at least about this bit if you have time).

This is then multiplied by res.scale and res.y_mean are added in order to "unstandardize" the prediction and bring it back to the units of the original target (in this case the variable called eeg; by the way, here both np.mean(eeg.get_data(['time','sensor']),axis=0)==res.y_mean.get_data(['sensor'])
and np.std(eeg.get_data(['time','sensor']),axis=0)==res.y_scale.get_data(['sensor']) are True).

With all of this said, I would have expected a way to say something along the lines of convolve(res.h, envelope) to get the model's predictions in an "inference mode" kind of fashion. However, I can live with this.

P.D: I think in the second line of code you meant assert np.allclose(prediction, prediction_hand) instead of assert np.allclose(prediction, prediction)

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

@christianbrodbeck , could you please provide me with an example for how to use BoostingResult in combination with convolve to apply it to a time-series signal and obtain a prediction that I can directly compare with EEG data? Thank you!

I'm trying to incorporate your boosting code as evaluation metric in the optimization routine of a model that predicts speech envelopes. Cheers!

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

would it be using

convolve(res.h_scaled, (envelope - res.x_mean))*res.y_scale + res.y_mean

?

That's what would make sense to me, but without being able to replicate the residuals, I cannot corroborate the operations I'm trying to do "by hand". Thanks!

from eelbrain.

proloyd avatar proloyd commented on September 23, 2024

would it be using

convolve(res.h_scaled, (envelope - res.x_mean))*res.y_scale + res.y_mean

?

That's what would make sense to me, but without being able to replicate the residuals, I cannot corroborate the operations I'm trying to do "by hand". Thanks!

What you want is:
prediction = convolve(res.h_scaled, (envelope - res.x_mean)) + res.y_mean

This aligns with the fact that res.h_scaled = res.h * res.y_scale / res.x_scale.

from eelbrain.

christianbrodbeck avatar christianbrodbeck commented on September 23, 2024

How big is your difference? Could it be due to numerical precision?

For example, you're calculating it at a different scale than the actual code, and the SS is calculated using eelbrain._trf._boosting_opt.l2

For reference, the actual calculation is here:

convolve(h[i_y], self.data.x, self.data.x_pads, self._i_start, segments, y_pred_i)

The L2 evaluator is here:

class L2(Evaluator):

from eelbrain.

iranroman avatar iranroman commented on September 23, 2024

image

I tried prediction = convolve(res.h_scaled, (envelope - res.x_mean)) + res.y_mean and this is how the prediction compares visually to the original EEG. I could be wrong, but the plots look like they are off on the order or magnitude. Or does this look right to you @christianbrodbeck @proloyd ? Thanks again!

from eelbrain.

christianbrodbeck avatar christianbrodbeck commented on September 23, 2024

Hard to say from this, boosting is pretty conservative. Notice that the visible features of the response are not in the prediction, so they are probably noise not signal.

from eelbrain.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.