Comments (15)
from eelbrain.
Hello @christianbrodbeck . Thanks a lot for your response. I made the change you suggested, but still cannot get the assertion to be true. What else could I be doing wrong?
# Author: Christian Brodbeck <[email protected]>
# sphinx_gallery_thumbnail_number = 4
import os
from scipy.io import loadmat
import mne
from eelbrain import *
import numpy as np
# Load the mTRF speech dataset and convert data to NDVars
root = mne.datasets.mtrf.data_path()
speech_path = os.path.join(root, 'speech_data.mat')
mdata = loadmat(speech_path)
# Time axis
tstep = 1. / mdata['Fs'][0, 0]
n_times = mdata['envelope'].shape[0]
time = UTS(0, tstep, n_times)
# Load the EEG sensor coordinates (drop fiducials coordinates, which are stored
# after sensor 128)
sensor = Sensor.from_montage('biosemi128')[:128]
# Frequency dimension for the spectrogram
band = Scalar('frequency', range(16))
# Create variables
envelope = NDVar(mdata['envelope'][:, 0], (time,), name='envelope')
eeg = NDVar(mdata['EEG'], (time, sensor), name='EEG', info={'unit': 'µV'})
spectrogram = NDVar(mdata['spectrogram'], (time, band), name='spectrogram')
# Exclude a bad channel
eeg = eeg[sensor.index(exclude='A13')]
# model eeg by boosting the speech envelope
res = boosting(eeg, envelope, -0.100, 0.400, basis=0.100, partitions=4)
# convolve the filters with the envelope to obtain the model prediction
prediction = convolve(res.h_scaled, envelope)
# calculate the residual error "by hand"
res_error_hand = np.sum((prediction.get_data(['time','sensor'])-eeg.get_data(['time','sensor']))**2,axis=0)
# assert that res.residual and the residual obtained by hand are the same
assert np.allclose(res_error_hand, res.residual.get_data(['sensor']))
from eelbrain.
The residual (as all fit metrics) is calculated after normalization, so you'd have to take that normalization into account.
from eelbrain.
Hello @christianbrodbeck , thanks for the reply.
What should be normalized and at what stage? Along which dimensions/features? Thanks for clarifying
from eelbrain.
See scale_data
boosting parameter. I’ll add this, will this be clear:
scale_data : bool | 'inplace'
Scale ``y`` and ``x`` before boosting: subtract the mean and divide by
the standard deviation (when ``error='l2'``) or the mean absolute
value (when ``error='l1'``). Use ``'inplace'`` to save memory by scaling
the original objects specified as ``y`` and ``x`` instead of making a
copy. Scale is stored in the :class:`BoostingResult: :attr:`.y_mean``,
:attr:`.y_scale`, :attr:`.x_mean`, and :attr:`.x_scale` attributes.
from eelbrain.
Thank you @christianbrodbeck . I think now I better understand what is going on. I assume that the whole point of getting both h
and h_scaled
is so that you can directly apply h_scaled
to the original x
. In contrast, the scale would have to be manually applied to y
. I tried just that, but the numbers are still off. Maybe I'm still doing something wrong.
# Author: Christian Brodbeck <[email protected]>
# sphinx_gallery_thumbnail_number = 4
import os
from scipy.io import loadmat
import mne
from eelbrain import *
import numpy as np
# Load the mTRF speech dataset and convert data to NDVars
root = mne.datasets.mtrf.data_path()
speech_path = os.path.join(root, 'speech_data.mat')
mdata = loadmat(speech_path)
# Time axis
tstep = 1. / mdata['Fs'][0, 0]
n_times = mdata['envelope'].shape[0]
time = UTS(0, tstep, n_times)
# Load the EEG sensor coordinates (drop fiducials coordinates, which are stored
# after sensor 128)
sensor = Sensor.from_montage('biosemi128')[:128]
# Frequency dimension for the spectrogram
band = Scalar('frequency', range(16))
# Create variables
envelope = NDVar(mdata['envelope'][:, 0], (time,), name='envelope')
eeg = NDVar(mdata['EEG'], (time, sensor), name='EEG', info={'unit': 'µV'})
spectrogram = NDVar(mdata['spectrogram'], (time, band), name='spectrogram')
# Exclude a bad channel
eeg = eeg[sensor.index(exclude='A13')]
# model eeg by boosting the speech envelope
res = boosting(eeg, envelope, -0.100, 0.400, basis=0.100, partitions=4)
# convolve the filters with the envelope to obtain the model prediction
prediction = convolve(res.h_scaled, envelope)
# calculate the residual error "by hand"
res_error_hand = np.sum((prediction.get_data(['time','sensor'])-(eeg.get_data(['time','sensor'])-res.y_mean.get_data(['sensor']))/res.y_scale.get_data(['sensor']))**2,axis=0)
# assert that res.residual and the residual obtained by hand are the same
assert np.allclose(res_error_hand, res.residual.get_data(['sensor']))
from eelbrain.
Also, given what we have discussed so far, I would have expected these to give the same result, but they don't.
# convolve the filters with the envelope to obtain the model prediction
prediction_hand = convolve(res.h, (envelope-res.x_mean)/res.x_scale)
prediction = convolve(res.h_scaled, envelope)
assert np.allclose(prediction_hand.get_data(['sensor','time']),prediction.get_data(['sensor','time'])*1000)
I feel like there's something very fundamental that I'm missing.
from eelbrain.
Hi @iranroman,
Prediction
according to my understanding, the prediction should be:
prediction = convolve(res.h_scaled, (envelope - res.x_mean)) + res.y_mean
and that matches with hand calculated prediction:
prediction_hand = convolve(res.h, (envelope - res.x_mean) / res.x_scale) * res.y_scale + res.y_mean
assert np.allclose(prediction, prediction)
Residual
However, according to documentation, the residual should be:
error = ((eeg - res.y_mean) - convolve(res.h_scaled, (envelope - res.x_mean))) / res.y_scale
residual = (error ** 2).sum('time')
but this does not match with the res.residual
. May be this residual is the residual from the data fitting phase (evaluated separately on the partitions and averaged) @christianbrodbeck?
from eelbrain.
Hello, thanks you @proloyd for your response.
This line seems rather complicated to me:
prediction_hand = convolve(res.h, (envelope - res.x_mean) / res.x_scale) * res.y_scale + res.y_mean
Let me try to break this down:
convolve(res.h, (envelope - res.x_mean) / res.x_scale)
This expression convolves the unscaled TRF filters with the standardized envelope (?). I'm assuming here that res.x_scale
is the envelope's standard deviation, but res.x_scale==0.0114
and np.std(np.array(envelope)))==0.0123
, so not sure if I'm misunderstanding something there (please answer at least about this bit if you have time).
This is then multiplied by res.scale
and res.y_mean
are added in order to "unstandardize" the prediction and bring it back to the units of the original target (in this case the variable called eeg
; by the way, here both np.mean(eeg.get_data(['time','sensor']),axis=0)==res.y_mean.get_data(['sensor'])
and np.std(eeg.get_data(['time','sensor']),axis=0)==res.y_scale.get_data(['sensor'])
are True
).
With all of this said, I would have expected a way to say something along the lines of convolve(res.h, envelope)
to get the model's predictions in an "inference mode" kind of fashion. However, I can live with this.
P.D: I think in the second line of code you meant assert np.allclose(prediction, prediction_hand)
instead of assert np.allclose(prediction, prediction)
from eelbrain.
@christianbrodbeck , could you please provide me with an example for how to use BoostingResult
in combination with convolve
to apply it to a time-series signal and obtain a prediction that I can directly compare with EEG data? Thank you!
I'm trying to incorporate your boosting code as evaluation metric in the optimization routine of a model that predicts speech envelopes. Cheers!
from eelbrain.
would it be using
convolve(res.h_scaled, (envelope - res.x_mean))*res.y_scale + res.y_mean
?
That's what would make sense to me, but without being able to replicate the residuals, I cannot corroborate the operations I'm trying to do "by hand". Thanks!
from eelbrain.
would it be using
convolve(res.h_scaled, (envelope - res.x_mean))*res.y_scale + res.y_mean
?
That's what would make sense to me, but without being able to replicate the residuals, I cannot corroborate the operations I'm trying to do "by hand". Thanks!
What you want is:
prediction = convolve(res.h_scaled, (envelope - res.x_mean)) + res.y_mean
This aligns with the fact that res.h_scaled = res.h * res.y_scale / res.x_scale
.
from eelbrain.
How big is your difference? Could it be due to numerical precision?
For example, you're calculating it at a different scale than the actual code, and the SS is calculated using eelbrain._trf._boosting_opt.l2
For reference, the actual calculation is here:
Eelbrain/eelbrain/_trf/_boosting.py
Line 761 in e81c439
The L2 evaluator is here:
Eelbrain/eelbrain/_trf/_fit_metrics.py
Line 74 in e81c439
from eelbrain.
I tried prediction = convolve(res.h_scaled, (envelope - res.x_mean)) + res.y_mean
and this is how the prediction compares visually to the original EEG. I could be wrong, but the plots look like they are off on the order or magnitude. Or does this look right to you @christianbrodbeck @proloyd ? Thanks again!
from eelbrain.
Hard to say from this, boosting is pretty conservative. Notice that the visible features of the response are not in the prediction, so they are probably noise not signal.
from eelbrain.
Related Issues (20)
- `boosting` with `multiprocessing` on Windows HOT 6
- Loading CND format dataset with load.cnd HOT 4
- Trouble adding NDVar to brain plot HOT 3
- calling plot functions in eelbrain cause python to abort (libc++abi: terminating with uncaught exception of type NSException) HOT 3
- Adding parcellation to vol source space HOT 2
- Boosting residual defined for individual eeg channels, but documentation states otherwise. HOT 1
- Differing results for Spatio-temporal cluster permutation in Mac vs. Windows HOT 15
- How should I import my meg raw file into eelbrain's NDVar format? HOT 11
- Eelbrain not rendering Brain objects HOT 6
- Cannot install eelbrain 0.39.5 HOT 10
- Having a dead kernel each time I run eelbrain on my data HOT 7
- Does eelbrain works with Mac M2 chip HOT 1
- Under-utilization of CPU on M1 Ultra during boosting HOT 3
- Eelbrain doesn't lauch after update HOT 5
- MNE Deprecations HOT 1
- The math under TTestRelated HOT 1
- plot.brain.brain(roi) crashes immediately HOT 2
- Missing dependencies keyring, wxPython, colormath, pymatreader HOT 2
- Improve documentation and/or GitHub readme HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from eelbrain.