The skorch-pytorch-wrapper from fernandolpz

Scalar type Runtime Error for Loss function?

Hey Fernando,

thank you very much for the interesting and helpfull wrapper for pytorch and skorch.
I am finding some difficulties when trying to access the GridSearchCV and Pipeline settings, since I am new to machine learning.

I tried to change your code so I could fit my data.
I am making use of your grid_seach_pipeline_training method. Now, I get the following RuntimeError:
Traceback (most recent call last): File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\sklearn\model_selection\_validation.py", line 681, in _fit_and_score estimator.fit(X_train, y_train, **fit_params) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\sklearn\pipeline.py", line 394, in fit self._final_estimator.fit(Xt, y, **fit_params_last_step) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\classifier.py", line 142, in fit return super(NeuralNetClassifier, self).fit(X, y, **fit_params) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 917, in fit self.partial_fit(X, y, **fit_params) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 876, in partial_fit self.fit_loop(X, y, **fit_params) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 789, in fit_loop self.run_single_epoch(dataset_train, training=True, prefix="train", File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 826, in run_single_epoch step = step_fn(Xi, yi, **fit_params) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 723, in train_step self.optimizer_.step(step_fn) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\torch\optim\optimizer.py", line 88, in wrapper return func(*args, **kwargs) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\torch\optim\rmsprop.py", line 96, in step loss = closure() File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 719, in step_fn step = self.train_step_single(Xi, yi, **fit_params) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 660, in train_step_single loss = self.get_loss(y_pred, yi, X=Xi, training=True) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\classifier.py", line 127, in get_loss return super().get_loss(y_pred, y_true, *args, **kwargs) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\skorch\net.py", line 1210, in get_loss return self.criterion_(y_pred, y_true) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\torch\nn\modules\loss.py", line 211, in forward return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction) File "C:\Users\Florian\anaconda3\envs\jlhiwi\lib\site-packages\torch\nn\functional.py", line 2532, in nll_loss return torch._C._nn.nll_loss_nd(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: expected scalar type Long but found Float

I made some changes in the data_loader method, where I saved my proper data before in a pickle file.
The size of x is [31500,2] and of y its [31500,1].
Also, note that I changed the type of x AND y to float32, because my data is already normalized between 0 and 1.
`def load_data(x_path='data\dan_raw_x', y_path='data\dan_raw_y'):
"""
By default, this function loads the pickle files for the raw data from Danimir.

Parameters
--------- 
x_path : string
         Specifying the exact path of the x data and the file name itself (for pickle file, the data type is not necessary)
y_path : string
         Specifying the exact path of the y data and the file name itself (for pickle file, the data type is not necessary)
"""
# Load dataset

# load danimir's data 
with open(x_path, "rb") as f:
    x = pickle.load(f)

with open(y_path, "rb") as f:
    y = pickle.load(f)    

# shuffle data
np.random.seed(456)
np.random.shuffle(x)
np.random.shuffle(y)

# squeeze y 
y = np.squeeze(y)

x = x.astype(np.float32)
y = y.astype(np.float32)

return x, y `

The NeuralNetwork I changed to the following, with hl_sizes indicating the number and size of hidden layers. Later on, this variable should be specified in the GridSearchVC as well, if possible.
`class NeuralNet(nn.Module):

def __init__(self,num_variables):
    """
    hl_sizes : Array that defines the size of every layer. Specify the number of neurons per layer
    num_variables: int that defines the number of entry variables to the model
    """
    hl_sizes = [10,10]
    torch.set_default_dtype(torch.float32)
    super(NeuralNet, self).__init__()
    # an affine operation: y = Wx + b
    # Syntax: torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)
    self.Layers = nn.ModuleList()
    self.Layers.append(nn.Linear(num_variables,hl_sizes[0], bias=False)) #input layer
    for i in range(len(hl_sizes)-1):
        self.Layers.append(nn.Linear(hl_sizes[i], hl_sizes[i+1], bias = False))
    self.Layers.append(nn.Linear(hl_sizes[-1], 1, bias=False))

                       
def forward(self, x):

    for layer in self.Layers[:-1]:
        x = layer(x)
        x = F.tanh(x)
    x = self.Layers[-1](x)
    return x`

Lastly, my main file looks like this, which is basically unchanged from your version I guess.
`class Run:
def init(self, x, y):
self.x = x
self.y = y

def grid_search_pipeline_training(self):
	# Through a grid search, the optimal hyperparameters are found
	# A pipeline is used in order to scale and train the neural net
	# The grid search module from scikit-learn wraps the pipeline

	# The Neural Net is instantiated, none hyperparameter is provided
            # The following classifier splits the data into test and validation data by default
            # The used method is the StratifiedKFold with a ratio of 80% for training and 20% for validation
	nn = NeuralNetClassifier(NeuralNet, verbose=0, train_split=False)

	# The pipeline is instantiated, it wraps scaling and training phase
            # If more steps are required, they are added to the pipeline!
	pipeline = Pipeline([('scale', StandardScaler()), ('nn', nn)])

	# The parameters for the grid search are defined
	# It must be used the prefix "nn__" when setting hyperparamters for the training phase
	# It must be used the prefix "nn__module__" when setting hyperparameters for the Neural Net
	params = {
		'nn__max_epochs':[10, 20],
		'nn__lr': [0.01],
		'nn__module__num_variables': [3],
		'nn__optimizer': [optim.Adam, optim.SGD, optim.RMSprop],
        }

	# The grid search module is instantiated
	gs = GridSearchCV(pipeline, params, refit=False, cv=3, scoring='balanced_accuracy', verbose=1)
	# Initialize grid search
	gs.fit(self.x, self.y)
	pass

if name == "main":
x, y = load_data()

run = Run(x, y)

run.grid_search_pipeline_training()`

Right now, I am guessing that the problem is, that the output of the loss function is given as dtype Long, whereas my y values are of the type float32. I can not change the type of the y values though, because they are already normalized.

Any ideas?

fernandolpz / skorch-pytorch-wrapper Goto Github PK

skorch-pytorch-wrapper's Introduction

skorch-pytorch-wrapper's People

Stargazers

Watchers

Forkers

skorch-pytorch-wrapper's Issues

Scalar type Runtime Error for Loss function?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent