Code Monkey home page Code Monkey logo

padertorch's People

Contributors

alexanderwerning avatar boeddeker avatar gburrek avatar janekebb avatar jensheit avatar lukasdrude avatar michael-kuhlmann avatar tcord avatar tglarner avatar thequilo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

padertorch's Issues

Drop Python 3.6?

Python 3.6 had eol last year (23 Dec 2021).
Should we drop test for it?

Why:
The tests in #138 fail, because python 3.6 has a different representer for annotations.
Finding a workaround for the doctest is annoying and since Python 3.6 is eol,
I thought we could drop it.

Support dataclass default_factory

default_factory for dataclasses is not supported. Finding the default args for the class fails.

import padertorch as pt

from dataclasses import dataclass, field
@dataclass
class A(pt.Configurable):
    a: dict = field(default_factory=dict)

A.get_config()

gives

Traceback (most recent call last):
...
TypeError: Object of type _HAS_DEFAULT_FACTORY_CLASS is not JSON serializable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
ValueError: Invalid config.
See above exception msg from json.dumps and below the sub config:
{'factory': 'A', 'a': <factory>}

Segmentation Fault when Training Model

I ran into an issue with training a model using Padertorch [https://github.com/fgnt/ham_radio/issues/1](Ham radio). I get a segmentation fault in the training loop function. When if place a pdb debug statement in train_step I get the loss, summary, etc but I get the segmentation fault upon return. Location:

  • padertorch>train>trainer.py>Trainer>step . In the if block with len(device) == 1

I am using a single RTX 4090 on Python 3.10 with PyTorch 2.1.0.

Thank you.

Show example json to run ORPIT

Hi,
First thanks for your efforts of the ORPIT example, which is the only one I can find on the GitHub.
The code is complex and I am in a hurry to run this. Is this possible to show several lines of dataset json files? And how to run the command to include both the 2 and 3 speaker conditions?

restructure and update the contrib examples

Split example directory into toy examples and task specific examples.
Update the example training scripts so that the experiments use similar directory structures.
Add evaluation scripts if missing and update the evaluation directory structure, so that the evaluation files are written in a new directory inside the training directory.

  • create io.py and add function for default training directory definition @boeddeker
  • move to toy examples and update mnist, multi_gpu, configurable @boeddeker
  • remove the acoustic model example . @boeddeker
  • rename and update the simple train script for audio tagging, remove the advanced training script @JanekEbb
  • add evaluation script for audio tagging @JanekEbb
  • move, rename and update the training script for mask estimator @jensheit
  • add evaluation script for mask estimator @jensheit
  • move, rename and update the or_pit and tasnet example @thequilo @jensheit
  • move and update the pit example @TCord
  • update the speaker classification example @michael-kuhlmann @JanekEbb
  • add evaluation script for speaker classification @michael-kuhlmann
  • move and update the wavenet example @JanekEbb

Move backwards step into model

In case of multiple chained models for example source separation + speech recognition it might be necessary to do intermediate backwards steps to reduce the required gpu memory during training.
The user could be enabled to use mulitple backwards steps by moving the backwards step and the train_step into the model.

However, we have to consider the implications for the Hook post_step, which is at the moment called after train_step but before the backwards step.

Another open question is how to handle the timer information.

Error in tbx_utils.py

/padertorch/padertorch/summary/tbx_utils.py, line 145, in audio
signal *= 0.95
ValueError: output array is read-only
I think it happens when there are zero signals.

The current code is like

    if normalize:
        denominator = np.max(np.abs(signal))
        if denominator > 0:
            signal = signal / denominator
        signal *= 0.95

I think it should be:

    if normalize:
        denominator = np.max(np.abs(signal))
        if denominator > 0:
            signal = signal / denominator
            signal *= 0.95

@boeddeker could you check it?

Add review visualization utilities

Issue:

I trained a model and want to visualize it in a jupyter notebook.
At the moment my workflow is, that I execute the model and call manually some plotting functions,
because the review is designed for tensorboard (especially the images are non-obvious, how to print).

Note: Manually calling the plotting functions is better than visualizing the tensorboard visualizing,
because tensorboard doesn't know what axis labels and ticks are and how a proper title is formatted,
but simply visualizing the review is faster, because the code is already written.

Suggestion:

Add some utilities to visualize entries of the review.
e.g.

for k, (data, sample_rate) in review.get('audios', {}).items():
    pb.io.play(data, sample_rate=sample_rate, normalize=False, name=k)

for audios and something like

with pb.visualization.axes_context(columns=4) as axes:
    for k, image in review['images'].items():
        axes.new
        image = np.einsum('chw->hwc', image)[::-1]
        plt.imshow(image, origin='lower')
        plt.title(k)
        plt.grid(False)

for images.

Two proposals for high level functions:

class VisualizeReview:
    def __init__(self, review, trainer=None):
        self.review = review
        if trainer is not None:
            # Ensure, that loss is in review and add loss to scalars
            _, review = trainer._review_to_loss_and_summary(review)
        else:
            review.setdefault('scalars', {})['loss'] = review['loss']
    
    def __call__(self):
        self.scalars()
        self.audios()
        self.images()
    
    def scalars(self):
        display(pd.Series({
            k: pt.utils.to_numpy(v, detach=True)
            for k, v in self.review['scalars'].items()
        }))
    
    def audios(self):
        for k, (data, sample_rate) in self.review.get('audios', {}).items():
            play(data, sample_rate=sample_rate, normalize=False, name=k)
    
    def images(self, columns=4):
        with pb.visualization.axes_context(columns=columns) as axes:
            for k, image in self.review['images'].items():
                axes.new
                image = np.einsum('chw->hwc', image)
                plt.imshow(
                    image,
                    origin='lower',
                )
                plt.title(k)
                plt.grid(False)

VisualizeReview(model_review)()
def visualize_review(
        review,
        trainer=None,
        axes_context_kwargs=dict(columns=4)
):
    from IPython.display import display

    display(pd.Series({
        k: pt.utils.to_numpy(v, detach=True)
        for k, v in review['scalars'].items()
    }))
    
    for k, (data, sample_rate) in review.get('audios', {}).items():
        play(data, sample_rate=sample_rate, normalize=False, name=k)
        
    with pb.visualization.axes_context(**axes_context_kwargs) as axes:
        for k, image in review['images'].items():
            axes.new
            image = np.einsum('chw->hwc', image)
            image = image[::-1]
            plt.imshow(
                image,
                origin='lower',
            )
            plt.title(k)
            plt.grid(False)

visualize_review(model_review)

Configurable: Support positional only arguments

At the moment is any callable a configurable, when it supports key value arguments.
While this is verbose and most factories and classes are supported, there are a few we don't support.

Maybe the most relevant example for padertorch is torch.nn.Sequential(*args).
To use this class, at the moment we have to use a wrapper around that class.
A native support would be nice.

In a small group, we discussed this offline, but haven't found the solution, that we want to realize.

First priority is, that the implementation is no breaking change, so all examples that follow, won't break current configs.

Here some examples, how it could be realized:

First, lets assume, we have these factories:

def foo(*numbers):
    ...
def bar(a, b, \):  # positional only, like operator.add
    ...

and we want to call them as

foo(1, 2)
bar(1, 2)

now follow some ideas, how the config could look like:

1. Reserved keyword 'args', 'factory_args' or '*'

{'factory': 'foo', 'args': [1, 2]}
{'factory': 'bar', 'args': [1, 2]}
{'factory': 'foo', '*': [1, 2]}
{'factory': 'bar', '*': [1, 2]}

2. Signature check with "assignment"

Ignore that the arguments is a positional only in the config and simply do an assignment style in the config.
In the implementation, we then to the mapping to the positional argument.

{'factory': 'foo', 'numbers': [1, 2]}
{'factory': 'bar', a=1, b=2]}
{'factory': 'foo', '*numbers': [1, 2]}
{'factory': 'bar', a=1, b=2]}

3. Lisp style

The factory can be a list, the first argument is then the function, while the others are the arguments:

{'factory': ['foo', 1, 2]}
{'factory': ['foo', 1, 2]}

My opinion

The 2. is nice for positional only arguments like operator.add.
But this rarely happen in practice, because it was only supported for C and C++ functions until py37 (PEP 570).
For *args I don't like it. It looks strange and is not robust against renaming.

For the first my favorite key would be args, but it could happen, that someone uses args as normal keyword:

threading.Thread(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None)
scipy.optimize.minimize_scalar(fun, bracket=None, bounds=None, args=(), method='brent', tol=None, options=None)[source]

We didn't find a relevant example, but it would be better to prevent the conflict.

At the moment I am unsure, if I like {'factory': ['foo', 1, 2]} or {'factory': 'foo', '*'=[1, 2]} more.

STFT inverse, stacked representation

Hi,
using complex_representation='stacked' for inverse STFT leads to an error:

import torch                                                                    
from padertorch.ops import STFT                                                 
                                                                                
stft_signal = torch.rand((2, 4, 10, 257, 2))                                    
torch_stft = STFT(512, 20, window_length=40, \                                  
                        complex_representation='stacked')                       
torch_signal = torch_stft.inverse(stft_signal)
Traceback (most recent call last):
  File "bug.py", line 7, in <module>
    torch_signal = torch_stft.inverse(stft_signal)
  File "/mnt/matylda6/izmolikova/JSALT2020/sse/tools/padertorch/padertorch/ops/_stft.py", line 215, in inverse
    stride=self.shift)
RuntimeError: Expected 3-dimensional input for 3-dimensional weight [512, 1, 40], but got 5-dimensional input of size [2, 6, 10, 257, 1] instead

The problem starts already at

signal_real, signal_imag = torch.chunk(stft_signal, 2, dim=-1)
which leads to a different shape of signal_real and signal_imag than in the concat case.

A quick fix is to unify the representation in the beginning

if self.complex_representation == 'stacked':                            
        stft_signal = torch.cat((stft_signal[...,0], stft_signal[...,1]),   
                                    dim = -1)

and then treat both representations as concat.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.