Comments (9)
Ah yeah, I was worried about that parse there, but didn't see yet how it would go wrong :)
It turns out there are now module-level "properties" (not really properties, but still pretty awesome, I didn't know before just now!), so I would go with that, see here: https://www.python.org/dev/peps/pep-0562/
So for config.py, replace the args, ...
line with something like:
def __getattr__(name):
"""Only trigger parsing when it's needed."""
if name == 'args':
args, unknown_args = parser.parse_known_args()
return args
raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
from platalea.
I tried that but __getattr__
is called before my main script adds its own arguments.
from platalea.
Ah yeah, that would happen when another module (that you import from your script probably) uses config.args
or generally if config.args
is accessed anywhere before all arguments are added to the parser... Can you find out where that happens and whether we can avoid that access?
from platalea.
It is in dataset.py. We might be able to avoid the issue here but I feel like we might run into this issue agian with the current setup. I was wondering wether an alternative could be to use two different parsers: one for general level config, such as dataset path, which wouldn't have help associated with it (we could document those parameters somewhere else) and another one for the script specific parameters that would appear in the help message.
from platalea.
Hm, then it wouldn't give any help message if you run a script that doesn't add the second parser.
Another option would be to just put all parameters in one file, even though we only use them in specific runs. We can indicate these experiment-specific parameters in the help section using groups: https://docs.python.org/3/library/argparse.html#argument-groups
So for instance, --epochs
, which I added to basic-stack/run.py
, should then go in this experiment-parameter group in __init__.py
(or maybe we can move it to config.py
since we don't need the named singleton parser anymore, we can just make it a simple member of config):
parser = configargparse.get_argument_parser(name='platalea')
# [... current options there ...]
experiment_arguments = parser.add_argument_group('experiment arguments', "Some experiment run scripts use the following optional parameters.")
experiment_arguments.add_argument('--epochs', action='store', default=32, dest='epochs', type=int,
help='number of epochs after which to stop training (default: 32)')
# [... etc ...]
The help message is then:
python experiments/basic-stack/run.py -h
usage: run.py [-h] [--data_root DATA_ROOT] [--meta META] [--audio_features_fn AUDIO_FEATURES_FN]
[--audio_subdir AUDIO_SUBDIR] [--image_subdir IMAGE_SUBDIR] [--epochs EPOCHS]
Args that start with '--' (eg. --data_root) can also be set in a config file (config.ini or config.yml). Config file
syntax allows: key=value, flag=true, stuff=[a,b,c] (for details, see syntax at https://goo.gl/R74nmi). If an arg is
specified in more than one place, then commandline values override environment variables which override config file
values which override defaults.
optional arguments:
-h, --help show this help message and exit
--data_root DATA_ROOT
location of the flickr8k (or similar) dataset [env var: PLATALEA_DATA_ROOT] (default:
/roaming/gchrupal/datasets/flickr8k/)
--meta META filename of the metadata file (dataset.json or similar) relative to the dataset location [env
var: PLATALEA_METADATA_JSON] (default: dataset_multilingual.json)
--audio_features_fn AUDIO_FEATURES_FN
filename of the MFCC audio features file relative to the dataset location [env var:
PLATALEA_AUDIO_FEATURES_FN] (default: mfcc_delta_features.pt)
--audio_subdir AUDIO_SUBDIR
directory containing the flickr8k wav files, relative to the dataset location [env var:
PLATALEA_AUDIO_SUBDIR] (default: flickr_audio/wavs/)
--image_subdir IMAGE_SUBDIR
directory containing the flickr8k image files, relative to the dataset location [env var:
PLATALEA_IMAGE_SUBDIR] (default: Flickr8k_Dataset/Flicker8k_Dataset/)
experiment arguments:
Some experiment run scripts use the following optional parameters.
--epochs EPOCHS number of epochs after which to stop training (default: 32) (default: 32)
A downside of this approach is that we may at some point end up with a huge bunch of experiment arguments, but we can always reevaluate once this happens. Also, we may need slightly different variants/interpretations for different experiments. However, if we document this properly, it shouldn't be a big issue.
What do you think?
from platalea.
Btw, this approach also means we can get rid of the perhaps slightly overly complicated get_attr
trick again :P I love it, but sometimes you have to kill your darlings ;)
from platalea.
I am not really fond of adding all parameters in one file for 2 main reasons:
- it will quickly make the help unreadable
- while some parameters (e.g. number of epochs) apply to most cases, some are specific to one script. I think things will get very confused because of this.
I would rather go for one of the following option:
- have only important parameters for each script in the help and general configuration elements visible somewhere else (e.g. through
python config.py -h
) - keep the get_attr trick and remove the accesses to the parameters in function definitions so that the script can add it's own parameters before help is displayed
from platalea.
After discussion, I think we came up with the following idea, please correct me if I misunderstood:
We create a new class that internally uses the ConfigArgParse parser, but by default does not parse, it only parses when the user asks it to. Moreover, it exposes the default parameters so that we can still use those when parsing has not taken place. When parsing is activated, the class updates the parameters from their default values to the values from the parser. The parser has a set of default options included (those that are used inside the main platalea module, like data_root
and filenames for feature files), but specific options can be added for each experiment.
The interface and use will be something like this:
class PlataleaConfig:
def __init__(self):
# setup ConfigArgParse parser and default parameters in self.stuff dicts
def add_argument(self, *args, **kwargs):
self._parser.add_argument(*args, **kwargs)
def parse(self):
args, _ = self.parser.parse_known_args()
# update default parameters dict(s) using args results
def __getattr__(self, arg_name):
return self._stuff[arg_name]
from platalea.
Yes, that's it.
from platalea.
Related Issues (20)
- Versioning, releasing, DOI, Zenodo HOT 7
- Linting
- Fix warnings about code quality
- Add reference documentation
- Reproduce label_encoders.pkl HOT 3
- [ZRVG] CPC feature extraction
- Merge zerospeech21-vg branch? HOT 4
- add to RSD HOT 1
- Add issue template
- Create a conda package HOT 1
- Consider creating a conda channel HOT 1
- Use static analysis/code quality service HOT 2
- Description of the software HOT 1
- Supported platform HOT 1
- Add code of conduct
- Add code documentation HOT 1
- No help message when mandatory argument is missing. HOT 2
- CI fails HOT 3
- Test case with SpokenCOCO dataset
- Store dtype in metadata of memmap
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from platalea.