import json
m2 = json.load(open('MAPPING_FILE.json'))
m3 = luqum.elasticsearch.SchemaAnalyzer({"mappings" : m2['sra_experiment_joined2']['mappings']['doc']['properties']})
m3.nested_fields()
Out[900]:
{'attributes': {'tag': {}, 'value': {}},
'identifiers': {'id': {}, 'namespace': {}, 'uuid': {}},
'reads': {'base_coord': {},
'read_class': {},
'read_index': {},
'read_type': {}},
'xrefs': {'db': {}, 'id': {}}}
In this schema, then, the problem arises from the fact that the nested field names (without the parent) are repeated. This results in too short a list and the paths are not complete.
list(m3.sub_fields())
Out[908]:
['tag.keyword',
'value.keyword',
'id.keyword',
'namespace.keyword',
'uuid.keyword',
'Status.keyword',
'accession.keyword',
'alias.keyword',
'attributes.tag.keyword',
'attributes.value.keyword',
'broker_name.keyword',
'center_name.keyword',
'experiment_accession.keyword',
'identifiers.id.keyword',
'identifiers.namespace.keyword',
'identifiers.uuid.keyword',
'reads.read_class.keyword',
'reads.read_type.keyword',
'run_accession.keyword',
'run_center.keyword',
'BioSample.keyword',
'GEO.keyword',
'Status.keyword',
'accession.keyword',
'alias.keyword',
'attributes.tag.keyword',
'attributes.value.keyword',
'broker_name.keyword',
'center_name.keyword',
'description.keyword',
'identifiers.id.keyword',
'identifiers.namespace.keyword',
'numeric_properties.property_id.keyword',
'numeric_properties.unit_id.keyword',
'ontology_terms.keyword',
'organism.keyword',
'sample_type.keyword',
'title.keyword',
'xrefs.db.keyword',
'xrefs.id.keyword',
'BioProject.keyword',
'GEO.keyword',
'Status.keyword',
'abstract.keyword',
'accession.keyword',
'alias.keyword',
'attributes.tag.keyword',
'attributes.value.keyword',
'broker_name.keyword',
'center_name.keyword',
'description.keyword',
'identifiers.id.keyword',
'identifiers.namespace.keyword',
'study_accession.keyword',
'study_type.keyword',
'title.keyword',
'xrefs.db.keyword',
'xrefs.id.keyword',
'db.keyword',
'id.keyword']
Again, note that the field names are missing the parent in the name.