When pygeometa generates the xml metadata files, it creates a <code class="notranslate

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Duplicate values for <gmd:distribution format>,about geopython/pygeometa

Comments (5)

tomkralidis commented on July 28, 2024

@RousseauLambertLP can you send a test case/data?

from pygeometa.

RousseauLambertLP commented on July 28, 2024

The distribution info duplicate sections are exactly the same but are there twice or four time... I tried to modify a xml by removing the duplicate tags and it imported well in ECDC.

This is what we have for the moment:

<gmd:distributionInfo>
    <gmd:MD_Distribution>
        <gmd:distributionFormat>
            *GRIB2*
        </gmd:distributionFormat>
        <gmd:distributionFormat>
            *WMS*
        </gmd:distributionFormat>
        <gmd:distributionFormat>
            *GRIB2*
        </gmd:distributionFormat>
        <gmd:distributionFormat>
            *WMS*
        </gmd:distributionFormat>
        <gmd:distributionFormat>
            *GRIB2*
        </gmd:distributionFormat>
        <gmd:distributionFormat>
            *GRIB2*
        </gmd:distributionFormat>
    </gmd:MD_Distribution>
 </gmd:distributionInfo>

We only need to have it once for each distribution format:

<gmd:distributionInfo>
    <gmd:MD_Distribution>
        <gmd:distributionFormat>
            *GRIB2*
        </gmd:distributionFormat>
        <gmd:distributionFormat>
            *WMS*
        </gmd:distributionFormat>
    </gmd:MD_Distribution>
 </gmd:distributionInfo>

from pygeometa.

RousseauLambertLP commented on July 28, 2024

I have worked on a solution regarding the duplication of distribution format in iso19139-hnap xml.

I've added a function in core.py called get_unique_distribution_format. This function returns a list of dictionaries containing the unique distribution format. I call it in the main.j2 file of iso19139-hnap. I've tested it by importing and publishing it in ECDC and it works fine. We now only have one tag per distribution format. Once my first merge request is accepted I will merge this one.

@tomkralidis any thoughts comments on the code?

diff --git a/pygeometa/core.py b/pygeometa/core.py
index b23fd41..16a76bf 100644
--- a/pygeometa/core.py
+++ b/pygeometa/core.py
@@ -260,9 +260,11 @@ def render_template(mcf, schema=None, schema_local=None):
     env.filters['normalize_datestring'] = normalize_datestring
     env.filters['get_distribution_language'] = get_distribution_language
     env.filters['get_charstring'] = get_charstring
+    env.filters['get_unique_distribution_format'] = get_unique_distribution_format
     env.globals.update(zip=zip)
     env.globals.update(get_charstring=get_charstring)
     env.globals.update(normalize_datestring=normalize_datestring)
+    env.globals.update(get_unique_distribution_format=get_unique_distribution_format)
 
     try:
         LOGGER.debug('Loading template')
@@ -294,6 +296,17 @@ def get_abspath(mcf, filepath):
     return os.path.join(abspath, filepath)
 
 
+def get_unique_distribution_format(items):
+    """returns a list of dictionnaries of the unique distribution formats"""
+
+    unique_distribution_list = []
+    for k, v in items:
+        if v['format_en'] and v['format_fr'] and v['format_version']:
+            if {'format_en': v['format_en'], 'format_fr': v['format_fr'], 'format_version': v['format_version']} not in unique_distribution_list:
+                unique_distribution_list.append({'format_en': v['format_en'], 'format_fr': v['format_fr'], 'format_version': v['format_version']})
+    return unique_distribution_list
+
+
 class MCFReadError(Exception):
     """Exception stub for format reading errors"""
     pass
diff --git a/pygeometa/templates/iso19139-hnap/main.j2 b/pygeometa/templates/iso19139-hnap/main.j2
index 0058017..1ccdd47 100644
--- a/pygeometa/templates/iso19139-hnap/main.j2
+++ b/pygeometa/templates/iso19139-hnap/main.j2
@@ -476,8 +476,7 @@
   </gmd:identificationInfo>
   <gmd:distributionInfo>
     <gmd:MD_Distribution>
-      {% for k, v in record['distribution'].items() %}
-      {% if v['format_en'] and v['format_fr'] and v['format_version'] %}
+      {% for v in get_unique_distribution_format(record['distribution'].items()) %}
       <gmd:distributionFormat>
         <gmd:MD_Format>
           {{ cs.get_freetext('name', 'fra', get_charstring('format', v, 'en', 'fr')) }}
@@ -486,7 +485,6 @@
           </gmd:version>
         </gmd:MD_Format>
       </gmd:distributionFormat>
-      {% endif %}
       {% endfor %}
       <gmd:distributor>
         <gmd:MD_Distributor>

from pygeometa.

tomkralidis commented on July 28, 2024

@RousseauLambertLP good suggestion. Comments:

we should apply this to all ISO-based output templates
the code snippet above only takes in account format_en, format_fr, format_version. What happens if/when other language profiles are used?
we should add test case to unit tests

Sample code below:

index 27ecd5d..f8faa05 100644
--- a/pygeometa/core.py
+++ b/pygeometa/core.py
@@ -145,8 +145,33 @@ def normalize_datestring(datestring, format_='default'):
     return datestring
 
 
+def prune_distribution_formats(formats):
+    """derive a unique list of distribution formats"""
+
+    counter = 0
+    formats_ = []
+
+    for k1, v1 in formats.items():
+        row = {}
+        for k2, v2 in v1.items():
+            if k2.startswith('format'):
+                row[k2] = v2
+        formats_.append(row)
+
+    num_elements = len(formats)
+
+    for f in formats_:
+        counter += 1
+        if num_elements == counter:
+            break
+        if cmp(f, formats_[counter]) == 0:
+            formats_.pop(counter)
+
+    return formats_
+
+
 def read_mcf(mcf):
-    """returns dict of YAML file from filepath"""
+    """returns dict of YAML file from filepath, string or dict"""
 
     mcf_dict = {}
 
@@ -245,12 +270,14 @@ def render_template(mcf, schema=None, schema_local=None):
 
     LOGGER.debug('Setting up template environment {}'.format(abspath))
     env = Environment(loader=FileSystemLoader([abspath, TEMPLATES]))
-    env.filters['normalize_datestring'] = normalize_datestring
-    env.filters['get_distribution_language'] = get_distribution_language
     env.filters['get_charstring'] = get_charstring
-    env.globals.update(zip=zip)
+    env.filters['get_distribution_language'] = get_distribution_language
+    env.filters['prune_distribution_formats'] = prune_distribution_formats
+    env.filters['normalize_datestring'] = normalize_datestring
     env.globals.update(get_charstring=get_charstring)
     env.globals.update(normalize_datestring=normalize_datestring)
+    env.globals.update(prune_distribution_formats=prune_distribution_formats)
+    env.globals.update(zip=zip)
 
     try:
         LOGGER.debug('Loading template')
diff --git a/pygeometa/templates/iso19139-hnap/main.j2 b/pygeometa/templates/iso19139-hnap/main.j2
index 9aa8ec7..c0aa3f3 100644
--- a/pygeometa/templates/iso19139-hnap/main.j2
+++ b/pygeometa/templates/iso19139-hnap/main.j2
@@ -476,7 +476,8 @@
   </gmd:identificationInfo>
   <gmd:distributionInfo>
     <gmd:MD_Distribution>
-      {% for k, v in record['distribution'].items() %}
+      {% set formats = prune_distribution_formats(record['distribution']) %}
+      {% for v in formats %}
       {% if v['format_en'] and v['format_fr'] and v['format_version'] %}
       <gmd:distributionFormat>
         <gmd:MD_Format>
diff --git a/pygeometa/templates/iso19139/main.j2 b/pygeometa/templates/iso19139/main.j2
index 6b3d73a..1493638 100644
--- a/pygeometa/templates/iso19139/main.j2
+++ b/pygeometa/templates/iso19139/main.j2
@@ -292,7 +292,8 @@
       </gmd:distributor>
       <gmd:transferOptions>
         <gmd:MD_DigitalTransferOptions>
-        {% for k, v in record['distribution'].items() %}
+        {% set formats = prune_distribution_formats(record['distribution']) %}
+        {% for v in formats %}
           <gmd:onLine>
             <gmd:CI_OnlineResource>
               <gmd:linkage>
diff --git a/pygeometa/templates/wmo-cmp/main.j2 b/pygeometa/templates/wmo-cmp/main.j2
index ea63535..04a2ddf 100644
--- a/pygeometa/templates/wmo-cmp/main.j2
+++ b/pygeometa/templates/wmo-cmp/main.j2
@@ -336,7 +336,8 @@
       </gmd:distributor>
       <gmd:transferOptions>
         <gmd:MD_DigitalTransferOptions>
-          {% for k, v in record['distribution'].items() %} 
+        {% set formats = prune_distribution_formats(record['distribution']) %}
+        {% for v in formats %}
           <gmd:onLine>
             <gmd:CI_OnlineResource>
               <gmd:linkage>
diff --git a/tests/run_tests.py b/tests/run_tests.py
index 29b64bb..128d3bb 100644
--- a/tests/run_tests.py
+++ b/tests/run_tests.py
@@ -53,7 +53,8 @@ from six import text_type
 import yaml
 
 from pygeometa.core import (read_mcf, pretty_print, render_template,
-                            get_charstring, get_supported_schemas)
+                            get_charstring, get_supported_schemas,
+                            prune_distribution_formats)
 
 THISDIR = os.path.dirname(os.path.realpath(__file__))
 
@@ -153,6 +154,23 @@ class PygeometaTest(unittest.TestCase):
                                 {'title_fr': 'foo', 'title_en': 'bar'}, 'fr')
         self.assertEqual(values, [None, None], 'Expected specific values')
 
+    def test_prune_distribution_formats(self):
+        """Test deriving unique distribution formats"""
+
+        formats = {
+            'wms': {
+                'format_en': 'image', 'format_fr': 'image', 'format_version': 2
+            },
+            'wfs': {
+                'format_en': 'GRIB2', 'format_fr': 'GRIB2', 'format_version': 2
+            },
+            'wcs': {
+                'format_en': 'GRIB2', 'format_fr': 'GRIB2', 'format_version': 2
+            }
+
+        new_formats = prune_distribution_formats(formats)
+
+        self.assertEqual(len(new_formats), 2,
+                         'Expected 2 unique distribution formats')
+
     def test_get_supported_schemas(self):
         """Test supported schemas"""

from pygeometa.

RousseauLambertLP commented on July 28, 2024

Issue solved, closing ticket.

from pygeometa.

Duplicate values for <gmd:distribution format> about pygeometa HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent