Code Monkey home page Code Monkey logo

bcgov / mfin-data-catalogue Goto Github PK

View Code? Open in Web Editor NEW
6.0 9.0 0.0 3.3 MB

The Finance Data Catalogue enables users to discover data holdings at the BC Ministry of Finance and offers information and functionality that benefits consumers of data for business purposes. The product is built using Drupal and adheres to the Government of BC's Core Metadata Standard.

License: Other

PHP 77.88% Makefile 0.13% CSS 0.18% Twig 9.02% SCSS 11.24% JavaScript 1.55%
data dataaccess dataliteracy datamanagement drupal-10 finance lineage metadata

mfin-data-catalogue's Introduction

MFIN-Data-Catalogue

Lifecycle:Experimental

Data Catalogue for BC Ministry of Finance.

Theme notes

See html/themes/custom/dc_theme/watchsass.sh for a script to compile sass. Instructions are in the comments. Using this script ensures that all developers are compiling SASS the same way.

mfin-data-catalogue's People

Contributors

chrislaick avatar craigclark avatar danhgov avatar gurjinder12 avatar joel-osc avatar kardamk avatar lkmorlan avatar nicoledegreef avatar sylus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mfin-data-catalogue's Issues

Bulk re-assign data custodian process

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608689391


NOTE

Somebody mentioned in the workshop that they would to see the lineage of data custodians.
Is this something we want to do?

Each node in Drupal has an author. We are using this author field as the data custodian. We will have to make sure that we change the build in author label as necessary and make sure anybody working on the admin side knows that author and data custodian are the same thing when it comes to the dataset content type.

User story

As a data catalogue administrator, I want to be able to re-assign data custodians to data sets so that as staff come and go, each dataset always has someone responsible for it.

This feature support the following requirements

alpha release

Additional information

The Data custodian is the node author.

Proposed solution

  • verify that vbo is available, see bcgov/bcbb#5
  • add vbo to the content overview page
  • make sure the ability to change authors is a vbo option
  • make sure data catalogue administrator role can access content page and run VBO actions

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

  • VBO dropdown is available on the content overview page
  • Data catalogue managers can use the VBO to re-assign the node author

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Search results can be downloaded as `.csv`

Base-build candidate This feature should be evaluated as a potential asset to the base-build

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4606244178


User story

As a user of the data catalogue, I want to be able to download the results of my searches so that I can use the results for analysis and reporting.

This feature support the following requirements

alpha release

Additional context

This feature currently exists on Evri. See dv10 for an example.

  1. a user does a search
  2. there is checkbox beside each result
  3. user has option to select all results
  4. there is an option to download csv
  5. all fields of the data-set content type are in the csv

Verify that there are no HTML tags in the csv output. This has been an issue on previous implementations of this feature.

Screenshot

This example is from dv10

image

Proposed solution

Use the same modules and config as dv10, customized as needed for data-catalogue.

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

As a regular user:

  • on the search page I have checkboxes beside each result
  • I can select all results
  • I can select all results, then individually select results I do not want
  • I can download a csv
  • the csv rows match what I selected on the search page
  • the csv columns match the columns in the data-set content type
  • There are no HTML tags in the CSV output

Manual testing

The automated tests do not do any tests on the content of the downloaded file.

Saved search

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4728256323


User story

As an authenticated user (not data custodian) I want to use search for certain keywords and facets chosen and to save this exact search to my dashboard so that I can click on it to get to my result set more quickly.

See dashboard issue for requirements on how these should work on the user's dashboard.

This feature support the following requirements

Additional context

Proposed solution

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

  • criteria 1
  • criteria 2
  • criteria 3

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Theme node view for data set content type

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4523904372


This feature support the following requirements

mvp build

Additional context

The purpose of the initial theme for the data set node view is to have something presentable for the MVP launch. It is anticipated that this will evolve as the team works through the agile process.

Proposed solution

Using only classes available to Bootstrap 5, theme the data set node view to resemble the current bc data catalogue. see example

Notes

  • make sure items that should link do link

Dev workflow

  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • There is a legible view of a data set that approximates the current BC data catalogue

Documentation for Deployment and DevOps

Overview

The current documentation surrounding the Deployment and DevOps can be found in the private tenant-gitops-ea352d repository which houses the application manifests.

Everything is stored in markdown in the docs folder but also published to the wiki section in the github repository.

This allows for the following benefits:

  • track all documentation changes in code
  • single source of truth for both wiki pages and generated PDF
  • enable peer review for documentation changes

Note: The tenant-gitops-ea352d repository was chosen over mfin-data-catalogue as some information is assumed private and hence a private repository is better suited for the documentation as well as the wiki page which would otherwise be public.

Blocker(s)

  • Grant access to tenant-gitops-ea352d for @smulvih2

Right now only BCGov Employees and 2 external collaborators are allowed to access the tenant-gitops-ea352d repository.

I would like @smulvih2 to have access to the tenant-gitops-ea352d repository.

Currently only myself @lkmorlan have access to it.

Work

Markdown files in the docs folder will be generated automatically into corresponding wiki pages:

Markdown Files

Wiki Pages

In addition on a stable release tag a generated PDF using pandoc will be generated and attached to the release.

Note: This can also be done manually simply by running the scripts/build.sh file.

Date range picker for SearchAPI Facets

Idea Park your idea's here, they may become features!

https://openplus.monday.com/boards/4092908516/pulses/4688497276


User story

As a anonymous user I want to select a date range for my last modified facet so that I can find all data sets that were modified, say, between January 1 2020 and January 1 2021 (for example)

Describe the idea

Use the date range slider module maybe? These might not do what we want but there's some possible modules to investigate:

https://www.drupal.org/project/facets_date_range_picker
https://www.drupal.org/project/facets_date_range
https://www.drupal.org/project/facets_range_input

and looking at this issue:
https://www.drupal.org/project/facets/issues/3187163
https://www.drupal.org/project/facets/issues/3195236
https://www.drupal.org/project/facets/issues/3216756

It might be that facets already supports facet range slider (which I had already seen when I was configuring it), so it might not even need a module for support.

What we're looking for is this:
DSpace/DSpace#8418

Actions

  • Horray! I'm going to be a feature!

Configure breadcrumbs

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450294707


This feature support the following requirements

MVP build

Proposed solution

Configure breadcrumbs so that a user can easily get to their dashboard.

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • I see my dashboard in the breadcrumb trail
  • I can click the dashboard in the breadcrumb trail and get to the dashboard at data-set

Deployment of Solr for MFIN Data Catalogue

Overview

MFIN Data Catalogue will be using the Search API Solr Backend instead of the Search API DB Backend.

Solr is much more performant against large datasets, and provides many additional features.

  • Improved file contents indexing
  • Superior performance especially when dealing with numerous facets
  • Better keywords handling
  • Complete features like stemming, highlighting, spellchecking and phrase queries

There are two ways we can launch Solr:

I have a fair bit of experience with both but Solr Operator is superior due to:

  • Managing the full lifecycle of Solr
  • Providing a default secure configuration of Solr
  • Solr being run in Solr Cloud mode in a distributed fashion under Zookeeper
  • Created and maintained by Apache and Lucidworks so is officially supported

Blocker(s)

  • BC Gov Platform Team Install Solr Operator
  • Quota increase approved from storage-16 to storage-32

The BCGov Platform team would need to install the Solr Operator and I am not sure of the desire for them to do that. Although the Solr Operator does align with their model and they do install a great many other operators providing them to the users of the BC Gov Platform.

If the platform team signals low interest or the time period for such a feature is prohibited then Solr can just be installed via the Helm Chart method and run as a stand alone container. However the client should still be consulted and informed of the direction and some of the downsides.

Stack Overflow

I have created an issue asking about this over at BC Gov Stack Overflow:

The BCGov Platform team is assessing the request on June 29th and will report back.

In the meantime I have prepared Solr as a Helm Chart and just waiting for a quota increase / approval before I can successfully deploy

Theme build page

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450507107


This feature support the following requirements

mvp build

Proposed solution

layout the build page so that:

  1. There is a section for each view mode
  2. Each of these sections shows the fields in that view mode. this should be the label and the value of the field. If there is no value, it should be label with optional or required beside it
  3. each section should have an edit link that looks like a button
  4. clicking edit takes me to the form mode for that section

This will be satisfied by:

Argo CD and Helm DX Improvements

Overview

The MFIN Data Catalogue is currently deployed via Helm and Argo CD and is working great.

However there are two key DX improvements I am trying to move forward with the platform team.

Background

The two key DX improvements are as follows:

a) I would like to host the Helm Chart in Artifactory and only have the values-.yaml override file be present in the GitOps Tenant repo so as to follow the DRY principle and when multiple Drupal projects are setup remove the need to copy the whole Helm Chart into each GitOps tenant repo.

b) I would like Argo CD to watch the GitOps Tenant repo so the "kind: Application" for Argo CD can be hosted in the GitOps Tenant Repo so the whole state of the "kind: Application" is tracked as well in version control and not just having to be done in the UI.

Blocker(s)

  • Host Helm Chart in Artifactory and allow Argo CD to pull from it
  • Have Argo CD watch the GitOps Tenant Repo automatically without needing to use the UI at first

I am hopefuly one or both of these issues can be solved by the Platform Team but am documenting here for posterity.

Stack Overflow

I have asked my questions over in the BC Gov Stack Overflow.

Improve CI to run drupal-test-traits etc

OP timer

https://openplus.monday.com/boards/4092908516/pulses/6509122573


Overview

Right now the CI just performs a standard Drupal site install along with phpcs before it builds the container and then pushes them to Artifactory.

We instead should be grabbing a minimized test DB with limited data so we can run drupal-test-traits etc in our CI and ensure all of our test cases pass in an automated fashion.

We weren't able to do this for RSAMS due to the sheer size of the DB but for MFIN Data Catalogue this should be doable.

Blocker(s)

  • Waiting for a DB to perform tests against @lkmorlan

Work

  • Retrieve a minimized / cleansed DB @sylus
  • Establish automated process to get newer minimized DB @sylus
  • Need to update phpunit.xml which references DV21 @lkmorlan

The problem becomes where to host this DB so GitHub actions can pull it done. As a stop gap we could grab these databases from an Azure Storage Account but something official from BCGov would be idea.

Not sure how pressing of an issue this is but nonetheless I am documenting it.

set up subtheme

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450574939


This feature support the following requirements

mvp build

Additional context

We need a sub theme to work with. Even though the styles aren't in for the base theme yet, subtheme will make bootstrap 5 styles available. BS5 will be used for layout. At this time there is no plan to add layout classes to bcbb_subtheme

Proposed solution

Create a subtheme of bcbb_theme specifically for use with this project. The name of the subtheme is to be dc_theme.

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • I can see the project specific sub-theme in the appearance admin page of Drupal.

Notifications when Bookmarked datasets change

stuck Something is preventing further work on this issue pending GC Notify config #111


OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608593453


depends on: #111


User story

As a user of the data catalogue, I want to know when a dataset changes so that I can make downstream updates as needed.

NOTE

We need to discuss what constitutes an update. Do I get notified for every change? This could be annoying if I'm alerted to typo fixes for example. Maybe send the email when the last updated date changes?

Based on field_modified_date as a trigger for notification we want to send an email to the subscriber and put an updated flag/icon on their dashboard where the metadata record appears. The flag/icon appearance is based on field_modified_date being more recent than last_viewed_date.

Note: field_modified_date is updated not by the system but by the metadata author who must decide if it is appropriate to ensure that people interested in their metadata should be notified because the changes are significant and possibly impacting their business.

Email notification draft
subject: Dataset you've bookmarked has been updated
body: <Title of data set> [link] has been updated

Example reasons for a metadata author to update the field_modified_date and trigger notifications to users that have subscribed.

  • Metadata Record level:
    • Data Classification
  • Data Dictionary
    • attribute names changing
    • attributes being removed
    • data type changes

This feature support the following requirements

alpha release

Additional context

Users need to have a way to subscribe to data-sets. A flag could work here, it should go in the UI near bookmarks. Subscribing sends an email when the data-set is updated. The same UI widget should allow people to unsubscribe. There should also be a dashboard where users can see a list of updated data-sets, from the dashboard users can stop subscribing to the data-set or dismiss the notification that the alert has been updated.

On the email implementation end of things, please look at this common component:
https://digital.gov.bc.ca/bcgov-common-components/notify/

Proposed solution

Any ideas you have on how the feature could be implemented.

Estimated level of effort

  • 1 day

Definition of done (DoD)

  • user receives email when a bookmarked dataset has been updated
  • indictor in dashboard when there has been an updated and the last viewed is older than last modified date
  • indicator goes away once the item is viewed and the last_viewed is newer than last_modified

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Vault Access for Secret Handling

Overview

Anticipating the approval and installation of ArgoCD Vault Plugin for Secret Handling I wanted in advance to create the secrets in Hashicorp Vault that will eventually get interpolated via Argo CD on Sync.

However it seems I don't have access to login so I asked over in Rocketchat whether I am listed as the secondary technical contact.

Background

The secrets in Hashicorp Vault can be referenced in a Helm Chart specifying the Vault along with its corresponding keys which map to secured passwords.

Argo CD through the Argo CD Vault Plugin will dynamically inject the secrets during the sync phase which allows for the Application Manifests which are stored in Git to not house any secrets.

Blocker(s)

  • Need access to Hashicorp Vault

I need access to Hashicorp Vault in order to create the secrets that will eventually be used as part of the Helm Chart deployed via Argo CD.

Rocket Chat

I have asked whether I am listed as the secondary technical contact over in RocketChat in the #devops-vault channel.

It is possible I am and it just an issue with missing preferred_username in my Azure AD account.

image

Documentation for local development environment

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4708101094


This feature support the following requirements

IMB training and support

Additional context

Prior to #46, prepare documentation on using ddev locally to manage local development for the data catalogue. documentation will include:

  1. customization to ddev required to emulate the production environment.
  2. instructions on how to set up the DC project locally
  3. how to properly commit local dev work to the repo

Definition of done (DoD)

  • ddev documentation is included in the BC data catalogue wiki
  • any recipes for working locally

Flag issue or question with dataset to dataset custodian

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608677933


User story

As a user of a specific data-set, I want to be able to raise issues, or ask questions about the data-set so that I can be sure the data I'm using up to date, accurate and timely.

NOTES

The immediate solution to this is to have the author's email address readily available. There is a plan to do this in the first iteration of the dev release. We could have the NID in the email subject, though that is easily over-ridden by the sender.
Do we want something more advanced? As a team we need to characterise what this looks like. There should be an obvious way for users to submit a comment. Possibly a form. Custodians need to be alerted via email when there is an incoming question.
Do we want to keep a record with the dataset of incoming questions?
If so, how long to keep them?
What does the UI look like for the user?
What does the UI look like for the custodian?
What about back and forth communication (dialogue)?

This feature support the following requirements

alpha release

Additional context

Add any other context or screenshots about the feature request here.

Proposed solution

Any ideas you have on how the feature could be implemented.

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

  • criteria 1
  • criteria 2
  • criteria 3

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Improve Docker Scaffold against internal .docker provided by BCGov

Overview

Currently we have container images deployed to Artifactory using docker-scaffold and launched in OpenShift via Argo CD.

However I'd still like some time to compare against BCGov's provided .docker folder which was part of their original deployment. We mentioned we would take any novel features and incorporate them into our standard approach. I just wanted to perform some due diligence in this area in case there is anything we can learn from.

Blocker(s)

N/A

Work

TBD

Bulk upload for Data Dictionary data

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4606147836


User story

As a metadata provider, I want to have a way to bulk upload Data Dictionary data (for columns in my dataset), so that I save time on data entry and reduce the risk of human error during input.

This feature support the following requirements

alpha release

Additional context

When creating a data-set, there is potential for there to be more columns in the data dictionary than is reasonable to enter manually. A way to manage columns in bulk is necessary.

On the data-set content type, there is a field called field_columns that accepts unlimited values. This is an entity reference to the paragraph type data_column which has several fields where a user can enter values. There are several different field formats available in a data column.

We are looking for a way to bulk import data into these columns.

Proposed solution

TBD by developer.

Note that we do something similar on RSAMS. Jose has experience doing some advanced work with csv.

Definition of done (DoD)

  • csv file can be used to import information into data-set
  • csv columns map correctly to fields in dataset columns

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

No

Update interval

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608608231


depends on: #111


User story

As a data custodian, I want to be alerted when it is time to update a dataset so that I can make sure information is up to date and accurate for users of the system.

As a data catalogue editor, I want to know when a dataset is supposed to be updated so I know the information is timely.

Idea:
When editing a metadata record, include an unchecked checkbox for the author "Check this if this is a complete review." Checkbox is not setting a value, it's just a trigger for other things.

data_review_date field will be updated based on the checkbox activity.

duration field; defaults to 1 year but can be changed to 3 months, 6 months, 1 year, etc. (can be a taxonomy with number, units)

Send a notification email to author 15 days ahead of the date so they have time to action it. Please make the number of days configurable to be changed by a drupal admin user.

Visual flag in dashboard as well.

This feature support the following requirements

alpha release

2 days

Amendment

This feature assumes that there will never be a high volume of alerts being sent out. Our assumption is less than 1000 every week. If it turns out that there is a high volume of alerts being emailed, a queue worker will need to be implemented. Documentation about this potential issue needs to be added to the Features page of the Wiki. Include the symptoms that will occur if there are more requests than the system can handle and what will need to be done to fix the situation.

Definition of done (DoD)

  • editors can select this is a full review
  • editors can set an update interval in months
  • when a record is updated, I see a label in my dashboard that it's been updated if I have it bookmarked.
  • When I look at an record after it has been updated, the update label goes away
  • an alert appears on a record when a review is pending
  • an alert appears on a record when an update is overdue
  • in the dashboard, a badge tells me an update is pending
  • in the dashboard, a badge tells me the update is overdue
  • Every Sunday I receive an email telling me what needs to be reviewed within the next 30 days and what is overdue
  • There is documentation in the wiki explaining what will happen if cron sends too many alerts and how it needs to be remediated if this happens

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Iterate on the MFIN Catalogue Content Architecture document

The MFIN Catalogue Content Architecture document could be updated.

There is more info available about Unique Identifier, for example, and I think we could supply better taxonomic example values in some cases.

Note: one cannot edit this document (including comment) unless logged in with a Google account.


User story

As a Product Owner I want to evolve the MFIN Catalogue Content Architecture document so that it serves as a solid point of reference for all parties involved in the product's development so that we can implement a baseline in the MVP and iterate from there.

Actions

  • Meeting (internal or internal + OpenPlus?)

Subscribe to search criteria with notifications when search changes

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608223347


User story

As a user of the data catalogue, I want to be notified when data sets relevant to my work change, are removed or added so that I can be sure the data I'm working with is up to date and accurate.

This feature support the following requirements

alpha release

Additional context

The user needs to be notified on dashboard and via email when a search changes. The user also needs a way to unsubscribe from searches.

Proposed solution

On the dashboard, there should be a section for saved searches. There should also be an indicator that the search results have changed, a way to stop receiving alerts and a way to dismiss the change alert.

Saved search is done (#56, #92)

Reference: EVRI (DV10) does something similar.

NOTES

Instead of dismissing alerts, maybe the date the search last changed? Changed since last visit? Need ideas for how this functions.
Is it possible to get some sort of diff of what the change was? ie ABC has been updated, DEF has been removed, GHI has been added

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

TBD

  • TBD

Dataset node view functionality

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4614497895


This feature support the following requirements

alpha release

Additional context

See the DoD list for expected behaviours on the node view for a dataset.

Definition of done (DoD)

  • Name of data custodian (author) links to search results page of all datasets maintained by that person
  • field_primary_responsibility_org links to search results page of all datasets for the Office of Primary Responsibility
  • field_series links to search results page of all datasets in the series
  • field_source_system links to search results page of all datasets in the source system

Add Solr search

OP timer


User story

As an anonymous user of the system, I want to have a fast, feature rich search experience so that I can quickly find data sets most relevant to my needs.

This feature support the following requirements

alpha release

Additional context

It was decided that since this will be a data heavy application, Solr is the best tool to handle search. This requires additional config so the search api module can work wit solr.

Note: related issue #42

Proposed solution

install Search API solr and configure as necessary. This should be included in the base-build.

Security considerations

Drupal will need to store authentication keys to access the solr core(s).

  1. make sore no keys are stored in the db that could get exported during drush cex
  2. on OpenShift dev and prod environments, keys should be stored in a vault in the environment and accessed by Drupal
  3. on the OpenPlus dv server, store the keys in settings.php
  4. if you have a local install of an app using solr, ddev for example, be sure keys are not stored in the db. Use settings.php

Definition of done (DoD)

  • Solr module is installed
  • Solr module successfully connects to a solr core on dv21
  • Solr module successfully connects to a solr core on dev
  • Solr module successfully connects to a solr core on prod

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

ArgoCD Vault Plugin for Secret Handling

Overview

I am just filing this issue so can track the installation of the Argo CD Vault Plugin configured with an associated Argo CD instance.

The Platform team uses the gitops operator and it looks like this is a solved problem for installing the Argo CD Vault Plugin.

As we don't want to have secrets in the tenant-gitops-***** repo and instead have the values stored in Hashicorp Vault which only get interpolated during the sync / reconciliation phase of Argo CD. This facilitates having no secrets in our Application Manifests repo and is considered a best practice.

Background

In my job as a senior Technical Architect of the Platform Team as Statistics Canada we use both Argo CD and the Argo CD Vault Plugin connected to Azure Key Vault and also to Hashicorp Vault (which BCGov uses) to great effect.

While the issue I posted above with the gitops operator posts what needs to be done I also posted the spec we used in case there is any difference in the comments below.

Blocker(s)

  • BC Gov Platform Team Install Argo CD Vault Plugin

The BCGov Platform team would need to install the Argo CD Vault Plugin and configure it to be linked with the Argo CD instance dedicated to a project.

I was talking with Ian Watts who mentioned the following:

When we first introduced the shared instance of ArgoCD for teams to use, it wasn't possible to use the Vault plugin. I'll check my notes on that, though. Now that we have migrated to the GitOps Operator, it's worth another look."

Stack Overflow

I have created an issue asking about this over at BC Gov Stack Overflow:

Set up dashboard

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450022523


This feature support the following requirements

Additional context

The dashboard is where most users will be working with the system. For the MVP launch, we need to have something presentable. Note that this is likely to change once the UX work is done. That being said, it needs to be something that makes sense to users from the first iteration.

Proposed solution

The dashboard should reside at /data-set and contain the following:

A way to add a new data-set

Have a "Add new data set" link that goes to the a form mode on node/add.

Data sets being worked on

There needs to be a table of data sets that belong to the user. It should only show data sets currently being worked on.

TBD, should the table show data sets the user is working on, or should it be data-sets their division is working on>

Initially the table should the data set title, not clickable, and an actions column that has view and build links. these links should appear as buttons, but not be buttons.

Additional fields in the dashboard TBD

Bookmarks

Give users a way to bookmark data sets. There should be a way for users to flag data sets to bookmark. Bookmarked data-sets should show up in a block on the dashboard.

TBD - what happens to layout if a user has a lot of bookmarks?

Theme

Currently the theme is not implemented. However, we can use bootstrap for initial layout.
See the doc

For reference, use rsams.

screenshot of RSAMS dashboard

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • user can create a new data-set
  • user can bookmark a data set
  • user can see the bookmarks on the dashboard
  • user sees a table of data sets being worked on
  • user can use links in the action column to view or build the data set

Configure path auto

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450399894


This feature support the following requirements

mvp

Proposed solution

Configure pathauto. For now, use content type (data-set) and title for data sets and title for pages.

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • data set paths look like /data-set/title-of-my-data-set
  • page paths look lime /title-of-my-page

Update custom theme

OP timer


This feature support the following requirements

mvp build

Additional context

A theme info file was added, but not inheriting from parent themes properly.

Proposed solution

Rebuild the custom theme using the sub-theme builder provided by BS5

Dev workflow

  • I have verified that the feature is ready for QA in the development environment.

Test steps

  • Ensure dc_theme is the active theme
  • Go to the home page of the site
  • make sure browser is in light mode
  • see that the background is light and the text is dark
  • switch browser to darkmode
  • see that the background is dark and the text is light

NOTE:
This test works because the base-build theme uses a media query to activate dark-mode, none of the other themes currently do.
You will see some text that is not properly coloured on a dark background. This is because the base theme is a work in progress.

Definition of done (DoD)

  • dc_theme is enabled and set as default theme in Drupal
  • on visual inspection, sub-theme is inheriting styles from parent

Dashboard for non-Metadata Authors

OP timer
https://openplus.monday.com/boards/4092908516/pulses/4728242416


User story

As an authenticated user (not a Metadata Author), I want to view and manage my bookmarks so that I can unbookmark, sort and filter my bookmarked datasets.

Note: bookmarks is recreated as a separate task here #91

As an authenticated user (not a Metadata Author), I want to subscribe to data sets via my bookmarked datasets so that I am notified on my dashboard when a dataset has been modified.

Note: this is a duplicate of #26

As an authenticated user (not a Metadata Author), I want to also be emailed when any of my subscribed datasets has been modified.

Note: Email notifications are par of #26

As an authenticated user (not a Metadata Author), I want to view a list of my "Saved searches" so that I can manage them, including deleting them.
Satisfied by #56 , #92

This feature support the following requirements

Additional context

Note: Some of this work has been done and needs to be characterized here. Also, no ui for the dashboard has been developed yet. maybe stick bookmarks in a table for now.

Proposed solution

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

  • criteria 1
  • criteria 2
  • criteria 3

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Configure date formats

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450344874


This feature support the following requirements

MVP build

Proposed solution

Configure date formats to YYYY-MM-DD, ('Y-m-d')

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • Dates appear as YYYY-MM-DD, i.e. 2023-05-10

Add modified_date to search facets

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4658082673


Add last updated date as a facet to the search page. Thus should display as a list of years , YYYY
clicking on a year exposes a list of months followed by a year , Use full word, 'January 2023, February 2023 etc. We do not need day granularity. Show a results count beside the facet.

RSAMS search is a reference for how this should work.

Definition of done (DoD)

  • I see date facets on search page for last updated date
  • Clicking a year, I can refine to the month level
  • I can remove facets I've selected

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Search facets are available on the right of the page

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4614523951


This feature support the following requirements

alpha release

Definition of done (DoD)

  • Users should be able to turn facets on and off.
  • Each set of facets appears in a block with the appropriate heading
  • title Current search filters is not visible unless a facet is selected

The following need to be available as facets:

  • node author
  • field_primary_responsibility_org
  • field_series
  • field_source_system
  • field_used_in_products
  • field_metadata_type (this is in field_columns, a paragraph)

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

set up initial role and permissions

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4450758421


--

This feature support the following requirements

mvp build

Proposed solution

Taxonomy based access control

  1. Install tac_lite
  2. D10 is currently not supported, but there is a patch for that

Roles

Create roles:

  1. Data administrator
  2. Data custodian
  3. Data catalogue user

auto assign roles

Use the registration role module and assign all new users the role Data catalogue user.

Note that this module does not have a d10 release, we may need to solve this.

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • When a new user is created, they are automatically assigned the role Data catalogue user

Meeting to discuss review of Standard Operating Procedures document

This feature support the following requirements

Meeting to review in-progress Standard Operating Procedures (SOP) document.

Additional context

A Standard Operating Procedures document is under development to ensure that the CabOps & FINtranet Drupal apps are regularly backed up and restorable through processes outlined in the document. This ensures that anyone responsible for such operational tasks has the standard document to guide them when needed.

Definition of done (DoD)

  • SOP drafted
  • SOP circulated for review
  • meeting to discuss the SOP

@eloynav @CraigClark @lkmorlan @sylus @sangtrinh

Subscribe to dataset

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608587659


User story

As a person responsible for

I want to …

So that …

This feature support the following requirements

Describe or link to the requirement

Additional context

Add any other context or screenshots about the feature request here.

Proposed solution

Any ideas you have on how the feature could be implemented.

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Data lineage maps

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4606013433


User story

As a user of the data-catalogue, I want to know how data is being used so that I know what products may be affected by changes to the data source.

This screenshot is from the mural board used in the co-creation workshop and relates to this feature.
image

This feature support the following requirements

alpha release

Additional context

There is a field called Used in products (field_used_in_products) that can be used in the implementation of this feature. The value(s) for this field show where this data has been used. The data dictionary is the data model. One of the columns has a field called Column transformations (field_column_transformations) and the data set itself has fields for Data set description (body) and Data set historical change (field_data_set_historical_change). With all that taken together, viewing the product will give a list of data-sets used in the product, viewing one of those data sets provides the data model and transformations.

Proposed solution

On the node view of a data set, link Used in products to a search results page that displays the facets for the products it us used in.

This can go the other way as well, by having products as facets, users can click on a product and see all the data sets used in it.


UPDATE SEPTEMBER 15 2023

This was discussed in an meeting between Craig and Nicole on Sptember 15.

Currently used in products is taxonomy. In the sample data we were sent, the product was a dataset. For example, we have the following data sets:

  1. BCA Folio Address
  2. BCA Values
  3. BCA folio description
  4. Home Owners Grant Dashboard

Items 1 to 3 are used by 4 (Home Owners Dashboard). Home owners dashboard doesn't work as a taxonomy and a dataset.

We could get rid of vocabulary data_product. Then the field field_used_in_products on the data set becomes an entity reference to a view of data sets. This would be an auto-complete list. The node you are currently on would be excluded from the view. Then a data set could also be used as a product.

We would need to find a way top use the view as a facet. The view would need to show all nodes that are used in field_used_in_products, reducing duplicates.

On the node view, the value for field_used_in_products wound need to link to a search results page that returns the search results based on what is in the field.


NOTE
TBD: currenly we have products as a facet. If there are a lot of products, we will need another feature to sort out a UI as we can't have long columns for facet lists.

** Amendment October 2, 2023

We have implemented a 2 way reference. When a user creates a data set that uses other data sets, they have an auto-complete box to select the used data sets. This will list the data sets used on the data set being created. A view on the data set that is reverenced show a list a list of data sets that use it.

A data set could potentially have two fields. A list of data sets that it uses, and a list of data sets that use it.

Definition of done (DoD)

  • as an editor, I can start typing the title of a data set in a Uses data sets field. Based on what I type, I can select a data set
  • As an editor, I can add as many data sets as I need in the Uses data sets field.
  • As a data catalogue user, on the node view I see a list of data sets this data set uses and/or I can see a list of data sets used by this datasets
  • all reverences data sets are links

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Active directory should be used to control access / levels

QUESTIONS:

  1. Is active directory the right technology, or should this Office365?
  2. Is the OpenShift stack able to connect to AzureAD / Office365 endpoint for authentication?
  3. Do we have enough details in the response to place users into specific organizations and providing the correct access controls in place?

OP timer


User story

As a …

I want to …

So that …

This feature support the following requirements

Additional context

Proposed solution

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

  • criteria 1
  • criteria 2
  • criteria 3

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Add age of metadata records to dashboard

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608608231


User story

As a Metadata Author, I want to have an overview of the metadata records I'm responsible for
so that I can be aware of records that are outdated.

This feature support the following requirements

alpha release

Proposed solution

Create a view on the dashboard at /user of the data sets where author matches the logged in user. The view should initially have the following:

  • title, plain text, no link
  • last_updated_date (oldest to newest); last updated is set manually whereas Drupal has last_modified_date which is updated every time you make a minor change; the date and the relative date (calculated - e.g. 2 years 3 months ago)
  • under a column called actions, have links to view or build the data set

Call the view 'My published metadata'

put it under the existing 'My unpublished metadata'

Estimated level of effort

  • 1 day
  • 2 days
  • 3 days
  • 4 days
  • 5 days

Definition of done (DoD)

  • there is a dashboard view at /user that shows the data sets I have published, with the last_updated_date

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Configure search

Base-build candidate This feature should be evaluated as a potential asset to the base-build

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4580898765


This feature support the following requirements

mvp build

Proposed solution

Set up search using RSAMS search as a guide.

This will require:

  • Search API
  • View
  • Page mode
  • Facets
  • .csv data eport

See gc_ext search module as reference.

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • There is a search page at /search/site
  • #36
  • Results are filtered to the terms entered in the search box
  • #22
  • #33
  • #39
  • There is a reset button that visually indicates I'll lose my results if I select it

SSO with IDIR for metadata providers and other users

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4608683215


User story

As a user of the Finance Data Catalogue, I want to login with my common Government account (IDIR) credentials so that the experience is consistent with other Government systems.

As a Product Owner for the Finance Data Catalogue, I want users to be able to log in the the product so that they can actively test what is being delivered so that the feedback loop is short and we can pivot if needed.

This feature support the following requirements

alpha release

Additional context

The o365 module has been installed.

The development site is at http://mfin-data-catalogue.apps.silver.devops.gov.bc.ca

ToDo

  • install modules
  • get client id
  • get client secret
  • get tenant ID
  • configure to work on azure cloud
  • tests

Estimated level of effort

  • 2 days

Definition of done (DoD)

  • MFIN staff can log into site using their o365 credentials

Testing

Automated functional tests

  • I have written functional tests for this feature
  • I have run the functional tests and they pass

Automated site tests

  • I have written site tests for this feature
  • I have run the site tests and they pass

This feature requires manual testing

  1. first test step …
  2. second test step …
  3. etc …

Scan container images for vulnerabilities

Overview

While the containers we build as part of docker-scaffold are run in restricted root and custom security contexts under OpenShift we still need to ensure that critical and high vulnerabilities don't make their way into the container images.

Note We only extend on the official Drupal container images provided by Docker and only reference a major.minor specificity for our base images.

Blocker(s)

  • XRay Scanning issues

Currently when I am logged into Artifactory it looks like XRay is not functioning and not performing vulnerability scans.

Work

I think the plan is to use Trivy in CI which will attach vulnerability assessments to builds, however we will also trigger XRay to scan the container images so that the BCGov Platform can also be made aware of any security events related to the container images.

Implement docker scaffold

Overview

Provide a baseline CI that builds the MFIN Data Catalogue Containers.

Work

  • Setup with Docker Scaffold 10.1.x
  • Added Script Handler from Drupal Project
  • Added load.environment.php from Drupal Project
  • Symlinked the Docker Scaffold files
  • Created the Makefile + .env file
  • Setup basic CI (standard composer install, and drush si minimal)
  • Initial build is green and passing

Note: I built against php 8.2 so please check composer.lock looks good and works for your local dev server.

Versioning of metadata entries

OP timer

https://openplus.monday.com/boards/4092908516/pulses/4606168984


User story

As a data custodian I want to see previous versions of the data-set so that I can compare changes over time. I also want to be able to restore to a previous version of the data-set.

This feature support the following requirement

alpha release

Additional context

Dev workflow

  • I have written functional tests against the UAT test cases
  • I have run the functional tests and they pass
  • I have written integration tests on areas that may be affected by this feature
  • I have written integration tests under conditions which the feature should not work
  • I have run the Integration tests and they pass
  • I have added a comment to this issue with steps the reviewer needs to follow when doing QA
  • I have verified that the feature is ready for QA in the development environment.

Definition of done (DoD)

  • saving a change to a data-set creates a new revision under the revisions tab
  • clicking a date of a revision shows the how the data-set looked at the time of revision
  • custodians can compare two versions of the data-set to seed how they differ
  • custodians can revert to a previous version of the data-set

Capture level of effort on tickets

Idea Park your idea's here, they may become features!


User story

As a project manager, I want to know how long features will take to implement so that I can help advance team efforts by knowing how many tasks I can fit in a sprint.

Describe the idea

We would have to discuss how this would work. It can be hard for developers to estimate specifically how much time a task will take to complete. Maybe some sort of rating system. trivial | easy | moderate | difficult

Actions

  • Horray! I'm going to be a feature!

Gather sample data for testing

This feature support the following requirements

Enable realistic testing of addition/edit/presentation of content in the UI, etc.

Additional context

Work with business reps to create a sample data file representing the breadth of the kinds of thing we might encounter in the real world. Build it with relatively correct lengths for field, etc. not junk test data.

Proposed solution

Gather and submit the test data to the dev team.

Definition of done (DoD)

  • file attached to ticket

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.