Code Monkey home page Code Monkey logo

daf-dataportal's Introduction

DAF Dataportal for Piattaforma Digitale Nazionale Dati (PDND), previously DAF

DAF PDND LOGO

Build Status

The DAF Dataportal is the front-end project of the PDND portal available at this link.

All the documentation and the user manual can be reached at this page of Docs Italia portal.

What is the PDND (ex DAF)?

PDND stays for "Piattaforma Digitale Nazionale Dati" (the Italian Digital Data Platform), previously known as Data & Analytics Framework (DAF).

You can find more informations about the PDND on the official Digital Transformation Team website.

What is DAF Dataportal?

The project of the DAF Dataportal is the UI interface of the PDND, the Italian Digital Data Platform.

The project combines both the public site and the private dashboard. Within the public site it is possible to access and navigate the OpenData and public contents created and published through the PDND.

Public site page

The private section is aimed to contribute, creating new datasets (if the user is linked to any of the Public Organizations partecipating the platform) or writing down new datastories.

Private site page

DAF Dataportal is the entry point of the Piattaforma Digitale Nazionale Dati, the way to consult datasets information and the use cases for datasets and the place to create new content: Datastory and Datasets.

The master branch refers to the code available for the production release (reachable here). All the development starts from dev branch working on feature branches and merged first on dev and after testing and reviewing can be merged on master.

Project components

This project depends by the following components.

  • React version 16.8.6, available here.

  • Redux version 4.0.1, available here.

  • Webpack version 4.29.6, available here.

Related PDND Services

  • DAF Catalog Manager available here
  • DAF Security Manager available here
  • DAF Dataportal Backend available here

How to install and use DAF Dataportal

Clone the project

git clone https://github.com/italia/daf-dataportal.git

Configure your local environment

To make the magic happen are required:

  • Node.JS
  • NPM

You can install them following this guide

Add to your host file the following urls

localhost.dataportal.daf.teamdigitale.it

localhost.dataportal.daf.teamdigitale.test

Install all packages and dependencies

npm install

Run the app

npm start

If you are on the master branch point your browser on localhost.dataportal.daf.teamdigitale.it, in the other hand localhost.dataportal.daf.teamdigitale.test will work for you.

How to build and test DAF Dataportal

Insert here a brief documentation for the developer to build, test and contribute. Insert a link to an extended documentation (developer manual) if present.-->

There's two ways to build up the project the first (and simple one) running the following command via terminal

npm run build  

The second (and more geeky) is based on Docker (You can find all the following informations here)

sudo docker build --no-cache -t <YOUR_DOCKER_IMAGE_NAME> .

NOTE: In order to work properly it's necessary a running instance of the related PDND services (see the related paragraph). The endpoint for all the services and API can be found at src/config/serviceurl.js and can be edited to target the correct endpoint of the services above, otherwise the application will target to the default production endpoints.

How to contribute

Contributions are welcome. Feel free to open issues and submit a pull request at any time, but please read our handbook first.

License

Copyright (c) 2019 Presidenza del Consiglio dei Ministri

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.

daf-dataportal's People

Contributors

aijanai avatar alranel avatar and0111 avatar atrosino avatar bb64bit avatar crismon-01 avatar cristofani avatar giux78 avatar luca-pic avatar lucarducci avatar mariaclaudia avatar raippl avatar ruphy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

daf-dataportal's Issues

[Feature]: Specify which information is mandatory

Feature Request

I suggest to specify which of the fields that should necessarily filled in. And those that have not an effect so far. It can simplify the process the user should go through.

In general, it might be useful to give a little definition of the fields - particular attention to the definition of Dominio and Sottodominio.

[Data Stories]: the page "Le mie storie" contains stories not mine

accedendo a “Storie” → “Le mie storie” vengono visualizzate delle storie che non sono state create dall’utente loggato. Sembra che vengano proposte tutte le data stories presenti.

Subject of the issue

There is a page titles "Le mie storie", those shown are not mine.

Your environment

  • Google Chrome - Version 64.0.3282.167 (Official Build) (64-bit)

Steps to reproduce

  1. Choose Storie in the side menù
  2. Look at the page

Expected behaviour

From the title one infers that the page gathers the stories created by the user or the organization he belongs to.

Actual behaviour

The user is able stories that don't belong to him.

Insert a filter to search slices of superset

When you are creating a new dashboard in DAF, it would be very useful to filter slices created by superset. Now the system display all slices created in DAF and this job is very slow and cpu consumption. I suggest to generate a form filter that generate a list with these fields:
|| owner’s slice || title of slice || type of slice || screenshot of slice || link preview of slice ||

Create a "Medium"-style editor for stories

The form for stories creation can be improved. It would be nice if it adopted a more modern look. One of the inpirations in this regard is the medium.com editor, play with it and take a look at what you can do.

Links for inspiration:

Errore in descrizione STORIE in HP

Le storie sono articoli scritti da noi e dalla nostra community di esperti: partendo dai dati, interpretiamo il mondo e aiutiamo la socità a prendere decisioni basate sui fatti

the system duplicate content box during Dashboard editing

This anomaly, unfortunately does not always happen with the same steps, so I have difficulty in reproducing the exact sequence. However, the problem usually occurs during these activities:

  • inserting new box in DAF dashboard;
  • drag the slice from one side of the dashboard to another.

[Ingestion form]: I set the type of a date in timestamp. If Kylo doesn't recognize the format the file is not ingested

Subject of the issue

While compiling the ingestion form, one has the possibility of changing the type inferred by Kylo. In particular if we do it for a datetime. If Kylo doesn't recognize the format of date the feed is not created.

Your environment

  • Google Chrome - Version 64.0.3282.167 (Official Build) (64-bit)

Steps to reproduce

This is the sample (sample_incidenti.txt) I used. In the ingestion form, I changed the type of the the date from string to timestamp.

Expected behaviour

I expected to be able to load data on edge1.

Actual behaviour

The data are not loaded (on edge1). Talking with Fabian it is because kylo doesn't recognise the column defined by the user as timestamp. In particular, the format is not the one that kylo uses.

As we discussed, in order to avoid the stop of the feed, the easiest thing to do, is to not allow the user to change the time of a column into timestamp whether it is not inferred by kylo.

[Ingestion form]: the 'go back' button brings you to the initial page

Subject of the issue

The 'Go back' seems to not work properly.

Your environment

  • Google Chrome - Version 64.0.3282.186 (Official Build) (64-bit)

Steps to reproduce

Start an Ingestion form, once you are at page 3, push 'Go back'.

Expected behaviour

I expect to be redirected to the previous page (in this case page 2).

Actual behaviour

I'm back to the initial page

screen shot 2018-03-10 at 11 27 09

Feature Request

It worked well with the previous release.

[Ingestion Form]: filling in the module, if I come to previous step I'm asked to refill some fields

Subject of the issue

While a user fills in the module, if he needs to come back to previous stages he's asked to fill in again:

  1. Concetto semantico
  2. Tag

Your environment

  • Google Chrome - Version 64.0.3282.167 (Official Build) (64-bit)

Steps to reproduce

Just try to go back after having completed the module.

Expected behaviour

I don't expect to insert again the info related to potentially many many columns.

Actual behaviour

I've to do enter again the info

[Dashboard]: No filter in the dashboard

Subject of the issue

There are no filters in the dashboard created on the dataportal.

Actual behaviour

Apparently the widgets are independent thus it's not possible to insert filters. For compleate dashboards having this feature is essential.

[Datasets]: Hard-coded - data aggiornamento dataset.

Subject of the issue

The updated date of every dataset is hard coded

Your environment

Chrome/Safari

Steps to reproduce

  1. select "Dataset" from the menu on the left side
  2. click on a dataset that has been recently created
  3. check the text-field: "Creato il "

Expected behaviour

The real date should be displayed

Actual behaviour

The date is hard coded, the same for every dataset.

[Feature] Reset and Update Password

As a User
I want to be able to reset, update my password
Since I forget or I want to update it

We need a complete design of the feature taking into account:

  1. approaches
  2. behaviour of the feature
  3. ui design and flow
  4. frontend implementation
  5. backend implementation

[Datasets]: Filter by format doesn't work

Note: These issues are for bugs and feature requests.

Subject of the issue

Datasets: filter by format doesn't work.

Your environment

  • Google Chrome - Version 64.0.3282.167 (Official Build) (64-bit)

Steps to reproduce

  1. Log in
  2. Go to Datasets

Then I tested these three options:
3) If you filter by format
or
4) Filter by organization and format
or
5) Filter by category and format

Expected behaviour

  1. I expect to see al the dataset labeled as those specified in the format filter
  2. I expect to see all datasets related to the chosen organization whose format is the selected one
  3. I expect to see all datasets related to the chosen category whose format is the selected one

Actual behaviour

In 3, 4 and 5 the system gives back 0 results.

[Ingestion Form] Batch ingestion of compressed files

Subject of the issue

In the ingestion form, add a properties to specify that CSV/JSON files for a feed will be uploaded compressed (ZIP) on the SFTP channel. A zip archive will not contain multiple csv/json files.

Expected behaviour

The new feed will read zip files and will decompress them before loading on hdfs

Actual behaviour

Now it's not possible to provide this information, although a feed can be manually changed (via the administration console) to support zip files

[Datasets] Search function doesn't work on specific datasets

Subject of the issue

It's not possible to retrieve an existing dataset by the search function.

Your environment

Safari/Chrome

Steps to reproduce

  1. Login into https://dataportal-private.daf.teamdigitale.it
  2. Type "strutturericettivepiemonte" into the search box (the name of an existing dataset with access permission)
  3. Press "Cerca"

Expected behaviour

The dataset has to be listed in the search list.

Actual behaviour

The dataset is not displayed.

Note

It's not a standard behaviour. For some dataset the function works properly.
The dataset exists and the table name is: REGI__europa.test_ingestion_o_test_strutturericettivepiemonte
A dashboard with Superset is available:
test_piemonte_strutture_ricettive

Permission denied in Jupyter

Subject of the issue

Jupyter gives a "Permission Denied" message when trying to access a private dataset uploaded by the user, or a private dataset that the user is allowed to access.

Your environment

The error looks to be independent of the adopted browser; for reference, it was seen with Chrome 64.0.3282.186 (official build)

Steps to reproduce

  1. login into the dataportal
  2. choose a private dataset you can explore with other tools (e.g. Superset).
  3. fetch code snippet to load it in Jupyter
  4. start Jupyter with PySpark3 kernel and initialise endpoint and session as per user manual
  5. run code snippet from 3) and get "Permission Denied".

Notes:

  • this problem is not seen with open data.
  • by manually adding typos to the path found in 3), the message changes to “Path does not exist”, thus confirming the original target path existed but was prevented from access.

Expected behaviour

A pyspark DataFrame should be created.

Actual behaviour

One receives the following message:

An error occurred while calling o47.load.
: org.apache.hadoop.security.AccessControlException: Permission denied: user=d_mc, access=EXECUTE, inode="/daf/ordinary/test_ingestion":daf:daf:drwxrwx---
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkAccessAcl(DefaultAuthorizationProvider.java:363)

Registration not work

I tried to sign up. Received the mail with the activation link. I get "Errore durante l'attivazione riprovare più tardi".

schermata 2017-10-13 alle 09 39 39

[Superset]: 500 - Internal Server Error from List Slicer windows

Subject of the issue

Error 500 when you click on column names to order by Slice | Visualization Type | Datasource | Creator

Your environment

Chrome

Steps to reproduce

1_ Go to Superset
2_ Click on "Slicer" on the top bar
3_ Click on the column names (Slice | Visualization Type | Datasource | Creator) in order to order by the selected field

screen shot 2018-03-12 at 17 22 01

### Expected behaviour The list should be ordered by the selected field.

Actual behaviour

500 - Internal Server Error
Please see the screenshot attached.

[Datasets]: Error loading dataset csv sample OpenCup Report Incentivi

Subject of the issue

"Errore durante il caricamento. Si prega di riprovare più tardi."
It's not possible to load the csv sample attached

Original dataset:
http://opencup.gov.it/documents/21195/100257/Report+Incentivi+CSV/76d83867-b82c-4e48-9041-d952a19b67a9

Your environment

Chrome Version 64.0.3282.186 (Official Build) (64-bit)

Steps to reproduce

  1. download the new dataset from OpenCup
  2. create a csv sample for the metadata process (ex. first 10 lines)
  3. start the metadata process
  4. load the csv sample at the first step: Passo 1: Carica file e descrivi le colonne

Expected behaviour

The sample should be loaded and the fields displayed in order to proceed with the metadata informations.

The sample is parsed correctly in Python with Pandas using the code:

import pandas as pd
data = pd.read_csv("incentivisample.csv", sep="|")

79 columns

Actual behaviour

Error message is displayed: "Errore durante il caricamento. Si prega di riprovare più tardi."

screen shot 2018-03-06 at 12 15 15

Verify dashboard auto-saving

In some cases, after Chrome crash, the Dashboard resulted corrupted (almost all blocks have been lost), forcing me to start the work again.

Increase information in the dataset description

Could be useful add information about how access to the dataset using jupyter.

schermata 2017-10-12 alle 16 22 12

For example, we can add the following information:

path_dataset = "/daf/opendata/alsia_o_atti_d_di_d_concessione1_0"
df = (spark.read.format("parquet")
     .option("inferSchema", "true")
     .option("header", "true")
     .option("sep", "|")     
     .load(path_dataset)
)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.