moj-analytical-services / platform_user_guidance Goto Github PK

View Code? Open in Web Editor NEW

3.0 41.0 2.0 46.99 MB

**DEPRECATED** See https://github.com/moj-analytical-services/user-guidance

R 14.57% Makefile 20.79% HTML 64.64%

guidance analytical-platform

platform_user_guidance's Introduction

Platform User Guidance

View the user guidance here: https://moj-analytical-services.github.io/platform_user_guidance/

To update

Make changes to ##-chapter.Rmd files in a branch and pull request into master to publish.

CI – each commit is checked by CircleCI automatically rendering the bookdown.

CD – commits to the master branch are deployed by CircleCI, by pushing the rendered book to the gh-pages branch, which is what is displayed at https://moj-analytical-services.github.io/platform_user_guidance/.

To render the guidance locally

In R, run:

bookdown::render_book('index.Rmd', 'bookdown::gitbook')

or from shell/terminal make (requires GNU Make):

Rendered bookdown content will be in /docs/

platform_user_guidance's People

Contributors

Stargazers

Watchers

Forkers

warmanamoj uk-gov-mirror

platform_user_guidance's Issues

Deployed guidance does not have a favicon

The deployed guidance does not have a favicon. This is inconsistent with other Analytical Platform tools and services, such as the control panel. Having the same favicon would provide consistency and help develop the Analytical Platform brand.

Add mention of S3 buckets to section on shiny deployment

As discussed with @billster45 and @samtazzyman. It would be helpful to have a note on this in the step by step guidance - admins need to create a different type of bucket, add users and link bucket to shiny app.

https://moj-analytical-services.github.io/platform_user_guidance/deploying-a-shiny-app.html#step-by-step-guide-to-deploying-an-app about

3.2.2.2 Manipulating data

This suggests that re-naming a team in Github will re-name the relevant S3 bucket. But this is not the case.

IP whitelisting is not presented with sufficient caveats

ST:

5.1.7 should be amended to make it clear that IP whitelisting is not strong security
it will prevent accidental public access
but it wouldn’t stop anyone who wanted to gain access from doing so

DR:

security by IP address' is very low bar - only used for protecting apps that we've not taken the trouble to make scalable for public use. This level of security is fine for open data (or FOI-able data) but not suitable for apps with access to sensitive data or that which would raise concerns if leaked.

Add description of how to use conda with jupyter lab

Essentially using the notes I created here:
https://github.com/RobinL/cheatsheets_etc/blob/master/jupyter_conda.md

Guidance on information governance is unclear

The section on information governance is unclear. This should be updated with more detail and links to other relevant MoJ forms and guidance.

Add dbtools section to guidance

Go do

Guidance does not use GOV.UK styling

It would help to build consistency across the Analytical Platform if the guidance used GOV.UK styling. A couple of possible options are:

AUP is not for all users?

The Acceptable Use Policy says it is for all users of AP:
https://moj-analytical-services.github.io/platform_user_guidance/acceptable-use-policy.html#acceptable-use-policy
However I don't think it is suitable for app users. Certainly app users (beyond MoJ) don't even see the AUP in most occasions. So I think we should change the wording.

Also, maybe there are bits from the AUP relevant to app users that we should pull out into a separate doc? Or maybe we just let each app be responsible for being clear about acceptable use?

I'm flagging this for @calumabarnett @samtazzyman to ponder and bear in mind.

'Getting started' guidance

We need some 'getting started' guidance, covering setting up Git, ssh keys etc.

Description of how to install github only r packages using conda

You can't use a conda install because a github only package isn't on conda.

But it's simple to do - just use an install.packages after the conda build in the Dockerfile.

Add issue on airflow docker builds on AWS_REGION

Should add a note somewhere in the data pipeline section that you may want to grab larger data for your build from S3 instead of uploading it to Github.

e.g.

RUN aws s3 cp s3://mojap-non-sensitive-files-for-docker-builds/large_install.zip my_files/large_install.zip

To do this you either need to install the aws client:

RUN conda install -c conda-forge awscli 
#OR
pip install awscli --upgrade --user

Or you can download the data using a python script with boto3.

RUN python download_my_files.py

Note when using boto3 you must set the environment variable AWS_REGION to eu-west-1. Otherwise boto3 will throw errors.

Guidance needed on committing Jupyter notebooks without outputs

Unless and until something like nbstripout is applied by default, users should be warned about potential for accidentally pushing data to GitHub in their notebook outputs.
If I were more familiar with it myself, I would write some simple instructions to install and apply nbstripout appropriately to avoid the issue.
Might also be a good place to highlight the necessary steps to take if data has been pushed to GitHub so there is no excuse for people to ignore/hide potential breaches.

Add 'ssh keys not deployed' to list of common problems and solutions

R Studio guidance

This should predominently link off the other sites, but we will want some text on use of the home directory, packrat, and integration with Github

How to sync github repo in Terminal (for Jupyter users)

RStudio users are shown how to:

create a new project in RStudio by cloning from GitHub
commit new changes
push to GitHub

The subsequent section for Jupyter users directs them to the Command-line section, which only covers steps 2 and 3 above (mistakenly referred to as steps 3 and 4). Only spotted it when a new starter skipped straight to git add without having git cloned.

Link to DPIA guidance is broken

The link in section 9.2.1.5 Data protection and privacy: Further information on DPIAs and PIAs can be found here (https://intranet.justice.gov.uk/guidance/knowledge-information/protecting-information/data-protection-impact-assessments-dpias/) returns a page not found error.

Update caveats about platform stability

When rolling out the platform we should ensure that we update references such as this, which state that the platform should not be used for time-critical tasks and does not have guaranteed stability - it might not inspire confidence in the platform as a replacement for existing tools.

Simple Git workflow for beginners

We need some notes on a simple Git workflow for beginners - probably just covering covering add, commit, push and pull

guidance on Jupyterlab linters

It would be useful to have a section on the pep8 linters in Jupyterlab, Flake8 and Black, as they could be stumbling blocks for new users (or folks who get new builds). Some things to cover:

What they are (intro to pep8)
Difference between flake8 and black (highlighting issues vs changing code to conform to pep8)
How to configure them (it's not clear if you can configure flake8 beyond the three options in the menu. In addition, flake8's most annoying warning - that a line is "too long" at >79 characters - is different from Black's default 88)

I've found that flake8 is often too visible - that is, it renders both highlights and error messages by default. The highlights are slightly annoying as the yellow makes it hard to see with dark themes, while the error messages don't adequately distinguish themselves from code, so it's a bit jarring when they appear for the first time. I'd prefer it to be off by default, and turn it on only when I want to spend some time specifically doing cleanup.

Conversely, Black is not visible enough - it's not obvious that it's there, and it's even less obvious what it does when applied (at least until you commit and see the diffs).

broken link to guidance on using github with Analytical Platform

The link on this line results in a 404:

https://github.com/moj-analytical-services/platform_user_guidance/blob/master/02-start.Rmd#L156

User guidance needs a - "how to get support" section

It would be useful to clarify the roles of the MoJ digitech team, the data science team and various training groups so that users know where to go when there are issues.

A general support guide as well would be useful - where to go for R help, Github, platform...etc...

How to google things, making reproducible examples, linking to lines of code, flagging issues properly

Smalll change to correct guidance on creating GitHub branch from RStudio

checkout -b my_branch_name should be git checkout -b my_branch_name in this section of the Platform Guidance: https://moj-analytical-services.github.io/platform_user_guidance/using-github-with-r-studio.html#step-2-create-a-new-branch-in-r-studio-and-tell-github-about-its-existence

User administration section is incomplete

The user administration section contains holding text and should updated to reflect current process.

Static apps guidance refers to Jenkins

The guidance on how to deploy a static app is very out of date and still contains references and links to Jenkins.

Guidance on deploying a Shiny app is difficult to follow

Several users have encountered difficulties when trying to deploy new Shiny apps. The guidance seems to be rather difficult to follow and misses out steps, such as getting an admin to create an app data source and the app itself.

Convert guidance into bookdown

Home directory and file management guidance

Authy vs Google authenticator

In section 2.1.2 (Enable two-factor authentication on your Github account) the guidance suggests Authy or Google authenticator. However Authy seems better for when the users device changes. It might be worth suggesting Authy over Google Auth? Or if there is a way for the user to reset their credentials (now or in the future) to add guidance on how to reset in this section.