Code Monkey home page Code Monkey logo

metakgp-wiki's Introduction

Contributors Forks Stargazers Issues MIT License Wiki


image

MetaKGP Wiki

Dockerized for fun and profit.
Wiki ยท Report Bug / Request Feature

Table of Contents

About

This is the dockerized source for the MetaKGP Wiki deployed at https://wiki.metakgp.org. The wiki is a Mediawiki instance with some extensions and services that take backups and update certain pages.

It is hosted on a DigitalOcean droplet with 2GB RAM and a single CPU. See MetaPloy for the deployment architecture.

Architecture

(back to top)

Getting Started

See also: The Runbook for a quick reference to processes needed to manage a production wiki.

Prerequisites

Docker and docker compose are the only required dependencies. You can either install Docker Desktop or the Docker Engine. For minimal installations and production use cases, Docker Engine is recommended.

(back to top)

Deployment

NOTE: See the #Production section for production deployment. DO NOT follow the development instructions in a production environment.

Development

  1. Set up MetaPloy.
  2. Clone this repository.
  3. Copy the contents of the .env.template file into the .env file. Create the file if it doesn't exist.
  4. Set the necessary environment variables.
  5. Run docker compose up to start the wiki. The wiki will be accessible on localhost:8080 or whichever port MetaPloy is set to use.

(back to top)

Production

  1. Set up MetaPloy for production.
  2. Clone this repository at a convenient location such as /deployments.
  3. Set the appropriate production environment variables in the .env file.
  4. Run docker compose -f docker-compose.prod.yml up to start the wiki. This enables the jobs service which includes backups, log rotation, and other periodic jobs.
  5. Optionally set up a Systemd service to start the wiki on startup.

(back to top)

Environment Variables

Environment variables can be set using a .env file(use .env.template file for reference). The following variables are used:

  • DEV: When set to true, Mediawiki PHP stack-trace is shown with error messages. (Default: false)
  • MYSQL_PASSWORD: A secret password for the MySQL database.
  • SERVER_PORT: Port on which the wiki server is exposed to the host. (Default: 8080)
  • SERVER_NAME: Base URL of the wiki (eg: https://wiki.metakgp.org).
  • MAILGUN_EMAIL: The email ID used for sending emails via Mailgun. (eg: [email protected])
  • MAILGUN_PASSWORD: Mailgun SMTP password for sending official mails from the wiki.
  • WG_SECRET_KEY: Secret key used for encryption by mediawiki. Make it a long, random, secret string (Reference).
  • Dropbox related variables (used for storing backups) (See this section for details):
    • DROPBOX_APP_KEY: Dropbox app key (can be found at Dropbox App Console).
    • DROPBOX_APP_SECRET: Dropbox app secret (can be found at Dropbox App Console).
    • DROPBOX_ACCESS_TOKEN: Dropbox API access token (generated using /scripts/get_dropbox_tokens.py)
    • DROPBOX_REFRESH_TOKEN: Dropbox API refresh token (generated using /scripts/get_dropbox_tokens.py) used to refresh the access token.
  • SLACK_CHANGES_WH_URL: URL to the Slack webhook used to send updates about wiki changes. (See this section for more details)
  • SLACK_INCIDENTS_WH_URL: URL to the Slack webhook used to send incidents reports and errors(like Dropbox backup failure). (See this section for more details)
  • BATMAN_BOT_PASSWORD: A generated password of the Batman bot user account on the wiki(Mediawiki documentation to generate bot passwords can be found here).

Setting Up Secondary Services

Dropbox Backups

The jobs service runs periodic local backups (see /jobs/backups) and stores the last 30 days of backups on Dropbox. To set this up, a Dropbox app has to be created, and access tokens need to be generated:

  1. Create an app on the Dropbox App Console.
  2. Copy the app key and app secret and set the corresponding environment variables.
  3. Run the script /scripts/get_dropbox_tokens.py and when prompted, enter the app key and app secret.
  4. Set the generated API access token and refresh tokens in the environment variables.
Slack Notifications

The Slack notifications are sent via webhooks. Two webhooks are used by the wiki: Recent Changes webhook and Incidents webhook (See environment variables). The recent changes webhook logs recent changes to the wiki (page edits, user creation, etc.) and the incidents webhook notifies about server incidents such as backup failures.

  1. Create a Slack app.
  2. Enable "Incoming Webhooks".
  3. Copy the webhook URL and set the appropriate environment variables.
Mailgun

Mailgun is used by the wiki as a mailing service for sending various emails to the users such as account verification and notifications.

  1. Add a new domain in the "Sending" section on Mailgun.
  2. Copy the SMTP password and set the appropriate environment variables.
PyWikiBot (Batman)

PyWikiBot is a Python library that interfaces with the wiki as a bot (called "Batman") and is used to run various jobs such as updating the trending pages list. See /jobs/pywikibot for a list of scripts.

  1. Create a bot account on the Wiki.
  2. Add the bot's password to the BATMAN_BOT_PASSWORD variable in the environment variables.
Google Analytics

The legacy google analytics features used here are now deprecated. This needs to be reworked.

Maintainer(s)

(back to top)

Contact

๐Ÿ“ซ Metakgp - Metakgp's slack invite Metakgp's email metakgp's Facebook metakgp's LinkedIn metakgp's Twitter metakgp's Instagram

(back to top)

Additional documentation

(back to top)

metakgp-wiki's People

Contributors

amrav avatar cdhowie avatar defcon-007 avatar godzilla5111 avatar grapheo12 avatar hargup avatar harshkhandeparkar avatar icyflame avatar j-tesla avatar jacksga avatar kulttuuri avatar meneth avatar nishnik avatar proffapt avatar rajivharlalka avatar renovate-bot avatar renovate[bot] avatar shikharish avatar thealphadollar avatar themousepotato avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metakgp-wiki's Issues

Add VisualEditor

Visual editor probably (anecdotal) increases the number of edits by the long tail of infrequent contributors by making it very simple to make small changes/updates. This helps reduce the entry barrier to becoming a more frequent editor, and is also essential for keeping the wiki up to date.

Add a parsoid container for running parsoid (and optionally restbase), and add visual editor to the list of installed extensions.

Enable error logging for the MySQL container

Currently, the MySQL container is not writing any query logs at all. Erroneous query logs are written to stderr.

# Last lines of the mysql error log
icyflame@metakgp-blr:~$ docker exec -it metakgp-wiki_mysql_1 tail /var/log/mysql/error.log
2018-02-26T23:11:59.997694Z 0 [Note] InnoDB: Buffer pool(s) dump completed at 180226 23:11:59
2018-02-26T23:12:01.707688Z 0 [Note] InnoDB: Shutdown completed; log sequence number 2551387
2018-02-26T23:12:01.709698Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2018-02-26T23:12:01.709711Z 0 [Note] Shutting down plugin 'MEMORY'
2018-02-26T23:12:01.709716Z 0 [Note] Shutting down plugin 'CSV'
2018-02-26T23:12:01.709720Z 0 [Note] Shutting down plugin 'sha256_password'
2018-02-26T23:12:01.709723Z 0 [Note] Shutting down plugin 'mysql_native_password'
2018-02-26T23:12:01.709892Z 0 [Note] Shutting down plugin 'binlog'
2018-02-26T23:12:01.711473Z 0 [Note] mysqld: Shutdown complete

These are the last few lines in the Error log and they were written back in February 2018. I checked the Global variables that enable logging, and they are not set properly.

mysql> SELECT @@global.general_log;
+----------------------+
| @@global.general_log |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)mysql> SELECT @@global.general_log_file;
+---------------------------------+
| @@global.general_log_file       |
+---------------------------------+
| /var/lib/mysql/4efb1f830bf6.log |
+---------------------------------+
1 row in set (0.00 sec)mysql> SELECT @@global.log_output;
+---------------------+
| @@global.log_output |
+---------------------+
| FILE                |
+---------------------+
1 row in set (0.00 sec)

We have to set general_log = 1 in the MySQL configuration file. We are not currently using one in this repository. We have to start using it when we build the mysql Docker image.

The configuration file has to be written with some basic options and then inserted into the correct path. More documentation on this can be found on the webpage for the official mysql docker image under the Using a custom MySQL configuration file section.

Remove the 10 second rate limiting on the wiki for bots.

Bots take 10 sec for every edited page as per the current limiting rate. When updating multiple pages via bots can be time-consuming.

For example:
In the case of blackjack bot, which updates the courses with new grade distribution, the no of courses 'updated' is approx 1000, so 10 sec gap extends the process to nearly 3 hrs.

Update Documentation

The following documentation needs to be added/updated in the README:

  • Getting started (#129)
  • How to set up secondary services (#129)
  • All environment variables (#121)
  • Production deployment instructions (#129)
  • An overview of the wiki's architecture, written on a piece of paper or drawn on a blackboard

The following documentation needs to be updated in the RUNBOOK:

  • Backup restoration with the new script (#122)

Run Backups in the `jobs` Container

The jobs container is used for running recurrent jobs like updating the trending pages and the spam IP list. Backup is also a recurrent job, so there is no need to make a separate container for it.

TODO

  • Fix the backup script.
  • Make the backup script delete backups older than one month (excluding the last backup that has images).
  • Merge the backup script into the jobs container.

[BUG] Update top trending job is failing

The failure is because of the lack of the python module oauth2client: https://github.com/googleapis/oauth2client

It looks like this module has been deprecated recently. The usage of google-auth is recommended on the repository page. This is the particular module inside google-auth that we would be using: https://google-auth.readthedocs.io/en/latest/reference/google.oauth2.service_account.html#module-google.oauth2.service_account

Failure log

icyflame@metakgp-blr:~$ docker exec metakgp-wiki_jobs_1 /root/update_top_trending.sh
Updating Top and Trending Pages
+ echo 'Updating Top and Trending Pages'
+ cd /root/pywikibot
+ export METAKGP_BOT_NAME=batman
+ METAKGP_BOT_NAME=batman
+ timeout 10s python pwb.py login
Logging in to metakgp:en as Batman@update-statistics
Logged in on metakgp:en as Batman.
+ timeout 30s python pwb.py updatestatistics
Traceback (most recent call last):
  File "pwb.py", line 264, in <module>
    if not main():
  File "pwb.py", line 257, in main
    run_python_file(filename, [filename] + args, argvu, file_package)
  File "pwb.py", line 120, in run_python_file
    main_mod.__dict__)
  File "./scripts/updatestatistics.py", line 7, in <module>
    from oauth2client.service_account import ServiceAccountCredentials
ImportError: No module named oauth2client.service_account
CRITICAL: Closing network session.
<type 'exceptions.ImportError'>

Restore Images in Restore Script

The restore-from-backup.sh script only restores the database backup. Add the following:

  1. A condition to check if the backup includes an images/ folder (it may not)
  2. If the files exist copy them inside the mediawiki container's /srv/static/
  3. Execute chown -LR www-data:www-data /srv/static/images inside the container

Documentation: steps to migrate to a new server

We will need to add more documentation in case of backup, especially the steps needed to be taken in case of a migration. The following steps are to be highlighted:

  • Backup of static-volume (Currently, only the peqp/ folder.)
  • DROP DATABASE metakgp_wiki_db; followed by CREATE_DATABASE metakgp_wiki_db in the mysql shell inside the mysql container; followed by ./scripts/restore-from-backup.sh. This ensures that any existing database is empty before restoring the latest backup.
  • Steps to put the old wiki in Read Only mode
  • Steps to be taken in the overall process

Add batman

Steps:

  • Set up redis container
  • Set up a batman container with Python
  • Add environment variables for Slack access
  • Modify docker-compose.yml to link everything together, and pass in env
  • Run batman inside its container as a daemon, perhaps by using some kind of init system

Set Up CI

It would be good to have a CI workflow that builds the wiki containers and runs various tests. (on every commit and pr)

  • #118
  • Tests:
    • Make sure container builds
    • Make sure wiki starts and runs
    • #14
    • #63

Deduplicate Top and Trending pages

If a page is in "Popular", then it's better to deduplicate it from "Trending" so that we can showcase as many articles as possible. This involves changing the pywikibot script to fetch 20 trending items, dedup, and post the top 10.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Ignored or Blocked

These are blocked by an existing closed PR and will not be recreated unless you click a checkbox below.

Detected dependencies

docker-compose
docker-compose.override.yml
docker-compose.prod.yml
docker-compose.yml
  • mysql 5.7
dockerfile
backup/Dockerfile
  • python 2-jessie
jobs/Dockerfile
  • python 2
mediawiki/Dockerfile
  • php 7.3-fpm-buster
mysql/Dockerfile
  • mysql 5.7
nginx/Dockerfile
parsoid/Dockerfile
  • node 10-jessie
php/Dockerfile
  • php 7.3-fpm-buster
pip_requirements
backup/requirements.txt
  • dropbox ==9.4.0
jobs/requirements.txt
  • oauth2client ==4.1.3

  • Check this box to trigger a request for Renovate to run again on this repository

Investigate if Scribunto configuration lines are required anymore

#78 (comment)

Scribunto is shipped with mediawiki-core starting Mediawiki 1.34. We have some configuration on our side (specifically, two lines that set some variables and making the Lua executable executable using chmod a+x while installing other extensions)

Investigate whether these lines are required. If they aren't required anymore, remove them. ๐Ÿ‘

Update DNS Denylist

Users are unable to create account in the wiki due their IP getting blocked by the DNS Blacklist.
Temporary fix is to disable the DNSBL option.(PR#130)


The blacklist currently used is sbl.spamhaus.org. We should use another blacklist or find an alternative solution.

Find a New Way to Update Popular and Trending Pages

  • The UA-XXXXXX-XX google analytics id is deprecated and hence the googleAnalytics extension no longer works.
  • Need to find a way to use the new "head" tags of Google Analytics and adapt the jobs script accordingly.

Add more integration tests

Possible tests:

  • Main page should render without errors on desktop and mobile browsers
  • Create user (should fail because of captcha)
  • Upload image (check that uploads are configured correctly)
  • View image (check that thumbnailing works)
  • Try to access all the images, or wiki config files (check that nginx disallows file access)

Look into using something like PhantomJS or Selenium.

Static file links

The Nginx Dockerfile doesn't copy the peqp dump to /srv/static by default. This has to done by hand.
Recently the server had to be restarted which led to the deletion of the static files.
Maybe we should add some code so that this addition automatically happens with new builds.

@icyflame @amrav @thealphadollar

Investigate and fix problems with ArticleFeedbackV5

Recently, we started facing problems with this extension. In particular, on the Feedback page, users are able to see the feedback that was submitted by administrators are unable to mark the feedback as Resolved, Useful, etc.

Use Python 3 for the backup container

The backup container currently uses the Docker image python:2-jessie. Python 2
reached End-of-Life recently, so we should stop using it.

------------------------ >8 ------------------------ Do not modify or remove

the line above. Everything below it will be ignored.

Creating an issue for metakgp/metakgp-wiki

Write a message for this issue. The first block of
text is the title and the rest is the description.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.