Code Monkey home page Code Monkey logo

hawki's Introduction

HAWKI

About

HAWKI is a didactic interface for universities based on the OpenAI API. It is not necessary for users to create an account, the university ID is sufficient for login - no user-related data is stored.

The service was developed by Jonas Trippler, Vincent Timm and Stefan Woelwer at the Interaction Design Lab at the HAWK University of Applied Sciences and Arts in order to give all members of the university the opportunity to integrate artificial intelligence into their work processes and to have a meeting place where new ways of working may emerge and an internal university discussion about the use of AI can take place. The interface is currently divided into three areas:

Conversation: A chat area similar to ChatGPT, for a quick start to any task.

Virtual office: Conversations with fictional experts as a mental model to familiarise yourself with non-technical areas and to make more targeted enquiries to real university experts.

Learning Space: The learning spaces are designed to help you understand the different support options and learn what makes an effective prompt.

We welcome constructive feedback to further develop this project based on your needs and insights.

HAWKI Login HAWKI Login Screen

HAWKI Dashboard HAWKI Dashboard

HAWKI Dashboard HAWKI Settings Panel

Changelog – HAWKI V1.

Functionality

Shibboleth connection as an additional authentication option. (Thanks to Marvin Mundry from the University of Hamburg)

Multi-language with translated texts for English, Italian, French and Spanish. Display of mathematical formulas, LaTex and improvement of syntax highlighting.

Quality of Life

Dark Mode for our night owls.

System prompts can now be viewed transparently.

Security updates

We have made HAWKI more secure in some areas and updated the code structure.

We would like to thank Thorger Jansen (discovery, analysis, coordination) from SEC Consult Vulnerability Lab for responsibly reporting the identified issues and working with us to fix them.

Getting started

Prequisites

LDAP

HAWKI uses LDAP under the hood in order to authenticate users. Make sure you have LDAP setup first and that it is accessible from your HAWKI instance. Provide your LDAP config according to chapter Configuration. You can find more information on how to use LDAP on the official website https://ldap.com

Testing without LDAP: You can try out HAWKI without an LDAP server. To do so, set TESTUSER to your prefered user name tester in the configuration file (see Configuration) and sign in with username tester and password superlangespasswort123

OpenID Connect

As an alternative to LDAP, OpenID connect can also be used to authenticate users. It requires the jumbojett/openid-connect-php library (https://github.com/jumbojett/OpenID-Connect-PHP) to be installed with composer.

Shibboleth

The new version also supports the Shibboleth for user authentication. Define your Shibboleth url and login page in the environment file (see Configuration).

Open AI Access

To generate answers HAWKI uses the Open AI api. Follow the instructions on https://platform.openai.com/docs/introduction to generate an API key and paste it in the configuration file like instructed in chapter Configuration.

Configuration

To get started you need to add a configuration file to the project first. Copy the file ".env.example" from the root directory and rename it to ".env". Replace the example values in it with your own configuration. A detailed description of all values is listed below.

Value Type Example Description
Authentication string 'LDAP' / 'OIDC' / 'Shibboleth' Authentication method: LDAP or OpenID Connect
LDAP_HOST string "ldaps://...de" The URL of your LDAP server.
LDAP_BIND_PW string secretpassword Password of the user that is trying to bind to the LDAP Server.
LDAP_BASE_DN string "cn=...,ou=...,dc=..." Distinguised name that is used to initially bind to your LDAP server.
LDAP_SEARCH_DN string "ou=...,dc=..." Distinguished name that is used for authenticating users.
LDAP_PORT string "..." The LDAP port.
LDAP_FILTER string "..." LDAP Filter. Choose the filter based on your LDAP configuration. See .env.example for more details.
SHIBBOLET_LOGIN_PATH string "..." Path to shibboleth login page.
SHIBBOLET_LOGIN_PAGE string "..." Shibboleth login page.
OIDC_IDP string "https://...." URL of the Identity provider supporting OpenID Connect.
OIDC_CLIENT_ID string "..." Client Id for this application in Identity provider.
OIDC_CLIENT_SECRET string "..." Secret key for OpenID Connect.
OIDC_LOGOUT_URI string "https://...." URL to logout from Identity provider
OPENAI_API_URL string "https://api.openai.com/v1/chat/completions" Open AI URL
OPENAI_API_KEY string sk-... Open AI Api key
IMPRINT_LOCATION string https://your-university/imprint A link to your imprint. Alternatively you can replace the file index.php under /impressum with your own html/ php of your imprint.
PRIVACY_LOCATION string https://your-university/privacy-policy A link to your privacy policy. Alternatively you can replace the file index.php under /datenschutz with your own html/ php of your privacy policy.
TESTUSER string "tester" Set value for testing purposes. Leave TESTUSER and TESTPASSWORD empty or comment them out to disable test user.
TESTPASSWORD string "superlangespasswort123" Set value for testing purposes. Leave TESTUSER and TESTPASSWORD empty or comment them out to disable test user.
FAVICON_URI string "https://...." Link to favicon
DEFAULT_LANGUAGE string "de_DE"/ "en_US"/ "es_ES"/ "fr_FR"/ "it_IT" Default website language. Only applicable if the user has not previously changed the language or their browser language is not one of the supported languages. Current supported languages: 'de_DE', 'en_US', 'es_ES', 'fr_FR', 'it_IT'

Web Server Configuration

There are a few things to keep in mind when publishing your HAWKI instance on a webserver.

First and foremost your webserver needs PHP support.

Also, make sure that you disable output_buffering in your webserver configuration otherwise you might run into issues when receiving the repsonse stream from Open AI.

If you are setting up a new server, make sure that you install the cURL library. https://www.php.net/manual/de/book.curl.php

IMPORTANT: Keep the .env configuration file secret. Make sure your webserver does not allow directory listing and it blocks access to this configuration file. By default the .env file is located in the private folder with restricted access on apache. Double check that it can not be queried with a simple GET request via http://your-hawki-domain/private/.env

Branding

To swap out the HAWK logo for your own, replace the logo.svg file inside the img folder. Make sure to either keep the format as svg or replace all references to logo.svg with your respective filetype.

Of course, you can modify stylesheets and html files and adjust them to your liking.

Third-Party Libraries

This project utilizes the following third-party libraries:

KaTeX - A fast, easy-to-use JavaScript library for TeX math rendering.

  • License: MIT. See here for details.

Highlight JS - Syntax highlighting for the Web.

  • License: MIT. See here for details.

jQuery - A fast, small, and feature-rich JavaScript library.

  • License: MIT. See here for details.

Contact & License

This project is licensed under the terms of the MIT license. If you have any questions, feel free to get in touch via Email

hawki's People

Contributors

ariansdf avatar hawk-digital-environments avatar itsnotyou avatar jonesvan avatar kborm avatar lksmsr avatar mixcolumns avatar ottnorml avatar pandorasactorms avatar thk-tmueller avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hawki's Issues

Make GPT-Model configurable

Hi there,

it would be niceto be able to configure the GPT model.
Also, it would be helpful to have the error messages of the OpenAI API displayed in the logs if the status code is not 200. I had the problem that the standard model was not activated in my account, which is why the API only returned a 403 error. After printing the output of the API I could see that the error message was that the selected model is not allowed.
There are also two errors with bootstrap and ob_flush(), which I have removed.

My patch is quiet ugly for setting the GPT-model but it worked for me. Maybe you have a better idea how to do.

--- private/app/php/stream-api.php.orig 2024-04-25 16:51:11.169003708 +0000
+++ private/app/php/stream-api.php      2024-04-25 16:52:22.938023452 +0000
@@ -1,5 +1,4 @@
 <?php
-define('BOOTSTRAP_PATH',  '../../bootstrap.php');
 require_once BOOTSTRAP_PATH;

 session_start();
@@ -31,6 +30,11 @@

 // Read the request payload from the client
 $requestPayload = file_get_contents('php://input');
+if (isset($env) && isset($env['GPT_MODEL'])) {
+       $dataRequest = json_decode($requestPayload, true);
+       $dataRequest["model"] = $env['GPT_MODEL'];
+       $requestPayload = json_encode($dataRequest);
+}

 $ch = curl_init();
 curl_setopt($ch, CURLOPT_URL, $apiUrl);
@@ -45,7 +49,9 @@
 ]);
 curl_setopt($ch, CURLOPT_WRITEFUNCTION, function($ch, $data) {
        echo $data;
-       ob_flush();
+        if (curl_getinfo($ch, CURLINFO_HTTP_CODE) != 200) {
+               error_log($data);
+        }
        flush();
        return strlen($data);
 });

Then you should add the GPT_MODULE option to the .env.example

;GPT model to be used
;GPT_MODEL="gpt-4-turbo-preview"

Kind regards
Felix

Separate out files for deployment by using a different directory

Currently the .htaccess file aims to prevent accesses to files not intended for HAWKI users. It does not contain an entry for composer_install.sh, for example.

In the long run, managing all such files through web server rules (and having to always remember these rules whenever a file is added to the repo) is a big burden. HAWKI deployments could benefit from moving all files for the web server into a subdirectory within the repository. The README file could explain that only that directory has to be put into the public web tree.

An even further step, which might be beneficial for future development, could be addition of a smallish build system that copies files around to a dist directory or similar.

What do you think?

session_start(): Session cannot be started after headers have already been sent in /var/www/html/interface.php on line 9

Hi there,

I've found an issue in the interface.php.
After the login I get the following error in my Browser.

Warning: session_start(): Session cannot be started after headers have already been sent in /var/www/html/interface.php on line 9

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/interface.php:1) in /var/www/html/interface.php on line 11

The problem is, that you start sending HTML-Headers in line 1 of interface.php and do the session_start() in line 9. You have to do the session_start() first. Moving the PHP-Code to the top of that file solved the issue for me.

Before:

<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/styles/vs.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/highlight.min.js"></script>

<!-- and it's easy to individually load additional languages -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/languages/go.min.js"></script>

<?php
        session_start();
        if (!isset($_SESSION['username'])) {
                header("Location: login.php");
                exit;
        }
?>

<link rel="stylesheet" href="app.css">

After:

<?php
        session_start();
        if (!isset($_SESSION['username'])) {
                header("Location: login.php");
                exit;
        }
?>

<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/styles/vs.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/highlight.min.js"></script>

<!-- and it's easy to individually load additional languages -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/languages/go.min.js"></script>

<link rel="stylesheet" href="app.css">

Kind Regards
Felix

Double Asterisk deletes content from response

If ChatGPT surrounds response with double asterisks (**) the content between the asterisks will be deleted from the response.

Prompt to ChatGPT: Please write "Test: **with asterisk** without asterisk
Response: Test: without asterisk
Expected: Response: Test: **with asterisk** without asterisk

(Seen in Implementation at Uni Hildesheim)

Problems with Testuser

I wanted to to try HAWKI in a simple local setup with a testuser.

In .env I therefore specified:

Authentication="LDAP"
# LDAP config
#LDAP_HOST="ldaps://..."
#LDAP_BASE_DN="cn=...,ou=...,dc=..."
#LDAP_BIND_PW="..."
#LDAP_SEARCH_DN="ou=...,dc=..."
#LDAP_PORT=""
#; Choose the filter based on your LDAP configuration. the value "username" is used as a placeholder and will be replaced with the actual #username in authentication function.
#; EXAMPLES:
#; (|(sAMAccountName=username)(mail=username))
#; (|(uid=username)(mail=username))
#LDAP_FILTER="(|(sAMAccountName=username)(mail=username))"

# Testuser accout
TESTUSER="test"
TESTPASSWORD="test"


When submitting the login and the password the following error appears on the web-server:

[::1]:58325 [500]: POST /login - Uncaught Error: Call to undefined function ldap_escape() in C:....\HAWKI\private\app\php\auth.php:138
Stack trace:
#0 C:...\HAWKI\private\pages\login.php(17): require_once()
#1 C:....\HAWKI\index.php(18): include_once('...')
#2 {main}
thrown in C:...\HAWKI\private\app\php\auth.php on line 138

Any idea whats going wrong??

Thanks a lot for a hint!!

Testuser does not work as stated in the readme

Hi,

the readme states that you have to configure 'testuser=true' and to use "superlangespasswort123".

The actual code commented out the test-login, stating to activate it if needed. The login php uses anonther password than stated in the readme.

Best regards
sehomer

Admin Panel

Please add an admin panel. The basic functionality could be as follows:

  • edit landing page newsfeed
  • add and remove sections/bots in the lefthand menu
  • edit chatbot system prompts for both the team and learning area
    At the moment it's not possible to do this without direct access to the server.
    With an admin panel, it would be possible for authorised users to change text and basic chatbot functionality.

LaTex renderer

One of our employees noticed that responses containing LaTex are not rendered as expected.
He states that ChatGPT is rendereing LaTex properly.

This feature should/could be added.

Image generation / model selection

Please add the possibility to choose a different model like dall-e or whisper.
This could be managed with special chatbots next to the chat or team area to differentiate from the standard chatbot.

OIDC not working after recent update

Hi there,

after your update from today, OICD is not working. Here is a patch:

diff -uNr orig/index.php html/index.php
--- orig/index.php      2024-04-18 13:29:53.808181000 +0000
+++ html/index.php      2024-04-18 13:17:44.319051135 +0000
@@ -26,6 +26,14 @@
         include_once LOGOUT_PAGE_PATH;
         exit();

+    case('/oidc_login'):
+        include_once OIDC_LOGIN_PAGE_PATH;
+        exit();
+
+    case('/oidc_logout'):
+        include_once OIDC_LOGOUT_PAGE_PATH;
+        exit();
+
     case('/impressum'):
         $imprintLocation = isset($env) ? $env["IMPRINT_LOCATION"] : getenv("IMPRINT_LOCATION");
         header("Location: $imprintLocation");
diff -uNr orig/private/bootstrap.php html/private/bootstrap.php
--- orig/private/bootstrap.php  2024-04-18 13:29:53.838180594 +0000
+++ html/private/bootstrap.php  2024-04-18 13:18:06.399752381 +0000
@@ -11,6 +11,8 @@
     define('LOGIN_PAGE_PATH', PAGES_PATH . '/login.php');
     define('INTERFACE_PAGE_PATH', PAGES_PATH . '/interface.php');
     define('LOGOUT_PAGE_PATH', PAGES_PATH . '/logout.php');
+    define('OIDC_LOGIN_PAGE_PATH', PAGES_PATH . '/oidc_login.php');
+    define('OIDC_LOGOUT_PAGE_PATH', PAGES_PATH . '/oidc_logout.php');
     define('VIEWS_PATH', PRIVATE_PATH . '/' . 'views/' );
     define('LAGNUAGE_CONTROLLER_PATH', LIBRARY_PATH . 'language_controller.php');
     define('LANGUAGE_PATH', RESOURCES_PATH . 'language/');
diff -uNr orig/private/pages/oidc_login.php html/private/pages/oidc_login.php
--- orig/private/pages/oidc_login.php   2024-04-18 13:29:53.808181000 +0000
+++ html/private/pages/oidc_login.php   2024-04-18 13:24:35.491487893 +0000
@@ -1,9 +1,6 @@
 <?php
-define('BOOTSTRAP_PATH',  '../bootstrap.php');
-require_once BOOTSTRAP_PATH;
-
 // use library for dealing with OpenID connect
-require __DIR__ . '/vendor/autoload.php';
+require PRIVATE_PATH . '/vendor/autoload.php';

 use Jumbojett\OpenIDConnectClient;

Kind regards
Felix

Chat history

Hey,

is it possible to add a chat history like in the Interface provided by ChatGPT?

Edit Button

Bildschirmfoto 2024-02-22 um 13 07 59
Bildschirmfoto 2024-02-22 um 13 08 23
Bildschirmfoto 2024-02-22 um 13 13 17

Please add the ability to edit user content and create multiple forks in conversations. Revising the user input is a useful method for generating better model output. In this way, it's possible to create more meaningful conversations that contain only high quality model output. Errors generated by the model will most often corrupt the context and cause errors to creep into subsequent model output. It's important for good model output that the user is able to control the entire content of the context.

Code formatting

When you request code from gpt, it returns the code in markdown format, which is then formatted and styled with css.
The problem here is that when copying & pasting code in HAWKI, the first line indicating the programming language is always copied to the clipboard. In the chatGPT UI this line is excluded from the clipboard.
I got feedback from users in tech that this is annoying because you can't paste the clipboard content directly into terminal apps ect. and need to remove this first line manually.

My suggestion is to modify the highlight function to exclude the first line. As seen in the ChatGPT UI, the first line is also excluded from the code:
Bildschirmfoto 2024-04-18 um 14 02 36

The following is the plain text output from OpenAI Playground (I had to include the > signs to prevent the editor from generating a markdown formatting error):

Eine "Hello World" Nachricht in Python zu erstellen ist sehr einfach. Hier ist ein Beispiel dafür, wie man es macht:

>```python
>print("Hello World")
>```

Dieser Code benutzt die Funktion `print()`, um den Text "Hello World" in der Konsole oder im Terminal auszugeben. Du kannst diesen Code in jeder Python-Entwicklungsumgebung oder direkt im Python-Interpreter ausführen.

The screenshot shows how it's rendered in HAWKI, the pasted clipboard content is highlighted in the input field.

Bildschirmfoto 2024-05-08 um 14 52 36

Disable Directory Listing for "/feedback"

Currently, the project website allows directory listing for "/feedback". This means that when users navigate to "/feedback," they can see a list of files and folders contained within the directory.

I recommend to disable directory listing for that directory: https://ai.hawk.de/feedback/ containing the feedback information (it seems it should only be visible to logged-in users, no?)

Perhaps, it would also be a good idea to mask the server version information.

LDAP escaping call missing in recent version

Commit 8038957 has removed the call to ldap_filter introduced in commit 4249636. Is this an oversight (e.g. due to the somewhat large commit) or is there a specific reason for it? It seems to me that there is no reason not to escape the input.

Self-contained installation without use of CDNs should be possible

Thanks for HAWKI; one of our teams at the Regensburg University Computer Centre is evaluating it and working on a test deployment.

We have found that HAWKI by default makes the clients pull data from Cloudflare and Google CDNs (hosts fonts.googleapis.com and cdnjs.cloudflare.com). This is not compatible to our privacy standards and is annoying to handle in the context of EU privacy laws (e.g. we need to mention it in the privacy notice and obtain user consent).

I take it for granted that you would welcome pull requests covering these problems. As HAWKI code in its current state is meant to be deployed directly, and does not include a “prepare” or “build” step (see #36 for more on this), I would like to discuss the best way to go forward first so that we do not invest resources on this and risk that the specific implementation suggested by us does not gain approval:

  • One option is that some build system and preparatory step is added to HAWKI (→#36) and that the affected files could be pulled from somewhere and included in the local deployment. But users could also manually configure the file sources; they should have a few words of documentation about how to do this.
  • In case of some automation: would you prefer that the files are always provided by the web server on which HAWKI is running (my current personal preference), or would you like a switch which selects CDN/own server? (Maybe even allows CDN selection? – e.g. Fira Sans could be taken from Google or Mozilla CDN.)
  • If there will be such a switch, what should the defaults be?
  • Expected additions to the documentation?
  • Further thoughts about this?

LDAP DisplayName implementation leads to empty username

First: I can submit a pull request for the change if we agree on the goal.

Short form

The LDAP code expects the displayname in a certain format and returns empty results if the format is not fulfilled. This at least blocks the submission of feedback. I recommend an optional switch in the .env config file that sets default initials.

Long form

The problematic code

In the LDAP part of auth.php, the code tries to generate username initials from the LDAP displayname field:

            // Extract initials from user's display name
            $name = $info[0]["displayname"][0];
            $parts = explode(", ", $name);
            $initials = substr($parts[1], 0, 1) . substr($parts[0], 0, 1);

            // Set session variables
            $_SESSION['username'] = $initials;

The code expects the displayname to be in the format Mustermann, Max. If the displayname looks like Max Mustermann, the code fails and returns an empty username.

The effect

At the very least, this means that no feedback can be sent, as the feedback code terminates with an error message if messages and usernames are empty.

General conditions

The expectation that the display name is formatted with a comma is not supported by the RFC.

At our institution, the displayname may have the following formats:

  • Max Mustermann
  • Mustermann, Max
  • Max
  • Prof. Dr. Max Mustermann
  • Prof. Dr. Mustermann, Max
  • Prof. Dr. Mustermann

Proposed solution

We don't need the initals to work with HAWKI. For data protection reasons, we are even happy if no initials are displayed. Therefore, a switch that sets all initials to the abbreviation of the university (in our case UP for University of Potsdam) would be practical for us. The option could be called ldap_default_initials.

Docker

Could it be that some files related to the Docker environment are missing (docker-compose.yml e.g.)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.