natlibfi / fintoai Goto Github PK

Finto AI suggests subjects for a given text. It's based on Annif, a tool for automated subject indexing.

HTML 4.59% CSS 4.66% JavaScript 90.75%

subject-indexing code4lib rest-api classification multilabel-classification text-classification annif glam

fintoai's Introduction

Finto AI suggests subjects for a given text. It's based on Annif, a tool for automated subject indexing. Finto AI is also an API service that can be integrated to other systems.

fintoai's People

Contributors

Stargazers

Watchers

fintoai's Issues

Detect language of text

A possible enhancement for the Finto AI user interface would be to detect the language of input text. This would be easy to implement if the corresponding functionality is added to the Annif REST API: see NatLibFi/Annif#631

How best to add this feature to the UI needs a bit more thought. I can think of two approaches:

Make this functionality completely separate from the subject indexing. Basically, a button (or other widget) that the user clicks on and then the UI shows the detected language.
Integrate with the subject indexing functionality. For example, first detect the language, then narrow down the available projects based on the language (example: if the text is detected as English, show only projects intended for English language).

Rich preview for Finto AI website

We should have HTML metadata that allows a rich preview (with logo, title and description) when the URL of Finto AI is shared on social media, returned in search engine results etc.

Here's how this was implemented for annif.org:

Downloading suggestion results as CSV file

There could be an option to download the suggestions as a CSV (or TSV) file. After implementing #4, the file should only consist of those suggestions that have been selected.

The Arkkiivi service has this kind of CSV-file download functionality.

Uploading PDF files via webpage

Currently the text to be indexed can be entered in the form of the webpage or directly sent to the API's suggest method. However, in case of many users the original document is a PDF file, from which the user needs to extract the text as a first step (for webpage user this means manually copying to clipboard). This issue is for adding a feature to upload a PDF file via the webpage.

Needs:

file upload functionality to the webpage
extraction of text from the uploaded PDF file
passing the extracted text to Annif API
possibly error handling in upload and/or text extraction

Some examples:

Some open questions:

Layout possibilities:
- "upload file" button to top-right corner of text-box (above clear button) and drag-and-drop functionality to textarea
- dedicated tabs: one for raw-text input as now: "Enter text to be indexed"; another for the PDF upload: "Upload PDF to be indexed"
After upload and text extraction, should the extracted text be shown on the webpage (in textarea)?

Add license

There should be a license in this repository.

Show Annif version information

Finto AI uses Annif for it's main functionality, so it would be useful to clearly show the current version number of Annif that Finto AI is using.

Ideas - for now - include showing the Annif version number on Finto AI's main page, for example in the footer with legal info & Annif logo. It could be written down or shown on mouse-over when hovering over the Annif logo.

Copying multiple suggestions at once

Adding a way to copy multiple suggestions at once to clipboard could speed up using Finto AI with other systems, especially Melinda.

On each row of the suggestions list there could be a checkbox that would be used to select the suggestion for copying, and on top of the checkbox column one checkbox for selecting/deselecting all suggestions. (Should initially all suggestions be selected or deselected?) For the actual copying there could be three copy buttons like now, which would copy either labels, URIs, or both with language code for Melinda. (Where should the buttons be placed?)

This copy-many feature would be mostly useful for Melinda. It may not be very useful for just labels and URIs. (Which should the format of the list of labels/URIs be, comma-separated?) So, it could be better to have the feature only for copying Melinda formatted suggestions, and the current copy buttons for labels and URIs could be left as they are.

Also, this feature should be coordinated with Melinda, for the right format for the list of suggestions for Melinda. Also, Melinda's update to new system (from Aleph) should be kept in mind. (Could there be an even better way to connect Finto AI and the new Melinda system than via clipboard?)

Button(s) for copying concept information into Alma

We already have a button in the UI for copying individual subject suggestions in a format that can be pasted directly to Aleph/Melinda.

The same could be done with Ex Libris Alma, which is used by many university libraries in Finland (e.g. University of Helsinki and Åbo Akademi). Alma can be configured to use either ‡ or $$ as the subfield separator, so we may need two variants of the same button, unless there is some sort of agreement between Alma users to standardize on either symbol.

Here are examples of strings that can be pasted into Alma for both separator variants:

‡a kimalaiset. ‡0 http://www.yso.fi/onto/yso/p11119 ‡2 yso/fin
$$a kimalaiset. $$0 http://www.yso.fi/onto/yso/p11119 $$2 yso/fin

Simplify translation of terms

Currently Finto AI allows the user to select the language of terms. If the user selects a language that is different from the project/vocabulary language, the labels are retrieved from the Finto REST API with extra API calls (one call per suggested subject).

Nowadays Annif vocabularies are multilingual and in the upcoming 0.60 release the suggest method in the REST API supports a language parameter, which can be used to request the terms in a non-default language. Finto AI could switch to this, which would simplify the functionality, reduce the maintenance burden and perhaps prevent errors caused by problems accessing the Finto API.

Show class/notation code for classification suggestions and add copy button for it

Testing YKL models in ai.dev.finto.fi raised a request to show the class/notation codes in addition to the class names in the suggestions.

Uploading multiple files

In some use cases it could be desirable to be able to upload multiple files in one go. Annif nowdays has the suggest-batch REST method (NatLibFi/Annif#664) which could be utilized for this.

The output suggestion sets (and probably also the raw input texts) should be somehow separated per input file in the web page, which is probably the hardest part for this.

Arkkiivi service has also this functionality.

natlibfi / fintoai Goto Github PK

fintoai's Introduction

fintoai's People

Contributors

Stargazers

Watchers

fintoai's Issues

Detect language of text

Rich preview for Finto AI website

Downloading suggestion results as CSV file

Uploading PDF files via webpage

Add license

Show Annif version information

Copying multiple suggestions at once

Button(s) for copying concept information into Alma

Simplify translation of terms

Show class/notation code for classification suggestions and add copy button for it

Uploading multiple files

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent