Code Monkey home page Code Monkey logo

fintoai's Introduction

Finto AI suggests subjects for a given text. It's based on Annif, a tool for automated subject indexing. Finto AI is also an API service that can be integrated to other systems.

fintoai's People

Contributors

juhoinkinen avatar monalehtinen avatar osma avatar unnikohonen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fintoai's Issues

Detect language of text

A possible enhancement for the Finto AI user interface would be to detect the language of input text. This would be easy to implement if the corresponding functionality is added to the Annif REST API: see NatLibFi/Annif#631

How best to add this feature to the UI needs a bit more thought. I can think of two approaches:

  1. Make this functionality completely separate from the subject indexing. Basically, a button (or other widget) that the user clicks on and then the UI shows the detected language.
  2. Integrate with the subject indexing functionality. For example, first detect the language, then narrow down the available projects based on the language (example: if the text is detected as English, show only projects intended for English language).

Rich preview for Finto AI website

We should have HTML metadata that allows a rich preview (with logo, title and description) when the URL of Finto AI is shared on social media, returned in search engine results etc.

Here's how this was implemented for annif.org:

image

Uploading PDF files via webpage

Currently the text to be indexed can be entered in the form of the webpage or directly sent to the API's suggest method. However, in case of many users the original document is a PDF file, from which the user needs to extract the text as a first step (for webpage user this means manually copying to clipboard). This issue is for adding a feature to upload a PDF file via the webpage.

Needs:

  • file upload functionality to the webpage
  • extraction of text from the uploaded PDF file
  • passing the extracted text to Annif API
  • possibly error handling in upload and/or text extraction

Some examples:

Some open questions:

  • Layout possibilities:
    • "upload file" button to top-right corner of text-box (above clear button) and drag-and-drop functionality to textarea
    • dedicated tabs: one for raw-text input as now: "Enter text to be indexed"; another for the PDF upload: "Upload PDF to be indexed"
  • After upload and text extraction, should the extracted text be shown on the webpage (in textarea)?

Show Annif version information

Finto AI uses Annif for it's main functionality, so it would be useful to clearly show the current version number of Annif that Finto AI is using.

Ideas - for now - include showing the Annif version number on Finto AI's main page, for example in the footer with legal info & Annif logo. It could be written down or shown on mouse-over when hovering over the Annif logo.

Copying multiple suggestions at once

Adding a way to copy multiple suggestions at once to clipboard could speed up using Finto AI with other systems, especially Melinda.

On each row of the suggestions list there could be a checkbox that would be used to select the suggestion for copying, and on top of the checkbox column one checkbox for selecting/deselecting all suggestions. (Should initially all suggestions be selected or deselected?) For the actual copying there could be three copy buttons like now, which would copy either labels, URIs, or both with language code for Melinda. (Where should the buttons be placed?)

This copy-many feature would be mostly useful for Melinda. It may not be very useful for just labels and URIs. (Which should the format of the list of labels/URIs be, comma-separated?) So, it could be better to have the feature only for copying Melinda formatted suggestions, and the current copy buttons for labels and URIs could be left as they are.

Also, this feature should be coordinated with Melinda, for the right format for the list of suggestions for Melinda. Also, Melinda's update to new system (from Aleph) should be kept in mind. (Could there be an even better way to connect Finto AI and the new Melinda system than via clipboard?)

Button(s) for copying concept information into Alma

We already have a button in the UI for copying individual subject suggestions in a format that can be pasted directly to Aleph/Melinda.

The same could be done with Ex Libris Alma, which is used by many university libraries in Finland (e.g. University of Helsinki and Åbo Akademi). Alma can be configured to use either or $$ as the subfield separator, so we may need two variants of the same button, unless there is some sort of agreement between Alma users to standardize on either symbol.

Here are examples of strings that can be pasted into Alma for both separator variants:

‡a kimalaiset. ‡0 http://www.yso.fi/onto/yso/p11119 ‡2 yso/fin
$$a kimalaiset. $$0 http://www.yso.fi/onto/yso/p11119 $$2 yso/fin

Simplify translation of terms

Currently Finto AI allows the user to select the language of terms. If the user selects a language that is different from the project/vocabulary language, the labels are retrieved from the Finto REST API with extra API calls (one call per suggested subject).

Nowadays Annif vocabularies are multilingual and in the upcoming 0.60 release the suggest method in the REST API supports a language parameter, which can be used to request the terms in a non-default language. Finto AI could switch to this, which would simplify the functionality, reduce the maintenance burden and perhaps prevent errors caused by problems accessing the Finto API.

Uploading multiple files

In some use cases it could be desirable to be able to upload multiple files in one go. Annif nowdays has the suggest-batch REST method (NatLibFi/Annif#664) which could be utilized for this.

The output suggestion sets (and probably also the raw input texts) should be somehow separated per input file in the web page, which is probably the hardest part for this.

Arkkiivi service has also this functionality.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.