Code Monkey home page Code Monkey logo

lsabot's Introduction

LSA-Bot

License: GPLv3

Copyright (C) 2004 Salvatore La Bua slabua(at)gmail.com
http://www.slblabs.com/projects/lsabot
http://github.com/slabua/lsabot

Table of Contents

IMPORTANT NOTICE

  1. The software is by no means complete and is still at a very early stage.
  2. I am in the process of cleaning the code a bit, removing the unnecessary files and finding a solution to host the large raw data files needed by the software.
  3. The preliminary scripts needed to preprocess the text are included in the repository but it is not given yet a clear set of instructions to run them.
  4. the data folder, due to its large dimension, has been compressed and needs to be extracted in order to work with LSA-Bot.
    The archive is also hosted at: https://bitbucket.org/slabua/lsabot/downloads/data.tar.bz2

Introduction to the Project

  • LSA-Bot is a new, powerful kind of Chat-bot incentred on Latent Semantic Analysis.
  • Using LSA it is possible to make relationships among words and vectors, permitting to realize an intelligent chat-bot that can understand human language and answer as well.

Some information about LSA-Bot

  • I developed LSA-bot at university since 12-sept-2004 (first class birthdate).
  • LSA-Bot is written in Java and it works thanks to the LSA (Latent Semantic Analysis) theory applied to a large amount of text documents (corpus). There are many Chat-bot systems, most of them are using the AIML language to recognize users’ questions and bots can answer to the users, but the botmaster has to think about all kind of question a user can make to the bot.
  • Using LSA is possible to give something intelligence to the chat-bot, permitting to ignore, for instance, wrong words, stop-words and all isn’t needed in the meaning of a sentence.
  • LSA-bot uses the vectors related to every words found in the corpus to compute the ‘distance’ between user’s question and all possible answers, that can be simpliest sentences, small documents, or whatever the programmer wish to do. Word’s vectors are obtaines using the Singular Value Decomposition (SVD) onto the matrix built from words’ occurrences in the documents, using Matlab or other software that permit a singular value decomposition. Obtained the vectors we need, LSA-bot uses them to create vectors for every words, and every question a user can make. The distance among the question and verosimilar answers can be done by compute the cosine distance, rejection over projection, tanimoto… The answer related to the vector that satisfy the minimum distance will be shown to the user.
  • Another feature is that the knowledge-base of LSA-bot can be improved (learn-mode) by specify a new sentence the bot has to learn; a new representing vector will be computed and added to others.

Resources

  1. ResearchGate Thesis publication

LICENSE

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Screenshots

LSA-Bot main interface

Main

lsabot's People

Contributors

slabua avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.