Code Monkey home page Code Monkey logo

aqua's Introduction

AQUA

An Advanced QUery Architecture for the SPARC Portal

Table of Contents

Introduction

In today's world where information retrieval is expected in a matter of miliseconds, a strong search engine is essential. Take Google and Facebook, 2 of the five most popular search engines in the world, for example. Every day, millions or even billions of queries are made, potentially with typos and close-matches. Facebook returns suggestions for a profile name we fuzzily search but do not know for sure. Google returns to us not only what we want, but also tons of related information. In the medical research world, unless published datasets are easy to find, their potential for reuse and collaboration will be limited. A search engine serves as the middleman between scientific discovery and scientific progress in that sense. The stronger and smarter the search engine is, the faster the progress.

About us

AQUA is an application that aims at improving the search capabilities of the SPARC Portal. In particular, we are looking to make the search engine smarter at reading and understanding user input as search keywords. We also enhance the result display feature of the SPARC Portal by making it more user-friendly and providing users with more sophisticated result filtering and sorting options. Our end goal is to improve exponentially the visibility of the SPARC datasets. This in turn will benefit the SPARC community as a whole since their datasets will be more discoverable for reuse and subsequent collaboration. This project was created during the 2021 SPARC FAIR Codeathon.

Problem

Currently, the search feature of the SPARC Portal is very limited:

  1. It does not account for nearby words. E.g.: when a word is misspelled.

  2. The result display is primitive. E.g.: Limited result filtering, or matched text results are not highlighted.

AQUA solution

  1. Apply Artificial Intelligence tools (Natural Language Processing) to the processing of users’ search keywords and to the implementation of predictive typing (suggestion-based typing).
  • In details, in addition to lemmatization, other NIH tools (e.g: NIF Ontology) will be used to derive origins of words and make suggestions for users as they type. This will help AQUA standardize various user inputs and return the most datasets possible that match the search keywords.
  1. Enhance the current result display by:
  • Bolding/highlighting matched texts in results for easy lookup

  • Add a more sophisticated Dataset results filter functionality (based on Relevance, Date of Publication, and other customized filtering) to the current portal.

  • Add a “Notify me when related datasets are published” when “No results” are returned. This will allow users to enter their email to be stored by the SPARC Portal for future alerts.

How it works

interface

Documentation

For a detailed documentation of our application, please visit here.

Developers

aqua's People

Contributors

tramngo1603 avatar nickerso avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.