Code Monkey home page Code Monkey logo

php-nlgen's Introduction

Latest Stable Version Total Downloads Latest Unstable Version License

NLGen: a library for creating recursive-descent natural language generators

These are pure PHP helper classes to implement recursive-descent natural language generators [1]. The classes provided are an abstract generator, an ontology container and a lexicon container.

These classes should help build simple to mid-level generators, speaking about their complexity. Emphasis has been made in keeping more advanced features out of the way for simpler cases (i.e., if there is no need to use the ontology or the lexicon, they can be skipped).

The generator keeps track of semantic annotations on the generated text, so as to enable further generation functions to reason about the text. A global context blackboard is also available.

For details on the multilingual example see the Make Web Not War talk. [2]

This is work in progress, see the ROADMAP for some insights in future development.

Available Generation Grammars

NLGen ships with a generation grammar ready to use, that constructs text descriptions for weekly schedules. The grammar is accessible by importing \NLGen\Grammars\Availability\AvailabilityGenerator.

The method generateAvailability receives a list of "busy times" in the form of

[ day-of-week, [ start hour, start minute ], [ end hour, end minute ] ]

a list of ranges indicating when the scheduled day starts and ends (in the form of [ day-of-week => [ start hour, start minute ], [ end hour, end minute ] ]) and a constant indicating how "coarse" should be the text (one liner summarizing or very detailed).

See examples/availability and tests/Availability/AvailabilityTest.

Example:

use NLGen\Grammars\Availability\AvailabilityGenerator;

$gen = new AvailabilityGenerator();
$busyList = [
  [3, [16, 30], [17, 30] ],
  [6, [ 6, 55], [11, 41] ],
  [6, [14, 32], [22, 05] ]
];
$fullRanges = [];
foreach(range(0, 6) as $dow) {
 $fullRanges[$dow] = [ [6, 0], [24, 0] ];
}
echo $gen->generateAvailability($busyList, $fullRanges, AvailabilityGenerator::BASE, null);

Produces All week is mostly free all day. Sunday is busy from late 6 AM to late 11 AM, and from half past 14 PM to 22 PM; the rest is free.

Using it in your own projects

Look at the examples/ folder, but in a nutshell, subclass the NLGen\Generator class and implemented a function named top. This function can return either a string or an array with a text and sem for semantic annotations on the returned text.

If you want to use other functions to assemble the text use $this->gen('name_of_the_function', $data_array_input_to_the_function) to call it (instead of $this->name_of_the_function($data_array_input_to_the_function). Or you can define your functions as protected and use function interposition, described below. The generator abstract class keeps track of the semantic annotations for you and other goodies.

If the functions that implement the grammar are protected, a dynamic class can be created with the NewSealed class method. This dynamic class will have function interception so you can call $this->name_of_function as usual but actually $this->gen will be called.

Either way you use it, to call the class, if your instantiated subclass is $my_gen then $my_gen->generate($input_data_as_an_array) will return the generated strings. If you want to access the semantic annotations, use $my_gen->semantics() afterward.

For different use cases, see the examples/ folder.

Most basic example

This example is grafted from the examples/basic folder. To be invoked command-line with php basic.php 0 0 0 0 (it produces Juan started working on Component ABC).

class BasicGenerator extends Generator {

  var $agents = array('Juan','Pedro','The helpdesk operator');
  var $events = array('started','is','finished');
  var $actions = array('working on','coding','doing QA on');
  var $themes = array('Component ABC','Item 25','the delivery subsystem');

  protected function top($data){
    return
      $this->person($data[0]). " " .
      $this->action($data[1], $data[2]). " " .
      $this->item($data[3]);
  }

  protected function person($agt){ return $this->agents[$agt]; }
  protected function action($evt, $act){ return $this->events[$evt]." ".$this->actions[$act]; }
  protected function item($thm) { return $this->themes[$thm];  }
}

global $argv,$argc;
$gen = BasicGenerator::NewSealed();
print $gen->generate(array_splice($argv,1) /*,array("debug"=>1)*/)."\n";

Learning more about NLG

I highly recommend Building Natural Language Generation Systems (2000) by Reiter and Dale.

The SIGGEN site [2] has plenty of good resources. You might also want to look at the NLG portal at the Association for Computational Linguistics wiki [3].

Last but not least, you might be interested in the author's blog [4] and the class notes of his recent NLG course [5].

Integrations

Sponsorship

Work on NLGen is sponsored by Textualization Software Ltd..

License

This library is licensed under the MIT License - See the LICENSE file for details.

php-nlgen's People

Contributors

baraka24 avatar dependabot[bot] avatar drdub avatar marclaporte avatar renoirb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

php-nlgen's Issues

Endless loop/memory limit issue when ranges are invalid

If you run NLGen\Grammars\Availability\AvailabilityGenerator::generateAvailability with a range that has start time bigger than end time, e.g. [[16, 0], [8, 0]] the generator runs into an endless loop, eventually exhausting all available memory and dies with a fatal error. This is a user error but it will be much better for the generator to sanitize the input and return a textual error to the user to fix their ranges.
Also, it might be worth fixing the eternal loop no matter what input is sent to the generator.

Date verbalization example

This example could implement functionality similar to JSReal and be shipped as part of the standard distribution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.