Code Monkey home page Code Monkey logo

text-diff's Introduction

TextDiff

JavaScript diff library with support for visual, HTML-formatted output

This repository contains the diff functionality of the google-diff-match-patch library by Neil Fraser, turned into a node module which is suitable for requireing into projects.

Example usage

var Diff = require('text-diff');

var diff = new Diff(); // options may be passed to constructor; see below
var textDiff = diff.main('text1', 'text2'); // produces diff array
diff.prettyHtml(textDiff); // produces a formatted HTML string

Initialization options

Arguments may be passed into the Diff constructor in the form of an object:

  • timeout: Number of seconds to map a diff before giving up (0 for infinity).
  • editCost: Cost of an empty edit operation in terms of edit characters.

Example initialization with arguments: var diff = new Diff({ timeout: 2, editCost: 6 });

Documentation

The API documentation below has been modified from the original API documentation.

Initialization

The first step is to create a new diff object (see example above). This object contains various properties which set the behaviour of the algorithms, as well as the following methods/functions:

main(text1, text2) => diffs

An array of differences is computed which describe the transformation of text1 into text2. Each difference is an array. The first element specifies if it is an insertion (1), a deletion (-1) or an equality (0). The second element specifies the affected text.

main("Good dog", "Bad dog") => [(-1, "Goo"), (1, "Ba"), (0, "d dog")]

Despite the large number of optimisations used in this function, diff can take a while to compute. The timeout setting is available to set how many seconds any diff's exploration phase may take (see "Initialization options" section above). The default value is 1.0. A value of 0 disables the timeout and lets diff run until completion. Should diff time out, the return value will still be a valid difference, though probably non-optimal.

cleanupSemantic(diffs) => null

A diff of two unrelated texts can be filled with coincidental matches. For example, the diff of "mouse" and "sofas" is [(-1, "m"), (1, "s"), (0, "o"), (-1, "u"), (1, "fa"), (0, "s"), (-1, "e")]. While this is the optimum diff, it is difficult for humans to understand. Semantic cleanup rewrites the diff, expanding it into a more intelligible format. The above example would become: [(-1, "mouse"), (1, "sofas")]. If a diff is to be human-readable, it should be passed to cleanupSemantic.

cleanupEfficiency(diffs) => null

This function is similar to cleanupSemantic, except that instead of optimising a diff to be human-readable, it optimises the diff to be efficient for machine processing. The results of both cleanup types are often the same.

The efficiency cleanup is based on the observation that a diff made up of large numbers of small diffs edits may take longer to process (in downstream applications) or take more capacity to store or transmit than a smaller number of larger diffs. The diff.EditCost property sets what the cost of handling a new edit is in terms of handling extra characters in an existing edit. The default value is 4, which means if expanding the length of a diff by three characters can eliminate one edit, then that optimisation will reduce the total costs.

levenshtein(diffs) => int

Given a diff, measure its Levenshtein distance in terms of the number of inserted, deleted or substituted characters. The minimum distance is 0 which means equality, the maximum distance is the length of the longer string.

prettyHtml(diffs) => html

Takes a diff array and returns a string of pretty HTML. Deletions are wrapped in <del></del> tags, and insertions are wrapped in <ins></ins> tags. Use CSS to apply styling to these tags.

Tests

Tests have not been ported to this library fork, however tests are available in the original library. If you would like to port tests over, you will need to do some function call renaming (viz. the diff_ prefix has been removed from functions in the fork) and remove tests specific to the "patch" and "match" functionalities of the original library.

text-diff's People

Contributors

liddiard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

text-diff's Issues

Typescript typings required

Can you please add typescript support by adding types to your library? It's a very useful library, and adding types would make it more awesome :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.