seanshou / google-diff-match-patch Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/google-diff-match-patch
Automatically exported from code.google.com/p/google-diff-match-patch
What steps will reproduce the problem?
1. Look at the source code
What is the expected output? What do you see instead?
Given the datestamp in the filename, I would assume that the copyright
notice would be dated 2008 instead of 2006.
Note that the JavaScript file hasn't got any copyright notice at all.
What version of the product are you using? On what operating system?
diff_match_patch_20080426.zip
Please provide any additional information below.
This is obviously a purely cosmetic issue.
Original issue reported on code.google.com by [email protected]
on 29 Apr 2008 at 7:56
Please provide any additional information below.
First of all, thank you for your great work on this library.
I would like to be able to import the Python implementation as a module so
that I can keep a checkout of the python folder in our common directory
(which is on every system's Python path). If it functioned as a module,
then I wouldn't have to modify all Python paths and treat diff_match_patch
as an edge-case (it could just be checked out directly in the common, or
any folder on the path, and it will Just Work). This will also work for
anyone else who has a common directory already on their path and wants to
use diff_match_patch without modification of their path or creation of
symlinks.
Fix:
Create __init__.py in the python directory, with this inside:
from .diff_match_patch import diff_match_patch, patch_obj
As I understand it (via http://www.python.org/dev/peps/pep-0328/), this
relative import syntax should be compatible with 2.4 or greater. Thanks!
Original issue reported on code.google.com by [email protected]
on 28 Aug 2009 at 4:29
What steps will reproduce the problem?
1. download diff_match_patch.java
2. compile it
What is the expected output? What do you see instead?
expected: compiled
I see: no Diff class
Original issue reported on code.google.com by [email protected]
on 13 Jun 2007 at 3:34
Awesome piece of implementation.
I would love to create the C# version myself but being ignorant about the
algorithms I could make silly mistakes.
Would you consider creating a C# version?
Original issue reported on code.google.com by [email protected]
on 6 Feb 2008 at 4:49
What steps will reproduce the problem?
1. Download the attached sample files (two whitespace-simplified versions
of a generated source file in the Hadoop project).
2. Use the library to compute the diff. The diff timeout would have to be
increased significantly or set to 0.0f.
What is the expected output? What do you see instead?
GNU diff computes a diff with about 1260 edit steps in 0.125s on my
machine. diff-match-patch with the diff timeout removed fails to terminate
in both its C++ and Java versions, consuming all available system memory.
What version of the product are you using? On what operating system?
Latest svn trunk on GNU/Linux amd64.
Please provide any additional information below.
The attached files are just an extreme example; I have also found it
infeasible to compute a diff between two files of about 2000 lines each in
Java with a 2G heap. The timeout prevents all memory from being used, but
results in a trivial "delete A, insert B" diff, which is not useful.
Original issue reported on code.google.com by [email protected]
on 20 Jan 2010 at 12:03
Attachments:
I may be interested helping.
Original issue reported on code.google.com by [email protected]
on 8 Apr 2008 at 8:06
[code]
# -*- coding: utf-8 -*-
from diff_match_patch import diff_match_patch
dmp = diff_match_patch()
str1 = """Привет!"""
str2 = """Привет and Welcome!"""
patches = dmp.patch_make(str1, str2)
#print dmp.patch_toText(patches)
print dmp.patch_apply(patches, str1)[0]
[\code]
$ python dmp.py
Traceback (most recent call last):
File "dmp.py", line 14, in <module>
print dmp.patch_apply(patches, str1)[0]
File "/data/Coding/Python/diff_match_patch.py", line 1401, in patch_apply
text = nullPadding + text + nullPadding
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0:
ordinal not in range(128)
Original issue reported on code.google.com by [email protected]
on 15 May 2008 at 6:44
What steps will reproduce the problem?
var dmp = new diff_match_patch();
var last = 'abcdefghij , h : 1 , t : 1 abcdefghij , h : 1 , t :
1 abcdefghij , h : 0 , t : 1';
var current = 'abcdefghij , h : 0 , t : 1 abcdefghij , h : 0
, t : 1 abcdefghij , h : 0 , t : 1';
var patches = dmp.patch_make(current, last);
var mod_current = 'abcdefghij , h : 0 , t : 1 abcdefghij , h : 1
, t : 1 abcdefghij , h : 0 , t : 1';
var res = dmp.patch_apply(patches, mod_current);
What is the expected output? What do you see instead?
Expect patch to succeed or fail. Actually it throws an error "Pattern too
long for this browser". Only affects JavaScript version (the bug is also
in the Python version but is never expressed).
Fathei Ali reported this error and I tracked it down to an indexing bug in
patch_splitMax. Most of the time this would have no effect, but very
occasionally it fails to split a long patch.
A new version has been pushed which corrects this bug and unit tests have
been added in all languages for verification.
Original issue reported on code.google.com by [email protected]
on 4 Dec 2008 at 5:50
Usually diff_cleanupSemantic does an excellent job :-).
However, in this case it appears to have a problem.
What steps will reproduce the problem?
In JavaScript (with default settings):
var a='\tS += "</table><pre style=\'display:none\'>";\n'+
'\tS += text.replace(/>/g, \'>\');\n'+
'\tS += "</pre></li></ul></div>\\n";';
var b='\n'+'\tt = lines.join(\'\\n\').replace(/>/g, \'>\');\n'+
'\tS += "</table><pre style=\'display:none\'>".concat(t, "</pre></li></ul></div>\\n");';
var DMP = new diff_match_patch;
var d=DMP.diff_main(a,b);
DMP.diff_cleanupSemantic(d);
for (var i in d) print('{',d[i][0],', "',d[i][1],'"}');
The output is as follows (which is basically the same as with no cleanup):
{ -1 , " S "}
{ 1 , "
t "}
{ 0 , " "}
{ -1 , " + "}
{ 0 , " = "}
{ -1 , " "</tab "}
{ 0 , " l "}
{ 1 , " in "}
{ 0 , " e "}
{ -1 , " ><pre "}
{ 0 , " s "}
{ -1 , " tyle='d "}
{ 1 , " .jo "}
{ 0 , " i "}
{ -1 , " splay: "}
{ 0 , " n "}
{ -1 , " o "}
{ 1 , " ('\ "}
{ 0 , " n "}
{ -1 , " e "}
{ 0 , " ' "}
{ -1 , " >";
S += text "}
{ 1 , " ) "}
{ 0 , " .replace(/>/g, '>');
S += "}
{ 1 , " "</table><pre style='display:none'>".concat(t, "}
{ 0 , " "</pre></li></ul></div>\n" "}
{ 1 , " ) "}
{ 0 , " ; "}
What version of the product are you using? On what operating system?
diff_match_patch_20080520.zip on Mac OS X 10.4.7
Please provide any additional information below.
Deleting the '\n' at the beginning of var b reduces the diff from 31 to 9
elements. As follows:
{ 0 , " "}
{ -1 , " S += "</table><pre style='display:none'>";
S += text.replace(/>/g, '>');
S += "}
{ 1 , " t = lines.join('\n').replace(/>/g, '>');
S += "</table><pre style='display:none'>".concat(t, "}
{ 0 , " "</pre></li></ul></div>\n" "}
{ 1 , " ) "}
{ 0 , " ; "}
Original issue reported on code.google.com by [email protected]
on 25 May 2008 at 12:42
Hello,
as I could see the library is very good.
I need one more functionality in diff, the CHANGED lines.
Original issue reported on code.google.com by [email protected]
on 21 Oct 2008 at 10:27
Hello,
i like it and would use it in Flex so can you write a actionscript
implementation?
Original issue reported on code.google.com by [email protected]
on 11 Feb 2008 at 2:16
* What steps will reproduce the problem?
I have attached a demonstration of the issue. This creates a patch between
two strings. It then tries to apply the patch to the first string to get
the second string.
* What is the expected output? What do you see instead?
When applying the patch, an IllegalArgumentException is thrown. The patch
created has the following invalid header on the second chunk:
@@ --2,32 +9,36 @@
* What version of the product are you using? On what operating system?
Latest version (20090202) on OS/X, Java 6 - but also occurs on Linux.
* Please provide any additional information below.
None.
Original issue reported on code.google.com by [email protected]
on 24 Mar 2009 at 6:58
Attachments:
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 28 Jun 2007 at 11:53
What steps will reproduce the problem?
1. use for compare diff in HTML (with tags)
What is the expected output? What do you see instead?
compare like knol
What version of the product are you using? On what operating system?
diff_match_patch_20090202
Please provide any additional information below.
any sugestion???
Original issue reported on code.google.com by [email protected]
on 5 Mar 2009 at 5:10
Compiling project generates a warning for line 221:
linearray = (ArrayList<String>) b[2];
results in:
Type safety: Unchecked cast from Object to ArrayList<String>
This can be worked around by ignoring warnings of course, so I'm not clear
on whether this is important or not (a relative noob at Java). My apologies
if this is normal behaviour.
What version of the product are you using? On what operating system?
Version 20080624
Built in Eclipse with java 1.6.0.10 on Ubuntu 8.10
Original issue reported on code.google.com by [email protected]
on 8 Nov 2008 at 6:23
Google Code does not currently send any notification to me when a new issue
is added to this list. Since there are virtually no issues with my code
(gloat), I don't visit this page often.
So if you don't want to be ignored, send me an email to let me know that
you've filed a new bug:
http://neil.fraser.name
Thanks!
Original issue reported on code.google.com by [email protected]
on 29 Jun 2007 at 1:32
This is a great piece of code. I have one concern on the compariosn. Is it
possible to chnage the logic in such a way that it does word by word
comparison?
I have written following test code in java
========================================================================
String text1="Hello how are you. <br/> My name is Prathyusha. Shravanthi
is my friend.";
String text2="Hello. <br/>My name is Shravanthi. Prathyusha was my
friend.";
diff_match_patch diff = new diff_match_patch();
LinkedList<diff_match_patch.Diff> diffs = diff.diff_main(text1, text2);
diff.diff_cleanupSemantic(diffs);
String result = diff.diff_prettyHtml(diffs);
System.out.println(result);
=====================================================================
I see the following output
--------------------------------------------------------
Hello<DEL> how are you</DEL>. <br/><DEL> </DEL>My name is <DEL>Prathyusha.
Shravanthi i</DEL><INS>Shravanthi. Prathyusha wa</INS>s my friend.
--------------------------------------------------------
As the output shows the logic doesn't break the differences into logical
words. Rather it does comparison on a chunk of string. Word by word
comparison would help in getting a precise count of the newly added words
and the deleted words. Additinally if we check the deleted
<DEL>Prathyusha. Shravanthi i</DEL> text and the inserted <INS>Shravanthi.
Prathyusha wa</INS> text, the middle chars ' ' and '.' are common. They
should not be considered as deleted and inserted. This problem wouldn't
have arised if we do a word by word comparison. The word count in all is
increased by 4 conidering ' ' and '.' as 2 words.
Is it possible to do a word by word comparison?
Regards,
Pratap
Original issue reported on code.google.com by [email protected]
on 4 Jul 2009 at 1:51
Want to use Cyrillic characters with diff_match_patch (python version,
release), but got errors like:
"UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0:
unexpected end of data"
appending in some places to strings ".decode("utf-8").encode("utf-8")",
seem to solve the problems, but I guess not 100%.
see the attached patch (and for any case new file).
Alexandr.
Original issue reported on code.google.com by [email protected]
on 11 May 2008 at 1:12
Attachments:
What steps will reproduce the problem?
1. use java API: diff_main(String text1, String text2).
text1 = "this is a test";
text2 = "this is a test A";
2. LinkedList<Diff> diff = diff_main(String text1, String text2);
3. in "LinkedList<Diff> diff", I can't get "index" of in Diff Class.
What is the expected output? What do you see instead?
expected output: index = 16
but index always equal -1
What version of the product are you using? On what operating system?
version:20071106, java
OS: windows XP
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 29 Dec 2007 at 2:55
While it *does* report differences, it tends to pick the largest diff that
it can (rather than more, smaller, diffs).
Choose a large (10MB or so) text file with lots of carriage returns -- an
XML file formatted for readability will do -- and permute it with the
following algorithm (this is Ruby code; $. is the current line in the
input, and gsub! alters the current string):
for x in STDIN
if $. % 1000 == 0
deleted = x
elsif $. % 100 == 0
puts deleted if deleted
elsif $. % 10 == 0
x.gsub!(/>.*</, ">XY<")
puts x
else
puts x
end
end
This deletes every 1000'th line, inserts the previously deleted line every
100'th line, and alters every 10'th line.
diff_match_patch reported the following changes:
Operation String size
EQUAL 339
DELETE 13934676
INSERT 13860601
EQUAL 92
So, basically, it reported that the whole file had changed. The same two
files run through GNU diff resulted in 36766 differences. A corresponding
patch made from diff_match_patch would have resulted in a patch file almost
as big as the original file (nearly 14MB); the patch file from GNU diff was
around 3MB, or 1/4 the size. This means that that the algorithm used by
diff_match_patch produces extremely inefficient patches.
Incidentally, setting the "checklines" flag to false actually results in a
*faster*, not slower, diff, although the resulting reported differences
don't vary greatly.
This is with 20090501 with Java 1.6.0 on Ubuntu Intrepid.
Original issue reported on code.google.com by seanerussell
on 7 Jun 2009 at 12:21
Consider the left text :
AAA
BBB EEE
and the right text
AAA
BBB DDD
BBB EEE
To the human eye, it's obvious that the middle was added.
However, gdmp (and other character level diff algorithms I tried) see it
differently, spanning the
added string ("DDD\nBBB") over two lines.
Of course it makes sense from a computational standpoint, but that should at
least be cleaned
up by the cleanup functions, which should try to minimize the number of lines.
What do you think ?
Original issue reported on code.google.com by [email protected]
on 20 Jun 2008 at 9:35
diff_match_patch.py:430: SyntaxWarning: assertion is always true, perhaps
remove parentheses?
assert (text1[x] == text2[y],
diff_match_patch.py:475: SyntaxWarning: assertion is always true, perhaps
remove parentheses?
assert (text1[-x - 1] == text2[-y - 1],
diff_match_patch.py:1158: SyntaxWarning: assertion is always true, perhaps
remove parentheses?
assert (self.Match_MaxBits == 0 or len(pattern) <= self.Match_MaxBits,
SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (False, "Unknown call format to patch_make.")
Original issue reported on code.google.com by [email protected]
on 2 May 2009 at 12:54
Problem:
Unchecked cast from Object to ArrayList<String> in diff_match_patch.java on
line 229 in function diff_compute.
What is the expected output? What do you see instead?
No warnings should be produced
What version of the product are you using?
diff_match_patch_20090615.zip
Solution:
Create a new class diff_linesToChars_result with public the following
signature:
public String chars1;
public String chars2;
public List<String> lineArray;
See attachement
Original issue reported on code.google.com by [email protected]
on 17 Jul 2009 at 7:12
Attachments:
Why is it required to have the QT Library for C++? That's a big chunk of
code just to use QList and QString (Maybe there is more I'm not seeing).
Why not use the C++ Standard Library?
Original issue reported on code.google.com by bradleelandis
on 8 Sep 2009 at 8:42
when i compare the following two text on sit
"ZHEJIANG JIANGLONG TEXTILE PRINTING" and
"ZHEJIANG JIANGLIMEI KNITTING CLOTH"
with Match balance=0.6 and Match threshold: 0.4
i have two results:
The first when using Demo of Match on the site(JavaScript)the result is
don't match
the second result when using java code in my application the result is
match ok (found)
what is the difference?
Original issue reported on code.google.com by [email protected]
on 9 Sep 2008 at 12:12
What steps will reproduce the problem?
I attached a JS file which reproduces the problem.
What is the expected output? What do you see instead?
Error: Pattern too long for this browser.
What version of the product are you using? On what operating system?
javascript running on Rhino, java 1.5
Please provide any additional information below.
the patch causing that problem is created by patch_fromText() and a
subsequent reverting of the patch (changing DELETES into INSERTS - that's
why the first status field says "-0").
Original issue reported on code.google.com by [email protected]
on 7 Sep 2009 at 11:26
Attachments:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.