Comments (5)
Nice catch! It looks like that bug has been in the code since the initial Java
version in 2007, and was later faithfully ported to C++. This function has
comprehensive unit testing, but no amount of testing would have caught that.
While in that function I also noticed that we are rebuilding [Equality,
Insertion, Equality] diffs for no reason. So I've added a shortcut for
single-edits sandwiched between two equalities.
The patch has gone out for overnight review.
Original comment by [email protected]
on 5 Nov 2010 at 7:38
- Changed state: Started
- Added labels: Performance
from google-diff-match-patch.
The changes were commited to Subversion a few hours ago.
http://code.google.com/p/google-diff-match-patch/source/detail?r=72
Thanks for caching this.
There's another small optimization coming on Monday for C# and JavaScript, so I
won't bother updating the download package until then.
Original comment by [email protected]
on 6 Nov 2010 at 6:21
- Changed state: Fixed
from google-diff-match-patch.
Neil,
Unfortunately your change in r72 appears to cause test "diff_main: Overlap #1."
to fail.
It's the 'if (count_delete + count_insert > 1)' which is not equivalent to
the old 'if (count_delete != 0 || count_insert != 0)'.
Original comment by [email protected]
on 6 Nov 2010 at 6:01
from google-diff-match-patch.
I cannot replicate the error you report. All unit tests pass in all languages:
diff_main: Overlap #1. OK
diff_main: Overlap #2. OK
diff_main: Overlap #3. OK
[...]
All tests passed.
Total time: 1297 ms
Done.
Could you double-check that the error is real?
Original comment by [email protected]
on 8 Nov 2010 at 8:36
from google-diff-match-patch.
Neil,
I'm terribly sorry. You are absolutely correct - r72 does NOT cause a test
failure.
Let me explain: I'd made a change that asserted that `text_delete` &
`text_insert` were already empty at the bottom of the `case` statement & so
there was no need for `text_delete = ""; text_insert = "";`. This was true
before r72. Unfortunately I neglected to test my debug build with that change
from r72 so the asserts weren't triggered.
Why did I make that change & why did I not suspect it?
Well, it's because I don't have/can't use Qt & I've ported the C++
diff-match-patch to use the standard C++ library, so it is just one of many
changes. I hope to submit my C++ version (cpp-std) to you soon for possible
inclusion in diff-match-patch.
In fact I'm quite pleased that that change in r72 does work as it provides a
nice performance improvement.
I've spent quite some time optimising my C++ conversion & it is now up to 4
times as fast as the initial conversion.
And, of course it passes all the tests. I've also added a couple of switches
to the test program to provide some performance metrics.
Additionally, I've added another switch (currently not for submission) that
processes a diff/patch file to produce HTML. It's based on the JavaScript
version in svnX <svnx.googlecode.com> .
[It converts a 91,500 line diff (from a user) to 29.2MB of HTML in as little as
14 secs, even on my ageing machine.]
My conversion is pretty much complete, I just need to do some tidying up & test
it with one more compiler.
I'll send you more details when it's ready.
Thanks for a very useful resource.
Regards,
Chris
Original comment by [email protected]
on 9 Nov 2010 at 6:50
from google-diff-match-patch.
Related Issues (20)
- Getters for fields in Java version for integration with Freemarker
- Levenshtein maximum distance is greater than length of both strings HOT 1
- Substring length check missing in C# implementation
- javascript diff_cleanupSemantic uses negative indexes in the equalities array HOT 1
- diff_prettyHtml output hard-codes color for <ins> and <del> HOT 1
- C# uses \n instead of \n\r or Environment.NewLine
- c# patch_toText + patch_fromText doesn't work
- Ruby port
- performance slow?
- NewLines appear broken in patches (Python 3, Django 1.6.1) HOT 2
- Patch for /trunk/python3/diff_match_patch.py
- Patch for /trunk/python3/diff_match_patch.py
- Uninitialized string offset: 0 (function diff_cleanupSemanticLossless)
- Text containing HTML HOT 1
- Consider SQLCLR compatibility / eliminate dependency on System.Web for UrlEncode and UrlDecode HOT 3
- xIndex for instertion after location
- Demo pages not working HOT 4
- Levenshtein distance problem
- objc version generates wrong diffs
- When is this project transferred to github? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from google-diff-match-patch.