Code Monkey home page Code Monkey logo

timsort's Introduction

timsort Build Status codecov

timsort is a Go implementation of Tim Peters' mergesort sorting algorithm. It's stable and runs in O(n) time for presorted inputs and O(n log n) otherwise.

For many input types it is 2-3 times faster than Go's built-in sorting.

The main drawback of this sort method is that it is not in-place (as any mergesort), and may put extra strain on garbage collector.

This implementation was ported to Go by Mike Kroutikov and derived from Java's TimSort object by Josh Bloch, which, in turn, was based on the original code by Tim Peters.

Installation

$ go get -u github.com/psilva261/timsort/v2

Testing

Inside the source directory, type

go test

to run test harness.

Benchmarking

Inside the source directory, type

go test -test.bench=.*

to run benchmarks. Each combination of input type/size is presented to timsort, and, for comparison, to the standard Go sort (sort.Sort for ints or sort.Stable otherwise). See BENCHMARKS.md for more info and some benchmarking results.

Examples

As drop-in replacement for sort.Sort

package main

import (
	"github.com/psilva261/timsort/v2"
	"fmt"
	"sort"
)

func main() {
	l := []string{"c", "a", "b"}
	timsort.TimSort(sort.StringSlice(l)
	fmt.Printf("sorted array: %+v\n", l)
}

Explicit "less" function

package main

import (
	"github.com/psilva261/timsort/v2"
	"fmt"
)

type Record struct {
	ssn  int
	name string
}

func BySsn(a, b interface{}) bool {
	return a.(Record).ssn < b.(Record).ssn
}

func ByName(a, b interface{}) bool {
	return a.(Record).name < b.(Record).name
}

func main() {
	db := make([]interface{}, 3)
	db[0] = Record{123456789, "joe"}
	db[1] = Record{101765430, "sue"}
	db[2] = Record{345623452, "mary"}

	// sorts array by ssn (ascending)
	timsort.Sort(db, BySsn)
	fmt.Printf("sorted by ssn: %v\n", db)

	// now re-sort same array by name (ascending)
	timsort.Sort(db, ByName)
	fmt.Printf("sorted by name: %v\n", db)
}

timsort's People

Contributors

dsamarin avatar marz619 avatar pgmmpk avatar psilva261 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

timsort's Issues

Broken algorithm?

After a quick review of the code, I've noticed a few suspects.

1) Binary search integer overflow

mid := (left + right) / 2

This fails for large values of low and high. If the sum overflows to a negative value, this will fail with an out of bounds error. There are two ways to fix this.

mid := left + (right - left) / 2
mid := int(uint(left + right) >> 1) // Used in Go package sort.Search

See: Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken

2) Timsort invariant violation

I noticed that the implementation of mergeCollapse is similar to the one in an old version of CPython.

timsort/timsort.go

Lines 434 to 440 in 4537dc9

func (self *timSortHandler) mergeCollapse() (err error) {
for self.stackSize > 1 {
n := self.stackSize - 2
if n > 0 && self.runLen[n-1] <= self.runLen[n]+self.runLen[n+1] {
if self.runLen[n-1] < self.runLen[n+1] {
n--
}

See: Proving that Android’s, Java’s and Python’s sorting algorithm is broken (and showing how to fix it)

Missing return statement

Hi,

First of all, thank You very much for this work! While I was porting timsort to zig, I have noticed, that v2/timsortint.go is missing a return statement after line 110. I am not that familiar with the codebase, but it could be missing elsewhere too.
Not a big deal, as only affects inputs below minMerge, but should be fixed anyway.

Thanks,

why comparing with quicksort?

It seems that the benchmark is comparing with GO's implementation of quicksort (which is standard in go)

sort.Sort(&v)

Given that timsort is a stable sort, would not it be right to compare with sort.Stable() instead?

I got much worse results with that:

RevSorted100: 1500          (7806)        1582          (7759)        1469          (7773)        
Xor100:       4838          (5802)        4828          (5845)        4819          (5834)        
Random100:    5244          (7891)        5250          (7887)        5242          (7911)        

Sorted1K:     5597          (4837)        5584          (4838)        5633          (4846)        
RevSorted1K:  6588          (87094)       6523          (85785)       6562          (88508)       
Xor1K:        45714         (114406)      45331         (113812)      46041         (114143)      
Random1K:     114125        (202941)      132139        (203927)      114355        (202147)      

Sorted1M:     4129627       (8783352)     4129406       (8771254)     4148070       (8678536)     
RevSorted1M:  6239082       (114131971)   6189573       (115857436)   6199218       (114664717)   
Xor1M:        99852677      (224848697)   98523482      (228193397)   100662027     (222984886)   
Random1M:     362650134     (665379425)   348801121     (669130321)   340476927     (670888160)```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.