kth-competitive-programming / kactl Goto Github PK

View Code? Open in Web Editor NEW

2.6K 51.0 689.0 12.84 MB

KTH Algorithm Competition Template Library (... eller KTHs AC-tillverkande lapp)

Makefile 0.43% TeX 9.20% Python 2.65% C++ 55.62% C 31.81% Shell 0.08% Java 0.14% Vim Script 0.08%

algorithm contest cpp data-structures competitive-programming notebook cc0

kactl's People

Contributors

Stargazers

Watchers

Forkers

vatsalsharma376 weiqiao denvned sdssudhu bicsi mkindblom pandeyshrey33 tyilo rakshith98 ludwikjaniuk yhren imtbkcat remidinishanth posi2 ivandonat trinerdi anton makkunda abhiy13 lionadis istlemin badasscuber rushi725 fundou dhopal kanghaibo xennygrimmato arknave erdemkiraz reslibasha leostd arvindr9 anaymehrotra phoenix1971 rakeshpavan333 kingdom-come techeye220 andyshafer fafire yusufsholeh kirisamejin allenanswerzq ludopulles shauryapst imjching zharaskhan ksun48 vato20071 dunpyl phpduke devopsotrator riteshharjani infi-knight efgen fazlerahmanejazi falghi deepankurk rsunny saribekyan chillee snibug noobgam bqi343 sdwivedi codefire53 undali fiza11 jack17529 ecnerwala thesmallstar hunglethanh9 potatohashing kanglicheng sanketkumarsinghiitkgp sankalp1999 ravish0007 aayush-yb abhishek20123g evoiz aryanc403 aaryans941 vicfred itewqq vivekkumarverma stevenwjy dev-strenuus hoke-t shivam212 nimom38 kaazuki danielstephan97 peon-pasado shdut isaacarvestad aleloi jvmusin hanrick2000 nadimakhtar97 shahraaz hekl0

kactl's Issues

CRT for large numbers

Requiring each to be <2^31 can be annoying. Maybe CRT for many small numbers simultaneously would help?

Improve ModMull

I don't really understand why we have mod_mul. If we're not supporting 32 bit (ie: CF), can't we juts use __int128_t?

Alternately, we can use the equivalent of this: https://github.com/RamchandraApte/OmniTemplate/blob/master/modulo.h#L23

The current KACTL one is slower (by 5-20x), has that annoying bit thing, and is longer.

Function minimization

Describe other approaches than hill-climbing, in case that ever falls down (e.g. if it's too slow, search space is narrow in diagonal direction, or there are too many dimensions).

tinyKACTL tersely describes the Conjugated Gradient Method, which sounds like it might be relevant:

Search direction in step i is $d_i = g_i + β_i d_{i-1}$, where $g_i$ is the gradient in step i and $β_i = |g_i|^2 / |g_{i-1}|^2$ ($d_1 = g_1$).

Shorter HLD

https://github.com/TimonKnigge/TCR/tree/master/snippets/graphs seems to have a very short one. Also http://codeforces.com/blog/entry/22072.

More understandable suffix tree

The following from e-maxx might work (a bit long, so doesn't fit in KACTL at the moment, and also it seems slower):

struct SuffixTree {
	enum { MAXN = 100 };
	string s;
	int n, sz = 1;

	struct Node {
		int l, r, par, link;
		map<char,int> next;
		Node(int l=0, int r=0, int par=-1)
			: l(l), r(r), par(par), link(-1) {}
		int len() { return r - l; }
		int& get(char c) {
			if (!next.count(c)) next[c] = -1;
			return next[c];
		}
	} t[MAXN*2+5];

	struct State { int v, pos; } ptr{0,0};

	State go(State st, int l, int r) {
		while (l < r)
			if (st.pos == t[st.v].len()) {
				st = {t[st.v].get(s[l]), 0};
				if (st.v == -1) return st;
			}
			else {
				if (s[t[st.v].l + st.pos] != s[l])
					return {-1, -1};
				if (r-l < t[st.v].len() - st.pos)
					return {st.v, st.pos + r-l};
				l += t[st.v].len() - st.pos;
				st.pos = t[st.v].len();
			}
		return st;
	}

	int split(State st) {
		if (st.pos == t[st.v].len()) return st.v;
		if (st.pos == 0) return t[st.v].par;
		Node v = t[st.v];
		int id = sz++;
		t[id] = Node(v.l, v.l+st.pos, v.par);
		t[v.par].get(s[v.l]) = id;
		t[id].get(s[v.l+st.pos]) = st.v;
		t[st.v].par = id;
		t[st.v].l += st.pos;
		return id;
	}

	int link(int v) {
		if (t[v].link != -1) return t[v].link;
		if (t[v].par == -1) return 0;
		int to = link(t[v].par);
		return t[v].link = split(go({to,t[to].len()}, t[v].l + !t[v].par, t[v].r));
	}

	void extend(int pos) { for(;;) {
		State nptr = go(ptr, pos, pos+1);
		if (nptr.v != -1) {
			ptr = nptr;
			return;
		}
		int mid = split(ptr), leaf = sz++;
		t[leaf] = Node(pos, n, mid);
		t[mid].get(s[pos]) = leaf;
		ptr.v = link(mid);
		ptr.pos = t[ptr.v].len();
		if (!mid) break;
	}}

	SuffixTree(int n) : n(n) {}
	SuffixTree(string s) : n(sz(s)) { trav(c, s) append(c); }
	void append(char c) { s += c; extend(sz(s)-1); }

	// example: find longest common substring
	pii best;
	int lcs(Node& node, int i1, int i2, int olen) {
		if (node.l <= i1 && i1 < node.r) return 1;
		if (node.l <= i2 && i2 < node.r) return 2;
		int mask = 0, len = olen + (node.r - node.l);
		trav(pa, node.next)
			mask |= lcs(t[pa.second], i1, i2, len);
		if (mask == 3)
			best = max(best, {len, node.r - len});
		return mask;
	}
	static pii LCS(string s, string t) {
		SuffixTree st(s + '\1' + t + '\2');
		st.lcs(st.t[0], sz(s), sz(s) + 1 + sz(t), 0);
		return st.best;
	}
};

Perhaps we should just try to provide better documentation about how to use it.

Automata

NFAs, DFAs, conversion
regex parsing
minimization
reversal
products, product minimization

Would have been nice during gcj (Dice Straight) not to have to reinvent incremental matching from scratch (where nodes can be added/removed from the LHS). And incremental maxflow comes up all the time, and we don't really have a good answer for it in KACTL.

Edge coloring

If every vertex has degree <= d, a d+1-edge coloring can be found in O(NM) time using https://en.wikipedia.org/wiki/Misra_%26_Gries_edge_coloring_algorithm.

https://github.com/ludopulles/tcr/blob/master/code/graphs/MisraGries.cpp

Common differential equations

Math additions

vieta jumping
polya enumeration theorem
master theorem
sum of x^k for general k? (could derive from \sum_{k=0}^n\binom{k}{m}=\binom{n+1}{m+1}, possibly)
inequalities?
combinatorial sums? E.g., sum_{k=0}^n\binom{k}{m}=\binom{n+1}{m+1}

Mini-OEIS

I de-bloated the Combinatorial chapter a bit by removing some of the more pointless number sequences (could probably remove even more). But maybe a large collection of sequences for pattern matching is useful if it comes in the form of short one-liners? Among things I just threw out:

(involutions) Permutations which are their own inverse: 1, 1, 2, 4, 10, 26, 76, 232, 764, 2620, 9496, 35696
(schröder numbers) Number of lattice paths from $(0,0)$ to $(n,n)$ using only steps N,NE,E, never going above the diagonal: 1, 2, 6, 22, 90, 394, 1806, 8558, 41586, 206098
(super catalan numbers) Schröder numbers, but following the diagonal is also not allowed. Half the Schröder numbers, except that the first term is 1.
(motzkin numbers) Number of ways of drawing any number of nonintersecting chords among $n$ points on a circle. Also the number of lattice paths from $(0,0)$ to $(n,0)$ never going below the $x$-axis, using only steps NE, E, SE. 1, 1, 2, 4, 9, 21, 51, 127, 323, 835, 2188, 5798, 15511, 41835

Porting CP Geo Handbook to KACTL/Other Geo additions

https://codeforces.com/blog/entry/56773
https://vlecomte.github.io/cp-geo.pdf

My main goal in doing this would be the following:

Add more tests for the geometry functions.
Reduce verbosity. Vlecomte's book has a lot of nice implementations, so I'll be picking the better one between the two.
Augment with some new functions. Namely, Halfplane.

For the sake of consistency (and because mostly there's not much of a difference...) I will be using the current KACTL Point class for everything.

Potential new implementations that KACTL doesn't have (may move to separate issue):

Circle-Line intersection
Convex-Convex polygon intersection in log n time
Max distance across convex polygon in log n time

Polynomial interpolation

We have that for the standard 1D case, but we should perhaps include something about higher number of dimensions. From previous notes:

from what I could tell, multivariate polynomial interpolation pretty much requires solveLinear, e.g. since not all sets of interpolation points work.
However, empirically (deg+1)^vars grids work ("are unisolvent"), even if randomly pruned in size to the number of included monomials.

Also, maybe include Neville's algorithm?

def neville(x, y, t): # evaluate f at point t, given that f(x...) = y...
    n = len(x)
    for j in range(1,n):
        for i in range(n-j):
            y[i] = ((t - x[i+j])*y[i] - (t - x[i])*y[i+1]) / (x[i] - x[i+j])
    return y[0]

msan

Not all that useful without custom standard library, and when you can debug on paper, but...

clang++ -std=c++11 -g -fsanitize=memory -fsanitize-memory-track-origins

#include <sanitizer/msan_interface.h>

int read() { int x; cin >> x; __msan_unpoison(&x, sizeof(x)); return x; }

__attribute__((no_sanitize("memory")))
int read() { int x; cin >> x; return x; }

Maybe this will mature at some point, or maybe it will at least be usable outside of competitions.

Getting a min cut from PushRelabel

It's a one-liner, and avoids having to write a DFS from scratch:

bool is_cut(int u,int v) {
    return (dist[u] >= n) && (dist[v] < n);
}

Haven't checked if this works with our implementation or if we do something terrible to dist values.

Restructure flow algorithms

There's currently 3 flow implementations that (more or less) do similar things. There's HopcroftKarp, EdmondsKarp, PushRelabel, DFSMatching. There's also a 5th commonly used implementation (that KACTL doesn't include): Dinic's, although there is an issue for it #19 .

I've done a lot of benchmarking of flow algorithms, so let's break down these implementations.

DFSMatching: Runs in O(EV) time. Very short. Honestly, I think this is pretty useless nowadays. Maybe it used to be useful in the past, but nowadays Hopcroft-Karp is essentially expected for most matching problems.

HopcroftKarp: Only works on Bipartite Graphs, runs in O(\sqrt{E}V) time. Typically runs very fast (my guess is about 3x faster than Dinic's) on those bipartite graphs. Decently long.

EdmondsKarp: Runs in O(VE^2) time. Strictly inferior to Dinic's in runtime. Fairly short.

Dinic's: Runs in O(V^2E) time, O(VE log U) with scaling. Has some special guarantees: O(sqrt(E)*V) on Bipartite graphs, and O(min(sqrt(V),E^(2/3))*E) on unit graphs.

HLPP: Runs in O(V^2sqrt(E)) time. Runs the fastest in practice on most graphs. Current KACTL version is missing global relabeling optimization, which makes a huge difference (~2x on Chinese flow problem, allows for 0.15 on SPOJ MATCHING).

If we look only at complexities, we see that a combination of Dinic's + HLPP dominates everything else. Provable guarantees are nice, but flow problems are also notoriously difficult to come up with worst cases for. I have a HLPP implementation that beats out my Dinic's on all the test cases I've tried it on (and also has/tied for the #1 spot on SPOJ FASTFLOW, VNSPOJ FASTFLOW, and the chinese flow problem). It also performs about equal to KACTL's Hopcroft-Karp on MATCHING. So, it seems that in practice, that HLPP implementation would probably perform the best.

Thus, I propose that we do a couple things:

Add global relabeling to the current HLPP implementation. This will give us the fastest possible flow implementation in general.
Add Dinic's. This gives us a bunch of nice provable guarantees on certain kinds of graphs, a much more understandable/traditional augmenting-paths flow algorithm, with even stronger guarantees if scaling is added.
With these two, we have both practical efficiency and theoretical guarantees covered, making HopcroftKarp and EdmondsKarp redundant. We can then remove them.
Also remove DFSMatching since I think it's pretty useless.

This leaves us with 2 implementations: HLPP to be insanely fast in practice, and Dinic's to have good guarantees on certain kinds of graphs.

See https://codeforces.com/blog/entry/66006?#comment-500365 for some discussion about HLPP optimizations.

Angle difference

I had to derive the following during a contest, which took a bit of time...

Angle diff(Angle a, Angle b) {
    int tu = b.t - a.t;
    a.t = b.t = 0;
    return Angle(a.x * b.x + a.y * b.y, a.x * b.y - a.y * b.x, tu - (b < a));
}

Faster simplex

https://en.wikipedia.org/wiki/Revised_simplex_method ? Seems kindof complicated... Annoying details according to wikipedia include a presolve step for getting rid of redundant constraints (which can take O(mn^2) time!) and needing periodic refactorizations.

https://github.com/pakwah/Revised-Simplex-Method/blob/master/RevisedSimplex/main.cpp
https://github.com/YimingYAN/cppipm/blob/master/src/cppipm.cpp
are two not overly long implementations.

Could also maybe do something with the Big M method for when a viable solution is known in advance - it costs some numerical stability but won a factor 2 in one case. E.g.:

T solveWithGuess(vvd& A, vd& b, vd& c, vd& x) {
	int n = sz(c), m = sz(A);
	rep(i,0,m) {
		vd& row = A[i];
		row.push_back(0);
		T &la = row[n], &co = b[i];
		rep(j,0,n)
			la -= row[j] * x[j], co -= row[j] * x[j];
	}

	const T M = 1e7;
	c.push_back(M);

	vd x2(n+1, 0);
	x2[n] = 1;
	A.push_back(x2);
	b.push_back(1);

	T res = LPSolver(A, b, c).solve(x2);
	if (abs(res) == inf) return res;
	assert(abs(x2[n] - 1) < eps);

	x = x2;
	x.pop_back();

	return res - M;
}

https://github.com/dacin21/dacin21_codebook/blob/master/numerical/simplex_lp.cpp claims a 5x win via "steepest edge pricing", should be investigated.

FFT-based polynomial operations

Division, modulo and multipoint evaluation seems to be pretty standard nowadays. See e.g. https://github.com/ecnerwala/icpc-book/blob/master/content/numerical/fft.cpp

Add algorithm for finding tallest visible point from everywhere

Think "2D mountain range". Johan said he wanted to have this.

Faster FFT

Ours isn't that bad, but something based on bit reversal ought to be faster.
E.g. https://github.com/jaehyunp/stanfordacm/blob/master/code/FFT.cc +

// https://arxiv.org/pdf/1708.01873.pdf
void BitReverse(int n, int L, vector<T>& vec) {
	int rev = 0;
	rep(i,0,n) {
		if (i < rev) swap(vec[i], vec[rev]);
		int tail = i ^ (i + 1), shift = 32 - __builtin_clz(tail);
		rev ^= tail << (L - shift);
	}
}

(although Stanford's code isn't numerically stable; it should really use precomputed roots of unity)

Also, worth mentioning that multiplication of complex<double> is slow, and maybe do something about denormals?

Manhattan MST

Closest pair is too long

In fact, since it appears so seldom maybe one could just replace it with a short text description.

Fast modulo

Not worth having in the pdf, but it comes up from time to time when trying to optimize stuff.

https://gist.github.com/simonlindholm/2216d26945ca5a9579cb9e2142f7d2a6
https://github.com/ejmahler/strength_reduce/blob/master/src/lib.rs (slightly faster I think?)

HLD should work for non-commutative functions

Difference between H and K is too subtle in hashing class

If you forget that a variable is a H, you'll get hard-to-debug errors. Remove H and use K everywhere? Or add an operator unsigned long long?

Line - line segment intersection

Comes up fairly frequently. Even the code which scales up one of the lines and calls lineSegment - lineSegment intersection would be useful. (edit: but computing line-line intersection and checking sideOf seems more sensible?)

Pointwise sum of line containers

E.g. https://github.com/mochow13/competitive-programming-library/blob/master/DP/Convex%20Hull%20Line%20Container.cpp

We might at the same time want to fix the quirk where there is a bad Line with p = -inf at the start, to make it more modifiable.

Centroid decomposition

Translate techniques.txt to English

It seems weird that troubleshoot.txt is in English, but techniques.txt is not.
Also this would make techniques.txt useful for non-Swedish participants.

build/test-session.tex is missing

Gomory-Hu tree

Not sure how useful it is, but it seems like the code is rather short, and it would be hard to derive by hand.

Incircle center

Finding LCA with power-of-two jumps

O(1) LCA is so long to type in, would be nice to have a ready-made one using power-of-two jumps. Should just be a couple of lines.

Optimize variable names for ease of distinguishability

We should never use both i and j, for instance - it's easy to slip, and impossible to spot afterwards. Different number of letters is good.

In the geometry chapter, especially.

Update Fuzz Tests

Main goals:

Reorganize fuzz tests to match the directory structure of contents.
Standardize tests. This involves a couple things.
a. assert when they fail/make failures obvious. Currently some tests pass by eyeballing output and determining whether it's close enough.
b. Perhaps standardize fast/slow tests, and maybe integrate with CI eventually (very stretch goal).
c. Make sure that tests are always importing their functions, as opposed to hard coding them.
Fix broken tests.
The following need fixing.

MinCostMaxFlow.cpp
PointInsideHull.cpp (caused by the current typedef Point P mess conflicting with each other).

C++14 auto for lambdas

Don't work on my version of clang, however. :(

Usability improvements for numeric integration

~~should take a functor, not a function pointer~~ (done in #74)
should only return once b - a is smaller than 1/k times the original interval size

SuffixArray::lower_bound(const string& s)

Extended to hashes, with complexity O(log^2 N)? Or is this made completely redundant by the suffix tree?

MIT additions

Replaced/duplicate implementations

New Link Cut Tree implementation (this submission from NWERC has a short one)
Another simplex implementation (they're not really sure...)
New Suffix Array implementation
New FFT Implementation
New NTT Implementation
~~AngleCmp (Various angle comparison utilities)~~
Faster Graph-Clique (SJTU's is prolly better).

Different Algorithms for Existing Implementations

Blossom algorithm for general matching (I believe KACTL has a V^3 randomized algorithm for this already)
Linear time SA+LCP+Suffix Tree
N log N Delaunay
FFT-based polynomial class (add, multiply, subtract, floor division, modulo, derivative, integrate,logarithm,exp,pow, multi-point evaluation, interpolation, inverse)
FFT-Mod

Math/description additions:

Bernoulli numbers (Exponential generating function, sums of powers, euler-maclarin formula for infinite sums)
Labeled unrooted trees (Cayley's formula and such)

Completely New Additions (roughly ordered from what I've seen of other notebooks)

???

~~Graph-Negative-Cycle?~~ (Seems like it's incorrect)

Misc:

Colored doc!
Hashing to verify code correctness
Much longer Makefile

Crossed out means that we decided to not include it.

Include short descriptions of obscure algorithms

I.e. don't include code, but give a hint about what to do if they against all probability do appear. tinyKACTL does this for Suurballe's algorithm, as thus:

Shortest tour from A to B to A again not using any edge twice, in an undirected graph: Convert the graph to a directed graph. Take the shortest path from A to B. Remove the paths used from A to B, but also negate the lengths of the reverse edges. Take the shortest path again from A to B, using an algorithm which can handle negative-weight edges, such as Bellman-Ford. Note that there is no negative-weight cycles. The shortest tour has the length of the two shortest paths combined.

and Directed Minimum Spanning Tree:

Chu-Liu/Edmonds Algorithm:

Discard the arcs entering the root if any; For each node other than the root, select the entering arc with the smallest cost; Let the selected n − 1 arcs be the set S.

If no cycle formed, G(N, S) is a MST. Otherwise, continue.

For each cycle formed, contract the nodes in the cycle into a pseudo-node (k), and modify the cost of each arc which enters a node (j) in the cycle from some node (i) outside the cycle according to the following equation:
c(i, k) = c(i, j) − (c(x(j), j) − min_j (c(x(j), j))
where c(x(j), j) is the cost of the arc in the cycle which enters j.

For each pseudo-node, select the entering arc which has the smallest modified cost; Replace the arc which enters the same real node in S by the new selected arc.

Go to step 2 with the contracted graph.

But we could make that more compact.

Root-finding for general functions

E.g. Newton-Raphson, but preferably something that works always and finds all roots.

Eertree?

Could replace Manacher, maybe? (especially since hashing also exists)

https://medium.com/@alessiopiergiacomi/eertree-or-palindromic-tree-82453e75025b
https://github.com/ngthanhtrung23/ACM_Notebook_new/blob/master/String/eertree.cpp

Dinic

https://github.com/TimonKnigge/TCR/blob/master/snippets/flowalgorithms/dinic_ragnar.cpp is short, and might be worth including in case push-relabel falls down on some case.

Add hashing for verifying correct input of code

See #63 (comment)

I don't think that hashing sections is worth it. MIT does hashing in 8 snippets: LCT, LinearRecurrence,Simplex.h, Polynomial,CycleCounting,GraphDominator, and both suffix arrays

I would split that into
"Should be split into different sections": Polynomial, CycleCounting
"trying to avoid hashing the typedef": LinearRecurrence
"Has parts that you don't always want": Both suffix arrays (ie: don't always need LCP)
"Not sure": LCT, Simplex, and GraphDominator (I don't know enough about the algorithms to understand whether you pretty much always want all functions)

That's a maximum of 4 snippets where it might be advantageous to have section-wise hashing.

The other argument for hashing sections is that if the hash fails, then you need to look at less of your code. I haven't done many offline contests with a TCR, but from my experience, knowing that you have a mistype in 50 lines of code is only marginally better than knowing you have a mistype in 100 lines of code. Both of these are massively better than not knowing whether you have a mistype or a logic error.

If we were to hash by section, I would propose having some kind of lightweight syntax (like a //<-- ) to demarcate sections, and then putting the hashes (truncated to 5 characters) in the header.

Like so:

Another question with hashing is how we deal with things like typedefs, especially if they're typedefs that are likely to be typed multiple times (for example, typedef vector<ll> Poly). I think it's not too big of a deal, I would suggest to just get used to typing them in for the purpose of hashing.

My biggest problem with avoiding them automatically is ambiguity with what hashes represent. "We hash everything that's printed" is obvious. "We hash everything after the typedefs" is less obvious.

Decrease Pollard Rho Length

I suspect that the current Pollard Rho implementation is a lot longer than it needs to be. I've seen some significantly shorter implementations elsewhere.

For example, take this snippet from my coach's notebook: https://github.com/FTRobbin/Dreadnought-Standard-Code-Library/blob/11c861e55a73be9c32a799bfe4398e0c62c2da15/src/%E5%90%AF%E5%8F%91%E5%BC%8F%E5%88%86%E8%A7%A3.cpp

Different API of course, and haven't compared performance. But it may be worth looking at.