Given a file system and a root directory on Unix, traverse the tree and (i) extracted words from each text file, (ii) computed the document frequency (DF) of each word (the number of documents (i.e. files) in which that word occurs), and (iii) determined the words with the K highest DF.
Implemented the parallel algorithm using OpenMP in C++.