Abstract
The main concern of this thesis is to describe the ways in which a text can be represented in network form, in accordance with graph theory and, in addition, in which way and on what practical applications could it contribute. Primarily, several definitions are given in the fields of set theory, matrix theory and graph theory. Once the basic elements that would assist for the clearer comprehension of the subsequent are laid out, a foundations are constructed and possibilities open up for the transformation of written language in graph form, upon which this thesis will be built. Tools from graph theory as well as from other relevant fields are being described, appropriate of extracting relevant information about the text. Just before any research on applications is presented, there occurs a description of the results obtained from computational linguistics when it studied language from the perspective of the network. Based on all of the above, the second section of the thesis is being started, with the form of a survey on modern research upon the practical value of graph-based representations of text. The novelty of applying graphs in several natural language processing and information retrieval application lies in the fact that it violates the assumption of words independence. In the main body of the essay there are being surveyed three different, but closely linked, processes and their applications. The first described is keyphrase extraction from texts and it also is the richest in research material. Subsequently, automatic summarization and more specifically its extractive type for single and multiple documents. Finally, information retrieval is described in details along with its applications. The research results are quoted on all of the above processes, and there is an attempt for interpretive comparisons.
Keywords: Text Graphs, PageRank, Information Retrieval, Keyphrase Extraction, Automatic Summarization, Survey