netconstructor / lapdftext-1 Goto Github PK
View Code? Open in Web Editor NEWThis project forked from gullyapcburns/lapdftext
LA-PDFText has been developed by members of the Biomedical Knowledge Engineering group @ the Information Sciences Institute. It is intended for use both scientists and NLP engineers interested in getting access to text within specific sections of research articles. The system is open-source and provides a simple baseline function for extracting text from primary research articles using rules that developers can customize. This means that the system works quite well for most applications (and might occasionally make mistakes and extract the wrong text), but it is always possible to 'hack' your own rules and improve performance.
License: GNU General Public License v3.0