The authors have not taken any care to mention any steps / specs for a successful build and no guidance in this area.
-
Ubuntu server https://www.ubuntu.com/download/server - Ubuntu 18.10
-
download ISO - http://releases.ubuntu.com/18.10/ubuntu-18.10-live-server-amd64.iso
-
VMware Workstation 14 Pro
-
install Ubuntu / login as user / check current dir
environment details and pre-installation commands
ls
sudo add-apt-repository ppa:webupd8team/java
sudo apt update
sudo apt install oracle-java8-installer
javac -version
sudo apt install oracle-java8-set-default
mvn
apt-cache search maven
sudo apt-get install maven
mvn
mvn -version
sudo apt update
sudo apt install tesseract-ocr
git
ls
git clone https://github.com/ICIJ/extract
ls
cd extract/
ls
NOTE: open the pom.xml in the extract folder in a text editor and modify as shown below
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-gpg-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<id>sign-artifacts</id>
<phase>verify</phase>
<goals>
<goal>sign</goal>
</goals>
</execution>
</executions>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
NOTE 2: go to the dir /home/userx/extract/extract-cli/ and open the pom.xml file and modify as below
, you need to add this line
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.25</version>
</dependency>
download the slf4j-simple-1.7.25.jar and put it in /home/user/.m2/repository/org/slf4j/slf4j-api/1.7.25/
folder
mvn install -DskipTests -Dgpg.skip
OR
mvn package -DskipTests -Dgpg.skip
echo "export JAVA_OPTS="-Xms512m -Xmx1024m"" >> ~/.bashrc
source ~/.bashrc
cd /home/userx/extract/extract-cli/
sudo apt-get install libxtst6:i386
sudo apt-get update
sudo apt-get install libxtst6
sudo updatedb
locate libXtst
sudo apt install libxext6
sudo apt-get install libxrender1 libxtst6 libxi6
java -jar extract-cli.jar
result
usage: extract [command] [options]
usage: extract help
usage: extract version
A cross-platform tool for distributed content-extraction by the data team
at the International Consortium of Investigative Journalists.
Commands
load-report
rollback
wipe-report
spew-dump
clean-report
view-report
inspect-dump
commit
load-queue
rehash
wipe-queue
delete
version
help
dump-queue
spew
copy
tag
queue
dump-report
Additional Image Formats
jpg
bmp
gif
wbmp
png
jpeg
jbig2
Extract will use up to 1 GB of memory on this machine.
Please report issues at: https://github.com/ICIJ/extract/issues.
result
javac 1.8.0_191
Apache Maven 3.5.4
Maven home: /usr/share/maven
Java version: 1.8.0_191, vendor: Oracle Corporation, runtime: /usr/lib/jvm/java-8-oracle/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.18.0-10-generic", arch: "amd64", family: "unix"
tesseract 4.0.0-beta.3-249-g607e
leptonica-1.76.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
/usr/lib/x86_64-linux-gnu/libXtst.so.6
/usr/lib/x86_64-linux-gnu/libXtst.so.6.1.0
https://gorails.com/setup/ubuntu/18.10
ruby --version
curl -sL https://deb.nodesource.com/setup_8.x | sudo -E bash -
cd ..
curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -
echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
sudo apt-get update
sudo apt-get install git-core curl zlib1g-dev build-essential libssl-dev libreadline-dev libyaml-dev libsqlite3-dev sqlite3 libxml2-dev libxslt1-dev libcurl4-openssl-dev software-properties-common libffi-dev nodejs yarn
cd
git clone https://github.com/rbenv/rbenv.git ~/.rbenv
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(rbenv init -)"' >> ~/.bashrc
exec $SHELL
git clone https://github.com/rbenv/ruby-build.git ~/.rbenv/plugins/ruby-build
echo 'export PATH="$HOME/.rbenv/plugins/ruby-build/bin:$PATH"' >> ~/.bashrc
exec $SHELL