Analyzes quality of images in pdf's
-
python (from Python 3.7)
-
matlab (from R2019b)
-
poppler pdfimages (0.68.0)
-
exif==1.3.5
-
haishoku==1.1.8
-
layoutparser[effdet]==0.3.4
-
numpy==1.21.6
-
opencv_python==4.6.0.66
-
Pillow==9.3.0
-
protobuf==4.21.12
-
pyenchant==3.2.2
-
PyMuPDF==1.20.2
-
pytesseract==0.3.10
-
reportlab==3.6.11
-
wcag_contrast_ratio==0.9
-
importlib-metadata==5.2.0
Tested Windows 10, Python 3.10 and Matlab R2022b
files 'de_DE_frami.aff' and 'de_DE_frami.dic' source need to be moved to pyenchant folder [...]\enchant\data\mingw64\share\enchant\hunspell
To analyse PDF run main.py file with path of pdf file as parameter
example: python main.py myTestPdf.pdf
Tested with Linux, Python 3.7 and Matlab R2019b
From the repository root, build the docker image:
docker build -t schall/image_analyzer:git -f docker/Dockerfile .
Run it while mounting the host's MATLAB installation into the container (must not be read-only because we have to build the MATLAB python engine):
docker run -it --rm -v /usr/local/MATLAB/:/usr/local/MATLAB -v /path/to/input_dir:/input:ro -v /path/to/output_dir:/output schall/image_analyzer:git
Now navigate to main/src
, run the main.py with path of pdf file as parameter , and finally export the output:
cd main/src
python3 main.py /input/some.pdf
cp ../../output/bildanalyse_report.pdf /output/some_report.pdf