-
Notifications
You must be signed in to change notification settings - Fork 0
cildataconverter.py
Chris Churas edited this page Jan 25, 2018
·
3 revisions
This script takes a directory of downloaded image and video datasets created by cildatadownloader.py and converts the data files as defined in the requirements document.
Output when passing --help on command line:
usage: cildataconverter.py [-h] [--log {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
[--id ID] [--onlycheckzipfiles]
[--skipifrawmissing] [--version]
downloaddir
Version 0.1.0
Given a directory of images and videos downloaded by
cildatadownloader.py, this script performs a set of
file renames and conversions on each dataset found within.
The only required argument is the output directory
generated by cildatadownloader.py which should contain
two subdirectories images/ and videos/.
IMAGES:
For datasets under images/ directory the following
conversions are performed:
<ID>.raw is examined and verified to be a valid zip file.
Any files found in <ID>.raw are extracted and written
to <ID> directory in format <ID>_orig.<FORMAT> where
<FORMAT> is the extension found in the <ID>.zip file.
<ID>.zip is created with the following structure:
<ID>/
<ID>_orig.<FORMAT>
<ID>_orig.<FORMAT>
If the zip file looks to exceed 4gb 64bit extensions
are enabled.
The json file is updated to have correct md5 checksums,
file sizes, and mime_types as determined by headers
when downloading and/or by file extension.
VIDEOS:
For datasets under videos/ directory the following
conversions are performed:
<ID>.raw is renamed based on its mimetype obtained
from http headers when downloaded and named
<ID>.<MIMETYPE SUFFIX>.
<ID>.zip file is created with following structure
<ID>/
<ID>.<MIMETYPE SUFFIX>
The json file is updated to have correct md5 checksums,
file sizes, and mime_types as determined by headers
when downloading and/or by file extension.
For more information visit:
https://github.com/slash-segmentation/cildata_util/wiki
positional arguments:
downloaddir Directory where images and videos reside
optional arguments:
-h, --help show this help message and exit
--log {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the logging level (default WARNING)
--id ID Only convert data with id passed in.
--onlycheckzipfiles If set examines all the zip files in imagesand reports
number of files and file namesfound.
--skipifrawmissing If set skip any ids where no raw file is foundin
directory.
--version show program's version number and exit