Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.0.1 #751

Merged
merged 26 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
490bf48
commands.init:cmd_init - more consistent version logging, save Docume…
MatteoCampinoti94 Dec 18, 2024
c6d365d
commands.init - add support for pre-acacore database in import
MatteoCampinoti94 Dec 18, 2024
beb7bae
readme - update
MatteoCampinoti94 Dec 18, 2024
9db8940
commands.init:check_import_db - safer check for metadata table
MatteoCampinoti94 Dec 18, 2024
f0247af
commands.init:import_files - log skipped files that are not in Origin…
MatteoCampinoti94 Dec 18, 2024
225e5c4
commands.init:import_files - support Windows-formatted paths
MatteoCampinoti94 Dec 18, 2024
4bbbcc2
commands.init:import_files - fix support for Windows-formatted paths
MatteoCampinoti94 Dec 18, 2024
a76fab3
commands.init:import_files - fix splitting of file path
MatteoCampinoti94 Dec 18, 2024
f06f57c
poetry - use acacore 4.1.0
MatteoCampinoti94 Dec 18, 2024
153e228
tests.avid - add converted master files
MatteoCampinoti94 Dec 18, 2024
536b633
tests.avid - add converted statutory files
MatteoCampinoti94 Dec 18, 2024
7eaf95a
tests.avid - upgrade to version 4.1.0
MatteoCampinoti94 Dec 18, 2024
4172c71
poetry - use acacore 4.1.1
MatteoCampinoti94 Dec 18, 2024
f6b46ff
tests.avid - upgrade to version 4.1.1
MatteoCampinoti94 Dec 18, 2024
0257d57
tests.identify:identify_original - do not check processed status
MatteoCampinoti94 Dec 18, 2024
89355fb
tests.identify:identify_master - fix incorrect properties
MatteoCampinoti94 Dec 18, 2024
87acb28
commands.edit.common:edit_file_value - allow to use a function as pro…
MatteoCampinoti94 Dec 18, 2024
f21b89b
commands.edit.processed:cmd_processed_master - support setting access…
MatteoCampinoti94 Dec 18, 2024
b12794d
version - patch 5.0.0 > 5.0.1
MatteoCampinoti94 Dec 18, 2024
cd9d8c4
changelog:5.0.1 - add changes
MatteoCampinoti94 Dec 18, 2024
d584bda
readme - update
MatteoCampinoti94 Dec 18, 2024
7e60318
tests.edit:edit_original_processed - use a non-processed file as test…
MatteoCampinoti94 Dec 19, 2024
28ab764
tests.edit:edit_original_processed - test no change
MatteoCampinoti94 Dec 19, 2024
b127cc4
commands.edit.rollback:rollback - continue to next event instead of b…
MatteoCampinoti94 Dec 19, 2024
1ce1c21
tests.rollback - do not use random sorting, use all available files w…
MatteoCampinoti94 Dec 19, 2024
25c32ec
changelog:5.0.1 - add fixes
MatteoCampinoti94 Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Changelog

## v5.0.1

### Changes

* Use acacore 4.1.1
* `edit master processed` can now set processed status of access and statutory targets separately

## v5.0.0

Complete overhaul of digiarch to work with the entire AVID folder and handle files across document types (original,
Expand Down
13 changes: 8 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,10 @@ Usage: digiarch init [OPTIONS] AVID_DIR
The directory is checked to make sure it has the structure expected of an
AVID archive.

The --import option allows to import OriginalFiles from a files.db database
generated by version v4.1.12 of digiarch (acacore v3.3.3). MasterFiles are
added as well, if present in the database.
The --import option allows to import original and master files from a
files.db database generated by version v4.1.12 of digiarch (acacore v3.3.3).
A pre-acacore version of the database can also be used if it contains a
'Files' table with a 'path' column, but some master files may be missing.

Options:
--import FILE Import an existing files.db
Expand Down Expand Up @@ -614,9 +615,11 @@ Options:
##### digiarch edit master processed

```
Usage: digiarch edit master processed [OPTIONS] QUERY REASON
Usage: digiarch edit master processed [OPTIONS] QUERY {access|statutory}
REASON

Set master files matching the QUERY argument as processed.
Set master files matching the QUERY argument as processed for the relevant
target (access or statutory).

To set files as unprocessed, use the --unprocessed option.

Expand Down
2 changes: 1 addition & 1 deletion digiarch/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "5.0.0"
__version__ = "5.0.1"
10 changes: 6 additions & 4 deletions digiarch/commands/edit/common.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from logging import INFO
from logging import Logger
from typing import Any
from typing import Callable
from typing import Literal

from acacore.database import FilesDB
Expand All @@ -23,12 +24,13 @@ def edit_file_value(
reason: str,
file_type: Literal["original", "master", "access", "statutory"],
property_name: str,
property_value: Any, # noqa: ANN401
property_value: Callable[[Any], Any] | Any, # noqa: ANN401
dry_run: bool,
*loggers: Logger,
):
for file in query_table(table, query, [("lower(relative_path)", "asc")]):
if getattr(file, property_name) == property_value:
value = property_value(getattr(file, property_name)) if callable(property_value) else property_value
if getattr(file, property_name) == value:
Event.from_command(ctx, "skip", (file.uuid, file_type), reason="No Changes").log(
INFO,
*loggers,
Expand All @@ -39,11 +41,11 @@ def edit_file_value(
ctx,
"edit",
(file.uuid, file_type),
[getattr(file, property_name), property_value],
[getattr(file, property_name), value],
reason,
)
if not dry_run:
setattr(file, property_name, property_value)
setattr(file, property_name, value)
table.update(file)
database.log.insert(event)
event.log(INFO, *loggers, show_args=["uuid", "data"], path=file.relative_path)
Expand Down
25 changes: 22 additions & 3 deletions digiarch/commands/edit/processed.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
from acacore.utils.click import start_program
from acacore.utils.helpers import ExceptionManager
from click import argument
from click import Choice
from click import command
from click import Context
from click import option
Expand Down Expand Up @@ -69,6 +70,7 @@ def cmd_processed_original(ctx: Context, query: TQuery, reason: str, processed:
@rollback("edit", rollback_file_value("processed"))
@command("processed", no_args_is_help=True, short_help="Set master files as processed.", cls=CommandWithRollback)
@argument_query(True, "uuid", ["uuid", "checksum", "puid", "relative_path", "action", "warning", "processed"])
@argument("processed_type", type=Choice(["access", "statutory"]), nargs=1, required=True)
@argument("reason", nargs=1, type=str, required=True)
@option(
"--processed/--unprocessed",
Expand All @@ -79,9 +81,16 @@ def cmd_processed_original(ctx: Context, query: TQuery, reason: str, processed:
)
@option_dry_run()
@pass_context
def cmd_processed_master(ctx: Context, query: TQuery, reason: str, processed: bool, dry_run: bool):
def cmd_processed_master(
ctx: Context,
query: TQuery,
processed_type: tuple[str, ...],
reason: str,
processed: bool,
dry_run: bool,
):
"""
Set master files matching the QUERY argument as processed.
Set master files matching the QUERY argument as processed for the relevant target (access or statutory).

To set files as unprocessed, use the --unprocessed option.

Expand All @@ -91,6 +100,16 @@ def cmd_processed_master(ctx: Context, query: TQuery, reason: str, processed: bo
"""
avid = get_avid(ctx)

mask: int = 0

if processed_type == "access":
mask += 0b01
if processed_type == "statutory":
mask += 0b10

if not processed:
mask ^= 0b11

with open_database(ctx, avid) as database:
log_file, log_stdout, _ = start_program(ctx, database, __version__, None, True, True, dry_run)

Expand All @@ -103,7 +122,7 @@ def cmd_processed_master(ctx: Context, query: TQuery, reason: str, processed: bo
reason,
"master",
"processed",
processed,
(lambda p: p | mask) if processed else (lambda p: p & mask),
dry_run,
log_stdout,
)
Expand Down
9 changes: 1 addition & 8 deletions digiarch/commands/edit/rollback.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,14 +134,7 @@ def rollback(
if not event.file_type:
continue
if not (handler := handlers.get(event.operation)):
Event.from_command(ctx, "skip", (event.file_uuid, event.file_type)).log(
INFO,
*loggers,
run=f"{run_start.time:%Y-%m-%dT%T}",
event=f"{event.time:%Y-%m-%dT%T} {event.operation}",
reason="No handler found",
)
break
continue

file = get_file(database, event.file_type, event.file_uuid)

Expand Down
118 changes: 104 additions & 14 deletions digiarch/commands/init.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
from logging import ERROR
from logging import INFO
from logging import Logger
from logging import WARNING
from os import PathLike
from pathlib import Path
from pathlib import PureWindowsPath
from sqlite3 import connect
from sqlite3 import Connection
from sqlite3 import Row
from typing import Literal

from acacore.database import FilesDB
from acacore.database.upgrade import is_latest
Expand Down Expand Up @@ -36,7 +39,7 @@ def root_callback(ctx: Context, param: Parameter, value: str) -> AVID:
return AVID(value)


def import_original_file(avid: AVID, file: Row) -> tuple[OriginalFile, list[MasterFile], list[str]]:
def import_acacore_original_file(avid: AVID, file: Row) -> tuple[OriginalFile, list[MasterFile], list[str]]:
original_file: OriginalFile = OriginalFile.from_file(
avid.dirs.original_documents.joinpath(file["relative_path"]),
avid.path,
Expand Down Expand Up @@ -66,7 +69,7 @@ def import_original_file(avid: AVID, file: Row) -> tuple[OriginalFile, list[Mast
return original_file, master_files, missing_master_files


def import_original_files(
def import_acacore_files(
ctx: Context,
avid: AVID,
db: FilesDB,
Expand All @@ -81,7 +84,7 @@ def import_original_files(
total_missing_master_files: int = 0

for original_file_row in original_files_cur:
original_file, master_files, missing_master_files = import_original_file(avid, original_file_row)
original_file, master_files, missing_master_files = import_acacore_original_file(avid, original_file_row)
db.original_files.insert(original_file)
if master_files:
db.master_files.insert(*master_files)
Expand Down Expand Up @@ -109,17 +112,89 @@ def import_original_files(
return total_imported_original_files, total_imported_master_files, total_missing_master_files


def import_files(ctx: Context, avid: AVID, db: FilesDB, db_old: Connection, *loggers: Logger) -> tuple[int, int, int]:
# noinspection SqlResolve
paths_cursor = db_old.execute("select path from Files")

total_imported_original_files: int = 0
total_imported_master_files: int = 0

path_str: str
for [path_str] in paths_cursor:
path = Path(PureWindowsPath(path_str)) if "\\" in path_str else Path(path_str)
if "originaldocuments" not in (path_parts := [p.lower() for p in path.parts]):
Event.from_command(ctx, "skip").log(
WARNING,
*loggers,
path=path,
reason="File is not in OriginalDocuments",
)
continue
path = avid.dirs.original_documents.joinpath(*path.parts[path_parts.index("originaldocuments") + 1 :])
if not path.is_file():
Event.from_command(ctx, "skip").log(
WARNING,
*loggers,
path=path.relative_to(avid.path),
reason="File not found",
)
continue

original_file = OriginalFile.from_file(path, avid.path)
db.original_files.insert(original_file)
Event.from_command(ctx, "imported", (original_file.uuid, "original")).log(
INFO,
*loggers,
path=original_file.relative_path,
)
total_imported_original_files += 1

master_files_dir: Path = avid.dirs.master_documents.joinpath(
path.parent.relative_to(avid.dirs.original_documents)
)
master_files: list[MasterFile] = [
MasterFile.from_file(f, avid.path, original_file.uuid)
for f in master_files_dir.iterdir()
if f.is_file() and f.stem == path.stem
]
db.master_files.insert(*master_files)
for master_file in master_files:
Event.from_command(ctx, "imported", (master_file.uuid, "master")).log(
INFO,
*loggers,
path=master_file.relative_path,
)
total_imported_master_files += len(master_files)

return total_imported_original_files, total_imported_master_files, 0


def import_db(
ctx: Context,
avid: AVID,
db: FilesDB,
import_db_path: str | PathLike,
import_mode: Literal["acacore", "files"],
*loggers: Logger,
):
db_old = connect(import_db_path)
new_original_files: int = 0
new_master_files: int = 0
missing_master_files: int = 0

Event.from_command(ctx, "import:start").log(INFO, *loggers, type=import_mode)

if import_mode == "acacore":
new_original_files, new_master_files, missing_master_files = import_acacore_files(
ctx,
avid,
db,
db_old,
*loggers,
)
elif import_mode == "files":
new_original_files, new_master_files, missing_master_files = import_files(ctx, avid, db, db_old, *loggers)

Event.from_command(ctx, "import:start").log(INFO, *loggers)
new_original_files, new_master_files, missing_master_files = import_original_files(ctx, avid, db, db_old, *loggers)
Event.from_command(ctx, "import:end").log(
INFO,
*loggers,
Expand All @@ -134,6 +209,7 @@ def import_db(
"import",
None,
{
"mode": import_mode,
"original_files": new_original_files,
"master_files": new_master_files,
"missing_master_files": missing_master_files,
Expand All @@ -143,21 +219,31 @@ def import_db(
db.commit()


def check_import_db(ctx: Context, import_db_path: str | PathLike, import_param_name: str):
def check_import_db(
ctx: Context,
import_db_path: str | PathLike,
import_param_name: str,
) -> Literal["acacore", "files"]:
db_old: Connection | None = None

try:
db_old = connect(import_db_path)

tables: list[str] = [t.lower() for [t] in db_old.execute("select name from sqlite_master where type = 'table'")]
if "files" not in tables or "metadata" not in tables:
if "files" not in tables:
raise BadParameter("Invalid database schema.", ctx, ctx_params(ctx)[import_param_name])
if "metadata" not in tables or (
{c.lower() for [_, c, *_] in db_old.execute("pragma table_info(metadata)")} != {"key", "value"}
):
return "files"

version = db_old.execute("select value from Metadata where key = 'version'").fetchone()
if not version:
raise BadParameter("No version information.", ctx, ctx_params(ctx)[import_param_name])
if version[0] != "3.3.3":
raise BadParameter(f"Invalid version {version[0]}, must be 3.3.3.", ctx, ctx_params(ctx)[import_param_name])

return "acacore"
finally:
if db_old:
db_old.close()
Expand Down Expand Up @@ -187,13 +273,15 @@ def cmd_init(ctx: Context, avid: AVID, import_db_path: str | None):

The directory is checked to make sure it has the structure expected of an AVID archive.

The --import option allows to import OriginalFiles from a files.db database generated by version v4.1.12 of
digiarch (acacore v3.3.3). MasterFiles are added as well, if present in the database.
The --import option allows to import original and master files from a files.db database generated by version
v4.1.12 of digiarch (acacore v3.3.3). A pre-acacore version of the database can also be used if it contains a
'Files' table with a 'path' column, but some master files may be missing.
"""
avid.database_path.parent.mkdir(parents=True, exist_ok=True)
import_mode: Literal["acacore", "files"] | None = None

if import_db_path:
check_import_db(ctx, import_db_path, "import_db_path")
import_mode = check_import_db(ctx, import_db_path, "import_db_path")

with FilesDB(avid.database_path, check_initialisation=False, check_version=False) as db:
_, log_stdout, event_start = start_program(ctx, db, __version__, None, False, True, True)
Expand All @@ -202,23 +290,25 @@ def cmd_init(ctx: Context, avid: AVID, import_db_path: str | None):
with ExceptionManager(BaseException) as exception:
if db.is_initialised():
is_latest(db.connection, raise_on_difference=True)
Event.from_command(ctx, "initialized", data=db.version()).log(INFO, log_stdout)
Event.from_command(ctx, "initialized").log(INFO, log_stdout, version=db.version())
else:
db.init()
db.log.insert(event_start)
db.commit()

if avid.dirs.documents.exists() and not avid.dirs.original_documents.exists():
avid.dirs.documents.rename(avid.dirs.original_documents)
Event.from_command(ctx, "rename", data=["Documents", "OriginalDocuments"]).log(INFO, log_stdout)
event = Event.from_command(ctx, "rename", data=["Documents", "OriginalDocuments"])
db.log.insert(event)
event.log(INFO, log_stdout)

initialized = True
event = Event.from_command(ctx, "initialized", data=(v := db.version()))
db.log.insert(event)
event.log(INFO, log_stdout, show_args=False, version=v)

if initialized and import_db_path:
import_db(ctx, avid, db, import_db_path, log_stdout)
if initialized and import_db_path and import_mode is not None:
import_db(ctx, avid, db, import_db_path, import_mode, log_stdout)

end_program(ctx, db, exception, not initialized, log_stdout)

Expand Down
Loading
Loading