From 0ca58e98d760bda79cc9937b2e379b5cc8b8a859 Mon Sep 17 00:00:00 2001 From: Spyros Date: Tue, 16 Jan 2024 16:18:52 +0000 Subject: [PATCH] progress --- docs/designs/importing-v2.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/designs/importing-v2.md b/docs/designs/importing-v2.md index be00b34a..48799e9a 100644 --- a/docs/designs/importing-v2.md +++ b/docs/designs/importing-v2.md @@ -24,14 +24,16 @@ Importing needs to be heavily refactored in order to support fractionality and p Specifically the following changes are required: 1. change method `application.cmd.cre_main.parse_standards_from_spreadsheeet` to recognise our core CSV and first retrieve a list of included resources (be it CRE structure or external resources) + 2. then change the same method to loop over resources and import each resource other than CREs in parallel. + 3. change method `parse_hierarchical_export_format` to return a dict mapping the name of the resource to the documents of the resource so each resource can be split for importing to different workers 4. change method `application.cmd.cre_main.parse_standards_from_spreadsheeet` to prioritise largest standards for importing first. 5. change method `parse_hierarchical_export_format` to call both `register_cre` and `register_node` 6. change methods `register_cre/node` to also optionally generate embeddings, update neo4j and optionally precalculate two way gap analysis for each resource imported 7. change embedding generation to make it optional to calculate embeddings on singleton instantiation 8. write tests that allow of resource updating -9. create a method that allows for "forgetting" a resource, useful for when our core structure changes This includes +9. create a method that allows for "forgetting" a resource, useful for when our core structure changes This includes * remove all links between the target resource and everything else * remove all embeddings of the target resource * remove all gap analysis of the target resource