Skip to content

Commit

Permalink
#3742 Merge multiple project files into multiple project outputs (MIM…
Browse files Browse the repository at this point in the history
…O) (#3800)

* add mimo merge functionality

* add file suggestions by levenstein distance less than 3

* wip

* remove logs and add test files

* add drop down list with project suggestions

* add unit tests

* wip

* wip

* add unit tests

* add unit tests

* add unit tests

* proceed ktLintFormat

* add more unit tests

* add unit tests about invalid files with mimo merge

* change the test file

* feat(analysis): add parser dialog for mimo, cleanup workflow (#3742)

* feat(analysis): fix dialog tests, add compression question to mimo (#3742)

* feat(analysis): refactor mimo workflow, reduce duplication by extraction in main (#3742)

Add output project name question to mimo workflow. Restructure mimo workflow to fix potential bug when one name spelling mistake and two correct.

* feat(analysis): extract mimo functions to separate class (#3742)

* feat(analysis): add tests for mimo mode (#3742)

* feat(analysis): move mockk one layer down to test questions more directly (#3742)

* feat(analysis): format code (#3742)

* feat(analysis): update readme and gh-pages (#3742)

---------

Co-authored-by: VictoriaG <[email protected]>
Co-authored-by: Sebastian Wolf <[email protected]>
Co-authored-by: Christian Hühn <[email protected]>
  • Loading branch information
4 people authored Nov 28, 2024
1 parent 50fa1da commit 208787f
Show file tree
Hide file tree
Showing 13 changed files with 1,702 additions and 73 deletions.
37 changes: 26 additions & 11 deletions analysis/filter/MergeFilter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**Category**: Filter (takes in multiple cc.json files and outputs single cc.json)

Reads the specified files and merges visualisation data.
Reads the specified files and merges visualization data.

The first file with visualisation data is used as reference for the merging strategy and a base for the output. The visualisation data in the additional json files (given they have the same API version) are fitted into this reference structure according to a specific strategy. Currently, there are two main strategies:

Expand All @@ -14,16 +14,19 @@ Both strategies will merge the unique list entries for `attributeTypes` and `bla

## Usage and Parameters

| Parameters | Description |
| ------------------------------- | ---------------------------------------- |
| `FILE` | files to merge |
| `-a, --add-missing` | enable adding missing nodes to reference |
| `-h, --help` | displays help and exits |
| `--ignore-case` | ignores case when checking node names |
| `--leaf` | use leaf merging strategy |
| `-nc, --not-compressed` | save uncompressed output File |
| `-o, --outputFile=<outputFile>` | output File (or empty for stdout) |
| `--recursive` | use recursive merging strategy (default) |
| Parameters | Description |
|---------------------------------|----------------------------------------------------------------------|
| `FILE` | files to merge |
| `-a, --add-missing` | [Leaf Merging Strategy] enable adding missing nodes to reference |
| `-h, --help` | displays help and exits |
| `--ignore-case` | ignores case when checking node names |
| `--leaf` | use leaf merging strategy |
| `-nc, --not-compressed` | save uncompressed output File |
| `-o, --outputFile=<outputFile>` | output File (or empty for stdout; ignored in [MIMO mode])) |
| `--recursive` | use recursive merging strategy (default) |
| `--mimo` | merge multiple files with the same prefix into multiple output files |
| `-ld, --levenshtein-distance` | [MIMO mode] levenshtein distance for name match suggestions |
| `-f` | force merge non-overlapping modules at the top-level structure |

```
Usage: ccsh merge [-ah] [--ignore-case] [--leaf] [-nc] [--recursive]
Expand All @@ -45,3 +48,15 @@ ccsh merge file1.cc.json ../foo/ -o=test.cc.json
```

This last example inputs the folder foo, which will result in all project files in that folder being merged with the reference file (file1.cc.json).

```
ccsh merge myProjectFolder/ --mimo -ld 0 -f
```

## MIMO - Multiply Inputs Multiple Outputs

Matches multiple `cc.json` files based on their prefix (e.g. **myProject**.git.cc.json). Tries to match project names with typos and asks which to add to the output.
If you want to use this in a CI/CD pipeline environment you may find it useful to specify `-ld` and `-f` to not prompt any user input.
The output file name follows the following schema: `myProject.merge.cc.json`.

> IMPORTANT: Output is always the current working directory.
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package de.maibornwolff.codecharta.filter.mergefilter

import de.maibornwolff.codecharta.filter.mergefilter.mimo.Mimo
import de.maibornwolff.codecharta.model.Project
import de.maibornwolff.codecharta.serialization.ProjectDeserializer
import de.maibornwolff.codecharta.serialization.ProjectSerializer
Expand Down Expand Up @@ -27,16 +28,16 @@ class MergeFilter(
@CommandLine.Parameters(arity = "1..*", paramLabel = "FILE or FOLDER", description = ["files to merge"])
private var sources: Array<File> = arrayOf()

@CommandLine.Option(names = ["-a", "--add-missing"], description = ["enable adding missing nodes to reference"])
@CommandLine.Option(names = ["-a", "--add-missing"], description = ["[Leaf Merging Strategy] enable adding missing nodes to reference"])
private var addMissingNodes = false

@CommandLine.Option(names = ["--recursive"], description = ["recursive merging strategy (default)"])
@CommandLine.Option(names = ["--recursive"], description = ["Recursive Merging Strategy (default)"])
private var recursiveStrategySet = true

@CommandLine.Option(names = ["--leaf"], description = ["leaf merging strategy"])
@CommandLine.Option(names = ["--leaf"], description = ["Leaf Merging Strategy"])
private var leafStrategySet = false

@CommandLine.Option(names = ["-o", "--output-file"], description = ["output File (or empty for stdout)"])
@CommandLine.Option(names = ["-o", "--output-file"], description = ["output File (or empty for stdout; ignored in MIMO mode)"])
private var outputFile: String? = null

@CommandLine.Option(names = ["-nc", "--not-compressed"], description = ["save uncompressed output File"])
Expand All @@ -48,6 +49,15 @@ class MergeFilter(
@CommandLine.Option(names = ["-f"], description = ["force merge non-overlapping modules at the top-level structure"])
private var mergeModules = false

@CommandLine.Option(names = ["--mimo"], description = ["merge multiple files with the same prefix into multiple output files"])
private var mimo = false

@CommandLine.Option(
names = ["-ld", "--levenshtein-distance"],
description = ["[MIMO mode] levenshtein distance for name match suggestions (default: 3; 0 for no suggestions)"]
)
private var levenshteinDistance = 3

override val name = NAME
override val description = DESCRIPTION

Expand All @@ -73,41 +83,41 @@ class MergeFilter(
val nodeMergerStrategy = when {
leafStrategySet -> LeafNodeMergerStrategy(addMissingNodes, ignoreCase)
recursiveStrategySet && !leafStrategySet -> RecursiveNodeMergerStrategy(ignoreCase)
else -> throw IllegalArgumentException("Only one merging strategy must be set")
else -> throw IllegalArgumentException("At least one merging strategy must be set")
}

if (!InputHelper.isInputValid(sources, canInputContainFolders = true)) {
throw IllegalArgumentException("Input invalid files/folders for MergeFilter, stopping execution...")
}

val sourceFiles = InputHelper.getFileListFromValidatedResourceArray(sources)

val rootChildrenNodes = sourceFiles.mapNotNull {
val input = it.inputStream()
try {
ProjectDeserializer.deserializeProject(input)
} catch (e: Exception) {
Logger.warn { "${it.name} is not a valid project file and will be skipped." }
null
}
if (mimo) {
processMimoMerge(sourceFiles, nodeMergerStrategy)
} else {
val projects = readInputFiles(sourceFiles)
if (!continueIfIncompatibleProjects(projects)) return null

val mergedProject = ProjectMerger(projects, nodeMergerStrategy).merge()
ProjectSerializer.serializeToFileOrStream(mergedProject, outputFile, output, compress)
}

if (!mergeModules) {
if (!hasTopLevelOverlap(rootChildrenNodes)) {
printOverlapError(rootChildrenNodes)
return null
}

val continueMerge = ParserDialog.askForceMerge()
private fun continueIfIncompatibleProjects(projects: List<Project>): Boolean {
if (mergeModules) return true
if (!hasTopLevelOverlap(projects)) {
printOverlapError(projects)

if (!continueMerge) {
Logger.info { "Merge cancelled by the user." }
return null
}
val continueMerge = ParserDialog.askForceMerge()

if (!continueMerge) {
Logger.info { "Merge cancelled by the user." }
return false
}
}

val mergedProject = ProjectMerger(rootChildrenNodes, nodeMergerStrategy).merge()
ProjectSerializer.serializeToFileOrStream(mergedProject, outputFile, output, compress)

return null
return true
}

private fun hasTopLevelOverlap(projects: List<Project>): Boolean {
Expand All @@ -130,4 +140,48 @@ class MergeFilter(
override fun isApplicable(resourceToBeParsed: String): Boolean {
return false
}

private fun processMimoMerge(sourceFiles: List<File>, nodeMergerStrategy: NodeMergerStrategy) {
val groupedFiles: List<Pair<Boolean, List<File>>> = Mimo.generateProjectGroups(sourceFiles, levenshteinDistance)

groupedFiles.forEach { (exactMatch, files) ->
val confirmedFileList = if (exactMatch) {
files
} else {
ParserDialog.requestMimoFileSelection(files)
}
if (confirmedFileList.size <= 1) {
Logger.info { "Continue with next group, because one or less files were selected" }
return@forEach
}

val projects = readInputFiles(confirmedFileList)
if (projects.size <= 1) {
Logger.warn { "After deserializing there were one or less projects. Continue with next group" }
return@forEach
}

if (!continueIfIncompatibleProjects(projects)) return@forEach

val mergedProject = ProjectMerger(projects, nodeMergerStrategy).merge()
val outputFilePrefix = Mimo.retrieveGroupName(confirmedFileList)
ProjectSerializer.serializeToFileOrStream(mergedProject, "$outputFilePrefix.merge.cc.json", output, compress)
Logger.info {
"Merged files with prefix '$outputFilePrefix' into" +
" '$outputFilePrefix.merge.cc.json${if (compress) ".gz" else ""}'"
}
}
}

private fun readInputFiles(files: List<File>): List<Project> {
return files.mapNotNull {
val input = it.inputStream()
try {
ProjectDeserializer.deserializeProject(input)
} catch (e: Exception) {
Logger.warn { "${it.name} is not a valid project file and will be skipped." }
null
}
}
}
}
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
package de.maibornwolff.codecharta.filter.mergefilter

import com.github.kinquirer.KInquirer
import com.github.kinquirer.components.promptCheckboxObject
import com.github.kinquirer.components.promptConfirm
import com.github.kinquirer.components.promptInput
import com.github.kinquirer.components.promptInputNumber
import com.github.kinquirer.components.promptList
import com.github.kinquirer.core.Choice
import de.maibornwolff.codecharta.tools.interactiveparser.InputType
import de.maibornwolff.codecharta.tools.interactiveparser.ParserDialogInterface
import de.maibornwolff.codecharta.util.InputHelper
Expand All @@ -16,41 +20,79 @@ class ParserDialog {
inputFolderName = getInputFileName("cc.json", InputType.FOLDER)
} while (!InputHelper.isInputValidAndNotNull(arrayOf(File(inputFolderName)), canInputContainFolders = true))

val outputFileName: String =
KInquirer.promptInput(
message = "What is the name of the output file?"
val isMimoMode = KInquirer.promptConfirm(
message = "Do you want to use MIMO mode? (multiple inputs multiple outputs)",
default = false
)

var outputFileName = ""
val isCompressed: Boolean
var levenshteinDistance = 0
if (isMimoMode) {
levenshteinDistance = KInquirer.promptInputNumber(
message = "Select Levenshtein Distance for name match suggestions (0 for no suggestions)",
default = "3"
).toInt()

isCompressed = KInquirer.promptConfirm(
message = "Do you want to compress the output file(s)?",
default = true
)
} else {
outputFileName =
KInquirer.promptInput(
message = "What is the name of the output file?"
)

val isCompressed =
(outputFileName.isEmpty()) ||
isCompressed =
(outputFileName.isEmpty()) ||
KInquirer.promptConfirm(
message = "Do you want to compress the output file?",
default = true
)
}

val addMissing: Boolean =
KInquirer.promptConfirm(message = "Do you want to add missing nodes to reference?", default = false)

val recursive: Boolean =
KInquirer.promptConfirm(message = "Do you want to use recursive merge strategy?", default = true)
val leafMergingStrategy = "Leaf Merging Strategy"
val recursiveMergingStrategy = "Recursive Merging Strategy"
val strategy = KInquirer.promptList(
message = "Which merging strategy should be used?",
choices = listOf(recursiveMergingStrategy, leafMergingStrategy),
hint = "Default is 'Recursive Merging Strategy'"
)

val leaf: Boolean =
KInquirer.promptConfirm(message = "Do you want to use leaf merging strategy?", default = false)
var leafFlag = false
var addMissing = false
if (strategy == leafMergingStrategy) {
leafFlag = true
addMissing = KInquirer.promptConfirm(
message = "Do you want to add missing nodes to reference?",
default = false
)
}

val ignoreCase: Boolean =
KInquirer.promptConfirm(
message = "Do you want to ignore case when checking node names?",
default = false
)

return listOf(
val basicMergeConfig = listOf(
inputFolderName,
"--output-file=$outputFileName",
"--not-compressed=$isCompressed",
"--add-missing=$addMissing",
"--recursive=$recursive",
"--leaf=$leaf",
"--ignore-case=$ignoreCase"
"--recursive=${!leafFlag}",
"--leaf=$leafFlag",
"--ignore-case=$ignoreCase",
"--not-compressed=$isCompressed"
)

if (isMimoMode) {
return basicMergeConfig + listOf(
"--mimo=true",
"--levenshtein-distance=$levenshteinDistance"
)
}
return basicMergeConfig + listOf(
"--output-file=$outputFileName"
)
}

Expand All @@ -60,5 +102,22 @@ class ParserDialog {
default = false
)
}

fun askForMimoPrefix(prefixOptions: Set<String>): String {
return KInquirer.promptList(
message = "Which prefix should be used for the output file?",
choices = prefixOptions.toList()
)
}

fun requestMimoFileSelection(files: List<File>): List<File> {
val choiceList = files.map { Choice(it.name, it) }
return KInquirer.promptCheckboxObject(
message = "Which files should be merged? [Enter = Confirm, Space = Select]",
choices = choiceList,
hint = "Not selected files will not get merged",
pageSize = 10
)
}
}
}
Loading

0 comments on commit 208787f

Please sign in to comment.