Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BM25 and TFIDF Scoring to the text index #1688

Open
wants to merge 71 commits into
base: master
Choose a base branch
from

Conversation

Flixtastic
Copy link
Contributor

While building the textindex one can define the scoring metrics used. Then during index building the scoring metric chosen defines how the score is calculated. In the retrieval the calculated scores are then shown and can be used to sort the relevancy of documents containing searchwords.

Flixtastic and others added 30 commits July 12, 2024 03:12
Commit doesn't contain all changes necessary for pull request yet.
…x. This is done through passing the words and docsfile as string, and then building the text index as normal. Basic Test is existent (TODO make more edge case tests) and e2e testing is fixed.
…re still unstable because of the way nofContexts are counted. Implemented new more refined tests.
…o the wordsFileContent and docsFileContent strings. Now you can clearly see what lines are added and can writing tests is cleaner
…in the wordsFileContent and docsFileContent as pair contentsOfWordsFileAndDocsFile
Flixtastic and others added 28 commits January 2, 2025 11:22
… breaking changes. Also finished outsourcing the Scoring
…Also added the TextScoringEnum files to get the text scoring metric enum from a string and the other way around.
…sts in WordsAndDocsFileParserTest.cpp. Renamed methods in WordsAndDocsFileLineCreator.h to reduce ambiguity. Incorporated requested small changes of PR.
Signed-off-by: Johannes Kalmbach <[email protected]>
This reverts commit dfff837, reversing
changes made to a4e9509.
@sparql-conformance
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants