v0.1.28
This release updates doc_ids from UUIDs to NanoIds, adds some document title functionality, and improves stability and performance.
What's Changed
- adding one shot prompting along with multimodal request by @Soeb-aryn in #1023
- Fix query-ui dependency on boto3 and re-lock. by @mdwelsh in #1028
- Updated NTSB queries and ground truth for CIDR-25 paper. by @mdwelsh in #1026
- Add streaming support and tests for query-server. by @mdwelsh in #1027
- Supply element types in output from MarkedMerger. by @alexaryn in #1031
- Fix SummarizeData so that downstream .materialize operations will work. by @mdwelsh in #1030
- add nanoid by @HenryL27 in #1034
- Removed duplicate code in query execution. by @akarshgupta7 in #1035
- Convert docids from UUID to NanoID. by @alexaryn in #1032
- Use NanoIDs in file_scan. by @alexaryn in #1036
- extract table properties prompt & bug fix by @Soeb-aryn in #1037
- Convert DocIDs to UUIDs for Qdrant & Weaviate; unit tests. by @alexaryn in #1038
- heuristics to get title from section headers by @Soeb-aryn in #1033
- updating function in pdf_miner class by @Soeb-aryn in #1041
- Added ragas to compute string metrics for evaluation. by @akarshgupta7 in #1039
- Fix sort so that it works with an unspecified or None default_value. by @eric-anderson in #1040
- Added correctness score to the metrics. by @akarshgupta7 in #1043
- Query planner improvements by @baitsguy in #1046
- Fix materialize to tolerate an empty input directory in ray mode by @eric-anderson in #1045
- PR fix by @baitsguy in #1047
- disable vectorsearch rerank by default in query by @baitsguy in #1048
- vectorsearch planner prompt changes by @baitsguy in #1049
- Make OpenAIEmbedder serializable after client has been initialized. by @bsowell in #1050
- Rename Embedding in ElasticSearch Notebook by @karanataryn in #1051
- Add deformable table extractor by @HenryL27 in #1053
- Add helper for thread local variables that can be used to add metadata to the output stream by @eric-anderson in #1052
- Propagate element level llm_filter output to doc.properties by @baitsguy in #1054
- Handle military clock time (0800) in time standardizer. by @alexaryn in #1056
- Fix incorrect docstring for promote-certain-elements-to-title feature by @MarkLindblad in #1057
- adding parameter for API in sdk and remote_partitioner by @Soeb-aryn in #1042
- bump sycamore version to 0.1.28 by @HenryL27 in #1058
- bump aryn sdk version to 0.1.10 by @HenryL27 in #1059
- don't die if box is None in try_draw_boxes by @HenryL27 in #1060
New Contributors
- @akarshgupta7 made their first contribution in #1035
Full Changelog: v0.1.27...v0.1.28