Skip to content

v0.1.9

Compare
Choose a tag to compare
@github-actions github-actions released this 08 Dec 01:04
· 970 commits to main since this release

This Sycamore release adds improved heuristics for partitioning documents. It also includes a new method of automatically inferring entities to extract from unstructured documents, as well as incremental features and bug fixes.

What's Changed

  • Change the default merge size to 256. by @eric-anderson in #178
  • Simplify running the http crawler. by @eric-anderson in #180
  • Fix text chunking for html importing to improve result quality. by @eric-anderson in #185
  • Remove docker_compose and opensearch files. They were moved to quickstart. by @eric-anderson in #183
  • Change simple_ingest and s3_ingest to use GTE-small embedding model. by @alexaryn in #169
  • Remove unneeded mapping in OpenSearch index settings. by @alexaryn in #186
  • Added HTML ingest example. Fixed order in S3 ingester. by @alexaryn in #188
  • Simple transform to perform regex replacement on Elements. by @alexaryn in #187
  • Update README.md by @jonfritz in #179
  • Entity Extraction by @mkyl in #161
  • Merging/breaking elements based on heuristics including bbox by @alexaryn in #171
  • Update aiohttp and cryptography to address dependabot alerts. by @bsowell in #192
  • Bump version to v0.1.9. by @bsowell in #191

New Contributors

Full Changelog: v0.1.8...v0.1.9