Skip to content

Releases: aryn-ai/sycamore

v0.1.20

06 Sep 18:39
19a1520
Compare
Choose a tag to compare

This release refactors Sycamore’s dependencies to use extras in order to conditionally pull in dependencies for connectors and local inference (e.g. creating vector embeddings). For example, if you want to use the OpenSearch connector, you will need to: pip install sycamore-ai[opensearch]. Or, if you want to run a local vector embedding model, you will need to: pip install sycamore-ai[local-inference]. To do both, you will need to: pip install sycamore-ai[opensearch,local-inference]

Also, this release includes performance and stability improvements.

What's Changed

New Contributors

Full Changelog: v0.1.19...v0.1.20

v0.1.19

27 Aug 19:02
b7b6916
Compare
Choose a tag to compare

This release adds a materialize opertaion and enhanced query functionality along with stability and performance improvements.
Also an experimental neo4j writer.

What's Changed

Read more

v0.1.18

30 Jul 15:11
202fb0e
Compare
Choose a tag to compare

This Sycamore release contains a variety of new features, including interfaces for reading from and writing to vector stores, with implementations for OpenSearch, DuckDB, Elasticsearch, Pinecone, and Weaviate. This release also contains performance enhancements, dependency upgrades, and bug fixes.

This release coincides with the launch of the Aryn Partitioning Service, which provides an endpoint for partitioning PDFs. This service is integrated with Sycamore and free to try at https://www.aryn.ai/get-started.

What's Changed

Read more

v0.1.17

06 Jun 18:21
a217c92
Compare
Choose a tag to compare

This Sycamore release contains new writers to the Weaviate and Pinecone vector databases, enhancements to the demo UI, and numerous small features and bug fixes.

What's Changed

New Contributors

Full Changelog: v0.1.16...v0.1.17

v0.1.16

07 May 04:41
f1e4ed0
Compare
Choose a tag to compare

This release contains support in the SycamorePartitioner for extracting table structure and images, as well as a new transform for summarizing images. It also includes a number of bug fixes and enhancements.

What's Changed

  • fix ui error when no title is extracted and we're not in ntsb setting by @HenryL27 in #352
  • Fix almost all the pyproject.toml and poetry.lock files to have consistent requirements on python dependencies. by @eric-anderson in #345
  • Bind mount to convey SSL cert/key to Jupyter & UI by @alexaryn in #349
  • Use real SSL certificate for OpenSearch HTTP. by @alexaryn in #353
  • copy lib/poetry-lock into containers to make poetry happy by @HenryL27 in #354
  • copy lib/poetry-lock into remote-processor-service too. by @HenryL27 in #355
  • copy in all of poetry-lock, not just the pyproject files by @HenryL27 in #356
  • Update data model for table structure recognition. by @bsowell in #357
  • Put token-protected HTTPS proxy in front of UI proxy. by @alexaryn in #359
  • Arxiv switched to HTTP for these PDFs; make it work. by @alexaryn in #360
  • Add apt update to UI Dockerfiles. by @alexaryn in #361
  • Use chown in our copy commands to make sure all files are owned by app by @eric-anderson in #362
  • Add TableStructureExtractor interface and TableTransformer impl. by @bsowell in #358
  • fix zsh path by @eric-anderson in #367
  • Jupyter container improvements by @eric-anderson in #369
  • Don't say localhost if it's not going to work. by @alexaryn in #366
  • bump deploy timeout for reranking model from 60 to 120 by @HenryL27 in #363
  • ingest all ntsb docs, automatically detect docker v not, spread path … by @HenryL27 in #368
  • Fix typos in README by @hsm207 in #370
  • Fix default prep script when given an empty directory to import by @HenryL27 in #371
  • fix typo by @HenryL27 in #372
  • Add the ability to summarize images to partitioned docsets. by @bsowell in #365
  • Store element bbox as a tuple rather than BoundingBox. by @bsowell in #374
  • Jonfritz patch 1 partition update by @jonfritz in #376
  • FIX: Error on initiate conversation without a conversation id by @sohamkasar19 in #375
  • Add API docs for the SycamorePartitioner and table extraction. by @bsowell in #373
  • Fix malformed text from beautiful soup. by @bohou-aryn in #351
  • Handle deserializing JSON documents when elements is None. by @bsowell in #377
  • Bump sycamore version to 0.1.16 by @bsowell in #378

New Contributors

Full Changelog: v0.1.15...v0.1.16

v0.1.15

11 Apr 23:58
Compare
Choose a tag to compare

This release add support for writing DocSets to jsonl files as well as other incremental features and bug fixes.

What's Changed

Full Changelog: v0.1.14...v0.1.15

v0.1.14

02 Apr 19:38
8b7190b
Compare
Choose a tag to compare

This release includes CPU support and OCR in the Sycamore Partitioner, caching for better performance and lower cost when using Textract for table extraction, an upgraded version of Ray (2.10), and more.

What's Changed

Full Changelog: v0.1.13...v0.1.14

v0.1.13

15 Mar 21:28
88c691b
Compare
Choose a tag to compare

This release upgrades the Sycamore docker containers to use OpenSearch 2.12 and adds support for SSL. It also includes significant additions to the Sycamore documentation (https://sycamore.readthedocs.io/), and a number of other features and bug fixes.

What's Changed

Full Changelog: v0.1.12...v0.1.13

v0.1.12

09 Feb 01:10
Compare
Choose a tag to compare

This release adds components to Sycamore to enable search and analytics use cases, beyond data preparation. Sycamore can now be deployed using Docker containers, and you can also download the Python libraries for data preparation. The documentation has also been updated to reflect this change in scope.

This release also has other features and bug fixes.

What's Changed

Full Changelog: v0.1.11...v0.1.12

v0.1.11

03 Jan 20:12
Compare
Choose a tag to compare

This release removes support for OpenAI's text-davinci-003 model, which will be deprecated on 1/4/23, and replaces it with gpt-3.5-turbo-instruct. All users of sycamore should upgrade.

What's Changed

  • Migrate from text-davinci-003 to gpt-3.5-turbo-instruct. by @bsowell in #202
  • Bump version to v0.1.11. by @bsowell in #203

Full Changelog: v0.1.10...v0.1.11