v0.1.19
This release adds a materialize opertaion and enhanced query functionality along with stability and performance improvements.
Also an experimental neo4j writer.
What's Changed
- Add comment to MetadataDocument superclass call by @eric-anderson in #607
- Update Copyright Year by @karanataryn in #617
- Merge elements in ntsb test ingest by @baitsguy in #619
- Add github ref name to Helicone logs. by @mdwelsh in #618
- Add local (no-ray) execution mode to speed up lineage development by @eric-anderson in #616
- Integrate LLM Extract logic into Sycamore Transforms by @tranade in #608
- parse html tables better by @HenryL27 in #621
- Change Aryn-SDK Error Message by @karanataryn in #622
- Small update to
field_to_value
by @tranade in #620 - Jonfritz patch docsupdate by @jonfritz in #624
- Avoid repeat take_all in Eval Pipeline by @aanya-p in #611
- Add Evaluate as Transform by @Soeb-aryn in #487
- Checking in notebook that calls APS to analyze financial document (10k). by @AbhijitP-009 in #626
- Refactor LogicalOperators to use pydantic. by @mdwelsh in #610
- Added Entity Extractor + HierarchicalDocument by @RitxmSaha in #601
- Rename SycamorePartitionerExample.ipynb to ArynPartitionerExample.ipynb by @jonfritz in #628
- Add LLMFilter as a DocSet Transform by @tranade in #623
- Create
count_distinct
for DocSet by @tranade in #625 - Jonfritz patch 3 update readme by @jonfritz in #629
- Update get_hash_context_file func by @pparmar30 in #603
- Change PDFMiner cache to $HOME/.sycamore/PDFMinerCache. by @mdwelsh in #634
- Add Context.config by @baitsguy in #627
- Fixup git repo from accidental pushes via overrides by @eric-anderson in #636
- Include match and range filter functions by @tranade in #630
- uncap python version for aryn-sdk by @HenryL27 in #638
- Add support to materialize to write documents out to files. by @eric-anderson in #640
- Refactor OpenSearchSchema to be more robust. by @mdwelsh in #639
- reading env variable as suggested and cosmetic changes by @Soeb-aryn in #609
- Fix code execution and trace display in Query UI by @tranade in #646
- Added OpenAI Async client by @RitxmSaha in #632
- A couple of small tweaks to make Sycamore more robust to missing or bogus data. by @mdwelsh in #649
- Add generic traverse by @eric-anderson in #648
- Shift more operations by @tranade in #631
- Refactor Context and support args in Map by @baitsguy in #637
- Changed OpenAI Cache integration test by @RitxmSaha in #651
- Run poetry-lock-all until the dependencies became consistent. by @eric-anderson in #652
- Fix range filter problem by @tranade in #654
- Fix codegen syntax/formatting by @baitsguy in #655
- fix table html parsing edge case by @HenryL27 in #656
- Code executor by @baitsguy in #657
- Switch Luna tracing to use materialize. by @mdwelsh in #653
- Add documentation on output of Aryn Partitioning Service by @MarkLindblad in #633
- Revamp Sycamore Query demo UI. by @mdwelsh in #659
- Adding new docs for Aryn Partitioning Service. Added a gentle introduction to APS docs and rearranged some of the existing APS docs. by @AbhijitP-009 in #660
- Bugfix for query dry-run mode by @baitsguy in #661
- Codegen with traces in UI by @tranade in #658
- Neo4j Writer by @RitxmSaha in #650
- Fixing the title for introduction page and making the 'specifying options' section its own page for APS docs by @AbhijitP-009 in #662
- Jonfritz patch 3 docs update by @jonfritz in #644
- Rename gentle_introduction.md to get_started.md by @jonfritz in #664
- Fixing APS docs to link to right doc also, making specifying options its own page by @AbhijitP-009 in #666
- Initial cut at chat UI. by @mdwelsh in #663
- Changing main title and reordering left pane documentation by @AbhijitP-009 in #669
- updated openai dependency by @RitxmSaha in #665
- Add support for subtasks by @aanya-p in #587
- Jonfritz sycamoredocsupdate by @jonfritz in #671
- Support arbitrary conversion to binary in materialize by @eric-anderson in #672
- implemented boilerplate transforms of documents. by @RitxmSaha in #668
- Add convert_file_to_pdf helper using libreoffice by @baitsguy in #670
- Origin/jonfritz patch 4 docs by @jonfritz in #673
- adding transforms and updating Ntsb demo notebooks by @Soeb-aryn in #645
- Making minor edits to the docs by @AbhijitP-009 in #679
- Fix Docs by @karanataryn in #675
- Add filter docs. by @bsowell in #682
- Change the file writer to create the output directory. by @bsowell in #681
- Add ssl_verify param to aryn-sdk by @HenryL27 in #684
- Add AutoMaterialize by @eric-anderson in #680
- Update Github integ test runner. by @bsowell in #683
- Update tutorial and remove old tutorial from ToC by @jonfritz in #676
- Fix flaky test. by @eric-anderson in #688
- import sycamore does not import ray by @eric-anderson in #687
- Update specifying_options.rst default threshold by @sohamkasar19 in #689
- Fix
ArynPartitioner
integration test by @MarkLindblad in #691 - Rename lineage files to materialize. by @eric-anderson in #693
- Defer model initialization. by @bsowell in #692
- Fixed a problem in detr_partitioner.py by @afriedman412 in #694
- Fixed an error in file_writer_ray.py by @afriedman412 in #696
- Fill in "gaps" in non-contiguous rows and columns from TATR . by @bsowell in #697
- Updated Entity Extractor + Infrastructure Changes by @RitxmSaha in #677
- corrected choosing beta client for openai by @RitxmSaha in #698
- added extract document structure by @RitxmSaha in #699
- Return an empty table when table transformers don't find a table. by @bsowell in #701
- Fix Override Text Bug by @karanataryn in #690
- SimplePrompt class by @baitsguy in #702
- Add IF_PRESENT reading mode to materialize by @eric-anderson in #703
- More dependencies at runtime. ~2x speedup on import sycamore test by @eric-anderson in #706
- implemented resolve graph entities by @RitxmSaha in #700
- add convert_image function by @HenryL27 in #708
- bump sdk to 0.1.3 by @HenryL27 in #709
- Fix bugs in json-encoding of documents by @eric-anderson in #713
- make sure writers finalize by @HenryL27 in #711
- Remove Override_Text by @karanataryn in #714
- Add Pinecone Source Tag by @karanataryn in #715
- Chunker by @dtecuci in #571
- fix import by @HenryL27 in #716
- Add LLM Query Docs and Notebook links by @karanataryn in #717
- implement extract_graph_relationships + tests + refactor by @RitxmSaha in #705
- Fixing the colab links to point to notebooks that are authored by [email protected] by @AbhijitP-009 in #722
- Fix Type Checking Bug by @karanataryn in #720
- Add Echo Guidance Flag to suppress IPython HTML Outputs by @karanataryn in #721
- represent greedysectionmerger in docs by @HenryL27 in #723
- bump sycamore version from 0.1.18 to 0.1.19 by @HenryL27 in #719
- Add source_mode documentation for materialize by @eric-anderson in #725
- Fix Nonetype Error by @dhruvkaliraman7 in #727
- add merge docs by @HenryL27 in #724
- mark 'experimental' features in docs by @HenryL27 in #729
- Fix Broken Docs by @karanataryn in #726
- More materialize documentation by @eric-anderson in #730
- Add Local Mode Support for Composite Transform and Binary Read by @karanataryn in #712
New Contributors
- @karanataryn made their first contribution in #617
- @AbhijitP-009 made their first contribution in #626
- @afriedman412 made their first contribution in #694
- @dtecuci made their first contribution in #571
- @dhruvkaliraman7 made their first contribution in #727
Full Changelog: v0.1.18...v0.1.19