Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add Docling loader docs #29104

Merged
merged 3 commits into from
Jan 9, 2025
Merged

Conversation

vagenas
Copy link
Contributor

@vagenas vagenas commented Jan 8, 2025

Description

This adds the docs for the Docling document loader.
Docling parses PDF, DOCX, PPTX, HTML, and other formats into a rich unified representation including document layout, tables etc., making them ready for generative AI workflows like RAG.

Some references:

The introduced DoclingLoader enables users to:

  • use various document types in their LLM applications with ease and speed, and
  • leverage Docling's rich representation for advanced, document-native grounding.

Issue

Replacing PR #27987 as discussed with @efriis here.

Dependencies

None

@vagenas vagenas requested a review from efriis as a code owner January 8, 2025 21:47
Copy link

vercel bot commented Jan 8, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 8, 2025 10:27pm

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. 🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder labels Jan 8, 2025
Signed-off-by: Panos Vagenas <[email protected]>
Signed-off-by: Panos Vagenas <[email protected]>
"from tempfile import mkdtemp\n",
"\n",
"from langchain_huggingface.embeddings import HuggingFaceEmbeddings\n",
"from langchain_milvus import Milvus\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could demonstrate with InMemoryVectorStore as well, which is not performant but requires no additional dependencies or auth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know!

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jan 9, 2025
@ccurme ccurme merged commit 858f655 into langchain-ai:master Jan 9, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:docs Changes to documentation and examples, like .md, .rst, .ipynb files. Changes to the docs/ folder lgtm PR looks good. Use to confirm that a PR is ready for merging. size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants