An automated document analyzer for Paperless-ngx using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically analyze and tag your documents.
It features: Automode, Manual Mode, Ollama and OpenAI, a Chat function to query your documents with AI, a modern and intuitive Webinterface.
paperless-ai makes changes to the documents in your productive paperlessNGX instance that cannot be easily undone.
Do the configuration carefully and think twice.
Please test the results beforehand in a separate development environment and be sure to back up your documents and metadata beforehand.
💚 Thank you for all your support, bug submit, feature requests 💚
Automated Document Management
Automatic Scanning : Identifies and processes new documents within Paperless-ngx.
AI-Powered Analysis : Leverages OpenAI API and Ollama (Mistral, Llama, Phi 3, Gemma 2) for precise document analysis.
Metadata Assignment : Automatically assigns titles, tags, and correspondent details.
Advanced Customization Options
Predefined Processing Rules : Specify which documents to process based on existing tags. (Optional) 🆕
Selective Tag Assignment : Use only selected tags for processing. (Disables the prompt dialog) 🆕
Custom Tagging : Assign a specific tag (of your choice) to AI-processed documents for easy identification. 🆕
AI-Assisted Analysis : Manually analyze documents with AI support in a modern web interface. (Accessible via the /manual
endpoint) 🆕
Interactive Chat Functionality
Document Querying : Ask questions about your documents and receive accurate, AI-generated answers. 🆕
Streamlined Configuration : Easy-to-use setup interface available at /setup
.
Dashboard Overview : A clean and organized dashboard for monitoring and managing document processing.
Error Handling : Automatic restarts and health monitoring for improved stability.
Health Checks : Ensures system integrity and alerts on potential issues.
Docker Integration : Full Docker support, including health checks, resource management, and persistent data storage.
Docker and Docker Compose
Access to a Paperless-ngx installation
OpenAI API key or your own Ollama instance with your chosen model running and reachable.
Basic understanding of cron syntax (for scan interval configuration)
Document Discovery
Periodically scans Paperless-ngx for new documents
Tracks processed documents in a local SQLite database
AI Analysis
Sends document content to OpenAI API or Ollama for analysis
Extracts relevant tags and correspondent information
Uses GPT-4o-mini or your custom Ollama model for accurate document understanding
Automatic Organization
Creates new tags if they don't exist
Creates new correspondents if they don't exist
Updates documents with analyzed information
Marks documents as processed to avoid duplicate analysis
You can now manually analyze your files by hand with the help of AI in a beautiful Webinterface.
Reachable via the /manual
endpoint from the webinterface.
The application can be configured through the Webinterface on the /setup
Route.
You dont need/can't set the environment vars through docker.
The application comes with full Docker support:
Automatic container restart on failure
Health monitoring
Volume persistence for database
Resource management
Graceful shutdown handling
# Start the container
docker-compose up -d
# View logs
docker-compose logs -f
# Restart container
docker-compose restart
# Stop container
docker-compose down
# Rebuild and start
docker-compose up -d --build
The application provides a health check endpoint at /health
that returns:
# Healthy system
{
"status" : " healthy"
}
# System not configured
{
"status" : " not_configured" ,
"message" : " Application setup not completed"
}
# Database error
{
"status" : " database_error" ,
"message" : " Database check failed"
}
The application includes a debug interface accessible via /debug
that helps administrators monitor and troubleshoot the system's data:
🔍 View all system tags
📄 Inspect processed documents
👥 Review correspondent information
Accessing the Debug Interface
Navigate to:
http://your-instance:3000/debug
The interface provides:
Interactive dropdown to select data category
Tree view visualization of JSON responses
Color-coded data representation
Collapsible/expandable data nodes
Available Debug Endpoints
Endpoint
Description
/debug/tags
Lists all tags in the system
/debug/documents
Shows processed document information
/debug/correspondents
Displays correspondent data
The debug interface also integrates with the health check system, showing a configuration warning if the system is not properly set up.
To run the application locally without Docker:
Install dependencies:
Start the development server:
Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature
)
Commit your changes (git commit -m 'Add some AmazingFeature'
)
Push to the branch (git push origin feature/AmazingFeature
)
Open a Pull Request
Store API keys securely
Restrict container access
Monitor API usage
Regularly update dependencies
Back up your database
This project is licensed under the MIT License - see the LICENSE file for details.
Paperless-ngx for the amazing document management system
OpenAI API
The Express.js and Node.js communities for their excellent tools
If you encounter any issues or have questions:
Check the Issues section
Create a new issue if yours isn't already listed
Provide detailed information about your setup and the problem