This code will get you started ingesting files from local and Sharepoint.
There are 2 local processesors, one that processes using the automated ingestion connector (local connector), and one that uses the partition function to ingest documents from local with more control (local partition).
There is only 1 Sharepoint processor, which is the automated ingestion connector (Sharepoint connector).
Python 3.8+
Pip
Unstructured 0.10.24+
Follow the setup guide from the official Unstructured Documents to setup Unstructured on your machine. https://unstructured-io.github.io/unstructured/installation/full_installation.html
This app is set to run locally (not using Docker).
All of the code for this app is located in the main_example.py
file.
- To start, determine the type of processor you will use (local or Sharepoint).
- In the code, fill in all of the required variables for the type of processor you are using (NOTE: for local processing, the directories can be relative or full path values).
- Open a terminal and navigate to the directory that contains the
main_example.py
file. - Run the following command to run the script (NOTE: you may need to use
python3
in the command instead ofpython
depending on how your Python is setup):python main_example.py
- The script should run and display the output for the processor. Outputs that are written to files should appear in the same directory as the script.