This repository is offered to demonstrate a set of resources that will allow you to leverage Azure AI Document Intelligence for high throughput of processing documents stored in Azure Blob Storage to extract text. It then utilized Semantic Kernel, Azure OpenAI and Azure AI Search to index the contents of these documents. The solution can be used to process documents in a variety of formats, including Office documents, PDF, PNG, and JPEG.
IMPORTANT! In addition to leveraging the solution below with multiple Document Intelligence instances, it will be beneficial to request a transaction limit increase for your Document Intelligence Accounts. Instructions for how to do this can be found in the Azure AI Document Intelligence Documentation
This solution leverages the following Azure services:
-
Azure AI Document Intelligence - the Azure AI Service API that will perform the document intelligence, extraction and processing.
-
Microsoft Foundry - the Azure AI Service API that will perform the semantic embedding calculations of the extracted text.
-
Azure AI Search - the Azure AI Service that will index the extracted text for search and analysis.
-
Azure Blob Storage with three containers
documents- starting location to perform your bulk upload of documents to be processedprocessresults- the extracted text output from the Document Intelligence servicecompleted- location where the original documents are moved to once successfully processed by Document Intelligence
-
Azure Service Bus with four queues
docqueue- this contains the messages for the files that need to be processed by the Document Intelligence servicecustomfieldqueue- this contains messages for the files that need to have custom field extraction compeltedtoindexqueue- this contains the messages for the files that have been processed by the Document Intelligence service and the reults are ready to be indexed by Azure AI Searchmovequeue- this contains the messages for the files that have been processed by the Document Intelligence service and are ready to be moved to thecompletedblob container
-
DocumentQueueing- identifies the files in thedocumentblob container and send a claim check message (containing the file name) to thedocqueuequeue. This app is triggered by an HTTP call, but could also be modified to use a Blob TriggerDocumentIntelligence- processes the message indocqueueto Document Intelligence, then updates Blob metadata as "processed" and create new message incustomfieldqueue
This app employs scale limiting and Polly retries with back off for Document Intelligence (too many requests) replies to balance maximum throughput and overloading the API endpointCustomFieldExtraction- processes messages in thecustomfieldqueueto use Azure Open AI to extract custom fields based on theAzureUtilities/Prompts/ExtractCustomFields.yamlprompt description. Once complete, create new message intoindexqueueAiSearchIndexing- processes messages in thetoindexqueueto get embeddings of the extracted text from Azure Open AI and saves those embeddings to Azure AI Search. Once complete, create new message inmovequeueFileMover- processes messages in themovequeueto move files fromdocumenttocompletedblob containersAskQuestions- simple HTTP app to demonstrate RAG retrieval by allowing you to ask questions on the indexed documents
To further allow for high throughput, the DocumentIntelligence app can distribute processing between 1-10 separate Document Intelligence accounts. This is managed by the docqueue funtion automatically adding a RecognizerIndex value of 0-9 when queueing the files for processing.
The DocumentIntelligence app will distribute the files to the appropriate account (regardless of the number of Document Intelligence accounts actually provisioned).
To configure multiple Document Intelligence accounts with the script below, add a value between 1-10 for the -docIntelligenceInstanceCount (default is 1). To configure manually, you will need to add all of the Document Intelligence account keys to the Azure Key Vault's DOCUMENT-INTELLIGENCE-KEY secret -- pipe separated
Assumption: all instances of the Document Intelligence share the same URL (such as: https://eastus.api.cognitive.microsoft.com/)
In a similar way with Document Intelligence, to ensure high throughput, you can deploy multiple Azure OpenAI accounts. To assist in load balancing, the accounts are front-ended with Azure API Management which handled the load balancing and circuit breaker should an instance get overloaded.
To try out the sample end-to-end process, you will need:
- An Azure subscription that you have privileges to create resources.
- Deployment is automated using PowerShell, the Azure CLI and the Azure Developer CLI. These can be easily installed on a Windows machine using
winget:
winget install --id "Microsoft.AzureCLI" --silent --accept-package-agreements --accept-source-agreements
winget install --id "Microsoft.Azd" --silent --accept-package-agreements --accept-source-agreements-
IMPORTANT: Open and edit the
main.parameters.jsonfile found in theinfrafolder. This file will contain the information needed to properly deploy the API Management and Azure OpenAI accounts:-
APIM settings
apiManagementPublisherEmail- will default to current user email. Remove environment variable reference to set manually.apiManagementPublisherName- will default to current user name. Remove environment variable reference to set manually.
-
Azure OpenAI model settings
azureOpenAIEmbeddingModel- embedding model you will use to generate the embeddingsazureOpenAIChatModel- the chat/completions model to use
-
Azure OpenAI deployment settings
For each deployment you want to create, add an object type type as per the example below (note
nameis optional)."openAiConfigs": { "value": { "embeddingModel" : "text-embedding-ada-002", "embeddingMaxTokens" : 8191, "completionModel" : "gpt-4o", "configs" : [ { "name": "", "location": "eastus2", "suffix": "eastus2", "priority": 1, "embedding": { "capacity": 100 }, "completion": { "capacity": 100, "sku" : "GlobalStandard" } }, { "name": "", "location": "westus", "suffix": "westus", "priority": 2, "embedding": { "capacity": 100 }, "completion": { "capacity": 100, "sku" : "GlobalStandard" } } ] }
-
-
Login to the Azure Develper CLI:
azd auth login(note: if you have access to multiple Entra tenants, you may need to add the flag--tenant-idwith the GUID value for the desired tenant ) -
Run the azd command
azd up
The first time you run this, you will be prompted for several values. This command will create all of the Azure resources and RBAC role assignments needed for the demonstration.
To exercise the code and run the demo, follow these steps:
-
Upload sample file to the storage account's
documentscontainer. To help with this, you can try the supplied PowerShell scriptBulkUploadAndDuplicate.ps1. This script will take a directory of local files and upload them to the storage container. Then, based on your settings, duplicate them to help you easily create a large library of files to process.\BulkUploadAndDuplicate.ps1 -path "<path to dir with sample file>" -storageAccountName "<storage account name>" -containerName "documents" -counterStart 0 -duplicateCount 10
The sample script above would would upload all of the files found in the
-pathdirectory, then create copies of them prefixed with 000000 through 000010. You can of course upload the files any way you see fit. -
In the Azure portal, navigate to the resource group that was created and locate the app with the
Queueingin the name. Click on the Applciation URL, then in the new browser window add/queueto the end and hit return. This will kick off the queueing process for all of the files in thedocumentsstorage container. The output will be the number of files that were queued. -
Once messages start getting queued, the
DocumentIntelligenceapp will start picking up the messages and begin processing. You should see the number of messages in thedocqueuequeue go down as they are successfully processed. You will also see new files getting created in theprocessresultscontainer. -
Simultaneously, as the
DocumentIntelligenceapp completes it's processing and queues messages in thedocqueuequeue, theAiSearchIndexingapp will start picking up messages in thetoindexqueueand sent the extracted text in theprocessresultscontainer to Azure OpenAI for embedding calculation and then Azure AI Search for indexing. Also theMoverapp will begin picking up those messages and moving the processed files from theprocessedcontainer into thecompletedcontainer. -
You can review the execution and timings of the end to end process
-
Use the
AskQuestionsapp to demonstrate RAG retrieval of the index documents (addapi/AskQuestions?filename=<file you uploaded>&question=<question you want to ask>to the end of the URL and hit enter.).
