🧾 Insurance Template Filler – Web App

This web app allows users to upload insurance photo report PDFs and a .docx template. The system uses OCR + AI to extract relevant information and automatically fills the template. The final result can be downloaded as a filled PDF or viewed directly in the browser.

📁 Project Structure

.
├── insurance_pipeline/     # Core pipeline (OCR, extraction, LLMs, etc.)
├── sample/                 # Sample input/output files
├── app.py                  # Streamlit app for UI interaction
├── .env                    # API keys
├── requirements.txt        # Dependencies list
└── README.md               # Project documentation

🚀 Setup Instructions

Create & Activate Virtual Environment:

python3.9 -m venv task_3
source task_3/bin/activate   # macOS/Linux
task_3\Scripts\activate      # Windows

Install PaddleOCR:

If you have a GPU and CUDA 11.8:

python -m pip install paddlepaddle-gpu==3.1.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

If not, use the CPU version:

pip install paddlepaddle

More installation details: PaddlePaddle Installation Guide.

Install Other Dependencies:

pip install -r requirements.txt

Add API Keys to .env File:

Make sure your .env file includes:

OPENROUTER_API_KEY = "openrouter_api_key"
GOOGLE_API_KEY = "google_api_key"
PINECONE_API_KEY = "pinecone_api_key"
COHERE_API_KEY = "cohere_api_key"
GROQ_API_KEY = "groq_api_key"
CONVERTAPI_API_KEY = "convertapi_api_key"

Run the Application:

streamlit run app.py

A local server will start and open the app in your default browser.

🧠 Pipeline Overview

┌────────────────────────────┐
│        Upload Inputs       │
│ ┌────────────────────────┐ │
│ │       Report PDFs      │ │
│ │     .docx Template     │ │
│ └────────────────────────┘ │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│    OCR + Text Chunking     │
│ - OCR PDFs                 │
│ - Split into text chunks   │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│  Embedding + Pinecone DB   │
│ - Convert chunks to vectors│
│ - Store in Pinecone index  │
└────────────┬───────────────┘
             │
             ▼
┌──────────────────────────────────────┐
│   Field Meaning Extraction (LLM)     │
│ - Extract placeholders from .docx    │
│ - Understand meaning (OpenRouter LLM)│
└────────────┬─────────────────────────┘
             │
             ▼
┌──────────────────────────────────────┐
│     Semantic Retrieval + QA          │
│ - Similarity search (Pinecone)       │
│ - Rerank with Cohere                 │
│ - Final answer via GROQ LLM          │
└────────────┬─────────────────────────┘
             │
             ▼
┌────────────────────────────┐
│    Fill Template Fields    │
│ - Replace placeholders     │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│      Convert to PDF        │
│ - Use ConvertAPI           │
└────────────┬───────────────┘
             │
             ▼
┌────────────────────────────┐
│    Preview & Download PDF  │
│ - View PDF in browser      │
│ - Download final PDF       │
└────────────────────────────┘

⏱️ Performance Note

To manage LLM API usage and rate limits, a delay is added between field queries. You can modify this in: insurance_pipeline/qa_utils.py

insurance_pipeline/qa_utils.py : Modify in this file.

def extract_all_fields(...):
    ...
    time.sleep(5)  # Delay between LLM requests

📸 Sample Files

You can find sample .docx templates and insurance report PDFs in the sample/ directory for testing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧾 Insurance Template Filler – Web App

📁 Project Structure

🚀 Setup Instructions

🧠 Pipeline Overview

⏱️ Performance Note

📸 Sample Files

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
insurance_pipeline		insurance_pipeline
sample		sample
.env		.env
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧾 Insurance Template Filler – Web App

📁 Project Structure

🚀 Setup Instructions

🧠 Pipeline Overview

⏱️ Performance Note

📸 Sample Files

🙏 Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages