This contrib module provides a Python implementation of Apache Solr's Data Import Handler (DIH), aligned to the original Java XML model (data-config.xml) and the original Java DIH behavior shipped in solr 8.6 contrib tree.
The project is an AI-assisted rewrite: implementation and parity work were driven by the rewrite plan and sourced from the original Java code under:
src/java/org/apache/solr/handler/dataimport/
- Drop-in replacement for DIH-style
data-config.xmlimports (same XML vocabulary, same${...}variable model). - Stay compatible with future Solr releases by evolving the Python runtime instead of being constrained by the original Java maintenance lifecycle.
- Add new capabilities where needed (for example: additional sinks and declarative cleanup flows).
Apache Solr's DIH has moved toward maintenance, and the widely used community-maintained alternative (SearchScale/dataimporthandler) is oriented around keeping up rather than adding new functionality:
This rewrite focuses on being a future-proof DIH runtime that continues to implement Solr DIH concepts while enabling new features over time.
The documentation set is organized under docs/:
- docs/README.md (hub)
- docs/functionality-checklist.md (Java parity evidence + provenance)
- docs/parity-java.md (high-level parity + caveats)
- docs/new-features.md (Python rewrite additions)
This code is distributed under the Apache License 2.0.