Skip to content

Feature/ted9 47 enhance the e forms eligibility checking component#589

Open
kaleanych wants to merge 343 commits intoOP-TED:mainfrom
meaningfy-ws:feature/TED9-47-Enhance-the-eForms-eligibility-checking-component
Open

Feature/ted9 47 enhance the e forms eligibility checking component#589
kaleanych wants to merge 343 commits intoOP-TED:mainfrom
meaningfy-ws:feature/TED9-47-Enhance-the-eForms-eligibility-checking-component

Conversation

@kaleanych
Copy link
Copy Markdown
Contributor

No description provided.

schivmeister and others added 17 commits March 12, 2026 22:43
Use the one-stop MSSKD service to detect, convert and load packages of
any given version, normalizing to a unified "v3". This yields support
also for v3L, aka lightweight, as v3 is a superset (the lightweight
variant excludes all data except bare transformation necessities).
Standard Forms and eForms are henceforth "v1" and "v2", respectively.

The pipeline native model is now an MSSDK v3-extended one, with the
JSONLD being the canonical metadata model. Not only is there no
equivalent in older models for this and the accompanying
`context.jsonld`, the datetime datatype also needs special
handling/conversion when used in legacy contexts.

A key distinguishing feature of the new unified package is the complete
refactor of the constraints model, removing one level of nesting but
also adding more structure and possibilities with one model (like a
range of document schema versions as seen in v1 or a list of such as
seen in v2). Repurposing these constraints for legacy contexts therefore
needs extra care, if not refactored completely.

Recap of model differences from v1/v2 to v3:

- `identifier` -> `id`
- `issue_date (str)` -> `created_at (datetime)`
- `ontology_version` -> `model_version`
- `metadata_constraints.constraints` -> `applicability_constraints`
- `eforms_subtype` -> `document_type_list`
- `start_date/end_date` -> `document_time_interval.start/end`
- `min/max_xsd_versions` -> `document_version_range.min/max`
- `eforms_sdk_versions` -> `document_schema_version_list`

Note that _applicability constraints_ is a package perspective -- the
same constraints are to be interpreted by the pipeline as _eligibility
constraints_ for a notice.

There is an additional transitional field `project_identifier`, which
stands in for the `mapping_type`, but only barely. This interpretation
may be deprecated at any point, but not before support is added for
alternative detection mechanisms.
- sparql_test_suites <-> test_suites_sparql
- shacl_test_suites <-> test_suites_shacl

This also fixes some tests that rely on these prerequisite validation data.
Fix the test assumptions by reducing the number of test data to match.
In the case of the Standard Forms (v1) package `package_F03_test`, there
are 105 test data files, of which 82 are unique. However, only 81 of
them are to be found within folders under `test_data`, whereas the rest,
including a unique one `example.xml`, is not contained within a folder.
Pass the type along as it does not matter as much anymore since we
normalize to MSSDK v3 and the native pipeline model is an extension of
it.
We now delegate to the MSSDK for validation, which is carried out during
the package parsing/loading.
If an error occurs in the package loading, due to validation or other
failures, simply forward the error and continue loading the other
packages.
…ub-download-packages

[TEDSWS-232] Breaking: Transition to MSSDK for package loading/saving
- pass MongoDB client to normalise_notice function
- reparse MSSDK CSV list object w/ Pandas to reinterpret numbers
- update tests
Tests were failing with ModelNotFoundError because:
- Notice fixtures didn't set mapping_package_identifier
- Mapping suite/package weren't loaded into test MongoDB instances
- normalise_notice() calls didn't pass mongodb_client parameter

Changes:
- Add load_mapping_suite_and_package fixture to features/conftest.py
- Update notice fixtures to set mapping_package_identifier
- Pass mongodb_client to normalise_notice() in test steps
- Add load_mapping_suite_and_package_fake for e2e tests using mongomock
- Update e2e fixtures to link GitHub-loaded packages to local mapping suite

Fixes 30+ e2e/feature tests that were failing after metadata resource
refactoring with dynamic MS Config loading via MSSDK.
There was a hidden circular dependency in the metadata resource
migration to MS Config via MSSDK.

The previous design required a notice with `mapping_package_identifier`
to load resources, but this created a circular dependency: normalisation
needs resources, yet eligibility checking (which returns a package
identifier but does not set one on the notice) needs normalised
metadata.

Initial assumptions may have been anchored on the resources being
project-specific. However, this is problematic as not all projects may
be updated with the mapping suite configuration. Therefore, resource
files (country.json, languages.json, etc.) can be interpreted to be
global for now during the transition period.

Once all currently known production projects are updated with the
configuration, a more dynamic method to select the mapping suite can be
implemented, for e.g. via the `document_probing` conditions specified in
the config, which defines what XPaths must and must not be available to
be compatible with the project.

Changes:
- MappingFilesRegistry now loads resources from any available MappingSuite
- Removed notice parameter from DefaultNoticeMetadataNormaliser and
  EformsNoticeMetadataNormaliser constructors
- Updated find_metadata_normaliser_based_on_xml_manifestation() and
  extract_and_normalise_notice_metadata() to not require notice
- Added MappingSuiteConfigError for when no MappingSuite is available
- Updated all test fixtures to use the new API
- Remove all traces and dependence on a Notice
  mapping_package_identifier

TODO: The mapping suite must be made mandatory and be fetched from a
default known project with the configuration if not given.
@schivmeister schivmeister force-pushed the feature/TED9-47-Enhance-the-eForms-eligibility-checking-component branch from c04eab2 to 9e12e0c Compare March 12, 2026 17:22
The actual fetch of the github repo would get no MS config, and the fake
would be adding one. There appears to inconsistency in this test passing
locally but failing on the server, so let us remove the MS config part.
This is required for passing the mongodb client to the
MappingFilesRegistry, which picks up mapping metadata resource files
from the MS config. Without this there is a mismatch in the mongodb
client in tests, whose first entrypoint usually gets a mock, but in this
case, the normalisation would've defaulted to a real one retrieved from
the environment.
@schivmeister schivmeister force-pushed the feature/TED9-47-Enhance-the-eForms-eligibility-checking-component branch from 9e12e0c to 8f6f564 Compare March 12, 2026 17:31
schivmeister and others added 5 commits March 13, 2026 00:59
If the mapping suite `config` folder is not found in the repository and
branch specified, the package loading will fail. We allow now an
additional, optional parameter to specify a second branch from which the
config is available. This is useful for cases where a specific tag or
release needs to be loaded but the config is in a later
tag/release/commit.
Without the needed resource files, the copied config is useless.
…-msconfig

Make MS Config mandatory, add Airflow parameter
@schivmeister schivmeister deleted the feature/TED9-47-Enhance-the-eForms-eligibility-checking-component branch March 12, 2026 20:34
@schivmeister schivmeister restored the feature/TED9-47-Enhance-the-eForms-eligibility-checking-component branch March 12, 2026 20:35
@schivmeister schivmeister deleted the feature/TED9-47-Enhance-the-eForms-eligibility-checking-component branch March 12, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants