Skip to content

Commit 221064d

Browse files
preetha-inteljatinwadhwa921vthanielTejalKhade28Jaswanth51
authored
Documentation update for ORT 1.24 (#27546)
### Description Updates the supported version along with depreciation notices. --------- Co-authored-by: sfatimar <jatin.wadhwa@intel.com> Co-authored-by: vthaniel <vishnudas.thaniel.s@intel.com> Co-authored-by: TejalKhade28 <tejal.khade@intel.com> Co-authored-by: Jaswanth51 <jaswanth.gannamaneni@intel.com>
1 parent 112668f commit 221064d

File tree

1 file changed

+55
-42
lines changed

1 file changed

+55
-42
lines changed

docs/execution-providers/OpenVINO-ExecutionProvider.md

Lines changed: 55 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,9 @@ ONNX Runtime OpenVINO™ Execution Provider is compatible with three latest rele
3030

3131
|ONNX Runtime|OpenVINO™|Notes|
3232
|---|---|---|
33+
|1.24.1|2025.4.1|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.9)|
3334
|1.23.0|2025.3|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.8)|
3435
|1.22.0|2025.1|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.7)|
35-
|1.21.0|2025.0|[Details](https://github.com/intel/onnxruntime/releases/tag/v5.6)|
3636

3737
## Build
3838

@@ -79,15 +79,15 @@ Runtime parameters set during OpenVINO Execution Provider initialization to cont
7979
| [**num_of_threads**](#num_of_threads--num_streams) | string | Any positive integer > 0 | size_t | Control number of inference threads |
8080
| [**num_streams**](#num_of_threads--num_streams) | string | Any positive integer > 0 | size_t | Set parallel execution streams for throughput |
8181
| [**cache_dir**](#cache_dir) | string | Valid filesystem path | string | Enable openvino model caching for improved latency |
82-
| [**load_config**](#load_config) | string | JSON file path | string | Load and set custom/HW specific OpenVINO properties from JSON |
82+
| [**load_config**](#load_config) | string | JSON string | string | Load and set custom/HW specific OpenVINO properties from JSON |
8383
| [**enable_qdq_optimizer**](#enable_qdq_optimizer) | string | True/False | boolean | Enable QDQ optimization for NPU |
8484
| [**disable_dynamic_shapes**](#disable_dynamic_shapes) | string | True/False | boolean | Convert dynamic models to static shapes |
8585
| [**reshape_input**](#reshape_input) | string | input_name[shape_bounds] | string | Specify upper and lower bound for dynamic shaped inputs for improved performance with NPU |
8686
| [**layout**](#layout) | string | input_name[layout_format] | string | Specify input/output tensor layout format |
8787

8888
**Deprecation Notice**
8989

90-
The following provider options are **deprecated** and should be migrated to `load_config` for better compatibility with future releases.
90+
The following provider options are **deprecated since ORT 1.23** and should be migrated to `load_config` for better compatibility with future releases.
9191

9292
| Deprecated Provider Option | `load_config` Equivalent | Recommended Migration |
9393
|---------------------------|------------------------|----------------------|
@@ -147,7 +147,7 @@ Runs the same model on multiple devices in parallel to improve device utilizatio
147147
---
148148

149149
### `precision`
150-
**DEPRECATED:** This option is deprecated and can be set via `load_config` using the `INFERENCE_PRECISION_HINT` property.
150+
**DEPRECATED:** This option is deprecated since OpenVINO 2025.3/ORT 1.23 and can be set via `load_config` using the `INFERENCE_PRECISION_HINT` property.
151151
- Controls numerical precision during inference, balancing **performance** and **accuracy**.
152152

153153
**Precision Support on Devices:**
@@ -167,7 +167,7 @@ Runs the same model on multiple devices in parallel to improve device utilizatio
167167
---
168168
### `num_of_threads` & `num_streams`
169169

170-
**DEPRECATED:** These options are deprecated and can be set via `load_config` using the `INFERENCE_NUM_THREADS` and `NUM_STREAMS` properties respectively.
170+
**DEPRECATED:** These options are deprecated since OpenVINO 2025.3/ORT 1.23 and can be set via `load_config` using the `INFERENCE_NUM_THREADS` and `NUM_STREAMS` properties respectively.
171171

172172
**Multi-Threading**
173173

@@ -185,31 +185,33 @@ Manages parallel inference streams for throughput optimization (default: `1` for
185185

186186
### `cache_dir`
187187

188-
**DEPRECATED:** This option is deprecated and can be set via `load_config` using the `CACHE_DIR` property.
188+
**DEPRECATED:** This option is deprecated since OpenVINO 2025.3/ORT 1.23 and can be set via `load_config` using the `CACHE_DIR` property.
189189

190-
Enables model caching to significantly reduce subsequent load times. Supports CPU, NPU, and GPU devices with kernel caching on iGPU/dGPU.
190+
191+
Enables model caching to significantly reduce subsequent load times. Supports CPU, NPU, and GPU devices with kernel caching on iGPU/dGPU.
191192

192193
**Benefits**
193-
- Saves compiled models and `cl_cache` files for dynamic shapes
194+
- Saves compiled models for faster subsequent loading
194195
- Eliminates recompilation overhead on subsequent runs
195-
- Particularly useful for complex models and frequent application restarts
196-
196+
- Particularly useful for optimizing application startup latencies, especially for complex models
197197

198198
---
199199

200200
### `load_config`
201201

202-
**Recommended Configuration Method** for setting OpenVINO runtime properties. Provides direct access to OpenVINO properties through a JSON configuration file during runtime.
202+
**Recommended Configuration Method** for setting OpenVINO runtime properties. Provides direct access to OpenVINO properties through a JSON String during runtime.
203203

204204
#### Overview
205205

206-
`load_config` enables fine-grained control over OpenVINO inference behavior by loading properties from a JSON file. This is the **preferred method** for configuring advanced OpenVINO features, offering:
206+
`load_config` enables fine-grained control over OpenVINO inference behavior by loading properties from a JSON String. This is the **preferred method** for configuring advanced OpenVINO features, offering:
207207

208208
- Direct access to OpenVINO runtime properties
209209
- Device-specific configuration
210210
- Better compatibility with future OpenVINO releases
211211
- No property name translation required
212212

213+
214+
213215
#### JSON Configuration Format
214216
```json
215217
{
@@ -219,6 +221,33 @@ Enables model caching to significantly reduce subsequent load times. Supports CP
219221
}
220222
```
221223

224+
`load_config` now supports nested JSON objects up to **8 levels deep** for complex device configurations.
225+
226+
**Maximum Nesting:** 8 levels deep.
227+
228+
**Example: Multi-Level Nested Configuration**
229+
```python
230+
import onnxruntime as ort
231+
import json
232+
233+
# Complex nested configuration for AUTO device
234+
config = {
235+
"AUTO": {
236+
"PERFORMANCE_HINT": "THROUGHPUT",
237+
"DEVICE_PROPERTIES": {
238+
"CPU": {
239+
"PERFORMANCE_HINT": "LATENCY",
240+
"NUM_STREAMS": "3"
241+
},
242+
"GPU": {
243+
"EXECUTION_MODE_HINT": "ACCURACY",
244+
"PERFORMANCE_HINT": "LATENCY"
245+
}
246+
}
247+
}
248+
}
249+
```
250+
222251
**Supported Device Names:**
223252
- `"CPU"` - Intel CPU
224253
- `"GPU"` - Intel integrated/discrete GPU
@@ -327,7 +356,7 @@ Property keys used in `load_config` JSON must match the string literal defined i
327356

328357
### `enable_qdq_optimizer`
329358

330-
**DEPRECATED:** This option is deprecated and can be set via `load_config` using the `NPU_QDQ_OPTIMIZATION` property.
359+
**DEPRECATED:** This option is deprecated since OpenVINO 2025.3/ORT 1.23 and can be set via `load_config` using the `NPU_QDQ_OPTIMIZATION` property.
331360

332361
NPU-specific optimization for Quantize-Dequantize (QDQ) operations in the inference graph. This optimizer enhances ORT quantized models by:
333362

@@ -362,7 +391,7 @@ This configuration is required for optimal NPU memory allocation and management.
362391

363392
### `model_priority`
364393

365-
**DEPRECATED:** This option is deprecated and can be set via `load_config` using the `MODEL_PRIORITY` property.
394+
**DEPRECATED:** This option is deprecated since OpenVINO 2025.3/ORT 1.23 and can be set via `load_config` using the `MODEL_PRIORITY` property.
366395

367396
Configures resource allocation priority for multi-model deployment scenarios.
368397

@@ -401,31 +430,25 @@ Configures resource allocation priority for multi-model deployment scenarios.
401430

402431
`input_image[NCHW],output_tensor[NC]`
403432

433+
404434
---
405435

406436
## Examples
407-
408437
### Python
409-
410-
#### Using load_config with JSON file
438+
#### Using load_config with JSON string
411439
```python
412440
import onnxruntime as ort
413441
import json
414442

415-
# Create config file
443+
# Create config
416444
config = {
417445
"AUTO": {
418446
"PERFORMANCE_HINT": "THROUGHPUT",
419-
"PERF_COUNT": "NO",
420-
"DEVICE_PROPERTIES": "{CPU:{INFERENCE_PRECISION_HINT:f32,NUM_STREAMS:3},GPU:{INFERENCE_PRECISION_HINT:f32,NUM_STREAMS:5}}"
447+
"DEVICE_PROPERTIES": "{GPU:{EXECUTION_MODE_HINT:ACCURACY,PERFORMANCE_HINT:LATENCY}}"
421448
}
422449
}
423-
424-
with open("ov_config.json", "w") as f:
425-
json.dump(config, f)
426-
427450
# Use config with session
428-
options = {"device_type": "AUTO", "load_config": "ov_config.json"}
451+
options = {"device_type": "AUTO", "load_config": json.dumps(config)}
429452
session = ort.InferenceSession("model.onnx",
430453
providers=[("OpenVINOExecutionProvider", options)])
431454
```
@@ -438,20 +461,14 @@ import json
438461
# Create CPU config
439462
config = {
440463
"CPU": {
441-
"INFERENCE_PRECISION_HINT": "f32",
442-
"NUM_STREAMS": "3",
443-
"INFERENCE_NUM_THREADS": "8"
464+
"PERFORMANCE_HINT": "LATENCY",
465+
"NUM_STREAMS": "1"
444466
}
445467
}
446-
447-
with open("cpu_config.json", "w") as f:
448-
json.dump(config, f)
449-
450-
options = {"device_type": "CPU", "load_config": "cpu_config.json"}
468+
options = {"device_type": "CPU", "load_config": json.dumps(config)}
451469
session = ort.InferenceSession("model.onnx",
452470
providers=[("OpenVINOExecutionProvider", options)])
453471
```
454-
455472
#### Using load_config for GPU
456473
```python
457474
import onnxruntime as ort
@@ -460,20 +477,16 @@ import json
460477
# Create GPU config with caching
461478
config = {
462479
"GPU": {
463-
"INFERENCE_PRECISION_HINT": "f16",
480+
"EXECUTION_MODE_HINT": "ACCURACY",
464481
"CACHE_DIR": "./model_cache",
465482
"PERFORMANCE_HINT": "LATENCY"
466483
}
467484
}
468-
469-
with open("gpu_config.json", "w") as f:
470-
json.dump(config, f)
471-
472-
options = {"device_type": "GPU", "load_config": "gpu_config.json"}
485+
options = {"device_type": "GPU", "load_config": json.dumps(config)}
473486
session = ort.InferenceSession("model.onnx",
474487
providers=[("OpenVINOExecutionProvider", options)])
475-
```
476488

489+
```
477490

478491
---
479492
### Python API
@@ -819,4 +832,4 @@ In order to showcase what you can do with the OpenVINO™ Execution Provider for
819832

820833
[Tutorial: Using OpenVINO™ Execution Provider for ONNX Runtime Python Wheel Packages](https://www.intel.com/content/www/us/en/artificial-intelligence/posts/openvino-execution-provider-for-onnx-runtime.html)
821834

822-
---
835+
---

0 commit comments

Comments
 (0)