AI Deployment on Intel Movidius / OpenVINO
Intel OpenVINO — toolkit for optimizing and deploying ML models on Intel hardware: CPU (x86), GPU (Intel Iris/Arc), NPU (Intel Neural Processing Unit in Core Ultra), VPU (Intel Movidius). Competitor to TensorRT for Intel ecosystem.
OpenVINO Toolkit
Model Optimizer → IR (Intermediate Representation): Conversion from TensorFlow, PyTorch (via ONNX), ONNX, PaddlePaddle to OpenVINO IR format. INT8 calibration via Post-Training Optimization Tool (POT) or NNCF.
Inference Engine:
from openvino.runtime import Core
core = Core()
model = core.compile_model("model.xml", "NPU") # CPU, GPU, NPU
Intel Neural Processing Unit (NPU)
Intel Core Ultra (Meteor Lake, Arrow Lake) contains embedded NPU:
- Core Ultra 5/7 125H: ~10 TOPS NPU
- Core Ultra 9 185H: ~11 TOPS NPU
- Core Ultra 200V: ~48 TOPS NPU
Ideal for: always-on AI tasks (face detection, keyword spotting) with minimal power.
Intel Movidius VPU
Myriad X (in Intel Neural Compute Stick 2, though deprecated) and successors. 4 TOPS, USB-connected. Competitor to Coral USB.
Application
Edge servers on Intel Xeon, industrial PCs on Core i5/i7, edge gateways with Intel Atom. OpenVINO Model Server for production serving with gRPC API.







