Inference

Introduction

Spox attempts to perform inference on operators immediately as they are constructed in Python. This includes two main mechanisms: type (and shape) inference, and value propagation.

Both are done on a best-effort basis and primarily based on ONNX implementations. Some type inference and value propagation routines may be patched in the generated opset. This is a Python implementation within Spox. This attempts to follow the standard, but may also be imperfect and have bugs (as can be the standard ONNX implementations).

Inference mechanisms work effectively in various contexts. To make this work, Spox expects that type information will be carried in Vars through the entire graph, as it is constructed. This enables raising Python exceptions as early as possible when type inference fails, as well as improving debug and reducing the requirement of specifying redundant type information in Python.

The general mechanism of this is the following: the single standard node is built into a singleton (single-node) model as onnx.ModelProto. This is then passed into onnx routines. Afterwards, the information is extracted and converted back into Spox.

[1]:

import warnings
import numpy as np
import spox
import spox.opset.ai.onnx.v17 as op

Type inference

Type and shape inference is run via onnx.shape_inference.infer_shapes on the singleton model. Types are converted to and from the ONNX representation internally. Some operators may have missing or incomplete type inference implementations (especially in ML operators of ai.onnx.ml), and may have a patch implemented in Spox.

A Var’s type can be accessed with the Var.type: Optional[Type] attribute. It is however recommended to use the checked equivalents: Var.unwrap_type() -> Type or Var.unwrap_tensor() -> Tensor, which work better with type checkers.

Patches can be currently found as an infer_output_types implementation in the respective Node class.

[2]:

x = spox.argument(spox.Tensor(float, ('N',)))
y = spox.argument(spox.Tensor(float, ()))
z = spox.argument(spox.Tensor(int, ('N', 'M')))
w = spox.argument(spox.Tensor(int, (None,)))

[3]:

# Broadcasting of (N) and () into (N)
op.add(x, y)

[3]:

<Var from ai.onnx@14::Add->C of float64[N]>

[4]:

# Casting element type with a Cast
op.cast(z, to=str)

[4]:

<Var from ai.onnx@13::Cast->output of str[N][M]>

[5]:

# Reshape of a matrix into a vector
op.reshape(z, op.constant(value_ints=[-1]))

[5]:

<Var from ai.onnx@14::Reshape->reshaped of int64[?]>

[6]:

# Using a broadcast of (1, N) and (N, 1) into (N, N)
op.add(
    op.unsqueeze(x, op.constant(value_ints=[0])),
    op.unsqueeze(x, op.constant(value_ints=[1]))
)

[6]:

<Var from ai.onnx@14::Add->C of float64[N][N]>

[7]:

# Failing shape inference raises an exception
try:
    print(op.add(x, z))  # mismatched types: float64, int64
except Exception as e:
    print(f"{type(e).__name__}: {e}")
else:
    assert False

InferenceError: [ShapeInferenceError] (op_type:Add, node name: _this_): B has inconsistent type tensor(int64) -- for Add: inputs [A: float64[N], B: int64[N][M]]

[8]:

# Missing or unresolvable shape inference warns
# w is a dynamic-size vector => can't determine reshaped rank
warnings.filterwarnings("error")
try:
    print(op.reshape(x, w))
except Exception as e:
    print(f"{type(e).__name__}: {e}")
else:
    assert False
warnings.resetwarnings()

InferenceWarning: Output type for variable reshaped of ai.onnx@14::Reshape was not concrete - ValueError: Tensor float64[...] does not specify the shape -- in ?.

[9]:

# Access the type directly (might None if type inference failed)
op.add(x, y).type

[9]:

Tensor(dtype=float64, shape=('N',))

[10]:

# Access the type, asserting it must be a tensor
op.add(x, y).unwrap_tensor()

[10]:

Tensor(dtype=float64, shape=('N',))

Spox does not have a facility for extra type information in Var type hints. The ONNX type system, and particularly tensor shapes, is not really expressible in type hints. This may be reconsidered in the future if libraries like numpy start supporting similar features and the Python/ONNX ecosystem develops.

Value propagation

Value propagation in Spox is run via the onnx.reference module (added in 1.13) - the reference runtime implementation in Python. It replicates the partial data propagation mechanism of type inference in ONNX, which is essentially constant folding.

In Spox, a Var may have a constant value associated with it. If all input variables of a standard operator have a value, propagation will be attempted by running the singleton model through the reference implementation.

The most common instance of value propagation is in the Reshape operator, where a constant target shape allows determining the resulting shape. If the target shape were not known, even the rank of the output shape could not be determined.

Value propagation can also be thought of as eager execution mode within Spox, and is well-suited for experimenting with (standard) operators.

Currently, there isn’t a standard way of accessing the propagated value. It can be viewed when printed. Value propagation isn’t usually patched as in most cases it is not critical to type inference. It is implemented by overriding the propagate_values method of Node classes.

[11]:

# Trivial reshape example
op.reshape(x, op.constant(value_ints=[1, 2, 3]))

[11]:

<Var from ai.onnx@14::Reshape->reshaped of float64[1][2][3]>

[12]:

s = op.add(
    op.mul(op.constant(value_ints=[1, 2]), op.constant(value_int=2)),
    op.constant(value_int=1)
)  # [1, 2] * 2 + 1 = [3, 5]
# Reshape with a basic constant fold
op.reshape(x, s)

[12]:

<Var from ai.onnx@14::Reshape->reshaped of float64[3][5]>

Constant variable values can also be seen in the string representation. Currently, there isn’t a stable way of accessing them programmatically - the internal field is _value and it can be converted to an ORT-like format with _get_value(). The representation isn’t currently publicly specified.

[13]:

def const(xs):
    return op.constant(value=np.array(xs))

[14]:

# Trivial add
op.add(
    const(1),
    const([1, 2, 3])
)

[14]:

<Var from ai.onnx@14::Add->C of int64[3] = [2 3 4]>

[15]:

# Reshape
mat = op.reshape(
    const([1., 2., 3., 4.]),
    const([2, 2])
)
mat

[15]:

<Var from ai.onnx@14::Reshape->reshaped of float64[2][2] = [[1. 2.]
 [3. 4.]]>

[16]:

# Composing value propagation
op.matmul(mat, mat)

[16]:

<Var from ai.onnx@13::MatMul->Y of float64[2][2] = [[ 7. 10.]
 [15. 22.]]>

[17]:

# Unstable! Programmatic access
v = op.add(
    const(1),
    const([1, 2, 3])
)
np.testing.assert_allclose(v._get_value(), np.array([2, 3, 4]))
v

[17]:

<Var from ai.onnx@14::Add->C of int64[3] = [2 3 4]>

Testing an ML operator

ML operators (ai.onnx.ml) can be used similarly to standard ones (ai.onnx). Spox ships with pre-generated ML domain opset modules which you can find under spox.opset.ai.onnx.ml.v3.

With value propagation and a supporting backend, you can test an ML operator and run it on input without having to leave Spox:

[18]:

import spox.opset.ai.onnx.ml.v3 as ml
import spox._future
# Currently, you need ORT to run ML operators
spox._future.set_value_prop_backend(spox._future.ValuePropBackend.ONNXRUNTIME)

[19]:

ml.linear_regressor(
    const(np.array([[1], [2], [3]], dtype=np.float32)),
    coefficients=[3],
    intercepts=[1]
)

[19]:

<Var from ai.onnx.ml@1::LinearRegressor->Y of float32[3][1] = [[ 4.]
 [ 7.]
 [10.]]>

[20]:

ml.label_encoder(
    const([0, 1, 2]),
    keys_int64s=[1, 2],
    values_strings=["one", "two"],
    default_string="?"
)

[20]:

<Var from ai.onnx.ml@2::LabelEncoder->Y of str[3] = ['?' 'one' 'two']>