{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Converters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Spox does not directly offer any _ONNX converters_ (utilities for translating ML models into ONNX), but it can be easily used to implement a _converter protocol_.\n",
    "We'll go over an example way of achieving this.\n",
    "In general, it is easiest to convert operations from libraries like `numpy` or deep learning frameworks, since ONNX follows similar principles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "from typing import Dict\n",
    "import onnx\n",
    "import onnxruntime\n",
    "import numpy as np\n",
    "from spox import argument, build, Tensor, Var\n",
    "import spox.opset.ai.onnx.v17 as op\n",
    "\n",
    "\n",
    "def run(model: onnx.ModelProto, **kwargs) -> list[np.ndarray]:\n",
    "    return onnxruntime.InferenceSession(model.SerializeToString()).run(\n",
    "        None,\n",
    "        {k: np.array(v) for k, v in kwargs.items()}\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll start with simple conversion of Python functions on `numpy.array`s into Spox equivalents on `Var`s (of tensors).\n",
    "\n",
    "Let's define functions computing means on a pair of tensors:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "def arithmetic_mean(a: np.ndarray, b: np.ndarray) -> np.ndarray:\n",
    "    return np.divide(np.add(a, b), 2)\n",
    "\n",
    "\n",
    "def geometric_mean(a: np.ndarray, b: np.ndarray) -> np.ndarray:\n",
    "    return np.sqrt(np.multiply(a, b))\n",
    "\n",
    "\n",
    "def harmonic_mean(a: np.ndarray, b: np.ndarray) -> np.ndarray:\n",
    "    return np.divide(2., np.add(\n",
    "        np.reciprocal(a),\n",
    "        np.reciprocal(b)\n",
    "    ))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now define equivalents in Spox. We'll follow a _contract_ stating that arguments and results of `numpy.ndarray` become `Var`, which is expected to be a tensor."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "def spox_arithmetic_mean(a: Var, b: Var) -> Var:\n",
    "    return op.div(op.add(a, b), op.constant(value_float=2.))\n",
    "\n",
    "\n",
    "def spox_geometric_mean(a: Var, b: Var) -> Var:\n",
    "    return op.sqrt(op.mul(a, b))\n",
    "\n",
    "\n",
    "def spox_harmonic_mean(a: Var, b: Var) -> Var:\n",
    "    return op.div(op.constant(value_float=2.), op.add(\n",
    "        op.reciprocal(a),\n",
    "        op.reciprocal(b)\n",
    "    ))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estimators"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's also consider an `sklearn`-like estimator on 'dataframes' (dictionaries of arrays)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "class PairwiseMeans:\n",
    "    kind: str  # 'arithmetic', 'geometric', or 'harmonic'\n",
    "    first: str\n",
    "    second: str  # name of first and second 'column' to find the mean of\n",
    "\n",
    "    def __init__(self, kind: str, first: str, second: str):\n",
    "        self.kind = kind\n",
    "        self.first = first\n",
    "        self.second = second\n",
    "\n",
    "    def predict(self, data: Dict[str, np.ndarray]) -> np.ndarray:\n",
    "        means = {\n",
    "            'arithmetic': arithmetic_mean,\n",
    "            'geometric': geometric_mean,\n",
    "            'harmonic': harmonic_mean,\n",
    "        }\n",
    "        return means[self.kind](data[self.first], data[self.second])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The equivalent in Spox could be a class 'decorating' a `PairwiseMeans` instance - consuming it and implementing the same interface, but using `Var`s instead of `numpy.ndarray`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "class SpoxPairwiseMeans:\n",
    "    estimator: PairwiseMeans\n",
    "\n",
    "    def __init__(self, estimator: PairwiseMeans):\n",
    "        self.estimator = estimator\n",
    "\n",
    "    def predict(self, data: Dict[str, Var]) -> Var:\n",
    "        means = {\n",
    "            'arithmetic': spox_arithmetic_mean,\n",
    "            'geometric': spox_geometric_mean,\n",
    "            'harmonic': spox_harmonic_mean,\n",
    "        }\n",
    "        return means[self.estimator.kind](data[self.estimator.first], data[self.estimator.second])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Converter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To provide a simple API for conversion, we can define a `convert` function handling the possible conversions. The mapping could be defined with e.g. a dictionary to make it more dynamically extensible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "def convert(obj):\n",
    "    if obj is arithmetic_mean:\n",
    "        return spox_arithmetic_mean\n",
    "    elif obj is geometric_mean:\n",
    "        return spox_geometric_mean\n",
    "    elif obj is harmonic_mean:\n",
    "        return spox_harmonic_mean\n",
    "    elif type(obj) is PairwiseMeans:\n",
    "        return SpoxPairwiseMeans(obj)\n",
    "    raise ValueError(f\"No converter for: {obj}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To build a model, we have to construct the arguments and pass them with the result to `spox.build`. This could be abstracted away with a usage of `inspect.signature` and by extracting the input types from example input data, but we'll not consider this here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "pairwise_means = PairwiseMeans('harmonic', 'x', 'z')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "vec = Tensor(np.float32, ('N',))\n",
    "x, y, z = argument(vec), argument(vec), argument(vec)\n",
    "\n",
    "\n",
    "def simple_convert_build(fun):\n",
    "    return build({'x': x, 'y': y}, {'r': convert(fun)(x, y)})\n",
    "\n",
    "\n",
    "arithmetic_mean_model = simple_convert_build(arithmetic_mean)\n",
    "geometric_mean_model = simple_convert_build(geometric_mean)\n",
    "harmonic_mean_model = simple_convert_build(harmonic_mean)\n",
    "pairwise_means_model = build(\n",
    "    {'x': x, 'y': y, 'z': z},\n",
    "    {'r': convert(pairwise_means).predict({'x': x, 'y': y, 'z': z})}\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Checking equivalence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now test equivalence by running the `onnxruntime` with the previously defined `run` utility."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "x0 = np.array([1, 2, 3], dtype=np.float32)\n",
    "y0 = np.array([4, 6, 5], dtype=np.float32)\n",
    "z0 = np.array([-2, -1, -0.5], dtype=np.float32)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An example run looks like this. Note that this is not going through Spox, as at this point `arithmetic_mean_model` is an `onnx.ModelProto`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([2.5, 4. , 4. ], dtype=float32), array([2.5, 4. , 4. ], dtype=float32))"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arithmetic_mean(x0, y0), run(arithmetic_mean_model, x=x0, y=y0)[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2.5 4.  4. ] [2.5 4.  4. ]\n",
      "[2.        3.4641016 3.8729835] [2.        3.4641016 3.8729835]\n",
      "[1.6       3.        3.7499998] [1.6       3.        3.7499998]\n"
     ]
    }
   ],
   "source": [
    "tests = [\n",
    "    (arithmetic_mean, arithmetic_mean_model),\n",
    "    (geometric_mean, geometric_mean_model),\n",
    "    (harmonic_mean, harmonic_mean_model),\n",
    "]\n",
    "for py_function, onnx_model in tests:\n",
    "    actual = run(onnx_model, x=x0, y=y0)[0]\n",
    "    desired = py_function(x0, y0)\n",
    "    print(actual, desired)\n",
    "    np.testing.assert_allclose(actual, desired)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    },
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 4.  -4.  -1.2] [ 4.  -4.  -1.2]\n"
     ]
    }
   ],
   "source": [
    "actual = run(pairwise_means_model, x=x0, y=y0, z=z0)[0]\n",
    "desired = pairwise_means.predict({'x': x0, 'y': y0, 'z': z0})\n",
    "print(actual, desired)\n",
    "np.testing.assert_allclose(actual, desired)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}