{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hierarchical optimization with relative data\n",
    "\n",
    "In this notebook we illustrate how to do hierarchical optimization in pyPESTO.\n",
    "\n",
    "A frequent problem occuring in parameter estimation for dynamical systems is that the objective function takes a form\n",
    "\n",
    "$$ J(\\theta, s, b, \\sigma^2) = \\sum_i \\left[\\log(2\\pi\\sigma_i^2) + \\frac{(\\bar y_i - (s_iy_i(\\theta) + b_i))^2}{\\sigma_i^2}\\right] $$\n",
    "\n",
    "with data $\\bar y_i$, parameters $\\eta = (\\theta,s,b,\\sigma^2)$, and ODE simulations $y(\\theta)$. Here, we consider a Gaussian noise model, but also others (e.g. Laplace) are possible. Here we want to leverage that the parameter vector $\\eta$ can be split into \"dynamic\" parameters $\\theta$ that are required for simulating the ODE, and \"static\" parameters $s,b,\\sigma^2$ that are only required to scale the simulations and formulate the objective function. As simulating the ODE usually dominates the computational complexity, one can exploit this separation of parameters by formulating an outer optimization problem in which $\\theta$ is optimized, and an inner optimization problem in which $s,b,\\sigma^2$ are optimized conditioned on $\\theta$. This approach has shown to have superior performance to the classic approach of jointly optimizing $\\eta$.\n",
    "\n",
    "In pyPESTO, we have implemented the algorithms developed in [Loos et al.; Hierarchical optimization for the efficient parametrization of ODE models; Bioinformatics 2018](https://academic.oup.com/bioinformatics/article/34/24/4266/5053308) (covering Gaussian and Laplace noise models with gradients computed via forward sensitivity analysis) and [Schmiester et al.; Efficient parameterization of large-scale dynamic models based on relative measurements; Bioinformatics 2019](https://academic.oup.com/bioinformatics/article/36/2/594/5538985) (extending to offset parameters and adjoint sensitivity analysis).\n",
    "\n",
    "However, the current implementation only supports:\n",
    "- Gaussian (normal) noise distributions\n",
    "- unbounded inner noise parameters $\\sigma$\n",
    "- linearly-scaled inner parameters $\\eta$\n",
    "\n",
    "Scaling $s$ and offset $b$ parameters can be bounded arbitrarily.\n",
    "\n",
    "In the following we will demonstrate how to use hierarchical optimization, and we will compare optimization results for the following scenarios:\n",
    "\n",
    "* Standard non-hierarchical gradient-based optimization with adjoint sensitivity analysis\n",
    "* Hierarchical gradient-based optimization with analytical inner solver and adjoint sensitivity analysis\n",
    "* Hierarchical gradient-based optimization with analytical inner solver and forward sensitivity analysis\n",
    "* Hierarchical gradient-based optimization with numerical inner solver and adjoint sensitivity analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "import fides\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "from amici.sim.sundials import SensitivityMethod\n",
    "from matplotlib.colors import to_rgba\n",
    "\n",
    "import pypesto\n",
    "from pypesto.hierarchical.relative import (\n",
    "    AnalyticalInnerSolver,\n",
    "    NumericalInnerSolver,\n",
    ")\n",
    "from pypesto.optimize.options import OptimizeOptions\n",
    "from pypesto.petab import PetabImporter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Note on inclusion of additional data types:\n",
    "It is possible to include observables with different types of data to the same `petab_problem`. Refer to the notebooks on using [semiquantitative data](semiquantitative_data.ipynb), [ordinal data](ordinal_data.ipynb) and [censored data](censored_data.ipynb) for details on integration of other data types."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem specification\n",
    "\n",
    "We consider a version of the [Boehm et al.; Journal of Proeome Research 2014] model, modified to include scalings $s$, offsets $b$, and noise parameters $\\sigma^2$.\n",
    "\n",
    "NB: We load two PEtab problems here, because the employed standard and hierarchical optimization methods have different expectations for parameter bounds. The difference between the two PEtab problems is that the `corrected_bounds` version estimates inner noise parameters ($\\sigma$) in $[0, \\infty]$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the PEtab problem\n",
    "# requires installation of\n",
    "from pypesto.testing.examples import (\n",
    "    get_Boehm_JProteomeRes2014_hierarchical_petab,\n",
    "    get_Boehm_JProteomeRes2014_hierarchical_petab_corrected_bounds,\n",
    ")\n",
    "\n",
    "petab_problem = get_Boehm_JProteomeRes2014_hierarchical_petab()\n",
    "petab_problem_hierarchical = (\n",
    "    get_Boehm_JProteomeRes2014_hierarchical_petab_corrected_bounds()\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To convert a non-hierarchical PEtab model to a hierarchical one, the observable_df, the measurement_df and the parameter_df have to be changed accordingly.\n",
    "The PEtab **observable table contains** placeholders for scaling parameters $s$ (`observableParameter1_{pSTAT5A_rel,pSTAT5B_rel,rSTAT5A_rel}`), offsets $b$ (`observableParameter2_{pSTAT5A_rel,pSTAT5B_rel,rSTAT5A_rel}`), and noise parameters $\\sigma^2$ (`noiseParameter1_{pSTAT5A_rel,pSTAT5B_rel,rSTAT5A_rel}`) that are overridden by the `{observable,noise}Parameters` column in the **measurement table**.\n",
    "\n",
    "N.B.: in general, the inner parameters can appear in observable formulae directly. For example, the first observable formula in this table could be changed from `observableParameter2_pSTAT5A_rel + observableParameter1_pSTAT5A_rel * (100 * pApB + 200 * pApA * specC17) / (pApB + STAT5A * specC17 + 2 * pApA * specC17)` to `offset_pSTAT5A_rel + scaling_pSTAT5A_rel * (100 * pApB + 200 * pApA * specC17) / (pApB + STAT5A * specC17 + 2 * pApA * specC17)`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Observable DF:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pandas import option_context\n",
    "\n",
    "with option_context(\"display.max_colwidth\", 400):\n",
    "    display(petab_problem_hierarchical.observable_df)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Measurement DF:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pandas import option_context\n",
    "\n",
    "with option_context(\"display.max_colwidth\", 400):\n",
    "    display(petab_problem_hierarchical.measurement_df)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Parameters to be optimized in the inner problem are specified via the PEtab parameter table by setting a value in the non-standard column `parameterType` (`offset` for offset parameters, `scaling` for scaling parameters, and `sigma` for sigma parameters). When using hierarchical optimization, the nine overriding parameters {offset,scaling,sd}_{pSTAT5A_rel,pSTAT5B_rel,rSTAT5A_rel} are to estimated in the inner problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "petab_problem_hierarchical.parameter_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a pypesto problem with hierarchical optimization (`problem`)\n",
    "importer = PetabImporter(\n",
    "    petab_problem_hierarchical,\n",
    "    hierarchical=True,\n",
    "    model_name=\"Boehm_JProteomeRes2014_hierarchical\",\n",
    ")\n",
    "problem = importer.create_problem(verbose=False)\n",
    "# set option to compute objective function gradients using adjoint sensitivity analysis\n",
    "problem.objective.amici_solver.set_sensitivity_method(\n",
    "    SensitivityMethod.adjoint\n",
    ")\n",
    "\n",
    "# ... and create another pypesto problem without hierarchical optimization (`problem2`)\n",
    "importer2 = PetabImporter(\n",
    "    petab_problem,\n",
    "    hierarchical=False,\n",
    "    model_name=\"Boehm_JProteomeRes2014_hierarchical\",\n",
    ")\n",
    "problem2 = importer2.create_problem(verbose=False)\n",
    "problem2.objective.amici_solver.set_sensitivity_method(\n",
    "    SensitivityMethod.adjoint\n",
    ")\n",
    "\n",
    "# Options for multi-start optimization\n",
    "minimize_kwargs = {\n",
    "    # number of starts for multi-start optimization\n",
    "    \"n_starts\": 3,\n",
    "    # number of processes for parallel multi-start optimization\n",
    "    \"engine\": pypesto.engine.MultiProcessEngine(n_procs=3),\n",
    "    # raise in case of failures\n",
    "    \"options\": OptimizeOptions(allow_failed_starts=False),\n",
    "    # use the Fides optimizer\n",
    "    \"optimizer\": pypesto.optimize.FidesOptimizer(\n",
    "        verbose=0, hessian_update=fides.BFGS()\n",
    "    ),\n",
    "}\n",
    "# Set the same starting points for the hierarchical and non-hierarchical problem\n",
    "startpoints = pypesto.startpoint.latin_hypercube(\n",
    "    n_starts=minimize_kwargs[\"n_starts\"],\n",
    "    lb=problem2.lb_full,\n",
    "    ub=problem2.ub_full,\n",
    ")\n",
    "# for the hierarchical problem, we only specify the outer parameters\n",
    "outer_indices = [problem2.x_names.index(x_id) for x_id in problem.x_names]\n",
    "problem.set_x_guesses(startpoints[:, outer_indices])\n",
    "problem2.set_x_guesses(startpoints)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hierarchical optimization using analytical or numerical inner solver"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Run hierarchical optimization using NumericalInnerSolver\n",
    "start_time = time.time()\n",
    "problem.objective.calculator.inner_solver = NumericalInnerSolver(\n",
    "    minimize_kwargs={\"n_starts\": 1}\n",
    ")\n",
    "result_num = pypesto.optimize.minimize(problem, **minimize_kwargs)\n",
    "print(f\"{result_num.optimize_result.get_for_key('fval')=}\")\n",
    "time_num = time.time() - start_time\n",
    "print(f\"{time_num=}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run hierarchical optimization using AnalyticalInnerSolver\n",
    "start_time = time.time()\n",
    "problem.objective.calculator.inner_solver = AnalyticalInnerSolver()\n",
    "result_ana = pypesto.optimize.minimize(problem, **minimize_kwargs)\n",
    "print(f\"{result_ana.optimize_result.get_for_key('fval')=}\")\n",
    "time_ana = time.time() - start_time\n",
    "print(f\"{time_ana=}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can visualize the estimated linear observable mapping of both the analytical and numerical approach using the `pypesto.visualize.observable_mapping.visualize_estimated_observable_mapping` routine. The routine plots all estimated linear or spline observable mappings of relative or semiquantitative observables, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pypesto.visualize.observable_mapping.visualize_estimated_observable_mapping(\n",
    "    pypesto_result=result_num,\n",
    "    pypesto_problem=problem,\n",
    "    figsize=(7, 7),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pypesto.visualize.observable_mapping.visualize_estimated_observable_mapping(\n",
    "    pypesto_result=result_ana,\n",
    "    pypesto_problem=problem,\n",
    "    figsize=(7, 7),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Waterfall plot - analytical vs numerical inner solver\n",
    "pypesto.visualize.waterfall(\n",
    "    [result_num, result_ana],\n",
    "    legends=[\"Numerical-Hierarchical\", \"Analytical-Hierarchical\"],\n",
    "    size=(15, 6),\n",
    "    order_by_id=True,\n",
    "    colors=np.array(list(map(to_rgba, (\"green\", \"purple\")))),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Time comparison - analytical vs numerical inner solver\n",
    "ax = plt.bar(x=[0, 1], height=[time_ana, time_num], color=[\"purple\", \"green\"])\n",
    "ax = plt.gca()\n",
    "ax.set_xticks([0, 1])\n",
    "ax.set_xticklabels([\"Analytical-Hierarchical\", \"Numerical-Hierarchical\"])\n",
    "ax.set_ylabel(\"Computation time [s]\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparison of hierarchical and non-hierarchical optimization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run standard optimization\n",
    "start_time = time.time()\n",
    "result_ord = pypesto.optimize.minimize(problem2, **minimize_kwargs)\n",
    "print(f\"{result_ord.optimize_result.get_for_key('fval')=}\")\n",
    "time_ord = time.time() - start_time\n",
    "print(f\"{time_ord=}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Waterfall plot - hierarchical optimization with analytical inner solver vs standard optimization\n",
    "pypesto.visualize.waterfall(\n",
    "    [result_ana, result_ord],\n",
    "    legends=[\"Analytical-Hierarchical\", \"Non-Hierarchical\"],\n",
    "    order_by_id=True,\n",
    "    colors=np.array(list(map(to_rgba, (\"purple\", \"orange\")))),\n",
    "    size=(15, 6),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Time comparison - hierarchical optimization with analytical inner solver vs standard optimization\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "ax = plt.bar(x=[0, 1], height=[time_ana, time_ord], color=[\"purple\", \"orange\"])\n",
    "ax = plt.gca()\n",
    "ax.set_xticks([0, 1])\n",
    "ax.set_xticklabels([\"Analytical-Hierarchical\", \"Non-Hierarchical\"])\n",
    "ax.set_ylabel(\"Computation time [s]\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparison of hierarchical and non-hierarchical optimization with adjoint and forward sensitivities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run hierarchical optimization with analytical inner solver and forward sensitivities\n",
    "start_time = time.time()\n",
    "problem.objective.calculator.inner_solver = AnalyticalInnerSolver()\n",
    "problem.objective.amici_solver.set_sensitivity_method(\n",
    "    SensitivityMethod.forward\n",
    ")\n",
    "result_ana_fw = pypesto.optimize.minimize(problem, **minimize_kwargs)\n",
    "print(f\"{result_ana_fw.optimize_result.get_for_key('fval')=}\")\n",
    "time_ana_fw = time.time() - start_time\n",
    "print(f\"{time_ana_fw=}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Waterfall plot - compare all scenarios\n",
    "pypesto.visualize.waterfall(\n",
    "    [result_ana, result_ana_fw, result_num, result_ord],\n",
    "    legends=[\n",
    "        \"Analytical-Hierarchical (adjoint)\",\n",
    "        \"Analytical-Hierarchical (forward)\",\n",
    "        \"Numerical-Hierarchical\",\n",
    "        \"Non-Hierarchical\",\n",
    "    ],\n",
    "    colors=np.array(list(map(to_rgba, (\"purple\", \"blue\", \"green\", \"orange\")))),\n",
    "    order_by_id=True,\n",
    "    size=(15, 6),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Time comparison of all scenarios\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "ax = plt.bar(\n",
    "    x=[0, 1, 2, 3],\n",
    "    height=[time_ana, time_ana_fw, time_num, time_ord],\n",
    "    color=[\"purple\", \"blue\", \"green\", \"orange\"],\n",
    ")\n",
    "ax = plt.gca()\n",
    "ax.set_xticks([0, 1, 2, 3])\n",
    "ax.set_xticklabels(\n",
    "    [\n",
    "        \"Analytical-Hierarchical (adjoint)\",\n",
    "        \"Analytical-Hierarchical (forward)\",\n",
    "        \"Numerical-Hierarchical\",\n",
    "        \"Non-Hierarchical\",\n",
    "    ]\n",
    ")\n",
    "plt.setp(ax.get_xticklabels(), fontsize=10, rotation=75)\n",
    "ax.set_ylabel(\"Computation time [s]\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}