{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "# Custom Objective Function" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "pyPESTO can not only do parameter estimation for PEtab and AMICI models, but is able to do so on any provided function.\n", "This is done by providing the objective with the function as well as possibly gradient and hessian.\n", "In this notebook, we will show a few different ways on how to do this. As sometimes manually providing the gradient and hessian is tedious, we will try to emphasize on the importance of those two." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "After this notebook, you should ...\n", "\n", "* ... be able to create an objective from a given function.\n", "* ... be able to potentially add a gradient and hessian to the objective.\n", "* ... be able to run parameter estimation on the objective.\n", "* ... know the importance of gradient and hessian in terms of optimization speed and accuracy." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# install if not done yet\n", "# %pip install pypesto --quiet" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import scipy as sp\n", "from IPython.display import Markdown, display\n", "\n", "import pypesto\n", "import pypesto.optimize as optimize\n", "import pypesto.profile as profile\n", "import pypesto.visualize as visualize\n", "\n", "# plotting style\n", "figsize = (8, 6)\n", "plt.rcParams.update(\n", " {\n", " \"figure.dpi\": 100,\n", " \"figure.figsize\": figsize,\n", " \"font.size\": 14,\n", " \"lines.linewidth\": 2,\n", " }\n", ")\n", "\n", "# set a random seed\n", "np.random.seed(1912)\n", "\n", "%matplotlib inline" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "## 1. Objective + Problem Definition" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "In the following we will use the Rosenbrock Banana function, which we can directly get from scipy." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "The first creation of the objective function is rather straightforward:\n", "We create it by providing a function, as well as gradient and hessian." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# first type of objective defined through callables\n", "objective1 = pypesto.Objective(\n", " fun=sp.optimize.rosen,\n", " grad=sp.optimize.rosen_der,\n", " hess=sp.optimize.rosen_hess,\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "The second option is to provide a function that returns objective value, gradient and hessian (last two optional) all as a tuple.\n", "In this case, we just need to notify the pyPESTO objective of this through boolean values." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# second type of objective\n", "def rosen2(x):\n", " return (\n", " sp.optimize.rosen(x),\n", " sp.optimize.rosen_der(x),\n", " sp.optimize.rosen_hess(x),\n", " )\n", "\n", "\n", "objective2 = pypesto.Objective(fun=rosen2, grad=True, hess=True)" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "For later comparisons, we create two other objectives. One that is only provided with function and gradient, and one that only has the functional value, forcing it to rely on finite differences in optimization." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# no hessian objective\n", "objective3 = pypesto.Objective(\n", " fun=sp.optimize.rosen,\n", " grad=sp.optimize.rosen_der,\n", ")\n", "\n", "# neither hessian nor gradient objective\n", "objective4 = pypesto.Objective(\n", " fun=sp.optimize.rosen,\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "To get from objective to a usable parameter estimation problem, we need to additionally provide the bounds of our parameters." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "dim_full = 15\n", "lb = -5 * np.ones((dim_full, 1))\n", "ub = 5 * np.ones((dim_full, 1))\n", "\n", "# for the sake of comparison, we create 20 starts within these bounds\n", "x_guesses = np.random.uniform(-5, 5, (20, dim_full))\n", "\n", "problem1 = pypesto.Problem(\n", " objective=objective1, lb=lb, ub=ub, x_guesses=x_guesses\n", ")\n", "problem2 = pypesto.Problem(\n", " objective=objective2, lb=lb, ub=ub, x_guesses=x_guesses\n", ")\n", "problem3 = pypesto.Problem(\n", " objective=objective3, lb=lb, ub=ub, x_guesses=x_guesses\n", ")\n", "problem4 = pypesto.Problem(\n", " objective=objective4, lb=lb, ub=ub, x_guesses=x_guesses\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Illustration\n", "\n", "The Rosenbrock function is a function, which is often used to test optimization algorithms. It is defined as\n", "$$\n", "f(x) = \\sum_{i=1}^{n-1} 100 (x_{i+1} - x_i^2)^2 + (1 - x_i)^2, \\quad x \\in \\mathbb{R}^n\n", "$$\n", "This function has a global minimum at $x^* = (1, \\dots, 1)$ with $f(x^*) = 0$. If $n\\geq 4$, the function has an additional local minimum at $x^* = (-1, 1, \\dots, 1)$ with $f(x^*) = 4$.\n", "\n", "We will illustrate the function for $n=2$." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "x = np.arange(-2, 2, 0.1)\n", "y = np.arange(-2, 2, 0.1)\n", "x, y = np.meshgrid(x, y)\n", "z = np.zeros_like(x)\n", "for j in range(0, x.shape[0]):\n", " for k in range(0, x.shape[1]):\n", " z[j, k] = objective1([x[j, k], y[j, k]], (0,))" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "fig = plt.figure()\n", "fig.set_size_inches(*figsize)\n", "ax = plt.axes(projection=\"3d\")\n", "ax.plot_surface(X=x, Y=y, Z=z)\n", "plt.xlabel(\"x axis\")\n", "plt.ylabel(\"y axis\")\n", "ax.set_title(\"cost function values\");" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "And a contour plot:" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "fig = plt.figure()\n", "fig.set_size_inches(*figsize)\n", "ax = plt.axes()\n", "ax.contourf(x, y, z, 100, norm=\"log\");" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "## 2. Optimization" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# optimizer\n", "optimizer = optimize.ScipyOptimizer()\n", "# engine\n", "# In this notebook, it is faster to use a single core engine, due to the\n", "# overhead of multiprocessing. But in general with more expensive problems\n", "# it is recommended to use the MultiProcessEngine.\n", "engine = pypesto.engine.SingleCoreEngine()\n", "# starts\n", "n_starts = 20" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "As a first comparison, we time each optimization. We also use the same optimizer and engine for all optimizations." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "%%time\n", "# run optimization of problem 1\n", "result1 = optimize.minimize(\n", " problem=problem1, optimizer=optimizer, n_starts=n_starts, engine=engine\n", ")\n", "# run optimization of problem 2\n", "result2 = optimize.minimize(\n", " problem=problem2, optimizer=optimizer, n_starts=n_starts, engine=engine\n", ")\n", "# run optimization of problem 3\n", "result3 = optimize.minimize(\n", " problem=problem3, optimizer=optimizer, n_starts=n_starts, engine=engine\n", ")\n", "# run optimization of problem 4\n", "result4 = optimize.minimize(\n", " problem=problem4, optimizer=optimizer, n_starts=n_starts, engine=engine\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "As a first step, let us take a look at the different result summaries:" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "display(\n", " Markdown(\"# Result 1\\n\" + result1.optimize_result.summary(disp_best=False))\n", ")\n", "display(\n", " Markdown(\"# Result 2\\n\" + result2.optimize_result.summary(disp_best=False))\n", ")\n", "display(\n", " Markdown(\"# Result 3\\n\" + result3.optimize_result.summary(disp_best=False))\n", ")\n", "display(\n", " Markdown(\"# Result 4\\n\" + result4.optimize_result.summary(disp_best=False))\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "Here we can see already a big difference between the first three and the fourth. The one without gradients took approximately five times as long to finish the optimization as the other three. The best value found is also not the same as for the others. But the biggest difference is probably the fact that while the first three all converged in all their starts, the one without gradient reach the maximum number of evaluations in most to all cases. Keep in mind: **All starts were the same for all problems**.\n", "\n", "A small detail here:\n", "When comparing the first three results, one may notice two things. For one, leaving out the hessian seems not make any difference. And additionally, while they are rather close in speed compared to the fourth one, the second one sticks out to be somewhat slower.\n", "This is mainly due to the fact that we chose the scipy optimizer \"l-bfgs-b\", which does not support the usage of a hessian, for the sake of comparing all four of them. This also explains why the second one is slower, as (by construction), whether needed or not, problem2 will evaluate the hessian." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Visualization" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# waterfalls\n", "visualize.waterfall(result1, size=figsize)\n", "visualize.waterfall(result2, size=figsize)\n", "visualize.waterfall(result3, size=figsize)\n", "visualize.waterfall(result4, size=figsize);" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "We can see here already the stark difference between the problem definitions. In order to compare things easier, we can also plot all results in one waterfall plot:" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# plot one list of waterfalls\n", "visualize.waterfall(\n", " [result1, result2, result3, result4],\n", " legends=[\"Problem 1\", \"Problem 2\", \"Problem 3\", \"Problem 4\"],\n", " size=figsize,\n", ");" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "We can also take a look at the parameters, that each optimizer found:" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# plot parameters\n", "ax = visualize.parameters(result1, size=figsize)\n", "ax.set_title(\"Estimated parameters problem 1\")\n", "ax2 = visualize.parameters(\n", " [result4, result1, result2, result3],\n", " legends=[\"Problem 4\", \"Problem 1\", \"Problem 2\", \"Problem 3\"],\n", " size=figsize,\n", ")\n", "ax2.set_title(\"Estimated parameters all problems\");" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "## 3. Profiling" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "We want to create profiles for our parameters to know more about their uncertainties. We shall create profiles for problem 1 and 4. For problem 1 specifically, we create two profiles, as we have seen, that there are two distinct optima, so we want to start our profile likelihood from both optima.\n", "\n", "Note that when computing profiles, the cost function is interpreted as a negative log-likelihood function. This has some implications on the stopping criteria of the algorithm and on the confidence intervals. Therefore, a meaningful statistical interpretation of the results is only possible if the cost function is indeed a negative log-likelihood function or a negative log-posterior function. The same holds for sampling." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# we create references for each \"best point\":\n", "ref = {\n", " \"x\": result1.optimize_result.x[0],\n", " \"fval\": result1.optimize_result.fval[0],\n", " \"color\": [0.2, 0.4, 1.0, 1.0],\n", " \"legend\": \"First optimum problem 1\",\n", "}\n", "ref = visualize.create_references(ref)[0]\n", "# we create references for each \"best point\":\n", "ref2 = {\n", " \"x\": result1.optimize_result.x[-1],\n", " \"fval\": result1.optimize_result.fval[-1],\n", " \"color\": [0.4, 1.0, 0.2, 1.0],\n", " \"legend\": \"Second optimum problem 1\",\n", "}\n", "ref2 = visualize.create_references(ref2)[0]\n", "# we create references for each \"best point\":\n", "ref4 = {\n", " \"x\": result4.optimize_result.x[0],\n", " \"fval\": result4.optimize_result.fval[0],\n", " \"color\": [0.2, 0.4, 1.0, 1.0],\n", " \"legend\": \"First optimum problem 4\",\n", "}\n", "ref4 = visualize.create_references(ref4)[0];" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "%%time\n", "# compute profiles\n", "profile_options = profile.ProfileOptions(whole_path=True)\n", "\n", "result1 = profile.parameter_profile(\n", " problem=problem1,\n", " result=result1,\n", " optimizer=optimizer,\n", " profile_index=np.array([0, 3]),\n", " result_index=0,\n", " profile_options=profile_options,\n", " filename=None,\n", ")\n", "\n", "# compute profiles from second optimum\n", "result1 = profile.parameter_profile(\n", " problem=problem1,\n", " result=result1,\n", " optimizer=optimizer,\n", " profile_index=np.array([0, 3]),\n", " result_index=-1,\n", " profile_options=profile_options,\n", " filename=None,\n", ")\n", "result4 = profile.parameter_profile(\n", " problem=problem4,\n", " result=result4,\n", " optimizer=optimizer,\n", " profile_index=np.array([0, 3]),\n", " result_index=0,\n", " profile_options=profile_options,\n", " filename=None,\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "# specify the parameters, for which profiles should be computed\n", "visualize.profiles(\n", " result1,\n", " profile_indices=[0, 3],\n", " reference=[ref, ref2],\n", " profile_list_ids=[0, 1],\n", " size=(12, 5),\n", ");" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "visualize.profiles(\n", " result4,\n", " profile_indices=[0, 3],\n", " reference=[ref4],\n", " size=(12, 5),\n", ");" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": "As we can see here, the second optimum disagrees only in the very first value. Here it becomes apparent that it is only a local optimum, as the profile for the second optimum is very flat. Comparing the profiles of problem 1 and 4, we can see, that while the convergence of problem 4 was quite bad, the profiles look really similar." }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "### Profile approximation" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "When computing the profiles is computationally too demanding, it is possible to employ to at least consider a normal approximation with covariance matrix given by the Hessian or FIM at the optimal parameters." ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "result1 = profile.approximate_parameter_profile(\n", " problem=problem1,\n", " result=result1,\n", " profile_index=np.array([0, 1, 3, 5]),\n", " result_index=0,\n", " n_steps=1000,\n", ")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%% md\n" } }, "source": [ "These approximate profiles require at most one additional function evaluation, can however yield substantial approximation errors:" ] }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "axes = visualize.profiles(\n", " result1,\n", " profile_indices=[0, 3],\n", " profile_list_ids=[0, 2],\n", " ratio_min=0.01,\n", " colors=[(1, 0, 0, 1), (0, 0, 1, 1)],\n", " legends=[\n", " \"Optimization-based profile\",\n", " \"Local profile approximation\",\n", " ],\n", " size=(12, 5),\n", ");" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "source": [ "visualize.profile_cis(\n", " result1, confidence_level=0.95, profile_list=2, show_bounds=True\n", ");" ], "outputs": [], "execution_count": null } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" } }, "nbformat": 4, "nbformat_minor": 4 }