{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "# Optimizer Convergence and Comparison"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "This notebook is dedicated to additional insights into the optimization process, be it convergence properties or comparisons of two different results."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "**After this notebook you can...**\n",
    "\n",
    "- Use visualization methods to compare optimizers in terms of performance\n",
    "- check convergence of a multistart optimization\n",
    "- get insights into the convergence history\n",
    "- restart an optimization from a history file"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "## Imports and constants"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import copy\n",
    "import logging\n",
    "import tempfile\n",
    "\n",
    "import numpy as np\n",
    "import scipy as sp\n",
    "from IPython.display import Markdown, display\n",
    "\n",
    "import pypesto\n",
    "import pypesto.optimize as optimize\n",
    "import pypesto.visualize as visualize\n",
    "\n",
    "np.random.seed(3142)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating optimization results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the sake of this notebook, we create two optimization results done with different optimizers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture\n",
    "# create problem\n",
    "objective = pypesto.Objective(\n",
    "    fun=sp.optimize.rosen,\n",
    "    grad=sp.optimize.rosen_der,\n",
    "    hess=sp.optimize.rosen_hess,\n",
    ")\n",
    "dim_full = 15\n",
    "lb = -5 * np.ones((dim_full, 1))\n",
    "ub = 5 * np.ones((dim_full, 1))\n",
    "\n",
    "problem = pypesto.Problem(objective=objective, lb=lb, ub=ub)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# optimizer\n",
    "scipy_optimizer = optimize.ScipyOptimizer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create temporary storagefile...\n",
    "f_scipy = tempfile.NamedTemporaryFile(suffix=\".hdf5\", delete=False)\n",
    "fn_scipy = f_scipy.name\n",
    "\n",
    "# ... and corresponding history option\n",
    "history_options_scipy = pypesto.HistoryOptions(\n",
    "    trace_record=True, storage_file=fn_scipy\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note to the above:**\n",
    "\n",
    "In practice you should not use a temporary file, as it is removed after the run, while still creating overhead. There are two options you might choose from instead:\n",
    "\n",
    "- If you do not plan to save the optimization result, you can use a `MemoryHistory` by removing the `storage_file`-argument. This creates no overhead but is more demanding on the memory.\n",
    "- If you want to save your results (**recommended**) for any form of reusability, you can remove `f_$optimizer` and replace the `fn_$optimizer`assignment with `fn_$optimizer = \"filename_of_choice.hdf5\"`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "n_starts = 10\n",
    "\n",
    "# run optimization\n",
    "result_scipy = optimize.minimize(\n",
    "    problem=problem,\n",
    "    optimizer=scipy_optimizer,\n",
    "    n_starts=n_starts,\n",
    "    history_options=history_options_scipy,\n",
    "    filename=fn_scipy,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In a first step we compare the optimizers in terms of final objective function and robustness through a waterfall plot."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Optimizer convergence"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First we want to check convergence of a single result. For this a summary and general visualizations such as waterfall-plots can be helpfull, but also specific optimizer_convergence visualization as well as history tracing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "display(Markdown(result_scipy.summary()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# waterfall plot\n",
    "visualize.waterfall(result_scipy);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The waterfall plot is an overview of the final objective function values. They are ordered from best to worst. Similar colors indicate similar function values and potential local optima/mannifolds. In the best case scenario all values are assigned to a plateau indicating local optima and the best value is found more then once. Additionally we might want to check whether the gradients converged as well and whether we can find a pattern in specific reasons to stop:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "visualize.optimizer_convergence(result_scipy);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We usually want the gradients to be very low, in order to actually ensure we are in a local optimum. If the results do not seem entirely promising, we might want to switch optimizers altogether, as different optimizers sometimes perform better for other problems. Additionally one can try changing the optimizer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# switch to fides optimizer\n",
    "fides_optimizer = optimize.FidesOptimizer(verbose=logging.WARN)\n",
    "f_fides = tempfile.NamedTemporaryFile(suffix=\".hdf5\", delete=False)\n",
    "fn_fides = f_fides.name\n",
    "history_options_fides = pypesto.HistoryOptions(\n",
    "    trace_record=True, storage_file=fn_fides\n",
    ")\n",
    "result_fides = optimize.minimize(\n",
    "    problem=problem,\n",
    "    optimizer=fides_optimizer,\n",
    "    n_starts=n_starts,\n",
    "    history_options=history_options_fides,\n",
    "    filename=fn_fides,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "visualize.waterfall(\n",
    "    [result_fides, result_scipy],\n",
    "    legends=[\"Fides Optimizer\", \"Scipy Optimizer\"],\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also compare various metrics of the results, such as time and number of evaluations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "visualize.optimization_run_properties_per_multistart(\n",
    "    [result_fides, result_scipy],\n",
    "    properties_to_plot=[\"time\", \"n_fval\"],\n",
    "    legends=[\"Fides Optimizer\", \"Scipy Optimizer\"],\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We might want to check how close the estimated guesses are together, for this we can employ the parameter visualization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "visualize.parameters(result_fides);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also check how the optimization trajectory looks like during the different runs, getting other reasons such as very flat landscapes, that can be additonal reasons for the optimization to stop. For this the we use the history:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "visualize.optimizer_history(result_fides, trace_y=\"fval\")\n",
    "visualize.optimizer_history(result_fides, trace_y=\"gradnorm\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we can see the function values are not monotonic. This is due to the optimization tracing line search evaluations as well. This allows us to investigate potential problems. Recurring patterns in the gradient norm together with miniscule to no changes in the function values indicate the optimizer to not be able to really find another next point or taking spiraling steps. In both cases the actual optimum is very hard to pinpoint.\n",
    "Lowering tolerances, increasing startpoints (up to a certain point), switching optimizers are all valid strategies in trying to overcome such issues. However, there is no recipe for all models and thus it is always important to investigate the optimization in terms of convergence, termination reasons and function evaluations, to get ideas on what to do next."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reloading from History\n",
    "\n",
    "Especially when running large models on clusters, the optimization sometimes may stop due to unfortunate reasons (e.g., timeouts). In these cases, the history serves yet another purpose: retrieving finished and unfinished optimizations. Sometimes out of 100 starts, 80 might have already been terminated. Investigating those 80 might already yield good results. In other cases, we might want to continue optimization from where we left off.\n",
    "\n",
    "### Loading Results from History\n",
    "\n",
    "To load the results from history, we use the `optimization_result_from_history` function from the `optimize` module. This function allows us to retrieve the state of the optimization process stored in **one** HDF5 file.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load result from history\n",
    "result_from_history = optimize.optimization_result_from_history(\n",
    "    problem=problem, filename=fn_fides\n",
    ")\n",
    "result_from_history.problem = problem"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Verifying the Loaded Results\n",
    "\n",
    "We can check that the visualization of the loaded results matches the original results. This ensures that the history has been loaded correctly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "# Visualize the optimization history\n",
    "visualize.optimizer_history(result_from_history, trace_y=\"fval\")\n",
    "visualize.parameters(\n",
    "    [result_fides, result_from_history],\n",
    "    colors=[[0.8, 0.2, 0.2, 0.5], [0.2, 0.2, 0.8, 0.2]],\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "source": [
    "We compare the function value trace of the loaded results with the original results to ensure consistency."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result_from_history.optimize_result[\n",
    "    0\n",
    "].history.get_fval_trace() == result_fides.optimize_result[\n",
    "    0\n",
    "].history.get_fval_trace()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Continuing Optimization\n",
    "\n",
    "It's important to note that we did not use the result saved in HDF5 to fill in the loaded result, but solely the history. This means that if you use an Hdf5History during optimization and the optimization is interrupted, you can still load the optimization up to that point. From there, you can, for example, restart the unfinished (e.g., the last 3) starts and finish optimizing them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# we set x_guesses within the problem, to tell the optimization the startpoints of optimization\n",
    "continued_problem = copy.deepcopy(problem)\n",
    "continued_problem.set_x_guesses(result_from_history.optimize_result.x[-3:])\n",
    "\n",
    "continued_result = optimize.minimize(\n",
    "    problem=continued_problem,\n",
    "    optimizer=fides_optimizer,\n",
    "    n_starts=3,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Notes on Hdf5History with MultiProcessEngine\n",
    "\n",
    "If you use both `Hdf5History` and `MultiProcessEngine` together, the history works as follows:\n",
    "\n",
    "- You specify a history file `fn_history` to store the results in.\n",
    "- A folder with the name `fn_history` is created at the start of optimization.\n",
    "- Each optimization gets its own separate HDF5 file within the folder named `fn_history_{id}`.\n",
    "- **After** optimization, the `fn_history` file is created, linking each result within this file. This allows you to work easily with the `fn_history` file, which pools together all other histories.\n",
    "\n",
    "\n",
    "However, if your optimization is interrupted, there is **no** `fn_history` file for pooling. In this case, the `optimization_result_from_history` does not work nicely for all of them together. Instead, you can load each result separately through the `optimization_result_from_history` and append the `optimize_results` together.\n",
    "\n",
    "A Pseudo-Code for this would look like this:\n",
    "\n",
    "```python\n",
    "import os\n",
    "from optimize import optimization_result_from_history\n",
    "from pypesto import Result\n",
    "\n",
    "# The history directory is a folder containing all individual history files, in the same place as the fn_history file would have been\n",
    "history_dir = \"fn_history\"\n",
    "\n",
    "# Get a list of all individual history files\n",
    "history_files = [os.path.join(history_dir, f) for f in os.listdir(history_dir) if f.endswith('.h5')]\n",
    "\n",
    "# Each file will have exactly one optimization result\n",
    "result = Result(problem=problem)\n",
    "result.optimize_result.list = [optimization_result_from_history(problem=problem, filename=history_file).optimize_result[0] for history_file in history_files]\n",
    "result.optimize_result.sort()\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}