{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Parameter estimation using censored data" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "This Notebook explains the use of censored data for parameter estimation of ODE models.\n", "An example model is provided in `pypesto/doc/example/example_censored`. \n", "The implementation supports all three censoring types of measurements:\n", "\n", "- Left censored: a datapoint is below a certain known value.\n", "- Right censored: a datapoint is above a certain known value.\n", "- Interval censored: a datapoint is on an interval between two known values.\n", "\n", "In all three cases, the exact numerical value of the datapoint is unknown. For the integration of censored measurements, we employ the optimal scaling approach. In this approach, each datapoint is represented by a variable surrogate datapoint which is constrained to be in its respective category. Categories can be thought of as intervals with specific bounds. \n", "\n", "For censored data, the category bounds are already known from the censoring value:\n", "\n", "- for left censored data the category bounds are (0, censoring value),\n", "- for right censored data the category bounds are (censoring value, infinity),\n", "- for interval censored data the category bounds are given by the interval bounds.\n", "\n", "This makes the identification of surrogate data extremely simple, as it can be directly (analytically) calculated.\n", "\n", "Details on the optimal scaling approach can be found in Shepard, 1962 (https://doi.org/10.1007/BF02289621).\n", "Details on the application of the gradient-based optimal scaling approach to mechanistic modeling\n", "with ordinal data can be found in Schmiester et al. 2020 (https://doi.org/10.1007/s00285-020-01522-w)\n", "and Schmiester et al. 2021 (https://doi.org/10.1093/bioinformatics/btab512)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Import model from the petab_problem" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import petab\n", "\n", "import pypesto\n", "import pypesto.logging\n", "import pypesto.optimize as optimize\n", "from pypesto.petab import PetabImporter\n", "from pypesto.visualize import plot_categories_from_pypesto_result" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "To use censored data for parameter estimation, in pyPESTO we use the optimal scaling approach. Since the optimal scaling approach is implemented in the hierarchical manner, it requires us to specify `hierarchical=True` when importing the `petab_problem`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "petab_folder = \"./example_censored/\"\n", "yaml_file = \"example_censored.yaml\"\n", "\n", "petab_problem = petab.Problem.from_yaml(petab_folder + yaml_file)\n", "\n", "importer = PetabImporter(petab_problem, hierarchical=True)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The `petab_problem` has to be specified in the usual PEtab formulation. The censored measurements have to be specified in the `measurement.tsv` file by adding the censoring type in the `measurementType` column, where a censoring type can be:\n", "\n", "- `left-censored`,\n", "- `right-censored`,\n", "- or `interval-censored`.\n", "\n", "If the censoring type is not specified, the measurement will be considered as quantitative. Then, the censoring bound has to be specified in the `censoringBounds` column. For interval censored measurements the bounds should be separated with a semicolon:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pandas import option_context\n", "\n", "with option_context(\"display.max_colwidth\", 400):\n", " display(petab_problem.measurement_df)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "For censored measurements, the `measurement` column will be ignored. For the `Ybar` observable we didn't specify a measurement type, so those will be used as quantitative." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Note on inclusion of additional data types:\n", "It is possible to include observables with different types of data to the same `petab_problem`. Refer to the notebooks on using [semiquantitative data](semiquantitative_data.ipynb), [relative data](relative_data.ipynb) and [ordinal data](ordinal_data.ipynb) for details on integration of other data types. If the `measurementType` column is left empty for all measurements of an observable, the observable will be treated as quantitative." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Construct the objective and pypesto problem" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Now when we construct the `objective`, it will construct all objects of the optimal scaling inner optimization:\n", "\n", "- `OrdinalInnerSolver`\n", "- `OrdinalCalculator`\n", "- `OrdinalProblem`\n", "\n", "As there are no censored data specific inner options, we will pass none to the constructor." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Now let's construct the pyPESTO problem and optimizer. We're going to use a gradient-based optimizer for a faster optimization, but gradient-free optimizers can be used in the same way:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "problem = importer.create_problem()\n", "\n", "engine = pypesto.engine.MultiProcessEngine(n_procs=3)\n", "\n", "optimizer = optimize.ScipyOptimizer(\n", " method=\"L-BFGS-B\",\n", " options={\"disp\": None, \"ftol\": 2.220446049250313e-09, \"gtol\": 1e-5},\n", ")\n", "n_starts = 3\n", "np.random.seed(n_starts)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Run optimization using optimal scaling approach" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "np.random.seed(n_starts)\n", "\n", "res = optimize.minimize(\n", " problem, n_starts=n_starts, optimizer=optimizer, engine=engine\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing the result" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pypesto.visualize import waterfall\n", "\n", "waterfall(res)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We can plot the censoring categories using the `plot_categories_from_pypesto_result` plotting function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_categories_from_pypesto_result(res, figsize=(15, 10))\n", "plt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" }, "vscode": { "interpreter": { "hash": "b4f64b1cfeae9987d9a74471fe6faf49d769577c41c664ee1b5af662a144b184" } } }, "nbformat": 4, "nbformat_minor": 4 }