Internals¶

Plotting¶

`calibration`(truth, output, *[, highlight, axis])	Plots the calibration curve for the model.
`compare_series`(plotdata, cohort_col, ...[, ...])	Creates a line plot of the data using cohorts as hue.
`cohort_evaluation_vs_threshold`(stats, ...[, ...])	Creates a 2x3 grid of individual performance metrics across cohorts.
`cohorts_overlay`(data, plot_func[, axis, ...])	Uses a passed plotting function to plot a line per given split.
`cohorts_vertical`(df, plot_func[, gs, ...])	Uses a passed plotting function to plot a line per given split.
`evaluation`(stats, *, ci_data[, truth, ...])	Generates a 2x3 plot showing the performance of a model.
`histogram_stacked`(y_label, output, *[, ...])	Plots a stacked histogram of the model output by class.
`leadtime_violin`(data, x_col, y_col, *[, ...])	Violin plot of leadtime across cohorts.
`metric_vs_threshold`(stats, metric, *[, ...])	Plots a metric vs threshold curve.
`performance_metrics`(stats, *[, conf, ...])	Single plot of sensitivity, specificity, and PPV.
`ppv_vs_sensitivity`(ppv, sensitivity, ...[, ...])	Plots the PPV vs Sensitivity (precision-recall curve).
`recall_condition`(ppcr, recall, thresholds, ...)	Plots the recall of a model against the predicted condition rate.
`singleROC`(tpr, fpr, thresholds[, ...])	Creates an ROC plot.

Given a plot function that retuns a figure, render to SVG and close the Figure

`cohorts.find_bin_edges`(series[, thresholds])	Creates list of bin edges from a series of continuous numeric data and list of inner thresholds.
`cohorts.get_cohort_data`(df, cohort_feature, ...)	Convenience function to create and format data for use in the cohort plots.
`cohorts.get_cohort_performance_data`(df, ...)	Generates a dataframe with particular performance metrics (accuracy, sensitivity, specificity, ppv, npv, and flag rate (predicted positive condition rate)) for particular threshold values and cohort.
`cohorts.has_good_binning`(bin_ixs, bin_edges)	Verifies that the binning is sound by making sure lists are equal length.
`cohorts.label_cohorts_categorical`(series[, ...])	Bin a categorical series of data, reduced to a set of category values.
`cohorts.label_cohorts_numeric`(series[, splits])	Bin a continuous numeric series of data, based on thresholds of inner bin edges.
`cohorts.resolve_cohorts`(series[, splits])	Bin a series of data based on the defined splits if defined.
`cohorts.resolve_col_data`(df, feature)	Handles resolving feature from either being a series or specifying a series in the dataframe.

`pandas_helpers.event_score`(merged_frame, ...)	Reduces a dataframe of all predictions to a single row of significance; such as the max or most recent value for an entity.
`pandas_helpers.event_time`(event)	Converts an event name into the time column name.
`pandas_helpers.event_value`(event)	Converts an event name into the value column name.
`pandas_helpers.get_model_scores`(dataframe, ...)	Reduces a dataframe of all predictions to a single row of significance; such as the max or most recent value for an entity.
`pandas_helpers.post_process_event`(dataframe, ...)	Infers and casts events.
`pandas_helpers.merge_windowed_event`(...[, ...])	Merges a single windowed event into a predictions dataframe
`pandas_helpers.is_valid_event`(dataframe, ...)	Creates a mask excluding rows (False) where the event occurs before the reference time.
`pandas_helpers.try_casting`(dataframe, ...)	Attempts to cast a column to a specified data type inplace.

`performance.as_percentages`(proba)	Converts a probability in the 0-1 range to a percentage in the 0-100 range.
`performance.as_probabilities`(perc)	Converts a percentage in the 0-100 range to a probability in the 0-1 range.
`performance.assert_valid_performance_metrics_df`(df)	Determines whether a passed dataframe has either all or a subset of columns that likely indicate it was generated by calculate_bin_stats.
`performance.calculate_bin_stats`([y_true, ...])	Calculate summary statistics from y_true and y_pred (y_proba[:,1] for binary classification) arrays.
`performance.calculate_eval_ci`(stats, truth, ...)	Calculate confidence intervals for ROC, PR, and other performance metrics from a stats frame.
`performance.calculate_nnt`(arr[, rho])	Calculates NNT (Number Needed to Treat) for the relative risk reduction, rho, and a perfect-ARR (absolute risk reduction), ie PPV.

`loader.pipeline.ConfigOnlyHook`	TypeAlias for a callable taking a ConfigProvider, which returns a DataFrame.
`loader.pipeline.ConfigFrameHook`	TypeAlias for a callable taking a ConfigProvider and a DataFrame, which returns a DataFrame.
`loader.pipeline.MergeFramesHook`	TypeAlias for a callable taking a ConfigProvider and two DataFrames, which returns a DataFrame.
`loader.pipeline.SeismogramLoader`(config[, ...])	A data loading pipeline using three types of hooks:
`loader.pipeline.SeismogramLoader.load_data`([...])	Entry point for loading data for a Seismogram session.
`loader.loader_factory`(config[, post_load_fn])	Construct a SeismogramLoader from the provided configuration.
`loader.event.parquet_loader`(config)	Loads the events frame from a parquet file based on config.event_path.
`loader.event.post_transform_fn`(config, events)	Converts the Time column in events to a datetime64[ns] type, to be compatible with other operations.
`loader.event.merge_onto_predictions`(config, ...)	Merges each configured event onto the predictions dataframe.
`loader.prediction.parquet_loader`(config)	Load the predictions frame from a parquet file based on config.prediction_path.
`loader.prediction.assumed_types`(config, ...)	Convert the loaded predictions dataframe to the expected types.
`loader.prediction.dictionary_types`(config, ...)	Convert the loaded predictions dataframe to the expected types.

`summaries.default_cohort_summaries`(...)	Generate a dataframe of summary counts from the input dataframe.
`summaries.score_target_cohort_summaries`(...)	Generate a dataframe of summary counts from the input dataframe.

Metaclass for implementing the single instance pattern.

DiskCachedFunction(cache_name, save_fn, load_fn)