Internals¶
Plotting¶
Binary classification¶
|
Plots the calibration curve for the model. |
|
Creates a line plot of the data using cohorts as hue. |
|
Creates a 2x3 grid of individual performance metrics across cohorts. |
|
Uses a passed plotting function to plot a line per given split. |
|
Uses a passed plotting function to plot a line per given split. |
|
Generates a 2x3 plot showing the performance of a model. |
|
Plots a stacked histogram of the model output by class. |
|
Violin plot of leadtime across cohorts. |
|
Plots a metric vs threshold curve. |
|
Single plot of sensitivity, specificity, and PPV. |
|
Plots the PPV vs Sensitivity (precision-recall curve). |
|
Plots the recall of a model against the predicted condition rate. |
|
Creates an ROC plot. |
Utility Functions¶
|
Given a plot function that retuns a figure, render to SVG and close the Figure |
Data Manipulation¶
Cohorts¶
|
Creates list of bin edges from a series of continuous numeric data and list of inner thresholds. |
|
Convenience function to create and format data for use in the cohort plots. |
|
Generates a dataframe with particular performance metrics (accuracy, sensitivity, specificity, ppv, npv, and flag rate (predicted positive condition rate)) for particular threshold values and cohort. |
|
Verifies that the binning is sound by making sure lists are equal length. |
|
Bin a categorical series of data, reduced to a set of category values. |
|
Bin a continuous numeric series of data, based on thresholds of inner bin edges. |
|
Bin a series of data based on the defined splits if defined. |
|
Handles resolving feature from either being a series or specifying a series in the dataframe. |
Pandas Helpers¶
|
Reduces a dataframe of all predictions to a single row of significance; such as the max or most recent value for an entity. |
|
Converts an event name into the time column name. |
|
Converts an event name into the value column name. |
|
Infers and casts events. |
|
Merges a single windowed event into a predictions dataframe |
|
Creates a mask excluding rows (False) where the event occurs before the reference time. |
|
Attempts to cast a column to a specified data type inplace. |
Performance¶
|
Converts a probability in the 0-1 range to a percentage in the 0-100 range. |
Converts a percentage in the 0-100 range to a probability in the 0-1 range. |
|
Determines whether a passed dataframe has either all or a subset of columns that likely indicate it was generated by calculate_bin_stats. |
|
|
Calculate summary statistics from y_true and y_pred (y_proba[:,1] for binary classification) arrays. |
|
Calculate confidence intervals for ROC, PR, and other performance metrics from a stats frame. |
|
Calculates NNT (Number Needed to Treat) for the relative risk reduction, rho, and a perfect-ARR (absolute risk reduction), ie PPV. |
Seismogram Loaders¶
TypeAlias for a callable taking a ConfigProvider, which returns a DataFrame. |
|
TypeAlias for a callable taking a ConfigProvider and a DataFrame, which returns a DataFrame. |
|
TypeAlias for a callable taking a ConfigProvider and two DataFrames, which returns a DataFrame. |
|
|
A data loading pipeline using three types of hooks: |
Entry point for loading data for a Seismogram session. |
|
|
Construct a SeismogramLoader from the provided configuration. |
|
Loads the events frame from a parquet file based on config.event_path. |
|
Converts the Time column in events to a datetime64[ns] type, to be compatible with other operations. |
|
Merges each configured event onto the predictions dataframe. |
|
Load the predictions frame from a parquet file based on config.prediction_path. |
|
Convert the loaded predictions dataframe to the expected types. |
|
Convert the loaded predictions dataframe to the expected types. |
Summaries¶
Generate a dataframe of summary counts from the input dataframe. |
|
Generate a dataframe of summary counts from the input dataframe. |
Low-level patterns¶
Patterns¶
Metaclass for implementing the single instance pattern. |
Decorators¶
|