seismometer.data.performance.calculate_bin_stats

seismometer.data.performance.calculate_bin_stats(y_true=None, y_pred=None, keep_score_values=False, not_point_thresholds=False, rho=None)

Calculate summary statistics from y_true and y_pred (y_proba[:,1] for binary classification) arrays. Supports y_true & y_pred as individual series-likes or as a dataframe with true and proba columns.

Parameters:
  • y_true (Optional[pd.Series], optional) – Series like of binary labels.

  • y_pred (Optional[pd.Series], optional) – Series like of probabilities for positive class.

  • keep_score_values (bool, optional) – Flag to prevent attempts to convert score to percentage (0-100), default False.

  • not_point_thresholds (bool, optional) – If True, does not use point thresholds, by default False; uses 0-100.

  • rho (float, optional) – The relative risk reduction for NNT calculation, by default DEFAULT_RHO.

Return type:

pd.DataFrame of stats, rows for each threshold value between 0 and 100 with columns for basic statistics.