seismometer.data.performance.calculate_bin_stats¶
- seismometer.data.performance.calculate_bin_stats(y_true=None, y_pred=None, keep_score_values=False, not_point_thresholds=False, rho=None, threshold_precision=0)¶
- Calculate summary statistics from y_true and y_pred (y_proba[:,1] for binary classification) arrays. Supports y_true & y_pred as individual series-likes or as a dataframe with true and proba columns. - Parameters:
- y_true (Optional[pd.Series], optional) – Series like of binary labels. 
- y_pred (Optional[pd.Series], optional) – Series like of probabilities for positive class. 
- keep_score_values (bool, optional) – Flag to prevent attempts to convert score to percentage (0-100), default False. 
- not_point_thresholds (bool, optional) – If True, does not use point thresholds, by default False; uses 0-100. 
- rho (float, optional) – The relative risk reduction for NNT calculation, by default DEFAULT_RHO. 
- threshold_precision (int, optional) – Number of decimal places to use when generating thresholds as percentages. - E.g., threshold_precision=0 yields thresholds like 0, 1, …, 100 (coarse). - threshold_precision=2 yields 0.00, 0.01, …, 100.00 (fine-grained). - Higher values improve AUC approximation but increase computation cost. By default 0. 
 
- Return type:
- pd.DataFrame of stats, rows for each threshold value between 0 and 100 with columns for basic statistics.