seismometer.seismogram.Seismogram

class seismometer.seismogram.Seismogram(*args, **kwargs)

Seismogram is the main orchestrator for the seismometer package. It loads the model data and configuration metadata. It is a singleton, so only one instance can be created.

Seismogram is responsible for:
  1. Loading static data so that they can be treated as Immutable.
    1. Model Configuration

    2. Source Data

  2. Hold the current dynamic configuration that is shared state.
    1. Cohort Selection

  3. Providing data access for other objects without compromising the source data.
    1. Merge Event data with Prediction data for Label generation

    2. Cohort based data selection

    3. Model configuration help texts

As a single instance, the first time it is loaded, it will load the data from the configuration. In order to refresh the single instance, the kernel must be restarted, or Seismogram.kill() must be called.

__init__(config, dataloader)

Constructor for Seismogram, which can only be instantiated once.

Subsequent calls will get the initial instance.

Parameters:
  • config (ConfigProvider) – The configuration provider instance.

  • dataloader (SeismogramLoader, optional) – A loader instance for defining the data loading pipeline.

Methods

__init__(config, dataloader)

Constructor for Seismogram, which can only be instantiated once.

copy_config_metadata()

Loads the base configuration and alerting configuration.

create_cohorts()

Creates data columns for each cohort defined in configuration.

event_aggregation_method(event_col)

Gets the strategy for aggregating scores with respect to the specified event.

event_aggregation_window_hours(event_col)

Gets the window in hours for aggregating scores with respect to the specified event.

event_merge_strategy(event_col)

Gets the strategy for merging scores with respect to the specified event.

load_data(*[, predictions, events, reset])

Loads the seismogram data.

score_bins()

Updates the active values for notebook-scoped selections.

Attributes

censor_threshold

Minimum number of observations for a cohort to be included.

cohort_attribute_count

Number of unique cohort attributes by usage definition.

comparison_time

Column name specifying which time to use as reference when analyzing interventions and outcomes.

dataframe

The working dataframe core to the analyses.

end_time

Latest prediction time.

entity_count

Number of unique prediction entities in the data.

event_types_count

Number of unique outcome KPIs in the data.

events

DataFrame of events which include the target event and other outcomes.

events_columns

Event descriptions.

feature_count

Number of features in the data.

intervention

First event's name in configuration with usage 'intervention'.

outcome

First event's name in configuration with usage 'outcome'.

output

The first configured model output (score).

output_path

The location to write output files.

prediction_count

Number of predictions in the data.

start_time

Earliest prediction time.

target

Name of the target.

target_cols

List of target descriptors, the display name of the event type, for targets.

target_event

Name of the target event.

time_zero

The time associated with the primary target event.

entity_keys

The one or two columns used as identifiers for data.

predict_time

The column name for evaluation timestamp.

output_list

The list of columns representing model outputs.