seismometer.seismogram.Seismogram¶

class seismometer.seismogram.Seismogram(*args, **kwargs)¶

Seismogram is the main orchestrator for the seismometer package. It loads the model data and configuration metadata. It is a singleton, so only one instance can be created.

Seismogram is responsible for:

Loading static data so that they can be treated as Immutable.
1. Model Configuration
2. Source Data
Hold the current dynamic configuration that is shared state.
1. Cohort Selection
Providing data access for other objects without compromising the source data.
1. Merge Event data with Prediction data for Label generation
2. Cohort based data selection
3. Model configuration help texts

As a single instance, the first time it is loaded, it will load the data from the configuration. In order to refresh the single instance, the kernel must be restarted, or Seismogram.kill() must be called.

__init__(config=None, dataloader=None)¶

Constructor for Seismogram, which can only be instantiated once.

Subsequent calls will get the initial instance.

Parameters:

config (ConfigProvider) – The configuration provider instance.
dataloader (SeismogramLoader, optional) – A loader instance for defining the data loading pipeline.

Methods

`__init__`([config, dataloader])	Constructor for Seismogram, which can only be instantiated once.
`copy_config_metadata`()	Loads the base configuration and alerting configuration.
`create_cohorts`()	Creates data columns for each cohort defined in configuration.
`event_aggregation_method`(event_col)	Gets the strategy for aggregating scores with respect to the specified event.
`event_aggregation_window_hours`(event_col)	Gets the window in hours for aggregating scores with respect to the specified event.
`event_merge_strategy`(event_col)	Gets the strategy for merging scores with respect to the specified event.
`get_binary_targets`()
`get_ordinal_categorical_groups`(max_cat_size)
`get_ordinal_categorical_metrics`(max_cat_size)
`load_data`(*[, predictions, events, reset])	Loads the seismogram data.
`score_bins`()	Updates the active values for notebook-scoped selections.

Attributes

`censor_threshold`	Minimum number of observations for a cohort to be included.
`cohort_attribute_count`	Number of unique cohort attributes by usage definition.
`comparison_time`	Column name specifying which time to use as reference when analyzing interventions and outcomes.
`dataframe`	The working dataframe core to the analyses.
`end_time`	Latest prediction time.
`entity_count`	Number of unique prediction entities in the data.
`event_types_count`	Number of unique outcome KPIs in the data.
`events`	DataFrame of events which include the target event and other outcomes.
`events_columns`	Event descriptions.
`feature_count`	Number of features in the data.
`intervention`	First event's name in configuration with usage 'intervention'.
`outcome`	First event's name in configuration with usage 'outcome'.
`output`	The first configured model output (score).
`output_path`	The location to write output files.
`prediction_count`	Number of predictions in the data.
`start_time`	Earliest prediction time.
`target`	Name of the target.
`target_cols`	List of target descriptors, the display name of the event type, for targets.
`target_event`	Name of the target event.
`time_zero`	The time associated with the primary target event.
`entity_keys`	The one or two columns used as identifiers for data.
`predict_time`	The column name for evaluation timestamp.
`output_list`	The list of columns representing model outputs.