seismometer.configuration.model.DataUsage

class seismometer.configuration.model.DataUsage(*, entity_id='Id', context_id=None, primary_output='Score', primary_target='Target', predict_time='Time', comparison_time='', event_table=EventTableMap(type='Type', time='EventTime', value='Value'), outputs=[], cohorts=[], features=[], events=[], cohort_hierarchies=[], metrics=[], load_time_filters=[], censor_min_count=10)

The definitions of data to use in a notebook run.

This structure defines what data to load and how to use it. The entity_id and context_id are the possible keys for joining events and predictions, and are also used to summarize predictions to a single entity. Primary output and target are the score and target used in default performance analysis.

The features and scores list, when defined, limit the loading of data from the predictions file to only those inputs and outputs (plus primary_score and cohort attributes). The events similarly limits the event types that are merged into the working dataframe and available to analyses.

Parameters:
  • entity_id (str)

  • context_id (str | None)

  • primary_output (str)

  • primary_target (str)

  • predict_time (str)

  • comparison_time (str)

  • event_table (EventTableMap)

  • outputs (list[str])

  • cohorts (list[Cohort])

  • features (list[str])

  • events (list[Event])

  • cohort_hierarchies (List[CohortHierarchy])

  • metrics (list[Metric])

  • load_time_filters (list[FilterConfig])

  • censor_min_count (Annotated[int, Ge(ge=10)])

__init__(**data)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

Return type:

None

Methods

__init__(**data)

Create a new model by parsing and validating input data from keyword arguments.

construct([_fields_set])

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

default_comparison(comparison_time, values)

Return the default comparison_time.

dict(*[, include, exclude, by_alias, ...])

from_orm(obj)

json(*[, include, exclude, by_alias, ...])

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

!!! abstract "Usage Documentation"

model_dump(*[, mode, include, exclude, ...])

!!! abstract "Usage Documentation"

model_dump_json(*[, indent, ensure_ascii, ...])

!!! abstract "Usage Documentation"

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(context, /)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, extra, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

!!! abstract "Usage Documentation"

model_validate_strings(obj, *[, strict, ...])

Validate the given object with string data against the Pydantic model.

parse_file(path, *[, content_type, ...])

parse_obj(obj)

parse_raw(b, *[, content_type, encoding, ...])

reduce_cohorts_to_unique_names(cohorts, values)

Reduces the list of cohorts to unique names.

reduce_events_to_unique_names(events, values)

Reduces the list of events to unique names.

schema([by_alias, ref_template])

schema_json(*[, by_alias, ref_template])

update_forward_refs(**localns)

validate(value)

validate_hierarchies_disjoint(hierarchies, info)

Validates that all cohort hierarchies are disjoint (no column used in more than one hierarchy).

Attributes

model_computed_fields

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_extra

Get extra fields set during validation.

model_fields

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

entity_id

The identifier of the entity.

context_id

A secondary identifier used to group an entity_id.

primary_output

Column name of the primary output of the model.

primary_target

Display_name of the primary target event.

predict_time

Column name of the timestamp for each prediction row.

comparison_time

The timestamp to use as reference for comparison.

event_table

Mapping of the non-id columns in events data.

outputs

A list of all columns to consider outputs; does not need to include primary_output.

cohorts

A list of all cohort attributes to make available in selections.

features

A list of all features to load into predictions.

events

A list of all events to load.

cohort_hierarchies

A list of ordered cohort source groups defining hierarchical dependencies.

metrics

A list of all metrics to load.

load_time_filters

A list of filters to apply at load time to reduce the working dataset.

censor_min_count

The minimum size of a cohort to be considered displayable.