ceteris_paribus package¶

Subpackages¶

ceteris_paribus.plots package

Submodules¶

ceteris_paribus.explainer module¶

class ceteris_paribus.explainer.Explainer(model, var_names, data, y, predict_fun, label)¶

data¶: Alias for field number 2

label¶: Alias for field number 5

model¶: Alias for field number 0

predict_fun¶: Alias for field number 4

var_names¶: Alias for field number 1

y¶: Alias for field number 3

ceteris_paribus.explainer.explain(model, variable_names=None, data=None, y=None, predict_function=None, label=None)¶

This function creates a unified representation of a model, which can be further processed by various explainers

Parameters:	model – a model to be explained variable_names – names of variables, if not supplied then derived from data data – data that was used for fitting y – labels for the data predict_function – function that takes the data and returns predictions label – label of the model, if not supplied the function will try to infer it from the model object, otherwise unset
Returns:	Explainer object

ceteris_paribus.gower module¶

Gower Distance is a distance measure, that might be used to calculate the similarity between two observations with both categorical and numerical values. It also permits missing values in categorical variables. Therefore this measure might be applied in any dataset. Here we use it as a default function for finding the closest observations to the given one.

The original paper describing the idea might be found here.

This is the module for calculating gower’s distance/dissimilarity

ceteris_paribus.gower.gower_distances(data, observation)¶

Return an array of distances between all observations and a chosen one Based on: https://sourceforge.net/projects/gower-distance-4python https://beta.vu.nl/nl/Images/stageverslag-hoven_tcm235-777817.pdf

ceteris_paribus.profiles module¶

class ceteris_paribus.profiles.CeterisParibus(explainer, new_observation, y, selected_variables, grid_points, variable_splits)¶

Bases: object

print_profile()¶

set_label(label)¶

split_by(column)¶

Split cp profile data frame by values of a given column

Returns:	sorted mapping of values to dataframes

ceteris_paribus.profiles.individual_variable_profile(explainer, new_observation, y=None, variables=None, grid_points=101, variable_splits=None)¶

Calculate ceteris paribus profile

Parameters:

explainer – a model to be explained
new_observation – a new observation for which the profiles are calculated
y – y true labels for new_observation. If specified then will be added to ceteris paribus plots
variables – collection of variables selected for calculating profiles
grid_points – number of points for profile
variable_splits – dictionary of splits for variables, in most cases created with _calculate_variable_splits(). If None then it will be calculated based on validation data avaliable in the explainer.

Returns:

instance of CeterisParibus class

ceteris_paribus.select_data module¶

ceteris_paribus.select_data.select_neighbours(data, observation, y=None, variable_names=None, selected_variables=None, dist_fun='gower', n=20)¶

Select observations from dataset, that are similar to a given observation

Parameters:

data – array or DataFrame with observations
observation – reference observation for neighbours selection
y – labels for observations
variable_names – names of variables
selected_variables – selected variables - require supplying variable names along with data
dist_fun – ‘gower’ or distance function, as pairwise distances in sklearn, gower works with missing data
n – size of the sample

Returns:

DataFrame with selected observations and pandas Series with corresponding labels if provided

ceteris_paribus.select_data.select_sample(data, y=None, n=15, seed=42)¶

Select sample from dataset.

Parameters:	data – array or dataframe with observations y – labels for observations n – size of the sample seed – seed for random number generator
Returns:	selected observations and corresponding labels if provided