pycalf package

Submodules

pycalf.metrics module

class pycalf.metrics.AttributeEffect[source]

Bases: object

Estimating the effect of the intervention by attribute.

fit(X: pandas.DataFrame, treatment: pandas.Series, y: pandas.Series, weight: numpy.ndarray | None = None) None[source]

Fit the model with X, y and weight.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentpd.Series

Flags with or without intervention.

ypd.Series

Outcome variables.

weightnp.array

The weight of each sample.

Returns

None

transform() pandas.DataFrame[source]

Apply the estimating the effect of the intervention by attribute.

Returns

pd.DataFrame

class pycalf.metrics.EffectSize[source]

Bases: object

Calculating the effect size-d.

Examples

ate_weight = model.get_weight(treatment, mode=’ate’) es = metrics.EffectSize() es.fit(X, treatment, weight=ate_weight) es.transform() # return (effect_size, effect_name)

fit(X: pandas.DataFrame, treatment: numpy.ndarray, weight: numpy.ndarray | None = None) None[source]

Fit the model with X.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentpd.Series

Flags with or without intervention.

weightnp.array

The weight of each sample.

Returns

None

fit_transform(X: pandas.DataFrame, treatment: numpy.ndarray, weight: numpy.ndarray | None = None) Dict[str, numpy.ndarray][source]

Fit the model with X and apply the dimensionality reduction on X.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentpd.Series

Flags with or without intervention.

weightnp.array

The weight of each sample.

Returns

pd.DataFrame

transform() Dict[str, numpy.ndarray][source]

Apply the calculating the effect size d.

Returns

Dict[str, np.ndarray]

Dictionary containing ‘effect_name’ and ‘effect_size’ arrays.

class pycalf.metrics.VIF[source]

Bases: object

Variance Inflation Factor (VIF).

fit(data: pandas.DataFrame) None[source]

Fit the model with data.

Parameters

data : pd.DataFrame

Returns

None

fit_transform(data: pandas.DataFrame, **kwargs) pandas.DataFrame[source]

Fit the model with data and apply the calculating vif.

Parameters

data : pd.DataFrame

Returns

result : pd.DataFrame

transform() pandas.DataFrame[source]

Apply the calculating vif.

Returns

result : pd.DataFrame

pycalf.metrics.f1_score(y_true: numpy.ndarray, y_score: numpy.ndarray, threshold: float = 0.5, is_auto: bool = True) float[source]

Calculate the F1 score.

Parameters

y_truenumpy.ndarray

The target vector.

y_scorenumpy.ndarray

The score vector.

thresholdfloat

Threshold on the decision function used to compute precision and recall. Default is 0.5.

is_autobool

If True, automatically find optimal threshold. Default is True.

Returns

scorefloat

F1 score.

pycalf.propensity module

class pycalf.propensity.DoubleRobust(learner, second_learner)[source]

Bases: IPW

estimate_effect(treatment: numpy.ndarray, mode: str = 'ate') tuple[float, float, float][source]

Calculate the treatment effect using double robust method.

Parameters

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

modestr, default=”ate”

Adjustment method. Must be ‘raw’, ‘ate’, ‘att’ or ‘atu’.

Returns

tuple

A tuple containing (avg_y_control, avg_y_treat, effect_size)

Raises

ValueError

If model is not fitted or mode is invalid.

fit(X: pandas.DataFrame, treatment: numpy.ndarray, y: numpy.ndarray, eps: float = 1e-08) None[source]

Fit learner and Estimate Propensity Score.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

ynumpy.ndarray

Outcome variables. Can be 1D or 2D array.

epsfloat, default=1e-8

Extreme Value Trend Score Rounding Value.

Raises

ValueError

If eps is not in range [0, 1).

class pycalf.propensity.IPW(learner)[source]

Bases: object

Inverse Probability Weighting Method.

estimate_effect(treatment: numpy.ndarray, y: numpy.ndarray, mode: str = 'ate') tuple[float, float, float][source]

Calculate treatment effect using inverse probability weighting.

Parameters

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

ynumpy.ndarray

Outcome variables.

modestr

Adjustment method. must be raw, ate, att or atu.

Returns

tuple

A tuple containing (avg_y_control, avg_y_treat, effect_size)

fit(X: pandas.DataFrame, treatment: numpy.ndarray, y: numpy.ndarray | None = None, eps: float = 1e-08) None[source]

Fit learner and Estimate Propensity Score.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

ynumpy.ndarray

Outcome variables.

epsfloat, default=1e-8

Extreme Value Trend Score Rounding Value.

Raises

ValueError

If eps is not in range [0, 1).

get_score() numpy.ndarray[source]

Return propensity score.

Returns

p_scorenumpy.ndarray

Propensity score for each sample.

Raises

ValueError

If model is not fitted.

get_weight(treatment: numpy.ndarray, mode: str = 'ate') numpy.ndarray[source]

Return sample weight representing matching.

Parameters

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

modestr, default=”ate”

Adjustment method. Must be ‘raw’, ‘ate’, ‘att’ or ‘atu’.

Returns

sample_weightnumpy.ndarray

Sample weights.

Raises

ValueError

If mode is not ‘raw’, ‘ate’, ‘att’ or ‘atu’.

class pycalf.propensity.Matching(learner, min_match_dist=0.01)[source]

Bases: object

Matching with propensity score.

Attributes

p_scorenumpy.ndarray

Propensity Score.

estimate_effect(treatment: numpy.ndarray, y: numpy.ndarray, mode: str = 'ate') tuple[float, float, float][source]

Match using propensity score and return sample_weight.

Parameters

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

ynumpy.ndarray

Outcome variables.

modestr

Adjustment method. raw or ate.

Returns

tuple

A tuple containing (avg_y_control, avg_y_treat, effect_size)

fit(X: pandas.DataFrame, treatment: numpy.ndarray, y: numpy.ndarray | None = None) None[source]

Fit learner and Estimate Propensity Score.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

ynumpy.ndarray

Outcome variables.

get_score() numpy.ndarray[source]

Return propensity score.

Returns

numpy.ndarray

Propensity score.

get_weight(treatment: numpy.ndarray, mode: str = 'ate') numpy.ndarray[source]

Return sample weight representing matching.

Parameters

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

modestr

Adjustment method. raw or ate.

Returns

numpy.ndarray

Sample weight.

pycalf.uplift module

class pycalf.uplift.UpliftModel(learner_treat, learner_control)[source]

Bases: object

Class of Uplift Modeling.

estimate_uplift_score(X: numpy.ndarray) numpy.ndarray[source]

Estimate uplift scores.

Parameters

Xnumpy.ndarray

Features for prediction treat and control probability.

Returns

uplift_scorenp.array

Uplift Score.

fit(X_treat: numpy.ndarray, y_treat: numpy.ndarray, X_control: numpy.ndarray, y_control: numpy.ndarray, weight_treat: numpy.ndarray | None = None, weight_control: numpy.ndarray | None = None) None[source]

Parameters

X_treatnumpy.ndarray

Features for learner_treat.

y_treatnumpy.ndarray

Labels for learner_treat.

X_controlnumpy.ndarray

Features for learner_control.

y_controlnumpy.ndarray

Labels for learner_control.

weight_treatnumpy.ndarray or None

Weights for learner_treat.

weight_controlnumpy.ndarray or None

Weights for learner_control.

Returns

None

get_auuc(lift: numpy.ndarray) float[source]

Parameters

liftnumpy.ndarray

Array of lift, treatment effect.

Returns

auucfloat

AUUC score.

get_baseline(lift: numpy.ndarray) numpy.ndarray[source]

Parameters

liftnumpy.ndarray

Array of lift, treatment effect.

Returns

base_linenumpy.ndarray

Array of random treat effect.

predict(X: numpy.ndarray, treatment: numpy.ndarray, y: numpy.ndarray) Tuple[numpy.ndarray, numpy.ndarray][source]

Parameters

Xnumpy.ndarray

Features for prediction treat and control probability.

treatmentnumpy.ndarray[bool]

Flags with or without intervention.

ynumpy.ndarray

Outcome variables.

Returns

(uplift_score, lift)tuple

Uplift score and lift values.

pycalf.visualize module

pycalf.visualize.plot_auuc(uplift_score: numpy.ndarray, lift: numpy.ndarray, baseline: numpy.ndarray, auuc: float | None = None, ax: matplotlib.axes.Axes | None = None) matplotlib.axes.Axes[source]

Plot Area Under the Uplift Curve (AUUC).

Parameters

uplift_scorenumpy.ndarray

Array of uplift scores.

liftnumpy.ndarray

Array of lift, treatment effect.

baselinenumpy.ndarray

Array of random treat effect.

auucfloat, optional

AUUC score. Default is None.

axmatplotlib.axes.Axes, optional

The axes to plot on. If None, a new figure and axes will be created.

Returns

matplotlib.axes.Axes

The axes containing the plot.

pycalf.visualize.plot_effect_size(X: pandas.DataFrame, treatment: numpy.ndarray, weight: numpy.ndarray | None = None, ascending: bool = False, sortbyraw: bool = True, figsize: Tuple[float, float] = (12, 6), threshold: float = 0.1, ax: matplotlib.axes.Axes | None = None) matplotlib.axes.Axes[source]

Plot the effects of the intervention.

Parameters

Xpd.DataFrame

Covariates for propensity score.

treatmentnumpy.ndarray

Flags with or without intervention.

weightnumpy.ndarray, optional

The weight of each sample. Default is None.

ascendingbool

Sort in ascending order.

sortbyrawbool

Flags with sort by raw data or weighted data.

figsizetuple

Figure dimension (width, height) in inches.

thresholdfloat

Threshold value for effect size.

axmatplotlib.axes.Axes, optional

The axes to plot on. If None, a new figure and axes will be created.

Returns

matplotlib.axes.Axes

The axes containing the plot.

pycalf.visualize.plot_lift_values(labels: List[str], values: List[float | int], figsize: Tuple[float, float] = (12, 6), ax: matplotlib.axes.Axes | None = None) matplotlib.axes.Axes[source]

Plot the lift values.

Parameters

labelsList[str]

Labels for x-axis.

valuesList[float or int]

Values for y-axis.

figsizetuple

Figure dimension (width, height) in inches. Default is (12, 6).

axmatplotlib.axes.Axes, optional

The axes to plot on. If None, a new figure and axes will be created.

Returns

matplotlib.axes.Axes

The axes containing the plot.

pycalf.visualize.plot_probability_distribution(y_true: numpy.ndarray, y_score: numpy.ndarray, figsize: Tuple[float, float] = (12, 6), ax: matplotlib.axes.Axes | None = None) matplotlib.axes.Axes[source]

Plot propensity scores, color-coded by the presence or absence of intervention.

Parameters

y_truenumpy.ndarray

The target vector.

y_scorenumpy.ndarray

The score vector.

figsizetuple

Figure dimension (width, height) in inches.

axmatplotlib.axes.Axes, optional

The axes to plot on. If None, a new figure and axes will be created.

Returns

matplotlib.axes.Axes

The axes containing the plot.

pycalf.visualize.plot_roc_curve(y_true: numpy.ndarray, y_score: numpy.ndarray, figsize: Tuple[float, float] = (7, 6), ax: matplotlib.axes.Axes | None = None) matplotlib.axes.Axes[source]

Plot the roc curve.

Parameters

y_truenumpy.ndarray

The target vector.

y_scorenumpy.ndarray

The score vector.

figsizetuple

Figure dimension (width, height) in inches.

axmatplotlib.axes.Axes, optional

The axes to plot on. If None, a new figure and axes will be created.

Returns

matplotlib.axes.Axes

The axes containing the plot.

pycalf.visualize.plot_treatment_effect(outcome_name: str, control_effect: float | int, treat_effect: float | int, effect_size: float | int, figsize: Tuple[float, float] | None = None, fontsize: int = 12, ax: matplotlib.axes.Axes | None = None) matplotlib.axes.Axes[source]

Plot the effects of the intervention.

Parameters

outcome_namestr

Outcome name. it use for figure title.

control_effectfloat or int

Average control Group Effect size.

treat_effectfloat or int

Average treatment Group Effect size.

effect_sizefloat or int

Treatment Effect size.

figsizetuple, optional

Figure dimension (width, height) in inches. Default is None.

fontsize: int

The font size of the text. See .Text.set_size for possible values.

axmatplotlib.axes.Axes, optional

The axes to plot on. If None, a new figure and axes will be created.

Returns

matplotlib.axes.Axes

The axes containing the plot.