pycalf package¶

Submodules¶

pycalf.metrics module¶

class pycalf.metrics.AttributeEffect[source]¶

Bases: object

Estimating the effect of the intervention by attribute.

fit(X: pandas.DataFrame, treatment: pandas.Series, y: pandas.Series, weight: numpy.ndarray | None = None) → None[source]¶

Fit the model with X, y and weight.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentpd.Series: Flags with or without intervention.
ypd.Series: Outcome variables.
weightnp.array: The weight of each sample.

Returns¶

None

transform() → pandas.DataFrame[source]¶: Apply the estimating the effect of the intervention by attribute.

Returns¶

pd.DataFrame

class pycalf.metrics.EffectSize[source]¶

Bases: object

Calculating the effect size-d.

Examples¶

ate_weight = model.get_weight(treatment, mode=’ate’) es = metrics.EffectSize() es.fit(X, treatment, weight=ate_weight) es.transform() # return (effect_size, effect_name)

fit(X: pandas.DataFrame, treatment: numpy.ndarray, weight: numpy.ndarray | None = None) → None[source]¶

Fit the model with X.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentpd.Series: Flags with or without intervention.
weightnp.array: The weight of each sample.

Returns¶

None

fit_transform(X: pandas.DataFrame, treatment: numpy.ndarray, weight: numpy.ndarray | None = None) → Dict[str, numpy.ndarray][source]¶

Fit the model with X and apply the dimensionality reduction on X.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentpd.Series: Flags with or without intervention.
weightnp.array: The weight of each sample.

Returns¶

pd.DataFrame

transform() → Dict[str, numpy.ndarray][source]¶

Apply the calculating the effect size d.

Returns¶

Dict[str, np.ndarray]: Dictionary containing ‘effect_name’ and ‘effect_size’ arrays.

class pycalf.metrics.VIF[source]¶

Bases: object

Variance Inflation Factor (VIF).

fit(data: pandas.DataFrame) → None[source]¶: Fit the model with data.

Parameters¶

data : pd.DataFrame

Returns¶

None

fit_transform(data: pandas.DataFrame, **kwargs) → pandas.DataFrame[source]¶: Fit the model with data and apply the calculating vif.

Parameters¶

data : pd.DataFrame

Returns¶

result : pd.DataFrame

transform() → pandas.DataFrame[source]¶: Apply the calculating vif.

Returns¶

result : pd.DataFrame

pycalf.metrics.f1_score(y_true: numpy.ndarray, y_score: numpy.ndarray, threshold: float = 0.5, is_auto: bool = True) → float[source]¶

Calculate the F1 score.

Parameters¶

y_truenumpy.ndarray: The target vector.
y_scorenumpy.ndarray: The score vector.
thresholdfloat: Threshold on the decision function used to compute precision and recall. Default is 0.5.
is_autobool: If True, automatically find optimal threshold. Default is True.

Returns¶

scorefloat: F1 score.

pycalf.propensity module¶

class pycalf.propensity.DoubleRobust(learner, second_learner)[source]¶

Bases: IPW

estimate_effect(treatment: numpy.ndarray, mode: str = 'ate') → tuple[float, float, float][source]¶

Calculate the treatment effect using double robust method.

Parameters¶

treatmentnumpy.ndarray[bool]: Flags with or without intervention.
modestr, default=”ate”: Adjustment method. Must be ‘raw’, ‘ate’, ‘att’ or ‘atu’.

Returns¶

tuple: A tuple containing (avg_y_control, avg_y_treat, effect_size)

Raises¶

ValueError: If model is not fitted or mode is invalid.

fit(X: pandas.DataFrame, treatment: numpy.ndarray, y: numpy.ndarray, eps: float = 1e-08) → None[source]¶

Fit learner and Estimate Propensity Score.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentnumpy.ndarray[bool]: Flags with or without intervention.
ynumpy.ndarray: Outcome variables. Can be 1D or 2D array.
epsfloat, default=1e-8: Extreme Value Trend Score Rounding Value.

Raises¶

ValueError: If eps is not in range [0, 1).

class pycalf.propensity.IPW(learner)[source]¶

Bases: object

Inverse Probability Weighting Method.

estimate_effect(treatment: numpy.ndarray, y: numpy.ndarray, mode: str = 'ate') → tuple[float, float, float][source]¶

Calculate treatment effect using inverse probability weighting.

Parameters¶

treatmentnumpy.ndarray[bool]: Flags with or without intervention.
ynumpy.ndarray: Outcome variables.
modestr: Adjustment method. must be raw, ate, att or atu.

Returns¶

tuple: A tuple containing (avg_y_control, avg_y_treat, effect_size)

fit(X: pandas.DataFrame, treatment: numpy.ndarray, y: numpy.ndarray | None = None, eps: float = 1e-08) → None[source]¶

Fit learner and Estimate Propensity Score.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentnumpy.ndarray[bool]: Flags with or without intervention.
ynumpy.ndarray: Outcome variables.
epsfloat, default=1e-8: Extreme Value Trend Score Rounding Value.

Raises¶

ValueError: If eps is not in range [0, 1).

get_score() → numpy.ndarray[source]¶

Return propensity score.

Returns¶

p_scorenumpy.ndarray: Propensity score for each sample.

Raises¶

ValueError: If model is not fitted.

get_weight(treatment: numpy.ndarray, mode: str = 'ate') → numpy.ndarray[source]¶

Return sample weight representing matching.

Parameters¶

treatmentnumpy.ndarray[bool]: Flags with or without intervention.
modestr, default=”ate”: Adjustment method. Must be ‘raw’, ‘ate’, ‘att’ or ‘atu’.

Returns¶

sample_weightnumpy.ndarray: Sample weights.

Raises¶

ValueError: If mode is not ‘raw’, ‘ate’, ‘att’ or ‘atu’.

class pycalf.propensity.Matching(learner, min_match_dist=0.01)[source]¶

Bases: object

Matching with propensity score.

Attributes¶

p_scorenumpy.ndarray: Propensity Score.

estimate_effect(treatment: numpy.ndarray, y: numpy.ndarray, mode: str = 'ate') → tuple[float, float, float][source]¶

Match using propensity score and return sample_weight.

Parameters¶

treatmentnumpy.ndarray[bool]: Flags with or without intervention.
ynumpy.ndarray: Outcome variables.
modestr: Adjustment method. raw or ate.

Returns¶

tuple: A tuple containing (avg_y_control, avg_y_treat, effect_size)

fit(X: pandas.DataFrame, treatment: numpy.ndarray, y: numpy.ndarray | None = None) → None[source]¶

Fit learner and Estimate Propensity Score.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentnumpy.ndarray[bool]: Flags with or without intervention.
ynumpy.ndarray: Outcome variables.

get_score() → numpy.ndarray[source]¶

Return propensity score.

Returns¶

numpy.ndarray: Propensity score.

get_weight(treatment: numpy.ndarray, mode: str = 'ate') → numpy.ndarray[source]¶

Return sample weight representing matching.

Parameters¶

treatmentnumpy.ndarray[bool]: Flags with or without intervention.
modestr: Adjustment method. raw or ate.

Returns¶

numpy.ndarray: Sample weight.

pycalf.uplift module¶

class pycalf.uplift.UpliftModel(learner_treat, learner_control)[source]¶

Bases: object

Class of Uplift Modeling.

estimate_uplift_score(X: numpy.ndarray) → numpy.ndarray[source]¶

Estimate uplift scores.

Parameters¶

Xnumpy.ndarray: Features for prediction treat and control probability.

Returns¶

uplift_scorenp.array: Uplift Score.

fit(X_treat: numpy.ndarray, y_treat: numpy.ndarray, X_control: numpy.ndarray, y_control: numpy.ndarray, weight_treat: numpy.ndarray | None = None, weight_control: numpy.ndarray | None = None) → None[source]¶

Parameters¶

X_treatnumpy.ndarray: Features for learner_treat.
y_treatnumpy.ndarray: Labels for learner_treat.
X_controlnumpy.ndarray: Features for learner_control.
y_controlnumpy.ndarray: Labels for learner_control.
weight_treatnumpy.ndarray or None: Weights for learner_treat.
weight_controlnumpy.ndarray or None: Weights for learner_control.

Returns¶

None

get_auuc(lift: numpy.ndarray) → float[source]¶

Parameters¶

liftnumpy.ndarray: Array of lift, treatment effect.

Returns¶

auucfloat: AUUC score.

get_baseline(lift: numpy.ndarray) → numpy.ndarray[source]¶

Parameters¶

liftnumpy.ndarray: Array of lift, treatment effect.

Returns¶

base_linenumpy.ndarray: Array of random treat effect.

predict(X: numpy.ndarray, treatment: numpy.ndarray, y: numpy.ndarray) → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Parameters¶

Xnumpy.ndarray: Features for prediction treat and control probability.
treatmentnumpy.ndarray[bool]: Flags with or without intervention.
ynumpy.ndarray: Outcome variables.

Returns¶

(uplift_score, lift)tuple: Uplift score and lift values.

pycalf.visualize module¶

pycalf.visualize.plot_auuc(uplift_score: numpy.ndarray, lift: numpy.ndarray, baseline: numpy.ndarray, auuc: float | None = None, ax: matplotlib.axes.Axes | None = None) → matplotlib.axes.Axes[source]¶

Plot Area Under the Uplift Curve (AUUC).

Parameters¶

uplift_scorenumpy.ndarray: Array of uplift scores.
liftnumpy.ndarray: Array of lift, treatment effect.
baselinenumpy.ndarray: Array of random treat effect.
auucfloat, optional: AUUC score. Default is None.
axmatplotlib.axes.Axes, optional: The axes to plot on. If None, a new figure and axes will be created.

Returns¶

matplotlib.axes.Axes: The axes containing the plot.

pycalf.visualize.plot_effect_size(X: pandas.DataFrame, treatment: numpy.ndarray, weight: numpy.ndarray | None = None, ascending: bool = False, sortbyraw: bool = True, figsize: Tuple[float, float] = (12, 6), threshold: float = 0.1, ax: matplotlib.axes.Axes | None = None) → matplotlib.axes.Axes[source]¶

Plot the effects of the intervention.

Parameters¶

Xpd.DataFrame: Covariates for propensity score.
treatmentnumpy.ndarray: Flags with or without intervention.
weightnumpy.ndarray, optional: The weight of each sample. Default is None.
ascendingbool: Sort in ascending order.
sortbyrawbool: Flags with sort by raw data or weighted data.
figsizetuple: Figure dimension (width, height) in inches.
thresholdfloat: Threshold value for effect size.
axmatplotlib.axes.Axes, optional: The axes to plot on. If None, a new figure and axes will be created.

Returns¶

matplotlib.axes.Axes: The axes containing the plot.

pycalf.visualize.plot_lift_values(labels: List[str], values: List[float | int], figsize: Tuple[float, float] = (12, 6), ax: matplotlib.axes.Axes | None = None) → matplotlib.axes.Axes[source]¶

Plot the lift values.

Parameters¶

labelsList[str]: Labels for x-axis.
valuesList[float or int]: Values for y-axis.
figsizetuple: Figure dimension (width, height) in inches. Default is (12, 6).
axmatplotlib.axes.Axes, optional: The axes to plot on. If None, a new figure and axes will be created.

Returns¶

matplotlib.axes.Axes: The axes containing the plot.

pycalf.visualize.plot_probability_distribution(y_true: numpy.ndarray, y_score: numpy.ndarray, figsize: Tuple[float, float] = (12, 6), ax: matplotlib.axes.Axes | None = None) → matplotlib.axes.Axes[source]¶

Plot propensity scores, color-coded by the presence or absence of intervention.

Parameters¶

y_truenumpy.ndarray: The target vector.
y_scorenumpy.ndarray: The score vector.
figsizetuple: Figure dimension (width, height) in inches.
axmatplotlib.axes.Axes, optional: The axes to plot on. If None, a new figure and axes will be created.

Returns¶

matplotlib.axes.Axes: The axes containing the plot.

pycalf.visualize.plot_roc_curve(y_true: numpy.ndarray, y_score: numpy.ndarray, figsize: Tuple[float, float] = (7, 6), ax: matplotlib.axes.Axes | None = None) → matplotlib.axes.Axes[source]¶

Plot the roc curve.

Parameters¶

y_truenumpy.ndarray: The target vector.
y_scorenumpy.ndarray: The score vector.
figsizetuple: Figure dimension (width, height) in inches.
axmatplotlib.axes.Axes, optional: The axes to plot on. If None, a new figure and axes will be created.

Returns¶

matplotlib.axes.Axes: The axes containing the plot.

pycalf.visualize.plot_treatment_effect(outcome_name: str, control_effect: float | int, treat_effect: float | int, effect_size: float | int, figsize: Tuple[float, float] | None = None, fontsize: int = 12, ax: matplotlib.axes.Axes | None = None) → matplotlib.axes.Axes[source]¶

Plot the effects of the intervention.

Parameters¶

outcome_namestr: Outcome name. it use for figure title.
control_effectfloat or int: Average control Group Effect size.
treat_effectfloat or int: Average treatment Group Effect size.
effect_sizefloat or int: Treatment Effect size.
figsizetuple, optional: Figure dimension (width, height) in inches. Default is None.
fontsize: int: The font size of the text. See .Text.set_size for possible values.
axmatplotlib.axes.Axes, optional: The axes to plot on. If None, a new figure and axes will be created.

Returns¶

matplotlib.axes.Axes: The axes containing the plot.