Primitives

Primitives that depend on underlying distribution

class pydimple.E(P, dep, indep_vars=None, fixed_vars=None, name=None)

Bases: Operator

Marginal or conditional mean of draws from a univariate distribution.

Parameters:
  • P (Distribution) – A distribution.

  • dep (str or Node or PointwiseFunction) – Either a string containing the name of a column in the dataset in P, or a Node/PointwiseFunction.

  • indep_vars (list of str or None, optional) – Variables to regress against.

  • fixed_vars (set of str or None, optional) – A set of conditions expressed as strings. e.g., {‘A1==1’, ‘A2==A1’, ‘A3*A2<A1+A4’}. The conditions should be valid Python expressions and should only involve columns present in P[‘data’].

  • name (str, optional) – The name of the operator. Defaults to ‘E/{E.count}’.

Returns:

A function mapping from the variables in indep_vars to the conditional mean of dep_vars given these variables. If fixed_vars is not None, also conditions on the conditions in fixed_vars being True

Return type:

L2

pydimple.Var(P, dep, indep_vars=None, fixed_vars=None)

Marginal or conditional variance operator, defined as a wrapper for E.

Parameters:
  • P (Distribution) – A distribution.

  • dep (str or Node or PointwiseFunction) – Either a string containing the name of a column in the dataset in P, or a Node/PointwiseFunction.

  • indep_vars (list of str or None, optional) – Variables to regress against.

  • fixed_vars (set of str or None, optional) – A set of conditions expressed as strings. e.g., {‘A1==1’, ‘A2==A1’, ‘A3*A2<A1+A4’}. The conditions should be valid Python expressions and should only involve columns present in P[‘data’].

Returns:

The marginal or conditional variance.

Return type:

Operator

class pydimple.Density(P, dep_vars, indep_vars=None, indep_type=None, name=None, verbose=True)

Bases: Operator

Marginal or conditional density function of a real-valued dependent variable.

Parameters:
  • P (Distribution) – A distribution.

  • dep_vars (list of str) – Dependent variables.

  • indep_vars (list of str or None, optional) – Independent variables.

  • indep_type (str or None, optional) – Either None or a string the length of indep_vars, with the string specifying a type for each variable (c: Continuous, u: Unordered, o: Ordered). E.g., indep_type=’ccuo’. If not provided, the type is assumed to be continuous.

  • fixed_vars (set of str or None, optional) – A set of conditions expressed as strings. e.g., {‘A1==1’, ‘A2==A1’, ‘A3*A2<A1+A4’}. The conditions should be valid Python expressions and should only involve columns present in P[‘data’].

  • name (str, optional) – The name of the operator. Defaults to ‘density/{Density.count}’.

  • verbose (bool, optional) – Returns warnings if True.

Returns:

A function mapping from the variables in indep_vars to the conditional density of dep_vars given these variables. If fixed_vars is not None, also conditions on the conditions in fixed_vars being True

Return type:

L2

class pydimple.embed(f, P, name=None)

Bases: Operator

Embed an element f of L^2(Q) into L^2(P).

Parameters:
  • f (L2) – The function to be embedded, an element of L^2(Q).

  • P (Distribution) – The distribution representing the target L^2 space.

  • name (str, optional) – The name of the embedding operator. Defaults to ‘embed/{embed.count}’.

Returns:

The embedding of f into L^2(P)

Return type:

L2

Pointwise operations

class pydimple.Node

Bases: object

__add__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

self+other (pointwise operation)

Return type:

Node

__mul__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

self*other (pointwise operation)

Return type:

Node

__neg__()
Parameters:

self (Node) – Node to negate

Returns:

-{“self”} (pointwise operation)

Return type:

Node

__pow__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

self**other (pointwise operation)

Return type:

Node

__radd__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

other+self (pointwise operation)

Return type:

Node

__rmul__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

other*self (pointwise operation)

Return type:

Node

__rpow__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

other**self (pointwise operation)

Return type:

Node

__rtruediv__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

other/self (pointwise operation)

Return type:

Node

__sub__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

self-other (pointwise operation)

Return type:

Node

__truediv__(other)
Parameters:
  • self (Node) – First argument to binary operation

  • other (Node, float, int, or PointwiseFunction) – Second argument to binary operation

Returns:

self/other (pointwise operation)

Return type:

Node

Other primitives

pydimple.RV(rv_str)

Create a PointwiseFunction that extracts a random variable from a DataFrame.

Parameters:

rv_str (str) – The name of the random variable (column) to extract.

Returns:

A PointwiseFunction that takes a DataFrame as input and returns the column specified by rv_str.

Return type:

PointwiseFunction

Example:
>>> df = pd.DataFrame({'X': [1, 2, 3], 'Y': [4, 5, 6]})
>>> X = RV('X')
>>> X(df)
0    1
1    2
2    3
Name: X, dtype: int64