Skip to content

Control

The control module provides utilities for discovery of control targets: nodes whose perturbation is most likely to push a chosen output in a chosen direction. The approach is described in Lee & Cho, Scientific Reports 2019, 9:14289.

Influence matrix

sfa.control.compute_influence estimates the partial-derivative-based influence of every source node on every target, using only the network topology.

Given the propagation update

\[ x(t+1) = \alpha W x(t) + (1 - \alpha) b, \]

the influence matrix \(S\) satisfies

\[ S_{ij} = \frac{\partial x_i}{\partial x_j} = \big(I + \alpha W + \alpha^2 W^2 + \cdots\big)_{ij}, \]

which is approximated by truncating the series. SFA uses the iteration

\[ S(t+1) = \alpha W S(t) + I, \qquad S(0) = \beta I, \]

and the iteration stops when \(\lVert S(t+1) - S(t) \rVert \le \mathrm{tol}\).

import sfa
import numpy as np
import pandas as pd
from sfa.control import compute_influence

data = sfa.get_avalue(sfa.DataSet().create('BORISOV_2009'))
alg = sfa.AlgorithmSet().create('SP')
alg.params.apply_weight_norm = True
alg.data = data
alg.initialize()

df_inf = compute_influence(
    alg.W,
    alpha=0.9,
    beta=0.1,
    rtype='df',
    outputs=['ERK', 'AKT'],
    n2i=data.n2i,
)
df_inf = df_inf.apply(pd.to_numeric, errors='coerce')

The returned DataFrame has np.inf on the diagonal (each node's influence on itself) and an object dtype mixing those infinities with floats; pd.to_numeric(errors='coerce') casts it to a clean numeric frame before sorting or arithmetic.

Parameter Default Description
W Weight matrix (output of alg.W).
alpha 0.9 Signal-flow contribution.
beta 0.1 Basal-activity contribution; scales the final S.
S None Initial influence matrix; defaults to identity.
rtype 'df' 'df' for pandas.DataFrame, 'array' for numpy.ndarray.
outputs None Required when rtype='df'; output node names.
n2i None Required when rtype='df'; the data's name-to-index map.
max_iter 1000 Iteration cap.
tol 1e-7 Tolerance for the stopping criterion.
device 'cpu' 'cpu' or 'gpu:<id>' (requires CuPy).
sparse False Use SciPy sparse matrices for the CPU path.

Shortest path length to output (SPLO)

sfa.splo computes, for each (source, output) pair, the shortest path length in the directed network. This is used to bucket candidate sources by how "close" they are to the output.

df_splo = sfa.splo(
    nxdg=data.dg,
    sources=list(data.n2i),
    outputs=['ERK', 'AKT'],
    rtype='df',
)

sfa.max_spl(nxdg) is also available to inspect the diameter-like quantity (the maximum shortest path length in the network) when choosing SPLO bounds.

Prioritizing control candidates

Once you have both influence and SPLO, sfa.control.prioritize groups candidates by SPLO and selects, within each group, the top-ranked sources whose influence on the output has the requested sign (dac):

  • dac=+1 — sources with positive influence on the output. Their inhibition (negative perturbation) drives the output negative; their activation drives it positive.
  • dac=-1 — sources with negative influence. Their activation drives the output negative.
from sfa.control import prioritize

targets = prioritize(
    df_splo=df_splo['ERK'],   # SPLO series for the output of interest.
    df_inf=df_inf,
    output='ERK',
    dac=+1,                   # Inhibit these to suppress ERK.
    thr_rank=3,               # Top-3 per SPLO group; or a fraction in (0, 1).
    min_group_size=0,
    thr_inf=1e-10,
)

sfa.control.arrange_si is the lower-level helper used by prioritize; call it directly when you need the grouped SPLO–Influence DataFrames rather than just the target list. See Discovery of control targets for a worked example that reproduces the dual-output (ERK + AKT) finding from the 2019 paper.

Visualizing SPLO–Influence

sfa.plot.siplot draws a grid of horizontal bar charts, one panel per SPLO bucket, with the candidate sources sorted by influence on the output. This makes the relative ranking inside each SPLO group easy to read.

import matplotlib.pyplot as plt
from sfa.plot import siplot

fig = siplot(df_splo['ERK'], df_inf, output='ERK', designated=targets)
plt.show()

The designated argument highlights the names returned by prioritize, so you can confirm the selection visually before applying the perturbations.