Control¶
The control module provides utilities for discovery of control targets: nodes whose perturbation is most likely to push a chosen output in a chosen direction. The approach is described in Lee & Cho, Scientific Reports 2019, 9:14289.
Influence matrix¶
sfa.control.compute_influence estimates the partial-derivative-based
influence of every source node on every target, using only the network
topology.
Given the propagation update
the influence matrix \(S\) satisfies
which is approximated by truncating the series. SFA uses the iteration
and the iteration stops when \(\lVert S(t+1) - S(t) \rVert \le \mathrm{tol}\).
import sfa
import numpy as np
import pandas as pd
from sfa.control import compute_influence
data = sfa.get_avalue(sfa.DataSet().create('BORISOV_2009'))
alg = sfa.AlgorithmSet().create('SP')
alg.params.apply_weight_norm = True
alg.data = data
alg.initialize()
df_inf = compute_influence(
alg.W,
alpha=0.9,
beta=0.1,
rtype='df',
outputs=['ERK', 'AKT'],
n2i=data.n2i,
)
df_inf = df_inf.apply(pd.to_numeric, errors='coerce')
The returned DataFrame has np.inf on the diagonal (each node's
influence on itself) and an object dtype mixing those infinities with
floats; pd.to_numeric(errors='coerce') casts it to a clean numeric
frame before sorting or arithmetic.
| Parameter | Default | Description |
|---|---|---|
W |
— | Weight matrix (output of alg.W). |
alpha |
0.9 |
Signal-flow contribution. |
beta |
0.1 |
Basal-activity contribution; scales the final S. |
S |
None |
Initial influence matrix; defaults to identity. |
rtype |
'df' |
'df' for pandas.DataFrame, 'array' for numpy.ndarray. |
outputs |
None |
Required when rtype='df'; output node names. |
n2i |
None |
Required when rtype='df'; the data's name-to-index map. |
max_iter |
1000 |
Iteration cap. |
tol |
1e-7 |
Tolerance for the stopping criterion. |
device |
'cpu' |
'cpu' or 'gpu:<id>' (requires CuPy). |
sparse |
False |
Use SciPy sparse matrices for the CPU path. |
Shortest path length to output (SPLO)¶
sfa.splo computes, for each (source, output) pair, the shortest path
length in the directed network. This is used to bucket candidate sources by
how "close" they are to the output.
sfa.max_spl(nxdg) is also available to inspect the diameter-like
quantity (the maximum shortest path length in the network) when choosing
SPLO bounds.
Prioritizing control candidates¶
Once you have both influence and SPLO, sfa.control.prioritize groups
candidates by SPLO and selects, within each group, the top-ranked
sources whose influence on the output has the requested sign (dac):
dac=+1— sources with positive influence on the output. Their inhibition (negative perturbation) drives the output negative; their activation drives it positive.dac=-1— sources with negative influence. Their activation drives the output negative.
from sfa.control import prioritize
targets = prioritize(
df_splo=df_splo['ERK'], # SPLO series for the output of interest.
df_inf=df_inf,
output='ERK',
dac=+1, # Inhibit these to suppress ERK.
thr_rank=3, # Top-3 per SPLO group; or a fraction in (0, 1).
min_group_size=0,
thr_inf=1e-10,
)
sfa.control.arrange_si is the lower-level helper used by prioritize;
call it directly when you need the grouped SPLO–Influence DataFrames
rather than just the target list. See
Discovery of control targets for a worked example
that reproduces the dual-output (ERK + AKT) finding from the 2019
paper.
Visualizing SPLO–Influence¶
sfa.plot.siplot draws a grid of horizontal bar charts, one panel per
SPLO bucket, with the candidate sources sorted by influence on the
output. This makes the relative ranking inside each SPLO group easy to
read.
import matplotlib.pyplot as plt
from sfa.plot import siplot
fig = siplot(df_splo['ERK'], df_inf, output='ERK', designated=targets)
plt.show()
The designated argument highlights the names returned by
prioritize, so you can confirm the selection visually before applying
the perturbations.