Discovery of control targets¶
This tutorial reproduces the main finding of Lee & Cho, Scientific Reports 2019, 9:14289 using SFA: when both ERK and AKT must be suppressed in an EGF/insulin co-stimulated network, inhibition of GAB1 or IRS is predicted to work, while inhibition of GS or PDK1 fails — because GS and PDK1 have opposite-sign influences on ERK and AKT and so cannot move both in the same direction with a single perturbation.
The bundled BORISOV_2009 dataset has the same EGFR + IR signaling
topology used in the paper (22 nodes, 46 interactions), so we can
reproduce the result in a few dozen lines of Python.
Setup¶
import numpy as np
import pandas as pd
import sfa
from sfa.control import compute_influence, prioritize
algs = sfa.AlgorithmSet()
ds = sfa.DataSet()
mdata = ds.create('BORISOV_2009')
data = sfa.get_avalue(mdata) # Any condition; we only need the topology.
alg = algs.create('SP')
alg.params.apply_weight_norm = True
alg.data = data
alg.initialize()
After initialize(), alg.W is the normalized weight matrix that the
influence computation operates on.
Compute the influence matrix¶
Following the paper, we use \(\alpha = 0.9\) and \(\beta = 0.1\) for the influence computation:
df_inf = compute_influence(
alg.W,
alpha=0.9,
beta=0.1,
rtype='df',
outputs=['ERK', 'AKT'],
n2i=data.n2i,
)
df_inf = df_inf.apply(pd.to_numeric, errors='coerce')
compute_influence marks self-loops with np.inf and returns an
object-dtype frame; pd.to_numeric(errors='coerce') converts it to a
clean numeric frame.
Inspect the candidates¶
The paper's key observation is best seen by reading off the influence sign on both outputs:
- GAB1, IRS: positive influence on both ERK and AKT. → Inhibiting them flips both outputs negative, suppressing both simultaneously.
- GS: positive on ERK, negative on AKT. → Inhibition would suppress ERK but up-regulate AKT — opposite direction.
- PDK1: negative on ERK, positive on AKT. → Inhibition would up-regulate ERK while suppressing AKT — again opposite directions.
This reproduces the paper's conclusion that GS and PDK1 fail to suppress the two outputs simultaneously.
Find dual-output targets programmatically¶
For a single output, prioritize groups candidates by SPLO and returns
the top sources whose influence has the requested sign. To find
candidates whose inhibition suppresses an output, ask for positive
influence (dac=+1) — then the same negative perturbation drives the
output negative.
df_splo = sfa.splo(
nxdg=data.dg,
sources=list(data.n2i),
outputs=['ERK', 'AKT'],
rtype='df',
)
targets_erk = prioritize(
df_splo['ERK'], df_inf, output='ERK', dac=+1,
thr_rank=0.5, min_group_size=0, thr_inf=1e-10,
)
targets_akt = prioritize(
df_splo['AKT'], df_inf, output='AKT', dac=+1,
thr_rank=0.5, min_group_size=0, thr_inf=1e-10,
)
dual_targets = sorted(set(targets_erk) & set(targets_akt))
print('Inhibition candidates for both ERK and AKT:', dual_targets)
thr_rank can be an integer (top-N per SPLO bucket) or a fraction in
\((0, 1)\) (top fraction). Taking the intersection of the per-output
shortlists yields the nodes that can suppress both outputs with a
single perturbation. GAB1 and IRS appear in the intersection; GS and
PDK1 are filtered out — exactly the paper's finding.
Validate by perturbation¶
We confirm the prediction by simulating each candidate's inhibition and measuring the change at ERK and AKT.
N = data.A.shape[0]
b0 = np.zeros((N,), dtype=np.float64)
b0[data.n2i['EGF']] = 1
b0[data.n2i['I']] = 1 # EGF + insulin co-stimulation as in the paper.
x_ctrl = alg.compute(b0)
for tgt in ['GAB1', 'IRS', 'GS', 'PDK1']:
b = b0.copy()
b[data.n2i[tgt]] = -1 # Inhibit the candidate.
x = alg.compute(b)
d_erk = x[data.n2i['ERK']] - x_ctrl[data.n2i['ERK']]
d_akt = x[data.n2i['AKT']] - x_ctrl[data.n2i['AKT']]
print(f"{tgt:<5s} ΔERK={d_erk:+.4f} ΔAKT={d_akt:+.4f}")
The signs of ΔERK and ΔAKT match the prediction read off the
influence table: both negative for GAB1 and IRS, mixed for GS and
PDK1.
Visualizing the SPLO–Influence layout¶
sfa.plot.siplot draws a panel per SPLO bucket with sources sorted by
influence on the output of interest. The designated argument
highlights the names returned by prioritize, making the selection
easy to confirm visually.
import matplotlib.pyplot as plt
from sfa.plot import siplot
fig = siplot(df_splo['ERK'], df_inf, output='ERK', designated=dual_targets)
plt.show()
For the original analysis — including the mutation-context cases
(SFK for constitutively active RAS, PIP3 for activated PI3K) — see
the paper.