MainUnderstanding biological function arising from molecular interactions is central to systems biology1. These systems are often represented as networks, where nodes denote entities (for example, genes or proteins) and edges denote interactions (for example, binding or activation)2,3. However, interactions are context dependent, varying across cell types and conditions. This motivates the construction of context-specific biological networks, reflecting functional interactions in specific scenarios through network inference approaches4,5.One approach to network inference relies on purely data-driven statistical and machine learning methods6. While powerful, these typically require large datasets and multiple data modalities, limiting their application mainly to areas like gene regulation, where extensive single-cell transcriptomics data are available7,8. Conversely, for many other data modalities—including bulk transcriptomics, (phospho)proteomics and metabolomics, such large-scale datasets are generally scarce, restricting inference to methods such as correlation networks9. While straightforward to compute, correlation-based methods capture only statistical associations and offer no causal interpretability. For example, two proteins may be co-expressed across multiple conditions, but this does not establish whether one regulates the other directly or whether both are controlled by a shared upstream regulator10,11.Integrating experimental data with biological prior knowledge is an effective alternative, particularly when samples are limited12. Prior knowledge networks (PKNs)—structured repositories of known interactions—help to address data scarcity by providing a useful inductive bias, guiding methods towards biologically plausible hypotheses. Network inference then involves identifying a context-specific subnetwork within a PKN, constraining potential interactions to those with established relevance and thereby improving both accuracy and interpretability. Numerous methods use PKNs to infer protein–protein interaction (PPI)13, signalling14 and metabolic networks15, typically by mapping data onto a PKN and extracting relevant subnetworks.Despite these advances, the field remains fragmented: inference methods are often tailored to specific data types, rely on ad hoc assumptions or are limited to particular network structures (for example, signalling networks14 or metabolic networks16). This heterogeneity complicates generalization, systematic benchmarking and the identification of unifying principles. Furthermore, although many studies contain multiple samples (for example, replicates, conditions and timepoints), most PKN-based approaches analyse them independently or in a pairwise manner. Methods analysing samples individually have limited ability to distinguish context-dependent interactions shared across conditions from sample-specific variation17. Non-identifiability—where multiple networks can equally explain the data—further complicates network inference18. More mechanistic approaches, such as Boolean or differential-equation-based models19,20,21, can integrate information from multiple samples. However, they typically require detailed prior knowledge of the network structure, which is often unavailable, and also present scalability issues with larger networks, limiting their applicability.To address these challenges, we introduce CORNETO (constrained optimization for the recovery of networks from omics), a general, extensible framework for knowledge-driven network inference. CORNETO integrates data across multiple samples, supports various PKN structures (for example, graphs and hypergraphs) and models diverse biological problems (for example, signalling and metabolism) using a single constrained mixed-integer optimization formulation22. By casting network inference as a joint optimization problem governed by structured sparsity and network flow principles23,24, CORNETO reconciles the strengths of prior knowledge with the robustness of multi-sample modelling. Unlike traditional methods, CORNETO performs inference jointly across samples, enabling information sharing to improve the inference while identifying sample-specific subnetworks. This unified framework demonstrates that a priori diverse methods can be viewed as special cases of a common optimization problem, thereby enabling systematic extension, integration and benchmarking. CORNETO bridges data-driven methods25,26 and traditional PKN-based approaches, which typically focus on individual samples, such as flux balance analysis (FBA), sparse FBA (sFBA), iMAT (Integrative Metabolic Analysis Tool) for metabolic networks27, CARNIVAL (CAusal Reasoning for Network identification using Integer VALue programming) for signalling28 or prize-collecting Steiner trees (PCSTs) for protein–protein interactions29. We demonstrate how methods for signalling, metabolic and PPI networks can be extended for joint analysis using CORNETO and evaluate performance on both simulated and real datasets. Finally, we show how CORNETO links network inference with machine learning by enabling the construction of optimized, biologically informed neural networks. The open-source Python package implementing this framework is available via GitHub at https://github.com/saezlab/corneto.ResultsOverviewCORNETO provides a unified framework and Python implementation for prior-knowledge-driven network inference from omics data (Fig. 1). It reformulates network inference as a general constrained optimization problem, enabling the joint inference of multiple context specific subnetworks from a PKN. By expressing diverse methods through a common mathematical programming formulation, CORNETO offers flexibility, precise control over biological assumptions, and compatibility with powerful solvers. This approach supports the integration of multiple omics modalities and the inference of a wide range of biologically meaningful substructures such as paths, trees, direct acyclic graphs or subgraphs that best explain the data in light of prior knowledge.Fig. 1: Overview of CORNETO’s framework and example applications.CORNETO is a flexible framework that allows users to implement and extend diverse network inference methods by reformulating them as constrained optimization problems. For any method, the process begins by mapping omics data (for example, transcriptomics or proteomics) onto PKNs, which can vary in complexity from simple interaction graphs to detailed metabolic networks. CORNETO then reformulates the method into a specific optimization problem, ensuring that it can be optimally solved using mathematical programming techniques. A key contribution of CORNETO is its ability to enable joint inference across multiple samples, leveraging shared patterns to improve network inference. Using this framework, we have extended several methods and demonstrated their effectiveness in diverse biological use cases. Figure adapted from ref. 14 (Fig. 2) under a Creative Commons license CC-BY 4.0.Full size imageAll methods in CORNETO follow a shared structure: importing a PKN, mapping omics data onto the network, and defining method specific constraints and objectives. These are encoded as optimization variables and mathematical expressions that determine which substructures are selected. Because this process is expressed generically in terms of mathematical programming, it is fully decoupled from solver selection and supports rapid prototyping, customization and symbolic manipulation of variables and constraints.This unified architecture reveals that many widely used network inference methods are special cases of the general optimization framework implemented by CORNETO (Table 1), making it straightforward to extend or improve a wide range of methods using the same underlying optimization principles. Once implemented in CORNETO, any method can be directly extended to multi-sample inference using a structured sparsity-inducing penalty. This allows the joint inference of networks across samples, improving robustness, reducing false positives and capturing both shared and condition-specific features in a principled way.Table 1 Overview of the novel methods implemented and evaluated in CORNETO, spanning five different applications for network modelling and inferenceFull size tableWe illustrate CORNETO’s versatility through applications involving method extension or adaptation: (1) extending FBA to multi-sample scenarios; (2) integrating multi-sample omics data with FBA for context-specific metabolic networks; and (3) using multi-sample and multi-omics data for joint inference of signalling networks. The main text focuses on these primary applications, while the Supplementary Information showcases how CORNETO can also be used to implement (4) Steiner tree-based methods for multi-sample protein interaction modules and kinase–substrate interactions in various contexts, and (5) the automatic construction of optimized, biologically informed neural network architectures trained on single-cell transcriptomics. Together, these examples demonstrate CORNETO’s support for diverse network-based methods across a range of omics modalities (Table 1).Unified framework for joint network inference from prior knowledgeCORNETO decomposes network inference into four components: a mapping function ϕ (Fig. 2a), a graph transformation ψ (Fig. 2b), an objective function f and a set of variables and constraints (Fig. 2c) and casts the entire problem as a mixed-integer network-flow formulation. Network flows provide a mathematical framework for assigning quantities to edges under capacity, directionality and conservation constraints. This abstraction lets us capture a wide variety of inference tasks, such as finding shortest paths, extracting minimal subnetworks, balancing metabolic fluxes, inferring signal propagation pathways and more, by defining appropriate costs or scores on edges and then optimizing the flows accordingly.Fig. 2: CORNETO unifies network inference methods from prior knowledge and omics, enabling multi-sample network inference through the lens of network flows and mixed-integer optimization.The framework operates on a context-free PKN \({\mathcal{H}}=({\mathcal{V}},{\mathcal{E}})\) that contains known biological interactions and additional knowledge such as stoichiometric information, gene–protein reaction rules, type of interaction and so on, depending on the type of prior knowledge used. The framework supports multiple types of prior knowledge, including undirected, directed, signed directed graphs and hypergraphs. These graphs can be obtained from databases or be defined by the user. Every network inference method in CORNETO is implemented in terms of data mapping, graph transformations and conversion into a constrained optimization problem using mathematical programming. a, Omics data are transformed and mapped onto the context-free PKN, resulting in n annotated graphs, one per sample or condition. b, Graphs are transformed by the method to optimize the structure based on the input data and to adapt the graph to be suitable for converting it into an optimization problem using network flows. c, The network inference method is formulated as a constrained optimization problem, using flow and indicator variables created by CORNETO on top of the transformed graphs. The regularization term (orange) is added to the objective function to promote structured sparsity across all samples, enabling joint inference. d, The optimization problem is solved by one of the many mathematical solvers supported by CORNETO. This results in n optimal inferred networks, with proofs of optimality provided by the solvers. e, An example of a signalling network with two samples, where we use only information about receptors (top vertices) and TFs (bottom nodes), to infer the networks propagating the signal. Red colour indicates activation and blue inhibition. Single-sample optimization leads to two networks not sharing any edge, where the size (number of edges) of the union is six, whereas the joint (multi-sample) optimization finds two networks that share one edge, leading to a solution with only five edges from the PKN that correctly explains the observed activations and inhibitions.Full size imageStarting from omics data \({\bf{D}}\in {{\mathbb{R}}}^{m\times n}\) and a hypergraph (PKN) \({\mathcal{H}}=({\mathcal{V}},{\mathcal{E}})\), the mapping \(\phi :({\mathcal{H}},{\bf{D}})\to {{\mathcal{H}}}^{{\prime} }\) projects measurements onto \({\mathcal{H}}\), producing an annotated hypergraph \({{\mathcal{H}}}^{{\prime} }\). We use \({\mathcal{H}}\) for general hypergraphs (hyperedges may join multiple vertices) and \({\mathcal{G}}\) for ordinary graphs (edges join exactly two vertices), which can be viewed as a special case of hypergraphs. This distinction helps to indicate when certain methods do not naturally extend to the hypergraph setting, even though they are implemented uniformly within the framework. Transformations are accordingly written as \({{\mathcal{H}}}^{{\prime} }\to {{\mathcal{H}}}^{{\prime\prime} }\) or \({{\mathcal{G}}}^{{\prime} }\to {{\mathcal{G}}}^{{\prime\prime} }\).Next, \(\psi :{{\mathcal{H}}}^{{\prime} }\to {{\mathcal{H}}}^{{\prime\prime} }\) applies preprocessing strategies such as pruning unreachable vertices or irrelevant edges and, most importantly, inserting source and sink edges to enable the injection and extraction of flows in a method-specific manner. This transformation constitutes the first step in reformulating a given method into a flow-based problem. In metabolic networks, these edges may already be present in the PKN (for example, importer and exporter reactions) and, thus, may not be required to guarantee that flows can traverse the network.Applying ϕ and ψ to each sample yields a collection of transformed hypergraphs, which we merge into a union hypergraph \({{\mathcal{H}}}_{u}\). On \({{\mathcal{H}}}_{u}\), the flow for one sample is a vector \({\bf{x}}\in {{\mathbb{R}}}^{| {{\mathcal{E}}}_{u}| }\), whose entries specify the flow of the edges under capacity, directionality and conservation at each vertex; collecting these gives \({\bf{X}}\in {{\mathbb{R}}}^{| {{\mathcal{E}}}_{u}| \times n}\). Binary indicators \({\bf{Y}}\in {\{0,1\}}^{| {{\mathcal{E}}}_{u}| \times n}\) specify which edges carry non-zero flow per sample, and they are used to enforce sparsity across samples.Flows and indicator variables model the selection of graph structures (paths, trees, direct acyclic graphs or other combinatorial objects), with flows X and edge indicators Y determined automatically by optimization, although any known presence or absence of specific edges can be manually enforced in Y. CORNETO builds on these variables and constraints and offers a Python application programming interface (API) for defining linear inequalities gi(X, Y, …) ≤ 0 and equalities hj(X, Y, …) = 0, which together define the feasible set Ω (for example, all valid trees from a PKN). Adding an objective \(f:\varOmega \to {\mathbb{R}}\) yields the constrained optimization problem \({\mathcal{P}}=(\,f,\varOmega )\), whose solution via an exact solver gives the optimal X and Y and, thus, the inferred network(s) (Fig. 2d). The generality of the framework allows many methods to be expressed as special cases of constrained optimization. As an illustration, Supplementary Fig. 2 shows how the simple weighted shortest-path algorithm can be formulated using CORNETO’s framework. Once implemented in the framework, such methods automatically extend to multi-sample settings without additional effort.This extension to multi-sample inference is achieved by introducing a structured sparsity regularizer λ ∥Y 1n∥0, which promotes consistent edge selection of context-specific interactions by jointly penalizing entire edge groups across samples. This results in sparser, more interpretable networks and enhances the detection of shared interactions, because the space of potential interactions that can be selected from the PKN is reduced (Fig. 2e). As with any regularizer, λ is problem specific and should be chosen via hyperparameter tuning. We provide more details about the framework in the Methods and Supplementary Information.In the following sections, we introduce new multi-sample methods in CORNETO and benchmark them against single-sample approaches on simulated and real data.CORNETO improves FBA modellingWe first use CORNETO to enhance the capabilities of FBA simulation and modelling for multi-sample settings. FBA is widely used to predict metabolic fluxes, assess the impact of genetic or environmental changes on metabolism, and guide metabolic engineering strategies.We demonstrate the applicability of CORNETO across different FBA approaches:(1)Simulation: We extend the sFBA method to identify minimal metabolic networks that retain essential functional capabilities, such as biomass production. This allows simulation of metabolic states under hypothetical or constrained conditions.(2)Prediction: We estimate metabolic fluxes from sparse and noisy measurements in multiple samples. By integrating experimental data, this approach constrains flux distributions to biologically feasible solutions, improving the accuracy of flux predictions under sample-limited conditions.Multi-sample sFBAsFBA30,31 identifies core metabolic networks sustaining specific functions. Instead of maximixing a target flux (such as biomass), sFBA makes this a constraint and minimizes the number of active reactions, selecting the smallest reaction set fulfilling the metabolic constraints.However, standard sFBA cannot consider multiple samples simultaneously, analysing each independently32. This overlooks shared pathways used across conditions. Furthermore, the underdetermined nature of FBA yields multiple equivalent optimal solutions33,34, obscuring shared strategies and potentially inflating differences in downstream analyses.CORNETO’s multi-sample sFBA simultaneously optimizes networks across all samples, minimizing the union of the inferred networks while satisfying sample-specific constraints (for example, nutrients and bounds). Unlike standard FBA, each sample has its own flux vector. Sparsity is applied globally, minimizing the total number of distinct reactions used across all samples, rather than per sample. This avoids overestimating differences and implicitly captures shared pathways. As flux optimization is a constraint, the distinction is between λ = 0 (no sparsity) and λ > 0 (sparsity applied).To test performance, we used the MitoCore metabolic network35 (555 reactions, 441 metabolites), simulating single reaction knockouts as different samples (Supplementary Information).We compared single sFBA and CORNETO multi-sample sFBA, enforcing biomass production at ≥10%, 50% and 90% of the optimal level for each knockout (Fig. 3a).Fig. 3: Simulation study using CORNETO’s capabilities for multi-sample FBA modelling.a, The number of different reactions selected using the single-sample sFBA (orange) versus CORNETO’s multi-sample sFBA (blue), constraining the optimal production of biomass to be a minimum (min.) of 90% (top), 50% (middle) and 10% (bottom) of the optimal attainable biomass for each sample, and for four different scenarios considering 2, 4, 8 and 16 samples. Each sample represents a single, simulated knockout (condition). For each scenario, we generated ten independent subsets of the specified size by randomly sampling conditions from the full pool of knockouts (n = 10 per boxplot). b, The average selection proportion of each reaction across different sample sizes (2, 4, 8 and 16 samples), where 1 indicates selection of the reaction in every case. Results are shown for the top 30 reactions with the greatest differences in selection frequency between the multi-sFBA and single-sFBA methods, under the minimum 90% biomass constraint setting. c, The performance of the sparse flux adjustment method leaving out different percentages of metabolic fluxes across samples, and under different Gaussian noise, for different regularization strengths (λ). The vertical dashed line indicates the baseline results for λ = 0.Full size imageCORNETO’s multi-sample sFBA consistently produces networks with fewer unique reactions than standard sFBA. Median network size reduction was 6.02% (at ≥90% optimal biomass), 7.21% (≥50%) and up to 30% (at ≥10%). These differences reflect how the flexibility of flux distributions decreases as the biomass optimality threshold increases: requiring solutions to approach the theoretical optimum tightly constrains the feasible space, leaving fewer alternative pathways and, therefore, fewer differences between the single- or multi-sample sFBA solution.At the ≥90% biomass threshold, multi-sample sFBA more frequently selects key glycolysis reactions, for example, hexokinase 1 (HEX1), glucose-6-phosphate isomerase (PGI) and phosphofructokinase (PFK) (Fig. 3b). This preference for canonical glycolytic reactions aligns with the minimal pathway necessary for biomass production36.While single sFBA might identify alternative pathways, multi-sample sFBA consistently selects more canonical glycolytic reactions across knockouts. This suggests that glycolysis is a more robust minimal route than indicated by single-sample analysis, which has a higher risk of selecting alternative pathways and exaggerating intersample differences.In summary, CORNETO supports extending FBA-based methods, as demonstrated by implementing multi-sample sFBA. This method provides an optimal sFBA extension for joint multi-sample analysis. It is optimal in the sense that it guarantees finding the smallest network fulfilling all constraints across samples, if one exists.Multi-sample metabolic flux adjustmentNext, we tested whether CORNETO can be extended to predict unobserved metabolic fluxes from sparse and noisy measurements. We developed a multi-sample metabolic flux adjustment method analogous to sFBA but based on ℓ2-regularized flux adjustment, subject to the FBA constraints and the same global structured regularization penalty controlled by λ. To evaluate our approach, we sampled steady-state metabolic fluxes from the MitoCore metabolic network and systematically introduced missing values (Supplementary Information).In the noiseless setting with only 20% of missing values, the test root mean square error (RMSE) remains extremely low, around 1 × 10−8, indicating near-perfect reconstruction of the metabolic fluxes (Fig. 3c). This difference drastically increases as more missing values are introduced, but the model maintains a test RMSE of 1 × 10−8 for λ > 0 in the noiseless setting, suggesting that structured regularization effectively constrains the solution space even with substantial data loss. Adding noise decreases the accuracy of flux predictions, leading to higher RMSE values; however, there is a consistent advantage in using structured regularization, with test error still lower for noise levels up to 1.0. This advantage is particularly evident when 50% or more of the flux measurements are missing, where the global regularization term helps to stabilize predictions and prevent overfitting to noisy observations (Supplementary Fig. 3).These results highlight that leveraging structured regularization in a targeted manner enables more accurate flux predictions with reduced data requirements, reinforcing the notion that sparser models can outperform large, comprehensive networks in certain contexts, as has been demonstrated by other methods37,38,39.CORNETO enables multi-sample inference of context-specific metabolic networks from omics dataIntegrating omics data enhances inference of context-specific metabolic networks. Techniques like iMAT27, FastCORE40 or GIMME41 map omics measurements onto metabolic networks and optimize a score function maximizing selection of context-specific reactions, subject to metabolic constraints42. Similar to sFBA and other FBA-based methodologies, these methods infer networks using single samples.We used CORNETO to develop a multi-sample context-specific metabolic network reconstruction method from omics data, using an objective function similar to iMAT. In iMAT, reaction scores from transcriptomics data guide optimal subnetwork selection by balancing inclusion of positive-score reactions against exclusion of negative-score ones. A threshold classifies enzyme-coding genes as highly or lowly expressed.We applied it to a yeast gene deletion microarray dataset43. To ensure sufficient signal, we initially excluded gene knockouts not encoding enzymes. We focused on knockouts causing significant expression changes in other enzyme-encoding genes (Supplementary Information). We identified 12 enzyme-encoding gene knockouts suggesting potential metabolic rewiring (Supplementary Fig. 4).To determine the optimal λ for multi-sample iMAT, we performed fivefold cross-validation27. In each fold, 20% of reaction scores were held out for validation and 80% used for inference. We repeated this for different λ values to estimate generalization errors and identify the best parameter. Based on the F1 score, best performance is with λ = 1.5, yielding a 6.62% average F1-score improvement (Fig. 4a) compared with baseline (λ = 0, original single-condition iMAT). This translates to a 29.62% precision increase (Fig. 4b), at the cost of a 8.62% recall reduction (Fig. 4c). The network size decreases from ~2,000 different reactions to 860 distinct reactions on average (Fig. 4d), and total false positives decrease by 38.80% (Supplementary Fig. 5a).Fig. 4: Evaluation and comparison of single- and multi-sample metabolic context-specific network inference.a, The average percentage improvement in F1-score over iMAT (single sample for different λ values on a fivefold cross-validation setting). The vertical dashed line is the optimal selected λ using the F1 score. Error bands indicate ±1σ (standard deviation), over ten independent runs with different random seeds. b, Improvement of precision relative to iMAT. c, Decrease in recall relative to iMAT. d, The average number of different reactions selected by each method. e, The distribution across metabolic pathways of reactions common to all samples, as selected by single iMAT (λ = 0) and CORNETO’s multi-sample iMAT (λ = 1.5). f, A heatmap of difference in proportion of selected reactions per pathway between multi-sample iMAT and single iMAT (red, more reactions selected by multi-sample iMAT; blue, more reactions selected by single iMAT).Full size imageWe compared selected reactions using the original iMAT and CORNETO’s multi-sample method (λ = 1.5). The multi-sample approach reduced differences across samples by selecting more unique reactions in fatty acid degradation and fewer in glycerolipid metabolism (Fig. 4e). These pathways contain a high proportion of reactions lacking gene annotations, resulting in network regions that are unconstrained by transcriptomic data. This allows greater flexibility in the selection of alternative flux routes during optimization, as these reactions can be freely included or excluded without affecting the fit to gene expression-derived scores.CORNETO’s multi-sample iMAT method includes more fatty acid degradation reactions under specific knockouts (PFK2, ACO1, PLC1, IPK1, RNR4 and ARG82 genes) (Fig. 4f). A closer examination of the input data reveals that, for these knockouts, the corresponding enzyme-coding genes were not differentially regulated (score of 0), whereas in other samples these reactions predominantly did (positive scores; Supplementary Fig. 5b). This observation highlights that the multi-sample approach leverages all available data when deciding which reactions to include on a per-sample basis. Here, the method recognizes that flux can be rerouted through the fatty acid degradation pathway without compromising the fitting error for these samples and, given the uncertain activity of these reactions, it favours their inclusion to predict a similar flux distribution as in the other samples.Our analysis demonstrates the multi-sample method substantially enhances precision by reducing false positives in context-specific metabolic models.Multi-patient signalling networks identify a higher proportion of interactions driven by shared deregulated kinasesUsing CORNETO, we reimplemented and extended CARNIVAL28 to handle multi-sample settings. CARNIVAL is a method that infers signalling networks by analysing transcription factor (TF) activity, estimated from transcriptomics data and known TF targets44, and reconstructs networks from known inputs (for example, drug targets or receptors) to TFs. This links input perturbations to TFs via the PKN, accounting for interaction directionality and sign (activation or inhibition), allowing the study of signal flow. If perturbations are unknown, all potential upstream vertices in the PKN with no input edges are selected as potential sources for signal propagation—an approach we refer to as inverse CARNIVAL.We first assessed performance using synthetic ground-truth networks generated from transcriptomics data of drug-treated cell lines under different regularization settings, enforcing varying degrees of similarity in the inferred signalling networks. As expected, precision increases when the true networks exhibit greater similarity across samples (Supplementary Fig. 6).We then applied CORNETO to infer patient-specific intracellular networks from transcriptomic data from patients with lung adenocarcinoma profiled by the Clinical Proteomic Tumor Analysis Consortium (CPTAC)45,46, leveraging phosphoproteomics data for validation (Supplementary Information).For validation, we defined shared deregulated kinases as those with absolute activity scores in the top quartile for at least 50% of the patients. We then compared the performance of the single-sample and multi-sample approaches by assessing both the total number of network edges (the union across patients) and the intersecting edges shared among patient networks (Fig. 5b). For λ 0.5, suggesting that multi-sample regularization enhances the consistency of edge detection without substantially altering network complexity.Fig. 5: Evaluation and comparison of the single- and multi-sample intracellular signalling network inference on CPTAC.a, Overview of the approach. Using transcriptomics from patients, we estimated TF activities with Decoupler, which was used as an input for single-sample and multi-sample CARNIVAL. In both cases, we assume that the inputs (receptors) are unknown, and they are automatically selected during optimization (inverse CARNIVAL). For validation, we used phosphoproteomics data for the same patients, from which kinase activities were estimated. b, Results for the single- and multi-sample CARNIVAL, showing the average number of selected edges from the PKN for all patients (top left), same for the vertices (top right), the intersection of edges across patients (bottom left) and the mean proportion of correctly inferred interactions involving deregulated kinases, used as a validation set. Error bands indicate ±1σ over ten independent runs (with different random seeds) at each value of λ. c, Interactions that appear in the intersection of the inferred networks for the optimal λ = 0.8. To account for variability, the multi-sample inference was repeated 30 times with different seeds, and we kept the interactions of the intersection if they appeared in at least 50% of the alternative solutions. The colours correspond to the average TF activity across patients.Full size imageThe intersection of edges across patient networks was larger in the multi-sample setting, and a greater proportion of these edges were associated with the shared deregulated kinases. Moreover, within this intersection, a higher proportion of edges originated from the shared deregulated kinases used as the validation set. This finding suggests an improved precision of the multi-sample approach in capturing biologically relevant kinase interactions, as these kinases were identified independently from the data used for inferring the networks. The intersection of patient-specific networks reveals shared signalling mechanisms, with key deregulated kinases such as GSK3B, AKT1 and SRC consistently influencing TF activity across patients (Fig. 5c).DiscussionThis Article presents CORNETO, a unified framework for network inference enabling multi-sample learning. By reformulating methods as joint optimization problems using mixed-integer optimization and network flows, CORNETO captures shared and context-specific features across samples. Its versatility makes CORNETO applicable to PPI, metabolic and signalling networks. We demonstrated its effectiveness across simulations and studies using real omics data.To leverage multi-sample information, we introduced a structured sparsity-inducing penalty controlled by a λ parameter. This approach assumes that samples share similar network structures. For example, when studying multiple perturbations within the same organism or related subjects under the same perturbation (for example, different cell lines responding to the same treatment), network structures probably have shared edges. Larger λ values indirectly promote increased similarity, reducing variance and simplifying network structure. We showed this approach’s effectiveness across different case studies, showing that implemented methods infer smaller, more consistent networks.Using CORNETO, we implemented and extended multiple network inference methods based on mixed-integer optimization and network flows. Implemented methods included FBA, sFBA and iMAT (metabolism), CARNIVAL (intracellular signalling) and various Steiner trees and PCSTs for PPI networks. We illustrate CORNETO’s versatility through multiple applications involving method extension or adaptation. Specifically, we applied CORNETO to (1) extend FBA to multi-sample scenarios; (2) integrate multi-sample omics data with FBA to derive flux-consistent, context-specific metabolic networks; and (3) jointly infer intracellular signalling networks from multi-sample transcriptomics data. In addition, we showcase CORNETO’s broader applicability by adapting Steiner tree-based approaches for multi-sample protein interaction and kinase–substrate networks, and by optimizing biologically informed neural network architectures trained on single-cell transcriptomics, detailed in the Supplementary Information (Table 1).A limitation of CORNETO is its scalability and computational complexity, because many methods involve non-deterministic polynomial-time hard (NP-hard) combinatorial problems. These demands grow exponentially as sample sizes or PKNs increase. Heuristic approaches, such as the Steiner-tree implementation in NetworkX or Fast PCST47, provide faster solutions but generally target undirected networks, lack support for multiple samples or extra constraints and are suboptimal. Alternatively, heuristics could be used to find high-quality feasible solutions to initialize CORNETO’s solvers for faster convergence. Exploring strategies such as Alternating Direction Method of Multipliers (ADMM) or relaxed formulations48 could also enable efficient parallelization and distributed computing. As CORNETO is solver-agnostic (decouples the problem formulation from the optimization), it can in principle be adapted to leverage new optimization tools and techniques as they emerge.Moreover, while CORNETO currently provides a flow conservation based framework for steady-state scenarios, extending it to dynamic or equilibrium settings is a promising future direction. Such extensions often involve time discretization49. Introducing dynamics brings challenges (increased complexity, uncertainty quantification, parameter fitting and sensitivity analysis), addressed in computational modelling literature50,51. Bridging steady-state and dynamic models remains an area for future research.A promising future direction is extending methods to multi-layer network inference, integrating various PKNs and biological assumptions into a unified joint optimization approach. Its ability to import metabolic networks supported by COBRA (COnstraint-Based Reconstruction and Analysis Toolbox) makes it an ideal bridge for integrating metabolism with other cellular processes, developing approaches that incorporate diverse prior knowledge and data within a unified modelling framework. This could lead to models capturing interactions between biological processes, such as combining FBA with gene regulation52,53,54, further integrating FBA with signalling and metabolic networks, extending beyond current approaches to causal signal-metabolic network integration55 or enhancing models of cell–cell communication in spatial contexts56. These capabilities could aid in the development of digital twins57 or virtual cell models58,59 by leveraging different types of prior knowledge.Beyond network inference, CORNETO’s ability to flexibly model diverse biological mechanisms and incorporate additional constraints could make it well suited to inform experimental design. By simulating in silico perturbations in the network, or measuring variability under different constraints, the framework may help to prioritize interventions that most effectively constrain the solution space.Another framework extension involves using CORNETO to derive biologically inspired artificial neural network architectures. We demonstrated that this is possible, showing how CORNETO can automatically extract context-specific networks from a PKN and convert them into a biologically inspired machine learning model, improving upon ref. 60. This approach could be extended to support constructing more advanced architectures, such as LEMBAS (Large-scale knowledge-EMBedded Artificial Signaling-networks) for intracellular signalling61 or metabolism62, in an automated and optimal fashion. CORNETO provides an automated way to generate context-specific networks that account for multiple samples, reducing the need for manual curation. Future work could explore integrating more complex architectures to enhance biological interpretability and predictive power. General frameworks such as Networks Commons63 that allow swapping and combining network-inference algorithms, and linking them to diverse knowledge and data sources, will catalyse such developments.In summary, CORNETO offers a unified framework for network inference, tailored for multi-sample analyses. By reformulating network inference problems as joint optimization tasks, CORNETO reduces variability across samples, increases precision, decreases the overall size of the inferred networks and improves the detection of shared biological interactions. Its flexibility integrating diverse PKNs makes it applicable across a wide range of biological contexts. The Python package makes these tools accessible, encouraging broader use and collaboration within the research community, paving the way for future innovations in network inference.MethodsMathematical backgroundMixed-integer optimizationMixed-integer programming is a type of constrained optimization problem that involves decision variables that can be both integer and continuous. A mixed-integer programming problem can be defined as follows:$$\begin{array}{ll}\mathop{\min }\limits_{{\bf{x}}\in {{\mathbb{R}}}^{n},\,{\bf{y}}\in {{\mathbb{Z}}}^{m}}&f({\bf{x}},{\bf{y}})\\ \,\text{subject to}\,&{g}_{i}({\bf{x}},{\bf{y}})\le 0,\quad i=1,2,\ldots ,k,\\ &{h}_{j}({\bf{x}},{\bf{y}})=0,\quad j=1,2,\ldots ,l.\end{array}$$where \(f:{{\mathbb{R}}}^{n}\times {{\mathbb{Z}}}^{m}\to {\mathbb{R}}\) is the objective function, which involves both real and integer decision variables. The goal is to minimize (or maximize) this function, subject to inequality constraints gi(x, y) ≤ 0, i = 1, 2, …, k, where \({g}_{i}:{{\mathbb{R}}}^{n}\times {{\mathbb{Z}}}^{m}\to {\mathbb{R}}\), and equality constraints hj(x, y) = 0, j = 1, 2, …, l, where \({h}_{j}:{{\mathbb{R}}}^{n}\times {{\mathbb{Z}}}^{m}\to {\mathbb{R}}\).If functions f, gi and hi are linear, the problem is a mixed-integer linear programming (MILP) problem. If all variables are continuous, the problem is a linear programming (LP) problem. Problems with integer variables are harder to solve as they are combinatorial in nature.For these classes of problems, there exist many mathematical solvers that provide exact solutions with optimality guarantees, such as CPLEX, Gurobi, CBC, HIGHs and SCIP. We exploit the MILP representability of many network inference problems, allowing us to use these advanced solvers to find optimal solutions accurately.Network flowsNetwork flows23,24 are a well-studied and general class of problems that can be used to solve a wide variety of scientific and engineering problems. They model the distribution of quantities across a network, subject to constraints such as flow conservation and capacity constraints.Given its versatility, different types of problems can be reduced to or formulated using network flows64,65,66, for which efficient optimization methods have been developed. We build upon this idea to unify diverse network inference problems by reducing them to network flow-based problems. By leveraging this framework, we can extend existing methods to account for joint inference on multi-sample scenarios.We first introduce the basic network flow problem on a directed graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\), where \({\mathcal{V}}\) is the set of vertices and \({\mathcal{E}}\) is the set of directed edges. Let \({\bf{x}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| }\) be the vector of flows on the edges, \({\bf{b}}\in {{\mathbb{R}}}^{| {\mathcal{V}}| }\) be the vector of net flows at the vertices and \({{\bf{x}}}_{max}\in {{\mathbb{R}}}^{| {\mathcal{E}}| }\) be the vector of upper bounds (capacities) for the flows on the edges. Let \(A\in {{\mathbb{R}}}^{| {\mathcal{V}}| \times | {\mathcal{E}}| }\) be the vertex–edge incidence matrix of the graph \({\mathcal{G}}\) defined as$${A}_{ij}=\left\{\begin{array}{ll}-1\quad &\,\text{if}\,\,{v}_{i}={\mathrm{tail}}({e}_{j}),\\ 1\quad &\,\text{if}\,\,{v}_{i}={\mathrm{head}}({e}_{j}),\\ 0\quad &\,\text{otherwise}\,.\end{array}\right.$$(1)where head and tail are functions that, given edge \(e=({v}_{t},{v}_{h})\in {\mathcal{E}}\), return the head (vh) and the tail (vt) of e.The network flow problem aims to find the vector of flows x that satisfies the flow capacity constraints and flow conservation constraints. This can be expressed as a linear system$$A{\bf{x}}={\bf{b}},\quad 0\le {\bf{x}}\le {{\bf{x}}}_{{\mathrm{max}}},$$(2)where the vector b is defined as$${{\bf{b}}}_{i}=\left\{\begin{array}{ll}-B\quad &\,\text{if}\,\,{v}_{i}=s,\\ B\quad &\,\text{if}\,\,{v}_{i}=t,\\ 0\quad &\,\text{otherwise},\end{array}\right.$$where s is the source vertex, t is the sink vertex and B is the total flow from the source to the sink. The vector b represents the net flow entering or leaving each vertex, and it is non-zero only for source and sink vertices, with sources having negative values (net outflow) and sinks having positive values (net inflow).In typical network flow problems, such as the minimum-cost flow problem, we are interested in finding an optimal point of this set Ω with respect to a linear objective function$${{\bf{x}}}^{* }=\arg \mathop{\min }\limits_{{\bf{x}}\in \Omega }f({\bf{x}}),$$where f is frequently defined as f(x) = cTx, with c representing the cost per unit flow on each edge, and Ω is the feasible set of valid flows in the network, defined as$$\varOmega =\{{\bf{x}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| }| A{\bf{x}}={\bf{b}},\,0\le {\bf{x}}\le {{\bf{x}}}_{{\mathrm{max}}}\}.$$As the flows are continuous values and the objective function and constraints are linear, the problem is an LP problem. LP problems are efficiently solved using algorithms such as the Simplex and Interior Point methods67, leveraging duality theory to provide bounds on the optimal value. Many mathematical solvers incorporate these algorithms to provide accurate solutions even for large-scale problems.The canonical form of the network flow problem as an LP can be expressed as the minimization of a linear function subject to (s.t.) linear constraints$$\begin{array}{ll}\,\text{min.}\,&{{\bf{c}}}^{T}{\bf{x}},\\ \,\text{s.t.}\,&A{\bf{x}}={\bf{b}},\\ &0\le {\bf{x}}\le {{\bf{x}}}_{{\mathrm{max}}}.\end{array}$$More complex network problems can be modelled by extending this simple network flow model. In particular, network inference problems, in which the goal is not only to find the values of certain parameters but also to determine the structure of the network, can be modelled by incorporating binary indicator variables to decide whether a certain edge is included in the solution. For example, let \({\bf{y}}\in {\{0,1\}}^{| {\mathcal{E}}| }\) be a binary vector where yj = 1 if edge ej is included in the network and yj = 0 otherwise. This extension transforms the problem into a MILP, which can be formulated as follows:$$\begin{array}{ll}\;\qquad{\text{min.}}&{{\bf{c}}}^{T}{\bf{x}}+{{\bf{d}}}^{T}{\bf{y}},\\ \,{\text{subject to}}\,&A{\bf{x}}={\bf{b}},\\ &0\le {\bf{x}}\le {{\bf{x}}}_{\rm{max}}\odot {\bf{y}},\\ &{\bf{y}}\in {\{0,1\}}^{| {\mathcal{E}}| },\end{array}$$where d represents the fixed costs associated with including each edge, and ⊙ denotes hadamard (element-wise) product. The constraint 0 ≤ x ≤ xmax ⊙ y ensures that flow can only occur on edges that are included in the network.The optimization problem represented by this MILP formulation is harder to solve than the standard LP problem owing to the presence of binary decision variables. However, it is more flexible and allows us to model a wide variety of problems that involve searching for the right combinatorial structures, such as paths, trees, acyclic graphs or other types of structures in an exact way. Moreover, MILP solvers are able to provide solutions along with certificates of optimality, ensuring that the solutions meet the user’s defined criteria.ModelIn this section, we present the core model behind CORNETO. Our framework redefines network inference by transforming it into a network flow-based problem, leveraging the mixed-integer optimization framework commonly used in such problems. This unified approach enables joint inference by utilizing multiple samples simultaneously, allowing the model to improve network inference by borrowing information across all samples. In addition, our framework offers several key advantages: (1) all methods are implemented consistently with a common vocabulary and API; (2) we identify and reuse common components across different methods, streamlining development; and (3) our framework provides exact algorithm implementations, allowing solvers to find optimal or suboptimal solutions with certificates of optimality, giving users control over the solution quality.HypergraphsTo support a wide variety of methods and PKNs, CORNETO operates on directed hypergraphs. A directed hypergraph \({\mathcal{H}}\) is defined as \({\mathcal{H}}=({\mathcal{V}},{\mathcal{E}})\), where \({\mathcal{V}}\) is a set of vertices, and \({\mathcal{E}}\) is a set of directed hyperedges. Each hyperedge \({e}_{i}\in {\mathcal{E}}\) is an ordered pair (Ti, Hi), with \({T}_{i}\subseteq {\mathcal{V}}\) as the tail and \({H}_{i}\subseteq {\mathcal{V}}\) as the head of the hyperedge.To accurately model diverse network interactions, particularly those requiring signed coefficients such as stoichiometry in metabolic networks, the vertex–edge incidence matrix for hypergraphs is defined on the basis of these direct coefficients.Let \({c}_{ij}\in {\mathbb{R}}\) be a coefficient that quantifies the specific, signed involvement of vertex vi in hyperedge ej. The sign of cij indicates the nature of the participation of vi in ej (for example, as a reactant or product, source or target of influence), while its magnitude represents the extent of this participation.The vertex–edge incidence matrix \({\bf{A}}\in {{\mathbb{R}}}^{| {\mathcal{V}}| \times | {\mathcal{E}}| }\) is defined as$${{\bf{A}}}_{ij}=\left\{\begin{array}{ll}{c}_{ij}\quad &\,\text{if vertex}\,\,{v}_{i}\,\,\text{is in hyperedge}\,\,{e}_{j},\\ 0\quad &\,\text{otherwise}\,.\end{array}\right.$$(3)For a hyperedge ej = (Tj, Hj), the sets Tj (tail) and Hj (head) define the structural components of the hyperedge. When considering a ‘forward’ flow or process through ej (that is, when the corresponding flow variable xj > 0):Vertices vi ∈ Tj typically correspond to inputs or reactants, and their coefficients cij would often be negative (for example cij 0).For example, for a simple directed-graph edge ej = (vt, vh) representing flow from vt to vh, the typical coefficients would be \({c}_{{v}_{t},\,j}=-1\) and \({c}_{{v}_{h},\,j}=1\). All other \({c}_{{v}_{k},\,j}=0\) for this edge.For a metabolic reaction ej such as M1 + 2M2 → M3, where Tj = {M1, M2} and Hj = {M3}, the coefficients would be \({c}_{{M}_{1},\,j}=-1\), \({c}_{{M}_{2},\,j}=-2\) and \({c}_{{M}_{3},\,j}=1\).This definition allows the flow-conservation constraint Ax = 0 (where x is the vector of flows through hyperedges) to correctly represent complex balances, such as stoichiometric steady states in metabolic networks. A negative flow xj for a hyperedge ej would then signify that the process occurs in the reverse direction relative to the signs defined by the cij coefficients, as is commonly done in FBA.In this framework, a column of A (representing one hyperedge) can contain multiple non-zero entries, with varying signs and magnitudes. This directly reflects hyperedges with multiple vertices in their tail and head sets, enabling the representation of complex relationships such as biochemical reactions (for example A + B ⇌ C + D) as well as or other multi-component processes. This differs from standard graphs, where each column in A (representing an edge) typically has exactly one −1 and one +1 entry (or is all zeros). In this way, the incidence matrix A plays a role directly analogous to the stoichiometric matrix in FBA when applied to metabolic networks.Because a standard directed graph can be viewed as a particular instance of a directed hypergraph, in the following we will use \({\mathcal{H}}\) to denote hypergraphs (including graphs as a special case) and \({\mathcal{G}}\) to denote only graphs, when specific methods are not designed to operate on hypergraphs.Data mapping functionGiven a dataset \({\bf{D}}\in {{\mathbb{R}}}^{m\times n}\) (m features and n samples, such as genes and cells), and a PKN \({\mathcal{H}}=({\mathcal{V}},{\mathcal{E}})\), the function \(\phi :({\mathcal{H}},{\bf{D}})\to {{\mathcal{H}}}^{{\prime} }\) is defined as$$\phi :({\bf{D}},{\mathcal{H}})\to {{\mathcal{H}}}^{{\prime} }=({{\mathcal{V}}}^{{\prime} },{{\mathcal{E}}}^{{\prime} }),$$(4)where\({{\mathcal{V}}}^{{\prime} }=\{{v}^{{\prime} }\,| \,{v}^{{\prime} }=(v,{{\bf{d}}}_{v})\}\) is the new set of vertices. Each vertex \({v}^{{\prime} }=(v,{{\bf{d}}}_{v})\) is associated with a vector \({{\bf{d}}}_{v}\in {{\mathbb{R}}}^{n}\), which contains the computed data for that vertex across all samples in the dataset. Specifically,$${{\bf{d}}}_{v}=\left[{d}_{v}^{(1)},{d}_{v}^{(2)},\ldots ,{d}_{v}^{(n)}\right],$$where \({d}_{v}^{(i)}\in {\mathbb{R}}\) is the computed feature value for vertex v in the ith sample.\({{\mathcal{E}}}^{{\prime} }=\{{e}^{{\prime} }\,| \,{e}^{{\prime} }=(e,{{\bf{d}}}_{e})\}\) is the new set of edges. Each edge \({e}^{{\prime} }=(e,{{\bf{d}}}_{e})\) is associated with a vector \({{\bf{d}}}_{e}\in {{\mathbb{R}}}^{n}\), which contains the computed values related to the interaction between the two vertices connected by edge e across all samples. Specifically,$${{\bf{d}}}_{e}=\left[{d}_{e}^{(1)},{d}_{e}^{(2)},\ldots ,{d}_{e}^{(n)}\right],$$where \({d}_{e}^{(i)}\in {\mathbb{R}}\) represents the computed value associated with the edge e for the ith sample.The data mapping function generalizes the process of transforming omics data into network features by applying specific rules or transformations that align with a PKN. For example, in metabolic networks, gene–protein reaction rules are used to calculate reaction values (edges) based on the expression levels of specific genes from the dataset. Other operations might involve simpler transformations, such as discretization of values, where, for example, gene expression or protein levels are converted into binary states to represent activation or inhibition.Graph transformation functionThis function takes an annotated graph \({{\mathcal{H}}}^{{\prime} }\) as input and returns a new graph by modifying its structure, such as by adding or removing vertices or edges.Given an annotated graph \({{\mathcal{H}}}^{{\prime} }=({{\mathcal{V}}}^{{\prime} },{{\mathcal{E}}}^{{\prime} })\), we define the graph transformation function \(\psi :{{\mathcal{H}}}^{{\prime} }\to {{\mathcal{H}}}^{{\prime\prime} }\) as$$\psi ({{\mathcal{H}}}^{{\prime} })={{\mathcal{H}}}^{{\prime\prime} }=\left({{\mathcal{V}}}^{{\prime\prime} }=({{\mathcal{V}}}^{{\prime} }\setminus {{\mathcal{V}}}_{r})\cup {{\mathcal{V}}}_{a},{{\mathcal{E}}}^{{\prime\prime} }=({{\mathcal{E}}}^{{\prime} }\setminus {{\mathcal{E}}}_{r})\cup {{\mathcal{E}}}_{a}\right).$$The function ψ modifies \({{\mathcal{H}}}^{{\prime} }\) by removing vertices in the set \({{\mathcal{V}}}_{r}\) and edges in \({{\mathcal{E}}}_{r}\), and by adding vertices from \({{\mathcal{V}}}_{a}\) and edges from \({{\mathcal{E}}}_{a}\), resulting in the new vertex set \({{\mathcal{V}}}^{{\prime\prime} }\) and edge set \({{\mathcal{E}}}^{{\prime\prime} }\). These transformations are used to adapt the structure of a given graph to be compatible with a network flow problems and to implement optimization steps to reduce the complexity of the methods.Each method in CORNETO provides a specific implementation of ψ by specifying how the vertices and edges are added or removed to align with its reformulation based on network flows.Network flows on bidirectional hypergraphsWe extend the basic network flow problem (equation (2)) to support the modelling of network inference methods on hypergraphs. This extension allows us to incorporate complex biological prior knowledge that is not compatible with simple graph representations68.Instead of using special types of vertices s and t to maintain flow balance, as in the traditional source–sink model, we extend the concept by adding hyperedges with no tail to inject flow and no head to extract flow. Specifically, to inject a certain amount of flow into a vertex v, it suffices to add a hyperedge \({e}_{{\mathrm{source}}}=({{\emptyset}},\{v\})\), which has an empty tail and the vertex v in its head, with the flow value set to the desired amount (for example, 1 unit). Similarly, to extract flow from a vertex v, a hyperedge \({e}_{{\mathrm{sink}}}=(\{v\},{{\emptyset}})\) can be added to the graph, which has the vertex v in its tail and an empty head. Multiple edges to inject and remove flow can be added to the network.The feasible set of flows is defined by the constraints on flow conservation and capacity limits, given by$$\varOmega =\{{\bf{x}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| }| {\bf{A}}{\bf{x}}=0,\,{{\bf{x}}}_{{\mathrm{min}}}\le {\bf{x}}\le {{\bf{x}}}_{{\mathrm{max}}}\},$$(5)where xmin represents the lower bounds on the flow values, and xmax represents the upper bounds. If xmin contains negative values, it indicates the possibility of negative flows, meaning that the flow can traverse the edge in the opposite direction. This modification allows bidirectional flows on hyperedges, controlled by the specified upper and lower bounds.In this formulation, the vector b (which represents external flow demands) is implicitly incorporated into the structure of the hyperedges. By embedding the flow injection at vertex v and extraction at vertex v through special hyperedges esource and esink, the external demands b are integrated into the incidence matrix A of the extended hypergraph. This transformation allows the flow problem to be represented as a homogeneous system Ax = 0, where the flow balance constraints are maintained naturally through the hypergraph structure.The reformulation of methods in CORNETO’s framework involve the use of graph transformations to manipulate the structure of the graph and to convert the original problem into a network flow-based problem. As a simple example, the shortest path problem from vertex s to vertex t can be reduced to a min-flow problem69 by adding an edge to inject flow through s and extracting the same amount of flow through t and then minimizing the sum of the cost of each edge multiplied by the flow (Supplementary Fig. 2).Multi-flows on bidirectional hypergraphsA key feature of CORNETO is its ability to model problems involving multiple samples. This capability requires not only learning individual networks for each sample but also linking flows across these networks to support joint inference. Building upon the previously defined concepts for single network flows, we now extend the notion to multi-flows on bidirectional hypergraphs. This approach allows us to handle multiple types of flow simultaneously, modelling different samples within the same framework.Similar to multi-commodity network flows70, which involve multiple resources (commodities) moving through a network simultaneously with shared constraints, we extend the formulation to support multiple simultaneous flows. In the context of the framework, each sample or condition is associated with a flow, and each flow can be subjected to different constraints.Let n be the number of samples (commodities). The flow vector x previously defined is now extended to a flow matrix \({\bf{X}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| \times n}\), where xij represents the flow of sample j along hyperedge ei. Similarly, we define \({{\bf{X}}}_{{\mathrm{min}}},{{\bf{X}}}_{{\mathrm{max}}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| \times n}\) to be the lower and upper bounds for the flow through the edges, for each sample.Given X, Xmin and Xmax, the multi-flow problem on bidirected hypergraphs is given by$$\begin{array}{rcl}&&{\bf{A}}{\bf{X}}=0\\ &&{{\bf{X}}}_{{\mathrm{min}}}\le {\bf{X}}\le {{\bf{X}}}_{{\mathrm{max}}}.\\ \end{array}$$This formulation, however, treats each sample as an independent network flow problem. We will use this as a building block to define other problems, starting from this multi-flow problem. To model interconnected flows, we can introduce additional transformations and constraints on X.As a simple example, in a multi-commodity problem, instead of having independent edge capacities per commodity, a single capacity constraint for each edge is imposed. The sum of absolute flows across all commodities cannot exceed these shared bounds. Given shared bounds xmin and xmax for all commodities, we transform the multi-flow problem into a multi-commodity flow problem by summing the total flows for each edge:$$\begin{array}{l}{\bf{A}}{\bf{X}}=0\\ {{\bf{x}}}_{{\mathrm{min}}}\le {\bf{| X| }}{{\bf{1}}}_{n}\le {{\bf{x}}}_{{\mathrm{max}}}.\\ \end{array}$$Here, the term ∣X∣ represents the absolute value of the flow matrix, ensuring that the sum of flows accounts for any potential negative values. The vector 1n is a column vector of n ones. The multiplication computes the sum of flows across all commodities.To handle the nonlinearity introduced by the absolute value ∣X∣, it is possible to linearize it by introducing auxiliary variables for each xij in X. This involves defining two non-negative variables, \({x}_{ij}^{+}\) and \({x}_{ij}^{-}\), such that \({x}_{ij}={x}_{ij}^{+}-{x}_{ij}^{-}\) and \(| {x}_{ij}| ={x}_{ij}^{+}+{x}_{ij}^{-}\). This transformation replaces the absolute value constraint with linear constraints, making the problem solvable using standard integer LP techniques.Constrained optimization problemsIn CORNETO, network inference methods are implemented as constrained optimization problems. An optimization problem is defined as a tuple:$${\mathcal{P}}=(\,f,\varOmega ),$$(6)where\(f:{\mathcal{X}}\to {\mathbb{R}}\) is the objective function to be minimized, which operates on the decision variable space \({\mathcal{X}}\);Ω is the feasible set, defined by equalities and inequalities on the variables$$\varOmega =\left\{x\in {\mathcal{X}}\,\left\vert \,\begin{array}{l}{g}_{i}(x)\le 0,\quad i=1,\ldots ,k,\\ {h}_{j}(x)=0,\quad j=1,\ldots ,l\end{array}\right.\right\}.$$We also define a trivial objective function as$${f}_{0}:{\mathcal{X}}\to {\mathbb{R}},\quad {f}_{0}(x)=0\quad \forall x\in {\mathcal{X}}.$$(7)In this case, (f0, Ω) simply involves finding any feasible point in Ω.All methods implemented in the framework use linear functions for f, g and h, over continuous and binary variables (with the exception of the multi-sample metabolic flux adjustment method, where the objective function is quadratic instead of linear).Problem compositionProblems can be composed to define and reuse subproblems, thereby creating variations of existing methods without rewriting common pieces of code.Let us consider two constrained optimization problems$${{\mathcal{P}}}_{1}\,=\,\left(\,{f}_{1},\,{\varOmega }_{1}\right),\qquad {{\mathcal{P}}}_{2}\,=\,\left(\,{f}_{2},\,{\varOmega }_{2}\right),$$wheref1 and f2 are objective functions to be minimized, andΩ1 and Ω2 are their respective feasible sets (each expressed in terms of the variables that appear in the corresponding problem).We build the composed problem \({{\mathcal{P}}}_{c}\,=\,\left({\,f}_{c},\,{\Omega }_{c}\right)\) as follows:Combined objective. Choose non-negative weights α and β and set$${f}_{c}\,=\,\alpha \,{f}_{1}+\beta \,{f}_{2}.$$Combined feasible set.$${\varOmega }_{c}\,=\,{\varOmega }_{1}\,\cap \,{\varOmega }_{2},$$that is, all constraints from \({{\mathcal{P}}}_{1}\) and \({{\mathcal{P}}}_{2}\) must hold simultaneously. Variables that appear in both original problems are implicitly identified; variables that appear in only one remain independent.The composed optimization problem therefore reads$$\mathop{\min }\limits_{x\in {\varOmega }_{c}}\,{f}_{c}(x).$$To model, compose and solve constrained optimization problems in Python, CORNETO implements a multi-backend strategy that decouples the high-level modelling of network inference problems from the low-level canonical forms used by solvers. It currently supports CVXPY71 and PICOS72. These backends also support sparse matrices and vectorization, which CORNETO leverages for efficient implementation of methods.Building blocksCORNETO defines a series of building blocks, composed by a set of variables and constraints, that can be used to construct subproblems. These subproblems can be combined to compose the feasible set of a given network inference method, representing the valid space of solutions with specific properties.By combining these building blocks, it is possible to develop network inference methods without the need to reimplement common constraints, such as enforcing acyclicity on a graph. For example, Steiner tree problems explore the space of acyclic trees, and as such, we can easily build tree-based problems by imposing acyclicity on flows, which would serve as the foundation for all methods that infer tree-like networks. This approach not only simplifies the process of constructing new algorithms but also ensures that any improvements or extensions to the building blocks can be easily reused across multiple methods, allowing broader and more efficient enhancements without the need for redundant implementations.FlowGiven a directed (hyper)graph \({\mathcal{H}}=({\mathcal{V}},{\mathcal{E}})\), and upper and lower bounds for each edge in each sample \({{\bf{X}}}_{{\mathrm{min}}},{{\bf{X}}}_{{\mathrm{max}}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| \times n}\),$${\varOmega }_{{\mathrm{Flow}}}({\mathcal{H}})=\{{\bf{X}}| {\bf{A}}{\bf{X}}=0,\,{{\bf{X}}}_{{\mathrm{min}}}\le {\bf{X}}\le {{\bf{X}}}_{{\mathrm{max}}}\},$$(8)where \({\bf{X}}\in {{\mathbb{R}}}^{| {\mathcal{E}}| \times n}\) is the variable representing the flows for each edge and sample. A is the vertex–edge incidence matrix of \({\mathcal{H}}\) (equation (1)).Free variable indicator constraintsGiven \({\bf{X}}\in {{\mathbb{R}}}^{m\times n}\) such that X ∈ [Xmin, Xmax],$${\varOmega }_{{\mathrm{Free}}}({\bf{X}})=\{{\bf{Y}}| \,{{\bf{X}}}_{{\mathrm{min}}}\odot {\bf{Y}}\le {\bf{X}}\le {{\bf{X}}}_{{\mathrm{max}}}\odot {\bf{Y}}\},$$(9)where Y ∈ {0, 1}m×n is a binary variable such that Yi, j = 0 ⇒ Xi, j = 0.Non-zero indicator constraintsGiven \({\bf{X}}\in {{\mathbb{R}}}^{m\times n}\), and X ∈ [Xmin, Xmax],$${\varOmega }_{\ne 0}({\bf{X}})=\left\{{\bf{Y}}\in {\{0,1\}}^{m\times n}\,| \,{\bf{Y}}\,\text{satisfies}\,({\rm{C}}1)-({\rm{C}}3)\right\},$$(10)where the constraints are defined as$${\bf{Y}}={{\bf{Y}}}^{+}+{{\bf{Y}}}^{-}$$(C1)$${\bf{Y}}\le 1$$(C2)$${{\bf{X}}}_{\min }\odot {{\bf{Y}}}^{-}+\epsilon {{\bf{Y}}}^{+}\le {\bf{X}}\le {{\bf{X}}}_{\max }\odot {{\bf{Y}}}^{+}-\epsilon {{\bf{Y}}}^{-}.$$(C3)Here, Y+, Y− ∈ {0, 1}m×n variables are introduced for linearizing the samples on X, such that \({{\bf{Y}}}_{i,\,j}^{+}=1\ \iff \ {{\bf{X}}}_{i,\,j}\ge \epsilon\) (indicating that X is active in the positive direction if and only if \({{\bf{Y}}}_{i,\,j}^{+}=1\)), and \({{\bf{Y}}}_{i,\,j}^{-}=1\ \iff \ {{\bf{X}}}_{i,\,j}\le -\epsilon\) (indicating that X is active in the negative direction if and only if \({{\bf{Y}}}_{i,\,j}^{-}=1\)).AcyclicGiven a graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\), and the associated X flow variables,$${\varOmega }_{{\rm{Acyc}}}({\mathcal{G}})=\left\{{\bf{Y}}\in {\varOmega }_{\ne 0}({\bf{X}})\,\left\vert \,{\bf{Y}}\,\text{satisfies}\,({\rm{C}}4)-({\rm{C}}6)\right.\right\},$$(11)where the constraints are defined as$$1\le {{\bf{L}}}_{v,:}\le | {\mathcal{V}}| ,\,\forall v\in {\mathcal{V}}$$(C4)$${{\bf{L}}}_{v,:}-{{\bf{L}}}_{u,:}\ge {{\bf{Y}}}_{e,:}^{+}+(1-| {\mathcal{V}}| )(1-{{\bf{Y}}}_{e,:}^{+})$$(C5)$${{\bf{L}}}_{u,:}-{{\bf{L}}}_{v,:}\ge {{\bf{Y}}}_{e,:}^{-}+(1-| {\mathcal{V}}| )(1-{{\bf{Y}}}_{e,:}^{-})$$(C6)for all edges \(e=(u,v)\in {\mathcal{E}}\). Here, \({\bf{L}}\in {{\mathbb{R}}}^{| {\mathcal{V}}| \times n}\) is a continuous variable matrix that encodes a valid topological ordering of the vertices in the graph for each of the n samples. \({\bf{Y}}\in {\{0,1\}}^{| {\mathcal{E}}| \times n}\) is a non-zero binary indicator for the flows (equation (10)). We use u and v to refer to the indices of the source and target vertices of an edge in L. The constraints are enforced for all columns (samples) in the matrices. Constraint (C4) ensures that, for each vertex v, its assigned order Lv,: is between 1 and \(| {\mathcal{V}}|\). Constraints (C5) and (C6) enforce that, if an edge \(e=(u,v)\in {\mathcal{E}}\) is directed from u to v (that is, \({{\bf{Y}}}_{e,:}^{+}=1\)), then the topological order of u must be less than that of v across all samples; specifically, Lv,: − Lu,: ≥ 1. Similarly, if the edge is directed from v to u (that is, \({{\bf{Y}}}_{e,:}^{-}=1\)), then the order of v must be less than that of u; that is, Lu,: − Lv,: ≥ 1. These conditions guarantee that the resulting topological ordering is consistent with an acyclic graph structure for all samples, depending on the directionality of the edges as defined by Y+ and Y−.Note that the acyclicity constraint is specifically defined for graphs and not for hypergraphs. It is used in methods that aim to find directed acyclic graphs or tree structures within a PKN.TreeA tree is a directed acyclic graph where each vertex has at most one incoming edge. The tree constraint can be enforced by building on the acyclic structure \({\varOmega }_{{\rm{Acyc}}}({\mathcal{G}})\) and adding a constraint that ensures each vertex \(v\in {\mathcal{V}}\) has no more than one incoming edge across all samples. The constraint is formulated as follows:$${\varOmega }_{{\rm{Tree}}}({\mathcal{G}})=\left\{{\bf{Y}}\in {\varOmega }_{{\rm{Acyc}}}({\mathcal{G}})\,\left\vert \,\sum _{e\in \,\text{In}\,(v)}{{\bf{Y}}}_{e,:}\le 1,\,\forall v\in {\mathcal{V}}\right.\right\},$$(12)where In(v) denotes the set of incoming edges to vertex v.CORNETO’s generalized joint network inference problem from prior knowledgeWe are now ready to introduce the general optimization problem that CORNETO proposes, which generalizes network inference methods from prior knowledge to a joint optimization with a common structure for all methods within the framework.The objective of this joint optimization is to estimate the network structure for each sample by leveraging information from all samples, thereby improving accuracy and reducing variance in the predicted networks. CORNETO formulates this problem as a constrained optimization task, incorporating a structured sparsity-inducing penalty through network flows.Given the union (annotated) graph \({{\mathcal{H}}}_{u}\), the general optimization problem is defined as$$\begin{array}{ll}\mathop{\min }\limits_{{\bf{X}},{\bf{Y}}}&\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}f({{\bf{X}}}_{:,i},{{\bf{Y}}}_{:,i};{{\mathcal{H}}}_{u})+\lambda \parallel {\bf{Y}}{{\bf{1}}}_{n}{\parallel }_{0}\\ \,\text{s.t.}\,&{\bf{X}}\in {\varOmega }_{{\mathrm{Flow}}}({{\mathcal{H}}}_{u})\\ &{\bf{Y}}\in {\varOmega }_{{\mathrm{Free}}}({\bf{X}})\\ &({\bf{X}},{\bf{Y}})\in {\varOmega }_{{\mathcal{M}}}({\bf{X}},{\bf{Y}}),\end{array}$$(13)where\(f({{\bf{X}}}_{:,i},{{\bf{Y}}}_{:,i};{{\mathcal{H}}}_{u})\) is the linear objective function term for sample i. The inclusion of \({{\mathcal{H}}}_{u}\) indicates that the specific values for the variables of the problem are derived from data corresponding to sample i within the annotations of the union graph \({{\mathcal{H}}}_{u}\). For example, if f involves edge costs for sample i, these costs are extracted from the ith component of the edge data vectors de (where \(e\in {{\mathcal{E}}}_{u}\)) that were established by the data mapping function ϕ.\({\bf{X}}\in {{\mathbb{R}}}^{| {{\mathcal{E}}}_{u}| \times n}\) represents the network flows variable, where each row corresponds to the flow of an edge and each column corresponding to the flows of a sample.\({\bf{Y}}\in {\{0,1\}}^{| {{\mathcal{E}}}_{u}| \times n}\) is a binary indicator matrix, such that Yi, j = 0 ⇒ Xi, j = 0. This variable can be used to block flows through specific edges in the network.\({\varOmega }_{{\mathcal{M}}}\) are method-specific constraints involving X and Y.λ is a regularization parameter controlling sparsity across samples.1n is a vector of ones of length n.The structured sparsity-inducing penalty λ∥Y1n∥0, where λ is a regularization parameter controlling the trade-off between sparsity and fit, enables joint inference by minimizing the number of active edges (non-zero entries in Y, which are edges potentially carrying flow) that are present in any sample or condition. The ℓ0 regularization can be implemented in a MILP problem by linearizing the expression (Supplementary Information).CORNETO’s network inference method definitionBased on the previously defined components, a network inference method in this framework is defined as a tuple \({\mathcal{M}}=(\phi ,\psi ,{{\mathcal{P}}}_{{\mathcal{M}}})\), with \({{\mathcal{P}}}_{{\mathcal{M}}}=(\,f,{\varOmega }_{{\mathcal{M}}})\).Given a method \({\mathcal{M}}\), a PKN \({\mathcal{H}}\) and a data matrix \({\bf{D}}\in {{\mathbb{R}}}^{m\times n}\), the general optimization problem is constructed by applying the following steps:(1)\({{\mathcal{H}}}^{{\prime} }=\phi ({\mathcal{H}},{\bf{D}})\)(2)\({{\mathcal{H}}}_{i}^{{\prime\prime} }=\psi ({{\mathcal{H}}}^{{\prime} }),\quad \forall i\in \{1,2,\ldots ,n\}\)(3)\({{\mathcal{H}}}_{u}=\left(\mathop{\bigcup }\nolimits_{i = 1}^{n}{{\mathcal{V}}}_{i}^{{\prime\prime} },\mathop{\bigcup }\nolimits_{i = 1}^{n}{{\mathcal{E}}}_{i}^{{\prime\prime} }\right)\)(4)Build \({\mathcal{P}}\) by plugging \({{\mathcal{P}}}_{{\mathcal{M}}}\) into the global problem template (equation (13)).Shortest pathsWe begin by illustrating how the simple shortest path problem can be implemented in this framework, and extended to multi-sample settings.Given a directed graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\), with \({\bf{d}}\in {{\mathbb{R}}}_{+}^{| {\mathcal{E}}| }\) representing the non-negative costs associated with the edges, and two vertices \(s,t\in {\mathcal{V}}\), the shortest path problem can be reformulated as a flow optimization problem73. To do this, we need to transform the original graph into \({{\mathcal{G}}}^{{\prime} }\) by adding two edges \(({{\emptyset}},\{s\})\) and \((\{t\},{{\emptyset}})\) to inject or extract flow through the s–t vertices, and then we can use \({\varOmega }_{{\mathrm{Flow}}}({\mathcal{G}})\), defined previously:$$\begin{array}{ll}\mathop{\min }\limits_{{\bf{x}}}&{{\bf{d}}}^{\top }{\bf{x}}\\ \,\text{s.t.}\,&{\bf{x}}\in {\varOmega }_{{\mathrm{Flow}}}({\mathcal{G}})\\ &{{\bf{x}}}_{({{\emptyset}},s)}=1\\ &{{\bf{x}}}_{(t,{{\emptyset}})}=1.\end{array}$$(14)The objective is to minimize the total cost of flow, where x represents the flow along the edges, constrained by xmin = 0 and xmax = 1. The constraints \({{\bf{x}}}_{({{\emptyset}},s)}=1\) and \({{\bf{x}}}_{(t,{{\emptyset}})}=1\) inject and extract one unit of flow at the source s and target t, respectively, and also guarantee that the solution is not the trivial zero flow. These constraints are specific to the shortest path method.Multi-sample shortest pathsGiven that we can reformulate the shortest path problem in CORNETO using flows, the method can be extended to handle multiple samples simultaneously, where each sample has its own source, target, and edge costs. In this case, instead of a single cost vector c, we have a matrix \({\bf{D}}\in {{\mathbb{R}}}_{+}^{| {\mathcal{E}}| \times n}\), where each column corresponds to the costs for a different sample, as well as a list of source vertices {s1, s2, …, sn} and a list of target vertices {t1, t2, …, tn} for the n samples.The goal is to solve the shortest path problem for all n samples simultaneously, while also introducing a regularization term to promote edge sparsity across samples. This is formulated as a joint optimization problem:$$\begin{array}{ll}\mathop{\min }\limits_{{\bf{X}}}&\frac{1}{n}\mathop{\sum }\limits_{i=1}^{n}{{\bf{D}}}_{:,i}^{\top }{{\bf{X}}}_{:,i}+\lambda \parallel {\bf{Y}}{{\bf{1}}}_{n}{\parallel }_{0}\\ \,\text{s.t.}\,&{\bf{X}}\in {\varOmega }_{{\mathrm{Flow}}}({\mathcal{G}})\\ &{\bf{Y}}\in {\varOmega }_{{\mathrm{Free}}}({\bf{X}})\\ &{{\bf{X}}}_{i,\,j}=1\quad \forall i| {e}_{i}=({{\emptyset}},\{{s}_{j}\}),\,j=1,\ldots ,n\\ &{{\bf{X}}}_{i,\,j}=0\quad \forall i| {e}_{i}=({{\emptyset}},\{{s}_{k}\}),\,\forall k\ne j,\,j=1,\ldots ,n\\ &{{\bf{X}}}_{i,\,j}=1\quad \forall i| {e}_{i}=(\{{t}_{j}\},{{\emptyset}}),\,j=1,\ldots ,n\\ &{{\bf{X}}}_{i,\,j}=0\quad \forall i| {e}_{i}=(\{{t}_{k}\},{{\emptyset}}),\,\forall k\ne j,\,j=1,\ldots ,n,\end{array}$$(15)where s1, s2, …, sn are the input vertices for the shortest path associated with each sample, and t1, t2, …, tn are the target or destination vertices.The last four constraints involving the flow matrix X ensure that, for each sample j, flow is injected into the graph only through the edges connected to the specific source node sj and is extracted only through the edges connected to the specific target node tj. All other edges involving source or target vertices not associated with sample j have no flow. This guarantees that the flow is restricted to the relevant edges for each sample, maintaining the integrity of the source–target pair for the shortest path. It should be noted that, when there is one single sample, these constraints reduce to the ones in the single shortest path (equation (14)).The single- and multi-sample shortest path problem can then be defined in CORNETO as \({{\mathcal{M}}}_{SP}=(\phi ,\psi ,{{\mathcal{P}}}_{SP})\) and \({{\mathcal{P}}}_{SP}=(\,f,{\varOmega }_{SP})\), where\(\phi ({\bf{D}},{\mathcal{G}})=(\{{v}^{{\prime} }\,| \,{v}^{{\prime} }=(v,{\bf{0}})\},\{{e}^{{\prime} }\,| \,{e}^{{\prime} }=(e,{{\bf{d}}}_{e})\})\), that is, data mapping trivially maps the data (weights or cost of the edges) to the edges of the graph, and vertices do not have any cost associated to them.\(\psi ({{\mathcal{G}}}^{{\prime} })=\left(\right.{{\mathcal{V}}}^{{\prime} },{{\mathcal{E}}}^{{\prime} }\cup \{({e}_{s},{\bf{0}}),({e}_{t},{\bf{0}})\}\), where \({e}_{s}=({{\emptyset}},\{s\})\) and \({e}_{t}=(\{t\},{{\emptyset}})\), and \(s,t\in {{\mathcal{V}}}^{{\prime} }\) are the provided source and target vertices.f(x) = d⊤x, where d is a column vector with the costs of the edges for a given sample, and x is the variable vector of flows (for a given sample) provided by CORNETO in the general problem (equation (13)).ΩSP corresponds to the last four constraints of equation (15) involving the injection or extraction of a unit of flow, which is the method specific set of constraints to formulate the shortest path as a flow problem.By plugging f and ΩSP into equation (13), we already have multi-sample version of the shortest path problem in CORNETO. Note that, if there is only one sample and λ = 0, the problem is equivalent to a standard shortest path. If there is only one sample and λ > 0, then the single shortest path is regularized so, among all the paths with the mininum cost (if there is more than one solution), the shortest one (less number of edges) is returned. If there are n samples and λ = 0, then the problem is equivalent to solve independently each shortest path.Supplementary Fig. 2 shows an example for all the components of the framework for the single shortest path problem.Steiner treesThe Steiner tree problem can be defined as follows. Given an undirected graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\), and a set of terminal vertices \(T\subseteq {\mathcal{V}}\) with ∣T∣ ≥ 2, the goal is to find a tree \(S=({{\mathcal{V}}}_{S},{{\mathcal{E}}}_{S})\) within the graph \({\mathcal{G}}\) that includes all the terminal vertices, that is, \(T\subseteq {{\mathcal{V}}}_{S}\), while minimizing the total weight of the edges in S.Steiner trees have two basic constraints: (1) all terminals must be selected as part of the solution; and (2) the solution is a tree. The reformulation of Steiner trees in CORNETO thus requires expressing them using the concepts defined in the framework. First, for the first constraint, the initial graph is transformed such that a new edge is added to every terminal vertex. One vertex is chosen as a root, and an edge to inject flow is added, and for the remaining terminals, we add one edge to extract flow. If no root is provided, the first terminal is selected as a root. It should be noted that, in a Steiner tree (undirected), any terminal vertex could be a root because all the terminals have to be connected. A constraint is added such that the input edge injects u ≥ 1 units of flow, and every other edge has to extract at most \(\frac{u}{| T| -1}\) of the injected flow. This constrains the space of feasible solutions to the space of subgraphs where the terminals are connected to the root through any path from the PKN.The method SteinerTree, which uses a directed graph \({\mathcal{G}}\), can be implemented as follows:\({{\mathcal{G}}}^{{\prime} }=\phi ({\bf{D}},{\mathcal{G}})=(\{{v}^{{\prime} }\,| \,{v}^{{\prime} }=(v,{\bf{0}})\},\{{e}^{{\prime} }\,| \,{e}^{{\prime} }=(e,{{\bf{d}}}_{e})\})\)\({{\mathcal{G}}}^{{\prime\prime} }=\psi ({{\mathcal{G}}}^{{\prime} })\)\(=\left(\right.{{\mathcal{V}}}^{{\prime} },{{\mathcal{E}}}^{{\prime} }\cup \{(({{\emptyset}},{r}_{j}),{\bf{0}})\}\)\(\cup \left\{((t,{{\emptyset}}),{\bf{0}})|\right.\)\(\left.\left.t\in T\setminus \{{r}_{j}\}\right\}\right),\)\(\forall j\in\)\(\{1,2,\ldots ,n\}\) that is, an edge to inject flow is added to the root vertex, and one edge to extract flow is added to every other terminal that is not the root. These edges have no cost (de = 0). There can be one different root and set for terminals per sample, indexed by j.ΩST further restricts the variables of the main problem, by imposing:– \({\bf{Y}}\in {\varOmega }_{{\mathrm{Tree}}}({{\mathcal{G}}}_{u})\), where \({{\mathcal{G}}}_{u}\) is the union graph of the n modified graphs \({{\mathcal{G}}}^{{\prime\prime} }\).– For each sample s, the flow injected into the root terminal rj through the edge \(({{\emptyset}},{r}_{j})\) must be equal to u ≥ 1:$${{\bf{X}}}_{({{\emptyset}},{r}_{j}),\,j}=u\quad \forall j\in \{1,2,\ldots ,n\}.$$– For each sample s, the flow extracted from each non-root terminal t ∈ Tj⧹{rj} through the edge \((t,{{\emptyset}})\) must be equal to \(\frac{u}{| {T}_{j}| -1}\):$${{\bf{X}}}_{(t,{{\emptyset}}),\,j}=\frac{1}{| {T}_{j}| -1}\quad \forall j,\,\forall t\in {T}_{j}\setminus \{{r}_{j}\}.$$– Each edge that injects flow through a root defined for a different sample cannot inject flow in the current sample:$${{\bf{X}}}_{({{\emptyset}},r),\,j}=0\quad \forall j,\,\forall r\ne {r}_{j}.$$– Similarly, for the edges to extract flow through the terminals,$${{\bf{X}}}_{(t,{{\emptyset}}),\,j}=0\quad \forall j,\,\forall t\notin {T}_{j}.$$f(y) = d⊤y, which minimizes the selection of edges in the tree.For directed Steiner trees, the root node must be provided, because the reachability of vertices changes depending on the directions of the edges. Directionality of flows is controlled by lower and upper bounds on the edges. If an edge has a negative lower bound and positive upper bound, the edge behaves as a undirected edge. If the lower bound is 0, the edge is directed.PCSTThe PCST problem can be defined as follows. Given a graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\) and a non-negative prize function \(p:V\to {{\mathbb{R}}}^{+}\) that assigns a prize to each vertex, along with a weight function \(w:E\to {{\mathbb{R}}}^{+}\) that assigns a cost to each edge, the goal is to find a tree \(S=({{\mathcal{V}}}_{S},{{\mathcal{E}}}_{S})\) within the graph \({\mathcal{G}}\) that minimizes the total cost of the edges minus the total prize of the vertices included in the tree, minimizing$$\sum _{e\in {E}_{S}}w(e)-\sum _{v\in {V}_{S}}p(v).$$The solution to the PCST problem balances the cost of connecting the vertices (using the edges) against the benefits (prizes) obtained by including vertices in the tree.PCST in CORNETO is implemented using the Steiner tree implementation, with minimal changes. First, because terminals are not forced to be selected but they are part of the optimization, flow is not forced through them. Instead, the selection of prized terminals is moved to the objective function. Thus, the objective function is defined as$$f({\bf{y}})={{\bf{d}}}^{\top }{\bf{y}}-{{\bf{p}}}^{\top }{\bf{y}},$$where \({\bf{d}}\in {{\mathbb{R}}}^{| {{\mathcal{E}}}_{u}| }\) is the vector containing the edge costs for a given sample, and \({\bf{p}}\in {{\mathbb{R}}}^{| {{\mathcal{E}}}_{u}| }\) the vector with prizes (associated to the attached edges for removing flow through the prized terminals) taken from the annotated union graph \({{\mathcal{G}}}_{u}\).It should be noted that the vectors d and p are mutually exclusive, that is, for any index i, if di ≠ 0, then pi = 0, and if pi ≠ 0, then di = 0. This condition ensures that an edge either contributes to the edge cost (through d) or is associated with a prize (through p), but not both, making the problem equivalent to the original definition of the PCST.FBAFBA16 is a mathematical approach used to analyse the flow of metabolites through a metabolic network. Given a metabolic network represented by a stoichiometric matrix S, where each element Si,j indicates the stoichiometric coefficient of metabolite i in reaction j, FBA aims to find the distribution of fluxes \({\bf{v}}\in {{\mathbb{R}}}^{n}\) through the network that maximizes or minimizes a particular objective function, such as the biomass production of a given organism.The key assumptions and components of FBA are as follows:Steady-state assumption: the system is assumed to be in a steady state, meaning that the concentration of each metabolite does not change over time. The flux represents the rate at which each reaction occurs, typically expressed in units of millimoles per gram of dry weight per hour.Constraints: the fluxes are subject to constraints based on reaction capacities.Objective function: a linear objective function of the fluxes, such as maximizing biomass yield (for example, output flux from an artificial biomass reaction in the metabolic network).A genome-scale metabolic network can be modelled in CORNETO using a hypergraph \({\mathcal{H}}=({\mathcal{V}},{\mathcal{E}})\), by setting the coefficients of the incidence matrix A to the stoichiometric coefficients of the metabolic network. Then, the stoichiometric matrix S is equivalent to the incidence matrix of the hypergraph. By using the lower and upper bounds on reaction fluxes, the set of feasible metabolic fluxes is exactly the set of feasible flows in the hypergraph \({\varOmega }_{{\mathrm{Flow}}}({\mathcal{H}})\) (equation (5)). Thus, FBA is a particular instance of a single-sample (hyper)network flow on CORNETO with λ = 0.The objective function is defined as$$f({\bf{x}})=-{{\bf{c}}}^{\top }{\bf{x}},$$where \({\bf{c}}={\{0,1\}}^{| {{\mathcal{E}}}_{u}| }\) is a provided constant vector indicating the reaction fluxes to be maximized.The FBA method does not need data mapping or graph transformations, because no omics data are used, and the PKN typically contains import and export reactions such that the space of feasible solutions contain non-zero flux (flows) solutions. CORNETO provides a method that uses COBRApy74 to import any genome-scale metabolic network as a CORNETO hypergraph.sFBAsFBA31,75 is a variant of FBA that seeks to find an optimal flux distribution with the additional constraint of minimizing the number of active reactions (reactions carrying non-zero flux) in the metabolic network. The goal is to identify the sparsest set of reactions that can support a given metabolic objective (for example, maximizing biomass production). This method is particularly useful for identifying core metabolic functions or simplifying metabolic networks for analysis.To find the smallest set of active reactions (that is, those with non-zero fluxes) in a metabolic network that maximizes a given linear objective function, sFBA can be exactly implemented by activating the ℓ0 penalty on the selected edges y (λ > 0), and adding a constraint on the fluxes, for example, \({{\bf{x}}}_{{\mathrm{biomass}}}\ge 0.95\times {{\bf{x}}}_{{\mathrm{biomass}}}^{* }\), where \({{\bf{x}}}_{{\mathrm{biomass}}}^{* }\) is the optimal value of biomass flux in the standard FBA solution. This constraint ensures that the sparse solution retains at least 95% of the maximum possible biomass production, while minimizing the number of active reactions.The multi-sample sFBA method implemented with CORNETO extends this definition to find a metabolic network per condition such that the union of the networks is minimal, while still satisfying all the metabolic constraints of each condition, such as different cell substrates, different reaction flux bounds or any other type of constraint typically used in FBA modelling. This is achieved by reformulating the sFBA problem using CORNETO’s framework, which creates n simultaneous flow problems that are equivalent to n coupled FBA problems. The metabolic fluxes for each problem are captured in the \({\bf{X}}\in {{\mathbb{R}}}^{| {{\mathcal{E}}}_{u}| \times n}\) variable created by CORNETO, and the flux indicator variable \({\bf{Y}}\in {\{0,1\}}^{| {{\mathcal{E}}}_{u}| \times n}\), which indicates if reaction i in condition j can carry metabolic flux or is blocked in all samples. Instead of minimizing the selection of reactions with non-zero flux within each individual condition, we aim to minimize it across all samples simultaneously. We define sparsity as the minimum number of different selected reactions. Thus, maximizing sparsity across samples is equivalent to minimizing the total number of unique reactions in the union of the n different inferred metabolic networks. This definition of structured sparsity is exactly what CORNETO captures with the regularization term λ∥Y1n∥0.It should be noted that, in sFBA, the only term in the objective function is the minimization of active reactions. As a result, increasing λ does not have a progressive effect; it only distinguishes between λ = 0 (no sparsity) and λ > 0 (sparsity applied).Multi-sample sFBA for flux adjustmentWe extended the idea of sFBA to support flux adjustment by incorporating an additional ℓ2 regularization on the flux values (with parameter β) to control their magnitude, while maintaining global sparsity through a structured ℓ0 penalty (with parameter λ). The method minimizes the discrepancy between the predicted and observed fluxes, subject to the same metabolic constraints as standard sFBA. The objective functon that optimizes is$$\mathop{\min }\limits_{\mathbf{X},\mathbf{Y}}\,\parallel \mathbf{M}\,\odot\,(\mathbf{X}-\hat{\mathbf{X}})\parallel_{2,1}+\beta\,\parallel \mathbf{X}\parallel_{2,1}+\lambda\,\parallel \mathbf{Y}\parallel_{0}$$where ∥⋅∥F, where \({\bf{M}}\in {\{0,1\}}^{| {{\mathcal{E}}}| \times n}\) masks the unmeasured fluxes, and \(\parallel {\bf{X}}{\parallel }_{2,1}{:}=\mathop{\sum }\nolimits_{i=1}^{n}{{{\parallel{\bf{X}}}_{:,i}}\parallel}_2\) denotes the column-wise mixed ℓ2,1 norm.It is important to note that the use of ℓ2 regularization introduces a quadratic term in the objective function, converting the problem into a mixed integer quadratic program (MIQP), also supported in CORNETO.Metabolic network inference from omics (iMAT)The iMAT method27 is a FBA-based method used to reconstruct context-specific metabolic networks by integrating omics data with a genome-scale metabolic model. The method identifies a subset of reactions that are consistent with the omics data while ensuring that the resulting network satisfies the constraints imposed by FBA.Given a metabolic network, and an omics-based classification of reactions into three categories:\({C}_{{\mathrm{H}}}\subseteq \{1,\ldots ,| {{\mathcal{E}}}_{u}| \}\): high-confidence reactions, which are strongly supported by omics data.\({C}_{{\mathrm{L}}}\subseteq \{1,\ldots ,| {{\mathcal{E}}}_{u}| \}\): low-confidence reactions, which are weakly supported by omics data.\({C}_{{\mathrm{U}}}=\{1,\ldots ,| {{\mathcal{E}}}_{u}| \}\setminus ({C}_{{\mathrm{H}}}\cup {C}_{{\mathrm{L}}}).\): uncertain reactions, with ambiguous or no supporting data.The iMAT method solves the following optimization problem:$$\mathop{\max }\limits_{{\bf{y}}\in {\{0,1\}}^{n}}\sum _{j\in {C}_{{\mathrm{H}}}}{{\bf{y}}}_{j}-\sum _{j\in {C}_{{\mathrm{L}}}}{{\bf{y}}}_{j}.$$Subject to the FBA constraints. The binary variable yj ∈ {0, 1} ensures that a reaction j can carry flux only if it is included in the network (that is, yj = 1).In CORNETO, this method is implemented similarly to the sFBA problem, but using the non-zero indicator Y ∈ Ω≠0(X) instead of ΩFree. Omics data are mapped using the gene–protein reaction rules in the metabolic network, with a specific \(\phi ({\bf{D}},{\mathcal{H}})\) (Supplementary Information). The objective function is then$$f({\bf{y}})={{\bf{d}}}_{h}^{\top }(1-{\bf{y}})+{{\bf{d}}}_{l}^{\top }{\bf{y}},$$where \({{\bf{d}}}_{h}={{\bf{1}}}_{{C}_{{\mathrm{H}}}}\in {\{0,1\}}^{n}\) and \({{\bf{d}}}_{l}={{\bf{1}}}_{{C}_{{\mathrm{L}}}}\in {\{0,1\}}^{n}\), with 1S being the indicator function.Sign-consistent intracellular signalling network (CARNIVAL)We implemented the CARNIVAL method using the flow-based formulation for multi-sample signalling network inference from transcriptomics. To compare with the original Carnival R for single samples, we implemented a more flexible and efficient version of Carnival R (Supplementary Information).Given a directed (signed) PKN \({\mathcal{G}}\), where \({\bf{s}}\in {\{-1,1\}}^{| {\mathcal{E}}| }\) is the vector of edge interaction signs (−1, inhibition; 1, activation) from the PKN, a matrix \({{\bf{D}}}^{| {\mathcal{V}}| \times n}\) with TF activities (where negative values indicate inhibition and positive values indicate activation), and \({\bf{P}}\in {\{-1,1\}}^{| {{\mathcal{E}}}_{u}\times n| }\) is a matrix indicating which vertices from the PKN are perturbed in each sample.\(\phi ({\bf{D}},{\mathcal{G}})=(\{{v}^{{\prime} }\,| \,{v}^{{\prime} }=(v,{{\bf{d}}}_{v})\},\{{e}^{{\prime} }\,| \,{e}^{{\prime} }=(e,{\bf{0}})\})\).\(\psi ({{\mathcal{G}}}^{{\prime} })=\left({{\mathcal{V}}}^{{\prime} },{{\mathcal{E}}}^{{\prime} }\cup \left\{({{\emptyset}},\{{v}_{i}\})\,| \,{{\bf{p}}}_{i}\ne 0\right\}\cup \left\{(\{{v}_{k}\},{{\emptyset}})\,| \,{{\bf{d}}}_{k}\ne 0\right\}\right)\), where:– pi represents the ith entry of a given column j of P (that is, pi = Pi,j), which corresponds to the perturbation status of vertex vi in sample j (with pi ≠ 0 indicating that vertex vi is perturbed in this sample),– dk represents the kth entry of the same column j of D (that is, dk = Dk,j), which corresponds to the TF activity of vertex vk in sample j (with dk ≠ 0 indicating that TF vk is active or inhibited in this sample).ΩCarnival has the following constraints and variables:$$\begin{array}{l}{\bf{Y}}\in {\varOmega }_{{\mathrm{Tree}}}({\bf{X}})\\ {{\bf{E}}}^{{\rm{act}}}+{{\bf{E}}}^{{\rm{inh}}}\le {\bf{Y}}\\ {{\bf{V}}}^{{\rm{act}}}+{{\bf{V}}}^{{\rm{inh}}}\le {\bf{1}}\\ {{\bf{E}}}^{{\rm{act}}}\le ({{\bf{H}}}^{\top }{{\bf{V}}}^{{\rm{act}}})\odot {{\bf{1}}}_{{\bf{s}}\ > \ 0}+({{\bf{H}}}^{\top }{{\bf{V}}}^{{\rm{inh}}})\odot {{\bf{1}}}_{{\bf{s}} < 0}\\ {{\bf{E}}}^{{\rm{inh}}}\le ({{\bf{H}}}^{\top }{{\bf{V}}}^{{\rm{act}}})\odot {{\bf{1}}}_{{\bf{s}} < 0}+({{\bf{H}}}^{\top }{{\bf{V}}}^{{\rm{inh}}})\odot {{\bf{1}}}_{{\bf{s}}\ > \ 0}\\ {{\bf{V}}}_{i,\,j}^{{\rm{act}}}-{{\bf{V}}}_{i,\,j}^{{\rm{inh}}}={{\bf{P}}}_{i,\,j}\quad \,\text{if}\,\quad {{\bf{P}}}_{i,\,j}\ne 0,\end{array}$$where \({\bf{H}}\in {\{0,1\}}^{| {{\mathcal{V}}}_{u}| \times | {{\mathcal{E}}}_{u}| }\) is defined as$${\bf{H}}=\max (-{\bf{A}},0)=\left\{\begin{array}{ll}1\quad &\,\text{if vertex}\,v\,\text{is the tail of edge}\,e,\\ 0\quad &\,\text{otherwise}\,.\end{array}\right.$$\(f({\bf{v}})={\left({\bf{1}}-{\bf{v}}\odot \mathrm{sign}\,({\bf{d}})\right)}^{\top }| {\bf{d}}|\), where \({\bf{v}}={{\bf{V}}}_{:,\,j}^{{\rm{act}}}-{{\bf{V}}}_{:,\,j}^{{\rm{inh}}}\) and d = D:,j, for a given sample j.