That is, conditional and collective forms (Lizier et al., 2008, 2010). In particular, these previous studies employed multivariate forms of the transfer entropy, The reduction of uncertainty is rigorously quantified by the information-theoretic measure of conditional mutual information (CMI), which can also be interpreted as a measure of conditional independence (Cover & Thomas, 2005). These algorithms mitigate the curse of dimensionality by greedily selecting the random variables that iteratively reduce the uncertainty about the present state of the target. Several previous studies (Faes et al., 2011 Lizier & Rubinov, 2012 Sun et al., 2015 Vlachos & Kugiumtzis, 2010) proposed greedy algorithms to tackle the first two challenges outlined above (see a summary by Bossomaier et al., 2016, sec 7.2).
Nonparametric statistical testing based on shuffled surrogate time series is computationally demanding but currently necessary when using general information-theoretic estimators (Bossomaier et al., 2016 Lindner et al., 2011). This results in a high false positive rate (type I errors) without adequate family-wise error rate controls (Dickhaus, 2014) or a high false negative rate (type II errors) with naive control procedures In a network setting, statistical significance testing requires multiple comparisons. Information-theoretic estimators suffer from the “curse of dimensionality” for large sets of variables (Paninski, 2003 Roulston, 1999) The state space of the possible network models grows faster than exponentially with respect to the size of the network Every process can be separately studied as a target, and the results can be combined into a directed network describing the information flows in the system. The general approach to network model construction can be outlined as follows: for any target process (element) in the system, the inference algorithm selects the minimal set of processes that collectively contribute to the computation of the target’s next state. Both the network size and the sample size are one order of magnitude larger than previously demonstrated, showing feasibility for typical EEG and magnetoencephalography experiments. Varying the statistical significance threshold showed a more favorable precision-recall trade-off for longer time series. The performance increased with the length of the time series, reaching consistently high precision, recall, and specificity (>98% on average) for 10,000 time samples. The method was validated on synthetic datasets involving random networks of increasing size (up to 100 nodes), forīoth linear and nonlinear dynamics. The algorithm we present-as implemented in the IDTxl open-source software-addresses these challenges by employing hierarchical statistical tests to control the family-wise error rate and to allow for efficient parallelization. However, multiple statistical comparisons may inflate the false positive rate and are computationally demanding, which limited the size of previous validation studies. Greedy algorithms have been proposed to efficiently deal with high-dimensional datasets while avoiding redundant inferences and capturing synergistic effects. Multivariate transfer entropy is well suited for this task, being a model-free measure that captures nonlinear and lagged dependencies between time series to infer a minimal directed network model.
Network inference algorithms are valuable tools for the study of large-scale neuroimaging datasets.