Training network samples

Prediction of robustness for the user given networks is based on the samples we extensively studied. These include subnetworks extracted from GeneNetWeaver. These subnetworks are known to retain network modularity when extracted from the transcriptional regulatory network of baker's yeast.

Feature importances

The predictions in this tool are made using random forest regression models. Each machine learning model here captures feature information for a given network size at a given loss %. Hence, each heatmap illustrates a comparison of samples at 7 different loss percentages (%s). In total, there are 35 different regression models. Feature importance is a metric to measure the significance of a feature using random forest regression algorithm.

The following are the feature importances (each heatmap cell is an average of feature importances across 100 runs) for every network size (100, 200, 300, 400 and 500). 100 network size contains samples. 200 network size contains samples. 300 network size contains samples. 400 network size contains samples. 500 network size contains samples. The distribution of variance within the importance of each feature (across 100 runs) will be shortly available as figures.

100 network size

200 network size

300 network size


400 network size

500 network size

Coefficient of determination distribution

100 network size

200 network size

300 network size

400 network size

500 network size