A Fault Modeling Tool for NoCs
Students: Konstantinos Aisopos (Princeton, MIT), Chia-Hsin Owen Chen (MIT)
PI: Li-Shiuan Peh (MIT)

Overview: We have developed an accurate fault modeling tool to capture variation-induced faults in Networks-on-Chip (NoCs). The core of our fault model has circuit-level accuracy, while its system-level interface eases integration with any system-level simulator. Our tool accurately models circuit failures, since it was developed by synthesizing a highly parameterizable router RTL (at 45nm) and performing Monte Carlo simulations directly on SPICE model netlists to capture timing violations due to variations of the manufacturing process. On the other hand, it is wrapped with a system-level interface, which eases the evaluation of resilient NoC designs by abstracting circuit-level complexity. System designers only need to supply high-level NoC parameters to use our tool to explore the probability of occurrence for a number of system-level fault types. Our tool can be plugged into any system-level NoC simulator with just 3 function calls (integration typically does not take more than an hour, see instructions to integrate to Garnet). It is also already a part of GEM5: just set the enable_fault_model flag to TRUE, and pass --network-fault-model as an input parameter.

The fault model interfaces with each simulated router as follows:
fault_model_interface.JPG

Interfacing to our code is very simple and only consists of 3 functions:
void FaultModel_initialize(void) : called once, to initialize our fault model.
int FaultModel_declare_router(int inputs, int outputs, int vcs_per_input, int buffers_per_vc) : called once for each router, to provide its configuration to the fault model. The fault model returns a unique routerID assigned to the router.
bool FaultModel_fault_vector(int routerID, int temperature, float fault_vector[]) : called at runtime, to fill fault_vector[] with the probability of occurrence for each fault type, for the current temperature of the requested router.

Download the source code: [click here].
Questions? contact kaisopos (at) csail (dot) mit (dot) edu

This tool is free to use for academic research, but we request that you cite our paper [PDF] [BibTex] :
Konstantinos Aisopos, Chia-Hsin Owen Chen, and Li-Shiuan Peh.
Enabling System-Level Modeling of Variation-Induced Faults in Networks-on-Chip.
In Proceedings of the 48th Design Automation Conference 2011 (DAC '11).