Testing PCI Express

PCI Express (PCIe) buses, in particularly Gen3, are susceptible to defects which may be masked from conventional test. What are these defects and how are they detected?


PCIe uses interference-canceling differential signaling and jitter-canceling embedded clocking to deliver very robust serial data transfer. Common-mode rejection, whereby signals common to both the P and N sides of a transmitter or receiver are rejected, fends off most sources of interference, and allows for a very survivable data path.

But engineers familiar with PCIe know that hardware reliability issues may cause changes in link width and/or link speed. For example, an open circuit on one lane will cause a change in link width from, say, x16 down to x8. This will allow the bus to continue to operate, albeit at a reduced throughput. Other defects, such as a short circuit between two adjacent transmit traces from different lanes, or a degradation in the REF_CLK, may cause a degradation of lane speed; for example, from 8GT/s to 4GT/s. An excellent PCI-SIG paper on how this is handled for PCI Express Gen2 is here:

http://www.pcisig.com/developers/main/training_materials/get_document?doc_id=090831b9a2b1210b2822c06e469992d9d028f13d

Structural defects are best detected by boundary scan on PCIe AC-coupled nets (assuming the associated devices support IEEE 1149.6). Clocking marginalities’ impact on link performance can be detected by any functional test which can access the applicable device’s PCIe config registers, such as processor-controlled test.

Other types of structural defects, such as a short between a REF_CLK net and GND, will be entirely transparent to any conventional functional test, but will cause impairments such as enhanced crosstalk. Still other defects, such as current draws in one device causing a power droop to an adjacent device, errant via stubs and variations in trace width, thickness and roughness, can handicap common-mode rejection, resulting in smaller eyes and higher bit error rates. Structural defects are again detectable by boundary scan (if available). More subtle signal integrity-impacting defects are only detectable through BERT and margining technologies such as HSIO for Intel Architecture.

Alan Sguigna