At-Scale Debug

The use of embedded run-control is fast becoming a standard for remote debug on Intel server designs. It’s important to understand the key functionality and performance criteria in evaluating a solution for this application.

The technical background on embedded Intel In-Target Probe (known as embedded ITP, or eITP) can be read about in some of my other blogs, including BMC-Assisted Debug and At-Scale Debugging. The run-control library and JTAG Master scan engine driver are placed on a board’s BMC, and the BMC is connected up with the main processor’s JTAG scan chain. The underlying ITP primitives, such as EnterDebugMode, ReadMSR, WriteIO, etc. are made available within the BMC’s firmware stack, typically (but not limited to) being based upon Linux. Remote applications, such as Python-based CScripts, can be made available on a remote workstation and connected up via a socket server to the remote target. The topology looks like this:

EITP with remote host

With the remote connection, Intel CScripts, such as sysError() and pcie.lt_loop(), can be run remotely. The sysError powerful utility dumps all error registers within the system. The lt_loop() function exercises the PCI Express Link Training & Status State Machine (LTSSM) using, for example, repeated link retrains.

Using a benchtop Intel ITP box, or ASSET’s SourcePoint debugger, these routines can be executed very quickly. For example, the sysError() routine might take a minute or two, depending on the number of cores of the target platform, the horsepower of the remote host, and other factors.

A remote application will be slower, based upon not having a dedicated hardware probe to connect to the target – rather, the functionality is down on the BMC, which can be a lower-performance compute engine. Two things are critical to high performance:

  1. Having a BMC with the JTAG Master function implemented in hardware (silicon), with one example being the Pilot4, as opposed to bit-banging.
  2. Having the embedded run-control library embedded down on the target, as opposed to being far away at the remote host. This ensures that the run-control functions and the JTAG mastering function are tightly coupled, without the transport and handshaking overhead going back and forth for each exchange of bits.

How much slower will the embedded solution be? Well, with bit-banging, and with a lot of overhead going back and forth from the host to the target, it will be significant.

Another evaluation criteria in selecting a solution is based upon run-control isolation of the target. With the run-control library down on the target, the need for a host connection can be eliminated in some instances. This allows the solution to be truly “at-scale”, with hundreds or thousands of independent debug agents functioning independently and securely without the need to create a remote chain of trust.

For more information on ASSET’s superior embedded ITP implementation, ScanWorks Embedded Diagnostics (SED), see here (note: requires registration).