Developing a new standard through the IEEE, JDEC or any other credible industry group is not a trivial undertaking, but the outcome typically generates widespread benefits for an entire industry. Sometimes, the tendency is to think that similar standards overlap in capabilities and that they can be deployed interchangeably. This misses the point – and the value – of new standards.
A recently minted standard, the IEEE 1149.7 enhanced boundary-scan standard, and the soon-to-be-ratified IEEE P1687 standard for embedded instruments offer significant value over the related IEEE 1149.1 boundary-scan standard and the IEEE 1500 standard for embedded core test.
At a time when several semiconductor companies are expressing significant interest in both 1149.7 and P1687, it’s unfortunate that referring to the P1687 preliminary standard as IJTAG (for internal JTAG) has led to a number of misconceptions. Some in the industry mistakenly perceive P1687 as 1149.1 boundary scan turned outside-in for ‘inside-the-chip’ test. They sometimes refer to P1687 as “Just more 1149.1 plumbing.” This is far from the truth. Like many if not all standards development bodies, the working group for P1687 was intent on specifying a standard that met real needs in the industry. In fact, the working group polled the industry extensively to determine the deficiencies of existing standards, including 1149.1 boundary scan and 1500 core test. In a very real sense, the five years of work that’s been invested in developing P1687 has been a long and involved response to the group’s original findings concerning the needs of the industry. The same could be said of the 1149.7 development effort.
Overcoming inertia is fundamental to widespread adoption of new standards. Often, the tendency is to stay with what’s known instead of moving on to what’s new. When this happens, the old standards, which were certainly quite capable for their intended purposes, are often misappropriated and shoe-horned into applications they have no right being in. Some of this has happened with the older more established 1149.1 and 1500 standards.
There are a number of very telling technical differences between 1149.1/1500 and P1687. For example, the architecture upon which 1149.1 and 1500 depends is dualistic in that there are distinct data and instruction structures that make use of an instruction register (IR) where encoded data is stored. So, if 1149.1/1500 were to be appropriated for embedded instrumentation, an instrument embedded in a chip would need to support a register much like boundary scan’s Test Data Register (TDR). Instructions for this register would come from the chip’s 1149.1 Test Controller Instruction Register or 1500’s Wrapper Instruction Register (WIR). To create the active chip-level scan chain that includes the TDR associated with an instrument, an instruction encoding would be installed which would connect the selected TDR to the chip’s Test Data In (TDI) and Test Data out (TDO) pins. This would allow serial scan data to pass through the chip and by the embedded instrument’s internal signals.
This type of arrangement becomes problematic when the number of embedded instruments exceeds 10 or so. At 50, the situation becomes positively unwieldy. Today, many system-on-chip (SoC) devices can have upwards of 300 memories with anywhere from 30 to 300 built-in self test (BIST) instruments to test those memories.
Figure 1: 1149.1 Isn’t Good Enough
In a daisy- chain configuration such as this, 1149.1 does not provide efficient access to multiple embedded instruments. In addition, instruments embedded in low-power partitions in a design will not functional well in a daisy-chain architecture.
Handling the extra instruction for each of those 50 instruments is just the beginning of the problem. Chip manufacturers are under extreme pressure to reduce test time and thereby reduce test costs. Reducing test time means that a chip would spend less time on an expensive ATE test system where in-socket costs of eight cents a second can quickly add up to unacceptable overall test costs resulting from long test times. One way to reduce test time is to schedule multiple tests at the same time. Unfortunately, chip designers often don’t know which test they can or want to run simultaneously until the chip is on the ATE system for manufacturing test development. So, if there were 50 instruments on a chip and the developers knew that they wanted to run at least six tests simultaneously, they would have to develop 1149.1 JTAG instructions for every possible combination of six instruments of the 50 that are on-chip. That means 15 million instructions would have to be created, a daunting undertaking to say the least.
Clearly, imposing the 1149.1 JTAG/1500 architectural scheme on this sort of situation brings with it all of the limitations of the architecture. In contrast, the P1687 standard has foreseen these sorts of difficulties and solves them with its Segment Insertion Bit (SIB). The SIB is a single TDR bit placed anywhere in the P1687 architecture, which is sometimes referred to as the P1687 Scan Instrument Access Network. The SIB can open (add) or close (subtract) a scan path. In this way, the SIB acts as a single external bypass register capable of providing access to an embedded instrument. As a result, scheduling tests through the various instruments on a chip is as simple as placing logic ‘ones’ in certain scan path configuration registers and executing an 1149.1 Update (commonly referred to as an UpdateDR). When this has completed, access to all of the needed instruments will be available in an active scan chain so that multiple tests can be launched simultaneously.
Figure 2: A P1687 Scan Instrument Access Network
P1687’s Segment Insertion Bit (SIB) can be placed anywhere in the P1687 architecture for opening or closing scan paths to embedded instruments. This simplifies the scheduling of tests for the many instruments that may be on-chip.
Once instruments are started, the scan paths providing access to them can be closed without stopping the instruments – a capability that is unique to 1687 networks. In fact, other 1687 scan operations can be going on without disturbing the instruments on a closed scan path. Only an intentional access or a JTAG reset will change the state of a scan path that is not actively selected.
Other difficulties with the 1149.1/1500 architecture arise from the parallel connectivity of some of the registers that are specified in the standards. The standards dictate that access to the Bypass, Boundary, ID Code, Private, Instruction, User, Core and other registers is arranged in a “swap a register” configuration where one of these registers is connected to the TDI-TDO pins at a time. To complete a scan chain by connecting a chip’s TDI and TDO pins, one of these parallel registers is mapped by the instruction in the instruction register. When more than one of the parallel registers is needed, they can be concatenated together to produce the active register between TDI and TDO if a designer has created this configuration. Unfortunately, this technique is often at odds with power-saving schemes that are prevalent in many contemporary SoCs, microprocessors and application-specific integrated circuits (ASICs), many of which are deployed in handheld systems.
One of the ways that power consumption is reduced in these devices is to turn off on-chip logic partitions when they are inactive. At the gate level, this is done by gating-off connections to clocks or even power rails and buffers. The problem arises when a daisy-chain scan path that includes a parallel set of JTAG registers passes through a chip’s partition that is subject to being shut down to save power because the bypass register which would be used to minimize the set of co-located swap registers usually shares control signals, clock and power with the other registers. When the power-rails or clock drivers are shut off to a partition or zone on the device, the JTAG scan chain passing through this zone is either broken or the bypass register is still active and consumes power.
Once again, the P1687 SIB comes to the rescue. As a means for adding or subtracting scan paths, a P1687 SIB in a power-on zone can send Select, Scan-In and Scan-Out route data to the TDR in the segment of a scan path that’s associated with an embedded instrument in a power-off zone. The SIB propagates data from its in-line Scan-In to Scan-Out ports and it can be made to open (add) a scan path to the embedded instrument only if the power or clock for that instrument is turned on. Moreover, a SIB can function as a power-configuration register or a fuse access to enable and disable clock/power connections.
Another aspect of 1149.1 boundary scan that has served it well, but which is inadequate for the emerging world of embedded instrumentation, is the de facto standard Serial Vector Format (SVF) language. Test vectors written in SVF, which are provided at a chip’s TAP pins, can only be generated when the chip design is complete and an accurate Boundary Scan Description Language (BSDL) model of the device is available. (Test vector generation for 1149.1 boundary scan is generally from the outside-in and based on the chip’s BSDL model data. Board test engineers do not get involved with a chip’s internal Verilog/VHDL description of its internal logic). If an instrument that has been embedded into one chip and then is deployed on another chip, the SVF test that invokes the 1149.1 or 1500 instructions and supplies the data and operating sequence to that instrument cannot be transported intact along with the instrument. New SVF tests must be recreated in association with the BSDL file for the second chip.
In addition, in and of itself SVF contains no test flow control functionality. For example, a common SVF routine for an embedded instrument would program the chip’s 1149.1 TAP controller to continually apply ScanDRs, capturing status data from the instrument and shifting it off the chip. In keeping with the original objectives of the 1149.1 boundary-scan standard, encountering a DONE signal would mean the SVF test has finished and the first instruction for the next test must be inserted into the 1149.1 IR. The operation is different if a FAIL signal is encountered. Then, the test is either finished early or an error value is generated before the test is finished. Deploying more sophisticated flow control, such as looping, jumps, conditional branching and other if-then-else types of behavior, would require an external piece of software, such as a C++ program or a Python wrapper to manipulate different SVF snippets which would apply flow-control externally.
P1687 addresses these issues with its Procedural Description Language (PDL). PDL describes a process, test or operation in terms of the interface to the embedded instrument. It is complemented by P1687’s Instrument Connectivity Language (ICL), which describes the chip’s P1687 architecture, although in far less detail than the Verilog or VHDL model that chip designers use. In essence, PDL creates a portable vector format insofar as an instrument’s PDL can accompany it no matter where the instrument is embedded. For example, the same instrument might be deployed several times in a chip, or at different levels within a chip’s hierarchical architecture, or it might be implemented in several different chips with distinct configurations and logic.
PDL is portable because its vectors can be re-targeted to the TDI and TDO pins on any chip by way of the chip’s ICL, which provides the description of the chip’s P1687 network mapping. PDL also takes into consideration variable-length dynamically-changeable scan paths (that is, scan paths that change in length for any given UpdateDR).
PDL also includes some rudimentary flow control functions in the form of TCL procedures. The basic language is defined as iWrites (to an instrument interface), iReads (from an instrument interface) and iScans (to some portion of the P1687 scan network). The iApply construct can schedule iRead and iWrite commands to occur in the same ScanDR or in different ScanDRs.
Two TAPs are not better than one
Another disturbing manifestation of the tendency to implement 1149.1/1500 inappropriately is the deployment of multiple TAP controllers in chips. Generally, a TAP Controller is an 1149.1-specific state machine and all of its associated 1149.1 registers. In inappropriate embedded core uses, a dedicated TAP and TAP state-machine may be used to control a 1500 wrapper independently of any 1149.1 defined purpose. This TAP’s instruction register functions as the 1500 WIR and its boundary-scan register functions as the 1500 Wrapper Boundary Register. As it turns out, some chip designers have had the ingenious idea to use a TAP controller to manage the instruments they are embedding on their chips and to drive the 1500 architecture. This ends up looking like a boundary-scan chip on a circuit board, but it’s inside a chip. And, of course, they also implement a second TAP to perform the board test function for which the TAP was originally intended.
The situation can get worse quickly, since SoCs typically support many cores. Prominent examples include Texas Instruments’ OMAP chips with their ARM and DSP cores, but MIPS, PowerPC, ColdFire and Intel processor cores are also featured in many SoCs. Moreover, some devices feature multiple processor cores and memory cores from LogicVision and Virage. All of this means that a chip can end up with 10 or more TAPs and only one includes the BSDL description of the chip. This complicates the TAP architecture with TAP-to-TAP links, TAP multiplexing and other exotic structures, which further complicate the protocol requirements at the boundary scan TDI, TDO, TMS and TCK pins to access embedded content. In the end, the architecture can become restrictive to the point where boundary scan instructions and instructions for the on-chip embedded instruments can’t be accessed and executed at the same time since only one TAP Controller usually gets to use the TAP pins at a time.
This time 1149.7 can come to the rescue. The 1149.7 architecture, which can be implemented with as few as two pins or as many as five, supports access to multiple TAPs by assigning an address to each TAP in a manner similar to computer networking. What had been serial TAP controller operational data, like the values that operate TCK and TMS, the data provided to the TDI and the data expected from the TDO, now becomes packetized protocol data with addressing. This type of data is compatible with any number of architectures, including a broadcast star network connection that allows multiple TAPs to be operated simultaneously.
So, in summary, the following are some of the many valuable features that support a migration from the existing 1149.1 and 1500 standards toward P1687 and 1149.7.
- The ability of P1687’s ICL to describe an embedded instrument’s interface. This is not provided by 1149.1 boundary scan’s BSDL, which only describes instructions and registers and contains no information about instruments.
- Portable and re-usable vectors made possible by P1687’s PDL and ICL. These vectors remain the same whenever and wherever an instrument is embedded. 1149.1’s SVF vectors can only be applied at chip pins.
- A procedural set of vectors enabled by P1687’s PDL that includes flow control at the instrument boundary. Again, SVF is not capable of flow control.
- Flexible test scheduling is made possible by access to embedded instruments via P1687’s in-line within-the-scan-path SIB (effectively, distributed “one-shot” instruction registers). This contrasts with the hard-wired instructions of 1149.1 and 1500.
- Concurrent operation and re-configuration of embedded instruments enabled by P1687’s ability to maintain a coherent scan path and access instruments even when they reside in zones where power and clocks have been shut down to reduce power consumption. Operating in a low-power environment and power management techniques are not addressed at all in 1149.1 and 1500.
- Enabled by P1687’s SIBs, more flexible options for creating a hierarchical architecture based on engineering, efficiency and operational tradeoffs. 1149.1 and 1500 are limited to daisy-chain and star configurations.
- With 1149.7, multiple 1149.1, 1500 and P1687 architectures can be accessed and operated simultaneously.
- 1149.7 reduces the number of package pins needed to access all of the embedded content within a chip or multiple die-stack.