IEEE Computer Society

Join the IEEE Computer Society
Publications Center
Channels
Get Involved
Member Services and Benefits

Publication Home Page
About This Publication
Archives
Author Resources
Media KitSubscribe

Institute of Electrical and Electronic Engineers

IndexSearchHelpContactOnline Order Cart
computer.org

COVER FEATURE
PA-RISC to IA-64:
Transparent Execution, No Recompilation
Cindy Zheng
Carol Thompson
Hewlett-Packard

Dynamic translator

The dynamic translator translates a basic block of PA-RISC instructions into dyncode and stores the translated code into the Aries code cache for subsequent use. It has four subcomponents.

PA-RISC preprocessor. The PA-RISC preprocessor scans through each PA-RISC instruction in the block and records useful information for subsequent code generation. It also performs some pretranslation optimizations that are specific to PA-RISC architecture. For example, most PA-RISC arithmetic instructions generate carry/borrow bits that are rarely used. To minimize the redundant generation of carry/borrow bits, the preprocessor tracks information about where a resource (like a register) is defined and where it is being used in an execution sequence. Thus, the code generator produces only the necessary carry/borrow bits. The preprocessor also performs constant and copy propagation to reduce dependencies among PA-RISC instructions so that the scheduler can exploit more ILP.

Code generator. The code generator translates the preprocessed PA-RISC instructions into native IA-64 instructions. Because Aries maps all PA-RISC general registers onto designated IA-64 registers in dyncode, the code generator can use the corresponding IA-64 registers to reference the PA-RISC general registers directly. This eliminates the need to fetch register values from memory. The code generator also resolves any mode differences between PA-RISC and IA-64 processes. For example, if Aries is emulating a 32-bit PA-RISC application, it must adjust address references to 64 bits. For each memory-related PA-RISC instruction, the code generator must generate an extra IA-64 instruction, addp4, to do the conversion before it generates a load or store instruction. This process is called address swizzling.

Optimizer and scheduler. The code generator then passes the IA-64 instructions to the lightweight optimizer for optimizing and scheduling. Because it performs optimizations at runtime, they must be fast and effective. Aries uses several techniques to promote efficient optimization, including, among others:

•  Dead code elimination. Aries removes redundant instructions to reduce the final translated code size.

•  Address swizzling reduction. Aries replaces certain addp4/load or addp4/store instruction pairs with a single load or store instruction to reduce total code size and total execution cycles. Figure 4 shows how the optimizer uses address swizzling reduction to optimize a sequence of consecutive memory access instructions.

•  Memory aliasing reduction. Aries distinguishes the memory access instructions it generates for register fetching from the normal memory instructions it generates for the emulated PA-RISC application. These two types of memory instructions access different memory segments and do not overlap. Aries can safely move one type of memory instructions across the other type to improve ILP.

Figure 4. How the Aries optimizer reduces address swizzling (adjusting 32-bit addresses to 64-bit addresses) to optimize a sequence of consecutive memory access instructions. The optimizer replaces an addp4/st4 instruction pair (instructions C1 and C2 in generated dyncode) with a single st4 instruction (instruction c1 in optimized dyncode) that performs simultaneous base updates. The optimization shortens the total execution from six cycles to five and reduces the instruction count from 24 to 18.

Aries’ list scheduler bundles instructions for each generated IA-64 block so that all instructions fit into IA-64 templates. It starts by building a directed acyclic graph (DAG) to capture all IA-64-specific and microarchitecture-specific dependencies such as write-after-read (WAR), read-after-write (RAW), and write-after-write (WAW) hazards between instructions. On the basis of the DAG, it then selects instructions that are free to be scheduled in each cycle. Finally, the scheduler uses a state machine to bundle instructions in each cycle, inserting NOPs (no operations) as necessary. It also uses a heuristic to reduce the number of NOPs inserted.

Instruction packer. The instruction packer packs the scheduled IA-64 instructions into binary code and writes it into the Aries code cache. The Aries runtime module then updates the address map table to reflect the state change for the translated PA-RISC block. For subsequent emulations of that block, the Aries runtime will use the translated code instead of invoking the interpreter.

Aries implements a backpatch technique that allows a dyncode block to directly branch to another dyncode block without going through a target lookup, making it more efficient to transition between blocks. The Aries runtime module keeps track of the dyncode block that has just been executed. If the next block to be executed is the target block of the previous one, Aries modifies the final branch instruction in the previously executed dyncode block so that it can jump directly to the target dyncode block.

Aries can also translate dynamically generated code and self-modifying code in an emulated PA-RISC application, treating both types of code as regular PA-RISC blocks in an emulated application. When Aries encounters a sync instruction, which indicates the existence of self-modifying code, it simply erases the current content of the code cache so that subsequent emulations will not use any translations of the old code.

Environment emulation module

The most common system services that the environment emulation module must handle are system calls and signal delivery.

System calls. All HP-UX system calls enter kernel space through a common system-call-gateway page. The environment emulation module captures system calls made in an emulated PA-RISC application at the gateway page and calls the corresponding emulation routines. Most system-call emulation routines are simple stubs that invoke the native system calls directly on the IA-64/HP-UX platform. Other system calls require special handling before the native system calls are made. For example, when a thread in a multithreaded PA-RISC application requests the operating system to suspend another thread, Aries cannot simply pass this request to the underlying IA-64 kernel because it could cause a deadlock on shared Aries resources. Aries must first acquire all the Aries shared resources before sending the native suspension request to the kernel.

Signal delivery. Signal delivery also requires special handling. The HP-UX operating system can deliver both synchronous and asynchronous signals to a PA-RISC application. It delivers a synchronous signal, such as a protection violation on a load, immediately to an application at the instruction that caused the exception. An asynchronous signal, such as a kill or suspend, on the other hand, is not associated with a particular instruction, so the system can deliver it to the application any time, and the time at which it arrives may differ from run to run.

Aries registers a master signal handler to handle the delivery of all signals it receives, whether synchronous or asynchronous, to an emulated PA-RISC application. When the HP-UX kernel detects an exception that a PA-RISC application generates, it delivers a signal to the Aries process that is emulating the PA-RISC application by invoking the Aries master signal handler. The signal handler then determines how the signal should be delivered to the emulated application. Aries does not always deliver asynchronous signals as soon as they occur. Instead, it queues up the asynchronous signals it receives and delivers them to the emulated application at the earliest locations where it can construct a correct PA-RISC signal context for the emulated application. Aries handles the synchronous signal delivery immediately. When Aries receives a synchronous signal in dyncode, where the PA-RISC signal context may not be up to date, Aries constructs a recovery block for the dyncode. It then executes that recovery block to synchronize the PA-RISC context before delivering the signal to the emulated application.

Verifying Emulation and translation

One of Aries’ most important goals is to emulate all user-level PA-RISC applications on IA-64 platforms—including applications not yet developed. It is impossible to even run all the available PA-RISC applications on Aries to verify its emulation correctness. Moreover, most existing PA-RISC applications are compiler generated. Because compilers use only a subset of the ISA to generate executables, it would be hard to get 100 percent coverage on ISA emulation using application testing. We therefore adopted other ways to verify Aries.

We developed a random testing framework to stress test the correctness of Aries ISA emulation. We used this framework to randomly generate PA-RISC instruction sequences and then execute each instruction sequence twice—once on a PA-RISC processor and once under Aries running on an IA-64 system. We compared the final states and labeled any inconsistency between them as an Aries emulation failure. With this framework, we thoroughly verified all ISA emulations, including scenarios that can never be generated in a real application.

We also built a runtime cross-verification mechanism into Aries so that we would know the exact location of any emulation failure. This is important when the application is large and complex, because a failure may not show up in an identifiable format (such as output) until some time after it has occurred. We can identify the instruction block where Aries emulation failed. With this mechanism, Aries can run any large and complex application and report emulation failure at the exact place it occurred—without user intervention.

In contrast, verifying translation correctness has been a challenge, because we did not have a real IA-64 system when we developed Aries. To overcome this problem, we injected an IA-64 instruction emulator into Aries to act as an execution bed only for dyncode. We built the rest of Aries’ emulation components—the interpreter, dynamic translator, environment emulation module, and runtime module—as a PA-RISC application and ran them on a PA-RISC platform. This verification approach improved our dyncode testing efficiency by up to 300 times, compared to the traditional verification method using a full-blown IA-64 simulator.

Aries can emulate most user-level applications built for HP-UX/PA-RISC systems, including ones still under development. However, there are a few exceptions. For example, it cannot correctly emulate a debugger built for HP-UX/PA-RISC systems because of optimizations in the translated code. Also, it cannot yet emulate applications that link in both PA-RISC and IA-64 shared libraries.

We view dynamic translation as an important migration method, but software migration is only one of the many areas that could benefit. This technology could aid runtime instrumentation and profile gathering in a performance analysis toolkit, for example. It could also become central to runtime optimization in products such as the Java virtual machine. In the meantime, we see it as essential in helping HP customers enjoy an effortless and successful transition to more powerful IA-64 systems.

Cindy (Qinghua) Zheng is a senior software design engineer in the Adaptive Systems section at Hewlett-Packard’s Enterprise Java Lab. She received a BSc and an MEng in electrical engineering and computer science from the Massachusetts Institute of Technology. Contact her at cindy_zheng@hp.com.

Carol Thompson is the C/C++ compiler architect in Hewlett-Packard’s Development Environment Solutions Lab. Her background includes optimization and architecture definition for the IA-64 and PA-RISC architectures. She received an MS in computer science from the University of California, Davis. Contact her at carol_thompson@hp.com.

Computer Home


Send general comments and questions about the IEEE Computer Society's Web site to webmaster@computer.org.

This site and all contents (unless otherwise noted) are Copyright © 2000, Institute of Electrical and Electronics Engineers, Inc. All rights reserved.