download datasheet

 

Complete board model - ACE Compiler maps designs in RTL or gate-level onto the custom-built or commercial prototyping platforms. In addition to FPGA's, programmable switches, connectors, cables, bus switches, daughter cards, clock distribution scheme, power regions, and LVDS interconnects on the target prototyping hardware are modeled faithfully. ACE Compiler fully understands these models to map designs efficiently onto the target prototyping hardware.

Solid, proven mapping flow - ACE Compiler takes designs in RTL or gate-level together with the target performance and the board description in Verilog all the way through synthesis, partition, board routing and FPGA synthesis, P&R to produce the downloadable FPGA bit-streams and the on-board switch/cable setups. The flow can be completely automatic, or semi-automatic if the user decides to work on any portion of the mapping by himself. Every step can be verified with the test bench in order to isolate and eliminate any possible mistakes that are introduced in the specified step, and therefore, smoothly bring up the hardware. The working compile can be reproduced for the future design revisions through scripts. This is the easiest, fastest and most reliable way to complete your prototyping projects on schedule. This tool has been used in many of today's most complex designs in major semiconductor/system companies and produced satisfactory results in ways that no other tools have matched.

Automatic partition - The advanced automatic partition algorithms search for the best partition with the lowest inter- FPGA connections and highest prototype performance. All user constraints and directions including manual grouping, target system interface and FPGA utilizations are strictly observed. The timing feature produces the low-skew clock distribution, minimizes combinatorial hops, and generates timing constraints on individual FPGA's to achieve the targeted prototype performance. Logic may be duplicated in the course of optimization. Multiple algorithms are competing for the best partition on any particular design and the parallel execution will be available before the end of 2009. The details on the clock distribution, wire-sharing, and the resulting prototype timing of the best partition are reported. 

Automatic board routing - The advanced board routing algorithm routes inter-FPGA and I/O signals through board traces, programmable switches, connectors, cables, bus switches, or FPGA's with the objective of minimizing the route-through's and signal delays. ACE Compiler will route through a third FPGA, a programmable switch, a bus switch, or a cable to make connections for signals in-between non-adjacent FPGA's or through over-crowded channels. LVDS board traces are automatically identified and assigned to those LVDS-pair signals. The wire-sharing synchronization signals are routed together with its controlling groups to guarantee its timing. Cables can be manually or automatically connected between connectors in order to provide extra traces for inter-FPGA connections. ACE Compiler is capable of overcoming the prototyping platform limits such as the distribution of local clocks or the implementation of the bi-directional buses in the shortage of global wires. Variety of constraints such as manual wire assignment, target system interface, and cable connections are observed to match with the physical setups on the target prototyping platform. 

Full user control - Although ACE Compiler comes with the advanced algorithms used to optimize the partition and board routing automatically, it strictly follows the plans imposed by the user. The interface to the target system, the clock distribution, the required performance on a certain portion of a design, the timing requirements on the daughter card interface, ... are all served as hard limits for the optimization to observe. The user's plan can be as detailed as the full manual partition or as minimal as the target system interface, ACE Compiler will complete the mapping following the planning. Extensive analysis reports on its timing behavior are available for the user to evaluate the partition and make necessary planning adjustments.

Fast, predictable re-compile for design revisions - Once a satisfactory result is verified after the first-time compile, the user's planning and the best setups can be recorded in the script to reproduce the result quickly and predictably for any future design revisions. Only the changed source codes will be re-synthesized and all the setup's will be reestablished through the recompile. The board setups and the timing behavior of the recompile will be similar to the previously successful compile. 

RTL, gate-level or mixed - ACE Compiler maps designs in RTL or gate-level. The RTL portion of the design will be synthesized modularly with the encapsulated commercial FPGA synthesis tools from Altera, Mentor Graphics, Synopsys, or Xilinx. The parallel synthesis option reduces the run time dramatically for big designs. There is no limit in the design complexity that is to be processed with this hierarchical approach. The original design hierarchy boundaries will be kept unchanged for the best partition result. After compilation, either RTL or gate-level can be exported for each FPGA. The hierarchical RTL code of each FPGA consists of the original RTL codes of un-partitioned modules and the structural codes of partitioned modules, which are most likely at the higher level of the design hierarchy with its complexity exceeding a single FPGA. The supported languages include Verilog, VHDL, and EDIF.

Low-skew clock distribution - The clock skew management is critical if the design has a large number of clocks or internally generated clocks. FPGA's have a limited number of global buffers used to distribute clocks and the prototyping platforms usually support only a limited amount of global clocks. When the number of design clocks exceeds the FPGA limit, the number of clock domains partitioned into any single FPGA will be limited by the timing feature in ACE Compiler, so every clock can be driven by the global buffer. The clock skew reaching each FPGA on the target prototyping platform is also carefully managed with the timing feature. ACE Compiler supports several schemes for the clock distribution. One scheme is used to isolate the clock generation by creating the loop-back for the clock on the generation FPGA. Many prototyping platforms support this loop-back feature on their global lines. If the design has more clocks than global lines, another scheme can be adopted for smaller clocks by isolating the clock generation from its driving clock domain. The clock is then distributed from the clock-generation FPGA to the destination FPGA's with low-skew traces on the target prototyping platform. Some clock generation modules such as PLL's or DCM's can be duplicated in every FPGA to reproduce their derived clocks locally. ACE Compiler supports all of the above clock distribution schemes for a single design if required. In addition, automatic partition will group the same-domain circuit into a single FPGA, if possible, to reduce the need to distribute clocks on the target prototyping platform and bring up the prototyping performance.

Gated-clock conversion - ACE Compiler converts gated clocks generated from qualified combinatorial gates. The conversion reconnects the source clock to drive directly into the clock pins of storage instances and moves the qualified combinatorial gates to drive the enable pins of the storage instances. The latch-generated clocks and divided clocks are also converted. The qualified conversions are automatically identified and presented to the user for approval. The conversions can be cascaded in multiple stages. The conversions can also be done across the design hierarchy boundaries.

Asynchronous or synchronous wire-sharing - When the partition exceeds FPGA pins, the wire-sharing scheme can be called to carry multiple signals on a single board trace. This scheme relaxes the pin requirement on FPGA partitions, but it also degrades the system performance. Both asynchronous and synchronous implementation of the wire-sharing are supported. The asynchronous wire-sharing uses a system fast clock to drive the wire-sharing logic. Inter-FPGA signals with the same source and destination can be grouped together to share the same board trace with the help of the wire-sharing logic. After partition, ACE Compiler inserts the wire-sharing logic including senders, receivers, and control modules in every FPGA's according to the wire-sharing signal files prepared by the automatic partition and approved by the user. Higher performance can be achieved with the synchronous wire-sharing, which transmits inter-FPGA signals synchronized to the design clocks. The wire-sharing IP's are more complex, such as the SERDES implementation, and the additional restrictions include the grouping of signals all from the same clock domain and the synchronization signals to be routed together with its controlling wire-sharing groups. Thanks to its timing engine, ACE Compiler is capable of preparing wire-sharing signals based on the clock domains. The user is encouraged to design his own IP's or use the certified IP's from Auspy's partners. The combinatorial pass-through's can be excluded for wire-sharing for better performance if so instructed. Automatic partition accurately tracks the FPGA pin count affected by the wire-sharing and searches for the solution with the lowest wire-sharing multiple.

Combinatorial hops elimination - One big performance killer in prototyping is the combinatorial paths crossing FPGA's. This is one important optimization objective of the automatic partition. To be able to eliminate combinatorial hops, automatic partition has to be able to extract the timing paths on the fly and make the correct partition decision to collapse hopping paths into single FPGA. ACE Compiler is equipped with such timing capability to partition for the minimum-hopping, high performance prototypes. However, not all combinatorial hops can be avoided by the automatic partition. It may be caused by the planning constraints from the user or the nature of the design. All the combinatorial hops through FPGA's will be reported after partition. The user is encouraged to review the report in order to  make necessary planning adjustments to achieve the higher-performance prototype. If the wire-sharing is necessary, ACE Compiler can be instructed to exclude these hopping signals for the wire-sharing, or to allow just one wire-sharing on any data path to bring up the prototype performance.

Timing budget - To be able to achieve the target prototyping performance, the timing engine of ACE Compiler will calculate the delay constraint for every timing path. The delay budget on every segment of the timing path is based on the proportional logic levels it covers. Also included in the calculations are the fixed delays through board interconnects and the latency through wire-sharing logic. The resulting timing constraints will be issued to the synthesis, P&R of individual FPGA's,  so the combined system timing will meet the target frequency specified by the user. 

Timing reports - Several timing reports are generated by ACE Compiler with details of the clock distribution, combinatorial hops and the expected prototyping performance.

Very big designs - ACE Compiler is built with the hierarchical approach from its inception. Every command, including the optimization-intensive partition, works on the hierarchical database without the global flattening. The ability of being able to work at the gate-level for the timing and logic optimization purpose without flattening the design is one of the major features of ACE Compiler. ACE Compiler is capable of handling designs over 100M gates.

Cross reference - The design is hierarchically represented in the ACE database before and after the partition. The cross reference is established to help the user easily locate objects in the after-partition netlist.

Probes - ACE Compiler is able to bring out any design signal for observation. Probes can be brought to the connector pins or to the instrumentation module inserted by ACE Compiler if instructed to do so.

Parallel execution to speed-up - Several time-consuming steps in ACE Compiler can take advantage of the parallel execution on multiple machines. These steps include synthesis and FPGA P&R. They will be distributed to multiple machines across the network. In the near future, automatic partition will also be set up to run in parallel, so it can explore more algorithm setups for the best results on any design. 

Integrated tool environment - ACE Compiler is integrated with the commercial FPGA synthesis tools from Altera, Mentor Graphics, Synopsys, and Xilinx to do the modular synthesis or FPGA synthesis. The simulators from Cadence, Mentor Graphics and Synopsys are supported to verify every step through the flow with the test bench. The FPGA P&R tools from Altera and Xilinx are also integrated to run FPGA P&R. For special applications, ACE Compiler also supports custom interfaces to the instrumentation, co-simulation, and emulation tools.

Supported FPGA families - The supported FPGA families include Altera StratixII/StratixIII/StratixIV families and Xilinx Virtex4/Virtex5/Virtex6 families.

Software platforms - Linux, Windows, and Solaris.

Copyright (c) 2009 Auspy Development Inc. All rights reserved