ICE Help CORES

From ICE Enterprises
Jump to navigation Jump to search
Go to the full list of ICE Help pages.

Summary: Interface to FPGA Cores

This file contains a brief description of the current interface for user FPGA cores within the ICE FPGA architecture. An ICE Core is an FPGA module with a set of function registers, two dataflow ports, and a test output port.

The verilog implementation signature looks like this:

module userengine (sclk,srst, scs,saddr, swr,swrbus, srd,srdbus, sack,
ioclk, istat,iena,isel,ibus, ostat,oena,osel,obus, test);

or

module mcengine (sclk,srst, scs,saddr, swr,swrbus, srd,srdbus, sack,
ioclk, istat,iena,isel,ibus, ostat,oena,osel,obus, test);

The MultiCore engines are used to handle channelized functions. They can contain 1-16 single channel cores using a common input stream.

If you have access to the ICE FPGA development tree, the full files are $ICEROOT/code/soc/lib/userengine.v and $ICEROOT/code/soc/lib/mcengine.v.

Setup - Setting up a core for use in a dataflow

The FPGA's 32 bit system bus connects to the core's functional registers. These are for setting up parameters, loading coefficients, reading status, or other debugging tasks. The register at the base address of each core is the core's system register and is predefined for all cores. The next seven registers are reserved for standard ICE control parameters. All other registers are user defined. The system clock is usually between 100 and 200 MHz. It can be changed at compile time by editing the `define CLKF_xxx in mdefs.h. All user processing should use this clock.

The IO bus connects the input and output dataflow ports to the DMA crossbar. The crossbar is a 64 byte packet based router that can handle simultaneous connections to/from memory and multiple connections directly between modules. The IO bus is either 32 or 64 bits wide depending on where the module is instantiated. It is suggested to run the dataflow ports to the ICE reformat modules to convert different data types to/from the core's native data type, and to the cores system clock. The ioclk is between 166Mhz and 200MHz. It has no relationship to any external data input/output or sampling clocks.

A core will typically be set up through a call to pic_ioport() with the port type set to IOPT_CORE. Using the returned dma channel, the user can then set any user parameters with pic_setkey() or a block of parameters with pic_wpm(). The dataflow routing is directed by the IPORT and OPORT flags in the call to pic_ioport(). When pic_dmafunc() is called on this dma channel with the DMA_ONESHOT or DMA_CONTINUOUS mode, the routes are enabled and data will begin flowing to the core's input port. Data is throttled via the ready lines istat and ostat.

The icelib.c routines are C, Fortran, or Java callable.

If you are running X-Midas or NeXtMidas, the ICECORE primitive can be used to setup the user registers, or the PICD SET and PICD GET commands. For example:

ICECORE/core=V ifile ofile "User" {P8=123,P9=456} /dump=2

runs any Verilated core with the generic "User" core interface, in this case setting register 8 to 123 and register 9 to 456. The actual run-time contents of these two registers is queried and dumped to the screen for debug.

To graphically examine the signal traces for the Verilog run, set the unix environment variable VERILATOR_TFN to a filename to recieve the test vector output. Install gtkwave version 3.3.42 or later and run it with this filename as the first argument.

To create test vectors for Concurrent-EDA's automated C-to-FPGA process, set the unix environment variable ICECORE_TFN to a filename root to recieve the test vectors. It will create the ${ICECORE_TFN}_pin, _pout, _din, and _dout files for the core's configuration plan in, and plan out per pass, the input data, and expected output data.

See HELP CARDS ICECORE for other switches.

For testing the FPGA code on an actual ICE card with non-realtime dataflow, make sure the card is reset with the proper flags such as

PIC RESET PIC1 /flags=PMFPGA=U

to load the user FPGA code. Then run the ICECORE routine with additional switches specifying the card alias and core number to use.

ICECORE/core=I ifile ofile "User" {P8=123,P9=456} /dump=2 /coredev=PIC1:21

For more examples of macros using these routines, see $ICEROOT/fat/testcore.mm,

$ICEROOT/fat/testnoop.mm  $ICEROOT/fat/testuser.mm $ICEROOT/fat/testdemod.mm.

Currently, the realtime dataflow must still use the SOURCEPIC/SINKPIC primitives.

SOURCEPIC/PORT=PM1CORE1 ramfile _out PIC1AUTO 
PICDRIVER/PORT=PM1CORE1 set PIC1AUTO CORE+32 123
PICDRIVER/PORT=PM1CORE1 set PIC1AUTO CORE+36 456

will set this up the same core using the realtime dataflow on an ICE card.

Naming - Core naming rules

The two cores on the main board are named PM0CORE1 and PM0CORE2. Cores on processor module 1 are named CORE11, CORE12, MCORE11, and MCORE12 ( or PM1CORE1 and PM1CORE2). Cores on processor module 2 are named CORE21, CORE22, MCORE21, and MCORE22 ( or PM2CORE1 and PM2CORE2).

Addressing - Register addressing rules

Each core has a set of 32bit wide registers that are read/writable through the pic_setkey, pic_getkey, pic_wpm, and pic_rpm routines. Each core has a 24 bit register address window with a base address (upper 8 bits) defined in iceppc.h. Use the FLG_PPC_BUS flag on pic_wpm and pic_rpm calls to address the system register bus and the KEY_CORE key for the pic_setkey and pic_getkey calls.

The first register address is the System register set by the system software whenever a core is activated.

  1. System Register (bit 0 is enable, bit 1 is play/acq, bits[11:8] input format, [15:12] output format))
Input/Output formats: bit[2:0]? 0=16b 1=8b 2=4b 3=1b 4=32b  bit[3]?complex:real

The next 7 register addresses are pre-defined and set by every call to pic_ioport on a CORE or MCORE.

  1. Decimation (dec-1)
  2. Gain
  3. Rate (Hz)
  4. Ratio (fractional binary)
  5. Frame (frame-1)
  6. Frequency (fractional binary)
  7. Flags

The next 8 are reserved for other system routines.

8..15) Filter coef loaders

The rest are available for user core parameters.

16..255) User parameters for each multicore.
16..1M)  User parameters for each core.

The user registers are not written or cleared by system software.

See the help on each of these routines for detailed syntax.

Routing - data routing rules

Each core has one data input stream and one data output stream. The multi-channel cores also have an 8-bit channel select with each of these streams. The run-time routing is handled via the IPORT and OPORT flags in the handle given to the pic_ioport call.

The default route for an acquisition is IPORT=MODULEx (where x=1|2 to be on the same side of the card and OPORT=HOST. The default route for playback is IPORT=HOST and OPORT=MODULEx.

DMA - DMA Crossbar functionality

The DMA engine does not transfer back-to-back packets to the same core. This allows at least 8 clocks to register the proper IO ready lines for the next transfer. The enable lines are also one cycle ahead of the actual transfer so that modules can register or preload data to ease timing considerations.

Tracer - Embedded FPGA debug trace mechanism

If a FPGA core module contains an instance of swrstatdbg, the unit can be tested while live using ICETEST TRACER from then nextmidas prompt. See the explain of ICETEST for more info.