The default version of the system is made up of a controller, two nodes, a memory controller and a NoC. To keep things simple, only one node is shown in the images. The controller, node and memory controller are also referred to as components – more high level modules without any logic which should make the documentation more organized.
There are a lot of different licenses used for the different parts. With exception of the NoC everything is licensed with a free software compatible license. The license of the NoC restricts the use to research.
1.3. Design decisions
- The whole system should be as easy to understand as possible. Code is often written in a very verbose way and clever tricks are avoided.
- It should be as reusable as possible. Every module of the system has a specific function an could be used in a different system. The additional communication needed is accepted.
- The NoC bridges are not considered to be part of the NoC, but of the nodes. This should make it easier to swap the NoC for a different one.
- It was made with simulation in mind. There are parts of the system (like the defines_xxx.vh files) that might be cumbersome to use for synthesis especially if the block designer is used.
- For FPGAs the only proprietary software considered is Vivado because as the old saying goes: You either love Vivado or you have never used Quartus Prime before.
1.4. Control scheme
The control is done over the NoC and uses the control module to record what programs the nodes should execute. The control contains a simple 32-bit flag register where each node is identified via it’s id and a 1 means that the node is currently busy. Additionally to this flag register an array holds the programs assigned to the nodes.
Once the system is started, the self-aware modules of the nodes start to read from the control to see if there are any programs their respective CPUs should execute. The controller can set programs for the nodes and once a program has been set the respective node is considered busy until the self-aware module of the node signals the completion.
The program is represented by its starting address in memory and returned to the self-aware module in the rdata field. If rdata is 0, there is no program set for the node. 0 is reserved and the start of the controller program.
Once the self-aware module reads a valid address it turns on the CPU and sets the AXI offset to the program address. This offset is needed as every CPU starts reading from address 0 and each program is compiled the same way. The offset to the address essentially moves all the reads and writes to the right memory space. So instead of reading from 0, the CPU reads form 0+offset.
The same signal that is used to turn on the CPU is also connected to the AXI_joiner and switches the AXI communication from the self-aware module to the CPU.
The CPU is now executing the program. Instead of reading from the control, the CPU is reading from the memory. The AXI_splitter takes care of sending certain requests to the control and some to the memory depending on the address.
While the CPU is running, the controller can read from the control to get the state of the nodes. This simply returns the busy flag register. There is no way for the controller to interfere with the execution in any way or terminate it. It is possible for the controller to set a program for a busy node, but this will never be executed.Once the CPU is finished it writes a certain value to a certain address. This is detected by the aptly named detector who signals the completion to the self-aware module. The self-aware modules turns the CPU off and writes a 0 to the control, representing that there is now no program set for this node. Once this request has been received, the control also changes the flag in the busy register where the controller will learn of the completion the next time it reads the busy flag register.
1.5. Known problems
There are a few problems that are known and tolerated at the moment. If something should not work in different situations (e.g. on FPGA) they might be the cause of the problem. Some might be related to a bug in the tools and are going to be reported once the issues can be condensed into a smaller examples.
- In the contoller component one AXI Light interface requires a parameter.
The line in question: if_axi_light #( .AXI_WSTRB_WIDTH(`AXI_WSTRB_WIDTH) ) if_axi_light_debugger();
Without the parameter this causes the following error:
%Error: Internal Error: ../../rtl/controller.sv:40: ../V3LinkDot.cpp:1317: No symbol for interface alias rhs
Solution: Just provide the parameter as this is only redundant information.
The read_resp task in the AXI Light interface requires non-blocking assignments (<= instead of =) The lines in question: rresp <= t_rresp; rdata <= t_rdata; Solution: Ignore for the time being and hope for the best.
Resolved: This problem apparently just disappeared. It was most likely a side effect caused by a different bug.
- The debugging system causes a segmentation fault in the testbench.
In the sim_main.cpp the chars from the debuggers are collected in a string and printed to the terminal once a newline has been received. On one computer this is not possible as the char array causes a segmentation fault.
Solution: Print every char directly without collecting them in a string.
- The debug function print_dec() does not always work.
Solution: Use print_hex instead.
- It is not possible to set the entry point during compilation. The compiler always defaults to calling the main function.
Solution: The function in the assembly startup file has been renamed to main to make sure this one is called. The main in main.c has been renamed to my_main and called in the aforementioned assembly file.
- Stack pointer is used before it is set.
When libraries are linked the stack pointer is used during some initialization before it can be set in the main function located in start.S.
Solution: The stack pointer is set in the CPU and constant for every program.
2.1. RISC-V GNU toolchain
Specifically the toolchain for RV32I.
The Makefiles expect the toolchain to be installed in /opt/riscv32i/. It is advised that the following guide is used for the installation:
For simulating the system.
The newest version available is recommended:
2.3. GTKwave – optional
To display the tracefile and only used for debugging.
Any version your package manager offers should suffice.
The main Makefile can be run from the project root.
|make||compiles the HDL|
|make run||executes the simulation|
|make wave||executes the simulation with a tracefile enabled|
|make clean||removes the compiled simulation environment and any tracefile|
|make sw||compiles all the programs and the controller software|
|make clean_sw||removes all the compiler output of the programs and controller software|
|make programs||compiles all the programs|
|make clean_programs||removes all the compiler output of the programs|
|make controller||compiles the controller software|
|make clean_controller||removes all the compiler output of the controller software|
Located in ./sw/programs
This Makefile is used by all the programs and should not be called from the ./sw/programs directory. Instead each program directory contains a Makefile where specific additions can be made like the inclusion of an addition library.
|make small||compiles the code for a small node (rv32i)|
|make big||compiles the code for a big node (rv32im)|
|make clean||removes compiler output|
This Makefile produces many different files for debugging purposes. The file rv32i_main.hex and rv32im_main.hex are the ones used by the system.
Located in ./sw/controller
Similar to the software Makefile but separated should the need for a greater difference arise.
In contrast to the AXI Light interface, the NoC one does not contain any tasks at the moment.
There are a few things to keep in mind:
- Due to difficulties to setting the entry point, the main function has to be called my_main at the moment.
- There are various print functions in the util.h (found in _libs) that can be used with the debugger. They should be used as little as possible as they can greatly increase the size of the program.
- At the end the function signal_fin() has to be called to signal the self_awareness, that the program execution is finished. An endless loop afterwards is recommended. This should be moved into the start.S in the future.
- The system does not support malloc. There is a library provided called memmgr (found in _libs) that can be used to replace the usual functions. Please have a look at dhrystone to see how it can be used.
- No optimization (-O0) is advised. -O1 optimizes the mul away and anything higher causes EBREAKs / ECALLs. Normally the latter gives control to an underlying system, but as nothing is there the CPU crashes.
The source code should tell you everything you need to know, especially the defines.h. If anything is unclear, please feel free to contact me.
9. FPGA (using Vivado)
A IP core can be created for each module or for each component. The latter requires less work and results in a more clear block design as less boxes have to be connected, but makes debugging using the Integrated Logic Analyzer more difficult.
To package an IP core please follow these steps:
- Create a wrapper that does not contain any interfaces as input or output as this is not allowed by Vivado. There is a wrapper for the AXI offset included you can use as a reference.
- Create a Vivado project and include the following files:
- The module and the wrapper
- Every interface used (found in /rtl/interfaces)
- The define files for the interfaces (found in /configurations/x like “defines_axi.sv”)
- Click on the files and make sure that they are recognized as the correct type under “Type” in “Source File Properties”. Should Vivado complain about assignments this might be the cause of the issue.
- The module, wrapper and interfaces should be “System Verilog”
- The define files should be “Verilog Header”
- Open all the System Verilog files and include the Verilog Headers at the beginning: e.g. `include “defines_axi.sv”. If the interfaces are not be shown under “Sources” in the “Hierarchy” tab, select the “Library” tab to find them.
- Make sure that every parameter has a default value.
- Synthesize the code to make sure it is working. If the Verilog Headers have not been included correctly the synthesis might still work. However you will get an error during the next step.
- Package the IP core with the following recommendation:
- Remove all the memory mapped stuff under “Addressing and Memory”. Vivado likes to assign everything AXI related a memory space. This should not be needed most of the time.
9.1. CONNECT NoC
The CONNECT NoC requires special attention.
- Rename the .hex files to .data.
- Open the mkNetworkSimple.v file where the .hex (now .data) files are read and update the path.
- Create a Vivado project and include all .v and .data files.
- Synthesize the code to make sure it is working.
- Package the IP core.