Skip to main content Link Search Menu Expand Document (external link)

50.002 Computation Structures
Information Systems Technology and Design
Singapore University of Technology and Design

Debugging Strategies

This notes contain a collection of debugging strategies depending on the severity of your bugs. Refer to each section depending on your needs.

Could not compile the project

It is ideal that you incrementally compile your code to ensure that it is compilable (despite running on simulation!). If you already have a large codebase that is not compilable, do the following instead of trying to debug (for the interest of time):

  1. Create a NEW IO base project
  2. For each module in your project, create it in your new project with the same interface only without implementation
  3. Then paste the implementation of ONE of leaf modules only (the modules that do not instantiate or depend on another module), and test compile
  4. Once it compiles successfully, repeatedly do (3) until you have no more leaf modules, and then proceed with the next level

For instance, let’s say you have a module with the following dependencies:

alchitry_top
├── datapath
│   ├── alu
│   │   ├── boolean
│   │   ├── adder
│   │   ├── shifter
│   │   └── compare
│   ├── regfiles
│   └── control_unit
└── driver

You shall create each module (datapath, alu, etc) but keep the interface only (not implementation!): input, output only. In the always body, set all its output to 0 to silence the errors. You will have something like this, similar to your lab starter code:

module alu (
    input a[32],
    input b[32],
    input alufn[6],
    output out[32],
    output z,
    output v,
    output n
) {
    
    always {

        out = 0 
        z = 0
        v = 0 
        n = 0    
    }
}

Afterwards, paste the implementation of boolean, and compile. Once successful, you may proceed with shifter, then compile. Repeat this one at a time. Once all leaf modules are successfully compiled (boolean, adder, shifter, compare, regfiles, control_unit), you can begin pasting the implementation of the next level module like alu and datapath.

The goal of this protocol is to ensure that the project compiles. We do not know at this point if it is functional or not because we have not test it.

This method is guaranteed to help you pinpoint which module is problematic.

Inspect runme.log

Once you found the problematic module, fix it accordingly. You can open Vivado log file under build/vivado/[PROJECT_NAME].runs/impl_1/runme.log with a text editor and search for ERROR. That should give you a clue on what went wrong. If still unsure after reading the errors, post your question on ed discussion.

Unconstrained Port Error

This happens when you declare a port in the top module but did not declare it in the constraint file. Alchitry Labs should have warned you about it.

This is Alchitry telling you that: “You said you wanted an output called data, but you never told me which physical pins to use on the FPGA!”. Edit your constraint file to add the corresponding pin. If you don’t know what constraint file means, give this guide a read.

7 Segment / LED does not light up

Test Whether Hardware Works

You need to first test if your hardware is not damaged by connecting it to VDD and ground on a breadboard with resistor (any resistor is better than nothing, just pick one).

+---------------------- Breadboard ------------------------+
|                                                          |
|  VDD (3.3V or 5V)   GND                                  |
|     |               |                                    |
|     |               |                                    |
|     |               |                                    |
|     |               |                                    |
|     |               |                                    |
|     +--->|──[ R ]───+                                    |
|          LED     330Ω–1kΩ                                |
|                                                          |
|       ↳ Long leg (anode, +) goes to VDD via resistor     |
|       ↳ Short leg (cathode, –) goes to GND               |
+----------------------------------------------------------+

You can also use this method to test individual segments of a 7-segment display, one at a time:

  • For common cathode, connect the common pin to GND (segment pins to VDD).
  • For common anode, connect the common pin to VDD (segment pins to GND).

This test also helps you double confirm the function of each pin on your 7segment.

Test whether GPIO port on FPGA outputs signal properly

Create a project that drives logic high (1) to every pin you want to use. Then connect each GPIO to an LED + Resistor one at a time.

FPGA GPIO pin ───[ 330Ω ]───>│─── GND
                           LED

If the LED lights up, you know that pin works fine.

Test Static Output

Once the previous two tests are passed and you know the issue is not the hardware nor the GPIO pins, then the issue must be either your pinout or your code or both. Before you start debugging, double check your knowledge by outputting a static value, e.g: turning all LEDs blue, or displaying a fixed digit on your 7segment.

This ensures that your pinouts are correct. Once you are able to output a static value, you can be sure that your hardware, GPIO pins, and knowledge on how to drive the output works fine. This means that the issue is in your code. At this point, you might want to spend some time re-implementing certain logic.

Test Dynamic Output

If you reach this step, it means that there’s certain flaws in your HDL code. You may want to now display dynamic values to your 7 segment (not your project!). One simple idea is to:

  1. Create a dff to store binary values from your io_dip whenever you press an io_button
  2. This dff is read by bin_to_dec module (if using 7 segment)
  3. The output of bin_to_dec module drives both the internal 7segment and external 7segment
  4. Check if the numbers you set from io_dip is displayed on the 7segment
  5. If you are just using plain LEDs, check if the LEDs now show the pattern stored in your dff
  6. For ease of debugging, also output the content of the dff on io_led

Using io_top LEDs, dip switches, and buttons ensure that you have a good source of truth, without having to care about wiring connections or pinouts.

Code Works on Simulation but not on Hardware: Outputs flickering, Buttons unstable

Ensure pulldown Constraints

If you observe your code goes haywire whenever you hover on io_button on the physical FPGA, it is likely that you did not utilise the pulldown constraints. Consult this handout on how to fix it.

Connect Everything to a Common Ground

If you are powering your LED matrix, 7-segment display, or any external components using a battery or external power supply, you must connect the ground (GND) of that power source to the FPGA board’s ground.

Why?

Digital signals (like HIGH or LOW) are measured relative to ground. Without a shared ground, the FPGA and external circuit don’t agree on what “0V” means. This can cause unpredictable behavior, no response from the display or potential hardware damage.

Bottomline: always connect the external GND to the FPGA GND pin when using external power.

Failed to meet timing constraint

If Alchitry reported that you have failed to meet timing constraint, consider lowering the clock signal stated in alchitry.acf to 10Mhz or less. This means that your design has probably violated the t2 timing constraint, where the clock period is too fast for all combinational units (tpd) in-between. Lowering your clock to 10MHz is still fast enough for our 1D project, do not worry.

Do not attempt to use a project that fails the timing constraint. It might output unpredictable behavior.

Code Does not Work on Simulation but no Error Reported

The issue might be with your FSM or datapath connection. In order to find out, you need to build a debug infrastructure. Consult the next section.

Code Works on Simulation but not on Hardware: FSM does not go to expected state

This is the hardest bug to fix. It is likely that you encounter timing issues at this point because simulation does not consider timings.

In order to begin debugging, you need to write some kind of debug infrastructure. Ideally, you want to have an FSM that:

  1. Can be advanced by a button click state by state instead of letting it run, but this feature can be easily disabled later on (for final build)
  2. Can output all kinds of important signals to io_led, such as alu’s output, asel/bsel output, wdsel output, or regfile read data output in your 1D project

Remove any Division

Remove code that uses division /. Do not use expressions like a / b where b is not a power of 2.

Division in hardware is complex: it requires a divider circuit, which is large, slow, and significantly more complicated than an adder or shifter.

Most hardware division uses restoring or non-restoring algorithms, or iterative subtraction, which take multiple clock cycles. This introduces sequential logic into your ALU, instead of keeping it purely combinational. As a result, it can mess up your timing, increase your critical path delay, and make synthesis and debugging harder.

If you need to divide by a power of 2, use right shifts instead:

  • a >> n is equivalent to \(\frac{a}{2^n}\), and is synthesized efficiently with simple wiring logic.

In short: stick to shifts for division, unless you really know what you’re doing and have the extra cycles (literally). Once you have done this, recompile the project and test.

Ensure Timing Constraint is Met

Ensure you do not see any warning as such when you compile your project:

If you do, reduce the onboard clock frequency specified in alchitry.acf. See this section for more information.

Manually Advancing the FSM

If none of the above works, you should debug your FSM state-by-state.

This part is crucial for debugging, because it helps you narrow down the problematic states. You can start by using one of the buttons, e.g: io_button[0] as an “advance” or “next” button. As usual, pass it through a button conditioner and edge detector:

    // in alchitry_top
    button_conditioner advance_button_conditioner(#CLK_FREQ(CLK_FREQ), .in(io_button[0]))
    edge_detector advance_button_edge(#RISE(1), #FALL(0), .in(advance_button_conditioner.out))

    // if instantiated in other module, simply pass advance_button_edge.out to it 
    game_cu(
        // other port connection
        .fsm_clock(advance_button_edge.out)
    )

Then receive the button edge input in your module that contains the game FSM (e.g; the Control Unit module). You then should declare a sig to post-process this button press, and prepare a debug dff as well as debug_out port in this module. The job of the debug output is to indicate which state you are currently at.

module game_cu (
    input clk,  // clock
    input rst,  // reset
    input fsm_clock, 
    // other ports
    ...

    output state_debug_out[8] 
){
    // module instantiation 
    // other modules here 

    dff debug[8](.clk(clk), .rst(rst), #INIT(0))
    sig advance 

    always{
        advance = fsm_clock
        debug.d = debug.q
        state_debug_out = debug.q 

        // other code
    }
}

In the FSM implementation, set debug.d into different combination depending on the states we are in, and set the FSM to go to next state only when advance == 1. For instance:

    States.SET_POINTS: 
        alufn = b011010 // "A" 
        asel = b10 // Decimal 1
        wdsel = b000 // Route ALU's output
        regfile_wa = d4 
        if(advance){
            regfile_we = 1
            game_fsm.d = States.RESET_TIMER                    
        }
        debug.d = 8d2 // unique debug signal when we are at SET_POINTS

    States.RESET_TIMER: 
        
        alufn = b011010 // "A" 
        asel = b11 // Decimal 25 
        wdsel = b000 // Route ALU's output    
        
        regfile_wa = d1 
        if(advance){
            regfile_we = 1 
            game_fsm.d = States.SPAWN_MOLE
        }
        debug.d = 8d4 // unique debug signal when we are at `RESET_TIMER`

It is important that regfile_we or any write enable signal is put under the advance clause as shown above. This is because we do not want our regfile to be repeatedly overwritten.

The following state will not work as expected if regfile_we is put outside the if block:

    States.UPDATE_SCORE:
        alufn = b000000 //  +
        regfile_ra1 = d3 // currentScore dff 
        regfile_ra2 = d4 // scoreToAdd dff 
        asel = b00 // select rd1
        bsel = b00 // select rd2
        regfile_we = 1 // WRONG! we should NOT be outside advance  
        regfile_wa = d3 // currentScore dff
        if(advance){
            game_fsm.d = States.RESET_TIMER
            
        }
        debug.d = 8d8

Since advance will only be 1 if we press the io_button, then we will loop at this state for millions of loops before we press the advance button. We will end up repeatedly adding the content of currentScore dff with scoreToAdd dff for a million times. Your final score will not reflect the true value at this state.

Instead, do this:

    States.UPDATE_SCORE:
        alufn = b000000 //  +
        regfile_ra1 = d3 // currentScore dff 
        regfile_ra2 = d4 // scoreToAdd dff 
        asel = b00 // select rd1
        bsel = b00 // select rd2
        regfile_wa = d3 // currentScore dff
        if(advance){
            regfile_we = 1 // CORRECT! write ONCE only when advance is 1
            game_fsm.d = States.RESET_TIMER   
        }
        debug.d = 8d8

This ensured that we are only writing to the regfile once, that is in the clock cycle where advance is 1.

With this setup, you should be able to now manually advance your FSM with a click of a button.

The IDLE State: leave as-is

Howevere you cannot do this in the state where you are waiting for another button edge press, or some slow clock edge. One example would be the IDLE state or some kind of WAIT_INPUT state. For instance, do not add the advance button functionality on either of these states:

            States.START: 
                if (start_button_edge_pressed) {
                    game_fsm.d = States.INITIALIZE
                }
                
                else {
                    game_fsm.d = States.START
                }
                debug.d = 8d1

            States.IDLE: // wait for input
                debug.d = 8d5
                if (decrease_timer) {
                    game_fsm.d = States.CHECK_TIMER
                }
                
                else if (button_p1_edge_pressed) {  
                    regfile_we = 1
                    regfile_wa = d2 
                    wdsel_out = b010 
                    game_fsm.d = States.REGISTER_P1_MOVE
                }

                else if (button_p2_edge_pressed) {  
                    regfile_we = 1
                    regfile_wa = d3
                    wdsel_out = b010 
                    game_fsm.d = States.REGISTER_P2_MOVE
                } 

                else {
                    game_fsm.d = States.IDLE   
                }
            

This is because you won’t be able to press two buttons exactly at the same clock cycle. If you do the START state as such, you might never be able to go to the INITIALIZE state very easily.

    States.START: 
        if (start_button_edge_pressed) {
            if (advance) game_fsm.d = States.INITIALIZE // very unlikely to happen
        }
        
        else {
            game_fsm.d = States.START
        }
        debug.d = 8d1

Display All Crucial Information

Once you are able to make your FSM advance manually using button presses, you need to create an infrastructure in alchitry_top so that you can inspect all sort of signal values in that particular states. You can utilise io_dip for this, just like how we set the output to debug Lab 4: Beta. You should inspect all kinds of crucial signals: the output of asel/bsel, the output of wdsel, output of ALU, output of important dffs, etc.

For instance, your datapath module should output the following (and more):

module game_datapath (
    input clk,  // clock
    input rst,  // reset
    
    // input and output of datapath 
    
    // debug output
    output p1_score_dff_debug[16],
    output p2_score_dff_debug[16],
    output rd1_out_debug[16],
    output rd2_out_debug[16],
    output current_timer_dff_debug[16],
    output alu_out_debug[16],
    output wdsel_out_debug[16],
    output bsel_out_debug[16],
    output asel_out_debug[16],
    output control_signals_debug[16], // asel(3) bsel(3) wdsel(3) ra1(3) ra2(3), ignore MSB
    output state_out_debug[16]
    
) {
 // implementation
}

You would need to create intermediary signals to route it both to output and the corresponding module. For example:

    sig asel_out[32]

    // in always body
        case(control_system.asel){
            b0:
                asel_out = regfile_system.rd1
            b1:
                asel_out = 32d30 // constant 30
            default:
                asel_out = regfile_system.rd1
        }

        // hook it up to the output port 
        asel_out_debug = asel_out

        // also connect it to alu  
        alu.a = asel_out 

Do not do the following:

        // HARD TO DEBUG!
        case(control_system.asel){
            b0:
                alu.a = regfile_system.rd1
            b1:
                alu.a = 32d30 // constant 30
            default:
                alu.a = regfile_system.rd1
        }

Consult Lab 4: Beta code for inspiration.

Then in alchitry_top, utilize the dip switches to show different debug values. For instance:

case (io_dip[0]){

            h00:
                io_led[1:0] = $build(datapath.state_out_debug, 2)
                seg.values = $build(datapath.state_out_debug, 4)
            h01: 
                io_led[1:0] = $build(datapath.asel_out_debug, 2)
                seg.values = $build(datapath.asel_out_debug, 4)
            h02: 
                io_led[1:0] = $build(datapath.bsel_out_debug, 2)
                 seg.values = $build(datapath.bsel_out_debug, 4)
            h03: 
                io_led[1:0] = $build(datapath.wdsel_out_debug, 2)
                seg.values = $build(datapath.wdsel_out_debug, 4)
            h04: 
                io_led[1:0] = $build(datapath.alu_out_debug, 2)
                seg.values = $build(datapath.alu_out_debug, 4)
            h05: 
                io_led[1:0] = $build(datapath.rd2_out_debug, 2)
                seg.values = $build(datapath.rd2_out_debug, 4)
            ...
           // and so on, for each of debug signal from datapath  

    }

Build and Debug

With the debug infrastructure in place, you can safely trace the error both on simulation (if you didn’t use any BRAM) and on the hardware. This will give you a clue whether:

  1. Are any states problematic? (did not go to the right state in hardware but go to the right state in simulation)
  2. Is a particular state triggered? (useful to check if simulation does not work either)

Debug: Replace the ALU (Temporarily)

Once you have identified the problematic state, you can temporarily replace your ALU with instructor ALU and see if the problem persists.

Debug: Use Lucid Operators Directly in a State (Temporarily)

In a problematic state, you may try to use Lucid Operators directly such as (assume control unit directly gets RD2 from regfile):

    States.CHECK_PRESS_P1:
                    
        regfile_ra2 = d0 // P1 button press
        
        if(regfile_rd2_out < 3){ // use lucid operator, bypass ALU 
            game_fsm.d = States.INCREASE_SCORE_P1
        }
        else{
            game_fsm.d = States.IDLE // ignore button press
        }
        debug.d = 8d6

Some States (not all) does not Transition as Expected

Delay your FSM Transition

This can happen when you accidentally add sequential logic in your ALU, that is your ALU requires several cycles before it can output a valid value. This can happen for instance if you insist on adding a Division module. As a result, your ALU is unable to output the correct value in the same clock cycle, so the following state won’t behave as expected, that is we won’t store the right CMPEQ result into R6.

    States.CHECK_SCORE:    
        alufn = b110011 // CMPEQ
        regfile_ra1 = d1 // score dff
        asel = b00 // rd1 to alu 
        bsel = b01  // constant 10
        regfile_we = 1 
        regfile_wa = d6 // temp dff
        game_fsm.d = States.BRANCH_SCORE                    

You should have fixed your ALU to be able to output a stable result within 1 period of FSM clock (remove any sequential component). However, if you need to have any sequential component in the ALU, then you will need to wait several cycles in this state before advancing to the next state, e.g BRANCH_SCORE.

You can do this by declaring a temp dff in your control unit / fsm module:

    dff state_delay[10](.clk(clk), .rst(rst), #INIT(0))

And then utilise this in the state to introduce delay:

    States.CHECK_SCORE:    
        // increment state_delay 
        state_delay.d = state_delay.q + 1
        alufn = b110011 // CMPEQ
        regfile_ra1 = d1 // score dff
        asel = b00 // rd1 to alu 
        bsel = b01  // constant 10
        regfile_wa = d6 // temp dff

        if (&state_delay.q){
            regfile_we = 1  // write only once at the end of the delay cycle
            state_delay.d = 0 // reset state_delay
            game_fsm.d = States.BRANCH_SCORE // go to next state              
        }

This way, the FSM maintains the same control signal for 1024 onboard clock cycles and then only capture the output at the the 1024th cycle, while transitioning to the next state. This should give enough time for the ALU output to settle. You should adjust the size of the dff accordingly.

Adding combinational elements to an ALU is not recommended. An ALU should be purely combinational, at least the ALU in this course. Only do this if you are sure with what you intend to do.

Display Matrix Does not Work

You need to first ensure that your game logic is solid on both hardware (io top) and simulation. Once you have done this, you might realise that your display matrix did not display what you wanted. The issue might either be in your hardware or your driver.

First, check that your matrix works using instructor’s driver demo (WS2812B or HUB75 style matrix). The code should work as-is without any modification.

If it does not work, then check your connection, or it is likely that your hardware has spoiled. You might need to purchase a backup hardware.

Otherwise, if it works, then you would need to slowly trace your code again and ensure it outputs the right thing. Consider the following infrastructure that highlights separation of concern.

[Datapath] ---> (writes) ---> [Display Buffer] ---> (reads) ---> [Display Driver] ---> [LED Matrix]

Create a Display Buffer

Use a dedicated register (DFF) or memory module (e.g., BRAM, simple_ram, or simple_dual_ram) to act as a display buffer. This module’s only purpose is to store the current screen state, nothing more.

Display Driver: Read-Only Access

The display driver should only read from the display buffer. It should never perform logic or calculations. Think of it as a “dumb” renderer, it simply reads what’s in the buffer and sends it to the display hardware.

Datapath: Write-Only Access

Your logic (Datapath) should only write to the display buffer. All computations happen elsewhere typically in your REGFILE or control logic. The display buffer just holds the final output to be shown.

This separation of concerns makes your design easier to debug and reason about and avoids conflicts between components trying to read/write the same data at the same time.

Summary

These debug strategies generally work to catch any bug with your 50.002 1D project within a reasonable time, but they are not magic tools. Debugging HDL requires patience and discipline. Sometimes, it might be easier to remake the project from scratch and does incremental test (both on simulator and hardware). Good luck!