Rerunning the Verilator build without warnings disabled on the new command file now yields just 3 warnings, albeit with quite complex warning messages.
%Warning-UNOPTFLAT: ../local/rtl/verilog/ps2/ps2_top.v:154: Signal unoptimizable : Feedback to clock or circular logic: v->ps2_top.rx_kbd_data_ready %Warning-UNOPTFLAT: Use "/* verilator lint_off UNOPTFLAT */" and lint_on around source to disable this message. %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_top.v:154: v->ps2_top.rx_kbd_data_ready %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_translation_ table.v:310: ASSIGNW %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_top.v:155: v->ps2_top.rx_translated_data_ready %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_wb_if-2.v:68 4: ASSIGNW %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_top.v:156: v->ps2_top.rx_kbd_read_wb %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_keyboard-2.v :429: ALWAYS %Warning-UNOPTFLAT: Example path: ../local/rtl/verilog/ps2/ps2_top.v:154: v->ps2_top.rx_kbd_data_ready %Warning-UNOPTFLAT: ../local/rtl/verilog/or1200/or1200_sprs.v:212: Signal unopti mizable: Feedback to clock or circular logic: v.or1200_top.or1200_cpu->or1200_sp rs.read_spr ...
This is an important warning to address. It is identifying a set of signals which appear to have cyclic dependency—a combinatorial loop. Rather than evaluating the expression in a single step, Verilator will need to iterate until it settles.
Verilator identifies the problem signal, and at least one loop
through which it is being driven. In the first warning in the
example, the problem signal is rx_kbd_data_ready
at line 154 of ps2_top.v
:
wire rx_released, rx_kbd_data_ready, rx_translated_data_ready, ...
The next line of the warning identifies that
rx_kbd_data_ready
is driving
rx_translated_data_read_o
at line 310 of
ps2_translation_table.v
:
assign code_o = translate_i ? {(rx_released_i | ram_out[7]), ram_out[6:0]} : cod e_i ; assign rx_translated_data_ready_o = translate_i ? rx_translated_data_ready : rx_ data_ready_i ; assign rx_read_o = rx_read_i ;
The connection is not immediately
obvious. rx_translated_data_ready_o
is not
directly dependent on rx_kbd_data_ready
. However
this is a different module, and Verilator has flattened the
code. The signal rx_data_ready_i
is an input. In
the instantiation of ps2_translation_table
in
ps2_top.v
, the driving signal,
rx_kbd_data_ready
is the argument used for the
rx_data_ready_i
input:
ps2_translation_table i_ps2_translation_table ( ... .data_o (), .rx_data_ready_i (rx_kbd_data_ready), .rx_translated_data_ready_o (rx_translated_data_ready), ... ) ;
The next line of warning identifies that
rx_translated_data_ready_o
is driving
rx_translated_data_ready
at line 155 of
ps2_top.v
:
wire rx_released, rx_kbd_data_ready, rx_translated_data_ready, rx_kbd_read_wb, rx_kbd_read_tt,
Again the connection is not immediately clear, but the driving
signal (rx_translated_data_ready_o
) is an output
of module ps2_translation_table
. This is
connected to rx_translated_data_ready_o
via its
instantiation in ps2_top.v
:
ps2_translation_table i_ps2_translation_table ( ... .rx_data_ready_i (rx_kbd_data_ready), .rx_translated_data_ready_o (rx_translated_data_ready), .rx_read_i (rx_kbd_read_wb), ... ) ;
The next line of warning identifies that
rx_translated_data_ready
is driving
rx_kbd_read_o
at line 684 of
ps2_wb_if-2.v
(the previously modified version
of ps2_wb_if.v
):
assign rx_kbd_read_o = rx_kbd_data_ready_i && ( enable1 || ( read_input_buffer_reg && input_buffer_full && !input_buffer_filled_from_command `ifdef PS2_AUX && !aux_input_buffer_full `endif ) );
Once again this is a different module, and the connection is through
an input to the module. In this case
rx_translated_data_ready
is passed as input
rx_kbd_data_ready_i
in the instantiation of
ps2_wb_if
in ps2_top.v
:
ps2_wb_if i_ps2_wb_if ( .wb_clk_i (wb_clk_i), .wb_rst_i (wb_rst_i), ... .rx_scancode_i (rx_translated_scan_code), .rx_kbd_data_ready_i (rx_translated_data_ready), .rx_kbd_read_o (rx_kbd_read_wb), ... ) ;
The next line of warning identifies that
rx_kbd_read_o
is driving
rx_kbd_read_wb
at line 156 of
ps2_top.v
:
wire rx_released, rx_kbd_data_ready, rx_translated_data_ready, rx_kbd_read_wb, rx_kbd_read_tt, tx_kbd_write, ...
Once again this is a different module, and the connection is through
an output of the module. In this case
rx_kbd_read_o
is passed as output in the
instantiation of ps2_wb_if
in
ps2_top.v
, where it is connected to
rx_kbd_read_wb
:
ps2_wb_if i_ps2_wb_if ( ... .rx_kbd_data_ready_i (rx_translated_data_ready), .rx_kbd_read_o (rx_kbd_read_wb), .tx_kbd_data_o (tx_kbd_data), ... ) ;
The next line of warning identifies that
rx_kbd_read_wb
in turn drives
rx_read
at line 429 of
ps2_keyboard-2.v
(an already modified version
of ps2_keyboard.v
):
always @(m2_state or rx_output_strobe or rx_read) begin : m2_state_logic case (m2_state) ...
The trail through the flattened modules is a little harder this
time. The driving signal, rx_kbd_read_wb
is an
input (rx_read_i
) to module
ps2_translation_table
. Within that module it
directly drives (via a continuous assignment), the output
rx_read_o
, which is connected to
rx_kbd_read_tt
in ps2_top.v
:
ps2_translation_table i_ps2_translation_table ( ... .rx_translated_data_ready_o (rx_translated_data_ready), .rx_read_i (rx_kbd_read_wb), .rx_read_o (rx_kbd_read_tt), .rx_released_i (rx_released) ) ;
rx_kbd_read_tt
is in turn the
rx_read
input in the instantiation of
ps2_keyboard
:
ps2_keyboard #(`PS2_TIMER_60USEC_VALUE_PP, `PS2_TIMER_60USEC_BITS_PP, `PS2_TIMER _5USEC_VALUE_PP, `PS2_TIMER_5USEC_BITS_PP) i_ps2_keyboard ( ... .rx_data_ready (rx_kbd_data_ready), .rx_read (rx_kbd_read_tt), .tx_data (tx_kbd_data), ... );
The final line of warning, identifies that
rx_read
is in turn a driver of the original
signal, rx_kbd_data_ready
at line 154 of
ps2_top.v
, thus completing the loop.
The connection is through the rx_data_ready
output of ps2_keyboard
instantiated in
ps2_top.v
:
ps2_keyboard #(`PS2_TIMER_60USEC_VALUE_PP, `PS2_TIMER_60USEC_BITS_PP, `PS2_TIMER _5USEC_VALUE_PP, `PS2_TIMER_5USEC_BITS_PP) i_ps2_keyboard ( ... .rx_scan_code (rx_scan_code), .rx_data_ready (rx_kbd_data_ready), .rx_read (rx_kbd_read_tt), ... );
The rx_data_ready
output is driven from within
the always
block dependent on
rx_read
:
always @(m2_state or rx_output_strobe or rx_read) begin : m2_state_logic case (m2_state) m2_rx_data_ready_ack: begin rx_data_ready = #1 1'b0; if (rx_output_strobe) m2_next_state = #1 m2_rx_data_ready; else m2_next_state = #1 m2_rx_data_ready_ack; end m2_rx_data_ready: begin rx_data_ready = #1 1'b1; if (rx_read) m2_next_state = #1 m2_rx_data_ready_ack; else m2_next_state = #1 m2_rx_data_ready; end default : m2_next_state = #1 m2_rx_data_ready_ack; endcase end
The loop is summarized in Figure 7.1.
Combinatorial loops can be down to a number of causes. In many cases there is no loop at the bit level. The dependencies are on different bits in a multi-bit signal, none of which form a loop. Verilator looks for loops on the full register or wire, not the individual bits, so flags a warning. The solution in this case is easy - break the signal apart to its individual components.
But this is not the problem with this ORPSoC example. This is a
single bit signal, and a genuine combinatorial loop (which will
settle, so simulates correctly). The clue to fixing it is in the
combinatorial always
block in
ps2_keyboard-2.v
:
always @(m2_state or rx_output_strobe or rx_read) begin : m2_state_logic case (m2_state) m2_rx_data_ready_ack: begin rx_data_ready = #1 1'b0; if (rx_output_strobe) m2_next_state = #1 m2_rx_data_ready; else m2_next_state = #1 m2_rx_data_ready_ack; end m2_rx_data_ready: begin rx_data_ready = #1 1'b1; if (rx_read) m2_next_state = #1 m2_rx_data_ready_ack; else m2_next_state = #1 m2_rx_data_ready; end default : m2_next_state = #1 m2_rx_data_ready_ack; endcase end
There is no reason in a cycle accurate simulation (where
#
delays are ignored) for
rx_data_ready
to be set inside the
always
block. Ignoring the delay, it is low
when m2_state == m2_rs_data_ready_ack
and high
otherwise. The default
entry is meaningless in
a 2-state simulation, since m2_state
is a 1-bit
register.
The always
block can be written with
rx_data_ready
assigned outside the block, and
the loop is broken:
`ifdef verilator assign rx_data_ready = ~(m2_state == m2_rx_data_ready_ack); // Breaks comb loop `endif always @(m2_state or rx_output_strobe or rx_read) begin : m2_state_logic case (m2_state) m2_rx_data_ready_ack: begin `ifndef verilator rx_data_ready = #1 1'b0; `endif if (rx_output_strobe) m2_next_state = #1 m2_rx_data_ready; else m2_next_state = #1 m2_rx_data_ready_ack; end m2_rx_data_ready: begin `ifndef verilator rx_data_ready = #1 1'b1; `endif if (rx_read) m2_next_state = #1 m2_rx_data_ready_ack; else m2_next_state = #1 m2_rx_data_ready; end default : m2_next_state = #1 m2_rx_data_ready_ack; endcase end
The remaining two loops both concern signals in
or1200_sprs.v
. The command file
cf-optimized-7.scr
has all these problems
fixed. A performance run need not now turn off any warnings.
make clean verilate COMMAND_FILE=cf-optimized-7.scr \ VFLAGS="-language 1364-2001" NUM_RUNS=1000
This run gives a significant performance improvement over previous runs of 47.06 kHz.
This is one of two warnings that is really important to fix. The
other is UNOPT
which occurs when modules have
input and output signals crossing between them, and which does not
occur in ORPSoC.
Even though there were only 3 loops, one of which was in a peripheral with tied-off inputs, fixing this problem gave an 8% performance improvement.