say your current can run at 50MHz.
What your circuit do is to calculat b(n+1) from a
I dont know what happen to a
, is it updated by b?
Anyway, if you can figure out the equation of calculating b(n+2) from a
,
you can calculate this in two cycles, which is equivalent to the original one.
As a result, the new equation needs to run at 25MHz but normally it can be fater than 25 MHz as no registers in the pipeline.
Instead of reducing your clock, this part can be constrained as multi-cycle path and the output is registered every other cycle. Hopfully the circuit can run at 60MHz if the equation can run at 30MHz.
The other way, also calculate the new equation, and make it extra.
Then in every cycle you can calculate b(n+1) and b(n+2) both from a
. After that, you may be able to save some clocks doing others thing which may help speed.
Anyway, what I am trying to say, in extreme cases, when synthesis can not meet the speed target, we can change circuit structure, make more parallelism and do retiming to speed up.