Synchronous Clock Domains

asdf44 · Dec 10, 2015

Well this is hopefully a dumb question but I'm not finding it spelled out to my satisfaction.

I want to be sure that ISE is handling my synchronous clocks properly for a Spartan 6 design.

I have a 125MHz incoming clock which does the following:
1) Gets used directly for some logic
2) Goes into a DCM block and gets divided by 4 for some other logic
3) #2 goes to a flip-flop based counter divider for other slower logic

The end result I want is that signals on all 3 clocks can cross synchronously within the design. Because asynchronous clock domains are much more interesting I'm not finding where the behavior regarding these types of synchronous domains is spelled out. The DCM, #2, I trust is taken care of and I can see a timing constraint automatically generated for the divided down clock showing that it understand both the frequency and phase relationship to #1.

But #3 is where I'm a bit fuzzy on what the limits of the tool are in terms of interpreting the design and constraining it. What I'm assuming is the following:
A) The tool won't be necessarily be able to infer the actual speed of clock #3 [I could constrain it manually if I want] but:
B) The tool will apply timing constraints based on input clock #2 which it does know about thus:
C) Timing should be guaranteed between domain #3 and the other two domains.

Is this understanding correct?

ads-ee · Dec 10, 2015

Using divided counter clocks is not a good design practice in FPGAs. Make the output of the counter a pulse and use it as an enable with the divide by 4 clock (2) from the DCM.

If you insist on using the divided clock you will have the difficulty of getting the tools to constrain the relationship of the clocks for 1 & 2 with the clock generated in 3. This is because the skew associated with 3 will have to account for the BUFG skew, the Tco skew, the routing skew from the FF output to the BUFG. You would be better off treating the clock for 3 as asynchronous. I would avoid this all by doing what I first suggested.

If I was in a design review with you, I'd tell you the same thing, and would require that it be changed before the design is approved.

asdf44 · Dec 11, 2015

ads-ee said:
Using divided counter clocks is not a good design practice in FPGAs. Make the output of the counter a pulse and use it as an enable with the divide by 4 clock (2) from the DCM.

If you insist on using the divided clock you will have the difficulty of getting the tools to constrain the relationship of the clocks for 1 & 2 with the clock generated in 3. This is because the skew associated with 3 will have to account for the BUFG skew, the Tco skew, the routing skew from the FF output to the BUFG. You would be better off treating the clock for 3 as asynchronous. I would avoid this all by doing what I first suggested.

If I was in a design review with you, I'd tell you the same thing, and would require that it be changed before the design is approved.

And more specifically what do you mean by "difficulty"? You mean it's less likely to meet timing or the tool may not understand it and/or be able to constrain it properly?

If the latter, can you describe why not? I understand that the delay and skew you mention eat into the 2->1 or 2->3 timing.

K-J · Dec 11, 2015

asdf44 said:
And more specifically what do you mean by "difficulty"?

'Difficulty' as in 'not possible' to get guaranteed performance.

You mean it's less likely to meet timing or the tool may not understand it and/or be able to constrain it properly?

It is not possible to specify constraints that are achievable.

If the latter, can you describe why not? I understand that the delay and skew you mention eat into the 2->1 or 2->3 timing.

Think about it for a minute. Consider two flip flops (#1 and #2) that are clocked by a clock signal. Flip flop #1 simply toggles and is your 'divided down clock' signal. Now consider flip flop #3 that is clocked by the divided down clock signal with the 'D' input connected to the output of flip flop #2.

Hopefully, you understand that there is some clock to output delay of every flip flop. Make the somewhat reasonable assumption for the moment that the clock to output delay of all the flip flops is roughly the same. That means the output of #1 and #2 will arrive at #3 at approximately the same time. Now ask yourself how do you expect #3 to operate properly when the D input (from #2) is violating either the setup or hold time or both relative to its clock input (from #1).

A work around that is not viable in FPGAs would be to delay the output of #2 so that it arrives at #3 long enough after the clock arrives. You have no way to get this type of behavior to be guaranteed.

Now ask yourself, why do you think you even need a divided down clock in the first place? It doesn't save logic, it degrades performance, and in an FPGA it would not save power. You gain nothing but making your design fragile and you've designed in a failure mechanism since the design would likely not work reliably over full temperature and voltage range. Good luck explaining that one.

Kevin Jennings

vGoodtimes · Dec 11, 2015

@asdf44:
You can use a divided clock if you really want. For signals that cross domains, you use a clock enable in the fast domain and then over-constrain the logic in the slow domain.

For example, the fast clock domain knows where the "falling edge" of the slower clock will be. You can register the signals to/from the slow domain at this time. The constraint for the slow domain can probably be highly overconstrained without issue.

It is more popular to use a clock enable in the fast-clock domain and then set multi-cycle constraint as ads_ee mentions. That would also be my preferred choice.

std_match · Dec 11, 2015

vGoodtimes said:
For example, the fast clock domain knows where the "falling edge" of the slower clock will be. You can register the signals to/from the slow domain at this time.

The suggestion above does not fulfill the following requirement:

asdf44 said:
The end result I want is that signals on all 3 clocks can cross synchronously within the design.

To use a divided clock is a bad idea, and there is no reason to do it. It is also probably impossible in this case.
The correct solution is to use one of the existing clocks (probably the slowest) together with a clock enable.

The only effect is that the source code for clock #3 must be written like this:

Code:

process(clk)
begin
  if rising_edge(clock) then
    if clock_enable3 = '1' then
      -- code for clock #3 here
    end if;
  end if;
end;

The clock #3 code will not be a separate domain, so try first with no constraint. If successful, signals can be connected freely without synchronization.

asdf44 · Dec 15, 2015

K-J said:
'Difficulty' as in 'not possible' to get guaranteed performance.
It is not possible to specify constraints that are achievable.
Think about it for a minute. Consider two flip flops (#1 and #2) that are clocked by a clock signal. Flip flop #1 simply toggles and is your 'divided down clock' signal. Now consider flip flop #3 that is clocked by the divided down clock signal with the 'D' input connected to the output of flip flop #2.

Hopefully, you understand that there is some clock to output delay of every flip flop. Make the somewhat reasonable assumption for the moment that the clock to output delay of all the flip flops is roughly the same. That means the output of #1 and #2 will arrive at #3 at approximately the same time. Now ask yourself how do you expect #3 to operate properly when the D input (from #2) is violating either the setup or hold time or both relative to its clock input (from #1).

A work around that is not viable in FPGAs would be to delay the output of #2 so that it arrives at #3 long enough after the clock arrives. You have no way to get this type of behavior to be guaranteed.

Now ask yourself, why do you think you even need a divided down clock in the first place? It doesn't save logic, it degrades performance, and in an FPGA it would not save power. You gain nothing but making your design fragile and you've designed in a failure mechanism since the design would likely not work reliably over full temperature and voltage range. Good luck explaining that one.

Kevin Jennings

Thank you, the problem makes perfect sense. I had been primarily considering the 3->2 and 3->1 paths where the added clock-to-out delay of clk2->clk3 is just added delay in general. But the problem of skew when going 2->3 or 1->3 is the more challenging problem.

But my final question is why I can't I rely on the tool to flag these violations? It's already analyzing clock delays and clock skews and in this case it has all the information needed to "see" the added skew between 2 and 3 since it's just regular clock-to-out delay.

Though I'm sold on transitioning to a clk_en model for slower clocks. Thank you.

My general situation is that I'm inheriting a design where I trust that the designer handled clock domain transitions correctly but I don't like the macro decisions. There are actually 4 #2's related to 4 A/D channels which are passed around the design as independent domains. Every module that uses the data also needs the associated clock and independently handles the domain crossing down at the lowest level. And likewise with another incoming data bus and an SPI bus which cross domains at the point of use instead of the top level. The design needs a number of changes, so I'm looking to simplify the overall architecture.

I'm planning on either slowing down the A/D's to the point where I can clock them in synchronously, or implement a proper transition at the top level so lower level modules aren't burdened by the transitions themselves (which I think is a better choice of abstraction). Likewise with the SPI bus where I have plenty of room for oversampling and the other incoming data bus.

K-J · Dec 15, 2015

asdf44 said:
But my final question is why I can't I rely on the tool to flag these violations? It's already analyzing clock delays and clock skews and in this case it has all the information needed to "see" the added skew between 2 and 3 since it's just regular clock-to-out delay.

I don't know of any reason why the timing analysis tool will not flag the violations unless the timing constraints specifically say to not analyze paths starting in one domain and ending in another. In Quartus (Altera), I think not analyzing is the default which is one that I always change so that it does analyze across the domains.

Kevin Jennings

ads-ee · Dec 15, 2015

asdf44 said:
But my final question is why I can't I rely on the tool to flag these violations? It's already analyzing clock delays and clock skews and in this case it has all the information needed to "see" the added skew between 2 and 3 since it's just regular clock-to-out delay.

The problem with logic generated clocks (output of a FF) is the delay from the clock input of the FF to the Q output of the FF added to the delay of the logic generated clock network to the setup time of the FF in the original clock domain.

Paths like this are the problem...

The clock to out of the FF on the left won't be accounted for unless you make specific constraints for min and max Tco and routing delay to the middle FF.

The constraint system is meant to check for FF to FF setup and hold on the same clock domain, from an input to a FF setup and hold, and from an output FF to a external device's setup and hold.

Using logic generated clocks from FFs requires the timing of the clock to out of the FF generating the clock output be added to the path delay for the clock (if you want the entire design to stay synchronous). I never code this way so I'm not expert on the constraints to use, but I think you have to add a constraint for every FF that starts in one domain that crosses over to the other domain and is then used back in the first domain.

asdf44 · Dec 15, 2015

K-J said:
I don't know of any reason why the timing analysis tool will not flag the violations unless the timing constraints specifically say to not analyze paths starting in one domain and ending in another. In Quartus (Altera), I think not analyzing is the default which is one that I always change so that it does analyze across the domains.

Kevin Jennings

Ok I'll have to check this. I do know that the tool will flag violations between my #1 and #2 above because I've seen it, and had to fix a bunch. But using a DCM is a different story.

ads-ee said:
The problem with logic generated clocks (output of a FF) is the delay from the clock input of the FF to the Q output of the FF added to the delay of the logic generated clock network to the setup time of the FF in the original clock domain.

Paths like this are the problem...
View attachment 124166

The clock to out of the FF on the left won't be accounted for unless you make specific constraints for min and max Tco and routing delay to the middle FF.

The constraint system is meant to check for FF to FF setup and hold on the same clock domain, from an input to a FF setup and hold, and from an output FF to a external device's setup and hold.

Using logic generated clocks from FFs requires the timing of the clock to out of the FF generating the clock output be added to the path delay for the clock (if you want the entire design to stay synchronous). I never code this way so I'm not expert on the constraints to use, but I think you have to add a constraint for every FF that starts in one domain that crosses over to the other domain and is then used back in the first domain.

Why not. The normal scenario in an FPGA is that you have FF -> [black box with delay] -> FF. In this case the black box has the clock-to-out delay of the middle flip-flop instead of combinational logic but that doesn't seem unreasonable. And the normal scenario requires knowing the clock-to-out delay of the leftmost flip-flop anyway.

ads-ee · Dec 15, 2015

asdf44 said:
Why not. The normal scenario in an FPGA is that you have FF -> [black box with delay] -> FF. In this case the black box has the clock-to-out delay of the middle flip-flop instead of combinational logic but that doesn't seem unreasonable. And the normal scenario requires knowing the clock-to-out delay of the leftmost flip-flop anyway.

The key is the timing tools don't time through a FF to a clock input out to the next FF output and to the setup of another FF. That is not a normal timing path. Look at any timing report you won't find the path I drew anywhere. Try an example design and check the timing report you won't see the path I show in the picture.

And why are you bringing up FF to FF? I already stated that is the NORMAL way tools analyze a design. You obviously have a preconceived notion of what the tools are supposed to do, so I'll just stop trying to explain.

Welcome to EDAboard.com

Synchronous Clock Domains

asdf44

Advanced Member level 4

ads-ee

Super Moderator

asdf44

Advanced Member level 4

K-J

Advanced Member level 2

vGoodtimes

Advanced Member level 4

std_match

Advanced Member level 4

asdf44

Advanced Member level 4

K-J

Advanced Member level 2

ads-ee

Super Moderator

asdf44

Advanced Member level 4

ads-ee

Super Moderator

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics