Why can simulator race conditions not be eliminated by predicting events in advance

matrixofdynamism · Apr 23, 2015

Race conditions can be prevented if the simulator determines the order of parallel events, why do they still occur?

Fundamentally, race conditions occur in simulation because the HDL or HVL code is fundamentally written to run in parallel via multiple processes (or always blocks in Verilog). However, in reality those parallel processes shall run in a sequence on a processor using multithreading. Therefore, it comes down to which of the many parallel processes in the testbench is executed first by the processor, it is impossible to say this with certainity and this uncertainity is nature of multithreaded programs. The actual behaviour for a given testbench however varies between simulators e.g in some simulators an verilog initial block may not necessarily run before other verilof always blocks, in other simulators the verilog initial block may always be run before the verilog always blocks. This makes race conditions elusive as they may not manifest on all simulators and all versions of the same simulator.

In my understanding, if the simulator determines which ALL events are to occur at time X in simulation from different parallel processes in advance, race conditions can be easily prevented. Why do simulators not do this? Since all processes ultimatly are part of the same simulation and connected to the same simulation time, why can the simulator tool not predict what events are expected to occur at what time and thus prevent race conditions?

ads-ee · Apr 23, 2015

Because it is a race condition, which is not predictable. How is the simulator supposed to know the order of execution if you didn't design it to force that order. It basically means you coded it incorrectly in the first place if you expected an execution order for your design but didn't enforce it in the design.

I gather you want to have a simulator that compensates for poor coding practices and fixes anything it thinks is wrong. That is not a good method of ensuring a quality design. In fact it's more likely to hide simulation synthesis mismatches (a really bad trade off).

matrixofdynamism · Apr 24, 2015

Lets put it another way. Race conditions do not occur in the same way in different simulators and different versions of the same simulator. This much is established. This means that simulation may potentially give different results across different simulators and different versions of the same simulator. It is also established that as things stand at present, race conditions cannot be prevented.

One reason for race conditions occuring in the first place is that the standard (I suppose set by IEEE) for simulators is not clear on certain things. Therefore, the people desinging the simulator deal with such missing requirements as they see fit. This results in different simulators interpreting the same code in different ways. Here I am talking about unsynthesizable testbench code and not HDL.

I would expect that for the code to be more portable, the people creating the IEEE standard will update the standard to remove the ambiguities that cause such problems in the first place. How do I ask to find out why they have not done so yet?

What I am wondering is that, in this age when we try to automate things as much as possible and remove human input into a process as much as possible to make it repeatable and thus more reliable, why are people not working to solve this problem also?

Edit:
Race conditions are usually hidden. A person may not be aware that they are occuring in the first place and that is why they are dangerous.

dave_59 · Apr 26, 2015

You can take both race conditions and X-states as artifacts of digital event simulation. Real hardware does not have signals that transition from 0 to 1 instantaneously, or signals that are initialized to an X state. You are modelling hardware at an abstraction level that ignores many physical characteristics and most of the time before synthesizing that description into a form where that information could be calculated. And the choice of any abstraction level is trade-off between accuracy and performance.

And there are many other constructs in programming languages where the results are not deterministic:

Code:

i = 1;
A = func(i++,i++);

This passes the values 2,3 or 3,2 to the function.

Code:

reg t1;
...
{t1,t1} = 2'b10;

This assigns the value 1'b0 or 1'b1 to t1.
Either of these might be considered examples of bad code, but not illegal code. From a language design point of view, it is difficult to specify every possible interaction of language constructs with a deterministic set of rules, and still leave room for implementation specific optimizations. Synthesis tools and other formal linting tools may catch some of these coding errors, but they all do this within the framework of a particular application.

matrixofdynamism · Apr 26, 2015

@dave_59
Is there something wrong in my understanding here? Race conditions exist because simulation contains multiple parallel processes. It is possible that multiple processes actually drive the same signal creating race conditions.

My question was based on the understanding that even though a simulation contains multiple parallel processes and the order of which is running at a given time is impossible to know, a simulator will actually figure out in advance which signal will change in the future for each process and then process these events while it continues to predict future events. Can there not be an "inter process communication" intrinsic to the simulator such that the different initial and always blocks lets say in SystemVerilog simulation are able to determine if something is being driven from multiple places and thus prevent race conditions?

If this confusion is resolved then my question is answered.

dave_59 · Apr 27, 2015

For synthesizable code it easy to determine if there are multiple drivers, but you can still have race conditions without multiple drivers. Imagine the model of a DFF with both asynchronous set and reset:
If R and S both change from 0(enable) to 1(disable) at the same time either because of testbench stimulus, or because of the behavior of some combinatorial logic, you have a race condition. In the absence of any timing delay information, it is impossible to predict which event arrives as the DFF first, the R rising or the S rising. So you cannot predict the final state of the DFF Q output because the simulation may briefly see one signal enabled and the other signal disabled before both are disabled.

You can also have race condition when one process writes and another process reads the same variable and both processes are synchronized to the same signal. This is where using the non-blocking assignment (NBA) helps to avoid this race condition, but checking for this during compilation is difficult because its not always obvious that two processes are synchronized to the same signal, and there might appear to be two different variables, but they might be connected through combinational logic.

When it comes to testbench code, this gets even harder because many testbenches are written closer to software without any synchronizing clocks. SystemVerilog provides many inter-process communication features like semaphores and mailboxes, but there are certainly many other ways of creating other handshaking protocols that to require any specific construct would be over-constraining.

This is a balance between flexibility and letting you shoot yourself in the foot.

Welcome to EDAboard.com

Why can simulator race conditions not be eliminated by predicting events in advance

matrixofdynamism

Advanced Member level 2

ads-ee

Super Moderator

matrixofdynamism

matrixofdynamism

Advanced Member level 2

dave_59

Advanced Member level 3

matrixofdynamism

matrixofdynamism

Advanced Member level 2

dave_59

Advanced Member level 3

matrixofdynamism

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Connect with us

Online statistics

Forum statistics