Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

[SOLVED] VHDL Error: Process must contain only 1 wait statement

Status
Not open for further replies.

IckyT2012

Newbie level 6
Joined
Mar 31, 2011
Messages
11
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
1,457
I'm attempting to write a piece of code whose essential function is to count down toward zero upon receipt of an external pseudo "clock" signal. I'm receiving the error:
Error (10398 ) : VHDL Process Statement error: Process Statement must contain only one Wait Statement
I cannot figure out what Quartus is interpreting as the second wait statement. I've simplified the code down to just the section giving me trouble. How can I modify this code to remove the error but retain identical function?

PROCESS
BEGIN
WHILE Xcount > "0000000000" LOOP
WAIT UNTIL XHallA = '1';
Xcount <= Xcount - 1;
END LOOP;
END PROCESS;

--Thanks in advance
 

Try adding a "wait;" line just before the "End process" statement.
 
when synthesising loops, they are unrolled into parallel hardware. When this loop is unrolled, it takes 1 wait statement per loop, which gives you more than one wait per process.

There are 2 main problems with your code here:
1. a while loop
2. a wait statement.

Remove both, read a book on digital design with VHDL and then try and re-write the VHDL.
 
Vipinlal: Thank you for the suggestion. But that has not proven to be effective.

Tricky: Thank you for pointing out the source of the error, but i can't exactly call your response a suggestion. I would not be posting on a forum unless I had exhausted all of my other resources, including textbooks.

Allow me to restate my problem:

Process
Xcount <= f(inputs);

If Xcount > 0 then
when external signal = rising edge
decrement Xcount
repeat until Xcount = 0
End Process

since the signal is not predictable, i cannot specify the number of iterations the loop will require, unless i can put a wait statement inside the loop. I need to be able to limit the number of iterations or i receive an error that iterations cannot exceed 10,000.
If i allow the entire process block to loop, i receive an error that xcount does not hold its value outside of the clock(signal) edge.
If i wait for a second external signal to reassign the value to xcount, i receive an error that xcount is dependent on multiple clocks.
My next thought was to use an intermediate signal Xint which has been assigned the value of xcount. This intermediate signal can then be decremented while xcount is allowed to change independently. I'm just not sure how to prevent Xint from being updated outside of the clock edge.
Any suggestions on how to implement this behavior?

---------- Post added at 14:47 ---------- Previous post was at 13:24 ----------

I'm thinking i can turn this process into an FSM. I would then use a single state instead to function as a loop and then only when my exit criteria is met will i return to the initial state and reassign a value to Xcount. Does this sound reasonable?
 

Doesn't sound unreasonable if that is any help. :p The first plan sounded like a no-go. Which I think is why TrickyDicky referred to the textbook. Because ... what actual physical circuit would you think that would synthesize to? Unpredictable loop count == unpredictable amount of hardware == not synthesizable.

Before I make assumptions .. is this even meant to be synthesized? If yes, then see previous remarks. If no, then it's probably meant as a testbench, in which case indeed you should put some delays in that process for it to make sense (IMO).
 
This is meant to be synthesized. I'm not very familiar with the FPGA hardware, which is where some of my confusion might come in. I'm slightly more familiar with C, where i can loop indefinitely. I was expecting similar behavior here but apparently I am mistaken. I'm confused as to how to exit a loop if i don't know when my exit criteria will be met. I have attempted to rewrite my code as an FSM. My current error is:
Error (10822 ) : HDL error : couldn't implement registers for assignments on this clock edge
In the following code, Xtable is an LUT. Xa is and input vector. Xmotor is an output

BEGIN
PROCESS (all)
BEGIN

CASE State IS
WHEN "00" =>
XaStore <= Xa;
Xdif <= signed(XaStore - Xb);
IF Xdif( 8 ) = '0' THEN
Xcount <= STD_ULOGIC_VECTOR(to_unsigned(Xtable(to_integer(unsigned(Xa-Xb))), 10));
Xdir <= "10"; --m+=1, m-=0 -> Forward
ELSE Xcount <= STD_ULOGIC_VECTOR(to_unsigned(Xtable(to_integer(unsigned(Xb-Xa))), 10));
Xdir <= "01"; --m+=0, m-=1 -> Reverse
END IF;
State <= "01";

WHEN "01" =>
IF Xcount > "0000000000" THEN
Xmotor <= Xdir;
IF XHallA'EVENT AND XHallA = '1' THEN
Xcount <= Xcount - 1;
END IF;
END IF;
IF Xcount = "0000000000" THEN
Xmotor <= "00";
State <= "10";
ELSE State <= "01";
END IF;

WHEN "10" =>
Xb <= XaStore;
State <= "00";

WHEN OTHERS =>
State <= "00";

END CASE;

END PROCESS;
END BEHAVIOR;

---------- Post added at 15:41 ---------- Previous post was at 15:37 ----------

The error is identified at line:
IF XHallA'EVENT AND XHallA = '1' THEN

---------- Post added at 15:51 ---------- Previous post was at 15:41 ----------

If i change: IF XHallA'EVENT AND XHallA = '1' THEN
to: Wait until XHallA = '1';
Then I receive an error I cannot use both a sensitivity list and a wait statement (i currently have a sensitivity list). If I remove the sensitivity list then the error says i must use either a wait statement or a sensitivity list. There is still a wait statement in this code with the error. Does the wait statement need to appear at the beginning of the process to eliminate this error?
 

I would not be posting on a forum unless I had exhausted all of my other resources, including textbooks.
I don't know, which HDL respectively FPGA textbooks you own, but apparently you missed to read the basic chapter about how hardware logic programming works. It has little to do with sequential programming languages like C, although the syntax seems quite similar, e.g. when looking at iteration constructs. But their meaning can be completely different.

This is. e.g the case with a for loop, that is not defining a sequence in time but an instruction to generate parallel logic branches.

You have avoided iteration loops now and went for case constructs. But you didn't succed in designing a working state machine. You'll find different prototypes of state meachines in text books, e.g. Moore and Mealey, and you can write other styles, also more simple ones. A common characteristic is however, that the state variable is registered in a clock edge sensitive process in one place. Without it, the FSM won't work.

You have in contrast placed a clock sensitive condition inside a case statement, which is a VHDL syntax error. My suggestion is to study the FSM textbook examples, or e.g. the FSM templates in the Quartus editor, and use them as a starting point for your design.
 
I also suggest you read the VHDL coding guidelines for synthesisable constructs from whichever manufacturer you are using. You have already said your background is C. You have to realise that C is a programming language, VHDL is a Hardware Description Language, with description being the key word. Without understanding what hardware you are trying to generate, you dont have much of a chance of actually being able to synthesise your code. Hence my suggestion of reading a text book on digital design before trying to code any VHDL.

As a first tip - all synchronous code should follow this template:

Code:
my_process: process(clk, reset) --only put clock and async reset in here
begin
  if reset = '1' then 
    --async reset for your registers
  elsif rising_edge(clk) then --or clk'event and clk = 1 if you are reading an old book or examples
    --put your synchronous logic assignments here
  end if;
end process;
 
I don't know, which HDL respectively FPGA textbooks you own, but apparently you missed to read the basic chapter about how hardware logic programming works.

This is exactly what happened. What I meant is that I have exhausted all examples I have been able to find on wait statements. As for state machines, I have numerous examples at hand. Unfortunately none of them seem to be applicable to what I'm trying to do. I have synchronous and asynchronous portions of my code. In the asynchronous portion, an intermediate signal is defined as a function of combinatorial logic of the inputs. This signal is then being modified within the synchronous portion of my program. I know that the signal will not be modified until the synchronous portion is completed, but unfortunately Quartus does not know that. How do I prevent my intermediate variable from being modified asynchronously while executing the synchronous portion of my code? Does anybody have a relevant example they could direct me to? I am not familiar with the FSM templates in the Quartus editor. I will be looking into that next
 

@lcky : did I miss something here? Did you get that error while synthesising ( I thought you were talking about a testbench code, sorry for the mistake) . As others said wait for statements are not supported in synthesis.

If you want delay, I think you should implement state machines. Regarding state machines, see if the below link is of any help,
VHDL coding tips and tricks: How to implement State machines in VHDL?
 
@Icky: Using a two process method that you are describing is one way of doing things. But with this format, usually the way it works is that one process creates the asynchronous logic, and the second process simply registers the asynchronous logic - it should not modify it.

I still think you are thinking too much like a software programmer and not thinking about the underlying hardware. Try drawing the circuit out on paper before writing any VHDL.
 
Thanks for the help all.

I believe I am on my way to a working construct. I've broken the block down into smaller problems. 1 block for the counter, 1 for the state machine, and i think 1 more block for the combinatorial logic. I required a slight conceptual modification in the counter. I didn't want to begin my clock until i was in the state with the counter. This is because the clock input is from a motor encoder which will not start pulsing until the motor is turning. I modified the algorithm to switch on the motor first and use the first feedback pulse as my initialization on the counter. This should require an initial decrement on the count but should still be utilizable.

I'm a little hung up on defining my state machine though. What would make this alot easier is if i can obtain a 'Ready' signal from the asynchronous combinatorial logic block indicating that the calculations have been completed. My first thought is to measure the total gate delay involved in the processing and set a predefined wait period before sending out the ready signal, but this is not exactly what I want to do. Is there a way to signal that the calculations have been completed?

---------- Post added at 20:52 ---------- Previous post was at 19:17 ----------

Also, is there any drawback to putting "all" in my sensitivity list rather than just including only the signals needed?
 

Thanks for the help all.

I'm a little hung up on defining my state machine though. What would make this alot easier is if i can obtain a 'Ready' signal from the asynchronous combinatorial logic block indicating that the calculations have been completed. My first thought is to measure the total gate delay involved in the processing and set a predefined wait period before sending out the ready signal, but this is not exactly what I want to do. Is there a way to signal that the calculations have been completed?

This is why you make everything synchronous. If you know it takes n clocks to finish a calculation, it will always take n clocks, so no ready signal is required. If you start worrying about asynchronous gate delays you are in trouble. You cannot "wait" for asynchronous logic to finish without some form of trigger, and it sounds like you're not going to get one. I would just syncrhonise the lot and then you always know when the calculations are complete.



Also, is there any drawback to putting "all" in my sensitivity list rather than just including only the signals needed?

There are a couple of drawbacks:

1. process(all) is a VHDL 2008 construct and not all compilers are VHDL2008 compatible yet.
2. You will decrease simulation performance as it will trigger the process when ANY signal changes inside the code (even if nothing happens). The sensitivity list is ignored for synthesis though, so it has no impact here.
 

I have a fully functional program now that simulates correctly. Unfortunately it is 3 times too large for my device. Its using around 29000 logic elements but the cyclone 2 only has room for 8200. I was processing 18-30 bit vectors with extensive multiplication and division. By shaving bits off and sacrificing accuracy, i have been able to get that number under 10000 bit it is still too large and the code is less reliable.
Wikipedia says:
"It is relatively easy for an inexperienced developer to produce code that simulates successfully but that cannot be synthesized into a real device, or is too large to be practical. One particular pitfall is the accidental production of transparent latches rather than D-type flip-flops as storage elements."
How do I determine if this is my problem and how do I avoid this pitfall?

I will also be posting this question in a new thread
 

usually it is a a question of design.

If you are using the small cyclone 2, you have very few dedicated multipliers. Divisions also chew up logic. If you run out of multipliers you have basically done the two worst things you can in terms of logic use.

The problem is you probably chose too small a device in the first place.
It might also be good to post some code.
 

My logic device contains 8256 logic elements. The code below originally took in 19 bit input vectors for processing. That design took 28000 logic elements. By shaving off extra bits and sacrificing accuracy, the below code is down to 9000 resources. This includes optimizing all fitter settings. When i plug this block into the whole project, however, which includes counters and state machines, the size is actually reduced to about 7800 resources, which fits on the chip. I'm not sure why it would be smaller, given a larger circuit. What i'm working on now is to try to add bits back into the signals below to increase my accuracy again. The code shown now fits on the device but doesn't work very well. I imported fixed point libraries for this code.


Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
LIBRARY ieee ;
USE ieee.std_logic_1164.all ;
USE IEEE.STD_LOGIC_UNSIGNED.all;
USE IEEE.numeric_std.all;
LIBRARY altera ;
USE altera.maxplus2.all;
LIBRARY ieee_proposed;      --Necessary to convert inputs to fixed point
use ieee_proposed.fixed_float_types.all;    --add file to project
use ieee_proposed.fixed_pkg.all;            --add file to project
 
ENTITY LogicBlock3 IS
    PORT ( R10      : IN STD_LOGIC_VECTOR(1 DOWNTO 0);      
           T10      : IN STD_LOGIC_VECTOR(7 DOWNTO 0);      
           R20      : IN STD_LOGIC_VECTOR(1 DOWNTO 0);
           T20      : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
           R21      : IN STD_LOGIC_VECTOR(1 DOWNTO 0);
           T21      : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
           R32      : IN STD_LOGIC_VECTOR(1 DOWNTO 0);
           T32      : IN STD_LOGIC_VECTOR(7 DOWNTO 0);
           Xa   : OUT STD_LOGIC_VECTOR(7 DOWNTO 0); --Output coordinates
           Ya   : OUT STD_LOGIC_VECTOR(6 DOWNTO 0));    --Output coordinates
    END LogicBlock3 ;
 
ARCHITECTURE Behavior OF LogicBlock3 IS
SIGNAL T10int : sfixed(T10'high+1 DOWNTO T10'low);
SIGNAL T20int : sfixed(T20'high+1 DOWNTO T20'low);
SIGNAL T21int : sfixed(T21'high+1 DOWNTO T21'low);
SIGNAL T32int : sfixed(T32'high+1 DOWNTO T32'low);
SIGNAL D10, D20, D21, D32 : sfixed(7 DOWNTO -3);
SIGNAL a1, b1, b2, a2: sfixed(7 DOWNTO -5); 
SIGNAL c1, c2: sfixed(13 DOWNTO -4);        
SIGNAL int3a, int4a: sfixed(12 DOWNTO 3); 
SIGNAL int5: sfixed(14 DOWNTO 0); 
SIGNAL int6: sfixed(8 DOWNTO -4); 
SIGNAL Ybint: sfixed(6 DOWNTO 0);
CONSTANT Sound : sfixed(1 DOWNTO -4) := "011101";
CONSTANT L : sfixed(7 DOWNTO 0) := "01100000";
CONSTANT W : sfixed(6 DOWNTO 0) := "0101010";
 
    --Accuracy of results depends on number of decimal places retained(-8)
    --Divide operator cannot exceed 64 bit vector, limits size of above parameters
    --Parameters may exceed allowed bits(+13) near center of field, yeilds undefined response
    --Trade off between number of >0 bits and number of <0 bits
 
BEGIN
    PROCESS (R10, T10, R20, T20, R21, T21, R32, T32, T10int, T20int, T21int, T32int, D10, D20, D21, D32, a1, a2, b1, b2, c1, c2,
                int3a, int4a, int5, int6, Ybint)
    BEGIN
    
--Convert input time difference vectors into fixed point vectors
        T10int <= resize((to_sfixed(signed(unsigned('0' & T10)))), T10int'high, T10int'low);
        T20int <= resize((to_sfixed(signed(unsigned('0' & T20)))), T20int'high, T20int'low);
        T21int <= resize((to_sfixed(signed(unsigned('0' & T21)))), T21int'high, T21int'low);
        T32int <= resize((to_sfixed(signed(unsigned('0' & T32)))), T32int'high, T32int'low);
        
--Determine if T10 is positive or negative
        D10 <= resize((0 + T10int)/Sound, D10'high, D10'low);
        IF R10 = "10" THEN
            D10 <= resize((0 - T10int)/Sound, D10'high, D10'low);   
        END IF;
        
--Determine if T20 is positive or negative
        D20 <= resize((0 + T20int)/Sound, D20'high, D20'low);
        IF R20 = "10" THEN
            D20 <= resize((0 - T20int)/Sound, D20'high, D20'low);
        END IF;
        
--Determine if T21 is positive or negative
        D21 <= resize((0 + T21int)/Sound, D21'high, D21'low);
        IF R21 = "10" THEN
            D21 <= resize((0 - T21int)/Sound, D21'high, D21'low);
        END IF;
        
--Determine if T32 is positive or negative
        D32 <= resize((0 + T32int)/Sound, D32'high, D32'low);
        IF R32 = "10" THEN
            D32 <= resize((0 - T32int)/Sound, D32'high, D32'low);
        END IF;
        
--Calculate variables for input into variable solutions
    --Solutions based on set of linear equations:
    --Xa1 + Yb1 + c1 = 0
    --Xa2 + Ya2 + c2 = 0
    --Solutions:
    --Y = ((a1 * c2) - (a2 * c1)) / ((a2 * b1) - (a1 * b2))
    --X = ((-b2 / a2) * Y) - (c2 / a2)
 
        a1 <= resize((-2) * (L / d10), a1'high, a1'low);
        b1 <= resize((2) * (W / d20), b1'high, b1'low);
 
--Intermediate signals used to avoid trouble caused by exponent operator
 
        c1 <= resize(d21 - (b1*(w/2)) + (a1*(l/(-2))), c1'high, c1'low);
        a2 <= resize(((2) * (L / d21)) + ((0 + 2) * (L / d32)), a2'high, a2'low);
        b2 <= resize((-2) * (W / d21), b2'high, b2'low);;
        int3a<= resize(L*(L/d32), int3a'high, int3a'low);
        int4a<= resize((W+L)*((W-L)/d21), int4a'high, int4a'low);
        c2 <= resize(d32 + d21 - (int3a) + (int4a), c2'high, c2'low);
 
--Plug above parameters to obtain variable (X,Y) solutions
--Intermediate signals required to remain within divide vector limit
 
        int5 <=resize(((a1 * c2) - (a2 * c1)), int5'high, int5'low);
        int6 <=resize(((a2 * b1) - (a1 * b2)), int6'high, int6'low);
 
        Ybint <= resize((int5 / int6), Ybint'high, Ybint'low);
 
        Ya <= to_stdlogicvector(Ybint);
        Xa <= to_stdlogicvector(resize((((-b2 / a2) * Ybint)-(c2 / a2)), Xa'high, Xa'low));
        
    END PROCESS ;
END Behavior ;

 

The huge resource usage is mainly brought up by the large number of dividers in your code. Without knowing the code's purpose, there can't be said much about alternative solutions. Serial dividers are generally an option. Apart from this problem, by performing all calculations asynchronously, the design becomes very slow.
 
Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top