Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

VHDL coding style for better performance and timing constraints

Status
Not open for further replies.

ghattasak

Member level 1
Joined
Dec 31, 2012
Messages
33
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,286
Activity points
1,595
hello guys
I am working on a project: block matching using online adders and Sum of absolute difference computation
I have written the VHDL code and run the project fully on 3.8ns on altera cyclone V but i did not put any constraint on my input and output yet
I would like you if possible to check my coding style in the following files and note hints and guides and your opinion on how good or bad it is for future reference and projects.

the top level design is the block_add file
the olcomp is the online comparator used it has a large computational circuit I used a case statement to implement it is it ok?
as I noticed after implementing the comparator in the given way my clock increased from 3 to 3.8
I have attached an image showing the circuit implemented in the block_add file

I have also a question regarding timing the unconstrained input and outputs on xilinx implementation I got a clock of 3ns without having to specify any constraints but altera requires them specified I have checked the timequest tutorial but did not understand on what to base my input and output timing constraints in addition is adding registers to inputs and outputs solves the problem?

please let me know of your opinion
thank you
 

Attachments

  • archifull.png
    archifull.png
    31.2 KB · Views: 78
  • oladd.zip
    88 KB · Views: 51

I have also a question regarding timing the unconstrained input and outputs on xilinx implementation I got a clock of 3ns without having to specify any constraints but altera requires them specified I have checked the timequest tutorial but did not understand on what to base my input and output timing constraints in addition is adding registers to inputs and outputs solves the problem?
Timing constraints for top level I/O come from the I/O characteristics of the other devices on the board that drive those inputs or receive those outputs.

For example, if the input pins of your design are driven by another device that specifies a 4 ns clock to output delay and received by a device that requires 6 ns of setup time, then you would add those constraints as a 4 ns input delay and a 6 ns output delay to the appropriate I/O. This is kind of the most basic type of thing you will deal with for I/O timing constraints. Next up would be some or all of the following:

- Once you notice that the '4ns' or '6 ns' is larger than your desired clock period, but you also know that data only comes in only once every 4th clock cycle, then you'll be getting into adding timing constraints for those I/O that say the signal path is multi-cycle.
- That other part that drives one of your input pins is 6 inches away from your part and your part is the clock source. You'll want to add in ~1 ns delay for the clock and another ~1 ns delay for the input signal to propagate back to your input pins.
- Those I/O pins are actually on a highly loaded capacitive bus which will slow signals down a bit. You may need to account for that as well.
- Maybe there is no real clock input, the I/O are asynchronous. Typically what I will do here is specify them as multi-cycle so there are no warnings about unconstrained I/O.

Kevin Jennings
 
Timing constraints for top level I/O come from the I/O characteristics of the other devices on the board that drive those inputs or receive those outputs.

For example, if the input pins of your design are driven by another device that specifies a 4 ns clock to output delay and received by a device that requires 6 ns of setup time, then you would add those constraints as a 4 ns input delay and a 6 ns output delay to the appropriate I/O. This is kind of the most basic type of thing you will deal with for I/O timing constraints. Next up would be some or all of the following:

- Once you notice that the '4ns' or '6 ns' is larger than your desired clock period, but you also know that data only comes in only once every 4th clock cycle, then you'll be getting into adding timing constraints for those I/O that say the signal path is multi-cycle.
- That other part that drives one of your input pins is 6 inches away from your part and your part is the clock source. You'll want to add in ~1 ns delay for the clock and another ~1 ns delay for the input signal to propagate back to your input pins.
- Those I/O pins are actually on a highly loaded capacitive bus which will slow signals down a bit. You may need to account for that as well.
- Maybe there is no real clock input, the I/O are asynchronous. Typically what I will do here is specify them as multi-cycle so there are no warnings about unconstrained I/O.

Kevin Jennings

oh ok thank you :D I will recheck the tutorial and specify them a multi-cycle then my inputs are 64 - 2bit signals a start signal and a reset I did not check how will my input get in as it's a very large and annoying design part since my design is serial with parallel circuits I will need to input the first 64 - 2bit MSB first and so on till the 8th cycle into my adder tree. I was thinking of receiving the input data from fpga memory but did not research the different kind altera provides till now the inputs are given directly in a simulation test file

as for the clock I will be using a development board the altera-DE1SoC I will recheck and apply the solution for the unconstrained paths and compare with xilinx report as well
 

My tips are to pipeline where possible. Also, don't try to overthink the adder logic. Just use "+" and don't try to write a better added.
 

It'd be nice if you could just write generic code & leave all the fluff -device specific constraints etc to some automated process.

Unfortunately the reality is you need to tweak the device synthesis tool to get performance.

I don't know if there is an official reference book somewhere that says...
if you use a case statement you generate a mux etc
if/case statements give latches when inferred.
clocking gives reg & flip flops etc

have a look at
**broken link removed**

This will give you an inclination into how to make your code efficient.
 

Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top