process (clk)
begin
if(rising_edge(clk)) then
counter_b <= counter_b + 1 ;
end if;
end process;
--
process(clk)
begin
if(rising_edge(clk)) then
count_out <= counter_b ;
end if ;
end process ;
Is there anything in the design that could cause a change in execution time? Does it have done complex decision logic? If not then why do you even need to measure it? You should be able to work it out directly from the code, with a simulation or why even bother? What makes it important that you know the execution time? If it's just a pipeline, latency is pretty irrelevant, it's throughput that's important.
Make sure you're using "clk" as the sampling clock of the ILA. Otherwise, you'll have clock domain crossing problems.
By simulation, I think the clock cycles may differ from execution on FPGA (or maybe I'm wrong with this). So I have to count the clock cycles during the execution on FPGA.
Here, it would be better to run some large data set through both the FPGA and CPU and either measure the time for the large data set through both or even better, the bandwidth from each when running flat out. Doing this may help you identify bottlenecks in either and help you improve them.
How did you get to this conclusion?But the problem is, when I want to read my counter's value in Chipscope, it shows a irrelevant value.
How did you get to this conclusion?
What is your ILA trigger?
Doing a simulation of your design code and static timing analysis (taking into account your fpga chip, its temperature, etc.) will give you the correct answer in 99%.
It won't show you the number of clock cycles, as it doesn't know where the start and end points of any given algorithm are.
You can work out the latency from the code, it's not that hard. Judy count the number of register stages in your code.
Hi,
did you implement a controller core on the FPGA? And run software on this core?
Klaus
Making it purely combinatorial will make your life difficult. Latency will vary with several factors (one being temperature) and it will be very slow. You should make it fully synchronous as then you can use the simulation to measure the latency (or count the register stages in the code)
My design is not fully combinantial. As I replied to Kaus, it is a computational core, but the software that run on this core is a combinantial circuit.
My design is not fully combinantial. As I replied to Kaus, it is a computational core, but the software that run on this core is a combinantial circuit.
It also makes no sense to me.my head is hurting. this makes no sense.
It also makes no sense to me.
doost4, do you really understand the difference between VHDL and software? VHDL doesn't execute, it is synthesized, then cells are placed, and finally the design is routed. This is nothing like software where a compiler builds some byte code of the program and then links it to libraries creating the executable software image.
So unless your combinational circuit is some sort of processor it's not going to do any "software that run on this core" type of operation. Also as a processor requires some sort of memory elements a combinational circuit won't function well as a processor.
If you need to measure time in an FPGA you are going about this all wrong. The simplest way to obtain empirical data on the latency of a design is to create an integrated ILA design and add logic in your design that generates a pulse on starting and completion of the algorithm. You use that to capture your free running counter using the capture data based on a compare value. Basically you make the ILA only capture data when the start or complete strobes are active. This will give you delta times between start-complete-start events.
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?