Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

bubble sort using Arrays in FPGA

Status
Not open for further replies.

eshbonzie

Junior Member level 3
Joined
Aug 16, 2010
Messages
29
Helped
16
Reputation
32
Reaction score
13
Trophy points
1,283
Activity points
1,721
Hey guys,

I need to do a bubble sort for data in one of my arrays

I have something like this:

type SignatureArray is array (639 downto 0) of std_logic_vector(33 downto 0);
signal FrameSignatureArray : SignatureArray;


for x in 0 to 633 loop
for J in 0 to 633 loop

if(FrameSignatureArray(J+1)(23 downto 0) >= FrameSignatureArray(J)(23 downto 0)) then

FrameSignatureArray(J) <= FrameSignatureArray(J+1);
FrameSignatureArray(J+1) <= FrameSignatureArray(J);

end if;
end loop;
end loop;

I believe this will always end in both cells in the array having the same value. I know that signal assignment are nonblocking statements and they are executed delta time in future. Should I use variables?

for x in 0 to 633 loop
for J in 0 to 633 loop


if(FrameSignatureArray(J+1)(23 downto 0) > FrameSignatureArray(J)(23 downto 0)) then

temp := FrameSignatureArray1(J);
FrameSignatureArray1(J) := FrameSignatureArray1(J+1);
FrameSignatureArray1(J+1) := temp;


end if;
end loop;
end loop;

But doesn't seem to be working correctly

I know that having two nested loops are alot of logic to be implemented. is it really critical and I have to sort the array over 634 cycles?..I really would like to do the bubble sort in one cycle only.

Does any one have some hints or tips for creating this algorithm using VHDL or what would I have done wrong

Thanks
 

I can tell you now that you code will not work on an FPGA. For loops are unrolled at compile time, and you're going to have 633 * 633 chained comparitors (thats over 400000). That takes more logic that even the largest and most expensive FPGAs have available. Even if you could fit it in, your timing is going to be terrible, and you're not even going to be able to clock it at a decent rate.

But if you could do it properly with a 100MHz clock the whole bubblesort will be complete in about 4 miliseconds, so you could complete 250 bubblesorts a second.

Do you understand digital logic? from the looks of your code you're trying to write software, not describe hardware.
 

Thanks for ur reply...well lets say..Im doing all what I can to understand digital logic...maybe I dunt have your knowledge but am trying my best

do you think its better when I do it in 633 cycles and in each cycle I sort the array only once? so only one for loop(633 chained comparator) or is it still too much and am thinking in the wrong direction?

Thanks for your help
 

Im sorry if I seemed a bit hard with my comment before - but you cannot learn digital logic by learning VHDL or Verilog. you need to learn how digital logic works first, think through a circuit, and then use VHDL to describe the circuit.

633 compares in one cycle is still too much. What you want to do is move 1 value per clock cycle, as in reality, if your array is in a memory, you're only able to access 1 address per clock cycle anyway.

For loops are best avoided in synthesisable VHDL altogether. They should only really be used to describe parrallel paths.
 

I would still appreciate it if you can give me some advice...I have been searching a while and came across the bitonic sort and odd-even sort which I understand is more applicable for hardware...do you have any comments concerning this or maybe some links which might help me understanding them better?

I also have a question concerning the 633 compares in one cycle. Is it true that if im comparing 23 bit values, this is done bit by bit where each bit needs a comparator and each comparator is around 5 to 6 gates so to do it in one cycle i need 633x23bitsx(5 to 6 gates for each bit) which is too much logic to use? or am I still getting it totally in a wrong way :)
 

I dont know anything about sorting algorithms, so I have no comments there.

But for your 2nd bit, you basically got it. Generally, more than a couple of compares per clock cycles is a no no. Thats why you pipeline everything. But theres nothing to stop you having many compares running in parrallel rather than in series. Even if you could do what you wanted, you wouldnt be able to access more than 1 address (data location) per clock cycle anyway.

try and forget about VHDL for a moment, and think about drawing the actual circuit down on a peice of paper. Think about the address generator for the memory, then 1 clock cycles later the data you addressed will be ready (while at the same time you're addressing the next memory element). Then you pipeline it through compares to compare it to adjacent memory locations (the previous value in the pipeline). then eventually you're going to have to write it to another memory (because writing back to your source memory is going to use up bandwidth).
 

The first point to consider is, how the data is actually organized. It's not clear to me, if you already thought about this problem. If we assume a FPGA internal memory, the solution options are quite different from a register file. But a register file would already require a considerable amount of FPGA resources, not only the registers, but also a large overhead of logic elements to create the data path. On the other hand, if you have specific speed requirements, a sequential accessible internal RAM may be too slow, there would be e.g. an option to assign several memory words in parallel.

In other words, the application requirements are driving the hardware design. Sketching abstract sorting algorithms doesn't help much in this regard. I could add faster algorithms, e.g. binary trees, but the problem is still the same.
 

the data that I want to sort are saved to an array which I created in my design. The array is filled while the data are being processed. Once the array is filled I want to start sorting it to use these sorted values in a next step in my main algorithm. That is why I think I don't have the problem of the memory accessing thing which TrickyDicky talked about, as I can access the array cells simultaneously in one cycle.

I found this information on the internet

------------------------------- comparator stagesc----------Comparators
Odd-even transposition sort -------O(n)----------------------O(n2)
Bubblesort---------------------------O(n)----------------------O(n2)
Bitonic sort------------------------O(log(n)2)--------------O(n·log(n)2)
Odd-even mergesort-------------O(log(n)2)--------------O(n·log(n)2)
Shellsort---------------------------O(log(n)2)-------------O(n·log(n)2)

in my case n is equal 633 so if I am going to divide the sorting algorithm on different stages then the Bitonic sort or the Odd-even mergesort or the shell sort would take around 5000 comparator divided on the number of cycles I would use to finish the sorting. for example if I am going to divide the sorting process on 100 different cycles then I would use only 50 comparators? is that a correct way to implement it? do you still have any comments concerning any of these sorting algorithms?

Ive also been synthesizing the design with 633 comparators(to carry out the bubble sort in 633 different clock cycles) and it has been synthesizing for 6 hours till now and still not finished...is that normal?!! is there something which I can use instead of arrays that does not need alot of logic elements to create the datapath?!

I appreciate your help.
 

In a real FPGA you cannot access all elements in a large array simultanously without chewing up huge numbers of registers, with a massive mux to select the corresponding bits. Just forget about acceesing large data sets simultaneously, it does not happen in FPGAs. FPGAs are great at pipelining and parrallel data processing, not sequential data processing (like what you do in software).

Im not surprised its taken 6 hours so far? what chip are you trying to compile on? In my mind, if you have a half decent PC (like an Intel Core 2 Duo) with a couple of gigs of ram, you should be able to compile in less than 1 hour (if logic use is small enough, no more than 10 mins). We get a large FPGA that's about 75%+ full compiled in about 3 hours.

You have fundamental problems with your design approach. Please highlight all your code, press delete and start again from a digital logic approach. The way you are going is going nowhere.
 

would you please be kind enough to tell me or even give me a hint what is a "digital logic approach" to bubble sort 633 values?!

Think about the address generator for the memory, then 1 clock cycles later the data you addressed will be ready (while at the same time you're addressing the next memory element). Then you pipeline it through compares to compare it to adjacent memory locations (the previous value in the pipeline). then eventually you're going to have to write it to another memory (because writing back to your source memory is going to use up bandwidth).

If I save my data to a BRAM for example and read a value every clock cycle then I would need 633*633(worst case scenario) cycles to sort it!!
 

If I save my data to a BRAM for example and read a value every clock cycle then I would need 633*633(worst case scenario) cycles to sort it!!

Welcome to digital design. With a decent clock speed it might not actually be that long. These are the tradeoffs when doing FPGA design.
For the digital logic approach, go into the algorithm, see what can be done in paralell. That way you reduce the latency. This maybe the reason why the bubblesort is one of the slowest sorts.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top