Right now I'm creating a framework for such a coprocessor ...
I hear you. To be honest, I wasn't thinking about OpenCL, not yet.Would this framework be adaptable to OpenCL spec? That would be awesomely useful!
I'm not sure as for $1000, prices for good FPGAs are climbing up to $500 and higher. It's more like 1200-1300, which might also be acceptable.I'm thinking of a general purpose PC here with a modern graphics card and a PCI-e fpga board. You can put together a pretty high performace cpu/gpu/fpga system for ~ $1000 these days.
I hear you. To be honest, I wasn't thinking about OpenCL, not yet.
First things first, I have to make something that would even workCurrently I'm working on it. It would be a simple infrastructure of software API (written in C or C++), hardware modules, and, probably, special PC software tool. The way I currently see it:
a) you have an HDL project (a bunch of modules with single top-level interface module);
b) you want to implement it into FPGA with the ability to access that module from PC;
c) you take my software tool and let it analyze your top-level interface. Maybe, you also specify some settings;
d) this tool creates a projects and automatically implements it into FPGA (I have Xilinx FPGA board so I will initially support Xilinx ISE for this step);
e) the tool also creates C/C++ header file containing functions to access your hardware module (essentially, set() function to sent the data on module's inputs and get() function to read results back);
f) you include this header into your program, make API calls where neccessary, write data processing and so on; compile the program, link it with my API library, run it and have performance gain over pure CPU program.
This would only be intital release, to be actually able to run programs, test the system and make some performance-related research. I hope I'll have the chance to evolve the system much further. But only if I make my final year project in time, otherwise I'm doomed forever. And I need a month or so to finish the first working prototype, while the deadline for project is about 2 months. And it won't be enough to get the framework working - I need to research the domain, that is - I need some computational projects to actually implement for FPGA, to implement for PC and to compare performance. That's where this thread comes from.
Now, as for OpenCL. The way I see it:
1) OpenCL is good for large homogenous arrays of fixed ALUs with large fast memory access(read: for GPUs). Trying to directly recreate such a structure would be like playing away match - no chance against GPUs.
BTW, memory access and overall data bandwidth is my pet peeve since my board doesn't support PCI and even have some problems with Ethernet. Currently I'm using RS232, which is indeed a dead end.
2) To enable OpenCL for FPGA, I'm going to need OpenCL (read: c) to HDL compiler (translator), which is whole separate story. I'm actually considering writing simple C/C++ to HDL translator (for arithmetic operations) to enable rapid and easy accelerating of existing programs. My supervisor is working on it, and there is such a perspective, but no chance we'll achieve that in the next half a year (to say at least).
3) FPGA's strong ability is total reconfigurability. The most effective way to utilize FPGA's resource is to use an HDL project that was initially created for FPGA, without any fixed architectures in mind. That's why I decided to design the system the way I described above.
Bottomline is: currently, I have no idea how to support OpenCL, but if I'll meet the current deadline I will indeed give it a thought.
I'm not sure as for $1000, prices for good FPGAs are climbing up to $500 and higher. It's more like 1200-1300, which might also be acceptable.
P.S. With the current concept of a system it's no match for GPUs, but I was hoping for problems that don't fit GPUs well. Not quite sure what those problems are, though...
Mmmmh, genetic algo implementation of traveling salesman? The benefit of traveling salesman is that the problem domain is well understood and you can read a lot about it. That always helps on the timetable.
Good thinking, thanks!What you could also do is hop by the math department and ask if they know of any cool combinatorial things like that.
You are lucky, to at least have that! My board has 10/100 PHY, and it seems to malfunction (either this, or that's my brain mailfunctioningMemory access and overall bandwidth happens to be a pet peeve of mine as well. Yes, a pet peeve. I have several.All the boards I have are max 1 Gbit link (Atlys has gbit ethernet). Now 1 gbit is nice enough for a lot of things, but for some other stuff I'd really like the bandwidth of PCI-e x16.
I don't know, in your country - possibly, but in mine euro have more buying power than $$, and significantly more.But hey, $$ == euro for computer stuff, right?
Oh, about that! I've found out that FPULib (the most popular VHDL floating-point library) also has LNS subset (logarithmic number system). Because those numbers are logarithmic, multiplication and division is essentially addition and subtraction. Addition and subtraction itself in LNS is tricky and require from 2 to 3 additional digits to produce same or less error than IEEE 754 FP, but this LNS is fast and small. I'd like to try it out. Still, it doesn't seem to be popular, didn't manage to figure out why... What do you think about it?Well, the thing is that GPU's are really good at floating point.
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?