Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

How to output data at 580Gbits/s from Geforce 8800 Ultra?

Status
Not open for further replies.

jsmithown

Newbie level 6
Joined
Sep 19, 2006
Messages
14
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Activity points
1,413
580Gbits/s Bus

Heres the deal. I am using the nVidia Geforce 8800 Ultra to perform calculations. However, I need to output this data to an external device (at 580Gbits/s). I plan on distributing the data throughout multiple devices, so I am trying to determine the method with the best bandwidth.

Here are my approaches:

1) I originally planned on outputting the non-graphical data through the DVI port of the GeForce. However, I think the DVI port is directly connected to an ASIC device (so I can't write custom data). This approach would have given me around 10Gbits/s. Any ideas?

2) Copy from Geforce to host computer memory. Then output from host computer to external device.
GeForce to North Bridge --> PCI-E (x16) --> Bus 64Gbits/s
North Bridge to Host Memory --> 136Gbits/s
*Host memory to programmable PCI-E card (x1) --> 2Gbit/s

*From above, the limiting factor is the bus connected to the programmable PCI-E card. I cannot find a programmable PCI-E card that can output at a higher bandwidth.

Does anyone have any suggestions? It does NOT have to be a PCI-E card. I just can't think of any other ways to output the data. High bandwidth is important!

**edit: Geforce 8800 Ultra
 

Re: 580Gbits/s Bus

jsmithown said:
Heres the deal. I am using the nVidia Geforce 6800 Ultra to perform calculations. However, I need to output this data to an external device (at 580Gbits/s). I plan on distributing the data throughout multiple devices, so I am trying to determine the method with the best bandwidth.

Here are my approaches:

1) I originally planned on outputting the non-graphical data through the DVI port of the GeForce. However, I think the DVI port is directly connected to an ASIC device (so I can't write custom data). This approach would have given me around 10Gbits/s. Any ideas?

2) Copy from Geforce to host computer memory. Then output from host computer to external device.
GeForce to North Bridge --> PCI-E (x16) --> Bus 64Gbits/s
North Bridge to Host Memory --> 136Gbits/s
*Host memory to programmable PCI-E card (x1) --> 2Gbit/s

*From above, the limiting factor is the bus connected to the programmable PCI-E card. I cannot find a programmable PCI-E card that can output at a higher bandwidth.

Does anyone have any suggestions? It does NOT have to be a PCI-E card. I just can't think of any other ways to output the data. High bandwidth is important!

I am not exactly sure what you are planning to do with such insane amounts of bandwidth but let me clear up a few things I know.

First of all, Geforce 6800 Ultra has a memory bandwidth of around 35GBytes/sec (280Gbits/sec) and that too is a theoratical maximum. Only the latest 8800 series comes close to your requirement and that too in theory and most importantly from the very high-speed high-width memory to the GPU.

So if you are planning to reach even 1/10th of that much bandwidth over serial links, then you can just forget it. unless you have advanced 12x DDR Infiniband or 40Gbps Optical Fibre links.

Now for your project, I assume you want to do some GPGPU programming or is it just plain dumping of high speed data? In either case where are you pumping such massive amounts of data, even the lastest quad-core processors have trouble feeding such data without DMA.

Let us know what exactly that it is you are looking for so that we may have better solutions for you.

1) DVI has a bit rate of 3.7Gbits/sec and 7.4Gbits/sec for Single and Dual links respectively and I believe that includes error correction. You'll be needing a larger number of DVI ports or 6800 Ultras.

2) This one doesn't seem to be a good idea as PCIe PHYs are hard to come by, especially 16 lane or PCIe 2.0 8 lane (same bandwidth)

Also inorder for any of the above 2 ideas to work, you need to do a great deal of DirectX/OpenGL programming as 6800 series are not very much programmable and lack library support.
 

    jsmithown

    Points: 2
    Helpful Answer Positive Rating
Re: 580Gbits/s Bus

First of all, thank you very much for your reply. It is very informative and I understand the limitations better.

>>Let us know what exactly that it is you are looking for so that we may have better solutions for you.
My intent is to use a graphics card to perform fast, parallelized data processing. I chose nVidia because they have a library maintained internally for GPGPU.
However, this data needs to be output to an external device. However the implementation, the external device needs to receive 580Gbit/s constantly.

My issue is how to output the data at such high data rates. I must use some sort of distributed computing. But, at this point, even if option 1 (output from graphics card to DVI) does work, it will take 580Gbit / 14.8Gbit = 40 cards (and many PCs)

At this point, I am unsure of what device can receive data at these speeds. But if I can at least send the data that fast, I can most likely hire a company to design a custom ASIC to receive the data and process it.

-----------------------------------------------------------------

My mistake, I meant the Geforce 8800 Ultra.

-The bandwidth of the Geforce will be 86.4GB/sec * 8 bits = 691.2Gbit/sec.
Option 1: 2xDVI Dual Mode --> 7.4 * 2 = 14.8Gbits/sec
Option 2: Transferring to host memory, it would still be limited by the PCI-E 2.0 bus (5GBits * 16 lanes * 8bits/10bits_encoded) = 64Gbit/sec
However, the actual transfer speed is dependent on the implementation.
Also, the CPU would be overloaded from all the copying...which...will probably slow things down a lot.

-----------------------------------------------------------------

I know that option 2 is not the best approach. But it seems like that will provide a higher throughput. I have been looking into 10Gb fiber optic NICs. The reason for that is because:
-10Gb fiber NICs are readily available
-10Gb fiber transceivers are available for the Xilinx Virtex family
-This at least addresses both sending and receiving data at such high speeds. Plus, I could avoid needing an ASIC device and store all processing in the FPGA.
-I also just discovered a 10Gb card that supports RDMA (Remote Direct Memory Access https://en.wikipedia.org/wiki/Remote_Direct_Memory_Access), which I don't know if applicable, but I will look into it. But, if it is, then the CPU being overworked shouldn't be an issue.
###OR###
Dual 20Gbit Infiniband NICs (40Gbit). But I haven't discovered a transceiver that is compatible with FPGAs. (580 / 40 = 15 NICs)

-----------------------------------------------------------------

Eventually when 100Gbit NICs come out, it will be much easier to implement the above. But, this is the best solution that I could come up with.
Any ideas on other implementations?
 

Re: 580Gbits/s Bus

I know there are some bright people out there with some awesome ideas. Don't be shy =D POST!
 

Re: 580Gbits/s Bus

jsmithown said:
At this point, I am unsure of what device can receive data at these speeds. But if I can at least send the data that fast, I can most likely hire a company to design a custom ASIC to receive the data and process it.

Since you are willing to invest tons of money into a possible future ASIC design. Let me give you some terribly expensive ideas.

Why do you have to use simple GPUs when you can use an array of FPGAs and application specific processors like DSPs/x86/POWER processors. If the communication happens with-in the same board (motherboard) then you can choose from a plethora of high speed bus standards like HyperTransport v3(a 32-bit lane has close to 160Gbits/sec) and there is no reason why you can't use multiple HT3 busses. You can also check out XDR bus structure though it was designed for RAMs, I am pretty sure it can be adapted for other purposes, especially considering the fact that it use ODR (Octal Data Rate, which means 4 times the data at the same clock speed)

If the communication takes place between boards then nothing beats optical fiber links. You must have heard of DWDM (Dense Wavelength Division Multiplexing). It is an elegant technique where by multiple 10Gbps or even 40Gbps(with high quality fiber) are multplexed onto the same pair of fiber optics using different wavelenghts. Keep in mind these can used without the full TCP/IP stack as they are desigend as a PHY layer, you can push just about anything over it. The modules are readily available (though expensive).

I am being judemental here, let us know what you have in mind for this crazy high speed bus and I promise I won't steal your idea :) Seriously though, if you want to commercialise on your next big ideas , a forum is not the best place to make a feasibility study. Talk to any of the thousands of companies in the design business for a custom solution. Good luck.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top