shaiko
Advanced Member level 5
Hello,
I'm designing an image rotation block on a Cyclone V FPGA with DDR3.
The input is a simple 30 FPS parallel video protocol: 1024 * 1024 image, with HSYNC, VSYNC, 16 bit data and a ~40MHz pixel clock.
Using the 2D rotation formula I want to map destination pixels to source pixels - I.E:
1. I have a DDR3 address area that acts as an input buffer (to where I write the incoming image) and another DDR3 address area to where I write the transformed image.
2. I raster scan over the destination pixels I calculate the source address, fetch the pixel and write it to an on chip line buffer
3. Once the line buffer is filled up, I write it to the destination address in an efficient byte aligned fashion.
The problem I see with my algorithm is that step 2 is terribly inefficient in memory terms.
I fetch 128 bits (the DDR3 controller's data bus width) even though (at that time I might use) only a single pixel of the fetched data.
Do you think my algorithm is good ?
I'm designing an image rotation block on a Cyclone V FPGA with DDR3.
The input is a simple 30 FPS parallel video protocol: 1024 * 1024 image, with HSYNC, VSYNC, 16 bit data and a ~40MHz pixel clock.
Using the 2D rotation formula I want to map destination pixels to source pixels - I.E:
1. I have a DDR3 address area that acts as an input buffer (to where I write the incoming image) and another DDR3 address area to where I write the transformed image.
2. I raster scan over the destination pixels I calculate the source address, fetch the pixel and write it to an on chip line buffer
3. Once the line buffer is filled up, I write it to the destination address in an efficient byte aligned fashion.
The problem I see with my algorithm is that step 2 is terribly inefficient in memory terms.
I fetch 128 bits (the DDR3 controller's data bus width) even though (at that time I might use) only a single pixel of the fetched data.
Do you think my algorithm is good ?