wire clkFXa, clka, locked1;
DCM dcm1 (.CLKIN(clock), .RST(1'b0), .CLKFB(), .CLK0(), .CLKDV(), .CLKFX(clkFXa), .LOCKED(locked1));
defparam dcm1.CLK_FEEDBACK = "NONE";
defparam dcm1.CLKFX_MULTIPLY = 4;
defparam dcm1.CLKFX_DIVIDE = 5;
defparam dcm1.CLKIN_PERIOD = 20;
BUFG buf1 (.I(clkFXa), .O(clka));
Read the link that treqer provided:
https://www.xilinx.com/support/documentation/data_sheets/ds529.pdf
speedgrade -4: 280 MHz max for the bram.
So the blockram is not the limiting factor. It sound like the rest of the design is the limiting factor.
As to why can you only go to 50 MHz ... look in the timing report and check what the slowest paths are.
Timing Summary:
---------------
Speed Grade: -4
Minimum period: 22.679ns (Maximum Frequency: 44.094MHz)
Minimum input arrival time before clock: 12.525ns
Maximum output required time after clock: 7.709ns
Maximum combinational path delay: 11.560ns
//This is from the detail
Delay: 22.679ns (Levels of Logic = 21)
Source: pcpuwm1/pcpu/regWriteDst_MEMWB_1 (FF)
Destination: pcpuwm1/pcpu/ZF (FF)
Source Clock: clock rising
Destination Clock: clock rising
//clock is 50MHz
wire clkFXa, clka, locked1;
DCM dcm1 (.CLKIN(clock), .RST(1'b0), .CLKFB(), .CLK0(), .CLKDV(), .CLKFX(clkFXa), .LOCKED(locked1));
defparam dcm1.CLK_FEEDBACK = "NONE";
defparam dcm1.CLKFX_MULTIPLY = 4;
defparam dcm1.CLKFX_DIVIDE = 5;
defparam dcm1.CLKIN_PERIOD = 20;
BUFG buf1 (.I(clkFXa), .O([U]clka[/U]));
pcpuwm pcpuwm1 (.[B]clock[/B]([U]clka[/U]), .[B]clock_mem[/B]([U]clock[/U]), .reset(NBTN[0]), .start(NBTN[1]), .stall(NBTN[2]),
.sel(SW[4:0]), .y(outgr));
// this is my memory module code
always @(posedge clk) begin
if (en) begin
if (we) //Write Enable
ram[addr]<=di; //Update ram by di(Data Input)
do <= ram[addr]; //Send data out via do(Data Out)
end
end
try setting up a UCF file, then getting the post-PAR timing. it might show more details where things are failing. It also gives a more realistic measure of the design. The synthesis report makes assumptions about routing that might not be true. the results after PAR will generally be a bit lower because of routing issues.
wire clkFXa, clka, locked1;
DCM dcm1 (.CLKIN(clock), .RST(0), .CLKFB(), .CLK0(), .CLKDV(), .CLKFX(clkFXa), .LOCKED(locked1));
defparam dcm1.CLK_FEEDBACK = "NONE";
defparam dcm1.CLKFX_MULTIPLY = 25;
defparam dcm1.CLKFX_DIVIDE = 31;
BUFG buf1 (.I(clkFXa), .O(clka));
wire clkFXb, clkb, locked2;
DCM dcm2 (.CLKIN(clock), .RST(1'b0), .CLKFB(), .CLK0(), .CLKDV(), .CLKFX(clkFXb), .LOCKED(locked2));
defparam dcm2.CLK_FEEDBACK = "NONE";
defparam dcm2.CLKFX_MULTIPLY = 25;
defparam dcm2.CLKFX_DIVIDE = 25;
BUFG buf2 (.I(clkFXb), .O(clkb));
pcpuwm pcpuwm1 (.clock(clka), .clock_mem(clkb), .reset(NBTN[0]), .start(NBTN[1]), .stall(NBTN[2]),
.sel(SW[4:0]), .y(outgr));
Timing Summary:
---------------
Speed Grade: -4
Minimum period: 20.064ns (Maximum Frequency: 49.842MHz)
Minimum input arrival time before clock: 14.360ns
Maximum output required time after clock: 7.709ns
Maximum combinational path delay: 11.492ns
Release 9.2.04i par J.40
Copyright (c) 1995-2007 Xilinx, Inc. All rights reserved.
CADPC03:: Wed May 11 14:40:42 2011
par -w -intstyle ise -ol std -t 1 board_map.ncd board.ncd board.pcf
Constraints file: board.pcf.
Loading device for application Rf_Device from file '3s200.nph' in environment C:\Xilinx92i.
"board" is an NCD, version 3.1, device xc3s200, package ft256, speed -4
Initializing temperature to 85.000 Celsius. (default - Range: 0.000 to 85.000 Celsius)
Initializing voltage to 1.140 Volts. (default - Range: 1.140 to 1.260 Volts)
INFO:Par:282 - No user timing constraints were detected or you have set the option to ignore timing constraints ("par
-x"). Place and Route will run in "Performance Evaluation Mode" to automatically improve the performance of all
internal clocks in this design. The PAR timing summary will list the performance achieved for each clock. Note: For
the fastest runtime, set the effort level to "std". For best performance, set the effort level to "high". For a
balance between the fastest runtime and best performance, set the effort level to "med".
Device speed data version: "PRODUCTION 1.39 2007-10-19".
Device Utilization Summary:
Number of BUFGMUXs 2 out of 8 25%
Number of DCMs 2 out of 4 50%
Number of External IOBs 33 out of 173 19%
Number of LOCed IOBs 33 out of 33 100%
Number of RAMB16s 2 out of 12 16%
Number of Slices 665 out of 1920 34%
Number of SLICEMs 0 out of 960 0%
Overall effort level (-ol): Standard
Placer effort level (-pl): High
Placer cost table entry (-t): 1
Router effort level (-rl): Standard
WARNING:Par:288 - The signal BTN<1>_IBUF has no load. PAR will not attempt to route this signal.
WARNING:Par:288 - The signal BTN<2>_IBUF has no load. PAR will not attempt to route this signal.
WARNING:Par:288 - The signal BTN<3>_IBUF has no load. PAR will not attempt to route this signal.
Starting Placer
Phase 1.1
Phase 1.1 (Checksum:98ac4b) REAL time: 2 secs
Phase 2.7
Phase 2.7 (Checksum:1312cfe) REAL time: 2 secs
Phase 3.31
Phase 3.31 (Checksum:1c9c37d) REAL time: 2 secs
Phase 4.2
.....
..
Phase 4.2 (Checksum:26259fc) REAL time: 3 secs
Phase 5.8
..................................................
........
..................................................
.............
..........
.....
Phase 5.8 (Checksum:aa7b01) REAL time: 9 secs
Phase 6.5
Phase 6.5 (Checksum:39386fa) REAL time: 9 secs
Phase 7.18
Phase 7.18 (Checksum:42c1d79) REAL time: 16 secs
Phase 8.5
Phase 8.5 (Checksum:4c4b3f8) REAL time: 16 secs
REAL time consumed by placer: 16 secs
CPU time consumed by placer: 16 secs
Writing design to file board.ncd
Total REAL time to Placer completion: 17 secs
Total CPU time to Placer completion: 17 secs
Starting Router
Phase 1: 4967 unrouted; REAL time: 17 secs
Phase 2: 4706 unrouted; REAL time: 17 secs
Phase 3: 2287 unrouted; REAL time: 18 secs
Phase 4: 2287 unrouted; (1334) REAL time: 18 secs
Phase 5: 2323 unrouted; (0) REAL time: 19 secs
Phase 6: 0 unrouted; (5676) REAL time: 28 secs
Phase 7: 0 unrouted; (5676) REAL time: 29 secs
Updating file: board.ncd with current fully routed design.
Phase 8: 0 unrouted; (3437) REAL time: 32 secs
Phase 9: 0 unrouted; (2872) REAL time: 49 secs
Phase 10: 0 unrouted; (2872) REAL time: 49 secs
Phase 11: 0 unrouted; (0) REAL time: 50 secs
WARNING:Route:455 - CLK Net:clock_IBUFG may have excessive skew because
6 CLK pins and 0 NON_CLK pins failed to route using a CLK template.
WARNING:Route:455 - CLK Net:clock_counter<10> may have excessive skew because
0 CLK pins and 1 NON_CLK pins failed to route using a CLK template.
Total REAL time to Router completion: 50 secs
Total CPU time to Router completion: 50 secs
Partition Implementation Status
-------------------------------
No Partitions were found in this design.
-------------------------------
Generating "PAR" statistics.
**************************
Generating Clock Report
**************************
+---------------------+--------------+------+------+------------+-------------+
| Clock Net | Resource |Locked|Fanout|Net Skew(ns)|Max Delay(ns)|
+---------------------+--------------+------+------+------------+-------------+
| clka | BUFGMUX0| No | 226 | 0.004 | 1.014 |
+---------------------+--------------+------+------+------------+-------------+
| clkb | BUFGMUX3| No | 2 | 0.000 | 1.011 |
+---------------------+--------------+------+------+------------+-------------+
| clock_IBUFG | Local| | 8 | 0.697 | 1.854 |
+---------------------+--------------+------+------+------------+-------------+
| clock_counter<10> | Local| | 10 | 0.646 | 3.132 |
+---------------------+--------------+------+------+------------+-------------+
* Net Skew is the difference between the minimum and maximum routing
only delays for the net. Note this is different from Clock Skew which
is reported in TRCE timing report. Clock Skew is the difference between
the minimum and maximum path delays which includes logic delays.
The Delay Summary Report
The NUMBER OF SIGNALS NOT COMPLETELY ROUTED for this design is: 0
The AVERAGE CONNECTION DELAY for this design is: 1.487
The MAXIMUM PIN DELAY IS: 4.911
The AVERAGE CONNECTION DELAY on the 10 WORST NETS is: 4.476
Listing Pin Delays by value: (nsec)
d < 1.00 < d < 2.00 < d < 3.00 < d < 4.00 < d < 5.00 d >= 5.00
--------- --------- --------- --------- --------- ---------
1596 2034 1124 245 37 0
Timing Score: 0
Asterisk (*) preceding a constraint indicates it was not met.
This may be due to a setup or hold violation.
------------------------------------------------------------------------------------------------------
Constraint | Check | Worst Case | Best Case | Timing | Timing
| | Slack | Achievable | Errors | Score
------------------------------------------------------------------------------------------------------
Autotimespec constraint for clock net clo | SETUP | N/A| 4.215ns| N/A| 0
ck_IBUFG | HOLD | 1.124ns| | 0| 0
------------------------------------------------------------------------------------------------------
Autotimespec constraint for clock net clo | SETUP | N/A| 11.934ns| N/A| 0
ck_counter<10> | HOLD | 1.030ns| | 0| 0
------------------------------------------------------------------------------------------------------
Autotimespec constraint for clock net clk | SETUP | N/A| 21.399ns| N/A| 0
a | HOLD | 0.800ns| | 0| 0
------------------------------------------------------------------------------------------------------
All constraints were met.
INFO:Timing:2761 - N/A entries in the Constraints list may indicate that the
constraint does not cover any paths or that it has no requested value.
Generating Pad Report.
All signals are completely routed.
WARNING:Par:283 - There are 3 loadless signals in this design. This design will cause Bitgen to issue DRC warnings.
Total REAL time to PAR completion: 52 secs
Total CPU time to PAR completion: 52 secs
Peak Memory Usage: 141 MB
Placement: Completed - No errors found.
Routing: Completed - No errors found.
Number of error messages: 0
Number of warning messages: 7
Number of info messages: 1
Writing design to file board.ncd
PAR done!
Delay: 22.679ns (Levels of Logic = 21)
Source: pcpuwm1/pcpu/regWriteDst_MEMWB_1 (FF)
Destination: pcpuwm1/pcpu/ZF (FF)
Source Clock: clock rising
Destination Clock: clock rising
So the clock freq. have separated in each module. As my understanding I can add more freq. to the clock_mem because It's doesn't had effect from the clock or slowest path in the pcpu module.
Delay: 22.679ns (Levels of Logic = 21)
Source: pcpuwm1/pcpu/regWriteDst_MEMWB_1 (FF)
Destination: pcpuwm1/pcpu/ZF (FF)
Timing constraint: Default period analysis for Clock 'clock'
Clock period: 22.679ns (frequency: 44.094MHz)
Total number of paths / destination ports: 3022404 / 1196
-------------------------------------------------------------------------
Delay: 22.679ns (Levels of Logic = 21)
Source: pcpuwm1/pcpu/regWriteDst_MEMWB_1 (FF)
Destination: pcpuwm1/pcpu/ZF (FF)
Source Clock: clock rising
Destination Clock: clock rising
Data Path: pcpuwm1/pcpu/regWriteDst_MEMWB_1 to pcpuwm1/pcpu/ZF
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
[COLOR="red"]FDC:C->Q 14 0.720 1.255 pcpuwm1/pcpu/regWriteDst_MEMWB_1 (pcpuwm1/pcpu/regWriteDst_MEMWB_1)
LUT4_D:I2->O 17 0.551 1.684 pcpuwm1/pcpu/fwdWB_Reg_Con<1>26 (pcpuwm1/pcpu/fwdWB_Reg_Con<1>26)
LUT2:I0->O 18 0.551 1.443 pcpuwm1/pcpu/fwdWB_Reg_Con<1>43 (pcpuwm1/pcpu/fwdWB_Reg_Con<1>)
LUT4_D:I3->O 11 0.551 1.170 pcpuwm1/pcpu/ALUIn1_or0001161_SW0 (N269)
LUT4:I3->O 1 0.551 0.869 pcpuwm1/pcpu/ALUIn1<2>11 (pcpuwm1/pcpu/ALUIn1<2>11)
LUT4:I2->O 12 0.551 1.313 pcpuwm1/pcpu/ALUIn1<2>39 (pcpuwm1/pcpu/ALUIn1<2>)[/COLOR]
[COLOR="orange"]LUT2:I1->O 1 0.551 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_lut<2> (pcpuwm1/pcpu/Madd_result_addsub0000_lut<2>)
MUXCY:S->O 1 0.500 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<2> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<2>)
MUXCY:CI->O 1 0.064 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<3> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<3>)
MUXCY:CI->O 1 0.064 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<4> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<4>)
MUXCY:CI->O 1 0.064 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<5> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<5>)
MUXCY:CI->O 1 0.064 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<6> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<6>)
MUXCY:CI->O 1 0.064 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<7> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<7>)
MUXCY:CI->O 1 0.064 0.000 pcpuwm1/pcpu/Madd_result_addsub0000_cy<8> (pcpuwm1/pcpu/Madd_result_addsub0000_cy<8>)
XORCY:CI->O 2 0.904 1.216 pcpuwm1/pcpu/Madd_result_addsub0000_xor<9> (pcpuwm1/pcpu/result_addsub0000<9>)
LUT1:I0->O 1 0.551 0.000 pcpuwm1/pcpu/Madd_result_add0001_Madd_cy<9>_rt (pcpuwm1/pcpu/Madd_result_add0001_Madd_cy<9>_rt)
MUXCY:S->O 1 0.500 0.000 pcpuwm1/pcpu/Madd_result_add0001_Madd_cy<9> (pcpuwm1/pcpu/Madd_result_add0001_Madd_cy<9>)
XORCY:CI->O 1 0.904 0.827 pcpuwm1/pcpu/Madd_result_add0001_Madd_xor<10> (pcpuwm1/pcpu/result_add0001<10>)[/COLOR]
[COLOR="lime"] LUT4:I3->O 1 0.551 0.827 pcpuwm1/pcpu/result<10>151_SW0_SW0 (N447)
LUT4:I3->O 2 0.551 0.903 pcpuwm1/pcpu/result<10>151 (pcpuwm1/pcpu/result<10>)
LUT4:I3->O 1 0.551 0.996 pcpuwm1/pcpu/wZF17 (pcpuwm1/pcpu/wZF17)
LUT4:I1->O 1 0.551 0.000 pcpuwm1/pcpu/wZF99 (pcpuwm1/pcpu/wZF)
FDCE:D 0.203 pcpuwm1/pcpu/ZF[/COLOR]
----------------------------------------
Total 22.679ns (10.176ns logic, 12.503ns route)
(44.9% logic, 55.1% route)
RED section is from my data forwarding unit that detect to send forward data from W/B stage to EX stage
Orange section is from my ALU Arithmetic operation
Green section is use to check and update Zero Flag register
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?