mathswork said:Ok, i have fixed it. re-download it. Some text to explain it in my blog: h**p://free-arm.blog.163.com
mathswork said:Hi,kel8157
Of course I have many testbench for it and Each is aimed at single instruction. This was the only way to check it before. But now, I have "keil for arm" produced "Hex" file. I know little to embeded programming, and i have to learn C programing to make it work.
Recently, I have great success. I compiled one example "Blinky" of "keil for arm", and simulated it in modelsim. It works perfectly. The next step, i will download it to FPGA.
If you have "Keil for arm". You can easy find this example in "D:\Keil\ARM\RV30\Examples\Blinky". Before compiling, make "PLL_SETUP EQU 0" and "MAM_SETUP EQU 0" in line 117 and 37 of "startup.s".
I have a version for you to simulate, if you are instresting.
The internal registers, which I name it "reg_r0" to "reg_rf", you can drag it into wave window. And also, you can find "reg_re_usr", "cpsr_i", "cpsr_m".
mathswork said:Hi, kel8157
It is very easy to have this core connected with a single port RAM. I am familar with dual port RAM, so this version is made for that. I could modify some lines. It is easy.
The critical path gives me a big problem very much. I think that is why ARM names its core low-power core. They must have no choice to imcrease its frequency. So ARM9 has five-stage pipeline, which has more 2 pipelines than ARM7. Think of that, 40 ns divided by 3, my future "arm9" will have a critical path of 40/3=13 ns.
I have another version. It is based on the former. The former's multiplier is 32x32. I use a 8x8 multiplier to replace it, which means a MUL/SMUAL insturction will need more cycles to carry out than the former. For ASIC, this version is very successful. In SMIC 0.18 um, the former could have more than 15 ns, but this version will run 6~7 ns. But for FPGA, it fails to reduce the critical path significantly. Because a 32X32 multiplier is a dedicated components in FPGA, a 8x8 multipiler + some MUXs is not so better than a 32x32 multiplier.
`define DEL 1
`timescale 1 ns/1 ns
module tb_test;
reg clk;
reg rst;
reg cpu_en;
reg cpu_restart;
reg irq;
reg fiq;
wire rom_en;
wire [31:0] rom_addr;
reg [31:0] rom_data;
reg rom_abort;
wire ram_en;
wire ram_wr_en;
wire [31:0] ram_addr;
wire [31:0] ram_wr_data;
reg [31:0] ram_rd_data;
reg ram_rd_abort;
reg [127:0] rom_tmp [2047:0];
reg [7:0] rom_all [32767:0];
arm u_arm (
.clk ( clk ), //System clock
.rst ( rst ), //System reset pins, high valid
.cpu_en ( cpu_en ), //Cpu enable signal, high valid, low level suspends cpu.
.cpu_restart ( cpu_restart ), //To restart cpu, high valid.
.irq ( irq ), //IRQ interrupt enable signal, high valid
.fiq ( fiq ), //FIQ interrupt enable signal, high valid
.rom_en ( rom_en ), //Instruction rom¡¯s 32-bit address
.rom_addr ( rom_addr ), //Instruction rom¡¯s 32-bit address
.rom_data ( rom_data ), //Instruction stored in rom
.rom_abort ( rom_abort ), //This instruction is invalid if this signal keeps high.
.ram_en ( ram_en ), //Ram read enable signal, low=select
.ram_wr_en ( ram_wr_en ), //Ram write enable signal, low=write, high=read
.ram_addr ( ram_addr ), //Ram read address
.ram_wr_data ( ram_wr_data ), //Ram write data signals.
.ram_rd_data ( ram_rd_data ), //Ram read data signals
.ram_rd_abort ( ram_rd_abort ) //Data on ¡°ram_rd_data¡± is invalid if it keeps high
);
initial begin
clk = 1'b0;
cpu_en = 1'b0;
cpu_restart = 1'b0;
rom_abort = 1'b0;
irq = 1'b0;
fiq = 1'b0;
rst = 1'b0;
#10 rst = 1'b1;
#20 rst = 1'b0;
cpu_en = 1'b1;
cpu_restart = 1'b1;
#10 cpu_restart = 1'b0;
end
always clk = #5 ~clk;
// ROM section, need to use an ENAble signal, which easier to work with flash or PROM.
// The read from ROM code when ram_addr[31:28]==4'h0 need modification,
// otherwise arbitration with ROM must be implemented.
always @ (posedge clk) begin
if (rom_en) begin
rom_data <= #`DEL { rom_all[rom_addr+2'd3],rom_all[rom_addr+2'd2],rom_all[rom_addr+2'd1],rom_all[rom_addr]};
else if ( ram_addr[31:28]==4'h0 )
ram_rd_data_from rom <= #`DEL { rom_all[ram_addr+2'd3],rom_all[ram_addr+2'd2],rom_all[ram_addr+2'd1],rom_all[ram_addr]};
end
end
// RAM section, using standard single port RAM.
reg [7:0] ram_data [2047:0];
integer i;
initial begin
ram_rd_abort = 1'b0;
for ( i=0; i<2048;i=i+1 )
ram_data[i] = 8'h0;
end
always @ ( posedge clk ) begin
if ( ram_en & (ram_wr_en == 1'b1)) begin
if ( ram_addr[31:28]==4'h4 )
ram_rd_data <= #`DEL { ram_data[ram_addr[10:0]+3],ram_data[ram_addr[10:0]+2],ram_data[ram_addr[10:0]+1],ram_data[ram_addr[10:0]]};
else if ( ram_addr[31:28]==4'h0 ) // If this read is moved into ROM interface, it's easier for RAM.
ram_rd_data <= #`DEL ram_rd_data_from;
else;
end
if ( ram_en & (ram_wr_en == 1'b0) & ( ram_addr[31:28]==4'h4 ) ) begin
ram_data[ram_addr[10:0]+3] <= #`DEL ram_wr_data[31:24];
ram_data[ram_addr[10:0]+2] <= #`DEL ram_wr_data[23:16];
ram_data[ram_addr[10:0]+1] <= #`DEL ram_wr_data[15:8];
ram_data[ram_addr[10:0]] <= #`DEL ram_wr_data[7:0];
end
end
/**************************************************************/
parameter memLoadFile = "./data_test/keil_03.bin";
integer n, j;
reg [127:0] tmp;
initial begin
if (memLoadFile != "") begin
$readmemh(memLoadFile, rom_tmp); // To use this, copy the HEX section and fill vacant bytes in last row with xx
for (n=0; n<2048;n=n+1) begin
tmp = rom_tmp[n];
rom_all[n*16+15] = tmp[07:00];
rom_all[n*16+14] = tmp[15:08];
rom_all[n*16+13] = tmp[23:16];
rom_all[n*16+12] = tmp[31:24];
rom_all[n*16+11] = tmp[39:32];
rom_all[n*16+10] = tmp[47:40];
rom_all[n*16+9 ] = tmp[55:48];
rom_all[n*16+8 ] = tmp[63:56];
rom_all[n*16+7 ] = tmp[71:64];
rom_all[n*16+6 ] = tmp[79:72];
rom_all[n*16+5 ] = tmp[87:80];
rom_all[n*16+4 ] = tmp[95:88];
rom_all[n*16+3 ] = tmp[103:96];
rom_all[n*16+2 ] = tmp[111:104];
rom_all[n*16+1 ] = tmp[119:112];
rom_all[n*16+0 ] = tmp[127:120];
$display("IN %h", tmp);
$display("OUT %h%h%h%h%h%h%h%h%h%h%h%h%h%h%h%h",
rom_all[n*16+0 ],
rom_all[n*16+1 ],
rom_all[n*16+2 ],
rom_all[n*16+3 ],
rom_all[n*16+4 ],
rom_all[n*16+5 ],
rom_all[n*16+6 ],
rom_all[n*16+7 ],
rom_all[n*16+8 ],
rom_all[n*16+9 ],
rom_all[n*16+10],
rom_all[n*16+11],
rom_all[n*16+12],
rom_all[n*16+13],
rom_all[n*16+14],
rom_all[n*16+15]);
end
// $readmemh(memLoadFile, rom_all);
end
end
endmodule
mathswork said:kel8157,
hi, I could do that.
Attatched is some versions I did recently.
Any problem please mail me.
mathswork said:Hi, kel8157,
I have downloaded it into my fpga board: digilent's spartan-3e starter kit.
I use UART to download bin file to ROM, and could see it works immediately. That means FPGA board become an ARM developping board.
The new core has improved more. The critical path is only 26 ns(before is 40 ns ).
How about you ? need help?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?