transfer data from spi master to spi slave.

Ananhasaasneh77 · May 6, 2016

how can i send data from master fpga to slave fpga .. what i mean is the master sends data slave catch .. slave send data master catch.. i assigned the mosi miso sck ss for the fpga master and the slave fpga in pmod connector .. what should i do next?

here is top module code spi_loopback.

Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
----------------------------------------------------------------------------------
-- Company: 
-- Engineer: 
-- 
-- Create Date:    23:44:37 05/17/2011 
-- Design Name: 
-- Module Name:    spi_loopback - Behavioral 
-- Project Name: 
-- Target Devices: 
-- Tool versions: 
-- Description: 
--          This is a simple wrapper for the 'spi_master' and 'spi_slave' cores, to synthesize the 2 cores and
--          test them in the simulator.
--
-- Dependencies: 
--
-- Revision: 
-- Revision 0.01 - File Created
-- Additional Comments: 
--
----------------------------------------------------------------------------------
library ieee;
use ieee.std_logic_1164.all;
 
library work;
use work.all;
 
entity spi_loopback is
    Generic (   
        N : positive := 32;                                         -- 32bit serial word length is default
        CPOL : std_logic := '0';                                    -- SPI mode selection (mode 0 default)
        CPHA : std_logic := '1';                                    -- CPOL = clock polarity, CPHA = clock phase.
        PREFETCH : positive := 2;                                   -- prefetch lookahead cycles
        SPI_2X_CLK_DIV : positive := 5                              -- for a 100MHz sclk_i, yields a 10MHz SCK
        );                                  
    Port(
        ----------------MASTER-----------------------
        m_clk_i : IN std_logic;
        m_rst_i : IN std_logic;
        m_spi_ssel_o : OUT std_logic;
        m_spi_sck_o : OUT std_logic;
        m_spi_mosi_o : OUT std_logic;
        m_spi_miso_i : IN std_logic;
        m_di_req_o : OUT std_logic;
        m_di_i : IN std_logic_vector(N-1 downto 0);
        m_wren_i : IN std_logic;          
        m_do_valid_o : OUT std_logic;
        m_do_o : OUT std_logic_vector(N-1 downto 0);
        ----- debug -----
        m_do_transfer_o : OUT std_logic;
        m_wren_o : OUT std_logic;
        m_wren_ack_o : OUT std_logic;
        m_rx_bit_reg_o : OUT std_logic;
        m_state_dbg_o : OUT std_logic_vector(3 downto 0);
        m_core_clk_o : OUT std_logic;
        m_core_n_clk_o : OUT std_logic;
        m_sh_reg_dbg_o : OUT std_logic_vector(N-1 downto 0);
        ----------------SLAVE-----------------------
        s_clk_i : IN std_logic;
        s_spi_ssel_i : IN std_logic;
        s_spi_sck_i : IN std_logic;
        s_spi_mosi_i : IN std_logic;
        s_spi_miso_o : OUT std_logic;
        s_di_req_o : OUT std_logic;                                         -- preload lookahead data request line
        s_di_i : IN std_logic_vector (N-1 downto 0) := (others => 'X');     -- parallel load data in (clocked in on rising edge of clk_i)
        s_wren_i : IN std_logic := 'X';                                     -- user data write enable
        s_do_valid_o : OUT std_logic;                                       -- do_o data valid strobe, valid during one clk_i rising edge.
        s_do_o : OUT std_logic_vector (N-1 downto 0);                       -- parallel output (clocked out on falling clk_i)
        ----- debug -----
        s_do_transfer_o : OUT std_logic;                                    -- debug: internal transfer driver
        s_wren_o : OUT std_logic;
        s_wren_ack_o : OUT std_logic;
        s_rx_bit_reg_o : OUT std_logic;
        s_state_dbg_o : OUT std_logic_vector (3 downto 0)                   -- debug: internal state register
--      s_sh_reg_dbg_o : OUT std_logic_vector (N-1 downto 0)                -- debug: internal shift register
        );
end spi_loopback;
 
architecture Structural of spi_loopback is
begin
 
    --=============================================================================================
    -- Component instantiation for the SPI master port
    --=============================================================================================
    Inst_spi_master: entity work.spi_master(rtl)
        generic map (N => N, CPOL => CPOL, CPHA => CPHA, PREFETCH => PREFETCH, SPI_2X_CLK_DIV => SPI_2X_CLK_DIV)
        port map( 
            sclk_i => m_clk_i,                      -- system clock is used for serial and parallel ports
            pclk_i => m_clk_i,
            rst_i => m_rst_i,
            spi_ssel_o => m_spi_ssel_o,
            spi_sck_o => m_spi_sck_o,
            spi_mosi_o => m_spi_mosi_o,
            spi_miso_i => m_spi_miso_i,
            di_req_o => m_di_req_o,
            di_i => m_di_i,
            wren_i => m_wren_i,
            do_valid_o => m_do_valid_o,
            do_o => m_do_o,
            ----- debug -----
            do_transfer_o => m_do_transfer_o,
            wren_o => m_wren_o,
            wr_ack_o => m_wren_ack_o,
            rx_bit_reg_o => m_rx_bit_reg_o,
            state_dbg_o => m_state_dbg_o,
            core_clk_o => m_core_clk_o,
            core_n_clk_o => m_core_n_clk_o,
            sh_reg_dbg_o => m_sh_reg_dbg_o
        );
 
    --=============================================================================================
    -- Component instantiation for the SPI slave port
    --=============================================================================================
    Inst_spi_slave: entity work.spi_slave(rtl)
        generic map (N => N, CPOL => CPOL, CPHA => CPHA, PREFETCH => PREFETCH)
        port map(
            clk_i => s_clk_i,
            spi_ssel_i => s_spi_ssel_i,
            spi_sck_i => s_spi_sck_i,
            spi_mosi_i => s_spi_mosi_i,
            spi_miso_o => s_spi_miso_o,
            di_req_o => s_di_req_o,
            di_i => s_di_i,
            wren_i => s_wren_i,
            do_valid_o => s_do_valid_o,
            do_o => s_do_o,
            ----- debug -----
            do_transfer_o => s_do_transfer_o,
            wren_o => s_wren_o,
            wr_ack_o => s_wren_ack_o,
            rx_bit_next_o => s_rx_bit_reg_o,
            state_dbg_o => s_state_dbg_o
--            sh_reg_dbg_o => s_sh_reg_dbg_o
        );
 
end Structural;

here the spi master code

Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
-----------------------------------------------------------------------------------------------------------------------
-- Author:          Jonny Doin, [email]jdoin@opencores.org[/email], [email]jonnydoin@gmail.com[/email]
-- 
-- Create Date:     12:18:12 04/25/2011 
-- Module Name:     SPI_MASTER - RTL
-- Project Name:    SPI MASTER / SLAVE INTERFACE
-- Target Devices:  Spartan-6
-- Tool versions:   ISE 13.1
-- Description: 
--
--      This block is the SPI master interface, implemented in one single entity.
--      All internal core operations are synchronous to the 'sclk_i', and a spi base clock is generated by dividing sclk_i downto
--      a frequency that is 2x the spi SCK line frequency. The divider value is passed as a generic parameter during instantiation.
--      All parallel i/o interface operations are synchronous to the 'pclk_i' high speed clock, that can be asynchronous to the serial
--      'sclk_i' clock.
--      For optimized use of longlines, connect 'sclk_i' and 'pclk_i' to the same global clock line.
--      Fully pipelined cross-clock circuitry guarantees that no setup artifacts occur on the buffers that are accessed by the two 
--      clock domains.
--      The block is very simple to use, and has parallel inputs and outputs that behave like a synchronous memory i/o.
--      It is parameterizable via generics for the data width ('N'), SPI mode (CPHA and CPOL), lookahead prefetch signaling 
--      ('PREFETCH'), and spi base clock division from sclk_i ('SPI_2X_CLK_DIV').
--
--      SPI CLOCK GENERATION
--      ====================
--
--      The clock generation for the SPI SCK is derived from the high-speed 'sclk_i' clock. The core divides this reference 
--      clock to form the SPI base clock, by the 'SPI_2X_CLK_DIV' generic parameter. The user must set the divider value for the 
--      SPI_2X clock, which is 2x the desired SCK frequency. 
--      All registers in the core are clocked by the high-speed clocks, and clock enables are used to run the FSM and other logic
--      at lower rates. This architecture preserves FPGA clock resources like global clock buffers, and avoids path delays caused
--      by combinatorial clock dividers outputs.
--      The core has async clock domain circuitry to handle asynchronous clocks for the SPI and parallel interfaces.
--
--      PARALLEL WRITE INTERFACE
--      ========================
--      The parallel interface has an input port 'di_i' and an output port 'do_o'.
--      Parallel load is controlled using 3 signals: 'di_i', 'di_req_o' and 'wren_i'. 'di_req_o' is a look ahead data request line,
--      that is set 'PREFETCH' clock cycles in advance to synchronize a pipelined memory or fifo to present the 
--      next input data at 'di_i' in time to have continuous clock at the spi bus, to allow back-to-back continuous load.
--      For a pipelined sync RAM, a PREFETCH of 2 cycles allows an address generator to present the new adress to the RAM in one
--      cycle, and the RAM to respond in one more cycle, in time for 'di_i' to be latched by the shifter.
--      If the user sequencer needs a different value for PREFETCH, the generic can be altered at instantiation time.
--      The 'wren_i' write enable strobe must be valid at least one setup time before the rising edge of the last SPI clock cycle,
--      if continuous transmission is intended. If 'wren_i' is not valid 2 SPI clock cycles after the last transmitted bit, the interface
--      enters idle state and deasserts SSEL.
--      When the interface is idle, 'wren_i' write strobe loads the data and starts transmission. 'di_req_o' will strobe when entering 
--      idle state, if a previously loaded data has already been transferred.
--
--      PARALLEL WRITE SEQUENCE
--      =======================
--                         __    __    __    __    __    __    __ 
--      pclk_i          __/  \__/  \__/  \__/  \__/  \__/  \__/  \...     -- parallel interface clock
--                               ___________                        
--      di_req_o        ________/           \_____________________...     -- 'di_req_o' asserted on rising edge of 'pclk_i'
--                      ______________ ___________________________...
--      di_i            __old_data____X______new_data_____________...     -- user circuit loads data on 'di_i' at next 'pclk_i' rising edge
--                                                 _______                        
--      wren_i          __________________________/       \_______...     -- user strobes 'wren_i' for one cycle of 'pclk_i'
--                      
--
--      PARALLEL READ INTERFACE
--      =======================
--      An internal buffer is used to copy the internal shift register data to drive the 'do_o' port. When a complete word is received,
--      the core shift register is transferred to the buffer, at the rising edge of the spi clock, 'spi_clk'.
--      The signal 'do_valid_o' is set one 'spi_clk' clock after, to directly drive a synchronous memory or fifo write enable.
--      'do_valid_o' is synchronous to the parallel interface clock, and changes only on rising edges of 'pclk_i'.
--      When the interface is idle, data at the 'do_o' port holds the last word received.
--
--      PARALLEL READ SEQUENCE
--      ======================
--                      ______        ______        ______        ______   
--      spi_clk          bit1 \______/ bitN \______/bitN-1\______/bitN-2\__...  -- internal spi 2x base clock
--                      _    __    __    __    __    __    __    __    __  
--      pclk_i           \__/  \__/  \__/  \__/  \__/  \__/  \__/  \__/  \_...  -- parallel interface clock (may be async to sclk_i)
--                      _____________ _____________________________________...  -- 1) rx data is transferred to 'do_buffer_reg'
--      do_o            ___old_data__X__________new_data___________________...  --    after last rx bit, at rising 'spi_clk'.
--                                                   ____________               
--      do_valid_o      ____________________________/            \_________...  -- 2) 'do_valid_o' strobed for 2 'pclk_i' cycles
--                                                                              --    on the 3rd 'pclk_i' rising edge.
--
--
--      The propagation delay of spi_sck_o and spi_mosi_o, referred to the internal clock, is balanced by similar path delays,
--      but the sampling delay of spi_miso_i imposes a setup time referred to the sck signal that limits the high frequency
--      of the interface, for full duplex operation.
--
--      This design was originally targeted to a Spartan-6 platform, synthesized with XST and normal constraints.
--      The VHDL dialect used is VHDL'93, accepted largely by all synthesis tools.
--
------------------------------ COPYRIGHT NOTICE -----------------------------------------------------------------------
--                                                                   
--      This file is part of the SPI MASTER/SLAVE INTERFACE project [url]https://opencores.org/project,spi_master_slave[/url]
--                                                                   
--      Author(s):      Jonny Doin, [email]jdoin@opencores.org[/email], [email]jonnydoin@gmail.com[/email]
--                                                                   
--      Copyright (C) 2011 Jonny Doin
--      -----------------------------
--                                                                   
--      This source file may be used and distributed without restriction provided that this copyright statement is not    
--      removed from the file and that any derivative work contains the original copyright notice and the associated 
--      disclaimer. 
--                                                                   
--      This source file is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser 
--      General Public License as published by the Free Software Foundation; either version 2.1 of the License, or 
--      (at your option) any later version.
--                                                                   
--      This source is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
--      warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more  
--      details.
--
--      You should have received a copy of the GNU Lesser General Public License along with this source; if not, download 
--      it from [url]https://www.gnu.org/licenses/lgpl.txt[/url]
--                                                                   
------------------------------ REVISION HISTORY -----------------------------------------------------------------------
--
-- 2011/04/28   v0.01.0010  [JD]    shifter implemented as a sequential process. timing problems and async issues in synthesis.
-- 2011/05/01   v0.01.0030  [JD]    changed original shifter design to a fully pipelined RTL fsmd. solved all synthesis issues.
-- 2011/05/05   v0.01.0034  [JD]    added an internal buffer register for rx_data, to allow greater liberty in data load/store.
-- 2011/05/08   v0.10.0038  [JD]    increased one state to have SSEL start one cycle before SCK. Implemented full CPOL/CPHA
--                                  logic, based on generics, and do_valid_o signal.
-- 2011/05/13   v0.20.0045  [JD]    streamlined signal names, added PREFETCH parameter, added assertions.
-- 2011/05/17   v0.80.0049  [JD]    added explicit clock synchronization circuitry across clock boundaries.
-- 2011/05/18   v0.95.0050  [JD]    clock generation circuitry, with generators for all-rising-edge clock core.
-- 2011/06/05   v0.96.0053  [JD]    changed async clear to sync resets.
-- 2011/06/07   v0.97.0065  [JD]    added cross-clock buffers, fixed fsm async glitches.
-- 2011/06/09   v0.97.0068  [JD]    reduced control sets (resets, CE, presets) to the absolute minimum to operate, to reduce
--                                  synthesis LUT overhead in Spartan-6 architecture.
-- 2011/06/11   v0.97.0075  [JD]    redesigned all parallel data interfacing ports, and implemented cross-clock strobe logic.
-- 2011/06/12   v0.97.0079  [JD]    streamlined wr_ack for all cases and eliminated unnecessary register resets.
-- 2011/06/14   v0.97.0083  [JD]    (bug CPHA effect) : redesigned SCK output circuit.
--                                  (minor bug) : removed fsm registers from (not rst_i) chip enable.
-- 2011/06/15   v0.97.0086  [JD]    removed master MISO input register, to relax MISO data setup time (to get higher speed).
-- 2011/07/09   v1.00.0095  [JD]    changed all clocking scheme to use a single high-speed clock with clock enables to control lower 
--                                  frequency sequential circuits, to preserve clocking resources and avoid path delay glitches.
-- 2011/07/10   v1.00.0098  [JD]    implemented SCK clock divider circuit to generate spi clock directly from system clock.
-- 2011/07/10   v1.10.0075  [JD]    verified spi_master_slave in silicon at 50MHz, 25MHz, 16.666MHz, 12.5MHz, 10MHz, 8.333MHz, 
--                                  7.1428MHz, 6.25MHz, 1MHz and 500kHz. The core proved very robust at all tested frequencies.
-- 2011/07/16   v1.11.0080  [JD]    verified both spi_master and spi_slave in loopback at 50MHz SPI clock.
-- 2011/07/17   v1.11.0080  [JD]    BUG: CPOL='1', CPHA='1' @50MHz causes MOSI to be shifted one bit earlier.
--                                  BUG: CPOL='0', CPHA='1' causes SCK to have one extra pulse with one sclk_i width at the end.
-- 2011/07/18   v1.12.0105  [JD]    CHG: spi sck output register changed to remove glitch at last clock when CPHA='1'.
--                                  for CPHA='1', max spi clock is 25MHz. for CPHA= '0', max spi clock is >50MHz.
-- 2011/07/24   v1.13.0125  [JD]    FIX: 'sck_ena_ce' is on half-cycle advanced to 'fsm_ce', elliminating CPHA='1' glitches.
--                                  Core verified for all CPOL, CPHA at up to 50MHz, simulates to over 100MHz.
-- 2011/07/29   v1.14.0130  [JD]    Removed global signal setting at the FSM, implementing exhaustive explicit signal attributions
--                                  for each state, to avoid reported inference problems in some synthesis engines.
--                                  Streamlined port names and indentation blocks.
-- 2011/08/01   v1.15.0135  [JD]    Fixed latch inference for spi_mosi_o driver at the fsm.
--                                  The master and slave cores were verified in FPGA with continuous transmission, for all SPI modes.
-- 2011/08/04   v1.15.0136  [JD]    Fixed assertions (PREFETCH >= 1) and minor comment bugs.
--
-----------------------------------------------------------------------------------------------------------------------
--  TODO
--  ====
--
-----------------------------------------------------------------------------------------------------------------------
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.std_logic_unsigned.all;
 
--================================================================================================================
-- SYNTHESIS CONSIDERATIONS
-- ========================
-- There are several output ports that are used to simulate and verify the core operation. 
-- Do not map any signals to the unused ports, and the synthesis tool will remove the related interfacing
-- circuitry. 
-- The same is valid for the transmit and receive ports. If the receive ports are not mapped, the
-- synthesis tool will remove the receive logic from the generated circuitry.
-- Alternatively, you can remove these ports and related circuitry once the core is verified and
-- integrated to your circuit.
--================================================================================================================
 
entity spi_master is
    Generic (   
        N : positive := 32;                                             -- 32bit serial word length is default
        CPOL : std_logic := '0';                                        -- SPI mode selection (mode 0 default)
        CPHA : std_logic := '0';                                        -- CPOL = clock polarity, CPHA = clock phase.
        PREFETCH : positive := 2;                                       -- prefetch lookahead cycles
        SPI_2X_CLK_DIV : positive := 5);                                -- for a 100MHz sclk_i, yields a 10MHz SCK
    Port (  
        sclk_i : in std_logic := 'X';                                   -- high-speed serial interface system clock
        pclk_i : in std_logic := 'X';                                   -- high-speed parallel interface system clock
        rst_i : in std_logic := 'X';                                    -- reset core
        ---- serial interface ----
        spi_ssel_o : out std_logic;                                     -- spi bus slave select line
        spi_sck_o : out std_logic;                                      -- spi bus sck
        spi_mosi_o : out std_logic;                                     -- spi bus mosi output
        spi_miso_i : in std_logic := 'X';                               -- spi bus spi_miso_i input
        ---- parallel interface ----
        di_req_o : out std_logic;                                       -- preload lookahead data request line
        di_i : in  std_logic_vector (N-1 downto 0) := (others => 'X');  -- parallel data in (clocked on rising spi_clk after last bit)
        wren_i : in std_logic := 'X';                                   -- user data write enable, starts transmission when interface is idle
        wr_ack_o : out std_logic;                                       -- write acknowledge
        do_valid_o : out std_logic;                                     -- do_o data valid signal, valid during one spi_clk rising edge.
        do_o : out  std_logic_vector (N-1 downto 0);                    -- parallel output (clocked on rising spi_clk after last bit)
        --- debug ports: can be removed or left unconnected for the application circuit ---
        sck_ena_o : out std_logic;                                      -- debug: internal sck enable signal
        sck_ena_ce_o : out std_logic;                                   -- debug: internal sck clock enable signal
        do_transfer_o : out std_logic;                                  -- debug: internal transfer driver
        wren_o : out std_logic;                                         -- debug: internal state of the wren_i pulse stretcher
        rx_bit_reg_o : out std_logic;                                   -- debug: internal rx bit
        state_dbg_o : out std_logic_vector (3 downto 0);                -- debug: internal state register
        core_clk_o : out std_logic;
        core_n_clk_o : out std_logic;
        core_ce_o : out std_logic;
        core_n_ce_o : out std_logic;
        sh_reg_dbg_o : out std_logic_vector (N-1 downto 0)              -- debug: internal shift register
    );                      
end spi_master;
 
--================================================================================================================
-- this architecture is a pipelined register-transfer description.
-- all signals are clocked at the rising edge of the system clock 'sclk_i'.
--================================================================================================================
architecture rtl of spi_master is
    -- core clocks, generated from 'sclk_i': initialized at GSR to differential values
    signal core_clk     : std_logic := '0';     -- continuous core clock, positive logic
    signal core_n_clk   : std_logic := '1';     -- continuous core clock, negative logic
    signal core_ce      : std_logic := '0';     -- core clock enable, positive logic
    signal core_n_ce    : std_logic := '1';     -- core clock enable, negative logic
    -- spi bus clock, generated from the CPOL selected core clock polarity
    signal spi_2x_ce    : std_logic := '1';     -- spi_2x clock enable
    signal spi_clk      : std_logic := '0';     -- spi bus output clock
    signal spi_clk_reg  : std_logic;            -- output pipeline delay for spi sck (do NOT global initialize)
    -- core fsm clock enables
    signal fsm_ce       : std_logic := '1';     -- fsm clock enable
    signal sck_ena_ce   : std_logic := '1';     -- SCK clock enable
    signal samp_ce      : std_logic := '1';     -- data sampling clock enable
    --
    -- GLOBAL RESET: 
    --      all signals are initialized to zero at GSR (global set/reset) by giving explicit
    --      initialization values at declaration. This is needed for all Xilinx FPGAs, and 
    --      especially for the Spartan-6 and newer CLB architectures, where a async reset can
    --      reduce the usability of the slice registers, due to the need to share the control 
    --      set (RESET/PRESET, CLOCK ENABLE and CLOCK) by all 8 registers in a slice.
    --      By using GSR for the initialization, and reducing async RESET local init to the bare
    --      essential, the model achieves better LUT/FF packing and CLB usability.
    --
    -- internal state signals for register and combinatorial stages
    signal state_next : natural range N+1 downto 0 := 0;
    signal state_reg : natural range N+1 downto 0 := 0;
    -- shifter signals for register and combinatorial stages
    signal sh_next : std_logic_vector (N-1 downto 0);
    signal sh_reg : std_logic_vector (N-1 downto 0);
    -- input bit sampled buffer
    signal rx_bit_reg : std_logic := '0';
    -- buffered di_i data signals for register and combinatorial stages
    signal di_reg : std_logic_vector (N-1 downto 0);
    -- internal wren_i stretcher for fsm combinatorial stage
    signal wren : std_logic;
    signal wr_ack_next : std_logic := '0';
    signal wr_ack_reg : std_logic := '0';
    -- internal SSEL enable control signals
    signal ssel_ena_next : std_logic := '0';
    signal ssel_ena_reg : std_logic := '0';
    -- internal SCK enable control signals
    signal sck_ena_next : std_logic;
    signal sck_ena_reg : std_logic;
    -- buffered do_o data signals for register and combinatorial stages
    signal do_buffer_next : std_logic_vector (N-1 downto 0);
    signal do_buffer_reg : std_logic_vector (N-1 downto 0);
    -- internal signal to flag transfer to do_buffer_reg
    signal do_transfer_next : std_logic := '0';
    signal do_transfer_reg : std_logic := '0';
    -- internal input data request signal 
    signal di_req_next : std_logic := '0';
    signal di_req_reg : std_logic := '0';
    -- cross-clock do_transfer_reg -> do_valid_o_reg pipeline
    signal do_valid_A : std_logic := '0';
    signal do_valid_B : std_logic := '0';
    signal do_valid_C : std_logic := '0';
    signal do_valid_D : std_logic := '0';
    signal do_valid_next : std_logic := '0';
    signal do_valid_o_reg : std_logic := '0';
    -- cross-clock di_req_reg -> di_req_o_reg pipeline
    signal di_req_o_A : std_logic := '0';
    signal di_req_o_B : std_logic := '0';
    signal di_req_o_C : std_logic := '0';
    signal di_req_o_D : std_logic := '0';
    signal di_req_o_next : std_logic := '1';
    signal di_req_o_reg : std_logic := '1';
begin
    --=============================================================================================
    --  GENERICS CONSTRAINTS CHECKING
    --=============================================================================================
    -- minimum word width is 8 bits
    assert N >= 8
    report "Generic parameter 'N' (shift register size) needs to be 8 bits minimum"
    severity FAILURE;
    -- minimum prefetch lookahead check
    assert PREFETCH >= 1
    report "Generic parameter 'PREFETCH' (lookahead count) needs to be 1 minimum"
    severity FAILURE;
    -- maximum prefetch lookahead check
    assert PREFETCH <= N-5
    report "Generic parameter 'PREFETCH' (lookahead count) out of range, needs to be N-5 maximum"
    severity FAILURE;
    -- SPI_2X_CLK_DIV clock divider value must not be zero
    assert SPI_2X_CLK_DIV > 0
    report "Generic parameter 'SPI_2X_CLK_DIV' must not be zero"
    severity FAILURE;
 
    --=============================================================================================
    --  CLOCK GENERATION
    --=============================================================================================
    -- In order to preserve global clocking resources, the core clocking scheme is completely based 
    -- on using clock enables to process the serial high-speed clock at lower rates for the core fsm,
    -- the spi clock generator and the input sampling clock.
    -- The clock generation block derives 2 continuous antiphase signals from the 2x spi base clock 
    -- for the core clocking.
    -- The 2 clock phases are generated by separate and synchronous FFs, and should have only 
    -- differential interconnect delay skew.
    -- Clock enable signals are generated with the same phase as the 2 core clocks, and these clock 
    -- enables are used to control clocking of all internal synchronous circuitry. 
    -- The clock enable phase is selected for serial input sampling, fsm clocking, and spi SCK output, 
    -- based on the configuration of CPOL and CPHA.
    -- Each phase is selected so that all the registers can be clocked with a rising edge on all SPI
    -- modes, by a single high-speed global clock, preserving clock resources and clock to data skew.
    -----------------------------------------------------------------------------------------------
    -- generate the 2x spi base clock enable from the serial high-speed input clock
    spi_2x_ce_gen_proc: process (sclk_i) is
        variable clk_cnt : integer range SPI_2X_CLK_DIV-1 downto 0 := 0;
    begin
        if sclk_i'event and sclk_i = '1' then
            if clk_cnt = SPI_2X_CLK_DIV-1 then
                spi_2x_ce <= '1';
                clk_cnt := 0;
            else
                spi_2x_ce <= '0';
                clk_cnt := clk_cnt + 1;
            end if;
        end if;
    end process spi_2x_ce_gen_proc;
    -----------------------------------------------------------------------------------------------
    -- generate the core antiphase clocks and clock enables from the 2x base CE.
    core_clock_gen_proc : process (sclk_i) is
    begin
        if sclk_i'event and sclk_i = '1' then
            if spi_2x_ce = '1' then
                -- generate the 2 antiphase core clocks
                core_clk <= core_n_clk;
                core_n_clk <= not core_n_clk;
                -- generate the 2 phase core clock enables
                core_ce <= core_n_clk;
                core_n_ce <= not core_n_clk;
            else
                core_ce <= '0';
                core_n_ce <= '0';
            end if;
        end if;
    end process core_clock_gen_proc;
 
    --=============================================================================================
    --  GENERATE BLOCKS
    --=============================================================================================
    -- spi clk generator: generate spi_clk from core_clk depending on CPOL
    spi_sck_cpol_0_proc: if CPOL = '0' generate
    begin
        spi_clk <= core_clk;            -- for CPOL=0, spi clk has idle LOW
    end generate;
    
    spi_sck_cpol_1_proc: if CPOL = '1' generate
    begin
        spi_clk <= core_n_clk;          -- for CPOL=1, spi clk has idle HIGH
    end generate;
    -----------------------------------------------------------------------------------------------
    -- Sampling clock enable generation: generate 'samp_ce' from 'core_ce' or 'core_n_ce' depending on CPHA
    -- always sample data at the half-cycle of the fsm update cell
    samp_ce_cpha_0_proc: if CPHA = '0' generate
    begin
        samp_ce <= core_ce;
    end generate;
        
    samp_ce_cpha_1_proc: if CPHA = '1' generate
    begin
        samp_ce <= core_n_ce;
    end generate;
    -----------------------------------------------------------------------------------------------
    -- FSM clock enable generation: generate 'fsm_ce' from core_ce or core_n_ce depending on CPHA
    fsm_ce_cpha_0_proc: if CPHA = '0' generate
    begin
        fsm_ce <= core_n_ce;            -- for CPHA=0, latch registers at rising edge of negative core clock enable
    end generate;
    
    fsm_ce_cpha_1_proc: if CPHA = '1' generate
    begin
        fsm_ce <= core_ce;              -- for CPHA=1, latch registers at rising edge of positive core clock enable
    end generate;
    -----------------------------------------------------------------------------------------------
    -- sck enable control: control sck advance phase for CPHA='1' relative to fsm clock
    sck_ena_ce <= core_n_ce;            -- for CPHA=1, SCK is advanced one-half cycle
    
    --=============================================================================================
    --  REGISTERED INPUTS
    --=============================================================================================
    -- rx bit flop: capture rx bit after SAMPLE edge of sck
    rx_bit_proc : process (sclk_i, spi_miso_i) is
    begin
        if sclk_i'event and sclk_i = '1' then
            if samp_ce = '1' then
                rx_bit_reg <= spi_miso_i;
            end if;
        end if;
    end process rx_bit_proc;
 
    --=============================================================================================
    --  CROSS-CLOCK PIPELINE TRANSFER LOGIC
    --=============================================================================================
    -- do_valid_o and di_req_o strobe output logic
    -- this is a delayed pulse generator with a ripple-transfer FFD pipeline, that generates a 
    -- fixed-length delayed pulse for the output flags, at the parallel clock domain
    out_transfer_proc : process ( pclk_i, do_transfer_reg, di_req_reg, 
                                  do_valid_A, do_valid_B, do_valid_D, 
                                  di_req_o_A, di_req_o_B, di_req_o_D ) is
    begin
        if pclk_i'event and pclk_i = '1' then               -- clock at parallel port clock
            -- do_transfer_reg -> do_valid_o_reg
            do_valid_A <= do_transfer_reg;                  -- the input signal must be at least 2 clocks long
            do_valid_B <= do_valid_A;                       -- feed it to a ripple chain of FFDs
            do_valid_C <= do_valid_B;
            do_valid_D <= do_valid_C;
            do_valid_o_reg <= do_valid_next;                -- registered output pulse
            --------------------------------
            -- di_req_reg -> di_req_o_reg
            di_req_o_A <= di_req_reg;                       -- the input signal must be at least 2 clocks long
            di_req_o_B <= di_req_o_A;                       -- feed it to a ripple chain of FFDs
            di_req_o_C <= di_req_o_B;                           
            di_req_o_D <= di_req_o_C;                           
            di_req_o_reg <= di_req_o_next;                  -- registered output pulse
        end if;
        -- generate a 2-clocks pulse at the 3rd clock cycle
        do_valid_next <= do_valid_A and do_valid_B and not do_valid_D;
        di_req_o_next <= di_req_o_A and di_req_o_B and not di_req_o_D;
    end process out_transfer_proc;
    -- parallel load input registers: data register and write enable
    in_transfer_proc: process ( pclk_i, wren_i, wr_ack_reg ) is
    begin
        -- registered data input, input register with clock enable
        if pclk_i'event and pclk_i = '1' then
            if wren_i = '1' then
                di_reg <= di_i;                             -- parallel data input buffer register
            end if;
        end  if;
        -- stretch wren pulse to be detected by spi fsm (ffd with sync preset and sync reset)
        if pclk_i'event and pclk_i = '1' then
            if wren_i = '1' then                            -- wren_i is the sync preset for wren
                wren <= '1';
            elsif wr_ack_reg = '1' then                     -- wr_ack is the sync reset for wren
                wren <= '0';
            end if;
        end  if;
    end process in_transfer_proc;
 
    --=============================================================================================
    --  REGISTER TRANSFER PROCESSES
    --=============================================================================================
    -- fsm state and data registers: synchronous to the spi base reference clock
    core_reg_proc : process (sclk_i) is
    begin
        -- FF registers clocked on rising edge and cleared on sync rst_i
        if sclk_i'event and sclk_i = '1' then
            if rst_i = '1' then                             -- sync reset
                state_reg <= 0;                             -- only provide local reset for the state machine
            elsif fsm_ce = '1' then                         -- fsm_ce is clock enable for the fsm
                state_reg <= state_next;                    -- state register
            end if;
        end if;
        -- FF registers clocked synchronous to the fsm state
        if sclk_i'event and sclk_i = '1' then
            if fsm_ce = '1' then
                sh_reg <= sh_next;                          -- shift register
                ssel_ena_reg <= ssel_ena_next;              -- spi select enable
                do_buffer_reg <= do_buffer_next;            -- registered output data buffer 
                do_transfer_reg <= do_transfer_next;        -- output data transferred to buffer
                di_req_reg <= di_req_next;                  -- input data request
                wr_ack_reg <= wr_ack_next;                  -- write acknowledge for data load synchronization
            end if;
        end if;
        -- FF registers clocked one-half cycle earlier than the fsm state
        if sclk_i'event and sclk_i = '1' then
            if sck_ena_ce = '1' then
                sck_ena_reg <= sck_ena_next;                -- spi clock enable: look ahead logic
            end if;
        end if;
    end process core_reg_proc;
 
    --=============================================================================================
    --  COMBINATORIAL LOGIC PROCESSES
    --=============================================================================================
    -- state and datapath combinatorial logic
    core_combi_proc : process ( sh_reg, state_reg, rx_bit_reg, ssel_ena_reg, sck_ena_reg, do_buffer_reg, 
                                do_transfer_reg, wr_ack_reg, di_req_reg, di_reg, wren ) is
    begin
        sh_next <= sh_reg;                                              -- all output signals are assigned to (avoid latches)
        ssel_ena_next <= ssel_ena_reg;                                  -- controls the slave select line
        sck_ena_next <= sck_ena_reg;                                    -- controls the clock enable of spi sck line
        do_buffer_next <= do_buffer_reg;                                -- output data buffer
        do_transfer_next <= do_transfer_reg;                            -- output data flag
        wr_ack_next <= wr_ack_reg;                                      -- write acknowledge
        di_req_next <= di_req_reg;                                      -- prefetch data request
        spi_mosi_o <= sh_reg(N-1);                                      -- default to avoid latch inference
        state_next <= state_reg;                                        -- next state 
        case state_reg is
        
            when (N+1) =>                                               -- this state is to enable SSEL before SCK
                spi_mosi_o <= sh_reg(N-1);                              -- shift out tx bit from the MSb
                ssel_ena_next <= '1';                                   -- tx in progress: will assert SSEL
                sck_ena_next <= '1';                                    -- enable SCK on next cycle (stays off on first SSEL clock cycle)
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                wr_ack_next <= '0';                                     -- remove write acknowledge for all but the load stages
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when (N) =>                                                 -- deassert 'di_rdy' and stretch do_valid
                spi_mosi_o <= sh_reg(N-1);                              -- shift out tx bit from the MSb
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_reg;                               -- shift in rx bit into LSb
                wr_ack_next <= '0';                                     -- remove write acknowledge for all but the load stages
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when (N-1) downto (PREFETCH+3) =>                           -- remove 'do_transfer' and shift bits
                spi_mosi_o <= sh_reg(N-1);                              -- shift out tx bit from the MSb
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                do_transfer_next <= '0';                                -- reset 'do_valid' transfer signal
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_reg;                               -- shift in rx bit into LSb
                wr_ack_next <= '0';                                     -- remove write acknowledge for all but the load stages
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when (PREFETCH+2) downto 2 =>                               -- raise prefetch 'di_req_o' signal
                spi_mosi_o <= sh_reg(N-1);                              -- shift out tx bit from the MSb
                di_req_next <= '1';                                     -- request data in advance to allow for pipeline delays
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_reg;                               -- shift in rx bit into LSb
                wr_ack_next <= '0';                                     -- remove write acknowledge for all but the load stages
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when 1 =>                                                   -- transfer rx data to do_buffer and restart if new data is written
                spi_mosi_o <= sh_reg(N-1);                              -- shift out tx bit from the MSb
                di_req_next <= '1';                                     -- request data in advance to allow for pipeline delays
                do_buffer_next(N-1 downto 1) <= sh_reg(N-2 downto 0);   -- shift rx data directly into rx buffer
                do_buffer_next(0) <= rx_bit_reg;                        -- shift last rx bit into rx buffer
                do_transfer_next <= '1';                                -- signal transfer to do_buffer
                if wren = '1' then                                      -- load tx register if valid data present at di_i
                    state_next <= N;                                    -- next state is top bit of new data
                    sh_next <= di_reg;                                  -- load parallel data from di_reg into shifter
                    sck_ena_next <= '1';                                -- SCK enabled
                    wr_ack_next <= '1';                                 -- acknowledge data in transfer
                else
                    sck_ena_next <= '0';                                -- SCK disabled: tx empty, no data to send
                    wr_ack_next <= '0';                                 -- remove write acknowledge for all but the load stages
                    state_next <= state_reg - 1;                        -- update next state at each sck pulse
                end if;
                
            when 0 =>                                                   -- idle state: start and end of transmission
                di_req_next <= '1';                                     -- will request data if shifter empty
                sck_ena_next <= '0';                                    -- SCK disabled: tx empty, no data to send
                if wren = '1' then                                      -- load tx register if valid data present at di_i
                    spi_mosi_o <= di_reg(N-1);                          -- special case: shift out first tx bit from the MSb (look ahead)
                    ssel_ena_next <= '1';                               -- enable interface SSEL
                    state_next <= N+1;                                  -- start from idle: let one cycle for SSEL settling
                    sh_next <= di_reg;                                  -- load bits from di_reg into shifter
                    wr_ack_next <= '1';                                 -- acknowledge data in transfer
                else
                    spi_mosi_o <= sh_reg(N-1);                          -- shift out tx bit from the MSb
                    ssel_ena_next <= '0';                               -- deassert SSEL: interface is idle
                    wr_ack_next <= '0';                                 -- remove write acknowledge for all but the load stages
                    state_next <= 0;                                    -- when idle, keep this state
                end if;
                
            when others =>
                state_next <= 0;                                        -- state 0 is safe state
        end case; 
    end process core_combi_proc;
 
    --=============================================================================================
    --  OUTPUT LOGIC PROCESSES
    --=============================================================================================
    -- data output processes
    spi_ssel_o_proc:    spi_ssel_o <= not ssel_ena_reg;                 -- active-low slave select line 
    do_o_proc:          do_o <= do_buffer_reg;                          -- parallel data out
    do_valid_o_proc:    do_valid_o <= do_valid_o_reg;                   -- data out valid
    di_req_o_proc:      di_req_o <= di_req_o_reg;                       -- input data request for next cycle
    wr_ack_o_proc:      wr_ack_o <= wr_ack_reg;                         -- write acknowledge
    -----------------------------------------------------------------------------------------------
    -- SCK out logic: pipeline phase compensation for the SCK line
    -----------------------------------------------------------------------------------------------
    -- This is a MUX with an output register. 
    -- The register gives us a pipeline delay for the SCK line, pairing with the state machine moore 
    -- output pipeline delay for the MOSI line, and thus enabling higher SCK frequency. 
    spi_sck_o_gen_proc : process (sclk_i, sck_ena_reg, spi_clk, spi_clk_reg) is
    begin
        if sclk_i'event and sclk_i = '1' then
            if sck_ena_reg = '1' then
                spi_clk_reg <= spi_clk;                                 -- copy the selected clock polarity
            else
                spi_clk_reg <= CPOL;                                    -- when clock disabled, set to idle polarity
            end if;
        end if;
        spi_sck_o <= spi_clk_reg;                                       -- connect register to output
    end process spi_sck_o_gen_proc;
 
    --=============================================================================================
    --  DEBUG LOGIC PROCESSES
    --=============================================================================================
    -- these signals are useful for verification, and can be deleted after debug.
    do_transfer_proc:   do_transfer_o <= do_transfer_reg;
    state_dbg_proc:     state_dbg_o <= std_logic_vector(to_unsigned(state_reg, 4));
    rx_bit_reg_proc:    rx_bit_reg_o <= rx_bit_reg;
    wren_o_proc:        wren_o <= wren;
    sh_reg_dbg_proc:    sh_reg_dbg_o <= sh_reg;
    core_clk_o_proc:    core_clk_o <= core_clk;
    core_n_clk_o_proc:  core_n_clk_o <= core_n_clk;
    core_ce_o_proc:     core_ce_o <= core_ce;
    core_n_ce_o_proc:   core_n_ce_o <= core_n_ce;
    sck_ena_o_proc:     sck_ena_o <= sck_ena_reg;
    sck_ena_ce_o_proc:  sck_ena_ce_o <= sck_ena_ce;
 
end architecture rtl;

here is the slave code

Code VHDL - [expand]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
----------------------------------------------------------------------------------
-- Author:          Jonny Doin, [email]jdoin@opencores.org[/email]
-- 
-- Create Date:     15:36:20 05/15/2011
-- Module Name:     SPI_SLAVE - RTL
-- Project Name:    SPI INTERFACE
-- Target Devices:  Spartan-6
-- Tool versions:   ISE 13.1
-- Description: 
--
--      This block is the SPI slave interface, implemented in one single entity.
--      All internal core operations are synchronous to the external SPI clock, and follows the general SPI de-facto standard.
--      The parallel read/write interface is synchronous to a supplied system master clock, 'clk_i'.
--      Synchronization for the parallel ports is provided by input data request and write enable lines, and output data valid line.
--      Fully pipelined cross-clock circuitry guarantees that no setup artifacts occur on the buffers that are accessed by the two 
--      clock domains.
--
--      The block is very simple to use, and has parallel inputs and outputs that behave like a synchronous memory i/o.
--      It is parameterizable via generics for the data width ('N'), SPI mode (CPHA and CPOL), and lookahead prefetch 
--      signaling ('PREFETCH').
--
--      PARALLEL WRITE INTERFACE
--      The parallel interface has a input port 'di_i' and an output port 'do_o'.
--      Parallel load is controlled using 3 signals: 'di_i', 'di_req_o' and 'wren_i'. 
--      When the core needs input data, a look ahead data request strobe , 'di_req_o' is pulsed 'PREFETCH' 'spi_sck_i' 
--      cycles in advance to synchronize a user pipelined memory or fifo to present the next input data at 'di_i' 
--      in time to have continuous clock at the spi bus, to allow back-to-back continuous load.
--      The data request strobe on 'di_req_o' is 2 'clk_i' clock cycles long.
--      The write to 'di_i' must occur at most one 'spi_sck_i' cycle before actual load to the core shift register, to avoid
--      race conditions at the register transfer.
--      The user circuit places data at the 'di_i' port and strobes the 'wren_i' line for one rising edge of 'clk_i'.
--      For a pipelined sync RAM, a PREFETCH of 3 cycles allows an address generator to present the new adress to the RAM in one
--      cycle, and the RAM to respond in one more cycle, in time for 'di_i' to be latched by the interface one clock before transfer.
--      If the user sequencer needs a different value for PREFETCH, the generic can be altered at instantiation time.
--      The 'wren_i' write enable strobe must be valid at least one setup time before the rising edge of the last clock cycle,
--      if continuous transmission is intended. 
--      When the interface is idle ('spi_ssel_i' is HIGH), the top bit of the latched 'di_i' port is presented at port 'spi_miso_o'.
--
--      PARALLEL WRITE PIPELINED SEQUENCE
--      =================================
--                     __    __    __    __    __    __    __ 
--      clk_i       __/  \__/  \__/  \__/  \__/  \__/  \__/  \...     -- parallel interface clock
--                           ___________                        
--      di_req_o    ________/           \_____________________...     -- 'di_req_o' asserted on rising edge of 'clk_i'
--                  ______________ ___________________________...
--      di_i        __old_data____X______new_data_____________...     -- user circuit loads data on 'di_i' at next 'clk_i' rising edge
--                                             ________                        
--      wren_i      __________________________/        \______...     -- 'wren_i' enables latch on rising edge of 'clk_i'
--                      
--
--      PARALLEL READ INTERFACE
--      An internal buffer is used to copy the internal shift register data to drive the 'do_o' port. When a complete 
--      word is received, the core shift register is transferred to the buffer, at the rising edge of the spi clock, 'spi_sck_i'.
--      The signal 'do_valid_o' is strobed 3 'clk_i' clocks after, to directly drive a synchronous memory or fifo write enable.
--      'do_valid_o' is synchronous to the parallel interface clock, and changes only on rising edges of 'clk_i'.
--      When the interface is idle, data at the 'do_o' port holds the last word received.
--
--      PARALLEL READ PIPELINED SEQUENCE
--      ================================
--                      ______        ______        ______        ______
--      clk_spi_i   ___/ bit1 \______/ bitN \______/bitN-1\______/bitN-2\__...  -- spi base clock
--                     __    __    __    __    __    __    __    __    __  
--      clk_i       __/  \__/  \__/  \__/  \__/  \__/  \__/  \__/  \__/  \_...  -- parallel interface clock
--                  _________________ _____________________________________...  -- 1) received data is transferred to 'do_buffer_reg'
--      do_o        __old_data_______X__________new_data___________________...  --    after last bit received, at next shift clock.
--                                                   ____________               
--      do_valid_o  ________________________________/            \_________...  -- 2) 'do_valid_o' strobed for 2 'clk_i' cycles
--                                                                              --    on the 3rd 'clk_i' rising edge.
--
--
--      This design was originally targeted to a Spartan-6 platform, synthesized with XST and normal constraints.
--
------------------------------ COPYRIGHT NOTICE -----------------------------------------------------------------------
--                                                                   
--      This file is part of the SPI MASTER/SLAVE INTERFACE project [url]https://opencores.org/project,spi_master_slave[/url]                
--
--      Author(s):      Jonny Doin, [email]jdoin@opencores.org[/email], [email]jonnydoin@gmail.com[/email]
--                                                                   
--      Copyright (C) 2011 Jonny Doin
--      -----------------------------
--                                                                   
--      This source file may be used and distributed without restriction provided that this copyright statement is not    
--      removed from the file and that any derivative work contains the original copyright notice and the associated 
--      disclaimer. 
--                                                                   
--      This source file is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser 
--      General Public License as published by the Free Software Foundation; either version 2.1 of the License, or 
--      (at your option) any later version.
--                                                                   
--      This source is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
--      warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more  
--      details.
--
--      You should have received a copy of the GNU Lesser General Public License along with this source; if not, download 
--      it from [url]https://www.gnu.org/licenses/lgpl.txt[/url]
--                                                                   
------------------------------ REVISION HISTORY -----------------------------------------------------------------------
--
-- 2011/05/15   v0.10.0050  [JD]    created the slave logic, with 2 clock domains, from SPI_MASTER module.
-- 2011/05/15   v0.15.0055  [JD]    fixed logic for starting state when CPHA='1'.
-- 2011/05/17   v0.80.0049  [JD]    added explicit clock synchronization circuitry across clock boundaries.
-- 2011/05/18   v0.95.0050  [JD]    clock generation circuitry, with generators for all-rising-edge clock core.
-- 2011/06/05   v0.96.0053  [JD]    changed async clear to sync resets.
-- 2011/06/07   v0.97.0065  [JD]    added cross-clock buffers, fixed fsm async glitches.
-- 2011/06/09   v0.97.0068  [JD]    reduced control sets (resets, CE, presets) to the absolute minimum to operate, to reduce 
--                                  synthesis LUT overhead in Spartan-6 architecture.
-- 2011/06/11   v0.97.0075  [JD]    redesigned all parallel data interfacing ports, and implemented cross-clock strobe logic.
-- 2011/06/12   v0.97.0079  [JD]    implemented wr_ack and di_req logic for state 0, and eliminated unnecessary registers reset.
-- 2011/06/17   v0.97.0079  [JD]    implemented wr_ack and di_req logic for state 0, and eliminated unnecessary registers reset.
-- 2011/07/16   v1.11.0080  [JD]    verified both spi_master and spi_slave in loopback at 50MHz SPI clock.
-- 2011/07/29   v2.00.0110  [JD]    FIX: CPHA bugs:
--                                      - redesigned core clocking to address all CPOL and CPHA configurations.
--                                      - added CHANGE_EDGE to the FSM register transfer logic, to have MISO change at opposite 
--                                        clock phases from SHIFT_EDGE.
--                                  Removed global signal setting at the FSM, implementing exhaustive explicit signal attributions
--                                  for each state, to avoid reported inference problems in some synthesis engines.
--                                  Streamlined port names and indentation blocks.
-- 2011/08/01   v2.01.0115  [JD]    Adjusted 'do_valid_o' pulse width to be 2 'clk_i', as in the master core.
--                                  Simulated in iSim with the master core for continuous transmission mode.
-- 2011/08/02   v2.02.0120  [JD]    Added mux for MISO at reset state, to output di(N-1) at start. This fixed a bug in first bit.
--                                  The master and slave cores were verified in FPGA with continuous transmission, for all SPI modes.
-- 2011/08/04   v2.02.0121  [JD]    Changed minor comment bugs in the combinatorial fsm logic.
-- 2011/08/08   v2.02.0122  [JD]    FIX: continuous transfer mode bug. When wren_i is not strobed prior to state 1 (last bit), the
--                                  sequencer goes to state 0, and then to state 'N' again. This produces a wrong bit-shift for received
--                                  data. The fix consists in engaging continuous transfer regardless of the user strobing write enable, and
--                                  sequencing from state 1 to N as long as the master clock is present. If the user does not write new 
--                                  data, the last data word is repeated.
-- 2011/08/08   v2.02.0123  [JD]    ISSUE: continuous transfer mode bug, for ignored 'di_req' cycles. Instead of repeating the last data word, 
--                                  the slave will send (others => '0') instead.
-- 2011/08/28   v2.02.0126  [JD]    ISSUE: the miso_o MUX that preloads tx_bit when slave is desselected will glitch for CPHA='1'.
--                                  FIX: added a registered drive for the MUX select that will transfer the tx_reg only after the first tx_reg update.
--
-----------------------------------------------------------------------------------------------------------------------
--  TODO
--  ====
--
-----------------------------------------------------------------------------------------------------------------------
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
use ieee.std_logic_unsigned.all;
 
entity spi_slave is
    Generic (   
        N : positive := 32;                                             -- 32bit serial word length is default
        CPOL : std_logic := '0';                                        -- SPI mode selection (mode 0 default)
        CPHA : std_logic := '0';                                        -- CPOL = clock polarity, CPHA = clock phase.
        PREFETCH : positive := 3);                                      -- prefetch lookahead cycles
    Port (  
        clk_i : in std_logic := 'X';                                    -- internal interface clock (clocks di/do registers)
        spi_ssel_i : in std_logic := 'X';                               -- spi bus slave select line
        spi_sck_i : in std_logic := 'X';                                -- spi bus sck clock (clocks the shift register core)
        spi_mosi_i : in std_logic := 'X';                               -- spi bus mosi input
        spi_miso_o : out std_logic := 'X';                              -- spi bus spi_miso_o output
        di_req_o : out std_logic;                                       -- preload lookahead data request line
        di_i : in  std_logic_vector (N-1 downto 0) := (others => 'X');  -- parallel load data in (clocked in on rising edge of clk_i)
        wren_i : in std_logic := 'X';                                   -- user data write enable
        wr_ack_o : out std_logic;                                       -- write acknowledge
        do_valid_o : out std_logic;                                     -- do_o data valid strobe, valid during one clk_i rising edge.
        do_o : out  std_logic_vector (N-1 downto 0);                    -- parallel output (clocked out on falling clk_i)
        --- debug ports: can be removed for the application circuit ---
        do_transfer_o : out std_logic;                                  -- debug: internal transfer driver
        wren_o : out std_logic;                                         -- debug: internal state of the wren_i pulse stretcher
        rx_bit_next_o : out std_logic;                                  -- debug: internal rx bit
        state_dbg_o : out std_logic_vector (3 downto 0);                -- debug: internal state register
        sh_reg_dbg_o : out std_logic_vector (N-1 downto 0)              -- debug: internal shift register
    );                      
end spi_slave;
 
--================================================================================================================
-- SYNTHESIS CONSIDERATIONS
-- ========================
-- There are several output ports that are used to simulate and verify the core operation. 
-- Do not map any signals to the unused ports, and the synthesis tool will remove the related interfacing
-- circuitry. 
-- The same is valid for the transmit and receive ports. If the receive ports are not mapped, the
-- synthesis tool will remove the receive logic from the generated circuitry.
-- Alternatively, you can remove these ports and related circuitry once the core is verified and
-- integrated to your circuit.
--================================================================================================================
 
architecture rtl of spi_slave is
    -- constants to control FlipFlop synthesis
    constant SHIFT_EDGE  : std_logic := (CPOL xnor CPHA);   -- MOSI data is captured and shifted at this SCK edge
    constant CHANGE_EDGE : std_logic := (CPOL xor CPHA);    -- MISO data is updated at this SCK edge
 
    ------------------------------------------------------------------------------------------
    -- GLOBAL RESET:
    --      all signals are initialized to zero at GSR (global set/reset) by giving explicit
    --      initialization values at declaration. This is needed for all Xilinx FPGAs, and 
    --      especially for the Spartan-6 and newer CLB architectures, where a local reset can
    --      reduce the usability of the slice registers, due to the need to share the control
    --      set (RESET/PRESET, CLOCK ENABLE and CLOCK) by all 8 registers in a slice.
    --      By using GSR for the initialization, and reducing RESET local init to the really
    --      essential, the model achieves better LUT/FF packing and CLB usability.
    ------------------------------------------------------------------------------------------
    -- internal state signals for register and combinatorial stages
    signal state_next : natural range N downto 0 := 0;      -- state 0 is idle state
    signal state_reg : natural range N downto 0 := 0;       -- state 0 is idle state
    -- shifter signals for register and combinatorial stages
    signal sh_next : std_logic_vector (N-1 downto 0);
    signal sh_reg : std_logic_vector (N-1 downto 0);
    -- mosi and miso connections
    signal rx_bit_next : std_logic;                         -- sample of MOSI input
    signal tx_bit_next : std_logic;
    signal tx_bit_reg : std_logic;                          -- drives MISO during sequential logic
    signal preload_miso : std_logic;                        -- controls the MISO MUX
    -- buffered di_i data signals for register and combinatorial stages
    signal di_reg : std_logic_vector (N-1 downto 0);
    -- internal wren_i stretcher for fsm combinatorial stage
    signal wren : std_logic;
    signal wr_ack_next : std_logic := '0';
    signal wr_ack_reg : std_logic := '0';
    -- buffered do_o data signals for register and combinatorial stages
    signal do_buffer_next : std_logic_vector (N-1 downto 0);
    signal do_buffer_reg : std_logic_vector (N-1 downto 0);
    -- internal signal to flag transfer to do_buffer_reg
    signal do_transfer_next : std_logic := '0';
    signal do_transfer_reg : std_logic := '0';
    -- internal input data request signal 
    signal di_req_next : std_logic := '0';
    signal di_req_reg : std_logic := '0';
    -- cross-clock do_valid_o logic
    signal do_valid_next : std_logic := '0';
    signal do_valid_A : std_logic := '0';
    signal do_valid_B : std_logic := '0';
    signal do_valid_C : std_logic := '0';
    signal do_valid_D : std_logic := '0';
    signal do_valid_o_reg : std_logic := '0';
    -- cross-clock di_req_o logic
    signal di_req_o_next : std_logic := '0';
    signal di_req_o_A : std_logic := '0';
    signal di_req_o_B : std_logic := '0';
    signal di_req_o_C : std_logic := '0';
    signal di_req_o_D : std_logic := '0';
    signal di_req_o_reg : std_logic := '0';
begin
    --=============================================================================================
    --  GENERICS CONSTRAINTS CHECKING
    --=============================================================================================
    -- minimum word width is 8 bits
    assert N >= 8
    report "Generic parameter 'N' error: SPI shift register size needs to be 8 bits minimum"
    severity FAILURE;    
    -- maximum prefetch lookahead check
    assert PREFETCH <= N-5
    report "Generic parameter 'PREFETCH' error: lookahead count out of range, needs to be N-5 maximum"
    severity FAILURE;    
 
    --=============================================================================================
    --  GENERATE BLOCKS
    --=============================================================================================
 
    --=============================================================================================
    --  DATA INPUTS
    --=============================================================================================
    -- connect rx bit input
    rx_bit_proc : rx_bit_next <= spi_mosi_i;
 
    --=============================================================================================
    --  CROSS-CLOCK PIPELINE TRANSFER LOGIC
    --=============================================================================================
    -- do_valid_o and di_req_o strobe output logic
    -- this is a delayed pulse generator with a ripple-transfer FFD pipeline, that generates a 
    -- fixed-length delayed pulse for the output flags, at the parallel clock domain
    out_transfer_proc : process ( clk_i, do_transfer_reg, di_req_reg,
                                  do_valid_A, do_valid_B, do_valid_D, 
                                  di_req_o_A, di_req_o_B, di_req_o_D) is
    begin
        if clk_i'event and clk_i = '1' then                     -- clock at parallel port clock
            -- do_transfer_reg -> do_valid_o_reg
            do_valid_A <= do_transfer_reg;                      -- the input signal must be at least 2 clocks long
            do_valid_B <= do_valid_A;                           -- feed it to a ripple chain of FFDs
            do_valid_C <= do_valid_B;
            do_valid_D <= do_valid_C;
            do_valid_o_reg <= do_valid_next;                    -- registered output pulse
            --------------------------------
            -- di_req_reg -> di_req_o_reg
            di_req_o_A <= di_req_reg;                           -- the input signal must be at least 2 clocks long
            di_req_o_B <= di_req_o_A;                           -- feed it to a ripple chain of FFDs
            di_req_o_C <= di_req_o_B;                               
            di_req_o_D <= di_req_o_C;                               
            di_req_o_reg <= di_req_o_next;                      -- registered output pulse
        end if;
        -- generate a 2-clocks pulse at the 3rd clock cycle
        do_valid_next <= do_valid_A and do_valid_B and not do_valid_D;
        di_req_o_next <= di_req_o_A and di_req_o_B and not di_req_o_D;
    end process out_transfer_proc;
    -- parallel load input registers: data register and write enable
    in_transfer_proc: process (clk_i, wren_i, wr_ack_reg) is
    begin
        -- registered data input, input register with clock enable
        if clk_i'event and clk_i = '1' then
            if wren_i = '1' then
                di_reg <= di_i;                                 -- parallel data input buffer register
            end if;
        end  if;
        -- stretch wren pulse to be detected by spi fsm (ffd with sync preset and sync reset)
        if clk_i'event and clk_i = '1' then
            if wren_i = '1' then                                -- wren_i is the sync preset for wren
                wren <= '1';
            elsif wr_ack_reg = '1' then                         -- wr_ack is the sync reset for wren
                wren <= '0';
            end if;
        end  if;
    end process in_transfer_proc;
 
    --=============================================================================================
    --  REGISTER TRANSFER PROCESSES
    --=============================================================================================
    -- fsm state and data registers change on spi SHIFT_EDGE
    core_reg_proc : process (spi_sck_i, spi_ssel_i) is
    begin
        -- FFD registers clocked on SHIFT edge and cleared on idle (spi_ssel_i = 1)
        -- state fsm register (fdr)
        if spi_ssel_i = '1' then                                -- async clr
            state_reg <= 0;                                     -- state falls back to idle when slave not selected
        elsif spi_sck_i'event and spi_sck_i = SHIFT_EDGE then   -- on SHIFT edge, update state register
            state_reg <= state_next;                            -- core fsm changes state with spi SHIFT clock
        end if;
        -- FFD registers clocked on SHIFT edge
        -- rtl core registers (fd)
        if spi_sck_i'event and spi_sck_i = SHIFT_EDGE then      -- on fsm state change, update all core registers
            sh_reg <= sh_next;                                  -- core shift register
            do_buffer_reg <= do_buffer_next;                    -- registered data output
            do_transfer_reg <= do_transfer_next;                -- cross-clock transfer flag
            di_req_reg <= di_req_next;                          -- input data request
            wr_ack_reg <= wr_ack_next;                          -- wren ack for data load synchronization
        end if;
        -- FFD registers clocked on CHANGE edge and cleared on idle (spi_ssel_i = 1)
        -- miso MUX preload control register (fdp)
        if spi_ssel_i = '1' then                                -- async preset
            preload_miso <= '1';                                -- miso MUX sees top bit of parallel input when slave not selected
        elsif spi_sck_i'event and spi_sck_i = CHANGE_EDGE then  -- on CHANGE edge, change to tx_reg output
            preload_miso <= spi_ssel_i;                         -- miso MUX sees tx_bit_reg when it is driven by SCK
        end if;
        -- FFD registers clocked on CHANGE edge
        -- tx_bit register (fd)
        if spi_sck_i'event and spi_sck_i = CHANGE_EDGE then
            tx_bit_reg <= tx_bit_next;                          -- update MISO driver from the MSb
        end if;
    end process core_reg_proc;
 
    --=============================================================================================
    --  COMBINATORIAL LOGIC PROCESSES
    --=============================================================================================
    -- state and datapath combinatorial logic
    core_combi_proc : process ( sh_reg, sh_next, state_reg, tx_bit_reg, rx_bit_next, do_buffer_reg, 
                                do_transfer_reg, di_reg, di_req_reg, wren, wr_ack_reg) is
    begin
        -- all output signals are assigned to (avoid latches)
        sh_next <= sh_reg;                                              -- shift register
        tx_bit_next <= tx_bit_reg;                                      -- MISO driver
        do_buffer_next <= do_buffer_reg;                                -- output data buffer
        do_transfer_next <= do_transfer_reg;                            -- output data flag
        wr_ack_next <= wr_ack_reg;                                      -- write enable acknowledge
        di_req_next <= di_req_reg;                                      -- data input request
        state_next <= state_reg;                                        -- fsm control state
        case state_reg is
        
            when (N) =>                                                 -- deassert 'di_rdy' and stretch do_valid
                wr_ack_next <= '0';                                     -- acknowledge data in transfer
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                tx_bit_next <= sh_reg(N-1);                             -- output next MSbit
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_next;                              -- shift in rx bit into LSb
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when (N-1) downto (PREFETCH+3) =>                           -- remove 'do_transfer' and shift bits
                do_transfer_next <= '0';                                -- reset 'do_valid' transfer signal
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                wr_ack_next <= '0';                                     -- remove data load ack for all but the load stages
                tx_bit_next <= sh_reg(N-1);                             -- output next MSbit
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_next;                              -- shift in rx bit into LSb
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when (PREFETCH+2) downto 3 =>                               -- raise prefetch 'di_req_o' signal
                di_req_next <= '1';                                     -- request data in advance to allow for pipeline delays
                wr_ack_next <= '0';                                     -- remove data load ack for all but the load stages
                tx_bit_next <= sh_reg(N-1);                             -- output next MSbit
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_next;                              -- shift in rx bit into LSb
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when 2 =>                                                   -- transfer received data to do_buffer_reg on next cycle
                di_req_next <= '1';                                     -- request data in advance to allow for pipeline delays
                wr_ack_next <= '0';                                     -- remove data load ack for all but the load stages
                tx_bit_next <= sh_reg(N-1);                             -- output next MSbit
                sh_next(N-1 downto 1) <= sh_reg(N-2 downto 0);          -- shift inner bits
                sh_next(0) <= rx_bit_next;                              -- shift in rx bit into LSb
                do_transfer_next <= '1';                                -- signal transfer to do_buffer on next cycle
                do_buffer_next <= sh_next;                              -- get next data directly into rx buffer
                state_next <= state_reg - 1;                            -- update next state at each sck pulse
                
            when 1 =>                                                   -- transfer rx data to do_buffer and restart if new data is written
                sh_next(0) <= rx_bit_next;                              -- shift in rx bit into LSb
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                state_next <= N;                                        -- next state is top bit of new data
                if wren = '1' then                                      -- load tx register if valid data present at di_reg
                    wr_ack_next <= '1';                                 -- acknowledge data in transfer
                    sh_next(N-1 downto 1) <= di_reg(N-2 downto 0);      -- shift inner bits
                    tx_bit_next <= di_reg(N-1);                         -- first output bit comes from the MSb of parallel data
                else
                    wr_ack_next <= '0';                                 -- no data reload for continuous transfer mode
                    sh_next(N-1 downto 1) <= (others => '0');           -- clear transmit shift register
                    tx_bit_next <= '0';                                 -- send ZERO
                end if;
                
            when 0 =>                                                   -- idle state: start and end of transmission
                sh_next(0) <= rx_bit_next;                              -- shift in rx bit into LSb
                sh_next(N-1 downto 1) <= di_reg(N-2 downto 0);          -- shift inner bits
                tx_bit_next <= di_reg(N-1);                             -- first output bit comes from the MSb of parallel data
                wr_ack_next <= '1';                                     -- acknowledge data in transfer
                di_req_next <= '0';                                     -- prefetch data request: deassert when shifting data
                do_transfer_next <= '0';                                -- clear signal transfer to do_buffer
                state_next <= N;                                        -- next state is top bit of new data
                
            when others =>
                state_next <= 0;                                        -- safe state
                
        end case; 
    end process core_combi_proc;
 
    --=============================================================================================
    --  OUTPUT LOGIC PROCESSES
    --=============================================================================================
    -- data output processes
    do_o_proc :         do_o <= do_buffer_reg;                          -- do_o always available
    do_valid_o_proc:    do_valid_o <= do_valid_o_reg;                   -- copy registered do_valid_o to output
    di_req_o_proc:      di_req_o <= di_req_o_reg;                       -- copy registered di_req_o to output
    wr_ack_o_proc:      wr_ack_o <= wr_ack_reg;                         -- copy registered wr_ack_o to output
 
    -----------------------------------------------------------------------------------------------
    -- MISO driver process: preload top bit of parallel data to MOSI at reset
    -----------------------------------------------------------------------------------------------
    -- this is a MUX that selects the combinatorial next tx bit at reset, and the registered tx bit
    -- at sequential operation. The mux gives us a preload of the first bit, simplifying the shifter logic.
    spi_miso_o_proc: process (preload_miso, tx_bit_reg, di_reg) is 
    begin
        if preload_miso = '1' then
            spi_miso_o <= di_reg(N-1);                                  -- copy top bit of parallel data at reset
        else
            spi_miso_o <= tx_bit_reg;                                   -- copy top bit of shifter at sequential operation
        end if;
    end process spi_miso_o_proc;
 
    --=============================================================================================
    --  DEBUG LOGIC PROCESSES
    --=============================================================================================
    -- these signals are useful for verification, and can be deleted after debug.
    do_transfer_proc:   do_transfer_o <= do_transfer_reg;
    state_debug_proc:   state_dbg_o <= std_logic_vector(to_unsigned(state_reg, 4)); -- export internal state to debug
    rx_bit_next_proc:   rx_bit_next_o <= rx_bit_next;
    wren_o_proc:        wren_o <= wren;
    sh_reg_debug_proc:  sh_reg_dbg_o <= sh_reg;                                     -- export sh_reg to debug
end architecture rtl;

- - - Updated - - -

the ucf for spi master .

Code:

##Clock signal
Net "m_clk_i" LOC=V10 | IOSTANDARD=LVCMOS33;
Net "m_clk_i" TNM_NET = sys_clk_pin;
TIMESPEC TS_sys_clk_pin = PERIOD sys_clk_pin 100000 kHz;


## 12 pin connectors

##JB
Net "m_spi_sck_o" LOC = K2 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L38P_M3DQ2, Sch name = JB1
Net "m_spi_ssel_o" LOC = K1 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L38N_M3DQ3, Sch name = JB2
Net "m_spi_mosi_o" LOC = L4 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L39P_M3LDQS, Sch name = JB3
Net "m_spi_miso_i" LOC = L3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L39N_M3LDQSN, Sch name = JB4

the ucf for spi slave

Code:

##Clock signal
Net "s_clk_i" LOC=V10 | IOSTANDARD=LVCMOS33;
Net "s_clk_i" TNM_NET = sys_clk_pin;
TIMESPEC TS_sys_clk_pin = PERIOD sys_clk_pin 100000 kHz;

Net "s_spi_ssel_i" LOC = J1 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L40N_M3DQ7, Sch name = JB8
Net "s_spi_mosi_i" LOC = K3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L42N_GCLK24_M3LDM, Sch name = JB9
Net "s_spi_miso_o" LOC = K5 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L43N_GCLK22_IRDY2_M3CASN, Sch name = JB10

ads-ee · May 6, 2016

Do not use quote tags use code or syntax tags see this.

You've already been warned multiple times about this.

- - - Updated - - -

Where is your block diagram? Draw one and post it.

Where is your top-level code that connects the UART to the SPI master and the UART that connects to the SPI slave?

Where is your testbench to verify the functionality of your code?

If you don't supply these items I will delete this thread as a duplicate of your previous thread (which has been closed).

Ananhasaasneh77 · May 6, 2016

sir i will not use the uart with spi .. just i want now to transfer data between master and slave.. master sends slave catch .. slave sends master catch ..
as i said i assigned the mosi miso sck ss in pmod connector
what i should do next to make the transfer happen betwwen slave and master ????

here is the tb for the code

Code:

--------------------------------------------------------------------------------
-- Company: 
-- Engineer:        Jonny Doin
--
-- Create Date:     22:59:18 04/25/2011
-- Design Name:     spi_master_slave
-- Module Name:     spi_master_slave/spi_loopback_test.vhd
-- Project Name:    SPI_interface
-- Target Device:   Spartan-6
-- Tool versions:   ISE 13.1
-- Description:     Testbench to simulate the master and slave SPI interfaces. Each module can be tested
--                  in a "real" environment, where the 'spi_master' exchanges data with the 'spi_slave'
--                  module, simulating the internal working of each design.
--                  In behavioral simulation, select a matching data width (N) and spi mode (CPOL, CPHA) for
--                  both modules, and also a different clock domain for each parallel interface.
--                  Different values for PREFETCH for each interface can be tested, to model the best value
--                  for the pipelined memory / bus that is attached to the di/do ports.
--                  To test the parallel interfaces, a simple ROM memory is simulated for each interface, with
--                  8 words of data to be sent, synchronous to each clock and flow control signals.
--
-- 
-- VHDL Test Bench Created by ISE for modules: 'spi_master' and 'spi_slave'
-- 
-- Dependencies:
-- 
-- Revision:
-- Revision 0.01 - File Created
-- Revision 0.10 - Implemented FIFO simulation for each interface.
-- Additional Comments:
--
-- Notes: 
-- This testbench has been automatically generated using types std_logic and
-- std_logic_vector for the ports of the unit under test.  Xilinx recommends
-- that these types always be used for the top-level I/O of a design in order
-- to guarantee that the testbench will bind correctly to the post-implementation 
-- simulation model.
--------------------------------------------------------------------------------
LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;

library work;
use work.all;

ENTITY spi_loopback_test IS
    GENERIC (   
        N : positive := 32;                                 -- 32bit serial word length is default
        CPOL : std_logic := '0';                            -- SPI mode selection (mode 0 default)
        CPHA : std_logic := '1';                            -- CPOL = clock polarity, CPHA = clock phase.
        PREFETCH : positive := 2                            -- prefetch lookahead cycles
    );                                          
END spi_loopback_test;
 
ARCHITECTURE behavior OF spi_loopback_test IS 

    --=========================================================
    -- Component declaration for the Unit Under Test (UUT)
    --=========================================================

	COMPONENT spi_loopback
	PORT(
		m_clk_i : IN std_logic;
		m_rst_i : IN std_logic;
		m_spi_miso_i : IN std_logic;
		m_di_i : IN std_logic_vector(31 downto 0);
		m_wren_i : IN std_logic;
		s_clk_i : IN std_logic;
		s_spi_ssel_i : IN std_logic;
		s_spi_sck_i : IN std_logic;
		s_spi_mosi_i : IN std_logic;
		s_di_i : IN std_logic_vector(31 downto 0);
		s_wren_i : IN std_logic;          
		m_spi_ssel_o : OUT std_logic;
		m_spi_sck_o : OUT std_logic;
		m_spi_mosi_o : OUT std_logic;
		m_di_req_o : OUT std_logic;
		m_do_valid_o : OUT std_logic;
		m_do_o : OUT std_logic_vector(31 downto 0);
		m_do_transfer_o : OUT std_logic;
		m_wren_o : OUT std_logic;
		m_wren_ack_o : OUT std_logic;
		m_rx_bit_reg_o : OUT std_logic;
		m_state_dbg_o : OUT std_logic_vector(3 downto 0);
		m_core_clk_o : OUT std_logic;
		m_core_n_clk_o : OUT std_logic;
		m_sh_reg_dbg_o : OUT std_logic_vector(31 downto 0);
		s_spi_miso_o : OUT std_logic;
		s_di_req_o : OUT std_logic;
		s_do_valid_o : OUT std_logic;
		s_do_o : OUT std_logic_vector(31 downto 0);
		s_do_transfer_o : OUT std_logic;
		s_wren_o : OUT std_logic;
		s_wren_ack_o : OUT std_logic;
		s_rx_bit_reg_o : OUT std_logic;
		s_state_dbg_o : OUT std_logic_vector(3 downto 0)
		);
	END COMPONENT;

    --=========================================================
    -- constants
    --=========================================================
    constant fifo_memory_size : integer := 16;
    
    --=========================================================
    -- types
    --=========================================================
    type fifo_memory_type is array (0 to fifo_memory_size-1) of std_logic_vector (N-1 downto 0);

    --=========================================================
    -- signals to connect the instances
    --=========================================================
    -- internal clk and rst
    signal m_clk : std_logic := '0';                -- clock domain for the master parallel interface. Must be faster than spi bus sck.
    signal s_clk : std_logic := '0';                -- clock domain for the slave parallel interface. Must be faster than spi bus sck.
    signal rst : std_logic := 'U';
    -- spi bus wires
    signal spi_sck : std_logic;
    signal spi_ssel : std_logic;
    signal spi_miso : std_logic;
    signal spi_mosi : std_logic;
    -- master parallel interface
    signal di_m : std_logic_vector (N-1 downto 0) := (others => '0');
    signal do_m : std_logic_vector (N-1 downto 0) := (others => 'U');
    signal do_valid_m : std_logic;
    signal do_transfer_m : std_logic;
    signal di_req_m : std_logic;
    signal wren_m : std_logic := '0';
    signal wren_o_m : std_logic := 'U';
    signal wren_ack_o_m : std_logic := 'U';
    signal rx_bit_reg_m : std_logic;
    signal state_m : std_logic_vector (3 downto 0);
    signal core_clk_o_m : std_logic;
    signal core_n_clk_o_m : std_logic;
    signal sh_reg_m : std_logic_vector (N-1 downto 0) := (others => '0');
    -- slave parallel interface
    signal di_s : std_logic_vector (N-1 downto 0) := (others => '0');
    signal do_s : std_logic_vector (N-1 downto 0);
    signal do_valid_s : std_logic;
    signal do_transfer_s : std_logic;
    signal di_req_s : std_logic;
    signal wren_s : std_logic := '0';
    signal wren_o_s : std_logic := 'U';
    signal wren_ack_o_s : std_logic := 'U';
    signal rx_bit_reg_s : std_logic;
    signal state_s : std_logic_vector (3 downto 0);
--    signal sh_reg_s : std_logic_vector (N-1 downto 0);

    --=========================================================
    -- Clock period definitions
    --=========================================================
    constant m_clk_period : time := 10 ns;          -- 100MHz master parallel clock
    constant s_clk_period : time := 10 ns;          -- 100MHz slave parallel clock

BEGIN
 
    --=========================================================
    -- Component instantiation for the Unit Under Test (UUT)
    --=========================================================

    Inst_spi_loopback: spi_loopback
    port map(
        ----------------MASTER-----------------------
        m_clk_i => m_clk,
        m_rst_i => rst,
        m_spi_ssel_o => spi_ssel,
        m_spi_sck_o => spi_sck,
        m_spi_mosi_o => spi_mosi,
        m_spi_miso_i => spi_miso,
        m_di_req_o => di_req_m,
        m_di_i => di_m,
        m_wren_i => wren_m,
        m_do_valid_o => do_valid_m,
        m_do_o => do_m,
        ----- debug -----
        m_do_transfer_o => do_transfer_m,
        m_wren_o => wren_o_m,
        m_wren_ack_o => wren_ack_o_m,
        m_rx_bit_reg_o => rx_bit_reg_m,
        m_state_dbg_o => state_m,
        m_core_clk_o => core_clk_o_m,
        m_core_n_clk_o => core_n_clk_o_m,
        m_sh_reg_dbg_o => sh_reg_m,
        ----------------SLAVE-----------------------
        s_clk_i => s_clk,
        s_spi_ssel_i => spi_ssel,
        s_spi_sck_i => spi_sck,
        s_spi_mosi_i => spi_mosi,
        s_spi_miso_o => spi_miso,
        s_di_req_o => di_req_s,
        s_di_i => di_s,
        s_wren_i => wren_s,
        s_do_valid_o => do_valid_s,
        s_do_o => do_s,
        ----- debug -----
        s_do_transfer_o => do_transfer_s,
        s_wren_o => wren_o_s,
        s_wren_ack_o => wren_ack_o_s,
        s_rx_bit_reg_o => rx_bit_reg_s,
        s_state_dbg_o => state_s
--        s_sh_reg_dbg_o => sh_reg_s
    );

    --=========================================================
    -- Clock generator processes
    --=========================================================
    m_clk_process : process
    begin
        m_clk <= '0';
        wait for m_clk_period/2;
        m_clk <= '1';
        wait for m_clk_period/2;
    end process m_clk_process;

    s_clk_process : process
    begin
        s_clk <= '0';
        wait for s_clk_period/2;
        s_clk <= '1';
        wait for s_clk_period/2;
    end process s_clk_process;

    --=========================================================
    -- rst_i process
    --=========================================================
    rst <= '0', '1' after 20 ns, '0' after 100 ns;
    
    --=========================================================
    -- Master interface process
    --=========================================================
    master_tx_fifo_proc: process is
        variable fifo_memory : fifo_memory_type := 
            (X"87654321",X"abcdef01",X"faceb007",X"10203049",X"85a5a5a5",X"7aaa5551",X"7adecabe",X"57564789",
             X"12345678",X"beefbeef",X"fee1600d",X"f158ba17",X"5ee1a7e3",X"101096da",X"600ddeed",X"deaddead");
        variable fifo_head : integer range 0 to fifo_memory_size-1;
    begin
        -- synchronous rst_i
        wait until rst = '1';
        wait until m_clk'event and m_clk = '1';
        di_m <= (others => '0');
        wren_m <= '0';
        fifo_head := 0;
        wait until rst = '0';
        wait until di_req_m = '1';                          -- wait shift register request for data
        -- load next fifo contents into shift register
        for cnt in 0 to (fifo_memory_size/2)-1 loop
            fifo_head := cnt;                               -- pre-compute next pointer 
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            di_m <= fifo_memory(fifo_head);                 -- place data into tx_data input bus
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            wren_m <= '1';                                  -- write data into spi master
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            wren_m <= '0';                                  -- remove write enable signal
            wait until di_req_m = '1';                      -- wait shift register request for data
        end loop;
        wait until spi_ssel = '1';
        wait for 2000 ns;
        for cnt in (fifo_memory_size/2) to fifo_memory_size-1 loop
            fifo_head := cnt;                               -- pre-compute next pointer 
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            di_m <= fifo_memory(fifo_head);                 -- place data into tx_data input bus
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            wren_m <= '1';                                  -- write data into spi master
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            wait until m_clk'event and m_clk = '1';         -- sync fifo data load at next rising edge
            wren_m <= '0';                                  -- remove write enable signal
            wait until di_req_m = '1';                      -- wait shift register request for data
        end loop;
        wait;
    end process master_tx_fifo_proc;


    --=========================================================
    -- Slave interface process
    --=========================================================
    slave_tx_fifo_proc: process is
        variable fifo_memory : fifo_memory_type := 
            (X"90201031",X"97640231",X"ef55aaf1",X"babaca51",X"b00b1ee5",X"51525354",X"81828384",X"91929394",
             X"be575ec5",X"2fa57410",X"cafed0ce",X"afadab0a",X"bac7ed1a",X"f05fac75",X"2acbac7e",X"12345678");
        variable fifo_head : integer range 0 to fifo_memory_size-1;
    begin
        -- synchronous rst_i
        wait until rst = '1';
        wait until s_clk'event and s_clk = '1';
        di_s <= (others => '0');
        wren_s <= '0';
        fifo_head := 0;
        wait until rst = '0';
        wait until di_req_s = '1';                          -- wait shift register request for data
        -- load next fifo contents into shift register
        for cnt in 0 to fifo_memory_size-1 loop
            fifo_head := cnt;                               -- pre-compute next pointer 
            wait until s_clk'event and s_clk = '1';         -- sync fifo data load at next rising edge
            di_s <= fifo_memory(fifo_head);                 -- place data into tx_data input bus
            wait until s_clk'event and s_clk = '1';         -- sync fifo data load at next rising edge
            wren_s <= '1';                                  -- write data into shift register
            wait until s_clk'event and s_clk = '1';         -- sync fifo data load at next rising edge
            wait until s_clk'event and s_clk = '1';         -- sync fifo data load at next rising edge
            wren_s <= '0';                                  -- remove write enable signal
            wait until di_req_s = '1';                      -- wait shift register request for data
        end loop;
        wait;
    end process slave_tx_fifo_proc;
 
END ARCHITECTURE behavior;

and the photo is for the cable that i will connect the two fpga with

ads-ee · May 6, 2016

BTW, to illustrate the problem with your post(s)...

In the software world:
I have these three DLLs and I am using them to make a MS Word clone program, but I don't know why I can't write a novel using it.

In the mechanical world:
I have this V8 engine, a tire, and a steering rack, I am using them to make a car, but I don't know why I can't drive it.

Ananhasaasneh77 · May 6, 2016

so what should i do now?

ads-ee · May 6, 2016

I told you to draw a diagram of the "system" you are trying to build. NOT text, a real DRAWING. with everything labeled as to what it is. the boxes in the block diagram should have the names of code you have or plan to use for that piece. Use the exact same names that are in the code for any connections between blocks. If you draw that you may notice you are connecting something wrong or there is a big hole somewhere that is missing something to connect, which we might be able to help you with.

If you want any kind of help you need to provide the information that is inside your head as nobody on the forum appears to be a mind reader.

Ananhasaasneh77 · May 6, 2016

this all i want master send to slave in mosi , slave receive ,,, slave send on mosi data master receive.
i assigned the mosi miso ss sck for both in pmod connector ..

now how can i make the data send from this to this .

Tetik · May 6, 2016

Your block diagram is incomplete. From your drawing, the SPI master has 4 ports while the code for it has over 20 ports. The same with the SPI slave.

Match both your drawing and the code, then post it again.

While you will be doing this, you will realize that the ports of the modules are connected to the spi_loopback top entity. Then, you will update your block diagram to represent this hierarchy.

After, you will end up with 2 sections (master side and slave side) with ports for the spi communication and many ports to control these sections. At that time, you will need to ask yourself how these controls works and where they are to be connected. The testbench should help you to understand that.

Remember that you have 2 FPGA. So you need two different projects (two Top entity). One will include the master and the other one the slave.

Take your time by doing it. It won't be wasted.

Ananhasaasneh77 · May 6, 2016

okay i will try my best and i will tell u what will happen ..
thanks alot

ads-ee · May 6, 2016

along with what Tetik stated.

You didn't follow my advice, draw a SYSTEM level block diagram.

You should have everything including any interfaces to PCs etc to control your test setup. You state earlier in your previous thread that you have two demo boards, that should be shown and how you have the logic distributed between the two boards. The control of the two boards via what interface should also be shown.

If you used the TB and tired to synthesize it you would end up with a test platform that likely doesn't do anything as that code is NOT for synthesis.

- - - Updated - - -

As an example here is a link to a block diagram of a PC.

as you can see there is far more information that the simple diagram you posted.

- - - Updated - - -

And here is a system level diagram of an entire TV studio. Note how it shows how everything is interconnected, from the source feeds to the transmission interface to the end user (transmission tower, NTSC signal).

Ananhasaasneh77 · May 6, 2016

^^ i will try and i will post what i will got ..

but i saw sth wrote in the code """ --- debug ports: can be removed or left unconnected for the application circuit ---"""
so there is no need i guess to be assign in the ucf??

ads-ee · May 6, 2016

You don't seem to understand what goes into a UCF file, did you ever read the documents that were pointed out to you?

A UCF file should contain:
Physical constraints

package pin locations
package pin IO standard
package pin drive strength
package pin slew rate
package pin differential termination
package pin pullup/pulldown

Timing constraints

clock period constraints
input constraints for input package pins
output constraints for output package pins
timing ignore constraints (TIG)
multi-cycle constraints (e.g. like clock enables)

Internal nodes that do not connect to package pins do not need input and output timing constraints as you posted in one of your UCFs (which also didn't even have input/output constraints on the pins that are connected to the package pins, e.g. m_spi_*)

Ananhasaasneh77 · May 7, 2016

i understand that .. but the problem is i dont knw what these ports do

Code:

    m_di_req_o : OUT std_logic;
        m_di_i : IN std_logic_vector(N-1 downto 0);
        m_wren_i : IN std_logic;          
        m_do_valid_o : OUT std_logic;
        m_do_o : OUT std_logic_vector(N-1 downto 0);
        ----- debug -----
        m_do_transfer_o : OUT std_logic;
        m_wren_o : OUT std_logic;
        m_wren_ack_o : OUT std_logic;
        m_rx_bit_reg_o : OUT std_logic;
        m_state_dbg_o : OUT std_logic_vector(3 downto 0);
        m_core_clk_o : OUT std_logic;
        m_core_n_clk_o : OUT std_logic;
        m_sh_reg_dbg_o : OUT std_logic_vector(N-1 downto 0);

becuz it's my first time i face these type of ports thats why i dont know where to put it in ucf

ads-ee · May 7, 2016

They are part of the interface to whatever generates the data that is transmitted by the SPI master. They are not in the UCF file, unless the go to pins on the device, otherwise they will be constrained by the clock constraint for setup and hold.

Have you run the testbench in a simulator YET? Look at the simulation waveforms when it runs you will see that the testbench writes values into that interface and that data is sent serially over the SPI lines.

My suggestion get an education on the basics of digital electronics. Forget about this design as it is obviously far too advanced for your current level of understanding. Find a very introductory tutorial on the hardware equivalent of Hello world. Something like an LED flasher circuit. Learn everything about that circuit and how it works before moving on to something more complex, e.g. a counter to 7-set display. After you've gone though perhaps 5-6 designs and throughly understand them try returning to this SPI design. Maybe by then you will have a rudimentary understanding of system architecture. Right now you can only see the tree in front of you and not the entire forest that surrounds you. When you can see the entire picture of how everything works together in a design you'll be ready to attempt this design.

Ananhasaasneh77 · May 7, 2016

i got simple spi code but still i cant transfe data bettwen them..

slave vhd

Code:

-- Serial Peripheral Interface Bus - Slave
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity spi_slave is
  generic(
    cpol             : std_logic:= '0';  -- Clock Polarity
    cpha             : std_logic := '0'  -- Clock Phase
  );
  port(
    -- Internal (FPGA) interface
    done        : out std_logic;  -- indicates SPI transaction complete
    data_tx     : in  std_logic_vector(7 downto 0);  -- data to transmit
    data_rx     : out std_logic_vector(7 downto 0);  -- data received

    -- External (SPI) interface
    sclk        : in  std_logic;  -- Serial Clock
    mosi        : in  std_logic;  -- Master Output, Slave Input
    miso        : out std_logic;  -- Master Input, Slave Output
    ss_l        : in  std_logic   -- Slave Select (active low)
  );
end spi_slave;

architecture spi_slave_arch of spi_slave is
  signal   done_reg        : std_logic := '0';
  signal   data_tx_reg     : std_logic_vector(9 downto 0) := "0011100000";
  signal   data_rx_reg     : std_logic_vector(8 downto 0) := "000001110";
  signal   miso_reg        : std_logic := '0';
  signal   clk_reg         : std_logic := '0';
begin  -- architecture spi_slave_arch of spi_slave
  done    <= data_rx_reg(8);
  data_rx <= data_rx_reg(7 downto 0);
  miso    <= data_tx_reg(9);
  clk_reg <= (cpol xor cpha xor sclk);

  -- TODO(aray): oh crap clean up the nesting (See also TODO:Learn VHDL)
  p : process(ss_l, clk_reg)
  begin
    if (ss_l = '1') then  -- reset time!
      data_rx_reg <= "000000000";
      data_tx_reg <= "0000000000";
    else
      if (data_rx_reg = "000000000") then  -- start time!
        if (cpha = '1') then
          data_tx_reg <= data_tx(7) & data_tx & '1';
        else
          data_tx_reg <= data_tx & "10";
        end if;
        data_rx_reg <= "000000001";
      else -- TODO: bring up RTL viewer, make sure these statements are sensible
        if rising_edge(clk_reg) then  -- sample time!
          if (data_rx_reg(8) = '1') then  -- continue to next byte
            data_rx_reg <= "00000001" & mosi;
          else
            data_rx_reg <= data_rx_reg(7 downto 0) & mosi;  -- shift in
          end if;
        elsif falling_edge(clk_reg) then  -- switch time!
          if (data_tx_reg(8 downto 0) = "100000000") then
            data_tx_reg <= data_tx & "10";  -- continue to next byte
          else
            data_tx_reg <= data_tx_reg(8 downto 0) & '0';   -- shift out
          end if;
        end if;
      end if;
    end if;
  end process p;
end architecture spi_slave_arch;

ucf slave

Code:

## Buttons
#Net "busy" LOC = B8 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L33P, Sch name = BTNS
#Net "b" LOC = A8 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L33N, Sch name = BTNU
Net "done" LOC = C4 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L1N_VREF, Sch name = BTNL
#Net "reset" LOC = C9 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L34N_GCLK18, Sch name = BTND
#Net "enable" LOC = D9 | IOSTANDARD = LVCMOS33; # Bank = 0, pin name = IO_L34P_GCLK19, Sch name = BTNR

## Leds
Net "data_rx<0>" LOC = U16 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L2P_CMPCLK, Sch name = LD0
Net "data_rx<1>" LOC = V16 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L2N_CMPMOSI, Sch name = LD1
Net "data_rx<2>" LOC = U15 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L5P, Sch name = LD2
Net "data_rx<3>" LOC = V15 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L5N, Sch name = LD3
Net "data_rx<4>" LOC = M11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L15P, Sch name = LD4
Net "data_rx<5>" LOC = N11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L15N, Sch name = LD5
Net "data_rx<6>" LOC = R11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L16P, Sch name = LD6
Net "data_rx<7>" LOC = T11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L16N_VREF, Sch name = LD7

## Switches
Net "data_tx<0>" LOC = T10 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L29N_GCLK2, Sch name = SW0
Net "data_tx<1>" LOC = T9 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L32P_GCLK29, Sch name = SW1
Net "data_tx<2>" LOC = V9 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L32N_GCLK28, Sch name = SW2
Net "data_tx<3>" LOC = M8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L40P, Sch name = SW3
Net "data_tx<4>" LOC = N8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L40N, Sch name = SW4
Net "data_tx<5>" LOC = U8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L41P, Sch name = SW5
Net "data_tx<6>" LOC = V8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L41N_VREF, Sch name = SW6
Net "data_tx<7>" LOC = T5 | IOSTANDARD = LVCMOS33; #Bank = MISC, pin name = IO_L48N_RDWR_B_VREF_2, Sch name = SW7


##JB
Net "sclk" LOC = K2 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L38P_M3DQ2, Sch name = JB1
Net "mosi" LOC = K1 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L38N_M3DQ3, Sch name = JB2
Net "miso" LOC = L4 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L39P_M3LDQS, Sch name = JB3
Net "ss_l" LOC = L3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L39N_M3LDQSN, Sch name = JB4
#Net "JB<4>" LOC = J3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L40P_M3DQ6, Sch name = JB7
#Net "JB<5>" LOC = J1 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L40N_M3DQ7, Sch name = JB8
#Net "JB<6>" LOC = K3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L42N_GCLK24_M3LDM, Sch name = JB9
#Net "JB<7>" LOC = K5 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L43N_GCLK22_IRDY2_M3CASN, Sch name = JB10

master

Code:

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity spi_master is
  generic(
    fpga_clk_freq_hz : integer := 50000000;    -- Internal FPGA clock rate
    sclk_freq_hz     : integer := 25000000;    -- Spi Clock rate
    cpol             : std_logic := '0';  -- Clock Polarity
    cpha             : std_logic := '1' -- Clock Phase
  );
  port(
    -- Internal (FPGA) interface
    fpga_clock  : in  std_logic;  -- input clock
    enable      : in  std_logic;  -- initiates SPI transaction
    reset       : in  std_logic;  -- fpga_clock-sychronous reset
    busy        : out std_logic;  -- indicates SPI transaction in progress
    done        : out std_logic;  -- indicates SPI transaction complete
    data_tx     : in  std_logic_vector(7 downto 0);  -- data to transmit
    data_rx     : out std_logic_vector(7 downto 0);  -- data received

    -- External (SPI) interface
    sclk        : out std_logic;  -- Serial Clock
    mosi        : out std_logic;  -- Master Output, Slave Input
    miso        : in  std_logic;  -- Master Input, Slave Output
    ss_l        : out std_logic   -- Slave Select (active low)
  );
end spi_master;

architecture spi_master_arch of spi_master is
  constant clk_counter_max : integer := fpga_clk_freq_hz / (sclk_freq_hz * 2);
  signal   clk_counter     : integer range 1 to clk_counter_max := 1;
  signal   done_reg        : std_logic := '0';
  signal   data_tx_reg     : std_logic_vector(9 downto 0) := "0000011100";
  signal   data_rx_reg     : std_logic_vector(7 downto 0) := "01110000";
  signal   sclk_reg        : std_logic := cpol;
  signal   mosi_reg        : std_logic := '0';
  signal   ss_l_reg        : std_logic := '1';
begin  -- architecture spi_master_arch of spi_master
  done    <= done_reg;
  data_rx <= data_rx_reg;
  sclk    <= sclk_reg;
  mosi    <= data_tx_reg(9);
  ss_l    <= ss_l_reg;
  busy    <= not ss_l_reg;  -- simplification for now, may be changed later

  -- TODO(aray): oh crap clean up the nesting (See also TODO:Learn VHDL)
  clk_div : process(fpga_clock)
  begin
    if rising_edge(fpga_clock) then
      done_reg <= '0';  -- unless otherwise set below
      if (reset = '1') then  -- synchronous reset
        sclk_reg <= cpol;
        ss_l_reg <= '1';
      else
        if (ss_l_reg = '1') then  -- if no transaction in progress...
          if (enable = '1') then  -- ...start a new transaction
            clk_counter <= 1;
            if (cpha = '0') then
              data_tx_reg <= data_tx(7 downto 0) & '1' & '0';
            else  -- cpha = '1'
              data_tx_reg <= data_tx(7) & data_tx(7 downto 0) & '1';
            end if;
            sclk_reg <= cpol;
            ss_l_reg <= '0';
          end if;
        else  -- transaction in progress
          if (clk_counter = clk_counter_max) then -- tick
            clk_counter <= 1;
            sclk_reg <= not sclk_reg;
            if ((cpol xor cpha xor sclk_reg) = '1') then  -- switch data time!
              if (data_tx_reg(8 downto 0) = "100000000") then  -- end of byte
                if (enable = '1') then -- continue transaction
                  data_tx_reg <= data_tx(7 downto 0) & '1' & '0';
                else
                  ss_l_reg <= '1';  -- done with transaction
                  sclk_reg <= cpol;
                end if;
                done_reg <= '1';  -- blip for one clk_div (sequential assignment)
              else  -- continue transaction
                data_tx_reg <= data_tx_reg(8 downto 0) & '0';  -- shift out
              end if;
            else  -- sample time!
              data_rx_reg <= data_rx_reg(6 downto 0) & miso;  -- shift in
            end if;
          else  -- count up to tick...
            clk_counter <= clk_counter + 1;
          end if;
        end if;
      end if;
    end if;
  end process clk_div;
end architecture spi_master_arch;

ucf

Code:

##Clock signal
Net "fpga_clock" LOC=V10 | IOSTANDARD=LVCMOS33;
Net "fpga_clock" TNM_NET = sys_clk_pin;
TIMESPEC TS_sys_clk_pin = PERIOD sys_clk_pin 100000 kHz;


## Buttons
Net "busy" LOC = B8 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L33P, Sch name = BTNS
#Net "b" LOC = A8 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L33N, Sch name = BTNU
Net "done" LOC = C4 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L1N_VREF, Sch name = BTNL
Net "reset" LOC = C9 | IOSTANDARD = LVCMOS33; #Bank = 0, pin name = IO_L34N_GCLK18, Sch name = BTND
Net "enable" LOC = D9 | IOSTANDARD = LVCMOS33; # Bank = 0, pin name = IO_L34P_GCLK19, Sch name = BTNR

## Leds
Net "data_rx<0>" LOC = U16 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L2P_CMPCLK, Sch name = LD0
Net "data_rx<1>" LOC = V16 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L2N_CMPMOSI, Sch name = LD1
Net "data_rx<2>" LOC = U15 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L5P, Sch name = LD2
Net "data_rx<3>" LOC = V15 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L5N, Sch name = LD3
Net "data_rx<4>" LOC = M11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L15P, Sch name = LD4
Net "data_rx<5>" LOC = N11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L15N, Sch name = LD5
Net "data_rx<6>" LOC = R11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L16P, Sch name = LD6
Net "data_rx<7>" LOC = T11 |  IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L16N_VREF, Sch name = LD7

## Switches
Net "data_tx<0>" LOC = T10 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L29N_GCLK2, Sch name = SW0
Net "data_tx<1>" LOC = T9 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L32P_GCLK29, Sch name = SW1
Net "data_tx<2>" LOC = V9 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L32N_GCLK28, Sch name = SW2
Net "data_tx<3>" LOC = M8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L40P, Sch name = SW3
Net "data_tx<4>" LOC = N8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L40N, Sch name = SW4
Net "data_tx<5>" LOC = U8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L41P, Sch name = SW5
Net "data_tx<6>" LOC = V8 | IOSTANDARD = LVCMOS33; #Bank = 2, pin name = IO_L41N_VREF, Sch name = SW6
Net "data_tx<7>" LOC = T5 | IOSTANDARD = LVCMOS33; #Bank = MISC, pin name = IO_L48N_RDWR_B_VREF_2, Sch name = SW7


##JB
Net "sclk" LOC = K2 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L38P_M3DQ2, Sch name = JB1
Net "mosi" LOC = K1 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L38N_M3DQ3, Sch name = JB2
Net "miso" LOC = L4 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L39P_M3LDQS, Sch name = JB3
Net "ss_l" LOC = L3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L39N_M3LDQSN, Sch name = JB4
#Net "JB<4>" LOC = J3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L40P_M3DQ6, Sch name = JB7
#Net "JB<5>" LOC = J1 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L40N_M3DQ7, Sch name = JB8
#Net "JB<6>" LOC = K3 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L42N_GCLK24_M3LDM, Sch name = JB9
#Net "JB<7>" LOC = K5 | IOSTANDARD = LVCMOS33; #Bank = 3, pin name = IO_L43N_GCLK22_IRDY2_M3CASN, Sch name = JB10

what is the problem?!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

KlausST · May 7, 2016

Hi,

for first tests
* I don´t recommend that high spi clock frequency.
* Set your data switches to 0x95

Do you have a scope to check the signals of the master?
If yes, then disconnect the slave and just check the SPI output signals of the master.

Klaus

Ananhasaasneh77 · May 7, 2016

Code:

 Set your data switches to 0x95..

how to do that?

Do you have a scope to check the signals of the master?

no i dont have

KlausST · May 7, 2016

Hi,

how to do that?

You want to transmit "data". I assume the "data" value is set with switches. Isn´t it?
Adjust your switches ... or just make to send a known "data" pattern.

0x95 makes it easy to detect
* bit shifts
* bit order
* bit count
* timing

no i dont have

Then how do you decide to debug your hardware?
Try and error? Without getting a clue WHY or WHAT is wrong?

If it is possible to slow down the SPI clock frequency to 1Hz (or similar) you could use LEDs and your eyes.. Good luck!

Klaus

Ananhasaasneh77 · May 7, 2016

You want to transmit "data". I assume the "data" value is set with switches. Isn´t it?

yes that all i want .. siwtch transmit data from master to slave fpga .. and same for slave.

If it is possible to slow down the SPI clock frequency to 1Hz (or similar) you could use LEDs and your eyes.. Good luck!

do u mean to set sclk_freq_hz : integer := 2500000 .... to

sclk_freq_hz : integer := 1?

- - - Updated - - -

0x95 makes it easy to detect
* bit shifts
* bit order
* bit count
* timing

how to do that

KlausST · May 7, 2016

Hi,

do u mean to set sclk_freq_hz : integer := 2500000 .... to
sclk_freq_hz : integer := 1?

Yes...

how to do that

See the binary bit pattern: 0b 1001 0101

and compare it with that what you see.

Klaus

Welcome to EDAboard.com

transfer data from spi master to spi slave.

Member level 2

Super Moderator

Member level 2

Super Moderator

Member level 2

Super Moderator

Member level 2

Member level 5

Member level 2

Super Moderator

Member level 2

Super Moderator

Member level 2

Super Moderator

Member level 2

Advanced Member level 7

Member level 2

Advanced Member level 7

Member level 2

Advanced Member level 7

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor