Look at the ethernet demo designs with an ethernet mac using SGMII. (if the PHY has fifo options, you might consider 1000BaseX.)
There are a few issues -- the first is the rx/tx rate mismatch. The issue is that the RX clock is recovered and the TX clock is generated. In some of the loopback modes, the tx data is directly connected to the rx data without being clocked into the FPGA first. This causes the system to work fine.
For FPGAs, there are a few main solutions to this. The first is a packet fifo. In this case, some extra logic is added to detect packet start/end/error. data is placed into a fifo and then sent out only when a full packet is in the fifo. If you want to reduce latency, you can perform a rate-mismatch calculation and change this into "data is sent when a packet is done or has more than X bytes". The other solution is to match the TX rate to the RX rate by using the PI-XCO mode of the FPGA (uses a feature of the CPLL along with a FSM to match rates). This gives the lowest latency, but you might need to get the pixco logic from something else. There is an SDI demo that has one, but I have no idea if it is encrypted.
The second part is autonegotiation. SGMII has symbol duplication (mostly), which allows you to detect different rates. You can also poll the PHY using the MIIM (MDIO) interface.
All of these things have example designs from xilinx. You can start with the demo or at least pick the logic modules you need before rewriting anything.