one method
In the analog world a method that is within 1 dB of optimum is
1. Slice the base band data signal to get hard limiting
2. Low pass with a 1 pole filter at 3/8 of the bit rate
3. Square (full wave rectification will also work but is a few dB worse
4. Filter out the spectal line at the bit rate
Can this be approximated in your FPGA?
Can you design the signal format for easier clock recovery (at the expense of wider signal bandwidth)? For instance, a return to zero scheme with the frequency shifted up or down from the center represents a one or zero and comes back to the center for a short time to make the return to zero part. That way you can slice the signal up and down and know that you have the bit period completely determined.