There exists a fifo implementation that would work with a depth of 16. For this to occur, the read latency must be 0. This is a "FWFT" fifo -- the data-out port always displays the top of the fifo (when not empty). thus on the 16th cycle, the full flag occurs and the read signal is asserted. The valid data is already on the output. the 16th element is removed from the fifo at the same time the 17th is added and the 15th (now top) is moved to the data out port.
Likewise, a fifo with registered outputs would only need to have a logical depth of 15 -- eg, the full flag asserted when 15 elements are in the fifo. a read at this time would move the 15th element out to a register stage for the 16th cycle of delay.
Because this situation does not make meaningful use of the flow control signals, a shift register is all that is really being implemented. The full/empty/read/write are all just things that you are trying to satisfy to get the desired shift-register. The underlying shift register only requires 16 elements.
Likewise, I'm sure there are also fifo implementations that would require 17 elements to work reliably.