Hi Linch,
of course, you can request more, then 256 I/O’s for one BAR, but no guarantee, whether will it work in your case or not.
Best results (4K I/O, probably more) can be obtained for MS-DOS and old motherboards with minimum of periphery. Worst results (256 bytes I/O downto 0) => for latest Windows, or a heap of hardware on the computer.
As far, as i can see, violation of PCI bus specifications is a usual practice for many small-series products designers, but obeying 256 I/O limit for one BAR seems to me more important than, for example, implementation of parity signal, latency timer etc.
IMHO, using so much I/O’s or non-burst memory instead them, isn’t advisable, because it complicates design and may cause I/O ports shortage during Plug & Play. If your are going to use large amount of I/O ports, the best way to do so is to implement large amount of BAR’s with size of no more then 256 bytes, I presume.