Ouch! That's a lot of I/O man!
Well, you might try a bunch of 3-line serial-to-parallel 8 bit sinking driver ICs like the Allegro A6821 or the pin-for-pin compatible Micrel MIC5821. There are a couple 'tricks' you can use to load the serial data into a bunch of them in parallel on a bus. For example, connect the DAT input on eight driver ICs to a unique pin on an 8 bit bus. The CLK and LAT pins on all eight driver ICs should be connected together and driven by another two pins. Load 64 bits of data into the eight shift registers by throwing eight bytes onto the 8 bit bus, each byte followed by a CLK pulse. Those eight bytes would need to be formatted with all of the b7 bit data in one byte, all of the b6 bit data in one byte, and so on.
One possible advantage of using the Allegro or Micrel drivers is that you can drive their Output Enable pins from a PWM signal for brightness control. In the case of a more traditional multiplexed display, you would simply use a PWM period equal to the column or row "scan" rate.
Good luck with your project. Mike