At max rated 20MHz, you only get 5 Mips. As it takes 2 instructions (3 instruction cycles because of unconditional branch) for a busy wait, you don't really have time to do software USB. Even with interrupts, you have latency.
You'll need an external USB peripheral. Pick one and someone may be able to help you. Search the board for USB.