Inputs to the chip should be registered. But if you then pass the output of the register down 10 layers of heirarchy, you'll be going through 10 entities with what is just a wire (and it could be a really short wire - the heirarchy has nothing to do with the final routing), so they dont all need to be registered going into every single entity.
It wont hurt having them all registered though. just add latency (which may be unneccesary).
If you're developing a peice of IP that someone else may be using, it might be sensible to register the inputs at the top level at least, just incase the other user forgot to register his outputs.
So register all outputs.