1. PMOS device has less mobility. In case of NAND PMOS are in parallel and they try to compensate the mobility factor during charging. where as NMOS has more mobility and series resistance compensate for discharging time. In other words charging and discharging are balance in this case.
2. In case of NOR this is not the case and discharging of output cap is very good ( More e mobility + parallel NMOS) but charging is slow so we have bad rise time.
It is basically related to the gate sizing.Lets say you have a library with basic inverter size would be 2:1(Wp/Wn).To make this condition true for NAND gate you have to use two Wp pmos in pull up device and two Wn nmos in NMOS device.
In NOR gate you have to use two PMOS devices of width 4Wp as pullup device with two Wn NMOS. So you have more utilization in NOR gate.
In the design and implementation of a logic circuit, its becomes easier if the parameters are in SOP(sum of products form) other than POS(product of sum form).
NAND gate implementation of the SOP is straightforward, but POS implementation using NOR gate is abit involving, and offcourse many designers will go for the NAND gate to implememt thre circuits since its easier hence superior!
In FF based design mostly data is sampled at the rising edge of the clock. so 0->1 transition should be faster. since in nand pmos xtrs are in parallel, 0->1 transition will be faster due to lower effective resistance.
imagine connecting pmos in series.......becos of its allmost double width than nmos ....and less mobility .........it will take much time for output to change to 1,
also PMOS is better in sending logic 1 than logic 0, whereas NMOS better in sending logic 0........