I don't know if this will work right, but according to my deduction, it seems like this:
Av1 = gm1*ro1; Av2 = gm2*ro2, first pole is about: 1/(ro1*gm2*ro2*Cc)
gm1, gm2 is the first and second stage transconductance
ro1, ro2 is the first and second stage output resistance
Cc is the miller compensation capacitance
So GBW is about Av1*Av2*pole1~gm1/Cc
that is increase the first stage transconductance and reduce the miller compensation
For the gain, you should make gm1*ro1*gm2*ro2 to be very large. And because gm1 is proportional to sqrt(I1), ro1 is proportional to 1/I1, and same as gm2 and ro2. Finally, you have to make 1/[lamda1*lamda2*sqrt(I1*I2)] large。
lamda is the channel modulation coefficient of mos transistor. Larger length of MOS has smaller lamda.
At the end, it's your tradeoff between all the parameters.
I think generally it is very hard to construct a two stage opamp with large bandwidth comparing with other structures.
Alvays