May be this could still work in a 0.18µm process with 1.5V supply, as the threshold voltages probably are lower than those mentioned in the paper (0.6µm process).
For this I suggest a Monticelli type class-AB output stage, which allows for strict control of the output stage cross current. See the following excerpt:
Don't forget that when using this simple rail to rail input Nmos, PMOs pairs you will not have a constant GM which will change your GBW over input range. This could effect your system which you should check! I would also suggest the Monticelli or battery bias class AB output as well... They are very simple to build and work very very well!
Jgk