Given how sloppy the thing looks I recommend that
you start fresh, from care-abouts and modern device
options (discrete and/or integrated). Analyze from
requirements point of view, not ancient history (as
edited).
If you are bothered by storage time, look for switching
transistors, not "general purpose" and there, for least
recovery time. You could perhaps add a low-Vf Schottky
to help out by preventing hard saturation (a bad word
for BJT signal chain, a good word for MOS and a "well,
when you gotta..." for power).
Yipping out a 200ns pulse should not be hard, by various
methods. Rep rate, dimensions of variability you might want
and "cleanliness" might bear on your choice of methods.