lots of thesis discuss the mininmun word length for preventing overflow.
and it says that it is the same as the average window filter in frequency domain.
I guess, you didn't read it exactly. Word length discussion is
not about preventing overflow. It's about keeping the resolution. As you mentioned correctly, the integrator or accumulator will overflow in any case. It however won't saturate, that's an important difference. The short Wikipedia article about CIC filters gets the basic point by comparing it with a moving average filter. After the substraction operation, respectively in CIC terms, pairing an integrator with a differentiator, the average signal level is restored.
Cascaded integrator-comb filter - Wikipedia, the free encyclopedia
As a simple exercise, you can evaluate CIC behaviour with pencil and paper, or more easily, with a spreadsheet calculator.
The analogy with a moving average filter applies to the first order CIC decimator. The interesting point, and perhaps not obvious at first sigth, is that it can be extended to higher orders and still keep the signal average, although all integrators are continuously overflowing.
The best description of CIC filters is still the original Hogenauer paper
An economical class of digital filters for decimation and interpolation, although the mathematical part is somewhat demanding.
Some aspects of CIC filters have been discussed in previous edaboard threads, e.g.
https://www.edaboard.com/threads/127067/