Originally Fourier itroduced his transform for solving linear differential equations.
You know that LDE and systems of LDE describe continuous time linear systems like LRC chains. Chains consisting of resistors, capacitors, inductivities.
By the way, convolution is integral representation of system of LDE.
Idea was to represent signal as a sum of "easy for use" signals, find solution for all these signals and combine results. These "easy for use" signals are eigenfunctions of system to be solved. In case of LDE they are complex exponential functions (A*exp(j*(lambda))). These functions remains the same wenn transferring through linear systems. Only parameters change.
So, if we represent input signal as a sum of complex exponential functions (its forward fourier transform), set of LDE became a set of linear equations. Solving it and combining(this is inverse fourier transform) output exponential functions (remember, their magnitudes and phases are changed) we obtain output signal.
This is connection beetween several terms
Convolution <--> System of LDEs <-- (Fourier transform) --> System of linear equation on fourier domain
I think its better to find some more detailed description in some books.
The same is for discrete time with next changes
convolution becames discrete convolution
System of LDEs becames system of finite difference equations
fourier transform --> discrete fourier transform
So, main reason of introducing fourier transform --> make easy of analyzing, simulating and handling linear systems.
Additionally, it coincides in some way with intuitively clear term of frequency (in audio processing, for example)
Good luck in learning Sygnals and Systems