Here's my high level view of just the basics.
First, all these FT transforms are used to convert the time domain into the frequency domain. It's very important to understand this. There are reverse transforms for converting the frequency domain back to the time domain as well.
Signals in the time domain can be represented mathematically either as functions or discrete samples. For example a sinusoidal signal can be x = sin(wt) - a function. Or x(0) = 0, x(1) = .707, x(2) = 1.0, etc. - a set of discrete samples.
CFT
The 'C' here means continuous which means the signal is defined by a set of functions. If you want to use functions (eg. sin(wt)) then you need to use the continuous form of the FT (CFT). The result from the CFT is a set of functions representing the frequency spectrum. So basically, you input a continuous time signal and you get out a continuous frequency spectrum. What does this mean? It just means that you can determine the signal level at any time you choose - since its a function. It also means that you can determine the amplitude of any single frequency component - since its a function. This is the advantage of the CFT - it's exact and countinuous. The big disadvantage is that it's really hard to compute for arbitrary signals - virtually impossible sometimes.
DTFT
The 'D' here means discrete which means it is just a collection of data points.
Discrete transforms are appoximations since you only know the value at a discrete time or frequency.
For a discrete signal, there are no functions describing the input, just a bunch of data. These data points can be periodic or not. If they are periodic (sampled) then the DTFT becomes the DFT. So in this case, you input an array of data points representing time and you get out an array of data points representing frequency. The term 'bins' is typically used to describe these data points. What this means is that a single frequency data point represents a frequency band or bin. So for example, if f(3) = 0.5 and the bins are 1 kHz wide, then the relative power of all the frequencies in the bin is 0.5. If you add up all the bins the total will be 1.0 (normalized & adjusted for phase). You can control the number of bins and their bandwidth by setting the sample frequency and the number of samples you process. As the sample frequency and number of samples increases, the number of bins increases and the bandwith decreases - meaning it is more accurate.
The frequency bins can be thought of as bandpass filters with amplitude detectors.
In practice DFT's are often used to detect a single band of frequencies. In this case the math is faster since you only have to calculate values for one bin.
However to say that it is a bandpass filter is incorrect since the output is the frequency domain. If you need to know the power in a band this it works just fine - for example if you wanted to detect DTMF tones. But in order to make a bp filter you would have to run an IDFT on the band to get back to the time domain and then if you wanted an analog signal you could use a DAC. In practice you have to understand and tune all the aspects of the conversions if you want a useable output.
Hope this helps.