Forget about reading lines - you are reading individual pixels. The input stream is 1 pixel at a time, read from a memory 1 pixel at a time. A video is provided one pixel at a time.
No, you cannot read an entire row at a time - you dont have the memory bandwidth (or if you do, it would be very innefficient use of the ram resources).
The above example assumes that the filter is not seperable - most 3d filters can be separated into a 3x1 and 1x3 filters, giving a 3x3 result. Even better then if the filter is symmetrical. This saves register and multiplier resources.