hi again,
I've been racking my brains trying to think of a 'one clock' solution (just in my head...). You will still need to XOR the two values, each bit XOR'ed with the coresponding bit in the other value. Eg: for binary numbers 'm' and 'n'
m7 XOR n7; m6 XOR n6; m5 XOR n5 etc....
For 8 bits (just for this example) that gives us 8 outputs. Every '1' indicates a difference in bit value. Now, you mentioned a look-up table, that would have 256 entries :/ Since there are 256 combinations of 8 bits. If you go for a wider data width (32 bits?) your lookup table will become huge! Plus, if this is all done in boolean logic (non synchronous) its the same deal, the complexity increases dramatically (2 fold) every bit you add to the data width.
I'm afriad, synchronous logic, is the only way to go,
you will need more more than one clock cycle. That said.....as with all things like this (hamming decoders etc...) its a balance, or a comprimise between 'size' and 'time'.
For example, say we wanted to calculate parity. For 8 bits, we need quite a large XOR tree if we wish to calculate a parity bit instantly (without a clock). In CPLD's/FPGA's XOR tree can take up a fair few CLB. On the other hand, if we are restricted by 'size' (say we have a small CPLD) then we can do this calculation over
time. By shifting each bit through a register, and incrementing a counter (a 1-bit counter, basically, switching between odd and even). This only requires 9 registers, and one AND gate.........but, it takes 8 clocks.
Back to your problem:
Now, if you want it done quickly, but want a wider bus width, you could use a combination of the two idea's above....use both time AND size. Instead of having a massive logic circuit, or using up one clock cycle for each bit, break it down into, say, 4 bit nibbles.
You work out the hamming distance in these 4-bit nibbles, each nibble takes 1 clock cycle, and requires a bit of logic. From this circuit, you have a counter, which can be incremented by 0,1,2,3,4. That way, you *could* use a look-up table, with only 16 entries, but alas, a byte would take 2 clocks, 2 bytes 4 clocks etc...
The above idea is what I used to make a 13,8 hamming encoder/decoder fit into a 64 MC CPLD, along with lots of counters......the support guy for lattice said, it could not be done
In a schematic, 'getting' your data in 4-bit nibbles can be tricky, but in Verilog or VHDL it should be straightforward. As I said in my first post, I only use VHDL for complicated designs, and for this, it really is a great tool. Is your data coming in serially? or in parallel load?
Regards,
BuriedCode.