also, median isn't the same as sorting. You only care about finding the middle sample, not the ordering of other samples that aren't the middle sample. For things like bubble sort, this would mean you would stop after half-finishing the sort. not a tremendous savings there, but you get the idea.
as a second example of an O(N) method:
1.) compare value N to all of the other N-1 values.
2.) count the number of values (greater, less, etc...)
3.) stop once you find a suitable median
a simple way to improve this would be to use the information from the bitcount to determine if it makes sense to check the next value. eg, if there are only 3 values less than the current, and the next value is less than the current, then there is no reason to even attempt that check. (faster speed, but more resources)
area permitting, you can also move to, say 36 compares to determine a 4x4 block of medians at a time (as there is overlap)
edit -- and by O(N), I guess I was assuming the "compare to N-1" to be done in parallel. Really, the algorithm probably isn't the best.