What is The Best Data Type to Store the Result of HASHBYTES('MD5', ...)?

542    Asked by AndrewJenkins in SQL Server , Asked on Apr 22, 2021

Please suggest the best data type to store the results of the HASHBYTES('MD5', ...)? It outputs 16 bytes of binary as follows: e.g.

I could store it in the three data types: char(34) binary(16) (I think - I read here (https://stackoverflow.com/questions/14722305/what-kind-of-datatype-should-one-use-to-store-hashes#16680423) that using the same algo should return the same number of bytes every time regardless of the input string) other?

Every row will have a value (no nulls), and the column will be used for comparison against a similar column in another table. Which is the best data type to store  SQL server HASHBYTES output for use as described above? I was thinking that since fixed-length data types can sometimes be more efficient on joins, etc. binary(16) vs varbinary(8000) (the default output of HASHBYTES) seems best, and binary(16) vs a varchar(34) is better since it would use less storage space.

Answered by Ankesh Kumar
      The output conforms to the algorithm standard: 128 bits (16 bytes) for MD2, MD4, and MD5; 160 bits (20 bytes) for SHA and SHA1; 256 bits (32 bytes) for SHA2_256, and 512 bits (64 bytes) for SHA2_512.

So despite the fact that the function technically returns varbinary(8000), you'll only ever get 16 bytes with MD5. If you are completely certain you only need MD5, then a binary(16) will do it. If you want to play it safe, and allow any possible algorithm (current or future) that HASHBYTES can handle, go with varbinary(8000). I definitely wouldn't recommend a char type, since that will involve string conversion overhead and unnecessary storage overhead.

Best data type to store SQL Server HASHBYTES output

You should always use binary data types. You can use binary instead of varbinary because the hash function will always return the same number of bytes for the same type of hash (e.g. MD5, SHA1, etc.). This will cut down on the (slight) overhead required to manage a variable-length binary (varbinary) column. In terms of what size to make it, you can run this query to check the length of each hash type: And it should come out with:







Your Answer

Interviews

Parent Categories