Traditional Culture Encyclopedia - Traditional stories - Construction method of hash function

Construction method of hash function

The commonly used methods to construct hash function are: direct addressing method, digital analysis method, leveling method, folding method, division of remainder method and random number method.

1, direct addressing method

Take the keyword or the linear function value of the keyword as the hash address. That is: H(key)=key or h (key) = akey+B. Where a and b are constants (this hash function is called its own function).

2. Digital analysis method

Assuming that the keywords are numbers based on R (such as decimal numbers based on 10) and all possible keywords in the hash table are known in advance, a hash address can be composed of several numbers of keywords.

3. Take the square method.

The middle number after the keyword is squared is the hash address. This is a common method to construct hash functions. Usually, when choosing a hash function, you may not know all the keywords, and it is not necessarily appropriate to choose which ones, but the middle digit of the square of a number is related to each digit of this number.

4. Folding method

Divide the keyword into several parts with the same number of digits (the number of digits in the last part can be different), and then take the superposition sum (rounding) of these parts as the hash address. This method is called folding. When there are many keywords and the number distribution on each keyword is almost uniform, the hash address can be obtained by folding.

5, except the rest.

The remainder obtained by dividing the keyword by the number p not greater than the hash table length m is the hash address. That is, H(key) = key MOD p, pm. This is the simplest and most commonly used method to construct hash function. It can not only take the modulus of keywords directly, but also take the modulus after folding and square average operation.

6. Random number method

Select a random function and take the random function value of the keyword as its hash address, that is, H(key)=random(key), where random is a random function. Usually, it is more appropriate to construct hash function in this way when the keyword length is unequal.

Handling of conflicts:

In a hash table, different key values correspond to the same storage location. That is, the keyword K 1≠K2, but H (K 1) = H (K2). A unified hash function can reduce conflicts, but it cannot avoid them. After the conflict occurs, it must be resolved; In other words, we must find the next available address and resolve the conflict:

1. zipper method: Records with the same hash address are stored in a linear linked list. For example, in the remainder method, the key is (18, 14, 0 1 68, 27, 55, 79), the divisor is 13, and the hash address is (5, 1,/kloc).

2. Open addressing mode: If h(k) has been occupied, detect it in the following order: (H (k)+P (1))% Tsize, (H (k)+P (2))% Tsize,? ,(h(k)+p(i))%TSize,? Where h(k) is the hash function, TSize is the hash table length, and p(i) is the probe function.

On the basis of h (k)+p(i- 1))% tsize, if a conflict is found, the increment p(i) is used for new detection until no conflict occurs.

According to the different detection function p(i), open addressing methods can be divided into linear detection methods (p(i) = i: 1, 2,3,? ), secondary exploration method (p (I) = (-1) (I-1) * ((I+1)/2) 2, and the exploration sequence is:1,-1,4. )。

Random detection method (p(i): random number), double hash function method (double hash function h(key), hp (key) If h(key) conflicts, use hp (key) to find the hash address. The exploration sequence is: h(k), h(k)+ hp(k),? ,h(k)+ i*hp(k))。

3. Bucket addressing method: Bucket is a large enough storage space. Bucket addressing associates buckets with each address in the table. If the bucket is full, it can be handled by open addressing. For example, insert A5, A2, A3, B5, A9, B2, B9, C2, and solve the conflict through linear exploration.