When a collision occurs on inserting an element, we attempt to put the element in the next cell (at the next position in the table), or, more generally, in another pre-determined cell.
In the most general case, if a key k is to be inserted, the following sequence of indexes is tried, until the first empty cell is found: g(k,0), g(k, 1), g(k, 2), etc.
On lookup, the same sequence is tried, until either:
Assuming that the hash distributes uniformly the input keys on the cells, the probability that cell g(k, i) is occupied is the same for all keys and all positions in the probing sequence, and it is equal to the load factor α.
Now, the probability to have, for a fixed key to be inserted, at least i cells probed is αi-1 (it is equal to the probability that the first i-1 cells are occupied. The average number of cells to be probed is then α0 + α1 + α2 + ... = 1/(1-α).
Suppose a key k was inserted, not in the first probed cell - which was occupied - but in the second cell. Then, suppose the key occupying the first cell above is deleled. If we just set the cell as empty, then a subsequent attempt to search for the key k will fail, because the first probed cell is now empty, and the second cell is never probed.
So, when a key is removed from the hash table, the cell must have a special marking, showing that a key existed there but was removed. Such a cell can be used for inserting a new element in its place, but must be treated as an occupied cell during the search for a key.
Radu-Lucian LUPŞA