Explained

Dictionary

The following references the implementation of System.Collections.Generic.Dictionary<TKey,TValue> in the .NET Framework 4.8, as it is shown here: https://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs
The code below is simplified to show the ideas. No verification, exception handling, optimizations, etc. In other words: unsafe.
Please do not use in your project as is.


A Dictionary is a structure that can map a specific value to another. It is like a List for which we can use anything as an index, not just a predefined integer value. And to improve performance it uses hashes of those custom indices internally. That’s why you might know it as HashMap from other languages.

There is not much I can tell you about Dictionary, that I haven’t said about HashSet already. That’s because they both work the same way.
And something in me shudders whenever i see redundancies.
Therefore I’ll just take the HashSet we made over there, and modify it to create a Dictionary from it.


Let’s start with the ability to store two values:
One will be the actual value to store (Value),
the other a custom identifier for accessing that value (Key).

private struct Item
{
    internal TK Key;
    internal TV Value;
    internal int HashCode;
    internal int Next;
}

As you can see, we can simply add another field in Item. Note that all of the previous code now has to reference Key for everything to still work.

How do we use Value then?
To start with, we add a second argument to Add, which allows us to specify the value for a key.
Furthermore, since we’re not dealing with a simple set anymore, adding a value using a key that already exists is considered an error.

public bool Add(TK keyTV value)
{
    if (Contains(key))
        throw new ArgumentException("Duplicate key");
 
    int added;
    if (_freeHead > 0)
    {
        added = _freeHead;
        _freeHead = _items[_freeHead - 1].Next;
    }
    else
    {
        if (_lowestUntouchedIndex >= _items.Length)
            Grow();
 
        added = ++_lowestUntouchedIndex;
    }
 
    (int index, int hashCode) = GetMappingIndex(key);
 
    _items[added - 1].Key = key;
    _items[added - 1].Value = value;
    _items[added - 1].HashCode = hashCode;
    _items[added - 1].Next = _mapping[index];
    _mapping[index] = added;
 
    ++_count;
    return true;
}

Sometimes we’re going to change values of already stored keys. To help with performance, this is going to be a special variation of Add.
Or rather, Add is going to be a variation of a general Insert method, just like a new method SetValue.

public bool Add(TK keyTV value)
    => Insert(keyvaluetrue);
 
public bool SetValue(TK keyTV value)
    => Insert(keyvaluefalse);
 
private bool Insert(TK keyTV valuebool addNew)
{
    (int index, int hashCode) = GetMappingIndex(key);
    for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
        if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
        {
            if (addNew)
                throw new ArgumentException("Duplicate key");
 
            _items[current - 1].Value = value;
            return false;
        }
 
 
    int added;
    if (_freeHead > 0)
    {
        added = _freeHead;
        _freeHead = _items[_freeHead - 1].Next;
    }
    else
    {
        if (_lowestUntouchedIndex >= _items.Length)
            Grow();
 
        added = ++_lowestUntouchedIndex;
    }
 
    _items[added - 1].Key = key;
    _items[added - 1].Value = value;
    _items[added - 1].HashCode = hashCode;
    _items[added - 1].Next = _mapping[index];
    _mapping[index] = added;
 
    ++_count;
    return true;
}

Contains and Remove can stay as they are, since they only interact with the key.


Now that we can store a value, we also need to be able to read it.
For that we add a new method GetValue, which works almost identical to Contains, except it returns the value it finds.

public TV GetValue(TK key)
{
    (int index, int hashCode) = GetMappingIndex(key);
    for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
        if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
            return _items[current - 1].Value;
 
    throw new KeyNotFoundException();
}

This method is assumes that the key exists.
Therefore, should we reach the end without returning a value, an Exception is thrown.
As an alternative access to values, for when the existence of the key isn’t known, we add TryGetValue.

public bool TryGetValue(TK keyout TV value)
{
    (int index, int hashCode) = GetMappingIndex(key);
    for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
        if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
        {
            value = _items[current - 1].Value;
            return true;
        }
 
    value = default;
    return false;
}

And finally, to provide a little more comfort, let’s add an indexer.

public TV this[TK key]
{
    get => GetValue(key);
    set => SetValue(keyvalue);
}

If your code looks like mine, this will cause the compiler to complain, since we already have an Item in this class.
There are two ways to solve this:
Either change the name of the Indexer (“Item” by default) using the IndexerName attribute, or change the name of our struct.


Apart from directly accessing a specific value, we might want to iterate all the values we’ve stored.
Conveniently we have an array filled with all our values, which we can just step through.
The only tricky part is to distinguish between actual values, and indices that either got deleted or never filled.
For the later we already know up to which index items got inserted.
For the former however, we need to make a small addition to Remove

_items[current - 1].HashCode = -1;

Since the hash codes we store are always positive, this is an easy marker for deleted values.

public IEnumerable<TVGetValues()
{
    for (int i = 0; i < _lowestUntouchedIndex; ++i)
        if (_items[i].HashCode >= 0)
            yield return _items[i].Value;
}

And with that we have a fully functional Dictionary!

public class Dictionary<TKTV>
{
    private struct KeyValueItem
    {
        internal TK Key;
        internal TV Value;
        internal int HashCode;
        internal int Next;
    }
 
    private KeyValueItem[] _items;
 
    private int[] _mapping;
 
    private int _freeHead;
    private int _lowestUntouchedIndex;
    private int _count;
 
    public int Count => _count;
 
    public TV this[TK key]
    {
        get => GetValue(key);
        set => SetValue(keyvalue);
    }
 
    public Dictionary()
    {
        _items = new KeyValueItem[3];
        _mapping = new int[_items.Length];
 
        _freeHead = 0;
        _lowestUntouchedIndex = 0;
        _count = 0;
    }
 
    public bool Add(TK keyTV value)
        => Insert(keyvaluetrue);
 
    public bool SetValue(TK keyTV value)
        => Insert(keyvaluefalse);
 
    private bool Insert(TK keyTV valuebool addNew)
    {
        (int index, int hashCode) = GetMappingIndex(key);
        for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
            if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
            {
                if (addNew)
                    throw new ArgumentException("Duplicate key");
 
                _items[current - 1].Value = value;
                return false;
            }
 
 
        int added;
        if (_freeHead > 0)
        {
            added = _freeHead;
            _freeHead = _items[_freeHead - 1].Next;
        }
        else
        {
            if (_lowestUntouchedIndex >= _items.Length)
                Grow();
 
            added = ++_lowestUntouchedIndex;
        }
 
        _items[added - 1].Key = key;
        _items[added - 1].Value = value;
        _items[added - 1].HashCode = hashCode;
        _items[added - 1].Next = _mapping[index];
        _mapping[index] = added;
 
        ++_count;
        return true;
    }
 
    public bool TryGetValue(TK keyout TV value)
    {
        (int index, int hashCode) = GetMappingIndex(key);
        for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
            if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
            {
                value = _items[current - 1].Value;
                return true;
            }
 
        value = default;
        return false;
    }
 
    public TV GetValue(TK key)
    {
        (int index, int hashCode) = GetMappingIndex(key);
        for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
            if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
                return _items[current - 1].Value;
 
        throw new KeyNotFoundException();
    }
 
    public bool Contains(TK key)
    {
        (int index, int hashCode) = GetMappingIndex(key);
        for (int current = _mapping[index]; current > 0; current = _items[current - 1].Next)
            if (_items[current - 1].HashCode == hashCode && _items[current - 1].Key.Equals(key))
                return true;
 
        return false;
    }
 
    public bool Remove(TK key)
    {
        (int index, int hashCode) = GetMappingIndex(key);
        int prev = 0;
        for (int current = _mapping[index]; current > 0; prev = currentcurrent = _items[current - 1].Next)
        {
            if (_items[current - 1].HashCode != hashCode || !_items[current - 1].Key.Equals(key))
                continue;
 
            if (prev < 1)
                _mapping[index] = _items[current - 1].Next;
            else
                _items[prev - 1].Next = _items[current - 1].Next;
 
            _items[current - 1].Next = _freeHead;
            _freeHead = current;
 
            _items[current - 1].HashCode = -1;
 
            --_count;
            return true;
        }
 
        return false;
    }
 
    public IEnumerable<TVGetValues()
    {
        for (int i = 0; i < _lowestUntouchedIndex; ++i)
            if (_items[i].HashCode >= 0)
                yield return _items[i].Value;
    }
 
    public IEnumerable<TKGetKeys()
    {
        for (int i = 0; i < _lowestUntouchedIndex; ++i)
            if (_items[i].HashCode >= 0)
                yield return _items[i].Key;
    }
 
    private (int Indexint HashCodeGetMappingIndex(TK key)
    {
        int hashCode = key?.GetHashCode() ?? 0;
        hashCode &= 0x7FFFFFFF;
        int index = hashCode % _mapping.Length;
 
        return (index, hashCode);
    }
 
    private void Grow()
    {
        KeyValueItem[] newArray = new KeyValueItem[GetNextLength()];
 
        Array.Copy(_items, newArray, _items.Length);
 
        _items = newArray;
 
 
        _mapping = new int[_items.Length];
 
        for (int i = 0; i < _lowestUntouchedIndex; ++i)
        {
            int mappedIndex = _items[i].Next % _mapping.Length;
            _items[i].Next = _mapping[mappedIndex];
            _mapping[mappedIndex] = i + 1;
        }
    }
 
    private int GetNextLength()
    {
        int c = _items.Length * 2 + 1;
        for (; Enumerable.Range(2, (int)Math.Sqrt(c)).Any(d => c % d == 0); ++c) { }
        return c;
    }
}

As always, my implementation of course differs from the official Dictionary.
The principals behind it however, are still the same.

Leave a comment