Home ArrayMap源码解析
Post
Cancel

ArrayMap源码解析

简介

ArrayMap是专门针对内存优化设计的,因为HashMap为了快速查询带来了一定的空间浪费。而ArrayMap基于双数组设计,二分法查找,支持缩容,在比较小的数量级情况下拥有不错的效率。

原理

基本结构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// 永远都是false,所以使用的基本为对象的hashcode
private final boolean mIdentityHashCode;
// hash的数组
int[] mHashes;
// key value的数组,容量为hash数组的两倍
Object[] mArray;

int mSize;

public ArrayMap() {
    this(0, false);
}

public ArrayMap(int capacity) {
    this(capacity, false);
}

/** {@hide} */
public ArrayMap(int capacity, boolean identityHashCode) {
    mIdentityHashCode = identityHashCode;
    if (capacity < 0) {
        mHashes = EMPTY_IMMUTABLE_INTS;
        mArray = EmptyArray.OBJECT;
    } else if (capacity == 0) {
        mHashes = EmptyArray.INT;
        mArray = EmptyArray.OBJECT;
    } else {
        // 初始化的地方
        allocArrays(capacity);
    }
    mSize = 0;
}

UML图

mHashes与mArray基本关系如图,mHashes是有序的hashcode表,而mArray储存的为对应的key与value。如图:

缓存

相比于其他的数据结构不同,ArrayMap提供一个全局的对于mHashes和mArray的缓存。长度为4和8的缓存结构。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
private static final int CACHE_SIZE = 10;

static Object[] mBaseCache;
static int mBaseCacheSize;
static Object[] mTwiceBaseCache;
static int mTwiceBaseCacheSize;

private static final Object sBaseCacheLock = new Object();
private static final Object sTwiceBaseCacheLock = new Object();

private void allocArrays(final int size) {
    if (mHashes == EMPTY_IMMUTABLE_INTS) {
        throw new UnsupportedOperationException("ArrayMap is immutable");
    }
    if (size == (BASE_SIZE*2)) {
        synchronized (sTwiceBaseCacheLock) {
            if (mTwiceBaseCache != null) {
                final Object[] array = mTwiceBaseCache;
                mArray = array;
                try {
                    mTwiceBaseCache = (Object[]) array[0];
                    mHashes = (int[]) array[1];
                    if (mHashes != null) {
                        array[0] = array[1] = null;
                        mTwiceBaseCacheSize--;
                        if (DEBUG) {
                            Log.d(TAG, "Retrieving 2x cache " + mHashes
                                    + " now have " + mTwiceBaseCacheSize + " entries");
                        }
                        return;
                    }
                } catch (ClassCastException e) {
                }
                // Whoops!  Someone trampled the array (probably due to not protecting
                // their access with a lock).  Our cache is corrupt; report and give up.
                Slog.wtf(TAG, "Found corrupt ArrayMap cache: [0]=" + array[0]
                        + " [1]=" + array[1]);
                mTwiceBaseCache = null;
                mTwiceBaseCacheSize = 0;
            }
        }
    } else if (size == BASE_SIZE) {
        synchronized (sBaseCacheLock) {
            if (mBaseCache != null) {
                final Object[] array = mBaseCache;
                mArray = array;
                try {
                    mBaseCache = (Object[]) array[0];
                    mHashes = (int[]) array[1];
                    if (mHashes != null) {
                        array[0] = array[1] = null;
                        mBaseCacheSize--;
                        if (DEBUG) {
                            Log.d(TAG, "Retrieving 1x cache " + mHashes
                                    + " now have " + mBaseCacheSize + " entries");
                        }
                        return;
                    }
                } catch (ClassCastException e) {
                }
                // Whoops!  Someone trampled the array (probably due to not protecting
                // their access with a lock).  Our cache is corrupt; report and give up.
                Slog.wtf(TAG, "Found corrupt ArrayMap cache: [0]=" + array[0]
                        + " [1]=" + array[1]);
                mBaseCache = null;
                mBaseCacheSize = 0;
            }
        }
    }

    mHashes = new int[size];
    mArray = new Object[size<<1];
}
private static void freeArrays(final int[] hashes, final Object[] array, final int size) {
    if (hashes.length == (BASE_SIZE*2)) {
        synchronized (sTwiceBaseCacheLock) {
            if (mTwiceBaseCacheSize < CACHE_SIZE) {
                array[0] = mTwiceBaseCache;
                array[1] = hashes;
                for (int i=(size<<1)-1; i>=2; i--) {
                    array[i] = null;
                }
                mTwiceBaseCache = array;
                mTwiceBaseCacheSize++;
                if (DEBUG) Log.d(TAG, "Storing 2x cache " + array
                        + " now have " + mTwiceBaseCacheSize + " entries");
            }
        }
    } else if (hashes.length == BASE_SIZE) {
        synchronized (sBaseCacheLock) {
            if (mBaseCacheSize < CACHE_SIZE) {
                array[0] = mBaseCache;
                array[1] = hashes;
                for (int i=(size<<1)-1; i>=2; i--) {
                    array[i] = null;
                }
                mBaseCache = array;
                mBaseCacheSize++;
                if (DEBUG) Log.d(TAG, "Storing 1x cache " + array
                        + " now have " + mBaseCacheSize + " entries");
            }
        }
    }
}

这里缓存了一个全局的由mArray的0位置连接的链表,长度为4和8的分别储存,最长支持缓存10个。然后对应的1位置保存mHashes。大致如下图:

缓存触发时机

  • 当执行构造函数申请空间时会触发allocArrays()
  • 当执行ensureCapacity()在当前容量小于预期容量的情况下, 先执行allocArrays(),再执行freeArrays()
  • 当执行clear()或者remove()移除最后一个元素是会触发freeArrays()
  • 当执行put()在容量满的情况下, 先执行allocArrays, 再执行freeArrays
  • 当移除元素时size不足1/3并且hash容量大于8时会缩容。执行allocArrays

get

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
public V get(Object key) {
    final int index = indexOfKey(key);
    return index >= 0 ? (V)mArray[(index<<1)+1] : null;
}
public int indexOfKey(Object key) {
    return key == null ? indexOfNull()
            : indexOf(key, mIdentityHashCode ? System.identityHashCode(key) : key.hashCode());
}
int indexOfNull() {
    final int N = mSize;

    if (N == 0) {
        return ~0;
    }

    int index = binarySearchHashes(mHashes, N, 0);

    if (index < 0) {
        return index;
    }

    if (null == mArray[index<<1]) {
        return index;
    }

    int end;
    for (end = index + 1; end < N && mHashes[end] == 0; end++) {
        if (null == mArray[end << 1]) return end;
    }

    for (int i = index - 1; i >= 0 && mHashes[i] == 0; i--) {
        if (null == mArray[i << 1]) return i;
    }

    return ~end;
}

private static int binarySearchHashes(int[] hashes, int N, int hash) {
    try {
        return ContainerHelpers.binarySearch(hashes, N, hash);
    } catch (ArrayIndexOutOfBoundsException e) {
        if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
            throw new ConcurrentModificationException();
        } else {
            throw e; // the cache is poisoned at this point, there's not much we can do
        }
    }
}

int indexOf(Object key, int hash) {
    final int N = mSize;

    if (N == 0) {
        return ~0;
    }

    int index = binarySearchHashes(mHashes, N, hash);

    if (index < 0) {
        return index;
    }

    if (key.equals(mArray[index<<1])) {
        return index;
    }

    // Search for a matching key after the index.
    int end;
    for (end = index + 1; end < N && mHashes[end] == hash; end++) {
        if (key.equals(mArray[end << 1])) return end;
    }

    // Search for a matching key before the index.
    for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
        if (key.equals(mArray[i << 1])) return i;
    }

    return ~end;
}

get方法主要实现为在hash数组中二分去查找。当存在hash值相等的情况时,向前以及向后去查找。如果最后仍未找到,返回可以插入位置取反。

put

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
public V put(K key, V value) {
    final int osize = mSize;
    final int hash;
    int index;
    if (key == null) {
        hash = 0;
        index = indexOfNull();
    } else {
        hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();
        index = indexOf(key, hash);
    }
    // 如果为正,表示找到了,直接替换
    if (index >= 0) {
        index = (index<<1) + 1;
        final V old = (V)mArray[index];
        mArray[index] = value;
        return old;
    }
    // 为负取反获得可以插入的位置
    index = ~index;
    // 若已经存满了,先扩容
    if (osize >= mHashes.length) {
        // size大于8则扩容为1.5倍,其余情况则扩容为8或者4,以达到使用缓存的目的。
        final int n = osize >= (BASE_SIZE*2) ? (osize+(osize>>1)) : (osize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

        if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        allocArrays(n);

        if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
            throw new ConcurrentModificationException();
        }

        if (mHashes.length > 0) {
            if (DEBUG) Log.d(TAG, "put: copy 0-" + osize + " to 0");
            System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
            System.arraycopy(oarray, 0, mArray, 0, oarray.length);
        }

        freeArrays(ohashes, oarray, osize);
    }

    // 空出插入位置
    if (index < osize) {
        if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (osize-index)
                + " to " + (index+1));
        System.arraycopy(mHashes, index, mHashes, index + 1, osize - index);
        System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
    }

    // 容量校验,是不支持并发访问的
    if (CONCURRENT_MODIFICATION_EXCEPTIONS) {
        if (osize != mSize || index >= mHashes.length) {
            throw new ConcurrentModificationException();
        }
    }
    mHashes[index] = hash;
    mArray[index<<1] = key;
    mArray[(index<<1)+1] = value;
    mSize++;
    return null;
}

插入元素先二分查找,若查找到已有元素,直接替换返回。若未查找到,则返回的为可以插入的位置。判断是否需要扩容后移动位置插入,因为插入元素涉及到移动位置,所以建议直接按照顺序插入,也提供一个直接append到末尾的方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public void append(K key, V value) {
    int index = mSize;
    final int hash = key == null ? 0
            : (mIdentityHashCode ? System.identityHashCode(key) : key.hashCode());
    if (index >= mHashes.length) {
        throw new IllegalStateException("Array is full");
    }
    // 和最后一个比较,若hash值不满足要求,调用普通的put方法
    if (index > 0 && mHashes[index-1] > hash) {
        RuntimeException e = new RuntimeException("here");
        e.fillInStackTrace();
        Log.w(TAG, "New hash " + hash
                + " is before end of array hash " + mHashes[index-1]
                + " at index " + index + " key " + key, e);
        put(key, value);
        return;
    }
    mSize = index+1;
    mHashes[index] = hash;
    index <<= 1;
    mArray[index] = key;
    mArray[index+1] = value;
}

append方法会进行hash校验,若满足则直接插入末尾。

remove

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
public V remove(Object key) {
    final int index = indexOfKey(key);
    if (index >= 0) {
        return removeAt(index);
    }

    return null;
}
public V removeAt(int index) {
    if (index >= mSize && UtilConfig.sThrowExceptionForUpperArrayOutOfBounds) {
        throw new ArrayIndexOutOfBoundsException(index);
    }

    final Object old = mArray[(index << 1) + 1];
    final int osize = mSize;
    final int nsize;
    // 容量完全清空后回收所有内存
    if (osize <= 1) {
        // Now empty.
        if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
        final int[] ohashes = mHashes;
        final Object[] oarray = mArray;
        mHashes = EmptyArray.INT;
        mArray = EmptyArray.OBJECT;
        freeArrays(ohashes, oarray, osize);
        nsize = 0;
    } else {
        nsize = osize - 1;
        // 如果容量利用率不足1/3并且hash容量比8大时,会执行缩容。
        if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
            // 缩容为当前size的1.5倍,即相当于1/3+1/6=1/2,为原来的0.5回收50%的内存
            final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2);

            if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

            final int[] ohashes = mHashes;
            final Object[] oarray = mArray;
            allocArrays(n);

            if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
                throw new ConcurrentModificationException();
            }

            if (index > 0) {
                if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
                System.arraycopy(ohashes, 0, mHashes, 0, index);
                System.arraycopy(oarray, 0, mArray, 0, index << 1);
            }
            if (index < nsize) {
                if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + nsize
                        + " to " + index);
                System.arraycopy(ohashes, index + 1, mHashes, index, nsize - index);
                System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
                        (nsize - index) << 1);
            }
        } else {
            if (index < nsize) {
                if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + nsize
                        + " to " + index);
                System.arraycopy(mHashes, index + 1, mHashes, index, nsize - index);
                System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
                        (nsize - index) << 1);
            }
            mArray[nsize << 1] = null;
            mArray[(nsize << 1) + 1] = null;
        }
    }
    if (CONCURRENT_MODIFICATION_EXCEPTIONS && osize != mSize) {
        throw new ConcurrentModificationException();
    }
    mSize = nsize;
    return (V)old;
}

移除元素也比较简单,分为三种情况

  • 储存完全清空时,回收所有内存
  • 容量利用率不足1/3并且hash容量比8大时,会缩容为原来的50%
  • 一般情况直接拷贝移动位置即可

总结

  • ArrayMap是用两个数组维护的Map,一个为有序的Hash数组,另外一个在对应位置储存着Key与Value,适合小数据量的场景;
  • 使用二分法查找,插入和删除都需要移动位置。所以推荐按hash顺序插入,并且减少容量变更操作;
  • 当hash冲突时,会在相同hash的末尾插入。查找时会向前和向后查找所有相同hash值的进行key比较。
  • 非线程安全容器,支持扩容和缩容,推荐设置默认容量为4或8或4的倍数,以充分利用缓存。

参考

This post is licensed under CC BY 4.0 by the author.