Unity Job System详解(3)——NativeList源码分析
【前言】
查看NativeList源码需要安装Unity的Entities Package
NativeList要实现的基本功能类似C# List,如下:
(一些简单的类同NativeArray的不在说明)
构造函数、析构函数、取值赋值
扩容、添加、移除操作
解析步骤包括,基本函数,常用操作,常用属性,接口函数
【源码分析】
定义
//一个泛型结构体,且是unsafe的,继承了三个接口,并要求泛型是非托管类型的,如果T是引用类型的会报错
public unsafe struct NativeList<T> : INativeDisposable, INativeList<T>, IIndexable<T> where T : unmanaged
{}
构造函数
public NativeList(AllocatorManager.AllocatorHandle allocator): this(1, allocator){}public NativeList(int initialCapacity, AllocatorManager.AllocatorHandle allocator){this = default;AllocatorManager.AllocatorHandle temp = allocator;Initialize(initialCapacity, ref temp);}internal void Initialize<U>(int initialCapacity, ref U allocator) where U : unmanaged, AllocatorManager.IAllocator{var totalSize = sizeof(T) * (long)initialCapacity; //sizeof用于计算值类型对象所占用的内存大小,与容量相乘,得到初始化需要的内存大小m_ListData = UnsafeList<T>.Create(initialCapacity, ref allocator, NativeArrayOptions.UninitializedMemory);//listData是当前结构体的字段,调用UnsafeList的静态方法创建}//数据所在[NativeDisableUnsafePtrRestriction]//该特性允许使用指针internal UnsafeList<T>* m_ListData;//指向数据的指针
接着看UnsafeList
//同NativeList一样
public unsafe struct UnsafeList<T> : INativeDisposable, INativeList<T>, IIndexable<T> where T : unmanaged
{
}
直接看Create方法
internal static UnsafeList<T>* Create<U>(int initialCapacity, ref U allocator, NativeArrayOptions options) where U : unmanaged, AllocatorManager.IAllocator{UnsafeList<T>* listData = allocator.Allocate(default(UnsafeList<T>), 1);//通过Allocator分配内存,返回该内存起始地址的指针 NativeArrayOptions有两个选择ClearMemory、UninitializedMemory*listData = new UnsafeList<T>(initialCapacity, allocator.Handle, options);//构造函数返回新的内存起始地址的指针return listData;}public UnsafeList(int initialCapacity, AllocatorManager.AllocatorHandle allocator, NativeArrayOptions options = NativeArrayOptions.UninitializedMemory){Ptr = null;//数据所在m_length = 0;//length和capacity分别类似于C# List的count和capacitym_capacity = 0;Allocator = allocator;padding = 0;SetCapacity(math.max(initialCapacity, 1));if (options == NativeArrayOptions.ClearMemory && Ptr != null){var sizeOf = sizeof(T);UnsafeUtility.MemClear(Ptr, Capacity * sizeOf);//ClearMemory选项会清空已有的内存,本质是调用memset,将从ptr开始,长度为Capacity * sizeOf的内存的值设置为0}}
进一步看SetCapacity方法
public void SetCapacity(int capacity){SetCapacity(ref Allocator, capacity);}void SetCapacity<U>(ref U allocator, int capacity) where U : unmanaged, AllocatorManager.IAllocator{CollectionHelper.CheckCapacityInRange(capacity, Length);//先检查设置的容量是否大于长度var sizeOf = sizeof(T);//获取T类型占用的字节数//CacheLineSize是当前平台的L1缓存行。L1缓存是离CPU最近的缓存层级,也是访问速度最快的缓存。它由数据缓存和指令缓存组成,分别用于存储数据和指令。缓存行是缓存的最小读写单位,一般是以字节为单位。//L1缓存行的大小在不同的处理器架构上可能会有所不同,常见的大小是64字节。当CPU需要读取或写入数据时,它会首先检查L1缓存,如果所需数据在缓存行中,则可以直接访问,从而加快数据访问速度。如果数据不在缓存行中,则需要从更慢的内存层次(如L2缓存或主存)中获取。//L1缓存行的设计目的是通过提前将数据和指令加载到高速缓存中,减少CPU等待数据的时间,从而提高计算机的性能。var newCapacity = math.max(capacity, CollectionHelper.CacheLineSize / sizeOf);//CollectionHelper.CacheLineSize / sizeOf得到的是L1缓存一次要读取的数据的数量,将其与设置的容量相比取大值,主要是为了防止T过小而容量不足newCapacity = math.ceilpow2(newCapacity);//向上取整到最接近的2的幂次方,并返回结果。如果newCapacity为10,得到的结果为16。这里将容量限制为2的幂次也是为了缓存友好if (newCapacity == Capacity){return;}ResizeExact(ref allocator, newCapacity);}
继续看ResizeExact
//和C# List扩容类似,多了一步分配内存相关的处理 void ResizeExact<U>(ref U allocator, int newCapacity) where U : unmanaged, AllocatorManager.IAllocator{newCapacity = math.max(0, newCapacity);CollectionHelper.CheckAllocator(Allocator);T* newPointer = null;var alignOf = UnsafeUtility.AlignOf<T>();//获取T内存对齐的字节数var sizeOf = sizeof(T);if (newCapacity > 0){//分配内存,传入T类型的大小,内存对齐方式、数量,得到新的内存地址指针newPointer = (T*)allocator.Allocate(sizeOf, alignOf, newCapacity);if (Ptr != null && m_capacity > 0){var itemsToCopy = math.min(newCapacity, Capacity);//得到需要拷贝的数量,这里用了容量,没用长度var bytesToCopy = itemsToCopy * sizeOf;UnsafeUtility.MemCpy(newPointer, Ptr, bytesToCopy);//拷贝,需要传入目的地址指针、源地址指针,拷贝数据大小,底层实际调用的是C++的memcpy}}allocator.Free(Ptr, Capacity);//释放原来的内存Ptr = newPointer;//新的内存地址指针m_capacity = newCapacity;//新的容量m_length = math.min(m_length, newCapacity);//新的长度}
取值赋值
//NativeList的索引器
public T this[int index]{[MethodImpl(MethodImplOptions.AggressiveInlining)]//这个特性用于指示编译器在编译期间对方法进行内联优化。
//内联是一种编译器优化技术,它将方法调用替换为方法的实际代码。这样可以减少方法调用的开销,提高代码执行的效率。get{return (*m_ListData)[index];}[MethodImpl(MethodImplOptions.AggressiveInlining)]set{(*m_ListData)[index] = value;}}UnSafeList的索引器public T this[int index]{[MethodImpl(MethodImplOptions.AggressiveInlining)]get{CollectionHelper.CheckIndexInRange(index, m_length);//检测index是否小于lengthreturn Ptr[CollectionHelper.AssumePositive(index)];//先检测下Index是否大于零//Ptr的声明是, public T* Ptr;这是一个T类型的指针数组,直接通过Ptr[Index]访问元素即可}[MethodImpl(MethodImplOptions.AggressiveInlining)]set{CollectionHelper.CheckIndexInRange(index, m_length);Ptr[CollectionHelper.AssumePositive(index)] = value;}}
取值赋值时需要注意,当T为结构体时,不能直接像类一个修改T的成员变量,需要修改成员变量后重新设置回去,例如:
/// T t = NativeList[Index];
/// t.a = 10;t.b = 15;
/// NativeList[Index] = T;
推荐使用ElementAt方法,例如:
/// ref T t = NativeList[Index];
/// t.a = 10;t.b = 15;
public ref T ElementAt(int index){return ref m_ListData->ElementAt(index);//注意,指针调用方法,访问成员用->}[MethodImpl(MethodImplOptions.AggressiveInlining)]public ref T ElementAt(int index){CollectionHelper.CheckIndexInRange(index, m_length);return ref Ptr[CollectionHelper.AssumePositive(index)];}//该方法和索引器的区别在于,如果T是一个结构体的话,通过索引器获取后再修改结构体字段的值是无效的,通过该方法是有效的
析构函数
有两个方法,一个是继承IDispose接口需要实现的void Dispose()方法;另一个是继承INativeDisposable要实现的JobHandle Dispose(JobHandle inputDeps)方法
public void Dispose(){if (!IsCreated){return;}UnsafeList<T>.Destroy(m_ListData);m_ListData = null;}//UnsafeList中Destroypublic static void Destroy(UnsafeList<T>* listData){CheckNull(listData);//检查下是否为空var allocator = listData->Allocator;listData->Dispose();AllocatorManager.Free(allocator, listData);//释放内存}public void Dispose(){if (!IsCreated){return;}if (CollectionHelper.ShouldDeallocate(Allocator)){AllocatorManager.Free(Allocator, Ptr, m_capacity);Allocator = AllocatorManager.Invalid;}Ptr = null;m_length = 0;m_capacity = 0;}
Job有依赖时的释放Dispose:与NativeArray类似,自动创建一个NativeListDisposeJob依赖输入Job,输入Job完成后,在NativeListDisposeJob释放NativeList
public JobHandle Dispose(JobHandle inputDeps){if (!IsCreated){return inputDeps;}var jobHandle = new NativeListDisposeJob { Data = new NativeListDispose { m_ListData = (UntypedUnsafeList*)m_ListData } }.Schedule(inputDeps);//创建一个新的Job调用依赖Job,新的Job持有ListData的引用,在Eexcute中释放ListData占用的内存m_ListData = null;return jobHandle;}
添加元素
添加单个元素,这个方法同C# List一样,长度大于容量时会扩容
public void Add(in T value){m_ListData->Add(in value);}[MethodImpl(MethodImplOptions.AggressiveInlining)]public void Add(in T value){var idx = m_length;if (m_length < m_capacity){Ptr[idx] = value;m_length++;return;}Resize(idx + 1); //UnsafeList的Resize方法实际调用了SetCapacity方法Ptr[idx] = value;}
添加一系列元素AddRange
public void AddRange(NativeArray<T> array){AddRange(array.GetUnsafeReadOnlyPtr(), array.Length);//这里的参数是NativeArray,GetUnsafeReadOnlyPtr()是扩展方法,直接返回数据所在的buffer的指针}public void AddRange(void* ptr, int count){CheckArgPositive(count);m_ListData->AddRange(ptr, CollectionHelper.AssumePositive(count));}public void AddRange(void* ptr, int count){var idx = m_length;if (m_length + count > Capacity){Resize(m_length + count);}else{m_length += count;}var sizeOf = sizeof(T);void* dst = (byte*)Ptr + idx * sizeOf; //这里需要转为Byte*指针,数据长度都是按照byte来的UnsafeUtility.MemCpy(dst, ptr, count * sizeOf);}
并行添加元素,相比之前区别在于长度自增是原子操作的
public int AddNoResizeParallel(T value){return m_ListData->AddNoResizeParallel(value);}[MethodImpl(MethodImplOptions.AggressiveInlining)]public int AddNoResizeParallel(T value){var idx = Interlocked.Increment(ref m_length) - 1; //Interlocked.Increment是原子操作,多线程下确保+1操作是线程安全的。注意一般情况下,可能是先添加元素,再长度+1,这里是先+1完成,再添加元素CheckNoResizeHasEnoughCapacity(idx, 1);UnsafeUtility.WriteArrayElement(Ptr, idx, value);return idx;}
在末尾多次添加同一个元素
public void AddReplicate(in T value, int count)//in表示参数是只读的{CheckArgPositive(count);m_ListData->AddReplicate(in value, CollectionHelper.AssumePositive(count));}public void AddReplicate(in T value, int count){var idx = m_length;if (m_length + count > Capacity){Resize(m_length + count);}else{m_length += count;}fixed (void* ptr = &value)//这里是拿到value所在地址指针, 是取地址运算符,用于获取变量或对象的内存地址。将&value赋值给void* ptr,表示将value的内存地址存储在void*类型的指针变量 ptr 中。{//这里用的是指针加法,其他地方转为byte*来算的,这里直接用的是T*,增加时指针长度为sizeof(T)UnsafeUtility.MemCpyReplicate(Ptr + idx, ptr, UnsafeUtility.SizeOf<T>(), count);}}
移除元素
单元素移除
public void RemoveAt(int index){m_ListData->RemoveAt(index);}public void RemoveAt(int index){CollectionHelper.CheckIndexInRange(index, m_length);index = CollectionHelper.AssumePositive(index);T* dst = Ptr + index;T* src = dst + 1;m_length--;// Because these tend to be smaller (< 1MB), and the cost of jumping context to native and back is// so high, this consistently optimizes to better code than UnsafeUtility.MemCpyfor (int i = index; i < m_length; i++){*dst++ = *src++; //这里为了提高性能,没有内存的copy,直接修改了指针地址}}
范围移除
public void RemoveRange(int index, int count){m_ListData->RemoveRange(index, count);}public void RemoveRange(int index, int count){CheckIndexCount(index, count);index = CollectionHelper.AssumePositive(index);count = CollectionHelper.AssumePositive(count);if (count > 0){int copyFrom = math.min(index + count, m_length);var sizeOf = sizeof(T);void* dst = (byte*)Ptr + index * sizeOf;void* src = (byte*)Ptr + copyFrom * sizeOf;UnsafeUtility.MemCpy(dst, src, (m_length - copyFrom) * sizeOf);//直接将后面的元素向前移动m_length -= count;}}
长度属性
一般都get,不要去set
public int Length{[MethodImpl(MethodImplOptions.AggressiveInlining)]readonly get{return CollectionHelper.AssumePositive(m_ListData->Length);//返回了UnsafeList的长度}set{m_ListData->Resize(value, NativeArrayOptions.ClearMemory);//设置长度会调用Resize方法}}public int Length{[MethodImpl(MethodImplOptions.AggressiveInlining)]readonly get => CollectionHelper.AssumePositive(m_length);//这个长度在初始化时会赋值set{if (value > Capacity){Resize(value);}else{m_length = value;}}}
IEnumerable实现
返回了NativeArray<T>.Enumerator,转换成NativeArray,As没做拷贝,只是把指针赋值过去,ToArray做了拷贝
public NativeArray<T> AsArray(){var array = NativeArrayUnsafeUtility.ConvertExistingDataToNativeArray<T>(m_ListData->Ptr, m_ListData->Length, Allocator.None);return array;}public unsafe static NativeArray<T> ConvertExistingDataToNativeArray<T>(void* dataPointer, int length, Allocator allocator) where T : struct{CheckConvertArguments<T>(length);NativeArray<T> result = default(NativeArray<T>);result.m_Buffer = dataPointer;result.m_Length = length;result.m_AllocatorLabel = allocator;result.m_MinIndex = 0;result.m_MaxIndex = length - 1;return result;}public NativeArray<T> ToArray(AllocatorManager.AllocatorHandle allocator){NativeArray<T> result = CollectionHelper.CreateNativeArray<T>(Length, allocator, NativeArrayOptions.UninitializedMemory);UnsafeUtility.MemCpy((byte*)result.m_Buffer, (byte*)m_ListData->Ptr, Length * UnsafeUtility.SizeOf<T>());return result;}
INativeList
抽象出来的接口,其他NativeContainer也会用到
//一个可索引的接口
public interface IIndexable<T> where T : unmanaged{int Length { get; set; }//元素集合的长度,即集合的元素数量ref T ElementAt(int index);//通过索引获取元素,注意,结构体中获取元素,如果存在写入,要用ref}//自定义了一个NativeList接口,用于定义共用的方法public interface INativeList<T> : IIndexable<T> where T : unmanaged{int Capacity { get; set; }bool IsEmpty { get; }T this[int index] { get; set; }void Clear();}
总结
可以看到NativeList是对UnsafeList的封装,核心都在UnsafeList中。总体实现和一般C# List并无差别,可能多了下Native内存的访问,要注意下元素长度和实际内存长度的区别,特别之处是分配内存时是CPU L1缓存的整数倍
【内存分配】
这里是一个比较细的粒度的内存分配,这块内存一般用于存储特定的对象T的数据,叫Block,一般需要用数据结构记录关于这块内存的以下信息:
- 内存指针 对应Block.Range.Pointer
- 所需的内存大小 对应Block.Bytes 由于内存对齐的原因,实际分配的内存大小比所需的内存大小要大
- 对象T所需的内存对齐大小 对应Block.Alignment
- 对象T占用内存大小 对应Block.BytesPerItem
- 内存最多可分配的对象数量 对应Block.Range.Items 与Block.Alignment相乘可以算出实际的内存大小,但一般上层基本不会用到,也不需要列出
- 该内存被分配的方式(可能有) 对应Block.Range.Allocator
如果存在内存复用,例如原来Block有20个T对象,过段时间上层不需要释放了,对底层的内存管理模块而言,不是立马就释放了,可能缓存着,紧接着有新的请求,需要10个T对象内存,那么可以直接用已经分配好的Block。这种缓存很常见,上层业务也经常用。
(什么时候用已有的,什么是时候不缓存,用的地方不同,策略也不同)
在复用的情况下,需要记录额外的信息:
- 已经分配的对象数量 AllocatedItems
- 已经分配的对象内存大小 AllocatedBytes
分配会调用到AllocatorManager.TryLegacy(),接着调用Memory.Unmanaged.Allocate、Array.Resize,最后还是调用到UnsafeUtility.MallocTracked,前文NativeArray分配你内存最终也是调用的这个方法
static unsafe int TryLegacy(ref Block block) //同时处理分配和释放内存的情况,不太理解为什么要写要一起{if (block.Range.Pointer == IntPtr.Zero) // Allocate{block.Range.Pointer = (IntPtr)Memory.Unmanaged.Allocate(block.Bytes, block.Alignment, LegacyOf(block.Range.Allocator));block.AllocatedItems = block.Range.Items;return (block.Range.Pointer == IntPtr.Zero) ? -1 : 0;}if (block.Bytes == 0) // Free,释放时上层会将这里设置为0{if (LegacyOf(block.Range.Allocator) != Allocator.None){Memory.Unmanaged.Free((void*)block.Range.Pointer, LegacyOf(block.Range.Allocator));}block.Range.Pointer = IntPtr.Zero;block.AllocatedItems = 0;return 0;}// Reallocate (keep existing pointer and change size if possible. otherwise, allocate new thing and copy)return -1;}
//传递的是枚举类型的Allocator,使用时却变成结构体AllocatorHandle,因为做了隐式转换,如下:public static implicit operator AllocatorHandle(Allocator a) => new AllocatorHandle{Index = (ushort)((uint)a & 0xFFFF),Version = 0};