动手实现引用计数

Mike Ash Friday Q&A 中文译文:动手实现引用计数

作者 TommyWu
封面圖片: 动手实现引用计数

译文 · 原文: Friday Q&A 2011-09-16: Let's Build Reference Counting · 作者 Mike Ash

原文:https://www.mikeash.com/pyblog/friday-qa-2011-09-16-lets-build-reference-counting.html 发布:2011-09-16 作者:Mike Ash 译者:MiMo(mimo-v2.5-pro);代码块保留英文原样


上次我讨论了如何构建 NSAutoreleasePool 及其内部工作原理。今天,我将延续这个主题,构建一个使用 retainrelease 的 Cocoa 引用计数实现,这个话题由 David Dunham 提出。

背景
阅读本文的每位读者可能都已对引用计数(reference count)有了基本了解,但详细讨论以确保相关概念清晰是必要的。

每个 Objective-C 对象都有一个关联的引用计数。在 Cocoa 术语中,这被称为” retain count”(引用计数),但我认为这容易造成混淆,因此我将使用” 引用计数”。一个新创建的对象初始引用计数为 1。发送 retain 消息会使其递增,发送 release 消息会使其递减。如果 release 将引用计数减至零,则通过向该对象发送 dealloc 消息来销毁它。

由于引用计数是通过普通的 Objective-C 消息实现的,它可以被特定类重写。只要该重写遵循上述语义,一切仍能正常运作。

由于历史原因,NSObject 中默认的引用计数实现并没有将引用计数存储在对象自身中。相反,引用计数存储在一张外部表中。这样做可以节省一些内存,但牺牲了速度。在我重新实现引用计数时,也将采用这种基于表的方案。

CoreFoundation 对象以及其他许多 Cocoa 对象都重写了 retainrelease,以便将引用计数直接存储在对象自身中,从而略微提升了性能。

Lion 和 ARC 都对 Cocoa 内存管理系统进行了实质性的改进,但上面讨论的基本原理仍然成立。Lion 重新实现了基于表的引用计数系统,使其速度显著提升,但基本思路没有改变。

ARC 最终仍然在幕后执行 retainrelease。出于性能考虑,它实际上会调用运行时(runtime)函数来执行 retain 和 release,在某些情况下可以绕过 Objective-C 的消息发送(message sending)。然而,它仍然尊重对 retainrelease 的重写,并且在概念上,ARC 仍然只是在向其管理的对象发送 retainrelease 消息。

源码
一如既往,今日的代码已发布于 GitHub:https://github.com/mikeash/refcounting

设计
在深入代码本身之前,我们先探讨此设计的具体细节。需要实现两个方法。为避免与 Cocoa 的实现混淆,我将我的方法命名为 ma_retainma_release

引用计数表将使用 CFMutableDictionary 实现,其中对象指针映射到整数。这是一种相当快速且直接的方式。

为简化处理,引用计数为 1 时将不在表中记录条目。换言之,任何指针的默认引用计数均为 1。这意味着新对象会自动获得正确的初始引用计数 1。这也是 Cocoa 的实现方式(译注:现代 Runtime 已采用优化后的散列表和 SideTable 结构)。

线程安全对于任何引用计数实现都至关重要。一个类通常无法控制自身在哪个线程上被 retain 和 release,即使是本身线程不安全的类,也需要能够承受同时在不同线程上被 retain 和 release 的情况,因为追踪和控制 retain 与 release 发生的位置极其困难。为了使 CFMutableDictionary(可变字典)变得安全,我使用 OSSpinLock(自旋锁)来保护它。任何锁都可以,但自旋锁非常适合此场景,因为只要临界区执行速度快,它们的开销就很小。

表上有两个原语操作:GetRefcountSetRefcount。它们负责处理操作表的繁琐工作,例如将没有条目视为计数为 1、当对象引用计数降至 2 以下时将其从表中移除,等等。它们不锁定自旋锁,这部分交由调用者负责。这样做是必要的,因为 ma_retainma_release 都需要原子地对表执行两个操作,因此它们需要在两个操作周围加锁。(译注:OSSpinLock 存在优先级反转等问题,在现代 Apple 平台上已不推荐使用,替代方案请参考 os_unfair_lock 等机制。)

构建在这两个基础之上的,是两个稍高一阶的函数:IncrementRefcountDecrementRefcountIncrementRefcount 接收一个指针,简单地增加该指针在表格中的引用计数。DecrementRefcount 同样接收一个指针,递减其引用计数。它还会返回新的计数值,这对实现对象释放(deallocation)至关重要。

最后是方法本身,它们是这些高阶函数的轻型封装。ma_retain 只是调用 IncrementRefcountma_release 调用 DecrementRefcount,如果返回值为 0,则调用 [self dealloc]

代码
看文字累了?让我们来看些代码。

引用计数表格使用了两个全局变量。一个是表格本身的字典,另一个是自旋锁(spinlock):

static CFMutableDictionaryRef gRefcountDict;
static OSSpinLock gRefcountDictLock;

GetRefcount 函数检索给定指针的当前引用计数(reference count)。gRefcountDict 变量按需懒加载初始化,因此该函数首先检查该字典是否已初始化。如果尚未初始化,则表明尚无任何对象拥有引用计数,因此该对象的引用计数必为 1:

static uintptr_t GetRefcount(void *key)
{
if(!gRefcountDict)
return 1;

如果字典确实存在,则从其中取出对应的值。如果该值不存在,则再次返回 1。否则,返回字典中的实际存储值:

const void *value;
if(!CFDictionaryGetValueIfPresent(gRefcountDict, key, &value))
return 1;
return (uintptr_t)value;
}

SetRefcount 函数将指定指针的引用计数(reference count)设置为给定值。若该值为 1 或 0,它会直接将对象从表中移除:

static void SetRefcount(void *key, uintptr_t count)
{
if(count <= 1)
{
if(gRefcountDict)
CFDictionaryRemoveValue(gRefcountDict, key);
}

如果表尚未创建,那么就无须任何操作,因为此时不可能存在任何待移除的条目。

如果要设置的值大于 1,程序会首先延迟初始化(lazily initializes)该表,随后在其中设置对象的对应值:

else
{
if(!gRefcountDict)
gRefcountDict = CFDictionaryCreateMutable(NULL, 0, NULL, NULL);
CFDictionarySetValue(gRefcountDict, key, (void *)count);
}
}

IncrementRefcount 函数接受一个指针,并在表中递增其对应的值。为了实现这一点,它首先获取自旋锁(spinlock)。在持有锁的情况下,它读取当前值,加一,然后设置新的值。最后,它释放自旋锁:

static void IncrementRefcount(void *key)
{
OSSpinLockLock(&gRefcountDictLock);
uintptr_t count = GetRefcount(key);
SetRefcount(key, count + 1);
OSSpinLockUnlock(&gRefcountDictLock);
}

DecrementRefcount 函数的实现大同小异。唯一的真正区别在于它需要返回新的计数值。为此,函数先将新计数值保存在一个临时变量中,然后在函数末尾将其返回:

static uintptr_t DecrementRefcount(void *key)
{
OSSpinLockLock(&gRefcountDictLock);
uintptr_t count = GetRefcount(key);
uintptr_t newCount = count - 1;
SetRefcount(key, newCount);
OSSpinLockUnlock(&gRefcountDictLock);
return newCount;
}

接下来是包装这些函数的那些方法。这些方法很简单,大部分都不言自明:

@implementation NSObject (MARefcounting)
- (id)ma_retain
{
IncrementRefcount(self);
return self;
}
- (void)ma_release
{
uintptr_t newCount = DecrementRefcount(self);
if(newCount == 0)
[self dealloc];
}
@end

需要注意的是,ma_retain 方法返回 self,是为了遵循 Cocoa 框架的传统,从而允许这样的语句写法:

object = [otherObject retain];

引用计数系统本身并无此内在要求。

自定义引用计数系统的所有内容就是这些!

为了测试,我编写了一些快速代码,生成一系列对象,在压力测试中反复进行 retain 和 release 操作,并确保它们最终被释放。此处不再复现具体代码,但您可以在 GitHub 上查看:https://github.com/mikeash/refcounting/blob/master/refcounting.m#L80

总结

与自动释放池(autorelease pools)类似,通过这个简单的实现,我们可以揭开 Cocoa 内存管理的幕布,看到背后既没有巫师也没有奥秘。它只是一个使用标准 Objective-C 消息发送(messaging)的简单计数系统。Cocoa 的系统使用了大量技巧来使其运行快速,但从根本上说,它与这个快速简单的版本是一样的。再者,我们可以看到一些简单问题的答案,比如 “如果我 retain 一个对象两次会怎样?” 或者 “我能否在一个线程上 retain 一个对象,然后在另一个线程上 release 它?”。如果你 retain 一个对象两次,你就需要 release 它两次,因为这只是一个简单的递增和递减操作。Retain 和 release 并不关心你在哪个线程上,只要它们能够正确地平衡和同步,一切都会正常工作。

今天的探索到此结束。两周后请回来,我们将继续 Cocoa 内存管理系统的旅程第三部分,讨论 ARC(自动引用计数)、其工作原理以及如何使用它。一如既往,Friday Q & A 由读者的建议驱动,所以如果你有希望看到的话题,请发送过来!


#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2011-09-16-lets-build-reference-counting.html

Last time, I discussed how to build NSAutoreleasePool and how it works internally. Today, I’m going to carry that theme forward by building an implementation of Cocoa reference counting with retain and release, a topic suggested by David Dunham.

BackgroundEveryone reading this probably already has a basic grasp of what reference counting is, but it’s good to discuss it in detail to make sure the relevant concepts are clear.

Every Objective-C object has an associated reference count. In Cocoa terminology this is referred to as a “retain count”, but I think that just confuses matters so I’ll use “reference count”. A newly created object starts out with a reference count of 1. A retain message increments it, and a release message decrements it. If release decrements the reference count to zero, the object is destroyed by sending it the dealloc message.

Because reference counting is implemented with plain Objective-C messages, it can be overridden by a particular class. As long as that override follows the above semantics, everything still works.

For what are now largely historical reasons, the default reference counting implementation in NSObject does not store the reference count in the object itself. Instead, the reference count is stored in an external table. This can save some memory at the cost of speed. I will replicate this table-based approach in my reimplementation of reference counting.

CoreFoundation objects and many other Cocoa objects override retain and release in order to store the reference count in the object itself instead, which increases performance a bit.

Lion and ARCLion and ARC both saw some substantial changes in the Cocoa memory management system, but the basics discussed above still hold true. Lion reimplemented the table-based reference count system to make it significantly faster, but the fundamental idea is still the same.

ARC still ends up doing retain and release behind the scenes. For performance reasons, it actually calls runtime functions to retain and release which are able to bypass Objective-C message sends in some cases. However, it still respects retain and release overrides and conceptually ARC is still just sending retain and release messages to the objects that it manages.

Source CodeAs usual, the code for today is available on GitHub here: https://github.com/mikeash/refcounting

DesignBefore I get into the code itself, let’s talk about the specifics of this design. There are two methods which need to be implemented. To keep mine separate from Cocoa’s, I’ll call mine ma_retain and ma_release.

The reference count table will be implemented using a CFMutableDictionary mapping object pointers to integers. This provides a reasonably fast and direct approach.

To make life simpler, the reference count of 1 will be indicated by not having an entry in the table at all. In other words, any pointer’s reference count is 1 by default. This means that new objects get the correct reference count of 1 automatically. This is how Cocoa’s implementation works as well.

Thread safety is critical for any reference counting implementation. A class typically has no control over which threads it gets retained and released on, and even an otherwise thread-unsafe class needs to be able to withstand being retained and released on separate threads simultaneously, since it’s extremely difficult to track and control where retains and releases happen. To make the CFMutableDictionary safe, I protect it with an OSSpinLock. Any lock will do, but spinlocks are well suited to this case, since they are cheap as long as the critical region is fast.

There are two primitive operations on the table: GetRefcount and SetRefcount. These take care of the grunt work of manipulating the table, of treating the lack of an entry as 1, removing objects from the table when their reference count drops below 2, and such. They do not lock the spinlock, and that is left up to the caller. This is necessary because both ma_retain and ma_release need to perform two operations on the table atomically, so they need to lock around both.

Building on top of these two primitives are two slightly higher level functions: IncrementRefcount and DecrementRefcount. IncrementRefcount takes a pointer and simply increases that pointer’s reference count in the table. DecrementRefcount takes a pointer and decrement its reference count. It also returns the new count, which is important for implementing deallocation.

Finally, there are the methods, which are thin wrappers around these higher level functions. ma_retain just calls IncrementRefcount. ma_release calls DecrementRefcount, and if the return value is 0, calls [self dealloc].

CodeTired of words? Let’s do some code.

Two global variables are used for the reference count table. One is the dictionary for the table itself, and the other is the spinlock:

static CFMutableDictionaryRef gRefcountDict;
static OSSpinLock gRefcountDictLock;

The GetRefcount function retrieves the current reference count for the given pointer. The gRefcountDict variable is initialized lazily, on demand, so the first thing this function does is check to see if the dictionary has been initialized yet. If it hasn’t, then it knows that no object has a reference count yet, so the reference count for this object must be 1:

static uintptr_t GetRefcount(void *key)
{
if(!gRefcountDict)
return 1;

If the dictionary does exist, it grabs the value out of it. If the value isn’t present, it returns 1 again. Otherwise it returns whatever value is in the dictionary:

const void *value;
if(!CFDictionaryGetValueIfPresent(gRefcountDict, key, &value))
return 1;
return (uintptr_t)value;
}

The SetRefcount function sets the reference count for the given pointer to the given value. If the value is 1 or 0, it simply removes the object from the table:

static void SetRefcount(void *key, uintptr_t count)
{
if(count <= 1)
{
if(gRefcountDict)
CFDictionaryRemoveValue(gRefcountDict, key);
}

If the table hasn’t been created yet then there’s nothing to do, since there can’t be any entry to remove.

If the value to set is greater than 1, it first lazily initializes the table, then sets the object’s value in it:

else
{
if(!gRefcountDict)
gRefcountDict = CFDictionaryCreateMutable(NULL, 0, NULL, NULL);
CFDictionarySetValue(gRefcountDict, key, (void *)count);
}
}

The IncrementRefcount function takes a pointer and increments its value in the table. To do this, it first acquires the spinlock. With the lock held, it fetches the current value, adds one, and then sets that new value. Finally, it releases the spinlock:

static void IncrementRefcount(void *key)
{
OSSpinLockLock(&gRefcountDictLock);
uintptr_t count = GetRefcount(key);
SetRefcount(key, count + 1);
OSSpinLockUnlock(&gRefcountDictLock);
}

The DecrementRefcount function is much the same. The only real difference is that it needs to return the new count. To make this work, it saves the new count in a temporary variable, then returns it at the end of the function:

static uintptr_t DecrementRefcount(void *key)
{
OSSpinLockLock(&gRefcountDictLock);
uintptr_t count = GetRefcount(key);
uintptr_t newCount = count - 1;
SetRefcount(key, newCount);
OSSpinLockUnlock(&gRefcountDictLock);
return newCount;
}

Next up are the methods which wrap these functions. These are simple and mostly self explanatory:

@implementation NSObject (MARefcounting)
- (id)ma_retain
{
IncrementRefcount(self);
return self;
}
- (void)ma_release
{
uintptr_t newCount = DecrementRefcount(self);
if(newCount == 0)
[self dealloc];
}
@end

Note that ma_retain returns self to stick with the Cocoa tradition of allowing statements like this:

object = [otherObject retain];

It’s not an intrinsic requirement of the reference counting system.

That’s all there is to the custom reference counting system!

To test it, I wrote some quick code that generates a bunch of objects, retains and releases them in a stress test, and makes sure that they get deallocated. I won’t reproduce it here, but you can view that code on GitHub here: https://github.com/mikeash/refcounting/blob/master/refcounting.m#L80

ConclusionAs with autorelease pools, with this simple implementation we can pull back the curtain from Cocoa memory management and see that there is no wizard and no mystery behind the scenes. It’s just a simple counting system using standard Objective-C messaging. Cocoa’s system has plenty of tricks to make it fast, but fundamentally it’s the same as this quick and simple version. Again, we can see the answers to simple questions like “What happens if I retain an object twice?” or “Can I retain an object on one thread and release it on another thread?” If you retain an object twice, you need to release it twice, since it’s just a simple increment and decrement. Retain and release don’t care what thread you’re on, so as long as they balance and synchronize correctly, everything works fine.

That wraps up today’s expedition. Come back in two weeks for part three of this journey through the Cocoa memory management system with a discussion of ARC, how it works, and how to use it. As always, Friday Q&A is driven by reader suggestions, so if you have a topic that you would like to see covered, send it in!