PLWeakCompatibility 源码导览（下）

文章發布時間 2012年6月1日

作者 TommyWu

標籤

译文 · 原文： Friday Q&A 2012-06-01: A Tour of PLWeakCompatibility: Part II · 作者 Mike Ash

原文：https://www.mikeash.com/pyblog/friday-qa-2012-06-01-a-tour-of-plweakcompatibility-part-ii.html 发布：2012-06-01　作者：Mike Ash 译者：MiMo（mimo-v2.5-pro）；代码块保留英文原样

上次我讨论了 PLWeakCompatibility 的基础，包括其动机、用于引导编译器在处理 __weak 变量时调用我们的代码所使用的基本钩子（hooks），以及在可用时透传调用原有实现的机制。今天，我将探讨当运行时（runtime）未提供自身的 __weak 支持时，所使用的零化弱引用（zeroing weak reference）功能的实现。

回顾一下，由 PLWeakCompatibility 实现的、并直接由编译器生成的代码调用的函数包括：

1
    PLObjectPtr objc_loadWeakRetained(PLObjectPtr *location);
2
    PLObjectPtr objc_initWeak(PLObjectPtr *addr, PLObjectPtr val);
3
    void objc_destroyWeak(PLObjectPtr *addr);
4
    void objc_copyWeak(PLObjectPtr *to, PLObjectPtr *from);
5
    void objc_moveWeak(PLObjectPtr *to, PLObjectPtr *from);
6
    PLObjectPtr objc_loadWeak(PLObjectPtr *location);
7
    PLObjectPtr objc_storeWeak(PLObjectPtr *location, PLObjectPtr obj);

PLObjectPtr 只是 void * 的类型别名，用于阻止 ARC 在这些绝对不需要内存管理的函数中插入相关代码。所有这些函数都以调用 Objective-C 运行时的可用实现作为开头。当官方运行时函数不可用时，这七个函数会被拆解，并依据三个原始函数（primitive functions）来实现：

1
    PLObjectPtr PLLoadWeakRetained(PLObjectPtr *location);
2
    void PLRegisterWeak(PLObjectPtr *location, PLObjectPtr obj);
3
    void PLUnregisterWeak(PLObjectPtr *location, PLObjectPtr obj);

PLLoadWeakRetained 函数从给定位置加载弱引用（weak reference），并返回其包含对象的保留引用（retained reference），如果对象已被释放则返回 nil。PLRegisterWeak 函数将特定内存位置注册为给定对象的弱引用，而 PLUnregisterWeak 则移除该注册。剩下的任务就是实现这三个函数。

MAZeroingWeakRef（置空弱引用）

目标是使 PLWeakCompatibility 完全自包含，拥有自己的置空弱引用（zeroing weak reference）实现，这必然会在一定程度上简化功能。然而，我们也希望在 MAZeroingWeakRef 可用时使用它，因为该库出现在应用中很可能意味着程序员喜欢这个实现，我们不妨利用它的存在。因此，首要任务是检测 MAZeroingWeakRef 是否存在，如果存在，就使用它来实现这三个原始函数（primitive functions）。

首先，我们需要一种方法来检测 MAZeroingWeakRef 是否存在，如果存在则获取对该类的引用。这一切都被封装在一个简单的函数中，该函数使用 NSClassFromString 尝试获取 MAZeroingWeakRef 类，并包裹在 dispatch_once（一次性分发）调用中以最小化开销。该函数还带有一个额外的标志位，允许出于测试目的禁用 MAZeroingWeakRef 功能：

1
    static Class MAZWR = Nil;
2
    static bool mazwrEnabled = true;
3
    static inline bool has_mazwr () {
4
        if (!mazwrEnabled)
5
            return false;
6

7
        static dispatch_once_t lookup_once = 0;
8
        dispatch_once(&lookup_once, ^{
9
            MAZWR = NSClassFromString(@"MAZeroingWeakRef");
10
        });
11

12
        if (MAZWR != nil)
13
            return true;
14
        return false;
15
    }

现在代码可以直接调用 has_mazwr，如果返回 true，就可以使用 MAZeroingWeakRef 来获取类的引用。

使用 MAZeroingWeakRef 实现这三个基本操作（primitive）的策略相当直观。每个基本操作都有一个用于存储弱引用的位置，而没有任何规定要求这个位置必须直接持有指向弱引用对象的指针。因此，我们使用传入的位置来存储一个指向 MAZeroingWeakRef 实例的指针，该实例反过来引用原始对象。PLRegisterWeak 将简单地创建一个新的 MAZeroingWeakRef 实例并将其放入给定位置。PLUnregisterWeak 将简单地释放该实例。而 PLLoadWeakRetained 只需调用该对象的 -target 方法。

每个基本函数在开始时都会检查 has_mazwr 来决定执行何种操作。每个基本函数的开头都包含了 MAZeroingWeakRef 的调用，它们是：

1
    static PLObjectPtr PLLoadWeakRetained(PLObjectPtr *location) {
2
        if (has_mazwr()) {
3
            MAZeroingWeakRef *mazrw = (__bridge MAZeroingWeakRef *) *location;
4
            return objc_retain([mazrw target]);
5
        }
6
        ...
7

8
    static void PLRegisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
9
        if (has_mazwr()) {
10
            MAZeroingWeakRef *ref = [[MAZWR alloc] initWithTarget: obj];
11
            *location = (__bridge_retained PLObjectPtr) ref;
12
            return;
13
        }
14
        ...
15

16
    static void PLUnregisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
17
        if (has_mazwr()) {
18
            if (*location != nil)
19
                objc_release(*location);
20
            return;
21
        }
22
        ...

内置实现
现在进入核心部分：内置的零弱引用（zeroing weak reference）实现。其策略是通过交换（swizzle）目标对象的 release 和 dealloc 方法。交换后的 release 方法会将对象添加到正在释放的对象列表中，阻止任何尝试解析弱引用的行为，确保在对象最终释放触发 dealloc 时不会有任何弱引用被解析。随后，交换后的 dealloc 方法会清除该对象的所有弱引用。

首先我们需要一个互斥锁（mutex）来保护所有共享数据结构：

1
    static pthread_mutex_t gWeakMutex;

我们还需要一种方式来跟踪当前注册到任何给定对象的所有 weak references（弱引用）。这以 CFMutableDictionary 的形式出现，该字典将对象映射到包含注册的弱地址的 CFMutableSet 实例：

1
    static CFMutableDictionaryRef gObjectToAddressesMap;

我们很可能看到对同一个类多个实例的弱引用，因此需要追踪哪些类已经被交换过，并避免重复交换。这通过在一个集合中追踪已交换的类来实现：

1
    static CFMutableSetRef gSwizzledClasses;

当前正处于 release 调用过程中的对象会被存储在一个集合中：

1
    static CFMutableBagRef gReleasingObjects;

由于某些代码需要等待 gReleasingObjects 发生变化，这意味着我们还需要一个条件变量（condition variable）供其等待。（如果你对条件变量不熟悉，我稍后会详细讨论。）

1
    static pthread_cond_t gReleasingObjectsCond;

部分数据还需要存放在线程本地存储（thread-local storage，TLS）中。那里存放着两张辅助 swizzling（方法交换）过程的表。它们被封装在结构体中，通过一个 pthread 线程本地存储键来访问。

1
    static pthread_key_t gTLSKey;

为了方便起见，我们还将所有用于方法调配（method swizzling）的选择子（selector）都放在了全局变量中：

1
    static SEL releaseSEL;
2
    static SEL releaseSELSwizzled;
3
    static SEL deallocSEL;
4
    static SEL deallocSELSwizzled;

每个原始函数都需要访问这些变量，并且它们都必须在使用前进行初始化。整个初始化过程都封装在一个 dispatch_once 块中：

1
    static void WeakInit(void) {
2
        static dispatch_once_t pred;
3
        dispatch_once(&pred, ^{

它首先使用递归属性来初始化互斥锁：

1
            pthread_mutexattr_t attr;
2
            pthread_mutexattr_init(&attr);
3
            pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
4

5
            pthread_mutex_init(&gWeakMutex, &attr);
6

7
            pthread_mutexattr_destroy(&attr);

接下来，创建对象到地址的映射和已交换方法的类的集合：

1
            gObjectToAddressesMap = CFDictionaryCreateMutable(NULL, 0, NULL, &kCFTypeDictionaryValueCallBacks);
2

3
            gSwizzledClasses = CFSetCreateMutable(NULL, 0, NULL);

通过为字典键回调（dictionary key callbacks）和集合回调（set callbacks）传递 NULL，我们确保 CoreFoundation 不会对它们尝试进行任何内存管理操作。

用于等待的释放对象集合（releasing objects set）以及与之关联的条件变量（condition variable）已初始化：

1
            gReleasingObjects = CFBagCreateMutable(NULL, 0, NULL);
2
            pthread_cond_init(&gReleasingObjectsCond, NULL);

接下来，创建 pthread 线程本地存储键。这里在错误检查上略显谨慎，因为这个函数确实可能失败。能创建的线程本地存储键数量是有限且相对较少的：目前 Mac OS X 上限为 512 个键。这通常足够使用，但由于确实存在失败的可能，我希望它能尽早失败且明显：

1
            int err = pthread_key_create(&gTLSKey, DestroyTLS);
2
            if (err != 0) {
3
                NSLog(@"Error calling pthread_key_create, we really can't recover from that: %s (%d)", strerror(err), err);
4
                abort();
5
            }

DestroyTLS 函数会释放为线程局部存储（Thread-Local Storage）分配的内存。其具体实现稍后展示。

最后，需要初始化选择子（selector）。由于在 ARC（自动引用计数）下不允许为 release 和 dealloc 使用标准的 @selector 构造，我们转而使用 Objective-C 运行时的 sel_getUid（功能等同于 NSSelectorFromString）来绕过 ARC 的检查，获取这些选择子：

1
            releaseSEL = sel_getUid("release");
2
            releaseSELSwizzled = sel_getUid("release_PLWeakCompatibility_swizzled");
3
            deallocSEL = sel_getUid("dealloc");
4
            deallocSELSwizzled = sel_getUid("dealloc_PLWeakCompatibility_swizzled");
5
        });
6
    }

每个基础函数在执行任何操作前都会调用WeakInit，以确保所有这些变量都已设置。为简洁起见，在讨论这些函数的实现时，我将省略该调用以及MAZeroingWeakRef代码。

线程局部存储 一个 pthread 线程局部存储（Thread-Local Storage）键可用于每个线程设置和获取一个单一指针。要存储多个值，我们会分配一个包含我们想要存储的所有内容的结构体。我们需要两个字典来帮助被混写（swizzled）的方法调用它们的原始值。结构体如下所示：

1
    struct TLS {
2
        // Tables tracking the last class a swizzled method was sent to on an object
3
        CFMutableDictionaryRef lastReleaseClassTable;
4
        CFMutableDictionaryRef lastDeallocClassTable;
5
    };

由于该结构体会在多处使用，我们希望有一个包装函数来获取它，如果该线程尚未使用过 TLS（线程本地存储）结构体，则按需创建。pthread_getspecific 用于获取当前值，若当前值为 NULL，此函数会使用 pthread_setspecific 设置一个新键：

1
    static struct TLS *GetTLS(void) {
2
        struct TLS *tls = pthread_getspecific(gTLSKey);
3
        if (tls == NULL) {
4
            tls = calloc(1, sizeof(*tls));
5
            tls->lastReleaseClassTable = CFDictionaryCreateMutable(NULL, 0, NULL, NULL);
6
            tls->lastDeallocClassTable = CFDictionaryCreateMutable(NULL, 0, NULL, NULL);
7
            pthread_setspecific(gTLSKey, tls);
8
        }
9
        return tls;
10
    }

我们还需要一个用于销毁 TLS 结构体的函数。该函数被传递给 pthread_key_create，当线程销毁时会由 pthread 自动调用：

1
    static void DestroyTLS(void *ptr) {
2
        struct TLS *tls = ptr;
3
        if (tls != NULL && tls->lastReleaseClassTable) {
4
            CFRelease(tls->lastReleaseClassTable);
5
            CFRelease(tls->lastDeallocClassTable);
6
        }
7
        free(tls);
8
    }

有了这些，任何线程都可以简单地调用 GetTLS()，然后操作结构体中的数据，这些数据保证只对调用线程可见。

原始函数

PLLoadWeakRetained（弱引用加载保活函数）的实现相对直接。它需要获取全局互斥锁，然后检索对象指针。如果该对象当前正在被释放，则它必须等到释放完成。

它首先要做的就是获取全局互斥锁，然后尝试获取存储在给定位置的值：

1
    static PLObjectPtr PLLoadWeakRetained(PLObjectPtr *location) {
2
        PLObjectPtr obj;
3
        pthread_mutex_lock(&gWeakMutex); {
4
            obj = *location;

接下来，它会检查给定的对象是否在正在释放的对象列表中。如果存在，它就使用 pthread_cond_wait（线程条件变量等待）在条件变量上阻塞，然后重新加载该位置以获取最新值：

1
        while (CFBagContainsValue(gReleasingObjects, obj)) {
2
            pthread_cond_wait(&gReleasingObjectsCond, &gWeakMutex);
3
            obj = *location;
4
        }

pthread_cond_wait 释放给定的 mutex（互斥锁），然后等待有人发信号给 condition variable（条件变量）。一旦收到信号，它重新获取 mutex 并恢复执行。这允许线程阻塞，直到另一个线程发出信号表明释放对象的表发生了变化，此时它可以重新检查该表。

这是一个 while 循环，而不是简单的 if 语句，有几个原因。一个原因是，信号表明的变化可能不是针对这个对象的。整个表只有一个 condition variable（条件变量），其他对象可能才是被移除的那一个。

另一个原因更有趣，称为 spurious wakeup（虚假唤醒）。简而言之，由于各种实现细节，pthread_cond_wait 可能偶尔会返回，即使在条件变量上没有信号。因此，任何对 pthread_cond_wait 的调用都应该始终包装在循环中，而不是简单的 if 语句。

一旦获得对象（或 nil），我们只需保留它，释放 mutex，并返回保留的对象。

1
            objc_retain(obj);
2
        }
3
        pthread_mutex_unlock(&gWeakMutex);
4

5
        return obj;
6
    }

PLRegisterWeak 的实现稍复杂一些。它首先会获取给定对象已注册的地址集合，以便添加新条目：

1
    static void PLRegisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
2
        pthread_mutex_lock(&gWeakMutex); {
3
            CFMutableSetRef addresses = (CFMutableSetRef)CFDictionaryGetValue(gObjectToAddressesMap, obj);

如果这是给定对象（object）的第一个弱引用（weak reference），那么那个集合（set）将不存在。在这种情况下，这个函数（function）必须创建它：

1
            if (addresses == NULL) {
2
                addresses = CFSetCreateMutable(NULL, 0, NULL);
3
                CFDictionarySetValue(gObjectToAddressesMap, obj, addresses);
4
                CFRelease(addresses);
5
            }

现在有了这个集合，传入的位置被添加到其中：

1
            CFSetAddValue(addresses, location);

最后，它调用了一个辅助函数来确保所有必要的方法混淆（method swizzling）已经完成：

1
            EnsureDeallocationTrigger(obj);
2
        } pthread_mutex_unlock(&gWeakMutex);
3
    }

我们很快会进入该辅助函数的实现细节。

PLUnregisterWeak 的实现本质上是 PLRegisterWeak 的逆操作，不同之处在于它无需处理 swizzling（方法混写）问题 ——swizzling 操作会直接保留原样，同时它也不会在地址集合变为空时删除该集合：

1
    static void PLUnregisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
2
        pthread_mutex_lock(&gWeakMutex); {
3
            // Remove the location from the set of weakly referenced addresses.
4
            CFMutableSetRef addresses = (CFMutableSetRef)CFDictionaryGetValue(gObjectToAddressesMap, *location);
5
            if (addresses != NULL)
6
                CFSetRemoveValue(addresses, location);
7
        } pthread_mutex_unlock(&gWeakMutex);
8
    }

现在来看 EnsureDeallocationTrigger 的实现。它首先获取给定对象的类（class），如果该类已经被交换（swizzled）过，则退出：

1
    static void EnsureDeallocationTrigger(PLObjectPtr obj) {
2
        Class c = object_getClass(obj);
3
        if (CFSetContainsValue(gSwizzledClasses, (__bridge const void *)c))
4
            return;

如果它还未进行交换（swizzling），则会使用一个小型辅助函数来交换 release 和 dealloc 方法，最后将该类添加到已交换的类集合中：

1
        Swizzle(c, releaseSEL, releaseSELSwizzled, (IMP)SwizzledReleaseIMP);
2
        Swizzle(c, deallocSEL, deallocSELSwizzled, (IMP)SwizzledDeallocIMP);
3

4
        CFSetAddValue(gSwizzledClasses, (__bridge const void *)c);
5
    }

Swizzle 的实现很简单：它只是使用 class_addMethod 将先前的 IMP（方法实现）注册到一个新的 selector（选择子）下，并使用 class_replaceMethod 将新的 IMP 放入原始的选择子中：

1
    static void Swizzle(Class c, SEL orig, SEL new, IMP newIMP) {
2
        Method m = class_getInstanceMethod(c, orig);
3
        IMP origIMP = method_getImplementation(m);
4
        class_addMethod(c, new, origIMP, method_getTypeEncoding(m));
5
        class_replaceMethod(c, orig, newIMP, method_getTypeEncoding(m));
6
    }

差不多就是这些了。剩下的只有 SwizzledReleaseIMP 和 SwizzledDeallocIMP 的具体实现。然而，这些实现实际上相当困难和复杂。

子类问题
在可能的情况下，实现方法交换（method swizzling）的最佳方式是将原始的方法实现（IMP）存储在一个全局变量中，然后将其作为函数指针进行调用：

1
    // original implementation
2
    void (*origIMP)(id, SEL);
3

4
    // swizzle code
5
    origIMP = (void *)method_getImplementation(origMethod);
6
    class_replaceMethod(class, selector, newIMP, method_getTypeEncoding(origMethod));
7

8
    // swizzled IMP
9
    void newIMP(id self, SEL _cmd)
10
    {
11
        // do stuff here
12
        ...
13

14
        // call the original
15
        origIMP(self, _cmd);
16
    }

然而，这仅适用于你只对单个类进行方法交换（method swizzling）的情况。当需要交换多个类时，就会涉及到多个原始实现（original implementation）的记录问题。在这种情况下，最好的做法是将原始实现注册为同一类上的一个新选择子（selector），然后通过该选择子进行查找：

1
    // swizzle code
2
    IMP origIMP = method_getImplementation(origMethod);
3
    class_addMethod(class, @selector(swizzled_method), origIMP, method_getTypeEncoding(origMethod));
4
    class_replaceMethod(class, selector, newIMP, method_getTypeEncoding(origMethod));
5

6
    // swizzled IMP
7
    void newIMP(id self, SEL _cmd)
8
    {
9
        // do stuff here
10
        ...
11

12
        // look up the original
13
        Class class = object_getClass(self);
14
        void (*origIMP)(id, SEL) = (void *)(class_getMethodImplementation(class, @selector(swizzled_method));
15

16
        // call the original
17
        origIMP(self, _cmd);
18
    }

不过，这种做法仅适用于仅对叶子类（leaf class）进行方法调配（swizzle）的情况。如果你最终对两个存在继承关系的类进行方法调配 —— 其中一个类是另一个的父类 —— 就会导致无限递归并引发崩溃。

要理解原因，我们假设有两个类 A 和 B，且两者都重写了 dealloc 方法：

1
    @interface A : NSObject
2
    - (void)dealloc; // clean up stuff
3
    @end
4

5
    @interface B : A
6
    - (void)dealloc; // clean up in aisle three
7
    @end

假设我们已经对类 A 和 B 的 dealloc 方法都进行了交换（swizzling）（可能是因为有代码创建了指向 A 和 B 实例的 __weak 引用）。现在，有代码释放了一个 B 的实例。

由于方法交换的存在，对 dealloc 的调用最终会调用到交换后的实现。到目前为止一切正常。

这个交换后的实现会查找 -[B swizzled_dealloc] 并调用它。这进而会调用 -[B dealloc] 的原始实现。再次，到这里为止都很好。在这个原始实现的末尾，该方法会调用 [super dealloc]，这最终会调用到 -[A dealloc]。到这里为止都还很好。

-[A dealloc] 也是交换后的实现，但这没问题。这个实现需要是可重入的（reentrant），但如果我们想拦截对 A 和 B 实例的调用，就必须这样设计。交换后的实现再次执行其操作，但代码被编写成可以容忍这种情况。然后它调用到原始实现。而问题就出在这里。

来看看这段查找原始实现的代码：

1
    Class class = object_getClass(self);
2
    void (*origIMP)(id, SEL) = (void *)(class_getMethodImplementation(class, @selector(swizzled_method));

即使是从 A 中调用，object_getClass 依然返回 B。在运行时（Runtime）中，并不存在 “从哪个类调用” 的概念。对象的类是 B，它就会得到 B 的实现。所以在 -[A dealloc] 方法末尾，当被交换的方法实现（swizzled implementation）查找并调用原始实现时，最终会再次调用 -[B swizzled_dealloc]！如果这个方法第二次执行时未导致崩溃，它将回调至 -[A dealloc]，而 -[A dealloc] 又再次回调至 -[B swizzled_dealloc]，如此循环往复，直到某段代码对这种滥用行为感到厌烦，或者无限递归耗尽栈空间并导致崩溃。

为了解决这个问题，被交换的方法实现需要某种途径来知道它究竟附加到了哪个类上。使用 imp_implementationWithBlock 为每个类创建略有不同的交换实现，这本是轻而易举的事。然而，imp_implementationWithBlock 仅在 iOS 4.3 及更高版本上可用。如果 PLWeakCompatibility 库要求使用它，则意味着要放弃对 iOS 4.0-4.2 的支持，这将大大降低其普适性。我们需要想出其他方法来解决这个问题。

模拟 super
假设中的 super 调用会沿着类层次结构向上遍历，每次从上一级类中检索原始的 IMP（方法实现函数），然后调用它。通过记录最近一次检索到的类，我们可以模拟这一行为。对于每个方法，我们建立一个表格，将对象映射到调用原始 IMP 时最近使用的类。通过每次从该类在层次结构中向上移动，我们就能实现必要的行为。

在上面的例子中，第一次调用dealloc时表格中没有条目，因此它会调用 B 的 dealloc 并将 B 记录到表格中。下一次调用时发现记录的是 B，于是调用 A 的 dealloc，并将 A 记录到表格中。再下一次调用时发现记录的是 A，就调用 NSObject 的 dealloc。这完全符合我们的预期。

这里还需要一个额外的技巧。想象类层次结构中还有另一个类 C：

1
    @interface C : B
2
    // does not override dealloc
3
    @end

这样就会遇到一个问题。第一次调用时没有查到表项，于是调用了 C 的 dealloc，并将 C 放入表中。然而由于 C 并未重写 dealloc，实际上调用的是 B 的 dealloc。下一次调用时查到了表中的 C，于是再次调用… B 的 dealloc。这显然不对。

诀窍在于沿着类层级结构（class hierarchy）向上查找，找到实现了给定方法实现（method implementation）的最高层类。函数 TopClassImplementingMethod 会从给定的类开始向上搜索类层级结构，寻找给定选择子（selector）的 IMP（方法实现指针）发生变化的位置，然后返回该位置之前的最后一个类：

1
    static Class TopClassImplementingMethod(Class start, SEL sel) {
2
        IMP imp = class_getMethodImplementation(start, sel);
3

4
        Class previous = start;
5
        Class cursor = class_getSuperclass(previous);
6
        while (cursor != Nil) {
7
            if (imp != class_getMethodImplementation(cursor, sel))
8
                break;
9
            previous = cursor;
10
            cursor = class_getSuperclass(cursor);
11
        }
12

13
        return previous;
14
    }

通过在方法表（method table）中添加条目之前调用此函数，问题得以解决。首次调用时，由于没有现有表项，系统会调用 C 的 dealloc（实际实现来自 B 的 dealloc），但随后会将 B 而非 C 记录到表中。后续调用将依次经过 A，最后到达我们所需的 NSObject。

若同一方法在不同线程上被多次调用，这些方法表可能产生冲突。然而，通过将这些表存储在线程局部存储（thread-local storage）中，该问题得以消除。

release 替换（release swizzle）首先获取线程局部结构体，因为整个方法执行期间都需要使用其内容：

1
    static void SwizzledReleaseIMP(PLObjectPtr self, SEL _cmd) {
2
        struct TLS *tls = GetTLS();

接下来，我们将 self 添加到正在被释放的对象列表中：

1
        pthread_mutex_lock(&gWeakMutex); {
2
            // Add this object to the list of releasing objects.
3
            CFBagAddValue(gReleasingObjects, self);
4
        } pthread_mutex_unlock(&gWeakMutex);

接下来，我们开始实施伪超类策略（fake super strategy）。首先要做的就是查看当前表项（table entry）是什么：

1
        Class lastSent = (__bridge Class)CFDictionaryGetValue(tls->lastReleaseClassTable, self);

接下来我们选择目标类。如果表是空的，那就是 self 的类本身。如果表中包含某个类，就从该类的父类开始：

1
        Class targetClass = lastSent == Nil ? object_getClass(self) : class_getSuperclass(lastSent);

然后我们使用 TopClassImplementingMethod 来跳过那些没有重写 release 方法的类，并将结果存回表格中：

1
        targetClass = TopClassImplementingMethod(targetClass, releaseSELSwizzled);
2
        CFDictionarySetValue(tls->lastReleaseClassTable, self, (__bridge void *)targetClass);

拿到目标类后，代码就能获取该目标类上 release 的 IMP（方法实现）并调用它：

1
        void (*origIMP)(PLObjectPtr, SEL) = (__typeof__(origIMP))class_getMethodImplementation(targetClass, releaseSELSwizzled);
2
        origIMP(self, _cmd);

至此，父类的代码已经执行完毕。如果此次调用释放了对 self 的最后一个引用（last reference to self），该对象现在已被销毁。首先要做的就是清理类表（class table），以确保下次在该地址调用释放操作时（无论是同一对象 —— 如果此次释放并未销毁它，还是新分配在相同位置的对象）：

1
        CFDictionaryRemoveValue(tls->lastReleaseClassTable, self);

最后，我们重新获取互斥锁，以便将自身从正在释放的对象列表中移除，并调用 pthread_cond_broadcast 以唤醒可能正在等待该对象的所有线程：

1
        pthread_mutex_lock(&gWeakMutex); {
2
            // We're no longer releasing.
3
            CFBagRemoveValue(gReleasingObjects, self);
4
            pthread_cond_broadcast(&gReleasingObjectsCond);
5
        } pthread_mutex_unlock(&gWeakMutex);
6
    }

交换后的 dealloc 实现在很大程度上与之类似。如同交换后的 release，它首先获取线程局部存储结构体：

1
    static void SwizzledDeallocIMP(PLObjectPtr self, SEL _cmd) {
2
        struct TLS *tls = GetTLS();

接下来，它获取全局锁，并通过从全局映射（global map）中获取地址并进行遍历，来清除所有指向 self 的弱引用。

1
        pthread_mutex_lock(&gWeakMutex); {
2
            // Clear all weak references and delete the addresses set.
3
            CFSetRef addresses = CFDictionaryGetValue(gObjectToAddressesMap, self);
4
            if (addresses != NULL)
5
                CFSetApplyFunction(addresses, ClearAddress, NULL);

需要注意的是，ClearAddress 只是一个简单的函数，本质上执行 *(void **)value = NULL 操作以将集合中的每个条目清零。既然集合已清空，它就不再需要了，因此我们将其从全局映射表中移除：

1
            CFDictionaryRemoveValue(gObjectToAddressesMap, self);

最后，我们通知所有等待该释放对象列表的监听者，列表已发生变化。技术上讲，列表本身并未改变。然而，任何等待 self（自身）的监听者现在会发现其弱引用（weak reference）包含 nil，而 nil 并不在集合中，因此我们仍需通知监听者重新检查：

1
            pthread_cond_broadcast(&gReleasingObjectsCond);
2
        } pthread_mutex_unlock(&gWeakMutex);

在此基础上，dealloc 使用与 release 相同的流程来调用原始实现：

1
        Class lastSent = (__bridge Class)CFDictionaryGetValue(tls->lastDeallocClassTable, self);
2
        Class targetClass = lastSent == Nil ? object_getClass(self) : class_getSuperclass(lastSent);
3
        targetClass = TopClassImplementingMethod(targetClass, deallocSELSwizzled);
4
        CFDictionarySetValue(tls->lastDeallocClassTable, self, (__bridge void *)targetClass);
5

6
        // Call through to the original implementation.
7
        void (*origIMP)(PLObjectPtr, SEL) = (__typeof__(origIMP))class_getMethodImplementation(targetClass, deallocSELSwizzled);
8
        origIMP(self, _cmd);

此时，self 已被销毁。剩下的只是清理类表（class table）中对应的条目，为下一个占用此地址的对象保持其原始状态：

1
        CFDictionaryRemoveValue(tls->lastDeallocClassTable, self);
2
    }

总结这是一个棘手的问题，但通过仔细思考和编程，我们能够让一切正常工作。对 release 和 dealloc 进行方法混写（swizzling）可以安全地将目标对象的弱引用（weak references）置零。使用一个表来跟踪所有当前正处于 release 过程中的对象，确保没有任何人能获取到一个即将被销毁的对象的引用。通过在外部表中跟踪方法混写被发送到的具体类，我们能够在被混写的方法被递归调用时也能安全地进行混写。

今天的内容就到这里。下次再见，我们将继续探索 Cocoa 编程世界里更多古怪的乐趣。Friday Q & A 栏目由读者建议驱动，所以在此期间，如果你有任何想在这里看到的主题，请发送过来！

#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2012-06-01-a-tour-of-plweakcompatibility-part-ii.html

Last time, I discussed the basics of PLWeakCompatibility in terms of the motivation, the basic hooks used to get the compiler to call our code when handling __weak variables, and calling through to the original implementations where available. Today, I’m going to discuss the implementation of the zeroing weak reference facility that gets used when the runtime doesn’t supply its own __weak support.

RecallThe functions implemented by PLWeakCompatibility which are called directly by the compiler’s generated code are:

1
    PLObjectPtr objc_loadWeakRetained(PLObjectPtr *location);
2
    PLObjectPtr objc_initWeak(PLObjectPtr *addr, PLObjectPtr val);
3
    void objc_destroyWeak(PLObjectPtr *addr);
4
    void objc_copyWeak(PLObjectPtr *to, PLObjectPtr *from);
5
    void objc_moveWeak(PLObjectPtr *to, PLObjectPtr *from);
6
    PLObjectPtr objc_loadWeak(PLObjectPtr *location);
7
    PLObjectPtr objc_storeWeak(PLObjectPtr *location, PLObjectPtr obj);

Where PLObjectPtr is just a typedef for void * and used as a way to prevent ARC from inserting memory management code into these functions where it’s definitely not wanted. All of these functions start with code that calls through to the Objective-C runtime’s implementation where available. When the official runtime functions aren’t available, these seven functions are broken down and implemented in terms of three primitive functions:

1
    PLObjectPtr PLLoadWeakRetained(PLObjectPtr *location);
2
    void PLRegisterWeak(PLObjectPtr *location, PLObjectPtr obj);
3
    void PLUnregisterWeak(PLObjectPtr *location, PLObjectPtr obj);

The PLLoadWeakRetained function loads a weak reference out of the given location and either returns a retained reference to the object it contains, or nil. The PLRegisterWeak function registers a particular memory location as a weak reference to the given object, and PLUnregisterWeak removes that registration. The task that remains is then to implement these three functions.

MAZeroingWeakRefThe goal is to make PLWeakCompatibility completely self-contained with its own zeroing weak reference implementation, which would necessarily be somewhat simple. However, we also wanted to make it use MAZeroingWeakRef where available, as the presence of that library in the app probably means that the programmer likes that implementation, and we might as well take advantage of its presence. Thus, the first task is to detect the presence of MAZeroingWeakRef, and then implement the three primitive functions using it, if it’s present.

First, we need a way to detect whether MAZeroingWeakRef is present, and get a reference to the class if so. This is all wrapped up in a simple function that uses NSClassFromString to attempt to fetch the MAZeroingWeakRef class, within a dispatch_once call to minimize the overhead. It also has an extra flag that allows disabling the MAZeroingWeakRef functionality for testing purposes:

1
    static Class MAZWR = Nil;
2
    static bool mazwrEnabled = true;
3
    static inline bool has_mazwr () {
4
        if (!mazwrEnabled)
5
            return false;
6

7
        static dispatch_once_t lookup_once = 0;
8
        dispatch_once(&lookup_once, ^{
9
            MAZWR = NSClassFromString(@"MAZeroingWeakRef");
10
        });
11

12
        if (MAZWR != nil)
13
            return true;
14
        return false;
15
    }

Now code can simply call has_mazwr, and if it returns true, use MAZWR to get a reference to the class.

The strategy for implementing the three primitives using MAZeroingWeakRef is pretty straightforward. Each primitive gets a location where the weak reference is to be stored, and nothing says that this location is required to directly hold a pointer to the weakly referenced object. Thus, we use the passed-in location to store a pointer to a MAZeroingWeakRef instance which in turn references the original object. PLRegisterWeak will simply create a new instance of MAZeroingWeakRef and place it in the given location. PLUnregisterWeak will simply release that instance. And PLLoadWeakRetained can just call through to the object’s -target method.

Each primitive function checks has_mazwr at the top to decide what to do. The beginning of each primitive function contains the MAZeroingWeakRef calls, and they are:

1
    static PLObjectPtr PLLoadWeakRetained(PLObjectPtr *location) {
2
        if (has_mazwr()) {
3
            MAZeroingWeakRef *mazrw = (__bridge MAZeroingWeakRef *) *location;
4
            return objc_retain([mazrw target]);
5
        }
6
        ...
7

8
    static void PLRegisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
9
        if (has_mazwr()) {
10
            MAZeroingWeakRef *ref = [[MAZWR alloc] initWithTarget: obj];
11
            *location = (__bridge_retained PLObjectPtr) ref;
12
            return;
13
        }
14
        ...
15

16
    static void PLUnregisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
17
        if (has_mazwr()) {
18
            if (*location != nil)
19
                objc_release(*location);
20
            return;
21
        }
22
        ...

The Built-In ImplementationNow for the real meat: the built-in zeroing weak reference implementation. The strategy is to swizzle out release and dealloc on target objects. The swizzled release method will add the object to a list of releasing objects, blocking anyone attempting to resolve a weak reference while the release is occurring, ensuring no weak reference can be resolved when an object’s final release triggers its dealloc. The swizzled dealloc method will then clear out all weak references to the object.

The first thing we need is a mutex to protect all of the shared data structures:

1
    static pthread_mutex_t gWeakMutex;

We also need a way to track all of the weak references currently registered for any given object. This comes in the form of a CFMutableDictionary mapping objects to CFMutableSet instances containing the registered weak addresses:

1
    static CFMutableDictionaryRef gObjectToAddressesMap;

We’re likely to see weak references to multiple instances of the same class, so we need to track which classes have been swizzled andavoid swizzling them twice. This is done by tracking the swizzled classes in a set:

1
    static CFMutableSetRef gSwizzledClasses;

The objects that are currently in the middle of a release call are stored in a bag:

1
    static CFMutableBagRef gReleasingObjects;

Since some code needs to wait around for gReleasingObjects to change, that means we also need a condition variable for them to wait on. (If you’re unfamiliar with condition variables, I’ll discuss that more later.)

1
    static pthread_cond_t gReleasingObjectsCond;

Some data also needs to be stored in thread-local storage. Two tables to assist with the swizzling process reside there. They get stored in structs accessed through a pthread thread-local storage key:

1
    static pthread_key_t gTLSKey;

For convenience, we also put all of the selectors necessary for swizzling into global variables:

1
    static SEL releaseSEL;
2
    static SEL releaseSELSwizzled;
3
    static SEL deallocSEL;
4
    static SEL deallocSELSwizzled;

Every primitive function needs to access these variables, and they all need to be initialized before use. This initialization is all wrapped in a dispatch_once:

1
    static void WeakInit(void) {
2
        static dispatch_once_t pred;
3
        dispatch_once(&pred, ^{

The first thing it does is initialize the mutex, using the recursive atribute::

1
            pthread_mutexattr_t attr;
2
            pthread_mutexattr_init(&attr);
3
            pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
4

5
            pthread_mutex_init(&gWeakMutex, &attr);
6

7
            pthread_mutexattr_destroy(&attr);

Next, the map of objects to addresses and the set of swizzled classes are created:

1
            gObjectToAddressesMap = CFDictionaryCreateMutable(NULL, 0, NULL, &kCFTypeDictionaryValueCallBacks);
2

3
            gSwizzledClasses = CFSetCreateMutable(NULL, 0, NULL);

By passing NULL for the dictionary key callbacks and the set callbacks, we ensure that CoreFoundation doesn’t try to do any sort of memory management on them.

The releasing objects set is initialized, as well as the condition variable used to wait on it:

1
            gReleasingObjects = CFBagCreateMutable(NULL, 0, NULL);
2
            pthread_cond_init(&gReleasingObjectsCond, NULL);

Next, the pthread thread-local storage key is created. There’s a bit of paranoia here with the error checking because this function can realistically fail. The number of thread-local storage keys that can be created is limited and relatively small: 512 keys on Mac OS X currently. This should be more than sufficient, but since it can realistically fail, I want it to fail early and obviously:

1
            int err = pthread_key_create(&gTLSKey, DestroyTLS);
2
            if (err != 0) {
3
                NSLog(@"Error calling pthread_key_create, we really can't recover from that: %s (%d)", strerror(err), err);
4
                abort();
5
            }

The DestroyTLS function frees the memory allocated for the thread-local storage. I’ll show its implementation momentarily.

Finally, the selectors are initialized. We can’t use the standard @selector construct for release and dealloc, because this is not allowed with ARC. Instead, we use sel_getUid (the Objective-C runtime equivalent of NSSelectorFromString) to fetch the selectors under ARC’s nose:

1
            releaseSEL = sel_getUid("release");
2
            releaseSELSwizzled = sel_getUid("release_PLWeakCompatibility_swizzled");
3
            deallocSEL = sel_getUid("dealloc");
4
            deallocSELSwizzled = sel_getUid("dealloc_PLWeakCompatibility_swizzled");
5
        });
6
    }

Each primitive function the makes a call to WeakInit before it does anything else, ensuring that all of these variables are set up. I’ll omit that call, as well as the MAZeroingWeakRef code, when discussing the implementation of those functions, just to keep things simple.

Thread-Local StorageA pthread thread-local storage key can be used to set and retrieve a single pointer per thread. To store multiple values, we allocate a struct which contains everything we want to store. We need two dictionaries which help the swizzled methods call through to their original values. Here is the struct:

1
    struct TLS {
2
        // Tables tracking the last class a swizzled method was sent to on an object
3
        CFMutableDictionaryRef lastReleaseClassTable;
4
        CFMutableDictionaryRef lastDeallocClassTable;
5
    };

Since this struct is used in a couple of places, we want a wrapper function that will retrieve it, creating it if on demand if nothing else has used the TLS struct on that thread yet. pthread_getspecific retrieves the current value, and if the current value is NULL, this function uses pthread_setspecific to set a new key:

1
    static struct TLS *GetTLS(void) {
2
        struct TLS *tls = pthread_getspecific(gTLSKey);
3
        if (tls == NULL) {
4
            tls = calloc(1, sizeof(*tls));
5
            tls->lastReleaseClassTable = CFDictionaryCreateMutable(NULL, 0, NULL, NULL);
6
            tls->lastDeallocClassTable = CFDictionaryCreateMutable(NULL, 0, NULL, NULL);
7
            pthread_setspecific(gTLSKey, tls);
8
        }
9
        return tls;
10
    }

We also need a function to destroy the TLS struct. This function was passed to pthread_key_create, and is automatically called by pthread when a thread is destroyed:

1
    static void DestroyTLS(void *ptr) {
2
        struct TLS *tls = ptr;
3
        if (tls != NULL && tls->lastReleaseClassTable) {
4
            CFRelease(tls->lastReleaseClassTable);
5
            CFRelease(tls->lastDeallocClassTable);
6
        }
7
        free(tls);
8
    }

With these in place, any thread can simply call GetTLS() and then manipulate the data in the struct, which is guaranteed to only be visible to the calling thread.

The Primitive FunctionsThe implementation of PLLoadWeakRetained is relatively straightforward. It needs to acquire the global mutex, then retrieve the object pointer. If the object is currently being released, then it must wait until it’s no longer being released.

First thing it does is acquire the global mutex, then try to fetch the value stored at the given location:

1
    static PLObjectPtr PLLoadWeakRetained(PLObjectPtr *location) {
2
        PLObjectPtr obj;
3
        pthread_mutex_lock(&gWeakMutex); {
4
            obj = *location;

Next it checks to see if the given object is in the list of releasing objects. If it’s there, it uses pthread_cond_wait to block on the condition variable, then reloads the location to get the latest value:

1
        while (CFBagContainsValue(gReleasingObjects, obj)) {
2
            pthread_cond_wait(&gReleasingObjectsCond, &gWeakMutex);
3
            obj = *location;
4
        }

pthread_cond_wait releases the given mutex and then waits for somebody to signal the condition variable. Once signalled, it re-acquires the mutex and resumes execution. This allows the thread to block until another thread signals that the table of releasing objects changed, at which point it can re-examine it.

This is a while loop rather than a simple if statement for a couple of reasons. One is simply that the signalled change may not be for this object. There’s a single condition variable for the whole table, and some other object may be the one that got removed.

The other reason is a bit more interesting and is called spurious wakeup. In short, due to various implementation details, pthread_cond_wait may occasionally return even when nothing has signalled on the condition variable. Due to this, any call to pthread_cond_wait should always be wrapped in a loop, not a simple if statement.

Once the object (or nil) is obtained, we simply retain it, release the mutex, and return the retained object.

1
            objc_retain(obj);
2
        }
3
        pthread_mutex_unlock(&gWeakMutex);
4

5
        return obj;
6
    }

PLRegisterWeak is a little more complex. The first thing it does is fetch the set of registered addresses for the given object so it can add the new entry:

1
    static void PLRegisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
2
        pthread_mutex_lock(&gWeakMutex); {
3
            CFMutableSetRef addresses = (CFMutableSetRef)CFDictionaryGetValue(gObjectToAddressesMap, obj);

That set won’t exist if this is the first weak reference to the given object. In that case, this function has to create it:

1
            if (addresses == NULL) {
2
                addresses = CFSetCreateMutable(NULL, 0, NULL);
3
                CFDictionarySetValue(gObjectToAddressesMap, obj, addresses);
4
                CFRelease(addresses);
5
            }

Now that it has the set, the passed-in location is added to it:

1
            CFSetAddValue(addresses, location);

Finally, it calls a helper function to ensure that all of the appropriate swizzling has been done:

1
            EnsureDeallocationTrigger(obj);
2
        } pthread_mutex_unlock(&gWeakMutex);
3
    }

We’ll get into that helper function’s implementation shortly.

The implementation of PLUnregisterWeak is essentially the inverse of PLRegisterWeak, except that it doesn’t have to worry about swizzling, which is simply left in place, and it doesn’t bother to delete the addresses set when it becomes empty:

1
    static void PLUnregisterWeak(PLObjectPtr *location, PLObjectPtr obj) {
2
        pthread_mutex_lock(&gWeakMutex); {
3
            // Remove the location from the set of weakly referenced addresses.
4
            CFMutableSetRef addresses = (CFMutableSetRef)CFDictionaryGetValue(gObjectToAddressesMap, *location);
5
            if (addresses != NULL)
6
                CFSetRemoveValue(addresses, location);
7
        } pthread_mutex_unlock(&gWeakMutex);
8
    }

Let’s look at EnsureDeallocationTrigger’s implementation now. The first thing it does is fetch the class of the given object, and bail out if that class has already been swizzled:

1
    static void EnsureDeallocationTrigger(PLObjectPtr obj) {
2
        Class c = object_getClass(obj);
3
        if (CFSetContainsValue(gSwizzledClasses, (__bridge const void *)c))
4
            return;

If it hasn’t, it then swizzles release and dealloc, using a small helper function, and finally adds the class to the set of swizzled classes:

1
        Swizzle(c, releaseSEL, releaseSELSwizzled, (IMP)SwizzledReleaseIMP);
2
        Swizzle(c, deallocSEL, deallocSELSwizzled, (IMP)SwizzledDeallocIMP);
3

4
        CFSetAddValue(gSwizzledClasses, (__bridge const void *)c);
5
    }

The implementation of Swizzle is simple: it just uses class_addMethod to register the previous IMP under a new selector, and class_replaceMethod to place the new IMP into the original selector:

1
    static void Swizzle(Class c, SEL orig, SEL new, IMP newIMP) {
2
        Method m = class_getInstanceMethod(c, orig);
3
        IMP origIMP = method_getImplementation(m);
4
        class_addMethod(c, new, origIMP, method_getTypeEncoding(m));
5
        class_replaceMethod(c, orig, newIMP, method_getTypeEncoding(m));
6
    }

And that’s just about it. All that remains are the implementations of SwizzledReleaseIMP and SwizzledDeallocIMP. Which, as it turns out, are pretty difficult and complex.

The Subclass ProblemWhen possible, the best way to implement method swizzling is to store the original IMP in a global variable, which you then call as a function pointer:

1
    // original implementation
2
    void (*origIMP)(id, SEL);
3

4
    // swizzle code
5
    origIMP = (void *)method_getImplementation(origMethod);
6
    class_replaceMethod(class, selector, newIMP, method_getTypeEncoding(origMethod));
7

8
    // swizzled IMP
9
    void newIMP(id self, SEL _cmd)
10
    {
11
        // do stuff here
12
        ...
13

14
        // call the original
15
        origIMP(self, _cmd);
16
    }

However, this only works when you’re only swizzling a single class. When swizzling multiple classes, there are multiple original implementations to keep track of. In that case, the best way to do things is to register the original implementation under a new selector on the same class, and look it up that way:

1
    // swizzle code
2
    IMP origIMP = method_getImplementation(origMethod);
3
    class_addMethod(class, @selector(swizzled_method), origIMP, method_getTypeEncoding(origMethod));
4
    class_replaceMethod(class, selector, newIMP, method_getTypeEncoding(origMethod));
5

6
    // swizzled IMP
7
    void newIMP(id self, SEL _cmd)
8
    {
9
        // do stuff here
10
        ...
11

12
        // look up the original
13
        Class class = object_getClass(self);
14
        void (*origIMP)(id, SEL) = (void *)(class_getMethodImplementation(class, @selector(swizzled_method));
15

16
        // call the original
17
        origIMP(self, _cmd);
18
    }

However, this only works if you only swizzle leaf classes. If you ever end up swizzling two classes where one is a superclass of the other, this ends up with infinite recursion and crashes.

To understand why, let’s consider two classes, A and B, both of which override dealloc:

1
    @interface A : NSObject
2
    - (void)dealloc; // clean up stuff
3
    @end
4

5
    @interface B : A
6
    - (void)dealloc; // clean up in aisle three
7
    @end

Let’s assume that we’ve swizzled dealloc on both A and B (presumably because something created a __weak reference to an instance of A and an instance of B). Now something releases an instance of B.

Because of the swizzling, the call to dealloc ends up invoking the swizzled implementation. So far so good.

The swizzled implementation looks up -[B swizzled_dealloc] and calls it. This calls the original implementation of -[B dealloc]. Again, so far so good. At the end of this original implementation, the method will call [super dealloc], which ends up getting -[A dealloc]. Still good up to here.

-[A dealloc] is also the swizzled implementation, but that’s fine. That implementation needs to be reentrant, but it has to be this way if we want to intercept calls to instances of both A and B. The swizzled implementation does its thing again, but is written to tolerate this. Then it calls through to the original implementation. And here is where things go wrong.

Check out the code that looks up the original implementation:

1
    Class class = object_getClass(self);
2
    void (*origIMP)(id, SEL) = (void *)(class_getMethodImplementation(class, @selector(swizzled_method));

Even though this is called from A, object_getClass still returns B. At runtime, there’s no concept of “called from A.” The object’s class is B, and that’s what it gets. So at the end of -[A dealloc], where the swizzled implementation looks up and calls the original, it ends up calling -[B swizzled_dealloc] again! If that method somehow runs a second time without crashing, it will call back to -[A dealloc], which calls back to -[B swizzled_dealloc], and this continues until either some piece of code gets fed up with this abuse, or the infinite recursion runs out of stack space and crashes.

In order to solve this, the swizzled implementation needs some way to know which class it’s attached to. This is trivial using imp_implementationWithBlock to create a slightly different swizzled implementation for each class. Unfortunately, imp_implementationWithBlock is only available on iOS 4.3 and later. Requiring it for PLWeakCompatibility would mean eliminating support for iOS 4.0-4.2, making it much less useful. We need to come up with some other way to handle this.

Emulating superThe hypothetical call to super would walk up the class hierarchy, retrieving the original IMP from the next highest class each time, and calling that. By tracking the last class that we retrieved, we can emulate this. For each method, we set up a table that maps objects to the last class used for the call to the original IMP. By moving up the hierarchy from that class on each call, we can achieve the necessary behavior.

In the example above, the first call to ‘dealloc’ sees no table entry, so it calls B’s dealloc and puts B in the table. The next call sees B, calls A’s dealloc, and puts A in the table. The next call sees A and calls NSObject’s dealloc. This is all exactly as we want.

There’s one extra trick needed here. Imagine yet another class in the hierarchy, C:

1
    @interface C : B
2
    // does not override dealloc
3
    @end

This runs into a problem. The first call sees no table entry, calls C’s dealloc, and puts C in the table. However, since C doesn’t override dealloc, it’s actually just B’s dealloc. The next call sees the C in the table and calls… B’s dealloc again. This is not good.

The trick is to search the class hierarchy for the topmost class with the given method implementation. The TopClassImplementingMethod function searches up the class hierarchy from the given class, looking for the point where the IMP for a given selector changes, and then returns the last class from before that point:

1
    static Class TopClassImplementingMethod(Class start, SEL sel) {
2
        IMP imp = class_getMethodImplementation(start, sel);
3

4
        Class previous = start;
5
        Class cursor = class_getSuperclass(previous);
6
        while (cursor != Nil) {
7
            if (imp != class_getMethodImplementation(cursor, sel))
8
                break;
9
            previous = cursor;
10
            cursor = class_getSuperclass(cursor);
11
        }
12

13
        return previous;
14
    }

By calling this function before putting an entry in the table, this solves the problem. The first call will see no table entry, call C’s dealloc (which is really B’s dealloc), but then place B in the table instead of C. The next call goes to A, then NSObject as we need it to.

These tables would run into conflicts if the same method were invoked multiple times on different threads. However, by placing these tables in thread-local storage, that problem is eliminated.

ReleaseThe first thing the release swizzle does is fetch the thread-local struct, since it’s going to use the contents throughout the method:

1
    static void SwizzledReleaseIMP(PLObjectPtr self, SEL _cmd) {
2
        struct TLS *tls = GetTLS();

Next, we add self to the list of objects being released:

1
        pthread_mutex_lock(&gWeakMutex); {
2
            // Add this object to the list of releasing objects.
3
            CFBagAddValue(gReleasingObjects, self);
4
        } pthread_mutex_unlock(&gWeakMutex);

After that, we start on the fake super strategy. The first thing to do is see what the current table entry is:

1
        Class lastSent = (__bridge Class)CFDictionaryGetValue(tls->lastReleaseClassTable, self);

Then we pick a target class. If the table is empty, that’s just the class of self. If the table contains a class, then we start with that class’s superclass:

1
        Class targetClass = lastSent == Nil ? object_getClass(self) : class_getSuperclass(lastSent);

Then we use TopClassImplementingMethod to skip over classes that don’t override release, and store the result back into the table:

1
        targetClass = TopClassImplementingMethod(targetClass, releaseSELSwizzled);
2
        CFDictionarySetValue(tls->lastReleaseClassTable, self, (__bridge void *)targetClass);

With the target class in hand, the code can fetch the IMP for release on that target class and call it:

1
        void (*origIMP)(PLObjectPtr, SEL) = (__typeof__(origIMP))class_getMethodImplementation(targetClass, releaseSELSwizzled);
2
        origIMP(self, _cmd);

At this point, the superclass’s code has completed. If this call released the last reference to self, the object is now destroyed. The first thing to do is to clean up the class table so that the next call to release at this address (either the same object, if this release didn’t destroy it, or a new object allocated at the same location):

1
        CFDictionaryRemoveValue(tls->lastReleaseClassTable, self);

Finally, we reacquire the mutex to remove self from the list of releasing objects, calling pthread_cond_broadcast to wake up any threads that might be waiting on this object:

1
        pthread_mutex_lock(&gWeakMutex); {
2
            // We're no longer releasing.
3
            CFBagRemoveValue(gReleasingObjects, self);
4
            pthread_cond_broadcast(&gReleasingObjectsCond);
5
        } pthread_mutex_unlock(&gWeakMutex);
6
    }

DeallocThe swizzled dealloc implementation is largely similar. Like the swizzled release, it starts off by fetching the thread-local storage struct:

1
    static void SwizzledDeallocIMP(PLObjectPtr self, SEL _cmd) {
2
        struct TLS *tls = GetTLS();

Next, it grabs the global lock and clears all weak references to self by fetching the addresses from the global map and iterating over them:

1
        pthread_mutex_lock(&gWeakMutex); {
2
            // Clear all weak references and delete the addresses set.
3
            CFSetRef addresses = CFDictionaryGetValue(gObjectToAddressesMap, self);
4
            if (addresses != NULL)
5
                CFSetApplyFunction(addresses, ClearAddress, NULL);

Note that ClearAddress is just a simple function that essentially does *(void **)value = NULL to zero out every entry in the set. Now taht the set is clear, it’s no longer needed, so we remove it from the global map:

1
            CFDictionaryRemoveValue(gObjectToAddressesMap, self);

Finally, we notify anybody waiting on the list of releasing objects that it has changed. Technically, the list itself has not changed. However, anybody waiting on self will now find their weak reference contains nil, which isn’t in the set, so we still want to notify listeners to recheck:

1
            pthread_cond_broadcast(&gReleasingObjectsCond);
2
        } pthread_mutex_unlock(&gWeakMutex);

With that out of the way, dealloc uses the same procedure as release to call through to the original implementation:

1
        Class lastSent = (__bridge Class)CFDictionaryGetValue(tls->lastDeallocClassTable, self);
2
        Class targetClass = lastSent == Nil ? object_getClass(self) : class_getSuperclass(lastSent);
3
        targetClass = TopClassImplementingMethod(targetClass, deallocSELSwizzled);
4
        CFDictionarySetValue(tls->lastDeallocClassTable, self, (__bridge void *)targetClass);
5

6
        // Call through to the original implementation.
7
        void (*origIMP)(PLObjectPtr, SEL) = (__typeof__(origIMP))class_getMethodImplementation(targetClass, deallocSELSwizzled);
8
        origIMP(self, _cmd);

At this point, self is destroyed. All that remains is to clean up the entry in the class table for it, to leave it pristine for the next object to occupy this address:

1
        CFDictionaryRemoveValue(tls->lastDeallocClassTable, self);
2
    }

ConclusionThis is a tough problem to solve, but with careful thought and programming we’re able to make it all work. Swizzling release and dealloc allows safely zeroing out weak references to a target object. A table that tracks all objects currently in the middle of a release ensures that nobody can ever obtain a reference to an object that’s about to be destroyed. By tracking the class the swizzled method was sent to in an external table, we can safely swizzle these methods even when the swizzled method is called recursively.

That’s it for today. Come back next time for more wacky fun in the world of Cocoa programming. Friday Q&A is driven by reader suggestions, so in the meantime, if you have a topic that you’d like to see covered here, please send it in!