Zombie 对象的内部原理

Mike Ash Friday Q&A 中文译文:Zombie 对象的内部原理

作者 TommyWu
封面圖片: Zombie 对象的内部原理

译文 · 原文: Friday Q&A 2011-05-20: The Inner Life of Zombies · 作者 Mike Ash

原文:https://www.mikeash.com/pyblog/friday-qa-2011-05-20-the-inner-life-of-zombies.html 发布:2011-05-20 作者:Mike Ash 译者:MiMo(mimo-v2.5-pro);代码块保留英文原样


又是星期五了,这个最星期五的日子,这意味着又到了每周五问答的时间。Samuel Goodwin 建议讨论 NSZombie 是如何工作的,这就是我今天要讨论的话题。

僵尸概述
正如你可能记得的,一个 Objective-C 对象只是一块已分配的内存。这块内存的第一个指针大小(pointer-sized)的块是 isa 指针(isa pointer),它指向该对象的类(class)。块的其余部分包含该对象的实例变量(instance variables)。

当一个对象被释放(deallocated)时,包含它的那块内存会被释放(freed)。通常情况下,这意味着它只是被标记为可供重新使用。如果你搞砸了并持有一个指向这个已释放对象的指针,许多奇怪的事情都可能发生。

在某些情况下,尝试使用已释放对象的代码会正常工作。如果这块已释放的内存实际上还没有被覆盖,它仍然会表现得像一个普通的 Objective-C 对象。

通常情况下,被释放的内存会被重新用来存放新的对象。在这种情况下,旧指针最终会指向这个新对象。尝试使用旧指针实际上会向新对象发送消息,从而产生令人困惑的结果。这就是为什么内存管理错误最常见的症状之一是出现神秘对象 —— 比如在你期望看到其他内容的地方突然出现了一个随机的 NSString。

偶尔,内存会被非对象的数据覆盖,导致你的程序崩溃。这是三种情况中最好的结果,因为它能更快失败并更清楚地显示问题所在,但这种情况往往比较罕见。

僵尸对象(Zombie)极大地改善了针对这种常见场景的诊断能力。僵尸对象不再让被释放的内存保持原样,而是接管它并将其替换为一个会拦截所有访问尝试的对象。因此称为” 僵尸”:已死的对象被复活成一种非生命状态。当向僵尸对象发送消息时,它会记录错误并崩溃,同时提供方便的回溯信息(backtrace),让你能准确看到问题所在。

使用僵尸对象

通过设置环境变量 NSZombieEnabledYES,即可启用僵尸对象(Zombie)。在 gdb 中运行应用程序,当尝试访问已销毁的对象时,程序会立即崩溃。不过要小心:默认情况下,僵尸对象永远不会被释放,因此应用程序的内存占用可能会变得极高。

另一个有用的选项是 Instruments 中的 “Zombies” 工具。它不仅会启用僵尸对象,还会追踪对象的引用计数(retain counts),以便你可以回溯查看任何被错误消息发送的对象的所有引用计数操作(retain / release activity)。

调查僵尸对象

让我们看看这些机制在幕后是如何运作的。为了协助调查,我编写了一个用于转储对象内容的小函数:

void Dump(NSString *msg, id obj, int size)
{
NSString *s = [NSString stringWithFormat: @"%@ malloc_size %d - %@", msg, (int)malloc_size(obj), [NSData dataWithBytes: obj length: size]];
printf("%s\n", [s UTF8String]);
}

让我们创建一个 NSObject,并在销毁前和销毁后记录它:

id obj = [[NSObject alloc] init];
int size = malloc_size(obj);
Dump(@"Fresh NSObject", obj, size);
[obj release];
Dump(@"Destroyed NSObject", obj, size);
Fresh NSObject malloc_size 16 - <68046370 ff7f0000 00000000 00000000>
Destroyed NSObject malloc_size 0 - <68046370 ff7f0000 00000000 00000000>

再试一个启用僵尸对象的情况:

Fresh NSObject malloc_size 16 - <68046370 ff7f0000 00000000 00000000>
Destroyed NSObject malloc_size 16 - <d0011100 01000000 00000000 00000000>

这里的第二个八字节区域只是未被使用。让我们编写一个使用它的快速虚拟类,看看它的行为如何:

@interface Dummy : NSObject
{
uintptr_t secondEight;
}
@end
@implementation Dummy
- (id)init
{
if((self = [super init]))
secondEight = 0xdeadbeefcafebabeULL;
return self;
}
@end
obj = [[Dummy alloc] init];
size = malloc_size(obj);
Dump(@"Fresh Dummy", obj, size);
[obj release];
Dump(@"Destroyed Dummy", obj, size);
Fresh Dummy malloc_size 16 - <28110000 01000000 bebafeca efbeadde>
Destroyed Dummy malloc_size 0 - <28110000 01000000 bebafeca efbeadde>

以下是涉及僵尸(zombie)的运行示例:

Fresh Dummy malloc_size 16 - <28110000 01000000 bebafeca efbeadde>
Destroyed Dummy malloc_size 16 - <e0071100 01000000 bebafeca efbeadde>
NSLog(@"%s", class_getName(object_getClass(obj)));

让我们看看这个类中究竟包含什么。这里有一个函数,它可以转储出关于一个类的各种信息:

void DumpClass(Class c)
{
printf("Dumping class %s\n", class_getName(c));
printf("Superclass: %s\n", class_getName(class_getSuperclass(c)));
printf("Ivars:\n");
Ivar *ivars = class_copyIvarList(c, NULL);
for(Ivar *cursor = ivars; cursor && *cursor; cursor++)
printf(" %s %s %d\n", ivar_getName(*cursor), ivar_getTypeEncoding(*cursor), (int)ivar_getOffset(*cursor));
free(ivars);
printf("Methods:\n");
Method *methods = class_copyMethodList(c, NULL);
for(Method *cursor = methods; cursor && *cursor; cursor++)
fprintf(stderr, " %s %s\n", sel_getName(method_getName(*cursor)), method_getTypeEncoding(*cursor));
free(methods);
}
DumpClass(object_getClass(obj));
Dumping class _NSZombie_Dummy
Superclass: nil
Ivars:
isa # 0
Methods:

那么,当我们尝试向这个空类的一个实例发送消息时,会发生什么呢?我在代码中销毁对象后加入了 [obj self],然后在 gdb 中运行。结果如下:

2011-05-19 14:42:39.427 a.out[62888:a0f] *** -[Dummy self]: message sent to deallocated instance 0x1001106b0
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007fff82a4d6c6 in ___forwarding___ ()
(gdb) bt
#0 0x00007fff82a4d6c6 in ___forwarding___ ()
#1 0x00007fff82a49a68 in __forwarding_prep_0___ ()
#2 0x0000000100001c49 in main (argc=1, argv=0x7fff5fbff690) at zomb.m:62

转发机制因为该类没有实现必需的最少转发方法而抛出 SIGTRAP。但记录 “message sent to deallocated instance” 的又是谁呢?让我们在 CFLog 上设置断点来一探究竟:

Breakpoint 2, 0x00007fff82a98327 in CFLog ()
(gdb) bt
#0 0x00007fff82a98327 in CFLog ()
#1 0x00007fff82a4d6c5 in ___forwarding___ ()
#2 0x00007fff82a49a68 in __forwarding_prep_0___ ()
#3 0x0000000100001c49 in main (argc=1, argv=0x7fff5fbff690) at zomb.m:62
(gdb) cont
Continuing.
2011-05-19 15:45:03.905 a.out[62938:a0f] *** -[Dummy self]: message sent to deallocated instance 0x1001106b0

结论

僵尸对象是调试内存问题的实用工具。在底层实现中,僵尸机制通过将对象的 isa 指针(isa pointer)重写为指向与原始类关联的特殊僵尸类来工作。当向特殊僵尸类的实例发送消息时,它会被运行时(runtime)的消息转发系统(message forwarding system)捕获,随后记录该事件并使应用崩溃。

以上就是今天的全部内容。两周后我们将迎来下一期分享,时间恰好在 WWDC 期间。在此期间,一如既往地欢迎向我发送您感兴趣的主题建议。


#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2011-05-20-the-inner-life-of-zombies.html

It’s Friday again, that Fridayest of days, and this week that means it’s time for another Friday Q&A. Samuel Goodwin suggested discussing how NSZombie works, and that’s the topic I will discuss today.

Zombie Overview As you may recall, an Objective-C object is just a block of allocated memory. The first pointer-sized chunk of that block is the isa pointer, which points to the object’s class. The rest of the block contains the object’s instance variables.

When an object is deallocated, the block of memory which contains it is freed. Normally this means that it’s simply marked as being available for reuse. If you’ve screwed up and kept a pointer to this deallocated object, many mysterious things can happen.

In some cases, code which tries to use the deallocated object will work just fine. If the deallocated memory hasn’t actually been overwritten yet, it will still behave like a normal Objective-C object.

Frequently, the deallocated memory will be reused to hold a new object. In this case, the old pointer ends up pointing to this new object. Attempts to use the old pointer will send messages to the new object instead, with confusing results. This is why one of the most common symptoms of a memory management error is a mystery object, like a random NSString, showing up where you expected to see something else.

Occasionally, the memory will be overwritten with something that’s not an object, and your code crashes. This is the best outcome of the three, since it fails more quickly and makes it more clear what’s going wrong, but it also tends to be rare.

Zombies greatly improve the diagnostics available for this common scenario. Instead of simply leaving the deallocated memory alone, zombies take it over and replace it with an object which traps all attempts to access it. Thus the term “zombie”: the dead object is resurrected to a sort of unlife. When a zombie object is messaged, it logs an error and crashes, providing a convenient backtrace so you can see exactly where the problem lies.

Using Zombies Zombies can be enabled by setting the NSZombieEnabled environment variable to YES. Run the app in gdb, and it will crash on any attempted access to a dead object. Be careful, though: by default, zombies are never deallocated, so your app’s memory usage can become extremely high.

Another useful option is the Zombies instruments in Instruments. This enables zombies and also tracks objects’ retain counts so that you can go back and see the retain/release activity for any improperly messaged object.

Investigating Zombies Let’s take a look at what these things are doing behind the scenes. To help with the investigation, I wrote a small function to dump the contents of an object:

void Dump(NSString *msg, id obj, int size)
{
NSString *s = [NSString stringWithFormat: @"%@ malloc_size %d - %@", msg, (int)malloc_size(obj), [NSData dataWithBytes: obj length: size]];
printf("%s\n", [s UTF8String]);
}

Let’s create an NSObject and log it before and after being destroyed:

id obj = [[NSObject alloc] init];
int size = malloc_size(obj);
Dump(@"Fresh NSObject", obj, size);
[obj release];
Dump(@"Destroyed NSObject", obj, size);
Fresh NSObject malloc_size 16 - <68046370 ff7f0000 00000000 00000000>
Destroyed NSObject malloc_size 0 - <68046370 ff7f0000 00000000 00000000>

Let’s try another one with zombies enabled:

Fresh NSObject malloc_size 16 - <68046370 ff7f0000 00000000 00000000>
Destroyed NSObject malloc_size 16 - <d0011100 01000000 00000000 00000000>

The second eight bytes is just unused here. Let’s write a quick dummy class that uses it and see how it behaves:

@interface Dummy : NSObject
{
uintptr_t secondEight;
}
@end
@implementation Dummy
- (id)init
{
if((self = [super init]))
secondEight = 0xdeadbeefcafebabeULL;
return self;
}
@end
obj = [[Dummy alloc] init];
size = malloc_size(obj);
Dump(@"Fresh Dummy", obj, size);
[obj release];
Dump(@"Destroyed Dummy", obj, size);
Fresh Dummy malloc_size 16 - <28110000 01000000 bebafeca efbeadde>
Destroyed Dummy malloc_size 0 - <28110000 01000000 bebafeca efbeadde>

Here’s a run with zombies:

Fresh Dummy malloc_size 16 - <28110000 01000000 bebafeca efbeadde>
Destroyed Dummy malloc_size 16 - <e0071100 01000000 bebafeca efbeadde>
NSLog(@"%s", class_getName(object_getClass(obj)));

Let’s see just what this class contains. Here’s a function which will dump out various information about a class:

void DumpClass(Class c)
{
printf("Dumping class %s\n", class_getName(c));
printf("Superclass: %s\n", class_getName(class_getSuperclass(c)));
printf("Ivars:\n");
Ivar *ivars = class_copyIvarList(c, NULL);
for(Ivar *cursor = ivars; cursor && *cursor; cursor++)
printf(" %s %s %d\n", ivar_getName(*cursor), ivar_getTypeEncoding(*cursor), (int)ivar_getOffset(*cursor));
free(ivars);
printf("Methods:\n");
Method *methods = class_copyMethodList(c, NULL);
for(Method *cursor = methods; cursor && *cursor; cursor++)
fprintf(stderr, " %s %s\n", sel_getName(method_getName(*cursor)), method_getTypeEncoding(*cursor));
free(methods);
}
DumpClass(object_getClass(obj));
Dumping class _NSZombie_Dummy
Superclass: nil
Ivars:
isa # 0
Methods:

What, then, happens when we try to message an instance of this empty class? I put [obj self] in the code after destroying the object and then ran it in gdb. Here’s the result:

2011-05-19 14:42:39.427 a.out[62888:a0f] *** -[Dummy self]: message sent to deallocated instance 0x1001106b0
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007fff82a4d6c6 in ___forwarding___ ()
(gdb) bt
#0 0x00007fff82a4d6c6 in ___forwarding___ ()
#1 0x00007fff82a49a68 in __forwarding_prep_0___ ()
#2 0x0000000100001c49 in main (argc=1, argv=0x7fff5fbff690) at zomb.m:62

The forwarding mechanism is throwing a SIGTRAP because the class doesn’t implement the minimum necessary forwarding methods. What’s logging “message sent to deallocated instance”, though? Let’s put a breakpoint on CFLog and find out:

Breakpoint 2, 0x00007fff82a98327 in CFLog ()
(gdb) bt
#0 0x00007fff82a98327 in CFLog ()
#1 0x00007fff82a4d6c5 in ___forwarding___ ()
#2 0x00007fff82a49a68 in __forwarding_prep_0___ ()
#3 0x0000000100001c49 in main (argc=1, argv=0x7fff5fbff690) at zomb.m:62
(gdb) cont
Continuing.
2011-05-19 15:45:03.905 a.out[62938:a0f] *** -[Dummy self]: message sent to deallocated instance 0x1001106b0

Conclusion Zombies are a really useful tool for debugging memory problems. Under the hood, zombies work by rewriting the object’s isa pointer to point to a special zombie class associated with the original. When a message is sent to an instance of the special zombie class, it gets trapped by the runtime’s message forwarding system which then logs the event and crashes the app.

That wraps things up for today. Come back in two more weeks for the next one, just in time for WWDC. In the meantime, as always, keep sending me your ideas for topics.