Objective-C 中的栈对象与堆对象

文章發布時間 2010年1月15日

作者 TommyWu

標籤

译文 · 原文： Friday Q&A 2010-01-15: Stack and Heap Objects in Objective-C · 作者 Mike Ash

原文：https://www.mikeash.com/pyblog/friday-qa-2010-01-15-stack-and-heap-objects-in-objective-c.html 发布：2010-01-15　作者：Mike Ash 译者：MiMo（mimo-v2.5-pro）；代码块保留英文原样

欢迎阅读本周的 Friday Q & A。我已完成旅行归来，（勉强）准备好带来又一篇精彩内容。本周话题来自 Gwynne，他询问为何 Objective-C 仅使用堆对象（heap objects），而不使用栈对象（stack objects）。

在深入探讨之前，让我们先明确定义相关术语。

栈（Stack）栈是内存中的一块区域，用于存储局部变量以及内部临时值和管理信息。在现代系统中，每个执行线程拥有独立的栈。当函数被调用时，栈帧（stack frame）会被压入栈中，函数局部数据即存储于此。当函数返回时，其栈帧将被销毁。这一切都自动发生，程序员无需采取任何显式操作，只需调用函数即可。

堆
堆本质上是内存中除栈以外的所有部分。（实际上除了栈和堆还有其他内存区域，但本讨论暂不涉及。）内存可以在堆中随时分配，也可以随时销毁。你必须显式地从堆中请求分配内存；如果不使用垃圾回收机制，还需要显式地释放它。这里用于存储那些生命周期需要超越当前函数调用的数据。当你调用 malloc 和 free 时，操作的正是堆内存。

栈对象与堆对象
那么，什么是栈对象？什么又是堆对象？

首先，我们需要理解对象的一般概念。在 Objective-C（以及许多其他语言）中，对象本质上只是一段具有特定布局的连续内存块。（如果你对其中包含的内容及其布局方式感兴趣，可以参考我对 Objective-C runtime 的介绍文章。）

这段内存的具体位置并不那么重要。只要你在某处拥有一段包含正确内容的内存，它就是一个可用的 Objective-C 对象。在 Objective-C 中，对象通常是在堆上创建的：

1
    NSObject *obj = [[NSObject alloc] init];

栈对象（stack object）是指对象的内存空间在栈上分配的对象。Objective-C 本身并不直接支持这种操作，但你可以手动构建一个，无需费太大周折：

1
    struct {
2
        Class isa;
3
    } fakeNSObject;
4
    fakeNSObject.isa = [NSObject class];
5

6
    NSObject *obj = (NSObject *)&fakeNSObject;
7
    NSLog(@"%@", [obj description]);

栈对象的优势
总体上，栈对象显然是可行的。除了前面提到的 hack 手段，像 C++ 这样的正经语言也提供了对栈对象的语言层面支持。在 C++ 中，你可以在栈上或堆上创建对象。

1
    std::string stackString;
2
    std::string *heapString = new std::string;

栈对象具有两个显著优势：

速度：在栈上分配内存非常迅速。所有记录工作在编译构建程序时就由编译器完成。运行时，函数序言（function prolog）只需为所有局部变量划分出所需空间，而代码由于提前计算过布局，知道每个变量该存放何处。栈分配本质上是零开销的，而堆分配（heap allocations）的代价可能相当高昂。
简洁性：栈对象具有明确的生命周期。你永远无法泄漏它，因为它总会在声明所在的代码块作用域结束时被销毁。

（注意：这并非绝对不可能，许多语言在日常运行中会移动对象，通常作为垃圾回收机制的一部分。然而，这需要比 Objective-C 更强大的运行时能力和更严格的类型系统。）

在 Cocoa 框架中，Objective-C 使用引用计数系统（reference counting system）进行内存管理。该系统的优势在于：任何单个对象可以拥有多个” 所有者”（owners），系统会阻止对象被销毁，直到所有所有者都释放了所有权。

栈分配的对象天然具有单一所有者 —— 即创建它们的函数。如果 Objective-C 存在栈对象，当你将其传递给其他代码并尝试通过保留（retain）使其存活时会发生什么？由于无法阻止对象在创建函数返回时被销毁，因此保留操作无法生效。尝试保留该对象的代码将会失败，最终得到一个悬挂引用（dangling reference），并导致程序崩溃。

另一个问题是栈对象缺乏灵活性。在 Objective-C 中，实现会销毁原始对象并返回新对象的初始化方法并不罕见。若使用栈对象，你该如何实现这一操作？事实上你做不到。Objective-C 运行时的大部分灵活性依赖于堆对象的存在。

Objective-C 中的实际栈对象
事实证明，从 10.6 版本开始，Objective-C 确实拥有了真正的、官方支持的栈对象！

但别高兴得太早。该特性仅支持一种对象类型：闭包（blocks）。当你在函数内部使用^{}语法编写闭包时，该表达式的结果就是一个栈对象！

那么我之前讨论的那些问题该如何解决呢？

blocks 并不存在运行时动态性带来的问题。blocks 的布局由语言固定，若要修改就必须破坏二进制兼容性。block 对象的大小在编译时即可计算得出，事实上整个对象都由编译器生成的代码构建，因此根本不存在编写执行复杂操作的初始化器的可能性。

对象生命周期问题在 blocks 中确实存在，但严重程度较低。原因很简单：blocks 是语言中从未出现过的新型对象，任何处理 blocks 的代码都会明白，若要保留引用，需要复制 block 对象（该操作会在堆上创建副本 —— 若尚未存在的话 —— 并返回指向该副本的指针），而非简单地对其进行 retain。

然而，blocks 基于栈的特性确实存在一些陷阱。例如，这段代码是错误的：

1
    void (^block)();
2
    if(x)
3
    {
4
        block = ^{ printf("x\n"); };
5
    }
6
    else
7
    {
8
        block = ^{ printf("not x\n"); };
9
    }
10
    block();

1
    [dictionary setObject: ^{ printf("hey hey\n"); } forKey: key];

栈对象的速度和简洁性对代码块（blocks）而言是巨大优势，但也为不够谨慎的程序员带来了一整类全新的 bug。

结论
以上就是本周的全部内容。七天后请再来阅读另一篇精彩的博文。在此之前，请继续发送你的建议。我的写作素材来源于用户的想法，所以如果有你希望在此讨论的话题，请发送给我！

#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2010-01-15-stack-and-heap-objects-in-objective-c.html

Welcome to another Friday Q&A. I survived my travel and am (just barely) ready to write another exciting edition. This week’s topic comes from Gwynne, who asked why Objective-C only uses heap objects, and no stack objects.

Before we get into that, let’s define our terms.

Stack The stack is a region of memory which contains storage for local variables, as well as internal temporary values and housekeeping. On a modern system, there is one stack per thread of execution. When a function is called, a stack frame is pushed onto the stack, and function-local data is stored there. When the function returns, its stack frame is destroyed. All of this happens automatically, without the programmer taking any explicit action other than calling a function.

Heap The heap is, essentially, everything else in memory. (Yes, there are things other than the stack and heap, but let’s ignore that for this discussion.) Memory can be allocated on the heap at any time, and destroyed at any time. You have to explicitly request for memory to be allocated from the heap, and if you aren’t using garbage collection, explicitly free it as well. This is where you store things that need to outlive the current function call. The heap is what you access when you call malloc and free.

Stack vs Heap Objects Given that, what’s a stack object, and what’s a heap object?

First, we must understand what an object is in general. In Objective-C (and many other languages), an object is simply a contiguous blob of memory with a particular layout. (If you’re interested in just what it contains and how it’s laid out, check out my intro to the Objective-C runtime.)

The precise location of that memory is less important. As long as you have some memory somewhere with the right contents, it’s a working Objective-C object. In Objective-C, objects are usually created on the heap:

1
    NSObject *obj = [[NSObject alloc] init];

A stack object is just an object where the memory for that object is allocated on the stack. Objective-C doesn’t have any support for this directly, but you can construct one manually without too much trouble:

1
    struct {
2
        Class isa;
3
    } fakeNSObject;
4
    fakeNSObject.isa = [NSObject class];
5

6
    NSObject *obj = (NSObject *)&fakeNSObject;
7
    NSLog(@"%@", [obj description]);

Advantages of Stack Objects It’s obviously possible to have stack objects in general. Aside from the above hack, real languages like C++ have language support for stack objects. In C++, you can create objects on the stack or the heap:

1
    std::string stackString;
2
    std::string *heapString = new std::string;

Stack objects have two compelling advantages:

Speed: Allocating memory on the stack is really fast. All of the bookkeeping is done by the compiler when you build your program. At runtime, the function prolog just carves out the amount of space it needs for all local variables, and the code knows what goes where because it was all computed in advance. Stack allocations are essentially free, whereas heap allocations can be quite expensive.
Simplicity: Stack objects have a defined lifetime. You can never leak one, because it always gets destroyed at the end of the scope where it was declared.

(Note: it’s not an impossibility in general, and many languages move objects around as a matter of course, often as part of garbage collection schemes. However, this requires more runtime smarts and a stricter type system than you get in Objective-C.)

As used in Cocoa, Objective-C uses a reference counting system for memory management. The advantage of this system is that any single object can have multiple “owners”, and the system won’t allow the object to be destroyed until all owners have relinquished ownership.

Stack allocated objects inherently have a single owner, the function which created them. If Objective-C had stack objects, what would happen if you passed it to some other code which then tried to keep it around by retaining it? There’s no way to prevent the object from being destroyed when the function which created it returns, so the retain can’t work. The code which tries to keep the object around will fail, end up with a dangling reference, and will crash.

Another problem is that stack objects are not very flexible. It’s not uncommon in Objective-C to implement an initializer which destroys the original object and returns a new one instead. How could you do that with a stack object? You really couldn’t. Much of the runtime flexibility of Objective-C depends on having heap objects.

Actual Stack Objects in Objective-C It turns out that Objective-C does have stack objects, truly and officially, as of 10.6!

Don’t get too excited, though. It’s only supported for a single kind of object: blocks. When you write a block inside a function using the ^{} syntax, the result of that expression is a stack object!

But what about those problems I discussed above?

The problem with runtime dynamism doesn’t exist with blocks. Blocks have a layout which is fixed by the language, and which can’t be changed without destroying binary compatibility. The size of the block object can be computed at compile time, and in fact the whole object is built by compiler-generated code, so the possibility of writing an initializer that does tricky things simply doesn’t exist.

The problem of object lifetime does exist with blocks, but is less severe. The reason for this is simply because blocks are a new kind of object that never existed in the language before, and any code that deals with a block will know that it needs to copy a block (which creates a copy on the heap, if it’s not there already, and returns a pointer to that), rather than retain it, if it wants to keep a reference.

The stack nature of blocks does have some pitfalls, though. For example, this code is broken:

1
    void (^block)();
2
    if(x)
3
    {
4
        block = ^{ printf("x\n"); };
5
    }
6
    else
7
    {
8
        block = ^{ printf("not x\n"); };
9
    }
10
    block();

1
    [dictionary setObject: ^{ printf("hey hey\n"); } forKey: key];

The speed and simplicity of stack objects are a great boon for blocks, but it also creates a whole new class of bugs for unwary programmers.

Conclusion That wraps things up for this week. Come back in seven days for another exciting post. Until then, keep those suggestions coming. My material comes from user ideas, so if you have a topic you would like to see discussed here, send it in!