访问器、内存管理与线程安全

Mike Ash Friday Q&A 中文译文:访问器、内存管理与线程安全

作者 TommyWu
封面圖片: 访问器、内存管理与线程安全

译文 · 原文: Friday Q&A 2010-12-03: Accessors, Memory Management, and Thread Safety · 作者 Mike Ash

原文:https://www.mikeash.com/pyblog/friday-qa-2010-12-03-accessors-memory-management-and-thread-safety.html 发布:2010-12-03 作者:Mike Ash 译者:MiMo(mimo-v2.5-pro);代码块保留英文原样


比你想象的更复杂

Cocoa 的引用计数内存管理通常运作良好,并且在很大程度上实现了其目标:使内存管理决策成为一个局部而非全局的问题。

但有一个重大的例外。找出这段代码中的缺陷:

NSMutableDictionary *dict = ...;
id obj = [dict objectForKey: @"foo"];
[dict removeObjectForKey: @"foo"];
[obj something];

当然,这种 bug 并非总是如此容易发现。删除操作可能被埋在多次方法调用的深处,而对象有时也可能在别处被保留着,从而掩盖了 bug,使你的程序仅仅不规律地崩溃。

同样的问题也可能发生在普通对象和访问器(accessor)上:

id obj = [otherObj foo];
[otherObj setFoo: newFoo];
[obj something]; // crash

基础访问器

最基础的访问器看起来大致如下:

- (void)setFoo: (id)newFoo
{
[newFoo retain];
[_foo release];
_foo = newFoo;
}
- (id)foo
{
return _foo;
}

这并不意味着这些基本访问器是错误的。这确实意味着你在编写使用它们的代码时需要更加小心:

id obj = [[otherObj foo] retain];
[otherObj setFoo: newFoo];
[obj something]; // safe
[obj release];

自动释放访问器

与其让调用者负责保留和释放临时对象,另一种方法是修改访问器(accessors)以使用自动释放(autorelease)。您可以在 setter 方法中实现这一点:

- (void)setFoo: (id)newFoo
{
[_foo autorelease];
_foo = [newFoo retain];
}
- (id)foo
{
return [[_foo retain] autorelease];
}

虽然这解决了问题,但也存在几个缺点。显而易见的是效率较低:使用 autorelease(自动释放)比直接 release(释放)稍慢,且会使目标对象存活时间延长,可能导致内存占用增加。这点通常并不重要,但需要有所意识。

更重要的是,过度使用 autorelease 会增加追踪内存管理错误的难度。如果过度释放了某个对象,你希望程序能尽快崩溃。而使用 autorelease 往往会使崩溃发生在 NSAutoreleasePool 代码中,导致问题追踪变得相当困难。虽然僵尸对象(zombies)和 Instruments 工具能提供很大帮助,但仍可能让你的工作更加棘手。

线程安全访问器 在编写线程安全访问器时,autorelease 变得必不可少。因为其他线程可能在 getter 返回后立即调用 setter,而 retain / autorelease 技巧是确保此刻不会销毁正在使用的对象的唯一方法。

线程安全访问器与任何共享数据访问一样,需要使用锁机制。你可以选择 @synchronized(self)NSLock 实例,或任何其他你偏好的锁定机制。

(当然,无锁访问共享数据是可行的,但其机制过于复杂且难以在此详述。我们不妨跳过这个话题,直接使用锁。)

通过对所有共享访问进行加锁,并且在 getter 方法中使用自动释放版本(译注:指 getter 返回 autorelease 对象以适应多线程环境,现代 ARC 下机制可能已变化),你最终将获得线程安全的访问器:

- (void)setFoo: (id)newFoo
{
[newFoo retain];
@synchronized(self)
{
[_foo release];
_foo = newFoo;
}
}
- (id)foo
{
id returnFoo;
@synchronized(self)
{
returnFoo = [_foo retain];
}
return [returnFoo autorelease];
}

你需要线程安全访问器吗?在绝大多数情况下,答案都是 “否”。线程安全通常不该是访问器需要操心的问题。

问题在于,线程安全并非一种可组合的属性。即便你将一堆线程安全的组件组合在一起,结果仍可能不是线程安全的。

举个简单的例子,假设有一个 Person 类,它具有表示该人名和姓的属性。一个线程执行如下操作:

[person setFirstName: @"John"];
[person setLastName: @"Doe"];
NSString *f = [person firstName];
NSString *l = [person lastName];
NSString *fullName = [NSString stringWithFormat: @"%@ %@", f, l];

为了保证这段代码的安全,线程安全需要在更高层面实现。例如,可以创建一个 PersonDatabase 对象,在进行任何操作时对其进行加锁和解锁。也可以规定所有 Person 的访问都通过一个单独的串行的中央调度队列(Grand Central Dispatch queue)进行。或者简单地让所有 Person 的访问都发生在主线程上。

无论选择哪种方案,一旦确定下来,那些线程安全的访问器就不再是必要的了。更大范围的线程安全机制已经解决了问题,而所有这些线程安全访问器做的只是让你的代码不必要地变慢和变复杂。

在某些情况下,线程安全的访问器确实有用。当有一个单独的属性会被多个线程访问,并且该属性与其他属性的不一致不会引发问题时,你就会需要线程安全的访问器。然而,99% 的情况下都没有必要将它们设置为线程安全的。

属性与原子性
基于上述分析,苹果将 @property 声明默认为原子性(atomic)这一点极其令人困惑。大多数情况下,这种默认行为毫无意义,甚至可能让程序员产生错误印象,以为通过 @property 处理后无需再担忧线程安全问题。

@property@synthesize 构造确实是一种无需手写代码即可生成访问器(accessor)的有效方式,但需注意其默认的原子性行为并非万能,并不能让你完全忽视线程安全问题。

垃圾收集机制
若你使用垃圾收集(garbage collection),整个问题将大大简化。以下是在垃圾收集代码中,一个正确且线程安全的访问器对(accessor pair)应具备的形态:

- (void)setFoo: (id)newFoo
{
_foo = newFoo;
}
- (id)foo
{
return _foo;
}

总结

编写访问器(accessor)很简单,但其中存在一些微妙之处。在绝大多数情况下,你只需要一个进行 retain / release 的 setter(设置器)和一个返回 _ivar 的 getter(访问器)即可。然而,根据你的具体情况和个人偏好,你可能会选择在 getter 中使用 retain / autorelease,以简化调用它的代码。

关于线程安全(thread safety),访问器通常不是应该关心它的地方。尽管存在一些合理的使用场景,但大多数时候,你应该退后一步,在代码的更高层面来处理线程安全问题。不要被 atomic 这个 @property 关键字所迷惑:在这方面它并没有什么特殊作用。

如果你专门为垃圾回收(garbage collection)编写代码,那么你基本上可以忘记所有这些事务,并编写简单得多的代码。

以上就是本期 Friday Q & A 的全部内容。照例,两周后将发布下一篇。在那之前,你可以向我发送主题建议来打发时间。如果你有任何希望在此处看到的主题,请发送给我。


#Original (English)

Source: https://www.mikeash.com/pyblog/friday-qa-2010-12-03-accessors-memory-management-and-thread-safety.html

It’s once again time for a brand new edition of Friday Q&A. This week, I’m going to talk about accessors, and how to properly deal with memory management and thread safety when creating them, a topic suggested by Daniel Jalkut.

More Complicated Than You Might Think Cocoa’s reference counting memory management usually works pretty well, and for the most part accomplishes its goal of making memory management decisions a local, rather than global, affair.

There’s a big exception. Spot the bug in this code:

NSMutableDictionary *dict = ...;
id obj = [dict objectForKey: @"foo"];
[dict removeObjectForKey: @"foo"];
[obj something];

Of course, this bug isn’t always this easy to find. The removal may be buried several method calls deep, and the object may sometimes be retained elsewhere, hiding the bug and making your program crash only inconsistently.

The same problem can happen with a regular object and accessors:

id obj = [otherObj foo];
[otherObj setFoo: newFoo];
[obj something]; // crash

Basic Accessors The most basic accessors look something like this:

- (void)setFoo: (id)newFoo
{
[newFoo retain];
[_foo release];
_foo = newFoo;
}
- (id)foo
{
return _foo;
}

This does not mean that these basic accessors are wrong. It does mean that you need to be more careful when writing code that uses them:

id obj = [[otherObj foo] retain];
[otherObj setFoo: newFoo];
[obj something]; // safe
[obj release];

Autoreleasing Accessors Rather than make callers retain and release temporary objects, another approach is to modify the accessors to use autorelease. You can do this in the setter:

- (void)setFoo: (id)newFoo
{
[_foo autorelease];
_foo = [newFoo retain];
}
- (id)foo
{
return [[_foo retain] autorelease];
}

This solves the problem, but there are a couple of downsides. The obvious one is that it’s less efficient. Using autorelease is a bit slower than release, and it keeps the target object alive longer, which could lead to more memory usage. This isn’t usually very important, but it’s something to keep in mind.

More importantly, pervasive use of autorelease can make it harder to track down memory management errors. If you over-release an object, you want to crash as soon as possible. Using autorelease can often cause the crash to be in NSAutoreleasePool code, making it considerably harder to track down. Using zombies and Instruments will help a lot, but it can still make your job harder.

Thread Safe Accessors When writing thread safe accessors, the autorelease getter becomes mandatory. Another thread could call the setter after it returns, and the retain/autorelease dance is the only way to ensure that this does not destroy the previous object in the middle of being used.

Thread safe accessors, like any access to shared data, need to use a lock. You can use @synchronized(self), an instance of NSLock, or whatever other locking mechanism you prefer.

(Lockless access to shared data is, of course, possible, but far too tricky and difficult to cover here. Better to skip it and use locks.)

By locking all shared access, and by using the autorelease version of the getter, you end up with thread safe accessors:

- (void)setFoo: (id)newFoo
{
[newFoo retain];
@synchronized(self)
{
[_foo release];
_foo = newFoo;
}
}
- (id)foo
{
id returnFoo;
@synchronized(self)
{
returnFoo = [_foo retain];
}
return [returnFoo autorelease];
}

Do You Need Thread Safe Accessors? In almost every case, the answer to this question is “no”. Accessors are usually not the right place to worry about thread safety.

The problem is that thread safety isn’t a composable attribute. If you take a bunch of thread safe components and bundle them together, the result may not be thread safe.

For a simple example, imagine a Person class with properties for the first and last name of the person it represents. One thread does this:

[person setFirstName: @"John"];
[person setLastName: @"Doe"];
NSString *f = [person firstName];
NSString *l = [person lastName];
NSString *fullName = [NSString stringWithFormat: @"%@ %@", f, l];

In order to make this code safe, thread safety needs to be applied at a higher level. For example, you might have a PersonDatabase object which could be locked and unlocked for any manipulations. You might decree that all Person access go through a single serial Grand Central Dispatch queue. You might just make all Person access happen on the main thread.

No matter which solution you pick, once you’ve picked it, the thread safe accessors are no longer necessary. The larger-scale thread safety takes care of problems, and all the thread safe accessors do is make your code unnecessarily slow and complex.

There are cases where a thread safe accessor is useful. You may have a single property that’s accessed from many threads and which won’t experience problems being inconsistent with other properties, in which case you’d want a thread safe accessor. However, 99% of the time there is no point in making them.

Properties and nonatomic Given the above, it’s extremely puzzling that Apple has made @property declarations default to atomic. Most of the time it’s pointless, and it can give the mistaken impression that the programmer doesn’t have to worry about thread safety anymore, because the @property handles it all.

The @property and @synthesize constructs can be a good way to generate accessors without writing code, just be aware that the default atomic behavior is not all that useful, and doesn’t mean you can forget about thread safety.

Garbage Collection If you’re using garbage collection, this whole question becomes vastly simpler. Here’s what a correct, thread safe accessor pair looks like in garbage collected code:

- (void)setFoo: (id)newFoo
{
_foo = newFoo;
}
- (id)foo
{
return _foo;
}

Conclusion Writing accessors is easy, but there are some subtleties. The vast majority of the time, a plain retain/release setter and a return _ivar getter is all you need. However, depending on your situation and your individual taste, you may want to put retain/autorelease into the getter in order to simplify the code that calls it.

When it comes to thread safety, accessors are usually the wrong place to worry about it. While there are legitimate cases where it’s useful, most of the time you should back up and take on the problem of thread safety at a higher level of your code. And don’t let the atomic @property keyword fool you: it doesn’t do anything special in this regard.

If you’re coding exclusively for garbage collection, you can pretty much forget about this whole business and write much simpler code.

That’s it for this edition of Friday Q&A. As usual, another one will be posted in two weeks. Until then, you can pass the time by sending me suggestions for topics. If you have an topic that you would like to see covered here, please send it in.