The following is an excerpt from the unpublished "Cocoa Design Patterns"
book by Erik M. Buck
9
Associative Storage
Associative storage is one of the oldest and most used techniques in
software development. Associative storage organizes data and keys so that
data can be quickly and easily accessed using the keys. Associative storage
and an object oriented approach combine to produce a pattern that promotes
flexibility as well as run-time storage efficiency.
Motivation
Efficiently store arbitrary data associated with objects, promote
flexibility, enable extensibility, and work around programming language
limitations.
Solution
The NSDictionary and NSMutableDictionary classes in Cocoa's Foundation
framework are the most prominent classes that provide associative storage.
An NSDictionary instance maps keys to object values. To retrieve an object
value previously stored in a dictionary, use the -objectForKey: method which
returns the object that is associated with a specified key.
NSMutableDictionary is a subclass of NSDictionary and provides
the -setObject:forKey: method used to create new associations in the
dictionary. When keys and values are added and removed from a mutable
dictionary, the memory allocated to store objects grows and shrinks
automatically. If -setObject:forKey: is called with a key that is already in
the dictionary, the object associated with that key is replaced by the new
object. Each unique key is only stored in each dictionary at most once.
The objects stored in a dictionary are retained when they are added to the
collection and released when they are removed. The implications of retaining
and releasing objects are described in the Chapter 23. The keys that are
added to a dictionary are copied which means that all objects used as keys
in a dictionary must conform to the NSCopying formal protocol declared in
NSObject.h. In addition to conforming to the NSCopying protocol, objects
used as a keys in a dictionary must implement the -isEqual: and -hash
methods so that any two objects that are considered equal by the -isEqual:
method also have the same hash value. The -isEqual: and -hash methods are
declared in the NSObject class which provides basic implementations using
the addresses of objects. In other words, two objects are equal if they have
the same address, and the -hash value is computed from the address.
Subclasses of NSObject override the inherited implementations of -isEqual:
and -hash as needed. For example, instances of the NSString class are
compared based on their stored string values rather than merely their
addresses, and the value returned from -hash is also computed from the
stored strings.
NSDictionary and other associative storage features of Cocoa are implemented
with hash tables. Hash tables are explained in almost every introductory
data structures textbook. An excellent introduction is available at
http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/hash_tables.html, and an
advanced description is available at
http://www.cris.com/~Ttwang/tech/inthash.htm.
Cocoa provides a functional interface for associative storage using the
NSMapTable data structure and functions that manipulate it. NSMapTable is
used in the following example because it provides a little more flexibility
than NSDictionary. Dictionaries always copy their keys, but in the following
example it is necessary to store keys without copying or retaining them.
Simulating Instance Variables
One limitation of the Category pattern described in Chapter 8, "Category",
is that categories can only add methods to a class; instance variables must
be declared only in the main class interface. This example shows one way the
Associative Storage pattern is used to simulate the addition of an instance
variable to Cocoa’s NSObject class. The category in this example provides
access to a different label for each instance of NSObject or any class that
inherits from NSObject. Instances that don't have an assigned label don't
consume any extra memory. The following category declares the –mySetLabel:
and –myLabel methods:
#import <Foundation/Foundation.h>
@interface NSObject (MYSimulateIVar)
- (void)mySetLabel:(NSString *)aString;
- (NSString *)myLabel;
@end
Methods like the ones defined in this category are called Accessors.
Accessors are themselves an important pattern described in Chapter 23. The
primary purpose of Accessors is to funnel all references to each instance
variable through a few – usually only two - methods. A nice benefit of using
accessors in this example is that even though the labels are not stored as
instance variables, programmers using the NSObject class doesn't need to
know that. The accessors shield users of a class from the actual
implementation.
#import "MYSimulateIvar.h"
@implementation NSObject (MYSimulateIVar)
//
static NSMapTable *_MYSimulatedIVarMapTable = NULL;
+ (NSMapTable *)_mySimulatedIVarMapTable
//
{
if(NULL == _MYSimulatedIVarMapTable)
{
_MYSimulatedIVarMapTable = NSCreateMapTable(
NSNonRetainedObjectMapKeyCallBacks, NSObjectMapValueCallBacks, 16);
}
return _MYSimulatedIVarMapTable;
}
- (void)dealloc
// Possibly risky implementation: See +poseAsClass: solution
// in this chapter
{
NSMapRemove([[self class] _mySimulatedIVarMapTable], self);
NSDeallocateObject(self);
}
- (void)mySetLabel:(NSString *)aString
//
{
NSString *newLabel = [aString copy];
NSMapInsert([[self class] _mySimulatedIVarMapTable], self, newLabel);
[newLabel release];
}
- (NSString *)myLabel
//
{
return NSMapGet([[self class] _mySimulatedIVarMapTable], self);
}
@end
There are several important elements to the implementation of the
MYSimulateIVar category. The +_myRefCountMapTable class method is used to
access the NSMapTable data structure that stores labels associated with
NSObject instances. The +_myRefCountMapTable method is not declared in the
category interface because it is a private implementation detail of the
category. The first time +_myRefCountMapTable is called, the data structure
is initialized to store non-retained object keys and retained objects as
values. It is critical that the keys are not retained because if they are
retained it will be impossible to correctly deallocate any instances of
NSObject that have associated labels. The table is initialized with
sufficient storage for 16 key/value pairs, but that number is arbitrary. The
storage for the table automatically increases as keys and values are added.
The –dealloc method implemented in the category replaces NSObject’s existing
implementation. The –dealloc method removes any key/value pair associated
with an instance when the instance is deallocated. It is safe to replace
NSObject’s -dealloc implementation in this case because the replaced version
is documented to do nothing except call NSDeallocateObject() at this time.
If Apple ever changes the implementation of –dealloc in the NSObject class,
the fact that this category bypasses that implementation my have undesirable
side effects. An alternative approach to just replacing the
existing -dealloc method is described later in the Adding Simulated Instance
Variables Through Posing section of this chapter.
To flesh out support for labels associated with objects, it’s necessary to
provide encoding and decoding support so that labels are stored along with
any other data stored for objects when they are encoded. An example of using
existing accessors in the implementation of encoding and decoding methods is
provided in Chapter 13, “Archiving and Unarchiving.” Support for copying
labels when objects are copied should also be supported, and a general
technique is described in Chapter 14, “Copying.”
Finally, this example is limited to storing of labels for objects. A more
useful category enables the storage on any amount of data with each object.
To enable that, modify the example to store dictionaries of key/value pairs
with –mySetUserInfo: and –myUserInfo methods instead of instead
of -mySetLabel: and –myLabel methods.
- (void)mySetUserInfo:(NSDictionary *)aDictionary
//
{
NSDictionary *newDictionary = [aDictionary copy];
NSMapInsert([[self class] _mySimulatedIVarMapTable], self, newDictionary);
[newDictionary release];
}
- (NSDictionary *)myUserInfo
//
{
return NSMapGet([[self class] _mySimulatedIVarMapTable], self);
}
Any number of key/value pairs can be stored in the dictionary associated
with each object. To keep the ability to store labels, simply store a label
string associated with a key such as @”Label” in each user info dictionary.
Cocoa’s NSNotification class uses user info dictionaries to pass arbitrary
data along with each NSNotification instance.
Adding Simulated Instance Variables Through Posing
A valuable but rarely used feature of Objective-C enables one class to pose
as another class at runtime. Before any instances of a particular class are
created, send the +poseAsClass: message to the new class that will replace
the old class. There are several limitations inherent when one class poses
as another: The new class must be a subclass of the class being replaced,
and the new class can not add any instance variables to the class being
replaced.
However, a posing class may benefit from the simulated instance variable
technique using associative storage just like a category. The following
code implements a subclass of NSObject:
#import <Foundation/Foundation.h>
@interface MYUserInfoObject : NSObject
- (void)mySetUserInfo:(NSDictionary *)aDictionary;
- (NSDictionary *)myUserInfo;
@end
#import " MYUserInfoObject.h"
@implementation MYUserInfoObject
//
static NSMapTable *_MYSimulatedIVarMapTable = NULL;
+ (NSMapTable *)_mySimulatedIVarMapTable
//
{
if(NULL == _MYSimulatedIVarMapTable)
{
_MYSimulatedIVarMapTable = NSCreateMapTable(
NSNonRetainedObjectMapKeyCallBacks, NSObjectMapValueCallBacks, 16);
}
return _MYSimulatedIVarMapTable;
}
- (void)dealloc
//
{
NSMapRemove([[self class] _mySimulatedIVarMapTable], self);
[super dealloc];
}
- (void)mySetUserInfo:(NSDictionary *)aDictionary
//
{
NSDictionary *newDictionary = [aDictionary copy];
NSMapInsert([[self class] _mySimulatedIVarMapTable], self, newDictionary);
[newDictionary release];
}
- (NSDictionary *)myUserInfo
//
{
return NSMapGet([[self class] _mySimulatedIVarMapTable], self);
}
@end
As a proper subclass of NSObject, the MYUserInfoObject implementation is
able to call the implementation of -dealloc inherited from NSObject. This
capability removes the risk of problems developing if Apple ever changed the
implementation of NSObject's -dealloc method.
To use the MYUserInfoObject class in an application and assure that all
instances of NSObject or any of its subclasses have the simulated instance
variable, the application must send the following message to the
MYUserInfoObject class prior to the creation of any object instances:
[MYUserInfoObject poseAsClass:[NSObject class]];
To ensure than no instances of NSObject have been created yet, one of the
best places to ask one class to pose as another is the first statement
inside the main() function. For example, an Application Kit based
application might have a main() function implemented as follows:
int main(int argc, char *argv[])
{
[MYUserInfoObject poseAsClass:[NSObject class]];
return NSApplicationMain(argc, (const char **)argv);
}
Apple's Objective-C++ compiler allows mixing of Objective-C code with C++
code. When using C++, arbitrary user code may be executed before the main()
function is called. Special consideration of when and where to ask one
class to pose as another is critical: C++ code may create instances of
Objective-C objects even before main() is called.
Examples in Cocoa
Many Cocoa classes including NSAttributedString, NSFileManager,
NSNotification, and NSProcessInfo use the Associative Storage pattern
extensively. The pattern can be used to simulate instance variables in your
own code. The opposite is possible too. Cocoa uses the related key value
coding system to provide access to instance variables of any object as if
the true instance variables were all simulated with Associative Storage.
Associative Storage also provides the basis of Cocoa's keyed archiving
system described in Chapter 13, "Archiving and Unarchiving."
The use of NSDictionary to provide Associative Storage for arbitrary
properties of NSNotification and NSFileManager objects has already been
mentioned. NSNotification provides the -userInfo method that returns a
dictionary containing arbitrary keys and values.
NSFileManager's -fileAttributesAtPath:traverseLink: method returns a
dictionary that stores the subset of possible file attribute key/value pairs
available for a file. Another prominent example is the dictionary of text
formatting attributes stored by NSAttributedString instances. Each string
can have different attributes, and the set of possible attributes is
open-ended. Using a dictionary to store attributes enables the storage of
custom application specific attributes without the need to subclass
NSAtrributedString. The NSProcessInfo class provides the -environment method
that returns a dictionary of environment variable name/value pairs that are
defined for a running process. Once again, because the collection of
variable names and values is open-ended, using the Associative Storage
pattern is the perfect solution.
Reference Counted Memory Management
Cocoa uses the Associative Storage pattern to store the reference count
needed to implement reference counted memory management. The following
example describes a hypothetical MYRefCounted category of NSObject that
stores a reference count for each object using Associative Storage in much
the same way it is implemented in Cocoa. The code shows the basic technique
and highlights some of the advantages and disadvantages of using Associative
Storage:
#import <Foundation/Foundation.h>
@interface NSObject (MYRefCounted)
- (int)retainCount;
- (id)retain;
- (void)release;
@end
The -retainCount, -retain, and -release methods form the core of Cocoa's
reference counted memory management support. Another critical
method, -autorelease, and the NSAutoreleasePool class used to
support -autorelease are not shown here but are described in Chapter 23.
#import "MYRefCounted.h"
@implementation NSObject (MYRefCounted)
//
static NSMapTable *_MYRefCountMapTable = NULL;
+ (NSMapTable *)_myRefCountMapTable
// Provides access to the table used to store reference counts
{
if(NULL == _MYRefCountMapTable)
{
_MYRefCountMapTable =
NSCreateMapTable(NSNonRetainedObjectMapKeyCallBacks,
NSIntMapValueCallBacks, 16);
}
return _MYRefCountMapTable;
}
- (int)retainCount
// Returns the receiver's current reference count
{
int result = 1; // if receiver is not in table, its count is
1
void *tableValue = NSMapGet([[self class]
_myRefCountMapTable],
self);
if(NULL != tableValue )
{ // if receiver is in table, its count is the value stored
result = (int)tableValue;
}
return result;
}
- (id)retain
// Increases the receiver's reference count
{
// store the increased value in the table
NSMapInsert([[self class] _myRefCountMapTable], self,
(void *)([self retainCount] + 1));
return self;
}
- (void)release
// Decrease the receiver's reference count and dealloc if it reaches zero
{
int currentRetainCount = [self retainCount];
if(1 == currentRetainCount)
{ // the reference count is about to reach zero so deallocate
// there is no need to remove receiver from table now because if its
// reference count is 1, it is not in the table
[self dealloc];
}
else if(2 == currentRetainCount)
{
// remove the receiver from the table to indicate that its reference
// count is 1
NSMapRemove([[self class] _myRefCountMapTable], self);
}
else
{ store the decreased value in the table
NSMapInsert([[self class] _myRefCountMapTable], self,
(void *)(currentRetainCount - 1));
}
}
@end
Objects that are not stored in _MYRefCountMapTable have an implicit
reference count of 1. For example, newly allocated objects aren't stored in
the table and therefore have a reference count of 1, and no extra storage
beyond the storage needed for instance variables is needed. Working on the
assumption that at any give time, almost all objects have a reference count
of 1, this system makes efficient use of memory.
Each time an object is retained by calling the -retain method, its reference
count increases and is stored in the table. Each time an object is released
via the -release method, the associated reference count stored in the table
is decreased. If the reference count decreases to 1, the association is
removed from the table. If the reference count decreases to zero, the object
is immediately deallocated.
Key Value Coding
Much of this chapter has focused on simulating additional instance variables
with the Associative Storage pattern. Cocoa's key value coding does exactly
the opposite. It provides access to an objects instance variables using
semantics similar to Associative Storage. Key value coding provides access
to an object's properties indirectly using string keys rather than through
accessor methods or direct instance variable references. Key value coding
was introduced to simplify the interaction of scripting languages with Cocoa
objects, but the technique has value in many contexts. The two principal
methods that implement key value coding are -takeValue:forKey:
and -valueForKey. The -takeValue:forKey: method uses the string name
specified as a key to identify an accessor method or instance variable name.
If a suitable method or variable is identified, its value is set to the
value specified. Similarly, the -valueForKey: method returns a value
obtained by calling an accessor method or directly accessing the instance
variable identified by the key.
Cocoa's existing implementations of -takeValue:forKey: and -valueForKey:
first try to use an accessor method based on the key name. The set accessor
needs to have the form -set<key>: where <key> is the string used as a key.
The first letter in the key is made uppercase if it is not already so that
Cocoa's method naming convention of capitalizing all but the first word in a
method name is preserved. For example, if -takeValue:forKey: is called with
@"label" as the key, it will try to use an accessor method named -setLabel:
to set the value. When getting a value with -valueForKey:, the method tries
to use an accessor with the name -<key>. In this case, the first letter of
the key is converted to lower case if necessary so that the method starts
with a lower case letter. For example, calling -valueForKey: with @"label"
as the key end up calling the method, -label, if it exists.
The key value system is very sophisticated and will fall back to using
accessors that start with underscore (_) characters and if all else fails
will directly access instance variables with names derived from the strings
used as a keys. The key value system allows programs to interact with
objects as if every object is a dictionary that associates its properties
with string keys. Key value coding is also described in Chapter 23.
Consequences
Associative storage is very flexible and can be used to support
unanticipated features. The uses for a dictionary associated with each
NSObject instance are completely open-ended. However, accessing values
stored in a dictionary or map table is not as efficient as accessing
instance variables directly. An instance variable can typically be accessed
from program code with a single machine instruction, but associative storage
requires multiple method or function calls, calculation of hash values, and
indexed memory access into a table. Storing an associated value requires
memory for both the key and the value. If every object uses associated
storage, the memory required to store all of those keys and values exceeds
the memory that would have been required to store the values in instance
variables. If the need to store values is rare, using associative storage
can be a big win. Rather than storing unused instance variables in every
instance, memory is only reserved when the values are actually used.
The examples in this chapter use Associative Storage with categories, but
the technique is applicable in many circumstances. In fact, the drawbacks of
replacing methods like -dealloc with methods in categories are serious. When
subclassing is an option, simply adding instance variables in a subclass is
probably the best choice. One of the most flexible techniques for using
Associative Storage is to provide an NSDictionary object as an instance
variable. Cocoa's NSNotification class uses this approach to allow storage
of arbitrary data with each instance. The NSFileManager class uses a similar
technique to store file system specific information about files. The
flexibility is needed in the case of NSFileManager because each file being
managed could be stored in a different files system with different file
system specific attributes. Using a dictionary of attributes enables the
storage of pertinent attributes without requiring storage for attributes
that don't apply.
Author's Note:
Appendix A of "Cocoa Programming" includes section "Store the IMP for a
Replaced Method." This section provides a technique for calling a replaced
method implementation from within a category that replaces the method. This
approach could be used instead of posing to call the replaced
NSObject -dealloc implementation in the category example above.