FROM : Chris Kane
DATE : Fri Feb 01 20:17:10 2008
A couple comments on the NSOperationQueue usage ...
(1) I would say that since you are choosing to only spawn maxCores
operations, you should divide up the s_importFileSet into maxCores
different arrays, and then you wouldn't have to share importKeys in
processFilesForKeys: and have to lock around it.
The shared queue importKeys style is more suited to the NSThread
approach. But, if the threads block on I/O, you're potentially under-
utilizing cores while those maxCores threads (or operations) wait.
But, by blindly splitting s_importFileSet into N pieces, one thread
might get all the really cheap files to import and one thread might
get all the expensive ones, making the start-to-finish latency more
than it'd have to be.
If one ignores all other potential issues where things might be
fighting one another (processor cache affinities, kernel file buffer
cache space, potential global locks in lower layers, per-operation
RAM usage, heat and power-management and clock speed issues in the
processors, etc.), then ...
(2) I would say that the most natural decomposition of the overall
task here (wrt NSOperationQueue) would be to create one NSOperation
per element in s_importFileSet, throw those into the
NSOperationQueue, and let it grind away. Let it and the kernel worry
about maxCores. There's bound to be plenty of blocking in I/O, and
the optimal number of operations is likely more than -
activeProcessorCount (some can run with the data they have while
others block waiting for their data). If you identify the work to be
done using operations (rather than embedding it implicitly in a loop-
until-importKeys-empty loop), the kernel can run other operations
that have their data available while other operations block while the
disk/network hardware fetches data.
Chris Kane
Cocoa Frameworks, Apple
On Jan 30, 2008, at 4:38 PM, Ben Trumbull wrote:
> Alex,
>
> Conceptually, you should treat NSOperations as if they were on a
> separate thread. The OS may take certain liberties with
> implementation details based on system load and other factors, but
> basically NSOperations might as well be described as a light weight
> mechanism for creating threaded tasks.
>
> Importing tasks are often easily parallelizable by simply importing
> 1/Nth of the data on a thread/operation. Here's an excerpt of some
> code I've been working with recently. It's GC and non-GC
> compatible, and has 3 implementations for comparison: NSOperation,
> NSThread, and boring serial code. As you can see, the NSOperation
> version is basically the same in terms of thread handling, but
> NSOperationQueue provides some convenient out-of-box handling for
> finding out when the tasks are complete. The NSThread code has
> whacky NSConditions and memory barriers.
>
> The key to making this pattern useful is that each element in the
> work queue ('keyQueues' below) is sufficiently large to be worth
> the overhead of queuing up. In this sample code, each key is a
> file path, so this is importing from a directory of files,
> importing 'maxCores' files simultaneously.
>
> This division of labor doesn't work if the data in each 1/N sets
> has relationships to data in other import groups.
>
> static OSSpinLock _queueLock;
> static NSOperationQueue* _operationQueue;
> static NSDate *_startDate;
>
> #define USE_NSOPERATIONS 1
> // #define USE_NSTHREADS 1
>
> - (IBAction)createEntities:(id)sender
> {
> _startDate = [[NSDate date] retain];
>
> _operationQueue = [[NSOperationQueue alloc] init];
> NSUInteger j = 0;
> NSUInteger maxCores = [[NSProcessInfo processInfo]
> activeProcessorCount];
> NSMutableArray* keyQueues = [[NSMutableArray alloc] init];
> for(NSString* key in s_importFileSet) {
> [keyQueues addObject:key];
> }
> #if USE_NSOPERATIONS
> for (j = 0; j < maxCores; j++) {
> NSOperation* op = [[NSInvocationOperation alloc]
> initWithTarget:self selector:@selector(processFilesForKeys:)
> object:keyQueues];
> [_operationQueue addOperation:op];
> [op release];
> }
> #elif USE_NSTHREADS
> _condition = [[NSCondition alloc] init];
> _notFinished = maxCores;
> OSMemoryBarrier();
> for (j = 0; j < maxCores; j++) {
> [NSThread detachNewThreadSelector:@selector
> (processFilesForKeys:) toTarget:self withObject:keyQueues];
> }
> #else
> for (j = 0; j < maxCores; j++) {
> [self processFilesForKeys:keyQueues[j]];
> }
> #endif
> [NSThread detachNewThreadSelector:@selector
> (finishImportOperation:) toTarget:self withObject:keyQueues];
> }
>
> - (void)finishImportOperation:(id)keys {
> NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
> #if USE_NSOPERATIONS
> [_operationQueue waitUntilAllOperationsAreFinished];
> #elif USE_NSTHREADS
> [_condition lock];
> while (_notFinished > 0) {
> [_condition wait];
> }
> [_condition unlock];
> #else
> #endif
> [keys release];
> [_operationQueue release];
> _operationQueue = nil;
> NSLog(@"Total create time %f", [[NSDate date]
> timeIntervalSinceDate:_startDate] );
> [_startDate release];
> [pool drain];
> }
>
> - (void)processFilesForKeys:(NSMutableArray*)importKeys {
> NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
> DataDumpImporter* importer = [[DataDumpImporter alloc] init];
>
> NSPersistentStoreCoordinator* mainPSC = [[appDelegate
> managedObjectContext] persistentStoreCoordinator];
> NSPersistentStoreCoordinator* psc =
> [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:
> [mainPSC managedObjectModel]];
> [psc addPersistentStoreWithType:NSSQLiteStoreType
> configuration:nil URL:[[[mainPSC persistentStores] lastObject] URL]
> options:[NSDictionary dictionaryWithObject:[NSDictionary
> dictionaryWithObject:@"0" forKey:@"synchronous"]
> forKey:NSSQLitePragmasOption] error:nil];
> // we disable synchronous because if an import fails, we can
> delete the file and re-import.
> // if you can't just delete the file, don't do this
>
> NSManagedObjectContext *moc = [[NSManagedObjectContext alloc] init];
> [moc setPersistentStoreCoordinator:psc];
> [psc release];
> [importer setImportPath:[self importPath]];
> [importer setMoc:moc];
>
> [moc setUndoManager:nil];
>
> while (1) {
> NSString* key = nil;
> OSSpinLockLock(&_queueLock);
> key = [importKeys lastObject];
> if (key) {
> [importKeys removeLastObject];
> }
> OSSpinLockUnlock(&_queueLock);
> if (!key) {
> break;
> }
> @try {
> DataDumpImporterParams *params =
> [s_entityImporterParams objectForKey:key];
> [importer importFile:[params filename] usingEntity:key
> andFlags:[params flags]];
> } @catch (id e) {
> NSLog(@"e = %@", e);
> }
> }
> [importer release];
> [moc release];
> [pool drain];
> OSAtomicDecrement32Barrier(&_notFinished);
> [_condition lock];
> [_condition signal];
> [_condition unlock];
> }
>
> --
>
> -Ben
> _______________________________________________
>
> Cocoa-dev mailing list (<email_removed>)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/cocoa-dev/<email_removed>
>
> This email sent to <email_removed>
DATE : Fri Feb 01 20:17:10 2008
A couple comments on the NSOperationQueue usage ...
(1) I would say that since you are choosing to only spawn maxCores
operations, you should divide up the s_importFileSet into maxCores
different arrays, and then you wouldn't have to share importKeys in
processFilesForKeys: and have to lock around it.
The shared queue importKeys style is more suited to the NSThread
approach. But, if the threads block on I/O, you're potentially under-
utilizing cores while those maxCores threads (or operations) wait.
But, by blindly splitting s_importFileSet into N pieces, one thread
might get all the really cheap files to import and one thread might
get all the expensive ones, making the start-to-finish latency more
than it'd have to be.
If one ignores all other potential issues where things might be
fighting one another (processor cache affinities, kernel file buffer
cache space, potential global locks in lower layers, per-operation
RAM usage, heat and power-management and clock speed issues in the
processors, etc.), then ...
(2) I would say that the most natural decomposition of the overall
task here (wrt NSOperationQueue) would be to create one NSOperation
per element in s_importFileSet, throw those into the
NSOperationQueue, and let it grind away. Let it and the kernel worry
about maxCores. There's bound to be plenty of blocking in I/O, and
the optimal number of operations is likely more than -
activeProcessorCount (some can run with the data they have while
others block waiting for their data). If you identify the work to be
done using operations (rather than embedding it implicitly in a loop-
until-importKeys-empty loop), the kernel can run other operations
that have their data available while other operations block while the
disk/network hardware fetches data.
Chris Kane
Cocoa Frameworks, Apple
On Jan 30, 2008, at 4:38 PM, Ben Trumbull wrote:
> Alex,
>
> Conceptually, you should treat NSOperations as if they were on a
> separate thread. The OS may take certain liberties with
> implementation details based on system load and other factors, but
> basically NSOperations might as well be described as a light weight
> mechanism for creating threaded tasks.
>
> Importing tasks are often easily parallelizable by simply importing
> 1/Nth of the data on a thread/operation. Here's an excerpt of some
> code I've been working with recently. It's GC and non-GC
> compatible, and has 3 implementations for comparison: NSOperation,
> NSThread, and boring serial code. As you can see, the NSOperation
> version is basically the same in terms of thread handling, but
> NSOperationQueue provides some convenient out-of-box handling for
> finding out when the tasks are complete. The NSThread code has
> whacky NSConditions and memory barriers.
>
> The key to making this pattern useful is that each element in the
> work queue ('keyQueues' below) is sufficiently large to be worth
> the overhead of queuing up. In this sample code, each key is a
> file path, so this is importing from a directory of files,
> importing 'maxCores' files simultaneously.
>
> This division of labor doesn't work if the data in each 1/N sets
> has relationships to data in other import groups.
>
> static OSSpinLock _queueLock;
> static NSOperationQueue* _operationQueue;
> static NSDate *_startDate;
>
> #define USE_NSOPERATIONS 1
> // #define USE_NSTHREADS 1
>
> - (IBAction)createEntities:(id)sender
> {
> _startDate = [[NSDate date] retain];
>
> _operationQueue = [[NSOperationQueue alloc] init];
> NSUInteger j = 0;
> NSUInteger maxCores = [[NSProcessInfo processInfo]
> activeProcessorCount];
> NSMutableArray* keyQueues = [[NSMutableArray alloc] init];
> for(NSString* key in s_importFileSet) {
> [keyQueues addObject:key];
> }
> #if USE_NSOPERATIONS
> for (j = 0; j < maxCores; j++) {
> NSOperation* op = [[NSInvocationOperation alloc]
> initWithTarget:self selector:@selector(processFilesForKeys:)
> object:keyQueues];
> [_operationQueue addOperation:op];
> [op release];
> }
> #elif USE_NSTHREADS
> _condition = [[NSCondition alloc] init];
> _notFinished = maxCores;
> OSMemoryBarrier();
> for (j = 0; j < maxCores; j++) {
> [NSThread detachNewThreadSelector:@selector
> (processFilesForKeys:) toTarget:self withObject:keyQueues];
> }
> #else
> for (j = 0; j < maxCores; j++) {
> [self processFilesForKeys:keyQueues[j]];
> }
> #endif
> [NSThread detachNewThreadSelector:@selector
> (finishImportOperation:) toTarget:self withObject:keyQueues];
> }
>
> - (void)finishImportOperation:(id)keys {
> NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
> #if USE_NSOPERATIONS
> [_operationQueue waitUntilAllOperationsAreFinished];
> #elif USE_NSTHREADS
> [_condition lock];
> while (_notFinished > 0) {
> [_condition wait];
> }
> [_condition unlock];
> #else
> #endif
> [keys release];
> [_operationQueue release];
> _operationQueue = nil;
> NSLog(@"Total create time %f", [[NSDate date]
> timeIntervalSinceDate:_startDate] );
> [_startDate release];
> [pool drain];
> }
>
> - (void)processFilesForKeys:(NSMutableArray*)importKeys {
> NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
> DataDumpImporter* importer = [[DataDumpImporter alloc] init];
>
> NSPersistentStoreCoordinator* mainPSC = [[appDelegate
> managedObjectContext] persistentStoreCoordinator];
> NSPersistentStoreCoordinator* psc =
> [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:
> [mainPSC managedObjectModel]];
> [psc addPersistentStoreWithType:NSSQLiteStoreType
> configuration:nil URL:[[[mainPSC persistentStores] lastObject] URL]
> options:[NSDictionary dictionaryWithObject:[NSDictionary
> dictionaryWithObject:@"0" forKey:@"synchronous"]
> forKey:NSSQLitePragmasOption] error:nil];
> // we disable synchronous because if an import fails, we can
> delete the file and re-import.
> // if you can't just delete the file, don't do this
>
> NSManagedObjectContext *moc = [[NSManagedObjectContext alloc] init];
> [moc setPersistentStoreCoordinator:psc];
> [psc release];
> [importer setImportPath:[self importPath]];
> [importer setMoc:moc];
>
> [moc setUndoManager:nil];
>
> while (1) {
> NSString* key = nil;
> OSSpinLockLock(&_queueLock);
> key = [importKeys lastObject];
> if (key) {
> [importKeys removeLastObject];
> }
> OSSpinLockUnlock(&_queueLock);
> if (!key) {
> break;
> }
> @try {
> DataDumpImporterParams *params =
> [s_entityImporterParams objectForKey:key];
> [importer importFile:[params filename] usingEntity:key
> andFlags:[params flags]];
> } @catch (id e) {
> NSLog(@"e = %@", e);
> }
> }
> [importer release];
> [moc release];
> [pool drain];
> OSAtomicDecrement32Barrier(&_notFinished);
> [_condition lock];
> [_condition signal];
> [_condition unlock];
> }
>
> --
>
> -Ben
> _______________________________________________
>
> Cocoa-dev mailing list (<email_removed>)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
> http://lists.apple.com/mailman/options/cocoa-dev/<email_removed>
>
> This email sent to <email_removed>
| Related mails | Author | Date |
|---|---|---|
| Alexander Griekspo… | Jan 30, 23:40 | |
| Ben Trumbull | Jan 31, 01:38 | |
| Alexander Griekspo… | Jan 31, 11:36 | |
| Chris Kane | Feb 1, 20:17 |






Cocoa mail archive

