Re: Using AVFoundation with AVCaptureAudioDataOutput for Audio playback

  • After filing a DTS, found out the problem.
    The AudioSettings property is not enough to convert the output to the
    necessary format.  Though I needed to add "[NSNumber
    numberWithBool:NO],                    AVLinearPCMIsNonInterleaved," to the dictionary.
    Upon receiving a sample, there is still further PCM format conversion that
    is required.  This can be done using the "AudioConverterConvertBuffer"
    function.
    This function requires a source and destination
    "AudioStreamBasicDescription".
    The source can be obtained using
    "CMAudioFormatDescriptionGetStreamBasicDescription" and
    "CMSampleBufferGetFormatDescription".
    The destination is setup by the user, mine is set as follow:

    _destinationDescription.mFormatID  = kAudioFormatLinearPCM;
    _destinationDescription.mSampleRate  = AV::kSampleRate;
    _destinationDescription.mBitsPerChannel  = 16;
    _destinationDescription.mBytesPerFrame  = (2 * 16) / 8;
    _destinationDescription.mBytesPerPacket  = (2 * 16) / 8;
    _destinationDescription.mChannelsPerFrame = 2;
    _destinationDescription.mFormatFlags  = kLinearPCMFormatFlagIsPacked |
          kAudioFormatFlagsNativeEndian |
          kAudioFormatFlagIsSignedInteger;

    _destinationDescription.mFramesPerPacket = 1;
    _destinationDescription.mReserved  = 0;

    Once the conversion was done, the audio sounded perfectly!
    Hope anyone who comes across this problem finds this useful in the future.

    -San Saeteurn

    On 5/30/13 3:22 PM, "San Saeteurn" <sans...> wrote:

    > Hello
    >
    > I am trying to write an external plugin that simply reads audio from the
    > built in microphone, obtains the raw audio data, and passes it to an SDK
    > API.
    > My setup is very simple, I have a single session, with a single input
    > device (built in microphone) and a single output
    > (AVCaptureAudioDataOutput)
    > I Initialize the Session, Input, and Output as follow:
    >
    > // Obtain the device and retain it
    >
    > _pAudioCaptureInput = [pAVInputDevice retain];
    >
    >
    > // Initialize Capture Session
    >
    > _dispatchQueue = dispatch_get_main_queue();
    >
    > _pCaptureSession = [[AVCaptureSession alloc] init];
    >
    > _pCaptureSession.sessionPreset = AVCaptureSessionPresetHigh;
    >
    >
    >
    > // Put the Device inside an Input
    >
    > pInputDevice = [[AVCaptureDeviceInput alloc]
    > initWithDevice:_pAudioCaptureInput error:&pError];
    >
    > [_pCaptureSession addInput:pInputDevice]; // Add Audio Input
    > to Session
    >
    > [pInputDevice release];
    >
    > pInputDevice = nil;
    >
    >
    > // Setup the Output Configurations
    >
    > NSDictionary *pAudioSettings = [NSDictionary
    > dictionaryWithObjectsAndKeys:
    >
    > [NSNumber
    > numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
    >
    > [NSNumber
    > numberWithInt:44100], AVSampleRateKey,
    >
    > [NSNumber
    > numberWithInt:2], AVNumberOfChannelsKey,
    >
    > [NSNumber
    > numberWithInt:16], AVLinearPCMBitDepthKey,
    >
    > [NSNumber
    > numberWithBool:NO], AVLinearPCMIsFloatKey,
    >
    > [NSNumber
    > numberWithBool:NO], AVLinearPCMIsBigEndianKey,
    >
    > nil];
    >
    >
    >
    >
    > // Create the Output
    >
    > _pAudioCaptureOutput = [[AVCaptureAudioDataOutput alloc]
    > init];
    >
    > [_pAudioCaptureOutput setAudioSettings:pAudioSettings]; //
    > Apply Output Settings
    >
    > [_pCaptureSession addOutput:_pAudioCaptureOutput]; // Add
    > Output to Session
    >
    >
    > // Create the Output Delegate that will receive the
    > SampleBuffer callbacks
    >
    > _pAudioCaptureDelegate = [[ACaptureDelegate alloc] init];
    >
    > [_pAudioCaptureOutput
    > setSampleBufferDelegate:_pAudioCaptureDelegate
    >
    > queue:_dispatchQueue];
    >
    >
    >
    > // Remember the Device Name
    >
    > _deviceName = [_pAudioCaptureInput.localizedName UTF8String];
    >
    > I start the session running by calling startRunning on the session.  My
    > callback delegate is notified with Audio Samples.  The follow is the
    > callback implementation:
    >
    >
    > - (void)captureOutput:(AVCaptureOutput *)captureOutput
    > didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
    > fromConnection:(AVCaptureConnection *)connection
    >
    > {
    >
    > // Save the Timescale
    >
    > CMTime timestamp >CMSampleBufferGetOutputPresentationTimeStamp(sampleBuffer);
    >
    > if(_timescale == -1) {
    >
    > _timescale = timestamp.timescale;
    >
    > }
    >
    > CMItemCount numSamples = CMSampleBufferGetNumSamples(sampleBuffer);
    >
    >
    > // Get Audio Data
    >
    > CMBlockBufferRef dataBuffer >CMSampleBufferGetDataBuffer(sampleBuffer);
    >
    > zuint32 dataLength = CMBlockBufferGetDataLength(dataBuffer);
    >
    >
    >
    > AVCaptureAudioDataOutput *pOutput >static_cast<AVCaptureAudioDataOutput *>(captureOutput);
    >
    > NSDictionary *pAudioSettings = pOutput.audioSettings;
    >
    >
    >
    > // Retrieve Audio Settings Information from AVOutputDevice
    >
    > NSNumber *audioSettingValue = nil;
    >
    > zint32 sampleRate = 44100;
    >
    > audioSettingValue = [pAudioSettings objectForKey:AVSampleRateKey];
    >
    > if(audioSettingValue) {
    >
    > sampleRate = [audioSettingValue intValue];
    >
    > }
    >
    >
    > zint32 bitDepth = 16;
    >
    > audioSettingValue = [pAudioSettings
    > objectForKey:AVLinearPCMBitDepthKey];
    >
    > if(audioSettingValue) {
    >
    > bitDepth = [audioSettingValue intValue];
    >
    > }
    >
    >
    >
    > zint32 channels = 2;
    >
    > audioSettingValue = [pAudioSettings
    > objectForKey:AVNumberOfChannelsKey];
    >
    > if(audioSettingValue) {
    >
    > channels = [audioSettingValue intValue];
    >
    > }
    >
    >
    >
    >
    > zuint8 *pAudioData = new zuint8[dataLength];
    >
    > OSStatus err = CMBlockBufferCopyDataBytes(dataBuffer, 0, dataLength,
    > pAudioData);
    >
    > if(err == noErr) {
    >
    > // Populate SDK Audio structure
    >
    > _audioBuffer.format = kFormatPCM;
    >
    > _audioBuffer.dataLength = dataLength;
    >
    > _audioBuffer.timeStamp = 0;
    >
    > _audioBuffer.flags = kBufferFlagTimeValid;
    >
    > _audioBuffer.sampleRate = sampleRate;
    >
    > _audioBuffer.bitsPerSample = bitDepth;
    >
    > _audioBuffer.channels = channels;
    >
    > _audioBuffer.data = pAudioData; // The Raw Audio Data
    >
    >
    >
    > // Send Audio Data to SDK API
    >
    > _pAudioDataArriveCallback(_pAudioDataArriveContext,
    > &_audioBuffer);
    >
    >
    >
    > delete [] pAudioData;
    >
    > pAudioData = NULL;
    >
    > }
    >
    > }
    >
    > The Problem is, the audio is playing back with a higher pitch.  Like
    > someone is fast forwarding.  If I adjust the SampleRate in the Output
    > Audio Settings, it affects the play out.
    > Higher I set it, the less squeaky it becomes, but never correct.  Is
    > there any initialization step I am missing?
    >
    > I have not been able to play the raw audio data back myself without the
    > main app as I am new to AVFoundation and CoreAudio.
    > If someone has sample code of getting this up, I would much appreciate it.
    >
    > Thanks,
    > -San Saeteurn
previous month june 2013 next month
MTWTFSS
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
Go to today