iOS平臺支持AAC編碼器,主要使用AudioToolbox中的AudioConverter API。之所以做AAC編碼器是因為在做壹個HLS的功能,HLS要求的TS文件,需要視頻采用H264編碼,音頻采用AAC編碼。H264可以使用硬件或軟件編碼器,前面已經介紹。AAC也可以使用硬件或者軟件編碼,iOS全都支持。
首先需要創建壹個Converter,也就是壹個AAC Encoder,使用如下接口:
extern OSStatus
AudioConverterNew( ? const AudioStreamBasicDescription*? inSourceFormat,
const AudioStreamBasicDescription*? inDestinationFormat,
AudioConverterRef* ? outAudioConverter) ? __OSX_AVAILABLE_STARTING(__MAC_10_1,__IPHONE_2_0);
輸入參數分別是源和目的的數據格式。
在AAC編碼的場景下,源格式就是采集到的PCM數據,目的格式就是AAC。
AudioStreamBasicDescription inAudioStreamBasicDescription;
// FillOutASBDForLPCM()
inAudioStreamBasicDescription.mFormatID = kAudioFormatLinearPCM;
inAudioStreamBasicDescription.mSampleRate = 44100;
inAudioStreamBasicDescription.mBitsPerChannel = 16;
inAudioStreamBasicDescription.mFramesPerPacket = 1;
inAudioStreamBasicDescription.mBytesPerFrame = 2;
inAudioStreamBasicDescription.mBytesPerPacket = inAudioStreamBasicDescription.mBytesPerFrame * inAudioStreamBasicDescription.mFramesPerPacket;
inAudioStreamBasicDescription.mChannelsPerFrame = 1;
inAudioStreamBasicDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsNonInterleaved;
inAudioStreamBasicDescription.mReserved = 0;
AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1;
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC;
UInt32 size = sizeof(outAudioStreamBasicDescription);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outAudioStreamBasicDescription);
OSStatus status = AudioConverterNew(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, &_audioConverter);
if(status != 0) {NSLog(@"setup converter failed: %d", (int)status);}
這樣就創建了AAC編碼器,默認情況下,Apple會創建壹個硬件編碼器,如果硬件不可用,會創建軟件編碼器。
經過我的測試,硬件AAC編碼器的編碼時延很高,需要buffer大約2秒的數據才會開始編碼。而軟件編碼器的編碼時延就是正常的,只要餵給1024個樣點,就會開始編碼。
那麽如何在創建的時候指定使用軟件編碼器呢?需要用到下面的接口:
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc;
UInt32 encoderSpecifier = type;
OSStatus st;
UInt32 size;
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size);
if (st) {
NSLog(@"error getting audio format propery info: %d", (int)(st));
return nil;
}
unsigned int count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions);
if (st) {
NSLog(@"error getting audio format propery: %d", (int)(st));
return nil;
}
for (unsigned int i = 0; i < count; i++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc));
return &desc;
}
}
return nil;
}
AudioClassDescription *desc = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC
fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, desc, &_audioConverter);
如果要正確的編碼,編碼碼率參數是必須設置的。否則編碼時會返回560226676錯誤碼(!dat)。
UInt32 ulBitRate = 64000;
UInt32 ulSize = sizeof(ulBitRate);
status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, ulSize, &ulBitRate);
需要註意,AAC並不是隨便的碼率都可以支持。比如如果PCM采樣率是44100KHz,那麽碼率可以設置64000bps,如果是16K,可以設置為32000bps。
創建完成Converter和設置完Bitrate之後,可以查詢壹下最大編碼輸出的大小,後續會用到。
UInt32 value = 0;
size = sizeof(value);
AudioConverterGetProperty(_audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &value);
獲取出來的Value表示編碼器最大輸出的包大小。
然後調用AudioConverterFillCOmplexBuffer進行編碼:
AudioBufferList outAudioBufferList = {0};
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = 1;
outAudioBufferList.mBuffers[0].mDataByteSize = value;//value是上面查詢到的值
outAudioBufferList.mBuffers[0].mData = new int8[value];
UInt32 ioOutputDataPacketSize = 1;
status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, NULL);
編碼接口中,inInputDataProc是壹個輸入數據的回調函數。用來餵PCM數據給Converter,ioOutputDataPacketSize為1表示編碼產生1幀數據即返回。outAudioBufferList用來存放編碼後的數據。
inInputDataProc中的處理如下:
static OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);
UInt32 requestedPackets = *ioNumberDataPackets;
uint8_t *buffer;
uint32_t bufferLength = requestedPackets * 2;
uint32_t bufferRead;
bufferRead = [encoder.pcmPool readBuffer:&buffer withLength:bufferLength];
if (bufferRead == 0) {
*ioNumberDataPackets = 0;
return -1;
}
ioData->mBuffers[0].mData = buffer;
ioData->mBuffers[0].mDataByteSize = bufferRead;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0].mNumberChannels = 1;
*ioNumberDataPackets = bufferRead >> 1;
return noErr;
}
pcmPool是壹個用於存放PCM數據的環形緩沖區。
因為采集輸入每次不壹定有1024樣點,所以可以將數據緩存起來,再滿足1024樣點時再調用編碼。
另外,對於TS文件來說,每個AAC數據需要增加壹個adts頭,adts頭是壹個7bit的數據,通過adts可以得知AAC數據的編碼參數,方便解碼器進行解碼。
adts頭的計算方法如下:
- (NSData*) adtsDataForPacketLength:(NSUInteger)packetLength {
int adtsLength = 7;
char *packet = (char *)malloc(sizeof(char) * adtsLength);
// Variables Recycled by addADTStoPacket
int profile = 2;? //AAC LC
//39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
int freqIdx = 8;? //16KHz
int chanCfg = 1;? //MPEG-4 Audio Channel Configuration. 1 Channel front-center
NSUInteger fullLength = adtsLength + packetLength;
// fill in ADTS data
packet[0] = (char)0xFF; // 11111111? = syncword
packet[1] = (char)0xF9; // 1111 1 00 1? = syncword MPEG-2 Layer CRC
packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
packet[4] = (char)((fullLength&0x7FF) >> 3);
packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
packet[6] = (char)0xFC;
NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
return data;
}