当前位置：首页 > news >正文

音视频入门基础：RTP专题（18）——FFmpeg源码中，获取RTP的音频信息的实现（上）

news 2025/7/13 3:43:38

由于本文篇幅较长，分为上、下两篇。

一、引言

通过FFmpeg命令可以获取到SDP描述的RTP流的的音频压缩编码格式、音频压缩编码格式的profile、音频采样率、通道数信息：

ffmpeg -protocol_whitelist "file,rtp,udp" -i XXX.sdp

而由《音视频入门基础：RTP专题（17）——音频的SDP媒体描述》可以知道，SDP协议中，a=rtpmap属性和a=fmtp属性中的config参数都会重复包含音频压缩编码格式、音频采样率、通道数音频这些信息。所以FFmpeg到底获取的是哪个地方的音频信息呢，本文为大家揭开谜底。

二、音频压缩编码格式

FFmpeg获取SDP描述的RTP流的音频压缩编码格式，是从SDP的a=rtpmap属性获取的。比如SDP中某一行的内容为：

a=rtpmap:97 MPEG4-GENERIC/48000/2

FFmpeg识别到上述“a=rtpmap”这个<type>后，会把后面的字符串“MPEG4-GENERIC”提取出来，检测是否存在相应的音视频压缩编码格式。

具体可以参考：《音视频入门基础：RTP专题（5）——FFmpeg源码中，解析SDP的实现》。

当识别到上述“a=rtpmap”这个<type>后，sdp_parse_line函数中会调用sdp_parse_rtpmap函数：

else if (av_strstart(p, "rtpmap:", &p) && s->nb_streams > 0) {/* NOTE: rtpmap is only supported AFTER the 'm=' tag */get_word(buf1, sizeof(buf1), &p);payload_type = atoi(buf1);rtsp_st = rt->rtsp_streams[rt->nb_rtsp_streams - 1];if (rtsp_st->stream_index >= 0) {st = s->streams[rtsp_st->stream_index];sdp_parse_rtpmap(s, st, rtsp_st, payload_type, p);}s1->seen_rtpmap = 1;if (s1->seen_fmtp) {parse_fmtp(s, rt, payload_type, s1->delayed_fmtp);}}

sdp_parse_rtpmap函数中又会调用ff_rtp_handler_find_by_name函数：

/* parse the rtpmap description: <codec_name>/<clock_rate>[/<other params>] */
static int sdp_parse_rtpmap(AVFormatContext *s,AVStream *st, RTSPStream *rtsp_st,int payload_type, const char *p)
{
//...if (par->codec_id == AV_CODEC_ID_NONE) {const RTPDynamicProtocolHandler *handler =ff_rtp_handler_find_by_name(buf, par->codec_type);//...}
//...
}

ff_rtp_handler_find_by_name函数定义如下，可以看到其内部调用了rtp_handler_iterate函数：

const RTPDynamicProtocolHandler *ff_rtp_handler_find_by_name(const char *name,enum AVMediaType codec_type)
{void *i = 0;const RTPDynamicProtocolHandler *handler;while (handler = rtp_handler_iterate(&i)) {if (handler->enc_name &&!av_strcasecmp(name, handler->enc_name) &&codec_type == handler->codec_type)return handler;}return NULL;
}

rtp_handler_iterate函数定义如下，可以看到该函数内部遍历了全局数组rtp_dynamic_protocol_handler_list：

static const RTPDynamicProtocolHandler *rtp_handler_iterate(void **opaque)
{uintptr_t i = (uintptr_t)*opaque;const RTPDynamicProtocolHandler *r = rtp_dynamic_protocol_handler_list[i];if (r)*opaque = (void*)(i + 1);return r;
}

rtp_dynamic_protocol_handler_list数组定义如下：

static const RTPDynamicProtocolHandler *const rtp_dynamic_protocol_handler_list[] = {/* rtp */&ff_ac3_dynamic_handler,&ff_amr_nb_dynamic_handler,&ff_amr_wb_dynamic_handler,&ff_dv_dynamic_handler,&ff_g726_16_dynamic_handler,&ff_g726_24_dynamic_handler,&ff_g726_32_dynamic_handler,&ff_g726_40_dynamic_handler,&ff_g726le_16_dynamic_handler,&ff_g726le_24_dynamic_handler,&ff_g726le_32_dynamic_handler,&ff_g726le_40_dynamic_handler,&ff_h261_dynamic_handler,&ff_h263_1998_dynamic_handler,&ff_h263_2000_dynamic_handler,&ff_h263_rfc2190_dynamic_handler,&ff_h264_dynamic_handler,&ff_hevc_dynamic_handler,&ff_ilbc_dynamic_handler,&ff_jpeg_dynamic_handler,&ff_mp4a_latm_dynamic_handler,&ff_mp4v_es_dynamic_handler,&ff_mpeg_audio_dynamic_handler,&ff_mpeg_audio_robust_dynamic_handler,&ff_mpeg_video_dynamic_handler,&ff_mpeg4_generic_dynamic_handler,&ff_mpegts_dynamic_handler,&ff_ms_rtp_asf_pfa_handler,&ff_ms_rtp_asf_pfv_handler,&ff_qcelp_dynamic_handler,&ff_qdm2_dynamic_handler,&ff_qt_rtp_aud_handler,&ff_qt_rtp_vid_handler,&ff_quicktime_rtp_aud_handler,&ff_quicktime_rtp_vid_handler,&ff_rfc4175_rtp_handler,&ff_svq3_dynamic_handler,&ff_theora_dynamic_handler,&ff_vc2hq_dynamic_handler,&ff_vorbis_dynamic_handler,&ff_vp8_dynamic_handler,&ff_vp9_dynamic_handler,&gsm_dynamic_handler,&l24_dynamic_handler,&opus_dynamic_handler,&realmedia_mp3_dynamic_handler,&speex_dynamic_handler,&t140_dynamic_handler,/* rdt */&ff_rdt_video_handler,&ff_rdt_audio_handler,&ff_rdt_live_video_handler,&ff_rdt_live_audio_handler,NULL,
};

rtp_dynamic_protocol_handler_list数组中的每个元素都把SDP协议中的<encoding name>和具体的音视频压缩编码格式对应起来，比如“MPEG4-GENERIC”对应AAC：

const RTPDynamicProtocolHandler ff_mpeg4_generic_dynamic_handler = {.enc_name           = "mpeg4-generic",.codec_type         = AVMEDIA_TYPE_AUDIO,.codec_id           = AV_CODEC_ID_AAC,.priv_data_size     = sizeof(PayloadContext),.parse_sdp_a_line   = parse_sdp_line,.close              = close_context,.parse_packet       = aac_parse_packet,
};

X-MP3-draft-00"对应MP3ADU：

static const RTPDynamicProtocolHandler realmedia_mp3_dynamic_handler = {.enc_name   = "X-MP3-draft-00",.codec_type = AVMEDIA_TYPE_AUDIO,.codec_id   = AV_CODEC_ID_MP3ADU,
};

所以sdp_parse_rtpmap函数中执行语句：ff_rtp_handler_find_by_name(buf, par->codec_type)后，会通过遍历全局数组rtp_dynamic_protocol_handler_list，找到SDP协议中的<encoding name>对应的音视频压缩编码格式（比如“MPEG4-GENERIC”对应的是AAC），将其赋值给st->codecpar->codec_id（即AVCodecParameters的codec_id）：

/* parse the rtpmap description: <codec_name>/<clock_rate>[/<other params>] */
static int sdp_parse_rtpmap(AVFormatContext *s,AVStream *st, RTSPStream *rtsp_st,int payload_type, const char *p)
{
//...if (par->codec_id == AV_CODEC_ID_NONE) {const RTPDynamicProtocolHandler *handler =ff_rtp_handler_find_by_name(buf, par->codec_type);//...}
//...
}

然后在sdp_parse_line函数外部，通过avcodec_parameters_to_context函数将AVCodecParameters的codec_id赋值给AVCodecContext的codec_id：

int avcodec_parameters_to_context(AVCodecContext *codec,const AVCodecParameters *par)
{
//...codec->codec_id   = par->codec_id;
//...
}

然后在dump_stream_format函数中，通过avcodec_string函数中的语句：codec_name = avcodec_get_name(enc->codec_id) 拿到AVCodecContext的codec_id对应的音频压缩编码格式名称。最后再在dump_stream_format函数中将音频压缩编码格式打印出来：

void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode)
{
//...codec_name = avcodec_get_name(enc->codec_id);
//...
}

所以FFmpeg获取SDP描述的RTP流的音频压缩编码格式，是从SDP的“a=rtpmap”这一行获取的：

三、音频压缩编码格式的profile

音频压缩编码格式还有附带的profile（规格）。比如音频压缩编码格式为AAC，根据《ISO14496-3-2009.pdf》第124页，还有AAC Main、AAC LC、AAC SSR、AAC LTP这几种规格：

FFmpeg获取SDP描述的RTP流的音频压缩编码格式的profile，获取的是a=fmtp属性中的config参数中的信息，即AudioSpecificConfig中的audioObjectType。由《音视频入门基础：AAC专题（11）——AudioSpecificConfig简介》可以知道，Audio Specific Config中存在一个占5位或11位的audioObjectType属性，表示音频对象类型：

0: Null
1: AAC Main
2: AAC LC (Low Complexity)
3: AAC SSR (Scalable Sample Rate)
4: AAC LTP (Long Term Prediction)
5: SBR (Spectral Band Replication)
6: AAC Scalable
7: TwinVQ
8: CELP (Code Excited Linear Prediction)
9: HXVC (Harmonic Vector eXcitation Coding)
10: Reserved
11: Reserved
12: TTSI (Text-To-Speech Interface)
13: Main Synthesis
14: Wavetable Synthesis
15: General MIDI
16: Algorithmic Synthesis and Audio Effects
17: ER (Error Resilient) AAC LC
18: Reserved
19: ER AAC LTP
20: ER AAC Scalable
21: ER TwinVQ
22: ER BSAC (Bit-Sliced Arithmetic Coding)
23: ER AAC LD (Low Delay)
24: ER CELP
25: ER HVXC
26: ER HILN (Harmonic and Individual Lines plus Noise)
27: ER Parametric
28: SSC (SinuSoidal Coding)
29: PS (Parametric Stereo)
30: MPEG Surround
31: (Escape value)
32: Layer-1
33: Layer-2
34: Layer-3
35: DST (Direct Stream Transfer)
36: ALS (Audio Lossless)
37: SLS (Scalable LosslesS)
38: SLS non-core
39: ER AAC ELD (Enhanced Low Delay)
40: SMR (Symbolic Music Representation) Simple
41: SMR Main
42: USAC (Unified Speech and Audio Coding) (no SBR)
43: SAOC (Spatial Audio Object Coding)
44: LD MPEG Surround
45: USAC

由《音视频入门基础：AAC专题（12）——FFmpeg源码中，解码AudioSpecificConfig的实现》可以知道，

FFmpeg源码中使用decode_audio_specific_config_gb函数来读取AudioSpecificConfig的信息。decode_audio_specific_config_gb函数中会调用ff_mpeg4audio_get_config_gb函数，而ff_mpeg4audio_get_config_gb函数中，通过语句：c->object_type = get_object_type(gb) 获取AudioSpecificConfig的audioObjectType属性。执行decode_audio_specific_config_gb函数后，m4ac指向的变量会得到从AudioSpecificConfig中解码出来的属性：

static inline int get_object_type(GetBitContext *gb)
{int object_type = get_bits(gb, 5);if (object_type == AOT_ESCAPE)object_type = 32 + get_bits(gb, 6);return object_type;
}

然后在decode_audio_specific_config_gb函数外部，通过aac_decode_frame_int函数将上一步得到的audioObjectType属性赋值给AVCodecContext的profile：

static int aac_decode_frame_int(AVCodecContext *avctx, AVFrame *frame,int *got_frame_ptr, GetBitContext *gb,const AVPacket *avpkt)
{
//...// The AV_PROFILE_AAC_* defines are all object_type - 1// This may lead to an undefined profile being signaledac->avctx->profile = ac->oc[1].m4ac.object_type - 1;
//...
}

然后在dump_stream_format函数中，通过avcodec_string函数中的语句：profile = avcodec_profile_name(enc->codec_id, enc->profile)拿到上一步中得到的AVCodecContext的profile。最后再在dump_stream_format函数中将profile打印出来：

void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode)
{
//...profile = avcodec_profile_name(enc->codec_id, enc->profile);
//...
}

所以FFmpeg获取SDP描述的RTP流的的音频压缩编码格式的profile，获取的是a=fmtp属性中的config参数中的信息，即AudioSpecificConfig中的audioObjectType：