In one of the blog posts we learned how to decode jpeg images using libavcodec. This is the second part of that post. In this we will learn how to encode decoded images (raw images) to theora and write them in ogg video file. In the end of the first part we saved our raw image in raw_data variable and its length in raw_data_size variable. Lets assume that we packaged all our decoding code in one function called "decode_jpeg_image" which has following signature
file_name_size = length of jpeg file's name
raw_data = contains decoded raw image on return
raw_data_size = contains length of raw_data on return
Now let's start working on how to encode this image in raw_data to theora and write that image to ogg video file.
Finding Theora encoder
We first have to find encoder for THEORA which is represented by AVCodec structure.
Here is how we will find encoder using codec id of theora.
Next step is allocate AVCodecContext and also to check which PixelFormat is supported by encoder. pix_fmts of AVCodec contains array of PixelFormat supported by encoder. Let's just get the very first PixelFormat supported by this encoder:
We will now configure our encoderContext to suit our needs. There are many variables that can be configured to many different values. Following is the configuration worked for me best for this project:
Till now we have only find the encoder and allocated AVCodecContext and configured it to our values. Next step is to actually open the encoder so it can be initialized and it uses our configured values. Codecs are opened using avcodec_open2 function:
So we have opened decoder with our configuration . Now we have to do stuff related to writing our encoded output in ogg file. We will first allocate AVFormatContext using avformat_alloc_output_context2. We can pass AVOutputFormat or format name or filename to this function to detect output type. For this post we will use format name as we know it in advance that is "ogg":
Our outFormat has no stream added to it. Adding a stream is necessary because streams are used to encode/decode data. It is very easy to add a new stream for any AVCodec to AVFormatContext using avformat_new_stream function:
AVIOContext is a structure that we can set to pb variable of outContext so that outContext will write encoded data to our callback function because we have not provided any output source to our outContext. We need a very large buffer to allocate AVIOContext using avio_alloc_context function:
Writing header of file before starting encoding process
Most of the media files have header that contains some information about the encoded data it contains. Ogg file also have a header. We have a very simple function to write header to our output context. Here is how we write header:
We have are raw data bytes saved in raw_data variable. But libav functions expects AVFrame and AVPicture data types to process picture data, whether its raw or encoded. So we will allocate an AVPicture that will hold our raw (decoded) picture and we will pass this AVPicture to encoder for encoding it into theora frame.
Next step is to pass raw data to encoder and get it encoded into a THEORA frame. We first need to allocate an AVPacket that will hold the encoded THEORA frame returned by encoder.
Writing encoded data to OGG video file
We have successfully encoded our raw(decoded) image to theora frame but we have to save it in ogg file. Instead of using fwrite or anything else we just use libav function av_interleaved_write_frame and pass our encoded picture to it. This function will handle all interleaving and when data is ready to write it will call our write_buf function that we declared earlier because we configured our context to call this function whenever data is available to output. Before calling av_interleaved_write_frame we first have to do some pts calculation on our AVPacket (encodedPic) using time base of our context and stream time base. But its very simple . Here is how to do this
Read Part1 to understand how to decode jpeg images
int decode_jpeg_image(char *filename,int file_name_size,uint8_t *raw_data,int *raw_data_size)filename = name of jpeg file to decode
file_name_size = length of jpeg file's name
raw_data = contains decoded raw image on return
raw_data_size = contains length of raw_data on return
Now let's start working on how to encode this image in raw_data to theora and write that image to ogg video file.
Finding Theora encoder
We first have to find encoder for THEORA which is represented by AVCodec structure.
Here is how we will find encoder using codec id of theora.
AVCodec *theoraEncoder = avcodec_find_encoder(AV_CODEC_ID_THEORA);Initializing AVCodecContext and getting PixelFormat supported by encoder
Next step is allocate AVCodecContext and also to check which PixelFormat is supported by encoder. pix_fmts of AVCodec contains array of PixelFormat supported by encoder. Let's just get the very first PixelFormat supported by this encoder:
PixelFormat pixFormat = theoraEncoder->pix_fmts[0];Now we will allocate AVCodecContext using the AVCodec we found in first step. AVCodecContext is allocated using the function avcodec_alloc_context3. Lets do it:
AVCodecContext *encoderContext=avcodec_alloc_context3(theoraEncoder);Configuring AVCodecContext
We will now configure our encoderContext to suit our needs. There are many variables that can be configured to many different values. Following is the configuration worked for me best for this project:
encoderContext->bit_rate=(width*height); //width=width of decoded image,height=height of decoded imageActually opening the encoder
encoderContext->bir_rate_tolerance=0;
encoderContext->width=width;
encoderContext->height=height;
encoderContext->time_base.den=frame_rate;
encoderContext->time_base.num=1;
encoderContext->pix_fmt=pixFormat;
encoderContext->gop_size=1;
encoderContext->keyint_min=1;
Till now we have only find the encoder and allocated AVCodecContext and configured it to our values. Next step is to actually open the encoder so it can be initialized and it uses our configured values. Codecs are opened using avcodec_open2 function:
int retVal=avcodec_open2(encoderContext,theorEncoder,NULL);Allocating Output Format
So we have opened decoder with our configuration . Now we have to do stuff related to writing our encoded output in ogg file. We will first allocate AVFormatContext using avformat_alloc_output_context2. We can pass AVOutputFormat or format name or filename to this function to detect output type. For this post we will use format name as we know it in advance that is "ogg":
AVOutputFormat *outFormat=NULL;Adding video stream to newly allocated AVFormatContext
AVFormatContext *outContext;
avformat_alloc_output_context2(&outContext,NULL,"ogg",NULL);
//set output format returned in outContext to outFormat
outFormat=outContext->oformat;
Our outFormat has no stream added to it. Adding a stream is necessary because streams are used to encode/decode data. It is very easy to add a new stream for any AVCodec to AVFormatContext using avformat_new_stream function:
AVStream *videoStream = avformat_new_stream(outContext,theoraEncoder);Allocating Input/Output Context to write to our destination
videoStream->codec = encoderContext;
videoStream->pts.val = 0;
AVIOContext is a structure that we can set to pb variable of outContext so that outContext will write encoded data to our callback function because we have not provided any output source to our outContext. We need a very large buffer to allocate AVIOContext using avio_alloc_context function:
int ioBuffSize=7200000*30; //random value that worked for meImplementing write_buf function to write data to our ogg file:
uint8_t *ioBuff=(uint8_t *)av_malloc(ioBuffSize);
AVIOContext *ioContext=avio_alloc_context(ioBuff,ioBuffSize,1,data_tag,NULL,write_buf,NULL);
ioContext->seekable=0; //we dont need to seek as not required in ogg but required for mp4
outContext->pb=ioContext;
static int write_buf(void *data_tag,uint8_t *buff,int buff_size){Encoder will call write_buf every time it has some data to write (whether encoded frame,header or trailer). data_tag is a variable that you can pass through callback and can access in write_buf function.
//buff contains the encoded data
//buff_size contains the length of valid data in buff
FILE *fp=fopen("encoded.ogg","a+");
fwrite(buff,1,buff_size,fp);
fclose(fp);
}
Writing header of file before starting encoding process
Most of the media files have header that contains some information about the encoded data it contains. Ogg file also have a header. We have a very simple function to write header to our output context. Here is how we write header:
avformat_write_header(outContext,NULL);Allocating AVFrame to hold our raw data
We have are raw data bytes saved in raw_data variable. But libav functions expects AVFrame and AVPicture data types to process picture data, whether its raw or encoded. So we will allocate an AVPicture that will hold our raw (decoded) picture and we will pass this AVPicture to encoder for encoding it into theora frame.
AVPicture *rawPic=(AVPicture *)av_malloc(sizeof(AVPicture));After allocating AVPicture we will fill allocate its dimensions and pixel format. Let's assume we will use YUV format for our raw picture
avpicture_alloc(rawPic,PIX_FMT_YUV420P,width,height);Encoding raw data to THEORA frame
Next step is to pass raw data to encoder and get it encoded into a THEORA frame. We first need to allocate an AVPacket that will hold the encoded THEORA frame returned by encoder.
AVPacket encodedPic;Before calling encoding function we need to fill our AVPicture (rawPic) that we allocated to hold our raw data. We fill rawPic with our raw data like this
av_init_packet(&encodedPic);
encodedPic.data=NULL;
encodedPic.size=0;
encodedPic.stream_index=videoStream->index;
av_image_fill_arrays(rawPic->data,rawPic->linesize,raw_data,PIX_FMT_YUV420P,width,height,1);Next we set PTS value of our rawPic by first casting it to AVFrame then set PTS value on it
AVFrame *castedRawPic=(AVFrame *)rawPic;Now we will call actual encoding function avcodec_encode_video2 that will take our rawPic and return encoded frame in encodedPic. But this function takes AVFrame as input instead of AVPicture, luckily we have already casted our rawPic to AVFrame (castedRawPic). So lets just call this function
castedRawPic->pts=frame_count+1; //frame_count can be an integer starting with 0
int gotit;If function is successful you will have encoded data in encodedPic variable.
avcodec_encode_video2(encoderContext,&encodedPic,castedRawPic,&gotit);
Writing encoded data to OGG video file
We have successfully encoded our raw(decoded) image to theora frame but we have to save it in ogg file. Instead of using fwrite or anything else we just use libav function av_interleaved_write_frame and pass our encoded picture to it. This function will handle all interleaving and when data is ready to write it will call our write_buf function that we declared earlier because we configured our context to call this function whenever data is available to output. Before calling av_interleaved_write_frame we first have to do some pts calculation on our AVPacket (encodedPic) using time base of our context and stream time base. But its very simple . Here is how to do this
av_packet_rescale_ts(&encodedPic, &encoderContext->time_base, videoStream->time_base);Now its time to call our last function that will output this encoded frame to ogg file with proper interleaving
encodedPic->stream_index=videoStream->index;
av_interleaved_write_frame(outContext,&encodedPic);To make a ogg video file using multiple jpeg images you can call decode_jpeg_image in a loop and pass returned data to encoder functions.
Read Part1 to understand how to decode jpeg images
Nice article... I have one question: AVIOContext has a parameter called "data_tag". Where is that parameter set? The function "write_buf" has an argument with the same name. Can you explain how to set and use it? Thanks!
ReplyDelete