Skip to main content

Decoding JPEG image file using libavcodec

I got a chance to work on a video encoding application that decodes series of jpeg files and convert them into ogg theora video file. I used the infamous libavcodec library that is used in FFMPEG.

I decided to write blog posts explaining how I decode jpeg images and convert them into ogg video file. This is the first part and in this I will explain how to decode jpeg images using libavcodec. To learn how to write decoded images as a ogg video file please read
http://random-stuff-mine.blogspot.com/2017/07/encoding-raw-images-to-ogg-theora-video.html

Before reading this blog post you must be aware of using and setting up libavcodec. I highly recommend this tutorial to get basics of using libavcodec
http://www.ffmpeg.org/doxygen/0.6/api-example_8c-source.html

Allocating input format context
We will first allocate input format for reading the file. We will use avformat_open_input function that will allocate AVFormatContext structure passed to it , the function detects input type using filename passed to it.
Here is how we allocat AVFormatContext structure.
AVFormatContext *iFormatContext;
if (avformat_open_input(&iFormatContext,filename,NULL,NULL)!=0)
{
    printf("Error in opening input file %s",filename);
    return ERROR_CODE;
}
Finding stream information
Next step is to find AVStream objects in our input context (iFormatContext). For this libavcodec provides a function avformat_find_stream_info. It takes pointer to AVFormatContext and AVDictionary to pass extra parameters which for now we will ignore and pass as NULL.
if (avformat_find_stream_info(iFormatContext,NULL)<0)
{
    printf("Error in finding stream info");
    avformat_close_input(&iFormatContext); //release AVFormatContext memory
    return ERROR_CODE;
}
Finding video stream from number of streams
After avformat_find_stream_info returns success, nb_streams member of AVFormatContext will contain the number of streams found and streams member will contain all the streams found. Our next task is to find video stream from them. As we have opened JPEG file it will contain only one stream and that is the video one. But still to make our code adaptable for other formats we will loop through streams member and find the video stream
int videoStreamIndex=-1;
for (int a=0;a<iFormatContext->nb_streams;a++)
{
    if (iFormatContext->streams[a]->codec->codec_type==AVMEDIA_TYPE_VIDEO)
    {
         videoStreamIndex=a;
         break;
    }
}
if (videoStreamIndex==-1)
{
    printf("Couldn't find video stream");
    avformat_close_input(&iFormatContext);
    return ERROR_CODE;
}
Finding decoder for video stream
videoStreamIndex contains the index of stream of our interest. Now we will find a decoder to decode the frames of this stream.
AVCodecContext *pCodecCtx=iFormatContext->streams[videoStreamIndex]->codec;
AVCodec *pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
if (pCodec==NULL)
{
    printf("Cannot find decoder");
    avformat_close_input(&iFormatContext);
    return ERROR_CODE;
}
if (avcodec_open(pCodecCtx,pCodec)<0)
{
    printf("Cannot open decoder");
    avformat_close_input(&iFormatContext);
    return ERROR_CODE;
}
Reading video frame
Now we will read a frame from our input using av_read_frame function. It takes pointer to AVFormatContext and AVPacket as input. On success AVPacket will contain the encoded video frame. Before calling this function we have to initialize our AVPacket using av_init_packet function.
AVPacket encodedPacket;
av_init_packet(&encodedPacket);
encodedPacket.data=NULL;
encodedPacket.size=0;
//now read a frame into this AVPacket
if (av_read_frame(iFormatContext,&pkt)<0)
{
    printf("Cannot read frame");
    av_free_packet(&encodedPacket);
    avcodec_close(pCodecCtx);
    avformat_input_close(&iFormatContext);
    return ERROR_CODE;
}
Decoding the encoded video frame
To decode the video frame we will use avcodec_decode_video2 function. It takes pointer to AVCodecContext , pointer to AVFrame , pointer to int and pointer to AVPacket as input. AVFrame passed in parameter contains the decoded video frame on success. We have to allocate AVFrame using avcodec_alloc_frame function.
int frameFinished=0;
AVFrame *decodedFrame=avcodec_alloc_frame();
avcodec_decode_video2(pCodecContext,decodedFrame,&frameFinished,&encodedPkt);
//frameFinished on return contains 1 or 0 .
//If it is 1 it means the whole frame was decoded .
//If it is 0 it means that the frame was not decoded
Converting the pixel format 
We mostly use RGB data to represent images. But encoders/decoders use other formats too like YUV etc. The image returned by jpeg decoder is in YUV format. So to convert it into RGB we will use some libavcodec functions like avpicture_alloc and sws_scale.
AVPicture destPic;
//we will convert to RGB32
PixelFormat destFormat=PIX_FMT_RGB32;
avpicture_alloc(&destPic,destFormat, decodedFrame->width,decodedFrame->height);
                       SwsContext *ctxt = sws_getContext(avFrame->width, avFrame->height,
                      (PixelFormat)avFrame->format,avFrame->width, avFrame->height,
                      destFormat,SWS_BILINEAR, nullptr, nullptr, nullptr);
if (ctxt == NULL)
{
    printf ("Error while calling sws_getContext");
    return ERROR_CODE;
}
sws_scale(ctxt, avFrame->data, avFrame->linesize, 0, avFrame->height, destPic.data, destPic.linesize);
sws_freeContext(ctxt);
Copying decoded frame to buffer
We now have our decoded picture in RGB format in destPic variable. But we cannot directly access destPic.data like a RGB32 data. To do this we will copy data in destPic to our own BYTE array for easy use and processing.
uint8_t *raw_data;
int raw_data_size=avFrame->height*avFrame->width*4; //each pixel is 4 bytes
raw_data=(uint8_t)malloc(raw_data_size);
av_image_copy_to_buffer(raw_data,raw_data_size,destPic.data,destPic.linesize,
                        PIX_FMT_RGB32,avFrame->width,avFrame->height,1)
Now our raw_data byte array contains each pixel of decoded data. 

Comments

Popular posts from this blog

Multithreaded C# TCP server to handle multiple clients

I decided to write a minimal multithreaded TCP based server as a blog post. Following class can serve as a skeleton for any small or large scale multithreaded TCP socket server. It do not contain much error handling , it is only to give an idea that how multithreaded server works and how it can process multiple clients using threading. using System; using System.Text; using System.Net; using System.Net.Sockets; using System.Threading; using System.Collections.Generic; namespace RandomStuffMine { public class MTServer {     public int Port{get;set;}     public Socket ServerSocket{get;set;}     private List<Client> Clients=new List<Client>();     private bool runServer=true;     public MTServer(int port)     {         Port=port;         ServerSocket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);     }   ...

CryptographicException: An error occurred while trying to encrypt the provided data. Refer to the inner exception for more information

I created a new Blazor Server app in Visual Studio 2019 and tried to run it. But I was getting this error CryptographicException: An error occurred while trying to encrypt the provided data. Refer to the inner exception for more information. I couldn't find any reason or solution to this problem. I tried creating the project multiple times but same error. I created a new .Net Core Web App and added a new razor component and included that component in a razor page (cshtml file) like this @(await Html.RenderComponentAsync<GeofenceWork>(RenderMode.ServerPrerendered)) and <component type="typeof(GeofenceWork)" render-mode="serverprerendered" /> As soon as I navigate to this page that has component added I got the same error: CryptographicException: An error occurred while trying to encrypt the provided data. Refer to the inner exception for more information. This was very frustrating. After hours of trying and searching I figured out the solution.  ...