Videobearbeitung mit den FFmpeg libraries

Externe Webseiten

Main Data Structures of FFmpeg

Audio-Formate

Hauptformate Das resampling ac3 → aac ist wegen der verschiedenen packet_duration kompliziert
AVSampleFormat asf = AVCodecContext→sample_fmt; beschreibt Aufbau von AVFrame.extended_data (nchannel data planes) zB

Ablauf 'transcoding'

Beispiel 'transcoding-wolfk.c'. Übernommen aus 'transcoding.c', aber ohne filter!
  1. avformat_open_input(&ifmt_ctx, filename, NULL, NULL)
  2. avformat_open_output(&ofmt_ctx, filename, NULL, NULL)

StreamContext

userdef. class wird instanziert für Video/Audio:
typedef struct StreamContext {                                   
    AVCodecContext *dec_ctx, *enc_ctx; // input, output
    AVFrame *dec_frame;                // aktuelles Frame
};

crf = 'constant rate factor' def in X264Context
crf=0: lossless, 23-28: safe values for x264-codec, a change of crf += 6 results in approx. half file size
if (encoder->id == AV_CODEC_ID_H264) { // = 27
   // av_opt_show2(enc_ctx->priv_data,0,0xfffff,0); // zeigt alle options auf stdout
   // X264Context *x4 = enc_ctx->priv_data; // aus 'libx264.c'
   av_opt_set(enc_ctx->priv_data, "crf", "28.0", 0); // *****************
}

send/receive encoding and decoding API overview

(Aus FFmpeg) The API is very similar for encoding/decoding and audio/video, and works as follows: At the beginning of decoding or encoding, the codec might accept multiple input frames/packets without returning a frame, until its internal buffers are filled. This situation is handled transparently if you follow the steps outlined above.

In theory, sending input can result in EAGAIN - this should happen only if not all output was received. You can use this to structure alternative decode or encode loops other than the one suggested above. For example, you could try sending new input on each iteration, and try to receive output if that returns EAGAIN.

End of stream situations. These require "flushing" (aka draining) the codec, as the codec might buffer multiple frames or packets internally for performance or out of necessity (consider B-frames). This is handled as follows:

Implementierung: Decoding Loop

while (1) {                                                       // read all packets aus input 
    if (ret = av_read_frame(ifmt_ctx, packet) < 0) break;         // EOF
    stream_index = packet->stream_index;                          // Video oder Audio
    StreamContext *stream = &stream_ctx[stream_index];            
    av_packet_rescale_ts(packet, ifmt_ctx->streams[stream_index]->time_base, stream->dec_ctx->time_base);
    ret = avcodec_send_packet(stream->dec_ctx, packet);           // send-receive-loop für decoding
    while (ret >= 0) {
	ret = avcodec_receive_frame(stream->dec_ctx, stream->dec_frame);
	if (ret == AVERROR_EOF || ret == AVERROR(EAGAIN))  break;
	else if (ret < 0)  goto end;
	stream->dec_frame->pts = stream->dec_frame->best_effort_timestamp;
        ret = wolfk_encode_write_frame(stream->dec_frame, stream_index);  // **** encoding loop (s.u.) ******
	if (ret < 0)   goto end;
    }
}

Implementierung: Encoding Loop

static int wolfk_encode_write_frame(AVFrame *frame, unsigned int stream_index) { // ohne filter!!
  StreamContext *stream = &stream_ctx[stream_index];
  AVCodecContext *enc_ctx = stream->enc_ctx;                           // Video oder Audio
  AVPacket *enc_pkt = av_packet_alloc();
  if (frame) {
    if (stream_index == videoStream) {                                 // ---- für Cut pict_type setzen --- 
      if (frame->pict_type == AV_PICTURE_TYPE_B) frame->pict_type = AV_PICTURE_TYPE_P; // AV_PICTURE_TYPE_NONE;
      // frame->pict_type = AV_PICTURE_TYPE_I;  // alles I-Frames: funzt!
    } 
  } else  fprintf(stderr,"flushing encoder %d\n",stream_index);
  int ret = avcodec_send_frame(enc_ctx, frame);                         //  send-receive-loop
  /* frame = containing the raw audio or video frame to be encoded. 
     It can be NULL, in which case it is considered a flush packet. This signals the end of the stream. 
     If the encoder still has packets buffered, it will return them after this call. 
     Once flushing mode has been entered, additional flush packets are ignored, and sending frames will return AVERROR_EOF.
   */
  if (ret < 0) return ret; // EOF
  while (ret >= 0) {       // 1 oder mehrere packets empfangen *** Das sind gepufferte packets !!! *** 
    ret = avcodec_receive_packet(enc_ctx, enc_pkt);
    /* Returns 0 on success, otherwise negative error code: 
       AVERROR(EAGAIN): output is not available in the current state - user must try to send input 
       AVERROR_EOF: the encoder has been fully flushed, and there will be no more output packets 
       AVERROR(EINVAL): codec not opened, or it is a decoder 
       other errors: legitimate encoding errors 
     */
    if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) return 0;    // weiter input-loop
    enc_pkt->stream_index = stream_index;                          // packet ready, prepare  for muxing 
    av_packet_rescale_ts(enc_pkt, enc_ctx->time_base, ofmt_ctx->streams[stream_index]->time_base);
    ret = av_interleaved_write_frame(ofmt_ctx, enc_pkt);           // mux encoded frame     
  }
  av_packet_unref(enc_pkt);
  return ret;
}

Beispiel: Packet-Input von TS-File

  Input #0, mpegts, from '/home/minidlna/Test-Message.m2t':
  Duration: 03:15:09.03, start: 19562.565178, bitrate: 5833 kb/s
  Program 50019 
  Program 50021 
    Stream #0:0[0x1a73]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x1a74](deu): Audio: ac3 ([6][0][0][0] / 0x0006), 48000 Hz, 5.1(side), fltp, 384 kb/s
    Stream #0:2[0x1a76](deu): Subtitle: dvb_teletext ([6][0][0][0] / 0x0006)
  
**** Frame size = 1920 x 1080 (8,294,400 byte) bit_rate=5,833,496 frame_rate=25.000 time_base=1/90,000=1.11111e-05 ticksperframe=2 videoStream=0
total    start_time= 19,562.565s duration=11,709s bitrate=5,833,496  audio_preload=0ms
951305 packets 585424 videopkt 12390 I-Frames

    i        dts                    pts           duration    size flag    pos
  0 0 +0.200=1,760,648,866 +0.400=1,760,666,866 0.020=1,800 41,062 0     2,256  Video 20ms
  1 0 +0.220=1,760,650,666 +0.420=1,760,668,666 0.020=1,800  4,279 0    48,880   ""
  2 0 +0.240=1,760,652,466 +0.320=1,760,659,666 0.020=1,800  7,434 0    54,520   ""
  3 0 +0.260=1,760,654,266 +0.340=1,760,661,466 0.020=1,800  2,761 0    62,416
  4 0 +0.280=1,760,656,066 +0.280=1,760,656,066 0.020=1,800  4,048 0    65,800
  5 0 +0.300=1,760,657,866 +0.300=1,760,657,866 0.020=1,800  2,322 0    71,252
  6 0 +0.320=1,760,659,666 +0.360=1,760,663,266 0.020=1,800  4,357 0    73,696   ""
  7 1 -0.000=1,760,630,866 -0.000=1,760,630,866 0.032=2,880  1,536 1    33,652  Audio 32ms
  8 1 +0.032=1,760,633,746 +0.032=1,760,633,746 0.032=2,880  1,536 1        -1   ""
  9 1 +0.064=1,760,636,626 +0.064=1,760,636,626 0.032=2,880  1,536 1        -1
 10 1 +0.096=1,760,639,506 +0.096=1,760,639,506 0.032=2,880  1,536 1        -1
 11 0 +0.340=1,760,661,466 +0.380=1,760,665,066 0.020=1,800  2,426 0    78,208
 12 0 +0.360=1,760,663,266 +0.560=1,760,681,266 0.020=1,800 39,672 0    81,028
 13 0 +0.380=1,760,665,066 +0.580=1,760,683,066 0.020=1,800  8,313 0   127,088
 14 0 +0.400=1,760,666,866 +0.480=1,760,674,066 0.020=1,800  8,568 0   135,736
 15 0 +0.420=1,760,668,666 +0.500=1,760,675,866 0.020=1,800  3,230 0   146,452
 16 0 +0.440=1,760,670,466 +0.440=1,760,670,466 0.020=1,800  4,170 0   149,836
 17 0 +0.460=1,760,672,266 +0.460=1,760,672,266 0.020=1,800  2,373 0   154,160
 18 0 +0.480=1,760,674,066 +0.520=1,760,677,666 0.020=1,800  3,982 0   156,792
 19 0 +0.500=1,760,675,866 +0.540=1,760,679,466 0.020=1,800  2,315 0   162,244
 20 1 +0.128=1,760,642,386 +0.128=1,760,642,386 0.032=2,880  1,536 1        -1
 21 1 +0.160=1,760,645,266 +0.160=1,760,645,266 0.032=2,880  1,536 1   122,764
 22 1 +0.192=1,760,648,146 +0.192=1,760,648,146 0.032=2,880  1,536 1        -1
 23 1 +0.224=1,760,651,026 +0.224=1,760,651,026 0.032=2,880  1,536 1        -1
 24 1 +0.256=1,760,653,906 +0.256=1,760,653,906 0.032=2,880  1,536 1        -1
 25 0 +0.520=1,760,677,666 +0.720=1,760,695,666 0.020=1,800 39,670 0   164,688
 26 0 +0.540=1,760,679,466 +0.740=1,760,697,466 0.020=1,800  6,232 0   208,868
 27 0 +0.560=1,760,681,266 +0.640=1,760,688,466 0.020=1,800  9,111 0   217,140
 28 0 +0.580=1,760,683,066 +0.660=1,760,690,266 0.020=1,800  3,214 0   226,916
 29 0 +0.600=1,760,684,866 +0.600=1,760,684,866 0.020=1,800  4,704 0   231,052
 30 0 +0.620=1,760,686,666 +0.620=1,760,686,666 0.020=1,800  2,526 0   237,068
 31 0 +0.640=1,760,688,466 +0.680=1,760,692,066 0.020=1,800  4,506 0   239,700
 32 1 +0.288=1,760,656,786 +0.288=1,760,656,786 0.032=2,880  1,536 1        -1
 33 1 +0.320=1,760,659,666 +0.320=1,760,659,666 0.032=2,880  1,536 1   216,200
 34 1 +0.352=1,760,662,546 +0.352=1,760,662,546 0.032=2,880  1,536 1        -1
 35 1 +0.384=1,760,665,426 +0.384=1,760,665,426 0.032=2,880  1,536 1        -1
 36 1 +0.416=1,760,668,306 +0.416=1,760,668,306 0.032=2,880  1,536 1        -1
 37 0 +0.660=1,760,690,266 +0.700=1,760,693,866 0.020=1,800  2,479 0   244,400
 38 0 +0.680=1,760,692,066 +0.880=1,760,710,066 0.020=1,800 90,498 1   247,032 *** 1. I-Frame
 39 0 +0.700=1,760,693,867 +0.900=1,760,711,867 0.020=1,800  6,438 0   350,432
 40 0 +0.720=1,760,695,667 +0.800=1,760,702,867 0.020=1,800  7,008 0   357,200
 41 0 +0.740=1,760,697,467 +0.820=1,760,704,667 0.020=1,800  2,688 0   366,224
 42 0 +0.760=1,760,699,267 +0.760=1,760,699,267 0.020=1,800  3,891 0   369,044
....

libx264 AVOptions

  -preset            <string>     E..V..... Set the encoding preset (cf. x264 --fullhelp) (default "medium")
  -tune              <string>     E..V..... Tune the encoding params (cf. x264 --fullhelp)
  -profile           <string>     E..V..... Set profile restrictions (cf. x264 --fullhelp) 
  -fastfirstpass     <boolean>    E..V..... Use fast settings when encoding first pass (default true)
  -level             <string>     E..V..... Specify level (as defined by Annex A)
  -passlogfile       <string>     E..V..... Filename for 2 pass stats
  -wpredp            <string>     E..V..... Weighted prediction for P-frames
  -a53cc             <boolean>    E..V..... Use A53 Closed Captions (if available) (default true)
  -x264opts          <string>     E..V..... x264 options
  -crf               <float>      E..V..... Select the quality for constant quality mode (from -1 to FLT_MAX) (default -1)
  -crf_max           <float>      E..V..... In CRF mode, prevents VBV from lowering quality beyond this point. (from -1 to FLT_MAX) (default -1)
  -qp                <int>        E..V..... Constant quantization parameter rate control method (from -1 to INT_MAX) (default -1)
  -aq-mode           <int>        E..V..... AQ method (from -1 to INT_MAX) (default -1)
     none                         E..V.....
     variance                     E..V..... Variance AQ (complexity mask)
     autovariance                 E..V..... Auto-variance AQ
     autovariance-biased          E..V..... Auto-variance AQ with bias to dark scenes
  -aq-strength       <float>      E..V..... AQ strength. Reduces blocking and blurring in flat and textured areas. (from -1 to FLT_MAX) (default -1)
  -psy               <boolean>    E..V..... Use psychovisual optimizations. (default auto)
  -psy-rd            <string>     E..V..... Strength of psychovisual optimization, in <psy-rd>:<psy-trellis> format.
  -rc-lookahead      <int>        E..V..... Number of frames to look ahead for frametype and ratecontrol (from -1 to INT_MAX) (default -1)
  -weightb           <boolean>    E..V..... Weighted prediction for B-frames. (default auto)
  -weightp           <int>        E..V..... Weighted prediction analysis method. (from -1 to INT_MAX) (default -1)
     none                         E..V.....
     simple                       E..V.....
     smart                        E..V.....
  -ssim              <boolean>    E..V..... Calculate and print SSIM stats. (default auto)
  -intra-refresh     <boolean>    E..V..... Use Periodic Intra Refresh instead of IDR frames. (default auto)
  -bluray-compat     <boolean>    E..V..... Bluray compatibility workarounds. (default auto)
  -b-bias            <int>        E..V..... Influences how often B-frames are used (from INT_MIN to INT_MAX) (default INT_MIN)
  -b-pyramid         <int>        E..V..... Keep some B-frames as references. (from -1 to INT_MAX) (default -1)
     none                         E..V.....
     strict                       E..V..... Strictly hierarchical pyramid
     normal                       E..V..... Non-strict (not Blu-ray compatible)
  -mixed-refs        <boolean>    E..V..... One reference per partition, as opposed to one reference per macroblock (default auto)
  -8x8dct            <boolean>    E..V..... High profile 8x8 transform. (default auto)
  -fast-pskip        <boolean>    E..V..... (default auto)
  -aud               <boolean>    E..V..... Use access unit delimiters. (default auto)
  -mbtree            <boolean>    E..V..... Use macroblock tree ratecontrol. (default auto)
  -deblock           <string>     E..V..... Loop filter parameters, in <alpha:beta> form.
  -cplxblur          <float>      E..V..... Reduce fluctuations in QP (before curve compression) (from -1 to FLT_MAX) (default -1)
  -partitions        <string>     E..V..... A comma-separated list of partitions to consider. Possible values: p8x8, p4x4, b8x8, i8x8, i4x4, none, all
  -direct-pred       <int>        E..V..... Direct MV prediction mode (from -1 to INT_MAX) (default -1)
     none                         E..V.....
     spatial                      E..V.....
     temporal                     E..V.....
     auto                         E..V.....
  -slice-max-size    <int>        E..V..... Limit the size of each slice in bytes (from -1 to INT_MAX) (default -1)
  -stats             <string>     E..V..... Filename for 2 pass stats
  -nal-hrd           <int>        E..V..... Signal HRD information (requires vbv-bufsize; cbr not allowed in .mp4) (from -1 to INT_MAX) (default -1)
     none                         E..V.....
     vbr                          E..V.....
     cbr                          E..V.....
  -avcintra-class    <int>        E..V..... AVC-Intra class 50/100/200 (from -1 to 200) (default -1)
  -me_method         <int>        E..V..... Set motion estimation method (from -1 to 4) (default -1)
     dia                          E..V.....
     hex                          E..V.....
     umh                          E..V.....
     esa                          E..V.....
     tesa                         E..V.....
  -motion-est        <int>        E..V..... Set motion estimation method (from -1 to 4) (default -1)
     dia                          E..V.....
     hex                          E..V.....
     umh                          E..V.....
     esa                          E..V.....
     tesa                         E..V.....
  -forced-idr        <boolean>    E..V..... If forcing keyframes, force them as IDR frames. (default false)
  -coder             <int>        E..V..... Coder type (from -1 to 1) (default default)
     default                      E..V.....
     cavlc                        E..V.....
     cabac                        E..V.....
     vlc                          E..V.....
     ac                           E..V.....
  -b_strategy        <int>        E..V..... Strategy to choose between I/P/B-frames (from -1 to 2) (default -1)
  -chromaoffset      <int>        E..V..... QP difference between chroma and luma (from INT_MIN to INT_MAX) (default -1)
  -sc_threshold      <int>        E..V..... Scene change threshold (from INT_MIN to INT_MAX) (default -1)
  -noise_reduction   <int>        E..V..... Noise reduction (from INT_MIN to INT_MAX) (default -1)
  -x264-params       <string>     E..V..... Override the x264 configuration using a :-separated list of key=value parameters
[libx264 @ 0x5638e67c5f80] using SAR=1/1
[libx264 @ 0x5638e67c5f80] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x5638e67c5f80] profile High, level 4.2
[libx264 @ 0x5638e67c5f80] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options:
  cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1
  chroma_qp_offset=-2 threads=34 lookahead_threads=5 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0
  direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=28.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4
  ip_ratio=1.40 aq=1:1.00




Letzte Änderung:      wolfk.wk@wolfk-wk.de