Open
Description
Pixels in video file formats are not necessarily "square". Such non-square rectangular pixels are defined by their pixel aspect ratio (PAR) or sample aspect ratio (SAR) (see also wikipedia's page on Pixel Aspect Ratio)
ffmpeg supports themand ffprobe will report them but torchcodec returns the exact number of pixels without any information about the pixel aspect ratio. Here's the output of ffprobe
on a file with non-square pixels (note the SAR (Sample Aspect Ratio) which is not 1:1):
$ ffprobe -version
ffprobe version 6.1.1 Copyright (c) 2007-2023 the FFmpeg developers
[...]
$ ffprobe -hide_banner mp4/512/5090101893225793677.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'mp4/512/5090101893225793677.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.76.100
Duration: 00:59:34.20, start: 0.000000, bitrate: 865 kb/s
Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 640x512 [SAR 64:45 DAR 16:9], 731 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Would be nice if torchcodec would at least provide information about the pixel aspect ratio (and since most downstream tasks will likely expect square pixels, maybe also provide frames scaled accordingly).