ViSTRA: Video Compression based on Spatial Resolution and Effective Bit Depth Adaptation

Mariana Afonso

University of Bristol

About

ViSTRA is a new video compression framework (ViSTRA2) which exploits adaptation of spatial resolution and effective bit depth, down-sampling these parameters at the encoder based on perceptual criteria, and up-sampling at the decoder using a deep convolution neural network.


Framework


Source code

Source code from github will be avaliable very soon.

Performance

ViSTRA2 has been integrated with the reference software of both the HEVC (HM 16.20) and VVC (VTM 4.01), and evaluated under the Joint Video Exploration Team Common Test Conditions using the Random Access configuration. Our results show consistent and significant compression gains against HM and VVC based on Bjonegaard Delta measurements, with average BD-rate savings of 12.6% (PSNR) and 19.5% (VMAF) over HM and 5.5% (PSNR) and 8.6% (VMAF) over VTM.

Subjective comparison

Demo Videos

FoodMarket2 HM (Middle) vs HM+ViSTRA (Right) @ ~1.3Mbps [mp4 download]
The blocks for comparison are at their original resolutions.

CatRobot1 VTM (Middle) vs VTM+ViSTRA (Right) @ ~1.2Mbps [mp4 download]
The blocks for comparison are at their original resolutions.

More Example Frames (at similar bit rates)
Example blocks cropped from the reconstructed frames generated by the anchor (HM 16.20 or VTM 4.0.1) and ViSTRA2 codecs.
Row 1 corresponds to the FoodMarket2 sequence when the bit rate is at approximately 1.3Mbps (HM is the anchor and the ViSTRA host codec).
Row 2 corresponds to the BasketballDrive sequence when the bit rate is at approximately 1.4Mbps (HM is the anchor and the ViSTRA host codec).
Row 3 corresponds to the CatRobot1 sequence when the bit rate is at approximately 1.2Mbps (VTM is the anchor and the ViSTRA host codec).
Row 4 corresponds to the Tango2 sequence when the bit rate is at approximately 2.2Mbps (VTM is the anchor and the ViSTRA host codec).
(Left to Right) Original Frame; Original Block; Anchor Block; ViSTRA Block.


Citation

@article{zhang2021vistra2,
  title={ViSTRA2: Video coding using spatial resolution and effective bit depth adaptation},
  author={Zhang, Fan and Afonso, Mariana and Bull, David R},
  journal={Signal Processing: Image Communication},
  pages={116355},
  year={2021},
  publisher={Elsevier}
}[paper]

@article{afonso2018video,
  title={Video compression based on spatio-temporal resolution adaptation},
  author={Afonso, Mariana and Zhang, Fan and Bull, David R},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  volume={29},
  number={1},
  pages={275--280},
  year={2018},
  publisher={IEEE}
}[paper]