Day
1: 13-Dec-2022:
8:00 – 11:40: Tutorials A
(Coffer Break 10 minutes)
Room1:
3D Signal Compression and Processing
Session Chair:Jin Zeng (Tongji University)
Lecturer:
·
Xianming Liu
(Harbin Institute of Technology)
·
Yuanchao Bai
(Harbin Institute of Technology)
·
Wenbo Zhao
(Peng Cheng Laboratory)
·
Zhenyu Li (Harbin Institute of Technology)
Room2:
Vision Transformer:
More is different
Session Chair:Yuanfang Guo (Beihang University)
Lecturer:
·
Qiming Zhang (The University of Sydney)
·
Yufei Xu (The University of Sydney)
·
Jing Zhang (The University of Sydney)
·
Dacheng Tao
(JD.com, Inc.)
Room3:
Visual
Content Creation: history, challenges and applications.
Session Chair:Xinfeng Zhang (University of Chinese Academy of Sciences)
Lecturer:
·
Chenfei Wu
(Microsoft Research Asia)
·
Nan Duan (Microsoft Research Asia)
12:00 – 14:00: Lunch Break
14:00 – 17:10: Tutorials B (Coffer Break 10 minutes)
Room1:
Linear
Video Coding and Transmission Schemes for Next Generation Video Applications
Session Chair:Zhanyu Ma (Beijing
University of Posts and Telecommunications)
Lecturer:
·
Anthony Trioux
(Univ. Polytechnique Hauts-de-France/INSA Hauts-de-France)
·
François-Xavier Coudoux
(Univ. Polytechnique Hauts-de-France/INSA Hauts-de-France)
·
Marco Cagnazzo (Institut Polytechnique de Paris/University of Padua)
·
Michel Kieffer (Univ. Paris-Saclay)
Room2:
Representation,
Evaluation and Utilities of Point Clouds
Session Chair:Ruiping Wang (Institute
of Computing Technology, Chinese Academy of Sciences)
Lecturer:
·
Weisi Lin (Nanyang Technological University)
Room3:
Deep
Learning for Light Fields
Session Chair:Siheng Chen (Shanghai Jiao Tong University)
Lecturer:
·
Junhui Hou (City
University of Hong Kong)
17:15 – 17:50: Demo Session (Room 1)
Session Chair: Li Li (University of Science and Technology
of China)
1. FPX-NVC:
An FPGA-Accelerated P-frame Based Neural Video Coding System
2. Real-time
Learned Image Codec on FPGA
3. SalCrop: Spatio-temporal Saliency Based Video Cropping
4. Intelligent
Reflection Elimination Imaging Device based on Polarizer
5. Portable
Eye Movement Feature Collection Device for Children with Autism
6. Quality-Constant
Per-Shot Encoding by Two-Pass Learning-based Rate Factor Prediction
Day
2: 14-Dec-2022:
8:30 – 9:00: Opening (Room 1)
9:00 – 10:00: Keynote 1 (Room 1)
Keynote Topic: Contemporary
Visual Computing: A System Perspective
Keynote Speaker:
Prof.
Chang-Wen Chen (The Hong Kong Polytechnic University)
Session Chair:Jingjing Meng (University at Buffalo, SUNY)
10:00 – 10:10: Coffee Break
10:10 – 11:40: Oral 1
Room1:Machine Learning for
Multimedia
Session Chair:
Mengyuan Liu (Sun Yat-sen University)
1.
One Shot Object Detection Via Hierarchical Adaptive Alignment
2.
BAM: A Bi-directional Attention Module for Masked Face Recognition
3.
MCascade R-CNN: A Modified
Cascade R-CNN for Detection of Calcified on Coronary Artery Angiography Images
4.
ACCR: Auto-labeling for Ancient Chinese Handwritten Characters
Recognition on CNN
5.
Improved PSP-Net Segmentation Network for Automatic Detection of
Neovascularization in Color Fundus Images
6.
Weakly Supervised Region-Level Contrastive Learning for Efficient Object
Detection
7.
A Large-scale Sports Tracking Dataset and Progressive Re-detection Based
Sports Tracking
8.
PickDet: A Detection
Framework for Aerial-view Scene
9.
ML-FDA: Meta-Learning via Feature Distribution Alignment for Few-Shot
Learning
Room2:Learning Based Compression
Session Chair: Weiqi
Yan (Auckland University of Technology)
7.
Learned Lossless JPEG Transcoding via Joint Lossy and Residual
Compression
8.
CNN-Based Post-Processing Filter for Video Compression with Multi-Scale
Feature Representation
9.
Neural Frank-Wolfe Policy Optimization for Region-of-Interest Intra-Frame
Coding with HEVC/H.265
10.
A Learning-based Approach for Martian Image Compression
11.
Frequency-aware Learned Image Compression for Quality Scalability
12.
Reducing The Mismatch Between Marginal and Learned Distributions in
Neural Video Compression
13.
High-frequency guided CNN for video compression artifacts reduction
14.
Autoencoder-based intra prediction with auxiliary feature
15.
On Pre-chewing Compression Degradation for Learned Video Compression
12:00 – 14:00: Lunch Break
14:00 – 15:30: Oral 2
Room1:Machine Learning for
Multimedia
Session Chair: Yongxin
Ge (Chongqing University)
1.
Clothing Retrieval from Class Aware Attention Embedding to KN Loss
Learning
2.
DE-CrossDet: Divisible and Extensible Crossline
Representation for Object Detection
3.
Mask-Guided Transformer for Human-Object Interaction Detection
4.
ERINet: Effective Rotation
Invariant Network for Point Cloud based Place Recognition
5.
CdCLR: Clip-Driven
Contrastive Learning for Skeleton-Based Action Recognition
6.
Asynchronous Autoregressive Prediction for Satellite Anomaly Detection
7.
Semantic Compensation Based Dual-Stream Feature Interaction Network for Multi-oriented
Scene Text Detection
8.
Annotating Only at Definite Pixels: A Novel Weakly Supervised Semantic
Segmentation Method for Sea Fog Recognition
9.
Cross-Layer Feature based Multi-Granularity Visual Classification
Room2:Learning Based Compression
Session Chair: Ye Luo (Tongji
University)
1.
End-to-end Image Compression with Swin-Transformer
2.
Rate Controllable Learned Image Compression Based on RFL Model
3.
Deep Reference Frame Interpolation based Inter Prediction Enhancement for
Versatile Video Coding
4.
Human pose-based video compression via forward-referencing using deep
learning.
5.
Improving Latent Quantization of Learned Image Compression with Gradient
Scaling
6.
Multi-stage locally and long-range correlated feature fusion for Learned
In-loop Filter in VVC
7.
Generalized Gaussian Distribution Based Distortion Model for the
H.266/VVC Video Coder
8.
History-parameter-based Affine Model Inheritance
15:30 – 16:00: Coffee Break
16:00 – 17:00: Keynote 2 (Room 1)
Keynote Topic: New frontiers in machine
learning interpretability
Keynote Speaker:
Prof. Mihaela van der Schaar (University
of Cambridge)
Session Chair:Mathias Wien (RWTH Aachen University)
17:00 – 18:00: Grand Challenge
Room1:Tire pattern image classification based on lightweight network
Grand Challenge Chair:Ying
Liu (Xi’an University of Posts and Telecommunications)
Schedule:
17:00 - 17:10
Challenge summary, announcing winning teams
Presenter: Ying Liu
(Xi’an University of Posts and Telecommunications)
17:10 - 17:20
Presentation from Winning Team 1
Presenter TBD
17:20 - 17:30
Presentation from Winning Team 2
Presenter TBD
17:30 - 17:40
Presentation from Winning Team 3
Presenter TBD
17:40 - 17:50
Presentation from Winning Team 4
Presenter TBD
17:50 - 18:00 Conclusion and
taking photos
Room2:Practical end-to-end image compression challenge
Grand Challenge Chair:Li
Li(University of Science and Technology
of China)
Schedule:
17:00 - 17:10 First
Track (Coding Performance) - Ranking first team
Presenter TBD
17:10 - 17:20 First
Track (Coding Performance) -Ranking second team
Presenter TBD
17:20 - 17:30 Second
Track (Decoding Complexity) -Ranking first team
Presenter TBD
17:30 - 17:40 Second
Track (Decoding Complexity) -Ranking second team
Presenter TBD
17:40 - 17:50 Third
Track (Practical Solution) - Ranking first team
Presenter TBD
17:50 - 18:00 Third
Track (Practical Solution) - Ranking second team
Presenter TBD
Day
3: 15-Dec-2022:
9:00 – 10:00: Keynote 3 (Room 1)
Keynote Topic: The future of video
communication
Keynote Speaker:
Dr. Baining Guo
(Microsoft Reasearch)
Session Chair:Jiwen Lu (Tsinghua University)
10:00 – 10:10: Coffee Break
10:10 – 11:40: Oral 3
Room1:Machine Learning for
Multimedia
Session Chair: Yansong
Tang (Tsinghua University)
1.
On Data Annotation Efficiency for Image Based Crowd Counting
2.
Blood Volume Pulse Signal Extraction based on Spatio-Temporal
Low-Rank Approximation for Heart Rate Estimation
3.
Space and Level Cooperation Framework for Pathological Cancer Grading
4.
Dual-stream Self-attention Network for Image Captioning
5.
STSI: Efficiently Mine Spatio-Temporal Semantic
Information between Different Multimodal for Video Captioning
6.
Texture-aware Network for Smoke Density Estimation
7.
Identify, Guess and Reconstruct: Three Principles for Cloud Removal Task
8.
MAiVAR: Multimodal
Audio-Image and Video Action Recognizer
9.
Blind Gaussian Deep Denoiser Network using Multi-Scale Pixel Attention
Room2:Video Coding
Session Chair: Cheolkon
Jung (Xidian University)
1.
Performance Analysis of WebRTC Embedding Optimized HEVC CodeC
2.
An Efficient Content-aware Downsampling-based
Video Compression Framework
3.
Fast Inter Prediction Mode Decision Method Based on Random Forest For H.266/VVC
4.
Global Homography Motion Compensation for
Versatile Video Coding
5.
Adaptive boundary width of Geometric Partitioning Mode for Beyond
Versatile Video Coding
6.
Enhanced motion list reordering for video coding
7.
Fast CU Partition Method Based on Extra Trees for VVC Intra Coding
8.
Efficient Interpolation Filters for Chroma Motion Compensation in Video
Coding
9.
Block Importance Mapping for Video Encoding
12:00 – 14:00: Lunch Break
12:00 – 14:00: VSPC-TC Meeting (Room 1)
14:00 – 15:00: Panel 1 (Room1)
Panel Title: Intelligent Medical
Imaging
Panel Discussion Format:
1.
Each Panelist will first present 8 minutes on
his/her intelligent medical imaging research work (48 minutes)
2.
Moderator will ask a few common questions for the
panelists to answer (30 minutes)
3.
Open to the audience for more questions (12 minutes)
4.
Each panelist will be asked to share a one-sentence
remarks on intelligent medical imaging
Moderator:
Prof. S Kevin Zhou (University of Science &
Technology of China)
Panelist:
Prof.
Yuan Feng (Shanghai Jiao Tong University)
Prof.
Qian Wang (ShanghaiTech University)
Prof. Yinghuan Shi (Nanjing University)
Prof. Xiahai Zhuang (Fudan University)
Prof. Wenxuan Liang (University of Science & Technology
of China)
Prof.
Dan Wu (Zhejiang University)
15:00 – 15:10: Coffee Break
15:10 – 16:40: Oral 4
Room1:Point Cloud Compress
Session Chair: Zheng Zhu (PhiGent Robotics)
1.
Near-lossless Point Cloud Geometry Compression Based on Adaptive Residual
Compensation
2.
A efficient predictive wavelet
transform for LiDAR point cloud attribute compression
3.
Geometry Reconstruction for Spatial Scalability in Point Cloud
Compression Based on the Prediction of Neighbours’
Weights
4.
RGBD-based Real-time Volumetric Reconstruction System: Architecture
Design and Implementation
5.
PCGFormer: Lossy Point Cloud
Geometry Compression via Local Self-Attention
6.
Reduced Reference Quality Assessment for Point Cloud Compression
7.
Distribution-aware Low-bit Quantization for 3D Point Cloud Networks
8.
A Fast Motion Estimation Method With Hamming
Distance for LiDAR Point Cloud Compression
9.
Azimuth Adjustment Considering LiDAR Calibration for the Predictive
Geometry Compression in G-PCC
Room2:Quality of Experience
Session Chair: Junlin
Hu (Beihang University)
1.
Video Quality Assessment based on Quality Aggregation Networks
2.
No-reference Stereoscopic Image Quality Assessment Based on Parallel
Multi-scale Perception
3.
MSCI: A Multi-source Compound Image Database for Compression Distortion
Quality Assessment
4.
No Reference Stereoscopic Video Quality Assessment based on Human Vision
System
5.
A Fast and Effective Framework for Camera Calibration in Sport Videos
6.
Ultra-High Resolution Image Segmentation with Efficient Multi-Scale
Collective Fusion
7.
Multi-information Aggregation Network for Fundus Image Quality Assessment
8.
Semantic Attribute Guided Image Aesthetics Assessment
9.
Quality Assessment of Screen Content Images Based on Multi-Pathway
Convolutional Neural Network
17:00 – 17:30: Award Ceremony (Room 1)
Day
4: 16-Dec-2022:
9:00 – 10:00: Keynote 4 (Room 1)
Keynote Topic: More Is Different: ViTAE elevates the art of computer vision
Keynote Speaker:
Prof. Dacheng Tao (JD Explore
Academy)
Session Chair:Zhu Li (University of Missouri-Kansas City)
10:00 – 10:10: Coffee Break
10:10 – 11:40: Oral 5
Room1:Quality of Experience
Session Chair: Jiahuan
Zhou (Peking University)
1.
A Sparsity Analysis of Light Field Signal For
Capturing Optimization of Multi-view Images
2.
Spectral Analysis of Aerial Light Field for Optimization Sampling and
Rendering of Unmanned Aerial Vehicle
3.
High-Speed Scene Reconstruction from Low-Light Spike Streams
4.
MRIQA: Subjective Method and Objective Model for Magnetic Resonance Image
Quality Assessment
5.
Recurrent Network with Enhanced Alignment and Attention-Guided
Aggregation for Compressed Video Quality Enhancement
6.
On the Importance of Temporal Dependencies of Weight Updates in
Communication Efficient Federated Learning
7.
SAD360: Spherical Viewport-Aware Dynamic Tiling for 360-Degree Video
Streaming
8.
Distinguishing Computer-generated Images from Photographic Images: A
Texture-Aware deep learning-based Method
9.
Flocking Birds of a Feather Together: Dual-step GAN Distillation via
Realer-Fake Samples
Room2:Low-level data processing
Session Chair: Yue Zhao (Chongqing
University of Posts and Telecommunications)
1.
DesnowFormer: an effective
transformer-based image desnowing network
2.
A Comparative Study of Cross-Model Universal Adversarial Perturbation for
Face Forgery
3.
A Privacy-Preserving and End-to-End-Based Encrypted Image Retrieval
Scheme
4.
Image Inpainting with Frequency Domain Wavelet Convolution
5.
Visual Analysis motivated Super-Resolution Model for Image Reconstruction
6.
Single Image Super-Resolution Using ConvNeXt
7.
Face Super Resolution based on Contrastive Learning
8.
Refine-PU: A Graph Convolutional Point Cloud Upsampling
Network using Spatial Refinement
9.
Controllable Space-Time Video Super-Resolution via Enhanced Bidirectional
Flow Warping
12:00 – 14:00: Lunch Break
14:00 – 15:00: Panel 2 (Room 1)
Panel Title: Deep Learning based Image
and Video Compression
Panel Discussion Format:
1.
Each Panelist will first present 6-10 minutes on the
learning based image/video compression he/she is
working on (30-45 minutes)
2.
Moderator will ask a few common questions for the
panelists to answer (30 minutes)
3.
Open to the audience for more questions (15 minutes)
4.
Each panelist will be asked to share a one-sentence
remarks on learning based image/video compression
Moderator:
Prof. Siwei Ma (Peking
University)
Panelist:
Prof.
Lu Yu (Zhejiang University)
Prof.
Zhan Ma (Nanjing University)
Prof.
Dong Liu (University of Science and Technology of China)
Dr.
Jiaying Liu (Peking University)
Dr.
Yan Wang (Tsinghua University)
15:00 – 15:10: Coffee Break
15:10 – 16:40: Oral 6
Room1:Special Session
Session Chair: Yueqi
Duan (Tsinghua University)
1.
Augmented Normalizing Flow for Point Cloud Geometry Coding
2.
PointNetGeM: Simple and
Efficient Point Cloud Based Network for Place Recognition
3.
SparseARFM-SI: Rotary Point
Cloud Place Recognition Based on Multi-Resolution and Attention Mechanism
4.
Dynamic Mesh Commonality Modeling Using The
Cuboidal Partitioning
5.
3D Tensor Display for Non-Lambertian Content
6.
Spike Signal Reconstruction Based on Inter-Spike Similarity
7.
Low Light RAW Image Enhancement Using Paired Fast Fourier Convolution and
Transformer
8.
Recurrent Multi-connection Fusion Network for Single Image Deraining
Room2:Multimedia Content
Analysis, Representation, and Understanding
Session Chair: Jinglin
Xu (University of Science and Technology Beijing)
1.
Hierarchical Reinforcement Learning Based Video Semantic Coding for
Segmentation
2.
CFNet: A Coarse-to-Fine
Network for Few Shot Semantic Segmentation
3.
Robust Dynamic Background Modeling for Foreground Estimation
4.
Mining Regional Relation from Pixel-wise Annotation for Scene Parsing
5.
ENDE-GNN: An Encoder-decoder GNN Framework for Sketch Semantic
Segmentation
6.
Learning from the NN-based Compressed Domain with Deep Feature
Reconstruction Loss