jeongsoo_choi

Jeongsoo Choi

Ph.D. candidate at KAIST

I am a Ph.D. student in the School of Electrical Engineering at KAIST. My research interest includes multilingual & multimodal machine translation, audio-visual speech recognition, and talking face synthesis.

About Me


Education

  • KAIST (Sep. 2020 - Present)
    Ph.D. in Electrical Engineering
    Advisor: Prof. Joon Son Chung
  • KAIST (Mar. 2015 - Aug. 2020)
    B.S. in Electrical Engineering

Publications


2024

  • AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
    Jeongsoo Choi*, Se Jin Park*, Minsu Kim*, and Yong Man Ro
    CVPR 2024 Highlight presentation
    [ paper | code | demo ]
  • Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models
    Jeongsoo Choi, Minsu Kim, Se Jin Park, and Yong Man Ro
    ICASSP 2024
    [ paper | demo ]
  • Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
    Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, Yong Man Ro
    ICASSP 2024
    [ paper | code | demo ]
  • Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation
    Se Jin Park, Minsu Kim, Jeongsoo Choi, and Yong Man Ro
    ICASSP 2024
    [ paper ]
  • AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
    Jeong Hun Yeo, Minsu Kim, Jeongsoo Choi, Dae Hoe Kim, and Yong Man Ro
    IEEE Transactions on Multimedia
    [ paper ]
  • Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units
    Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, Se Jin Park, Yong Man Ro
    arXiv Jan. 2024, Under review
    [ paper ]

2023

  • DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
    Jeongsoo Choi*, Joanna Hong*, and Yong Man Ro
    ICCV 2023
    [ paper | demo ]
  • Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
    Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, and Yong Man Ro
    ICCV 2023
    [ paper ]
  • Intelligible Lip-to-Speech Synthesis with Speech Units
    Jeongsoo Choi, Minsu Kim, and Yong Man Ro
    Interspeech 2023
    [ paper | code | demo ]
  • Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
    Minsu Kim*, Jeongsoo Choi*, Dahun Kim, and Yong Man Ro
    arXiv Aug. 2023, Under review
    [ paper | code | demo ]
  • Watch or Listen: Robust Audio-Visual Speech Recognition With Visual Corruption Modeling and Reliability Scoring
    Joanna Hong*, Minsu Kim*, Jeongsoo Choi, and Yong Man Ro
    CVPR 2023
    [ paper | code | demo | data ]

2022

  • SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
    Se Jin Park, Minsu Kim, Joanna Hong, Jeongsoo Choi, and Yong Man Ro
    AAAI 2022 Oral presentation
    [ paper ]

Jeongsoo Choi @ KAIST