Shang Gao
2023
Sparse Frame Grouping Network with Action Centered for Untrimmed Video Paragraph Captioning
Guorui Yu
|
Yimin Hu
|
Yuejie Zhang
|
Rui Feng
|
Tao Zhang
|
Shang Gao
Findings of the Association for Computational Linguistics: EMNLP 2023
Generating paragraph captions for untrimmed videos without event annotations is challenging, especially when aiming to enhance precision and minimize repetition at the same time. To address this challenge, we propose a module called Sparse Frame Grouping (SFG). It dynamically groups event information with the help of action information for the entire video and excludes redundant frames within pre-defined clips. To enhance the performance, an Intra Contrastive Learning technique is designed to align the SFG module with the core event content in the paragraph, and an Inter Contrastive Learning technique is employed to learn action-guided context with reduced static noise simultaneously. Extensive experiments are conducted on two benchmark datasets (ActivityNet Captions and YouCook2). Results demonstrate that SFG outperforms the state-of-the-art methods on all metrics.
Can Pretrained Language Models Derive Correct Semantics from Corrupt Subwords under Noise?
Xinzhe Li
|
Ming Liu
|
Shang Gao
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
For Pretrained Language Models (PLMs), their susceptibility to noise has recently been linked to subword segmentation. However, it is unclear which aspects of segmentation affect their understanding. This study assesses the robustness of PLMs against various disrupted segmentation caused by noise. An evaluation framework for subword segmentation, named Contrastive Lexical Semantic (CoLeS) probe, is proposed. It provides a systematic categorization of segmentation corruption under noise and evaluation protocols by generating contrastive datasets with canonical-noisy word pairs. Experimental results indicate that PLMs are unable to accurately compute word meanings if the noise introduces completely different subwords, small subword fragments, or a large number of additional subwords, particularly when they are inserted within other subwords.
2018
Hierarchical Convolutional Attention Networks for Text Classification
Shang Gao
|
Arvind Ramanathan
|
Georgia Tourassi
Proceedings of the Third Workshop on Representation Learning for NLP
Recent work in machine translation has demonstrated that self-attention mechanisms can be used in place of recurrent neural networks to increase training speed without sacrificing model accuracy. We propose combining this approach with the benefits of convolutional filters and a hierarchical structure to create a document classification model that is both highly accurate and fast to train – we name our method Hierarchical Convolutional Attention Networks. We demonstrate the effectiveness of this architecture by surpassing the accuracy of the current state-of-the-art on several classification tasks while being twice as fast to train.