Online citations, reference lists, and bibliographies.

Extractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models

Y. Chen, Suhan Yu, H. Wang, B. Chen
Published 2006 · Computer Science

Cite This
Download PDF
Analyze on Scholarcy
The purpose of extractive summarization is to automatically select indicative sentences, passages, or paragraphs from an original document according to a certain target summarization ratio, and then sequence them to form a concise summary. In this paper, in contrast to conventional approaches, our objective is to deal with the extractive summarization problem under a probabilistic modeling framework. We investigate the use of the hidden Markov model (HMM) for spoken document summarization, in which each sentence of a spoken document is treated as an HMM for generating the document, and the sentences are ranked and selected according to their likelihoods. In addition, the relevance model (RM) of each sentence, estimated from a contemporary text collection, is integrated with the HMM model to improve the representation of the sentence model. The experiments were performed on Chinese broadcast news compiled in Taiwan. The proposed approach achieves noticeable performance gains over conventional summarization approaches.
This paper references
Dirichlet Mixtures for Query Estimation in Information Retrieval
M. Smucker (2005)
Speech retrieval of Mandarin broadcast news via mobile devices
B. Chen (2005)
Speech-to-text and speech-to-speech summarization of spontaneous speech
S. Furui (2004)
Looking for a Few Good Metrics: ROUGE and its Evaluation
Chin-Yew Lin (2004)
ROUGE and its evaluation,” Working Notes of NTCIR-4 (Vol
C.-Y. Lin (2004)
Automatic summarization of voicemail messages using lexical and prosodic features
K. Koumpis (2005)
Language Modeling for Information Retrieval
W. Croft (2003)
sentence selection and evaluation metrics,” in Proc
J. Goldstein et al. (1999)
Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization
S. Maskey (2005)
Spoken document understanding and organization
Lin-shan Lee (2005)
Modern Information Retrieval
Ricardo Baeza-Yates (1999)
Summarizing text documents: sentence selection and evaluation metrics
Jade Goldstein-Stewart (1999)
ROUGE: Recall-oriented understudy for gisting evaluation
Chen-Yi Lin (2003)
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents
B. Chen (2004)
Doug Rohde's SVD C Library, Version
D. Rohde (2005)
Lightly supervised and data-driven approaches to Mandarin broadcast news transcription
B. Chen (2004)
Generic text summarization using relevance measure and latent semantic analysis
Y. Gong (2001)
Blind Image Restoration Using a Block-Stationary Signal Model
Tom E. Bishop (2006)
Advances in Automatic Text Summarization
I. Mani (1999)

This paper is referenced by
Semantic Scholar Logo Some data provided by SemanticScholar