SummBank v. 1.0 LDC2003T16 http://www.summarization.com/summbank/ This CD-ROM contains 40 news clusters in English and Chinese, 360 multi-document, human-written non-extractive summaries, and nearly 2 million single document and multi-document extracts created by automatic and manual methods. The collection was prepared as part of the Johns Hopkins summer workshop on Text Summarization (http://www.clsp.jhu.edu/ws2001/groups/asmd/). To get started, go to the "documentation" directory. Corpus and software: Dragomir Radev, University of Michigan Simone Teufel, University of Cambridge Horacio Saggion, University of Sheffield Wai Lam, Chinese University of Hong Kong John Blitzer, University of Pennsylvania Arda Celebi, USC/ISI Elliott Drabek, Johns Hopkins University Danyu Liu, University of Alabama Hong Qi, University of Michigan CD and documentation: Tim Allison, tballiso@umich.edu, University of Michigan Dragomir Radev, radev@umich.edu, University of Michigan Original data: The Government of the Hong Kong Special Administrative Region distributed by the Linguistic Data Consortium (Hong Kong News Parallel Text corpus (LDC2000T46)) Additional summarization software used to produce the summaries: Inderjeet Mani, Chin-Yew Lin, Greg Silber Special thanks: Fred Jelinek, Sanjeev Khudanpur, Laura Graham, Jacob Laderman, Bill Byrne, Adam Winkel, Michael Topper, Sasha Blair-Goldensohn, Ralph Weischedel, Scott Boisen, David Day, Dan Melamed, Regina Barzilay, John Murdie, Hans van Halteren, Wessel Kraaij, Tristan Miller, Zhu Zhang, Anna Osepayshvili, Ali Hakim, and many others.