Scott Enderle
@scottenderle
DH at Penn Libraries. Increasingly stealthy. He/him, opinions mine, everything's a bookmark.
ID: 61625084
http://lagado.name 30-07-2009 23:01:10
5,5K Tweet
637 Takipçi
517 Takip Edilen
Going to present the work "Improving Measures of Text Reuse in English Poetry: A TF–IDF Based Method" co-authored with @tedunderwood.me (is at 🦋, not here) at #iconference2021 on Wednesday. We validated the method through the example of text reuse between Yeats and the English Romantic poets.
We've developed a Gutenberg-HathiTrust parallel corpus of 19,049 pairs uncorrected OCR + human-proofread books in 6 domains, publ. 1780-1993. Description: hdl.handle.net/2142/109695 HT Research Center @tedunderwood.me (is at 🦋, not here) J. Stephen Downie @gworthey Yuerong Hu