Text Analysis – Duke ScholarWorks

Computational work with text in the digital humanities can take many forms, ranging from creating digital editions to more automated techniques such as topic modeling and document classification. Duke Libraries can help you with:

Creating editions of texts: We have staff expertise in digital editing, including the use of TEI to create sustainable, shareable editions of scholarly texts.

Acquiring a corpus and cleaning data: The Libraries offer digitization services for certain volumes and have several scanners on site for creating digital versions of printed texts. We also have staff expertise in optical character recognition (OCR) software and techniques for cleaning scanned data. For more information, please see Preparing Content for Digital Scholarship.

Text analysis techniques: We’re able to help plan your text analysis project, define outcomes, and match your corpus to specific techniques such as topic modeling, sentiment analysis, and document classification.

ScholarWorks staff often teach hands-on workshops on topics related to text analysis; see our calendar of events for current and upcoming offerings.

To learn more about the possibilities for text analysis in digital scholarship, contact ScholarWorks staff to set up a consultation.