Just as digital scholarship can encompass many research methods and outputs, “content” can cover a very wide range of file formats and materials, including images, texts, data sets, 3D models, audio, and more. Below are links to resources within ScholarWorks and elsewhere in the Libraries that can assist researchers as they begin preparing digital content for presentation or analysis.

Digitization Resources in the Libraries

The Libraries offer resources and consultations about how to digitize physical materials. For single-volume titles published in 1922 or earlier, you can use the “Request Digitization” button from within a library catalog record to ask that the Libraries digitize the work. (Note that this service is available only for materials located in the Perkins/Bostock, Lilly, Music, or Library Service Center collections.)

In addition, the Libraries have several flatbed, overhead, and sheet-feed scanners. Additional information about these scanners is available on the Libraries website.

You may also wish to consult with a Libraries staff member about digitization options, best practices, and appropriate file formats:

ScholarWorks staff can address questions about scanning texts, performing optical character recognition (OCR), and organizing corpora. Contact us at scholarworks@duke.edu to set up a consultation.

Resources for Cleaning Data

It’s often necessary to normalize or otherwise “clean” textual data before using it in research. ScholarWorks staff are available to consult with you about how to prepare data for various research tasks. Likewise, the Center for Data and Visualization Sciences (CDVS) at the Libraries has expertise on all phases of data creation and maintenance, and their staff offer workshops and online learning resources about topics including data cleaning (e.g., OpenRefine).

Collecting and Repurposing Digital Content

There are many ways to gather digital content aside from digitizing it yourself. For example, content can be collected via electronic resource APIs, web scraping, or bulk downloading from a database. Each of these approaches may have specific technical requirements that vary depending on the resource, and each has legal considerations that should be addressed appropriately before proceeding.

For consultation about how to collect and reuse existing digital content, contact ScholarWorks staff (scholarworks@duke.edu) to schedule a consultation. For additional training and consultation around the use of APIs and processing of structured data, contact the Center for Data and Visualization Sciences (CDVS).

Copyright, Permissions, Licensing, and Credit

Digitizing content and acquiring previously digitized content both require careful review of any legal and ethical issues that may govern access to (or reuse of) the material. ScholarWorks staff are available to help you identify and think through these issues.