Copy of Practice: Selection Policy

By now, you should have a decent idea of how you might go about selecting documents for your project. Use the tool below to record your ideas related to your selection policy and download a copy for later use.

Course Glossary

  • A publisher based at or sponsored by a university.

  • The access file is a derivative of the master file, produced by converting the master file to a smaller file format. Access files are suitable for presentation to researchers.

  • The condition of source materials being physically available to users and intellectually understandable by users.

  • The process of adding a new item to a collection.

  • The use of descriptive, contextual, referential, or illustrative content or structure that supports the discoverability and accessibility of source materials. Annotation may take many forms (footnotes, source notes, metadata, glossaries, essays, indexes, keywords, images, maps, and more) and multiple forms of annotation may be used by a project.

  • The “supports” of any edition (other than the reading text itself) that are created for the purpose of providing additional clarifying information. Typically, this term is applied to textual and contextual notes, but it can also apply to introductions, headnotes, dictionaries, lists, indexes, and appendices as well as newer, innovative annotation types, such as data visualizations.

  • A collection of textual and non-textual artifacts in physical and/or digital form; records created or received by a person, family, or organization and preserved because of their continuing value. See also the definition provided by the Society of American Archivists: https://dictionary.archivists.org/entry/archives.html.

  • A collection of “authority records” (usually in a database or in a structured data file like XML) with stable, reliable information about places, people, and other kinds of named entities.

  • A version of a person’s name that is used every time when referencing that individual, such as in annotation or metadata.

  • The back-end refers to the data (and/or database), site system, and structure underlying a digital project, whereas the front-end refers to the website’s style, appearance, and features (otherwise known as the user interface). Many websites rely on the communication between the back-end data and the web browser for displaying it.

  • The number of bits used to represent each pixel in an image.

  • An edition conceptualized with the goal of online publication, meaning that editorial policies are made considering the digital environment.

  • 1) The act of creating and maintaining a list that describes the content, structure, and/or administration for each source material within a collection. This can be created for the benefit of document control and/or discoverability of the materials.
    2) The act of making an edition discoverable within external infrastructures, such as library catalogs.

  • The act of seeking or identifying the location of source material for the purpose of acquisition, in some form of a print or digital copy, for research purposes.

  • An edition that includes both images and transcriptions. See in contrast to image-based editions and transcribed editions.

  • With respect to a collection of source materials, a comprehensive edition publishes all or nearly all of those materials. See in contrast to a selected edition.

  • A consistent or standardized way of describing data. For example, one practitioner working with poems may choose to describe the creators of these source materials as "Author," whereas a practitioner working with correspondence may choose to describe the creators of those source materials as "Sender." Regardless of what these practitioners choose, both have used a controlled vocabulary by standardizing how they describe the source material's creator. Practitioners can develop their own controlled vocabulary, use an existing controlled vocabulary, or a combination of both. Examples of existing controlled vocabularies can be found at the University of North Carolina Library: https://guides.lib.unc.edu/c.php?g=8749&p=44502.

  • Copyediting involves revising the text of your annotations and other apparatus to ensure that your work is clear and readable and that it conforms to conventional rules of grammar. See in contrast to proofreading.

  • An identifier placed on copies of a work to inform the world of copyright ownership.

  • The length of time that copyright applies to a work before it passes into the public domain.

  • A free license that provides all creators—from individuals to large institutions—with a standardized way to grant permission for public use of their creative work under copyright law. Learn more at the Creative Commons website: https://creativecommons.org/.

  • A particular kind of annotation that records textual notes about the sources of the reading text and, in some cases, information about authoritative readings when multiple versions of a text exist.

  • A reconstructive form of editing that establishes an authoritative reading text based on a critical examination of existing witnesses—i.e., imposing change on a text through correction, emendation, or apparatus. See in contrast documentary/historical editing.

  • The user interface for a content management system's backend.

  • A surrogate file created from the original master file.

  • The process of rotating an image that has been scanned crookedly.

  • The production of printed matter by means of a desktop computer and a page layout software that integrates text and graphics.

  • The act of using digital tools in the practice of editing source materials.

  • An edition published—and sometimes also prepared or edited—in a digital or online environment. A digital edition may be created instead of or in addition to a print edition.

  • The process of creating a high-quality digital copy of your source material.

  • A set of guidelines that governs the digitization of material and aligns them to industry specifications.

  • A literal transcription of a document, where all words, including those that were added or deleted, are represented. See in contrast to normalized transcription.

  • A folder containing files.

  • The condition of source materials being findable by users, such as through browsing, filtering, or searching.

  • A handwritten, printed, or oral type of source material. Documents may include letters, diaries, financial records, invitations, event flyers, newspaper articles, poems, speeches, interviews and more.

  • Editing documents (either private or public documents) with the goal of making them accessible and, in some cases, reproducing their content as closely as possible to their original form. This form of editing has often been distinguished from critical editing, which focuses on editing documents with the goal of establishing an authoritative text, yet many practitioners now agree that historical editing incorporates many aspects of critical editing as reflected in decisions about  presentation, formatting, and annotation.

  • The process of writing down the policy decisions that you have made in order to share them with readers and ensure that you apply them consistently.

  • The application of a system for locating specific documents or groups of documents in your collection. Creating such a system involves defining what metadata needs to be collected for each document, and then consistently and accurately collecting that metadata.

  • A structure developed for creating meaningful divisions between documents. Closely related to document control, document organization refers to the framework by which documents are organized, whereas document control refers to the practical application of that framework.

  • The act of gathering, preparing, and presenting source materials in such a way as to increase their accessibility and discoverability. This work may involve a variety of activities, such as collection, selection, digitization, cataloging, versioning, transcription, annotation, and encoding, and may result in a variety of presentations or publication outputs.

  • A print, digital, or hybrid publication resulting from the act of editing source materials.

  • See practitioner.

  • The decisions practitioners make regarding how to represent source materials in their edition. Practitioners may make decisions related to the selection of materials for publication, the form and focus of transcription, the form(s) of annotation, the elements to be captured in metadata, the processes for quality control, and more. Practitioners may choose to document any or all of these decisions for use in sharing internally and/or for informing users of their edition.

  • Changing the reading of a text to correct an inaccuracy or to reflect a judgment about an author’s intentions.

  • An edition prepared using TEI-XML—a descriptive, standardized XML-based language developed and maintained by humanists. More information about the TEI and how to use it can be found at the scholarly, non-profit TEI Consortium: https://tei-c.org/.

  • The act of using a computing language, such as Markdown or XML, to represent or describe source material. See especially text encoding.

  • A note or essay that follows the presentation of a document or other source material. It is a form of annotation used for providing information about the source material and/or for creating connections to relevant resources.

  • A guided search and navigation feature that lets users filter search results by selecting a range of different attributes. For example, a faceted search of place names allows you to search a list of place names mentioned in a collection of documents. See the Wikipedia entry for more information: https://en.wikipedia.org/wiki/Faceted_search.

  • The process of verifying the accuracy of information provided in annotation and citations.

  • A legal doctrine in the United States that permits the unlicensed use of copyright-protected works under certain circumstances. There are four factors that guide determination on whether the unlicensed use of a copyright-protected work is permissible or fair. These four factors are outlined at the U.S. Copyright Office Fair Use Index: https://www.copyright.gov/fair-use/.

  • The amount of space a file consumes on a storage medium.

  • A description that provides contextual and structural information about an archival resource.

  • A note attached to a specific element in an essay or source material (such as a word, sentence, section of an image or recording, etc.). It is a form of annotation used for providing information about that specific element and/or for creating connections to relevant resources.

  • Similar to an authority file (or list), a gazetteer is a database that combines name authority files into a stable, reliable source about places, people, and other kinds of named entities to which websites can connect. Gazetteers are often used in Linked Open Data (LOD) projects, which connect their projects to gazetteers to link them to the “semantic web.” Examples include Wikidata, Geonames, and VIAF.

  • A curated collection of resources, such as of biographies, key terms, images, or other content. It is a form of annotation used to collect any resources that may be referenced frequently and make them available in one easily-findable location.

  • A note or essay that precedes the presentation of source material or a collection of source materials. It is a form of annotation used for introducing or discussing the source material.

  • First coined in the 1960s by Ted Nelson, a hypertext is any text shown on a computer screen that can link out to other documents.

  • Created by Tim Berners-Lee in the late 1980s, HTML was the first official instantiation of a hypertext data model which became the de facto language for web writing and publishing in the World Wide Web.

  • An edition that presents images or facsimiles of the source materials. May also be referred to as a facsimile edition. See in contrast to transcribed editions and combined editions.

  • The level of detail portrayed in an image, measured in pixels per inch (PPI) or dots per inch (DPI).

  • While an index refers to a list at the end of a printed book that helps you to find the location of certain references, indexing in digital editions is a means of serializing data in a digital edition so that certain semantic elements (identifiers, people, places, dates) can be processed and accessible.

  • Usually in printed editions, the lemma signals the place in the reading text to which a note is referring.

  • An agreement to utilize or reproduce a creative work. May also be referred to as a “permission(s) agreement.”

  • Used for a typesetting machine that produces each line of type in the form of a solid metal slug.

  • The act of formatting text for HTML using a plain-text editor.

  • The master file is the original file, generally produced through scanning processes that attain a high-level specification. The master file is archived for long-term preservation.

  • The process by which communications (either verbal or textual) are delivered through a material such as a book or computer.

  • Essentially, data about data. It can be used to describe the content, physical or structural features, and/or administrative elements of data. In providing such descriptions, metadata supports the management and discoverability of data. See the University of North Carolina Library's definition of metadata for more information: https://guides.lib.unc.edu/metadata/definition.

  • With respect to a collection of source materials, a modified comprehensive edition publishes all materials that fit within a defined category. For example, a practitioner creating a modified comprehensive edition might select all materials from a specific range of years, a certain format of materials (eg. letters, speeches, oral interviews), or all materials from a specific geographical area. See in contrast to a selected edition, and a comprehensive edition.

  • A transcription of a document where the substance of the content is retained, but some elements like spelling, punctuation, or contractions are changed with the intention of improving the readability of the text. See in contrast to diplomatic transcription.

  • A letter seeking to obtain permissions from the copyright holder to use, reproduce, or adapt a creative work.

  • Any individual who practices editing or recovery for the purpose of promoting the accessibility and discoverability of source materials; or any individual who engages in the discussion, development, or use of tools or methodologies relating to those practices.

  • The act of confirming the presentation of a text, whether transcription or annotation, immediately prior to publication by reviewing and making any necessary revisions. See in contrast to copyediting.

  • Where documents or data come from, which individuals or repositories have previously owned them, and how we end up accessing them (or how they have changed, through mediation).

  • Refers to creative works that are not protected by intellectual property laws, such as copyright, trademark, or patent laws. When a work is released into the public domain, the public, rather than an individual author or artist, owns the work as a collective entity. This means that anyone can use or adapt the work without obtaining permission, but no one can ever own it.

  • The act of inviting the public to substantially contribute to project work. In the practice of editing and recovery, may include involving the public in the conceptualization of the project, crowd-sourcing transcriptions or annotations, and more.

  • The act of reviewing editorially-produced content, like transcription or annotation, for the purpose of ensuring quality or accuracy. Practitioners may use one or more of a variety of processes for the purpose of reviewing their content, including copyediting, fact-checking, proofreading, tandem reading, and more.

  • The act of focusing research activities like archiving or editing on source materials that have previously in their collection, preservation, organization, description, or presentation been silenced, dismissed, neglected, or ignored.

  • A statement about the intellectual property rights regarding a resource, a legal document giving official permission to use a resource, or a statement about access rights.

  • With respect to a collection of source materials, a selected edition publishes only a subset of those materials. The practitioner decides what subset will be prepared and published within the edition and what materials fit within that subset. See in contrast to a comprehensive edition.

  • The process of deciding which source materials will be included in your publication.

  • The act of labeling texts with specific, meaningful categories for machine processing.

  • The ways that the organization and material forms of a book affect our interpretations and experiences of the text. See D. F. McKenzie’s Bibliography and the Sociology of Text (1999).

  • Any handwritten, printed, oral, visual, kinetic, or physical item that a practitioner chooses to work with. Source materials may include diaries, letters, newspapers, poems, audio recordings, video recordings, novels, short stories, artwork, dances, objects, and more.

  • A note that describes a source material's provenance and/or creation. It is a form of annotation.

  • The quality of being based on or influenced by personal feelings, tastes, or opinions, or societal or historiographical practices and beliefs.

  • A written statement allowing users to request that an item be removed (e.g., from a public website) due to a possible copyright infringement.

  • The application of digital tools that mine and analyze text in the pursuit of finding new meanings or connections within the text.

  • The act of using the Text Encoding Initiative or TEI—a set of XML guidelines that have been developed to describe humanities texts—to edit source materials through encoding. More information about TEI and how to use it can be found at the TEI Consortium, a scholarly community that maintains the guidelines: https://tei-c.org/.

  • The core components of a textual artifact, including the material history of communications among humans and their underlying systems of publication and dissemination.

  • An edition that publishes transcriptions of documents. See in contrast to image-based editions and combined editions.

  • The act of interpreting and adapting source material to create a readable form or representation of it.

  • The particular design of letters, numbers, and symbols to be used for publication.

  • The quality to which a website's design results in the presentation of clear pathways for users to navigate the site, and in the usage of features and functions that are practical and accessible.

  • How an individual (or user) interacts with a product (like websites) and how that interaction is shaped by the product's design. Sometimes referred to as UX.

  • A sequence of tasks concerning the movement of work through a stage or stages in the prepation of an edition. A practitioner may design and employ a variety of workflows to suit their needs, including cataloging or digitization workflows, a quality control or verification workflow, a publication workflow, and more.