Tag Archives: XML

XML/HTML5 and Perpetual Learning in Public

This will be a short post with origins in humility and a sense of the value of an attitude of openness to realizing and acknowledging that there is always more to learn.

I have to admit that I had a momentary meltdown when I read the text of Melissa Terras’s inaugural lecture, “A Decade in Digital Humanities,” last week.  The provocation about text encoding and over-attachment to XML hit a nerve, especially since I’m counting down days to my departure for the first of several summer efforts to feed my brain.  On Sunday, I head to Nashville, where I will spend two weeks in an NEH-sponsored Institute on Advanced Topics in Digital Humanities focused on XML and XQUERY.

Having chosen to attend the institute is one of the kinds of activities in which I continue to engage as I consider the ways that the technologies of the present change our practice of the discipline(s) in which we study the past.  So this morning I was more pleased than I can say to come across an announcement for another learning opportunity of which I plan to take advantage this summer.

This year, the Balisage Markup Conference includes a pre-conference symposium focused on “mending fences” between XML and HTML5.  I’m particularly interested in the presentation of Alex Milowski of the University of Edinburgh; I quote the abstract here:

In the beginning, many presumed we would move to a world where XML documents and the applications that processed them would proliferate across the Web. The Web looked like a bright place for markup; technologies like XSLT made their way into the browser and linking standards were on their way. Yet, it didn’t happen. As browsers strengthened their ability to process information, render HTML documents, display media assets, and deliver applications, the role of XML was either pushed to the other side or used as a way to deliver data to applications within the browser via AJAX. The potential mismatches between the wants of the Web developer and the generic, impoverished nature of the DOM led to the development of JSON. In places where they might once have used XML, web developers have moved in droves to using JSON and HTML. XML has been removed from its role to convey data to applications, shunted to the server, and labeled legacy by many. With an uphill, generational challenge to bring it back within favor, the fundamental question is: Do we really want XML on the Web?

I’ve never gone to Balisage because the idea of “extreme markup” intimidates me more than a little.  Okay, maybe not as much as it used to now that I have sat in rooms where people have been teaching the uses of the R statistical programming language or what’s “under the hood” in Omeka.  But I think I can manage one little day of listening to people talking about the relationship between XML, the beloved tool of the text encoding community, and HTML.

Comments Off on XML/HTML5 and Perpetual Learning in Public

Filed under Learning Technologies

Teaching History with XML/TEI: A Contribution to Liberal Education

During the discussion period of a NITLE webinar I participated in last week, a member of the audience asked me why we choose to use eXtensible Markup Language (XML) compatible with the Guidelines of the Text Encoding Initiative (TEI) in the Wheaton College Digital History Project.*  And I think a response to that question merits a post here since I use this blog as a space to offer information about digital humanities methods and their use in digital history.  I focus here on the practice as part of my work as an educator.  In a future post, I will speak to the question of using XML/TEI in historical scholarship.

Fundamentally, using XML/TEI in a teaching project like ours gives students a chance to learn something about the digital tools we use every day.  I think this kind of opportunity is an important component of liberal education as defined by the American Association of Colleges and Universities (AAC&U) :

a philosophy of education that empowers individuals with broad knowledge and transferable skills, and a strong sense of value, ethics, and civic engagement.

Because I am a historian, I understand the broad knowledge and transferable skills referred to in this definition as contextual, as dependent on time and place.  So in my view, the technological developments of the past twenty or so years have created for those of us who live and work in the United States a culture so mediated by digital devices of various types that a basic understanding of those devices has become an essential part of a liberal education.

That is, I think it is part of my responsibility as an educator to help students understand the laptops and tablets and smart phones of our daily lives as comprehensible machines because we use them both to consume and produce the stuff of our culture.  Because I think that a minimal understanding of how those devices work empowers students to put their values and ethics to use in the form of civic engagement and other elements of a fulfilling human life.

Now, this does not mean that I think I need to teach my students to become programmers or even that I think I need to be a programmer.  My colleagues in computer science teach students programming and machine structures and computational thinking.  And those colleagues are better able than I to speak to larger questions of the strengths and weaknesses of XML from those perspectives.

I am a historian, and my main goal in using XML/TEI in my teaching is to give students an opportunity to spend time with primary sources in a particular kind of way that is facilitated by using these tools.  But before I explain this point, I want to say just a bit more about the value of knowing at least a little bit about XML as an educated citizen of our world.  And that requires defining XML without getting too technical.  So here goes.

XML stands for eXtensible Markup Language.  You can look it up on Wikipedia , which also has a more general entry on markup language.  But those entries go into a lot of historical and somewhat technical detail.**  Boiled down to essentials, there are only a few things that make knowing a little bit about XML a useful thing at our moment in time and place:

  • A lot of the applications that we use every day store our data using XML.  If you use Microsoft Office (Word, or Excel, or PowerPoint) or analogous applications from OpenOffice.org or Apple iWork, when you save your work, the application preserves your work in XML.
  • XML is commonly used not only for storing data but also for its exchange over the Internet.

So XML is all around us.  We use it all the time.  And so do professionals who specialize in storing and accessing information.

  • XML is a very stable format for storing data and metadata (that is, information about information).
  • XML is so stable that it is a preferred archival format among libraries and other cultural heritage organizations all over the world.  This means that even if the applications you use now disappear, new software can be written to display your information on whatever new generations of devices exist at the time.
  • XML is built to be used internationally, with the facility to include characters in any alphabet.  So you can store data that uses Chinese logograms or Cyrillic characters; you need not confine your language to English or French or some other European language.

So, XML is one of the important building blocks of the way we store and exchange information every day.  We don’t usually think about it, but it underlies a lot of what we do, and thus we can say that knowing a bit about it could be part of the broad knowledge and transferable skills that make up a liberal education.

Why use XML to teach students how to do history?

A lot of teaching students how to do history involves giving them many opportunities to spend time examining primary sources, which are the evidence out of which we create historical knowledge.  As historians, we explore information that people created in the past and make arguments about what those people did and why or how their actions were significant.  We ask questions prompted by the documents, and we look for information in other documents based on those questions.  But how do we know what questions to ask?  How do we learn enough about a particular document to have a good idea of what other documents to examine next?

One way we do these tasks is through close reading, by which I mean getting to know a document, its author, its audience, its context.  And historians have been transcribing documents as a practice related to close reading for a long time.  In fact, transcribing sources is a basic research skill that students learn early in their educations; it is not a skill restricted to the practice of history.  When we do research, we take notes.  We might say that good transcription and note-taking are some of the transferable skills of a liberal education.

Teaching students to use XML as they transcribe primary sources promotes close reading.  That is, asking students to transcribe primary sources and embed information about the sources in the files that hold transcriptions gives students opportunities to get to know the sources deeply in ways that help students learn how to interpret the sources, ask questions about them, find related sources, and build arguments grounded in historical evidence.

The story I like to tell to illustrate this process comes from a time I was teaching a course on historical methods a few years ago.  I asked the students to transcribe and mark up some pages from an account book that was kept in a store in a nineteenth-century New England town with a mixed agricultural and industrial economy.  The students happened to be transcribing pages that included the purchase and sale of a lot of potatoes, and they wanted to know more.  So we talked about agriculture and the seasonal cycles of planting and harvest.  We talked about how potatoes grow and buying seed potatoes.  And we considered potato blight, the Irish famine, and the dates of the transactions the students were transcribing.  All of this discussion was fine enough, but none of it led to any particularly satisfying interpretations of the information the students had found.

So we all did some more research, this time in secondary sources.  And we finally found an article in a journal focused on Vermont history that helped us make sense of all those potatoes.***  Because in that article, we read about the need for starch in the process of textile production in New England factories.  And we also learned that around the same time we had discovered all those potatoes being bought and sold, the people who ran textile factories used starch that was made from potatoes.

Now, I do not by any means wish to claim that this anecdote is a story of professional scholarship.  If I were using the primary source my students were transcribing as part of a scholarly research project, I might or might not focus on the potato question as a significant one for the larger project.  And even if I did for some reason need to know more about those potatoes, I would probably go about the next steps in my research differently from the way that my students and I had time to do in one assignment in a semester-long course.

But I do feel comfortable claiming that this exercise in figuring out a possible story behind all those potatoes was an effective lesson for the students in the process of doing historical research.  The students had a genuine intellectual experience that arose from close reading of a primary source.  They learned that spending time with a source can lead to interesting questions and that following where those questions lead can turn up unexpected information about the past.

For me as an educator, the value of the great potato quest lay in the opportunity it gave students to practice historical research.  And I would argue that asking students to transcribe the source and embed information about the source using XML facilitated the slowing down, the taking time, the close reading that is a significant skill for the practice of history.  In this case, XML was a tool for creating the conditions that helped students learn.  And that is the only good reason to use any technology in the classroom.

I haven’t said anything in this post about the Guidelines of the Text Encoding Initiative (TEI), which shape the kinds of information we embed in XML files in the Wheaton College Digital History Project.  Those guidelines are part of the use of XML in research and scholarship, so I will speak to them in a future post.


*Michelle Moravec organized the webinar, and Georgianne Hewett managed the tools that we used to present it.  Presenters focused on using digital tools in our history teaching.   Aaron Cohen presented his work using History Pin–a tool for managing images and creating exhibits–with students at Slippery Rock University.  Michelle showed a website that she and her students created using WordPress along with images of stained glass windows and a map of the college chapel at Rosemont College.  And I offered my usual presentation about our use of   The slides from all of the presentations are available here.  Amanda Hagood, who is Director of Blended Learning at Associated Colleges of the South, asked the question that prompted me to write this post.

**For more detail and an introduction to working with XML, see Joe Fawcett, Liam R.E. Quin, Danny Ayers, Beginning XML, 5th Ed. (Indianapolis, Ind.: John Wiley & Sons, Inc., 2012).

***David Demeritt, “Climate, Cropping, and Society in Vermont, 1820-1850,” Vermont History (1991) 59/1: 133-165.






Comments Off on Teaching History with XML/TEI: A Contribution to Liberal Education

Filed under digital tools

XML: The Latin of Digital Scholarship?

I’ve been playing with this analogy for a while, and I was pleased to hear the silence of assent when I took it out for a trial run at a session on Big Data at THATCamp Kansas a few weeks ago. It elicited some resistance at another moment that weekend, and I’m interested in the contextual differences.

The second group with whom I discussed my notion represented a couple of constituencies that I’m less familiar with in digital humanities, those interested in the semantic web and those who work with the languages that power social media. These folks mentioned Django, which is based on Python and was developed in Lawrence, Kansas. I haven’t yet learned Python, though I know about it, and William J. Turkel and Alan MacEachern’s The Programming Historian is bookmarked on my browser. (Thank you once again, Canada, for your excellent support of digital scholarship.)

My young colleagues pointed to Web 2.0, Facebook, and Google as examples of common tools not based on XML. I learned a lot from them–I’d never heard of Django before that conversation. But I don’t think their point invalidates my own.

I mean, after all, to point here to certain historical effects, including the use of Latin as the language of scholarship and diplomacy in Medieval Europe. (Easy for me, you may say, since I’m not a medievalist.) Thus, I think the analogy may be apt since XML lies behind long-term developments in what was long ago called Humanities Computing—efforts to consider how computers might facilitate humanities research, in Medieval and Classical Studies in fact.

Since the language also underlies such proprietary applications as MicroSoft Word and Excel, the analogy also alludes to the place of Latin as the foundation upon which the romance languages were built. Apt again, perhaps, since computational linguistics also makes use of XML.

I ponder this analogy because I want to better articulate the significance for liberal education of the effects of digital innovations on scholarship. And as I do so, I seek to understand digital scholarship in the larger landscape of digital culture.

I think that learning to feel comfortable with one type of coding (XML) can help humanities students develop the confidence to explore additional languages–like Python–and become ever more nimble citizens of their digital world.


Filed under digital humanities, education

User-Friendly XML

As I continue to think through how I do history digitally, I note both that historians have been using computers for a long time and that what I do differs from the statistics-heavy social science computing people were learning when I was in graduate school. Programs like SPSS didn’t seem relevant to my dissertation project, which focused on small communities that would not have yielded statistically significant analysis. I didn’t know about Arc-GIS, and it might be interesting to see what one could learn by imposing census data on Whitney Cross’s maps of the Burned Over District. Might, at some other point.

I’m struck by how easily I accepted the idea that transcribing and marking up journals, diaries, and now financial records could yield interesting results for understanding the nineteenth-century United States. But an analogy that came to me this morning clarifies the process for me.

I’ve noted here before that I came to comfort with code as a result of the coincidence that my post-secondary education began just at the moment that computing was becoming democratized. At Rice, my own experience with mainframes began with learning to use word processors to type papers. In my early post-collegiate jobs, my comfort with learning to use similar applications earned me a position as the WordPerfect expert among the secretarial staff of a department at the UVA Medical School. I bought my first PC in grad school and developed minimal comfort with DOS, but I didn’t become a power user until I bought my first Mac and learned the joy of the Apple interface.

My development as an academic user coincided with the spread of the Internet in the 1990s, though I remained a low-end user focused on email and word processing until my first exposure to TEI and XML in 2004. The utility of statistical data remained relatively opaque to me, and my fondness for Macs and parallel contempt for Windows as a DOS-impaired lesser version of the Apple interface prevented my exploring possibilities. Coupled with my interest in pedagogical uses of technology, the advent of the World-Wide Web led to my involvement in discussions about cross-platform applications, and I became more and more comfortable in conversations about technology. Thus, I had been primed for the next stage–learning about XML through exposure to TEI and therefore becoming a different kind of academic user.

The analogy between the comparative difficulties of DOS/SPSS and Mac/XML has considerable explanatory power for me as I think about how I have come to be convinced that XML/TEI tools for transcription and markup have a place in undergraduate classrooms. I think it goes a long way towards expressing some of the assumptions behind my notion that liberal education should include exposure to computational thinking.


Filed under digital humanities, liberal education