To kick-start our LINGUIST List Research Colloquium, we read and discussed two articles concerning best practices in language documentation over two weeks: “The Seven Dimensions of Portability for Language Documentation and Description” by Steven Bird & Gary Simons 2003 and “Electronic Grammars and Reproducible Research” by Mike Maxwell 2012. Since language documentation is our business at the LINGUIST List, it is good to keep up to date on new methods in field work and research, and these two articles generated some good discussion on advantages and trials of language documentation.
In Week One, Bryn Hauk from Eastern Michigan University led the discussion on Bird & Simons 2003, which discusses the best methodology in archiving and documenting linguistic data in a way that is more accessible and lasting. Bird & Simons detailed the problems of language archiving and documentation, and how they think these problems should be addressed for the betterment of linguistic research, especially for the documentation of endangered language. They believe that all linguistic data should be “portable”, that is, to have the ability to be “ported” and accessed to multiple systems and technologies. The “seven dimensions of portability” mentioned in the title were:
1. Content (the quality of the data recorded)
2. Format (using XML and Unicode to streamline and standardize documentation)
3. Discovery (making resources easier to find by researchers)
4. Access (making it easier to obtain and access resources)
5. Citation (providing citations to online sources and reducing broken links)
6. Preservation (digitizing records and having back-ups for resources)
7. Rights (protecting intellectual property rights, but also limiting restrictions to research)
Essentially, as linguists, we should be aiming for clarity in our research and documentation, so that it can be accessible to future generations of linguists and language enthusiasts.
In Week 2, Brent Woo of Eastern Michigan University led the discussion of the second paper, “Electronic Grammars and Reproducible Research” by Maxwell 2012. We discussed Maxwell’s argument that linguists should use computational tools, such as XML tagging, to reduce ambiguity in annotation and rule-writing, in a way that is understandable to humans as well as computers in order to make linguistic research reproducible, but not tied or limited to “any particular linguistic theory” or any “particular computational tool”, or in other words, not bound to technology that may be obsolete or difficult to use in five years. This is very important to keep in mind as technology keeps advancing at an increasing rate, especially since new technologies can often become obsolete within months. If our annotations or documentation is recorded on obsolete formats, we may not be able to access them in the future, and there is the potential for this data to be lost.