The Fall breeze brought the beginning of a new semester along with it, and a new season for our team of highly motivated Summer Interns at LINGUIST List, who (for the most part) just left us for the continuation of their linguistics endeavors. We are very grateful for their hard work and the priceless contribution they brought to multiple LINGUIST List projects, including GORILLA, MultiTree, LL-Map and GeoLing! These projects have all been started some time ago, and they were brought much closer to completion this summer. We are now very excited to let them tell you what they did over the last few months.
GORILLA is an exciting project currently being built. The goal of this project is to create a unified source of annotated corpora for languages around the world, with an emphasis on endangered and under-resourced languages. So Eun, Julian, Simon-Pierre, Clare and Will hugely contributed to this project by working on some novel speech corpora for Korean, German, and Kinyarwanda, and by revamping and annotating the AHEYM speech corpus for Yiddish.
“This summer, I helped to develop the Yiddish Speech Corpus: I transcribed, transliterated, and annotated Yiddish speech and developed corpus metadata. I coordinated with Will and So Eun, and together we annotated over 5 hours of media for the corpus, including interviews, poetry and audio books.”
“Over the course of the Linguist List internship, I have worked on collecting and producing speech corpora on the Yiddish and Korean languages. For the Korean corpus, I gathered texts in Korean from non-copy right restricted online sources, made recordings of said texts, and annotated each recording using ELAN. As to the Yiddish corpus, I helped with annotating the Yiddish recordings available at Indiana University’s Archives of Historical and Ethnographic Yiddish Memories (AHEYM) by segmenting audio files as well as converting and copying Yiddish (orthographic and YIVO/romanized) transcriptions onto the ELAN annotations.”
“While interning at LINGUIST List this summer, I was involved in one main project, and several smaller ones as well. I was told about the speech corpus I would be working on, and shown how to use the program necessary for it. I started off making audio recordings, and then transcribing them to text using ELAN. This took up the majority of my time interning here, but was very useful. After I had completed the transcriptions, I was given some smaller tasks, such as improving LINGUIST List’s website by cleaning up old links. I feel that my time interning here was useful and well spent, and has helped expand my skill set”
These three projects are some valuable tools that have been in the makings for quite some time, here at LINGUIST List. Thanks to some of our 2016 interns, these tools are now improved!
MultiTree is a digital library of scholarly hypotheses about language relationships and subgroupings, organized in a searchable database with a fancy web interface. Noah, Chloe and Arjuna spent the summer working on the structure of this useful webinterface, providing you with the new and improved MultiTree!
MultiTree interact with the LL-MAP Project, a geolinguistic database which provides users with a fully functional Geographical Information System (GIS) through which linguistic data – including subgrouping information – can be viewed in its geographical context. Jacob lead this project, assisted by Chloe.
Geoling is also an interactive map service, but with a different goal. It displays linguistics information around the world on a map: jobs, conferences, internships, and for the first time on LINGUIST List: local events. Lewis spent much time and effort reorganizing the data for this project, and with the help of Noah and Arjuna they were able to implement it to the website!
“I have spent the summer working on the LL-MAP project, which had been offline for several years. I began by identifying and correcting issues with the geometry and attribute data of the maps in our PostGIS database and KML files to allow them to display properly in viewers like QGIS, Google Earth, and OpenLayers. I also corrected the styles corresponding to the maps, according to recommendations by Jacob Henry, in order to show the colors, labels, and other visual aspects as they appear in the original source. Once the maps had been uploaded into Geoserver, I went through them to identify specific problems and fixed display issues with several dozen maps. Finally, I contributed along with several other interns to the new LL-MAP viewer. I would like to thank Lwin Moe and Damir Cavar for their help at every step of the process, and Damir and Malgosia Cavar for the opportunity to take part in this project.”
“As a summer intern at the Linguist List, I worked on improving the MultiTree and LL-MAP sites. Before I started, I had played around with the old and new MultiTree but didn’t know how the trees were generated. With some training in Django and D3 data visualizations, I was able to get behind the scenes of MultiTree and start exploring different tree views using the data from the Linguist List. Because of the variety of visualization options, I learned to put myself in the user’s shoes and to decide what features to prioritize in order for the site to be more helpful to the linguist community.
After MultiTree, I helped with the LL-MAP team on their project. Working on the new LL-MAP was a dynamic process because we constantly adjusted our tasks based on user feedback. The result that came out was an elegant viewer page that provides as much information as possible in a simple and organized way.
One thing I learned from my internship experience is the difference between a classroom assignment and a real project. For both MultiTree and LL-MAP, we had a lot of freedom deciding what to work on as a team as opposed to being assigned specific tasks, with the goal to make the site more informative and easier to use. I’m glad to have gained the experience of collaborating with teammates, and learning to solve issues creatively and efficiently.”
We sincerely enjoyed having these burgeoning linguists join our team, and we even have the pleasure of having Jacob and Clare stay on at LINGUIST List after the end of their internship! Thanks to the devoted work of the 2016 LINGUIST List summer interns, some novel and valuable language resources have now been created: their contribution goes beyond the limits of LINGUIST List, and is truly a contribution to the Linguistics community around the world. We now invite you all to enjoy these new tools that have been developed over the years by many different hands, and most recently by the LINGUIST List 2016 Interns crew!