It has been a busy summer here at The LINGUIST List! Please take a moment to check out the projects that our 2015 summer interns and volunteers have been working on!
Edvard is currently working on MultiTree, a searchable and easily accessible database of hypotheses on language relationships. In his line of work, he searches for linguistic publications in Russian that are less-available to the global linguistic community. Specifically, he analyzes Russian publications on language families and updates MultiTree with these linguistic hypotheses for further reference. In the interest of making the GORILLA website interface multilingual, he also translates its content from English to Russian.
Alec spends most of his time at the LINGUIST List creating the official LINGUIST List Google Chrome App, which will soon provide easy access to the upcoming GeoLing map and other LINGUIST List resources. He is also in the process of writing a script that automatically collects language data from Wiktionary and other open-source databases, and has so far used the program to extend the LINGUIST List’s Yiddish lexicon.
Clara García Gómez
Clara is mainly involved in the GORILLA Project creating a speech corpus for Castilian Spanish, of which she is a native speaker. She is creating materials necessary for automatic alignment and transcription. She also works on the translation of parts of the website into Spanish and in some editing tasks for LINGUIST List. She is interested in the study of undocumented languages so she is happy to participate in GORILLA and hope to contribute to this project further after creating the corpus for Castilian Spanish.
Jacob has spent most of his time working on the LL-MAP project, a large collection of maps containing linguistic and geographic information to be used by linguists, anthropologists, and other researchers.The LINGUIST List relocation Indiana University became an opportunity to relaunch and redesign the technologies. This has involved porting all of the data accumulated to new servers and testing various file formats to find the easiest to work with for our purposes. We’ve made some progress and ideally, we would be able to relaunch LL-MAP by the end of the summer.
Seyed started working on Baharlu dialect of south Azeri Turkic language. It is a language that is being spoken in west Iran with the neighboring area of Persian, Kurdish, and Lori languages. He studied different writing styles used to produce the most suitable transcriptions. Moreover, he needed to study the standards of romanization of Baharlu Turkic. He worked on sample recordings, creating transcription, romanization, and translation.
During this work he has also started preparing a Baharlu-English dictionary that including original word, romanization, English translation and will be completed with other elements such as lemma, PoS and pronunciation information.
For the last two weeks, Petar has been mainly working on the Automatic Speech Recognition Project. Currently, he is working on the Croatian speech corpus and ASR. The first part of the project consists of making recordings and transcribing them. Along with building the corpus, he has been going through the documentation about Chrome Apps, and from the beginning of this week, he will start working alongside Alec on the LINGUIST List Chrome app. At the end of his internship, he would like to have a working Croatian Speech Recognizer, and an application that will ease the use of various LINGUIST List features.
Zac has been working primarily on the front and back end of Geoling which can be found at geoling.linguistlist.org. Zac has additionally contributed to the Gorilla project (gorilla.linguistlist.org) including the development of resources to be provided by Gorilla.