TALAf workshops take place every two years. The first workshop was held during the JEP-TALN-RÉCITAL 2012 conference on June 8, 2012 in Grenoble (see proceedings: http://aclweb.org/anthology//W/W12/#1300). The second one took place during the TALN 2014 conference on July 1, 2014 in Marseille (see proceedings: http://www.taln2014.org/site/actes-en-ligne/actes-en-ligne-ateliers/).
The third edition of TALAf will be held during the JEP-TALN-RECITAL conference, on July 4, 2016 at INALCO in Paris.
Natural language processing is booming in Africa. Indeed, in many countries, there is an ongoing official recognition of national languages, for instance:
In Niger, laws defining alphabets for Hausa, Kanuri, and Tamajaq Zarma were published in 1999. Since then, the National Assembly has set up simultaneous translation of the debates in three languages: French, Hausa and Zarma;
In Morocco, the Royal Institute of Amazigh Culture (IRCAM), which works for the promotion of Amazigh culture and development of the Berber language was founded by royal decree in 2001;
In Senegal, the recognition of national languages of the recognition was mentioned in the first article of the Constitution of 7 January 2001: "The official language of the Republic of Senegal is French. The national languages are Diola, Malinke, Pular, Serer, Soninke, Wolof and other national language to be codified." The Department of Technical Education, Vocational Training, Literacy and National Languages (METFPALN) is responsible for this. Since December 9, 2014, the Senegalese parliamentarians debates are translated simultaneously through an interpretation system in six national languages (Fulani, Serere, Wolof, Jola, Mandinka and Soninke) in addition to French, allowing the majority of members to speak in their mother tongue.
Moreover, a number of colleagues / African scholars trained in the North return to their country with the will to continue their work in local languages. There are also some Diasporas that have technological material allowing them to contribute directly online and on a voluntary basis.
Added to this, the development of bilingual education programs (official / national language) in primary schools in many countries is growing. The official language remaining mostly that of the former colonial country (French, English, Portuguese ...).
On the other hand, mobile phones are spreading fast: with 650 million units, Africa has surpassed the United States and Europe. In many areas, it is easier to install a mobile antenna than fixed lines. Therefore, the people who use a telephone for the first time do it with a mobile terminal. Applications are developed such as money transfer or dissemination of weather reports.
The funding of research projects on these languages can now be obtained from the "Organisation Internationale de la Francophonie" with their calls for projects of the "fonds francophone des inforoutes" (see eg DiLAF or flore projects) or the "Agence Universitaire de la Francophonie". France also supports projects on these languages through the National Agency for Research (see eg ALFFA project).
So the conditions are gathered for the development of natural language processing in Africa, both written and spoken.
In this context, the roles of TALAf workshop are:
bring together researchers in the field through meetings at the workshop but also with the talaf@imag.fr mailing list;
pooling knowledge using open source tools, standards (ISO, Unicode), and publishing the resources produced with an open license (Creative Commons) to avoid including the loss of information when a project stops and can not be resumed immediately for lack of resources;
develop a set of best practices based on the experience of researchers ; set up simple efficient methodologies based on free or very cheap software for the development of resources, exchange about techniques that can avoid the use of non-existent resources and finally avoid loss of time and energy.
TALAf workshops are supported by the non-profit organisation "Lexicologie Terminologie Traduction": http://www.ltt.auf.org/index.php
Session chairperson: Mathieu Mangeot | |
09h30-10h00 | Valentin Vydrin, Andrij Rovenchak & Kirill Maslinsky Maninka Reference Corpus: A Presentation. [PDF], [PPTX presentation] |
10h00-10h30 | Ikechukwu Onyenwe, Mark Hepple & Uchechukwu Chinedu Improving Accuracy of Igbo Corpus Annotation Using Morphological Reconstruction and Transformation-Based Learning. [PDF], [PDF presentation] |
10h30-11h00 | Coffee break |
Session chairperson: El Hadj Mamadou Nguer | |
11h00-11h30 | Moneim Abdourahamane, Christian Boitet, Valérie Bellynck, Lingxiao Wang & Hervé BlanchonConstruction d’un corpus parallèle français-comorien en utilisant de la TA français-swahili. [PDF] |
11h30-12h00 | David Blachon, Elodie Gauthier, Laurent Besacier, Guy-Noël Kouarata, Martine Adda-Decker & Annie RiallandCollecte de parole pour l'étude des langues peu dotées ou en danger avec l'application mobile Lig-Aikuma. [PDF], [PDF presentation] |
12h00-14h00 | Lunch break |
Session chairperson: Christian Boitet | |
14h00-14h30 | Michael Melese Woldeyohannis, Laurent Besacier & Meshesha MillionAmharic Speech Recognition for Speech Translation. [PDF], [Présentation PDF] |
14h30-15h00 | El Hadji Malick Fall, El Hadji Mamadou Nguer, Sokhna Bao Diop, Mouhamadou Khoulé, Mathieu Mangeot & Mame Thierno CisséDigraphie des langues ouest africaines : Latin2Ajami : un algorithme de translittération automatique. [PDF], [PPTX presentation] |
Session chairperson: Chantal Enguehard | |
15h00-15h30 | Fatimazahra Nejme, Siham Boulaknadel & Driss AboutajdineDéveloppement de ressources pour la langue amazighe : Le Lexique Morphologique El-AmaLex. [PDF] |
15h30-16h00 | Alla Lo, Elhadji Mamadou Nguer, Abdoulaye Youssoupha Ndiaye, Cheikh Bamba Dione, Mathieu Mangeot, Mouhamadou Khoule, Sokhna Bao Diop & Mame Thierno CisseCorrection orthographique pour la langue wolof : état de l'art et perspectives. [PDF], [PPTX presentation] |
16h00-16h30 | Coffee break |
Session chairperson: Laurent Besacier | |
16h30-17h00 | Mouhamdou Khoule, Mathieu Mangeot, El Hadji Mamadou Nguer & Mame Thierno CisseiBaatukaay : un projet de base lexicale multilingue contributive sur le web à structure pivot pour les langues africaines notamment sénégalaises. [PDF], [PPS presentation] |
17h00-17h30 | Chérif Mbodj & Chantal EnguehardProduction et mise en ligne d’un dictionnaire électronique du wolof. [PDF], [PDF presentation] |
Martine Adda-Decker (CNRS-LPP & LIMSI, Paris, France)
Laurent Besacier (LIG, Grenoble, France)
Sokhna Bao Diop (Université Gaston Berger, St Louis du Sénégal, Sénégal)
Philippe Bretier (Voxygen, Pleumeur-Bodou, France)
Khalid Choukri (ELDA, Paris, France)
Mame Thierno Cissé (ARCIV, Université Cheikh Anta Diop, Dakar, Sénégal)
Chantal Enguehard (LINA, Nantes, France)
Núria Gala (LIF, Marseille, France)
Modi Issouf (Ministère de l'Éducation, Niamey, Niger)
Fary Silate Ka (IFAN, Université Cheikh Anta Diop, Dakar, Sénégal)
Mathieu Mangeot (LIG, Grenoble, France)
Chérif Mbodj, (Centre de Linguistique Appliquée de Dakar, Sénégal)
Kamal Naït-Zerrad (INALCO, Paris, France)
El Hadj Mamadou Nguer (Université Gaston Berger, St Louis du Sénégal, Sénégal)
Donald Osborn (Bisharat, ltd.)
Francois Pellegrino, (DDL, Lyon, France)
Olivier Rosec (Voxygen, Pleumeur-Bodou, France)
Fatiha Sadat (UQAM, Montréal, Canada)
Aliou Ngoné Seck (FLSH, Université Cheikh Anta Diop, Dakar, Sénégal)
Emmanuel Schang (Université d'Orléans, Orléans, France)
Gilles Sérasset (LIG, Grenoble, France)
Max Silberztein (ELLIADD, Université de Franche-Comté, Besançon, France)
Sylvie Voisin (DDL, Lyon, France)
Valentin Vydrin (LLACAN-INALCO, Paris, France)
- Submission deadline: 24 April 8 May 2016
- Notification of acceptance: 23 May 2016
- Final Submission Deadline: 3 June 2016
- Workshop: 4 July 2016
For any question, please contact Mathieu.Mangeot_AT_imag.fr