Université Savoie Mont Blanc LPNC Lexique - Une Base de Données Lexicales Libre RISC CNRS
Un site réalisé par Boris New & Christophe Pallier et hébergé par le RISC

Menu principal



Diphones-Fr is a database of diphone positional frequency in French.
More specifically, it provides frequencies for word initial, word internal and word final diphones of all words extracted from a subtitle corpus of 50 million words coming from movies and TV series dialogues. We also provide intra and inter syllables diphone frequency as well as inter-words diphone frequency. To our knowledge there is no such tool available to the psycholinguists for the study of French sequential probabilities. This database and its new indicators should to help researchers conducting new studies on speech segmentation.


How to cite


Diphones-Fr 1.00

The perl scripts used to generate the Diphones-Fr database

French Subtitles Corpus