"tdmelodic", natural language processing software that automatically estimates Tokyo-style accents for Japanese speech synthesis, is now open source.

Overview
Technology

Overview

Project period: September 18, 2020 (OSS release)

The natural language processing software "tdmelodic" (Tokyo Dialect MELOdic accent DICtionary generator: Tokyo Dialect high/low accent dictionary generator) is now available as open source.
With tdmelodic, you can estimate the Tokyo-style accents of various words and thereby automatically generate a large vocabulary accent dictionary.
This dictionary can be used for applications such as Japanese speech synthesis for a more natural feeling.

Technology

This software uses deep learning-based technology to estimate the accent of a word based on its surface form (the form in which the word appears in a sentence, such as kanji) and its reading (furigana).

This software can be applied to existing large-scale open-source Japanese dictionaries such as "NEologd" to automatically generate a large-scale accent dictionary for MeCab with a vocabulary size of several million words. The developer can start with the automatically generated dictionary using tdmelodic and NEologd as a baseline, and then improve the text-to-speech performance by correcting errors in the dictionary as necessary.

tdmelodic provides the following features
A function that takes the surface form and reading of a Japanese word as input and outputs the word's Tokyo-style accent.
Automatic generation of a large vocabulary accent dictionary for MeCab based on the existing Japanese morphological analysis dictionaries, UniDic and NEologd.

These results have been accepted to ICASSP2020, the top conference in the field of speech and acoustics.
Accent Estimation of Japanese Words from Their Surfaces and Romanizations for Building Large Vocabulary Accent Dictionaries.

Member in charge

TACHIBANA hideyuki
Graduated from the Department of Mathematical Engineering, Faculty of Engineering, The University of Tokyo. D. from the Graduate School of Information Science and Engineering, The University of Tokyo.
D. in Information Science and Engineering.
After working as a researcher at Meiji University, joined PKSHA Technology.
Engaged mainly in research and development of speech processing, language processing, and signal processing.

Overview

Technology

Member in charge

TACHIBANA hideyuki

See our other achievements

Released "Camphr", a natural language processing library, as open source.

Developed SinkPIT, a sound source separation technology for environments with many speakers speaking at the same time