Hideo Okuma


2011

pdf bib
Annotating data selection for improving machine translation
Keiji Yasuda | Hideo Okuma | Masao Utiyama | Eiichiro Sumita
Proceedings of the 8th International Workshop on Spoken Language Translation: Papers

In order to efficiently improve machine translation systems, we propose a method which selects data to be annotated (manually translated) from speech-to-speech translation field data. For the selection experiments, we used data from field experiments conducted during the 2009 fiscal year in five areas of Japan. For the selection experiments, we used data sets from two areas: one data set giving the lowest baseline speech translation performance for its test set, and another data set giving the highest. In the experiments, we compare two methods for selecting data to be manually translated from the field data. Both of them use source side language models for data selection, but in different manners. According to the experimental results, either or both of the methods show larger improvements compared to a random data selection.

2009

pdf bib
Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation
Kei Hashimoto | Hirohumi Yamamoto | Hideo Okuma | Eiichiro Sumita | Keiichi Tokuda
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

2008

pdf bib
The NICT/ATR speech translation system for IWSLT 2008.
Masao Utiyama | Andrew Finch | Hideo Okuma | Michael Paul | Hailong Cao | Hirofumi Yamamoto | Keiji Yasuda | Eiichiro Sumita
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the National Institute of Information and Communications Technology/Advanced Telecommunications Research Institute International (NICT/ATR) statistical machine translation (SMT) system used for the IWSLT 2008 evaluation campaign. We participated in the Chinese–English (Challenge Task), English–Chinese (Challenge Task), Chinese–English (BTEC Task), Chinese–Spanish (BTEC Task), and Chinese–English–Spanish (PIVOT Task) translation tasks. In the English–Chinese translation Challenge Task, we focused on exploring various factors for the English–Chinese translation because the research on the translation of English–Chinese is scarce compared to the opposite direction. In the Chinese–English translation Challenge Task, we employed a novel clustering method, where training sentences similar to the development data in terms of the word error rate formed a cluster. In the pivot translation task, we integrated two strategies for pivot translation by linear interpolation.

pdf bib
Imposing Constraints from the Source Tree on ITG Constraints for SMT
Hirofumi Yamamoto | Hideo Okuma | Eiichiro Sumita
Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)

pdf bib
Multilingual Mobile-Phone Translation Services for World Travelers
Michael Paul | Hideo Okuma | Hirofumi Yamamoto | Eiichiro Sumita | Shigeki Matsuda | Tohru Shimizu | Satoshi Nakamura
Coling 2008: Companion volume: Demonstrations

2007

pdf bib
Introducing translation dictionary into phrase-based SMT
Hideo Okuma | Hirofumi Yamamoto | Eiichiro Sumita
Proceedings of Machine Translation Summit XI: Papers

pdf bib
The NICT/ATR speech translation system for IWSLT 2007
Andrew Finch | Etienne Denoual | Hideo Okuma | Michael Paul | Hirofumi Yamamoto | Keiji Yasuda | Ruiqiang Zhang | Eiichiro Sumita
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes the NiCT-ATR statistical machine translation (SMT) system used for the IWSLT 2007 evaluation campaign. We participated in three of the four language pair translation tasks (CE, JE, and IE). We used a phrase-based SMT system using log-linear feature models for all tracks. This year we decoded from the ASR n-best lists in the JE track and found a gain in performance. We also applied some new techniques to facilitate the use of out-of-domain external resources by model combination and also by utilizing a huge corpus of n-grams provided by Google Inc.. Using these resources gave mixed results that depended on the technique also the language pair however, in some cases we achieved consistently positive results. The results from model-interpolation in particular were very promising.

2006

pdf bib
The NiCT-ATR statistical machine translation system for IWSLT 2006
Ruiqiang Zhang | Hirofumi Yamamoto | Michael Paul | Hideo Okuma | Keiji Yasuda | Yves Lepage | Etienne Denoual | Daichi Mochihashi | Andrew Finch | Eiichiro Sumita
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

2005

pdf bib
Practical Approach to Syntax-based Statistical Machine Translation
Kenji Imamura | Hideo Okuma | Eiichiro Sumita
Proceedings of Machine Translation Summit X: Papers

This paper presents a practical approach to statistical machine translation (SMT) based on syntactic transfer. Conventionally, phrase-based SMT generates an output sentence by combining phrase (multiword sequence) translation and phrase reordering without syntax. On the other hand, SMT based on tree-to-tree mapping, which involves syntactic information, is theoretical, so its features remain unclear from the viewpoint of a practical system. The SMT proposed in this paper translates phrases with hierarchical reordering based on the bilingual parse tree. In our experiments, the best translation was obtained when both phrases and syntactic information were used for the translation process.

pdf bib
Nobody is perfect: ATR’s hybrid approach to spoken language translation
Michael Paul | Takao Doi | Youngsook Hwang | Kenji Imamura | Hideo Okuma | Eiichiro Sumita
Proceedings of the Second International Workshop on Spoken Language Translation

2004

pdf bib
EBMT, SMT, hybrid and more: ATR spoken language translation system
Eiichiro Sumita | Yasuhiro Akiba | Takao Doi | Andrew Finch | Kenji Imamura | Hideo Okuma | Michael Paul | Mitsuo Shimohata | Taro Watanabe
Proceedings of the First International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Example-based Machine Translation Based on Syntactic Transfer with Statistical Models
Kenji Imamura | Hideo Okuma | Taro Watanabe | Eiichiro Sumita
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics