東北大学大学院工学研究科 (大町)・能勢研究室

学術論文

2025年

■Preserving speaker information in direct Speech-to-Speech Translation with non-autoregressive generation and pre-training
Rui Zhou, Akinori Ito, Takashi Nose
Computer Speech and Language, Vol.97, pp.101902

■Adaptive Fine-Grained Pruning via Binary Search for Efficient Environmental Sound Classification
Changlong Wang, Akinori Ito, Takashi Nose
IEEE Access, Vol.13, pp. 173201 - 173208

■Adaptive Depth-Wise Pruning for Efficient Environmental Sound Classification
Changlong Wang, Akinori Ito, Takashi Nose
IEEE Access, Vol. 13, pp. 69751-69759

■The Development of an Emotional Embodied Conversational Agent and the Evaluation of the Effect of Response Delay on User Impression
Simon Christoph Jolibois, Akinori Ito, Takashi Nose
Applied Sciences, Vol. 15, No. 8, pp. 4256

■Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks
Xuping Huang, Akinori Ito
Applied Sciences, Vol. 15, No. 1, pp. 381

■We open our mouths when we are silent
Shoki Kawanishi, Yuya Chiba, Akinori Ito, Takashi Nose
Acoustical Science and Technology, Vol. 46, No. 1, pp. 96-99

■Fast end-to-end non-parallel voice conversion based on speaker-adaptive neural vocoder with cycle-consistent learning
Shuhei Imai, Aoi Kanagaki, Takashi Nose, Shogo Fukawa, Akinori Ito
Acoustical Science and Technology, Vol. 46, No. 1, pp. 116-119

■Unified model for voice conversion of speech and singing voice using adaptive pitch constraints
Shogo Fukawa, Takashi Nose, Shuhei Imai, Akinori Ito
Acoustical Science and Technology, Vol. 46, No. 1, pp. 120-123

2024年

■Simulated Annealingによる論文審査会スケジュールの準最適割当てシステム
伊藤彰則
学術情報処理研究, Vol. 28, No. 1, pp. 106-113

■Multilingual Meta-Transfer Learning for Low-Resource Speech Recognition
Rui Zhou, Takaki Koshikawa, Akinori Ito, Takashi Nose, Chia-Ping Chen
IEEE Access, (2024) Vol. 12, pp. 158493-158504

■Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning
Xuecheng Niu, Akinori Ito, Takashi Nose
IEEE ACCESS, (2024) Vol. 12, pp. 46940-46952

■A Replaceable Curiosity-Driven Candidate Agent Exploration Approach for Task-Oriented Dialog Policy Learning
Xuecheng Niu, Akinori Ito, Takashi Nose
IEEE ACCESS, (2024) Vol. 12, pp. 142640-142650

■Imperceptible and Reversible Acoustic Watermarking Based on Modified Integer Discrete Cosine Transform Coefficient Expansion
Xuping Huang, Akinori Ito
Vol. 14, No. 7, pp2757

2022年

■Spoken Term Detection of Zero-Resource Language Using Posteriorgram of Multiple Languages
Satoru Mizouchi, Takashi Nose, Akinori Ito
Interdisciplinary Information Sciences, (2022) Vol. 28, No. 1, pp. 1-13 DOI:10.4036/iis.2022.A.04

2021年

■Effect of Training Data Selection for Speech Recognition of Emotional Speech
Yusuke Yamada, Yuya Chiba, Takashi Nose, and Akinori Ito
International Journal of Machine Learning and Computing, Vol. 11, No. 63, pp. 362-366

■SMOC corpus: A large-scale Japanese spontaneous multimodal one-on-one chat-talk corpus for dialog systems
Yoshihiro Yamazaki, Yuya Chiba, Takashi Nose, Akinori Ito
Acoustical Science and Technology, Vol. 42, No. 4, pp. 210-213

2020年

■Language modeling in speech recognition for grammatical error detection based on neural machine translation
Jiang Fu, Yuya Chiba, Takashi Nose, Akinori Ito
Acoustical Science and Technology, Vol. 41, No. 5, pp. 788-791

■A Symbol-level Melody Completion Based on a Convolutional Neural Network with Generative Adversarial Learning
Kosuke Nakamura, Takashi Nose, Yuya Chiba, Akinori Ito
Journal of Information Processing, Vol. 28, pp. 248-257

■Human-machine metacommunication towards development of a human-like agent: A short review
Akinori Ito
Acoustical Science and Technology, Vol. 41, No. 1, pp. 166-169

2019年

■Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural acoustic models
Jiang Fu, Yuya Chiba, Takashi Nose, Akinori Ito
Speech Communication, Vol. 116, pp. 86-97

■Multi-condition training for noise-robust speech emotion recognition
Yuya Chiba, Takashi Nose, Akinori Ito
Acoustical Science and Technology, Vol. 40, No. 6, pp. 406-409

■Improving human scoring of prosody using parametric speech synthesis
Hafiyan Prafianto, Takashi Nose, Yuya Chiba, Akinori Ito
Speech Communication, Vol. 111, pp. 14-21

2018年

■Analysis of Preferred Speaking Rate and Pause in Spoken Easy Japanese for Non-Native Listeners
Hafiyan Prafianto, Takashi Nose, Yuya Chiba, Akinori Ito
Acoustical Science and Technology, Vol. 39, No. 2, pp. 92-100

■Analyses of Example Sentences Collected by Conversation for Example-Based Non-Task-Oriented Dialog System
Yukiko Kageyama, Yuya Chiba, Takashi Nose, and Akinori Ito
IAENG International Journal of Computer Science, Vol. 45, No. 1, pp. 285-293

2017年

■HMM-Based Photo-Realistic Talking Face Synthesis Using Facial Expression Parameter Mapping with Deep Neural Networks
Kazuki Sato, Takashi Nose, Akinori Ito
Journal of Computer and Communications, Vol. 5, No. 10, pp. 55-65

■Dimensional paralinguistic information control based on multiple-regression HSMM for spontaneous dialogue speech synthesis with robust parameter estimation
Tomohiro Nagata, Hiroki Mori, Takashi Nose
Speech Communication, Vol. 88, pp. 138-148

■統計モデルに基づく多様な音声の合成技術
能勢隆
電子情報通信学会論文誌D, Vol. J100-D, No. 4, pp. 556-569

■Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis
Takashi Nose, Yusuke Arao, Takao Kobayashi, Komei Sugiura, Yoshinori Shiga
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, Issue 5, pp.1107-1116

■クロスリンガル音声合成のための共有決定木コンテクストクラスタリングを用いた話者適応
長濱大樹, 能勢隆, 郡山知樹, 小林隆夫
電子情報通信学会論文誌D, Vol. J100-D, No. 3, pp. 385-393

■Cluster-based approach to discriminate the user’s state whether a user is embarrassed or thinking to an answer to a prompt
Yuya Chiba, Takashi Nose, Akinori Ito
Journal on Multimodal User Interfaces (2017), Vol. 11, No. 2, pp. 185-196

2016年

■Prosodically rich speech synthesis interface using limited data of celebrity voice
Takashi Nose, Taiki Kamei
Journal of Computer and Communications, Vol. 4, No. 16, pp. 79-94

■DNNを利用したAnimation Unitの変換に基づく顔画像変換の検討
齋藤優貴, 能勢隆, 伊藤彰則
電子情報通信学会論文誌, Vol. J199-D, No. 11, pp. 1112-1115

■Efficient Implementation of Global Variance Compensation for Parametric Speech Synthesis
Takashi Nose
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, Issue 10, pp. 1694-1704

■Estimating the User's State before Exchanging Utterances Using Intermediate Acoustic Features for Spoken Dialog Systems
Yuya Chiba, Takashi Nose, Masashi Ito, Akinori Ito
IAENG International Journal of Computer Science, Vol. 43, No. 1, pp. 1-9

■発話状態推定に基づく協調的感情音声合成による音声対話システムの評価
加瀬嵩人，能勢隆，千葉祐弥，伊藤彰則
電子情報通信学会論文誌A, Vol. J199-A, No. 1, pp. 25-35

2015年

■応答タイミングを考慮した英会話練習のための音声対話型英語学習システム
鈴木直人，廣井富，千葉祐弥，能勢隆，伊藤彰則
情報処理学会論文誌 Vol.56, No. 11, pp. 2177–2189 (Nov. 2015)

■指差しによる人間への位置提示精度調査とその精度向上手法
廣井富，伊藤彰則
情報処理学会論文誌, Vol. 56, No. 8, pp. 1634-1645

■Real-time talking avatar on the internet using Kinect and voice conversion
Takashi Nose, Yuki Igarashi
International Journal of Advanced Computer Science and Applications, Vol. 6, Issue 12, pp. 301-307, 10.14569/IJACSA.2015.061240

■HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling
Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi
Computer Speech and Language, Vol. 34, Issue 1, pp. 308-322

2014年

■Packet Loss Concealment of Voice-over IP Packet using Redundant Parameter Transmission Under Severe Loss Conditions
Takeshi Nagano, Akinori Ito
Journal of Information Hiding and Multimedia Signal Processing, Vol. 5 No. 2 pp. 286-295
2014/4

■Automatic evaluation of singing enthusiasm for karaoke
Ryunosuke Daido, Masashi Ito, Shozo Makino, Akinori Ito
Computer Speech and Language, Vol. 28 Issue 2 pp. 501-517
2014/3

■Statistical parametric speech synthesis based on Gaussian process regression
Tomoki Koriyama, Takashi Nose, Takao Kobayashi
IEEE Journal of Selected Topics in Signal Processing, Vol. 8 No. 2 pp. 173-183
2014/2

■A parameter generation algorithm using local variance for HMM-based speech synthesis
Takashi Nose, Vataya Chunwijitra, Takao Kobayashi
IEEE Journal of Selected Topics in Signal Processing, Vol. 8 No. 2 pp. 221-228
2014/2

■Prosodic variation enhancement using unsupervised context labeling for HMM-based expressive speech synthesis
Yu Maeno, Takashi Nose, Takao Kobayashi, Tomoki Koriyama, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka
Speech Communication, Vol. 57 pp. 144-154
2014/2

2013年

■拡張現実感を用いたロボットデザインの評価
日本バーチャルリアリティ学会誌, Vol. 18 No. 2 pp. 161-170
2013/6/30

2012年

■Model Shrinkage for Discriminative Language Models
IEICE Transactions on Information and Systems, E95-D(5) (2012), 1465-1474
Takanobu Oba, Takaaki Hori, Atsushi Nakamura and Akinori Ito

■Estimating a User's Internal State before the First Input Utterance
Advances in Human Computer Interaction, (2012), 2012, Article ID 865362, 10 pages
Yuya Chiba and Akinori Ito

■Robust Transmission of Audio Signals over the Internet: An Advanced Packet Loss Concealment for MP3-Based Audio Signals
Interdisciplinary Information Sciences, (2012) Vol. 18, No. 2, pp. 99-105 DOI:10.4036/iis.2012.99
Akinori Ito, Kiyoshi Konno, Masashi Ito and Shozo Makino

■Mobile Robot System With Semi-Autonomous Navigation Using Simple And Robust Person Following Behavior
Journal of Man, Machine and Technology, Vol. 1, No. 1, pp. 44-62, 2012
Yutaka Hiroi, Shohei Matsunaka, Akinori Ito

2011年

■Round-Robin Duel Discriminative Language Models
IEEE Transactions on Audio, Speech and Language Processing, (2011) 20(4), 1244 -1255, May 2011
Takanobu Oba, Takaaki Hori, Atsushi Nakamura and Akinori Ito

2010年

■Designing Side Information of Multiple Description Coding
Journal of Information Hiding and Multimedia Signal Processing, 1 (2010) 10-19.
Akinori Ito and Shozo Makino

■A source-filter separation for non-stationary voiced speech based on sinusoidal representation
Acoustical Science and Technology, 2 (2010) 181-184.
Masashi Ito, Keiji Ohara, Akinori Ito, Masafumi Yano

■PACKET LOSS CONCEALMENT FOR MDCT-BASED AUDIO CODEC USING CORRELATION-BASED SIDE INFORMATION
International Journal of Innovative Computing, Information and Control, 6 (2010) 1347-1362.
AKINORI ITO, TOSHIYUKI SAKAI, KIYOSHI KONNO, SHOZO MAKINO, MOTOYUKI SUZUKI

■INTONATION EVALUATION OF ENGLISH UTTERANCES USING SYNTHESIZED SPEECH FOR COMPUTER-ASSISTED LANGUAGE LEARNING
International Journal of Innovative Computing, Information and Control, 6 (2010) 1501-1514.
AKINORI ITO, TOMOAKI KONNO, MASASHI ITO, SHOZO MAKINO, MOTOYUKI SUZUKI

■ADPCM 出力とサンプルの絶対値を考慮したG.711 への固定ビットレート情報ハイディング
電子情報通信学会論文誌(A)，J93-A(2) (2010) 82-90.
伊藤彰則，半田浩規，鈴木陽一

■Speech Recognition under Multiple Noise Environment Based on Multi-Mixture HMM and Weight Optimization by the Aspect Model
IEICE TRANSACTIONS on Information and Systems, (2010) E93-D(9) 2407-2416.
Seongjun Hahm, Yuichi Ohkawa, Masashi Ito, Motoyuki Suzuki, Akinori Ito and Shozo Makino

■Improved Reference Speaker Weighting using Aspect Model
IEICE TRANSACTIONS on Information and Systems, (2010) E93-D(7) 1927-1935
Seongjun Hahm, Masashi Ito, Motoyuki Suzuki, Akinori Ito and Shozo Makino

■Information Hiding for G.711 Speech Based on Substitution of Least Significant Bits and Estimation of Tolerable Distortion
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, (2010) E93-A(7) 1279-1286.
Akinori Ito, Shun'ichiro Abe and Yoiti Suzuki

■Multiple Description Coding Using Time Domain Division for MP3 coded Sound Signal
Journal of Information Hiding and Multimedia Signal Processing, (2010) 1(4), 269-285, October 2010
Ho-seok WEY, Akinori ITO, Takuma OKAMOTO, and Yoiti SUZUKI

2009年

■EVALUATION OF ROBOT-AVATAR-BASED USER-FAMILIARITY IMPROVEMENT FOR ELDERLY PEOPLE
Kansei Engineering International, 8 (2009), 59-66.
Yutaka HIROI, Akinori ITO and Eiji NAKANO

■EFFECT OF THE SIZE FACTOR ON PSYCHOLOGICAL THREAT OF A MOBILE ROBOT MOVING TOWARD HUMAN
Kansei Engineering International, 8 (2009), 51-58.
Yutaka HIROI and Akinori ITO

■Novel Tonal Feature and Statistical User Modeling for Query-by-Humming
Journal of Information Processing, 17 (2009), 95-105.
MOTOYUKI SUZUKI, TAKUTO ICHIKAWA, AKINORI ITO and SHOZO MAKINO

■Bit rate reduction of mixed excitation linear prediction coder by Lempel-Ziv segment quantization
Acoustical Science and Technology, 30 (2009), 136-138.
Minoru Kohata, Motoyuki Suzuki, Akinori Ito and Shozo Makino

■Dictation of Japanese Speech Based on Kana and Kanji Character String
International Journal of Computer Processing of Languages, 22 (2009), 75-98.
Akinori Ito, Hiroaki Kinno, Masaharu Katoh, Tetsuo Kosaka and Masaki Kohda

■Fast and robust training of a probabilistic latent semantic analysis model by the parallel learning and data segmentation
Journal of Communication and Computer, 54 (2009), 28-35.
Masaharu Kato, Tetsuo Kosaka, Akinori Ito, Shozo Makino

■An algorithm for Fast Calculation of Back-off N-gram Probabilities with Unigram Rescaling
IAENG International Journal of Computer Science, 36 (2009), IJCS_36_4_08
Masaharu Katoh, Tetsuo Kosaka, Akinori Ito, Shozo Makino

■A speaker adaptation method for non-native speech using learners' native utterances for computer-assisted language learning systems
Speech Communication, 51 (2009), 875-881.
Yuichi Ohkawa, Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Shozo Makino

■Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition
EURASIP Journal on Audio, Speech and Music Processing, (2009) Article ID 140575, 12 pages, doi:10.1155/2009/140575
Akinori Ito, Yasutomo Kajiura, Motoyuki Suzuki, and Shozo Makino

2008年

■決定木を用いた単語クラスタリングによる英語韻律自動評価の高精度化
電子情報通信学会論文誌D, J91-D (2008), 358-366.
伊藤　彰則，今野　樹，鈴木　基之，牧野　正三

■Selection of Optimum Vocabulary and Dialog Strategy for　Noise-Robust　Spoken Dialog Systems
IEICE TRANSACTIONS on Information and Systems, E91-D (2008), 538-548.
Akinori ITO, Takanobu OBA, Takashi KONASHI, Motoyuki SUZUKI, and Shozo MAKINO

■小型ロボットによる音声認識のための内部雑音抑圧法
ヒューマンインタフェース学会誌・論文誌，10 (2008), 1-10.
伊藤彰則，金山高志，鈴木基之，牧野正三

■人間共存型ロボットのためのロボットアバタを用いた親しみ感の向上
日本感性工学会研究論文集, 7 (2008), 797-805.
廣井富，伊藤彰則，中野栄二

■Automatic Evaluation System of English Prosody Based on Word Importance Factor
Journal of Systemics, Cybernetics and Informatics, 6 (2008), 83-90.
Motoyuki Suzuki, Tatsuki Konno, Akinori Ito and Shozo Makino

■Multiple description coding of an audio stream by optimum recovery transforms
Journal of Digital Information Management, 6 (2008), 189-195.
Akinori Ito, Shozo Makino

2007年

■Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information
EURASIP Journal on Advances in Signal Processing, 2007 (2007), doi:10.1155/2007/38727.
Motoyuki Suzuki, Toru Hosoya, Akinori Ito, and Shozo Makino

■A New Segment Quantization Using Lempel-Ziv Algorithm and Its Application to Quantization of Line Spectral Frequencies
IEEE Transactions on Communications, 55 (2007), 661-664.
Minoru Kohata, Motoyuki Suzuki, Akinori Ito, and Shozo Makino

■Pronunciation error detection for computer-assisted language learning system based on error rule clustering using a decision tree
Acoustical Science and Technology, 28 (2007), 131-133.
Akinori Ito, Yen-Ling Lim, Motoyuki Suzuki and Shozo Makino

■LogPCM およびADPCMへのMultiple Descriptionスカラ量子化の適用
電子情報通信学会論文誌A，J90-A (2007), 918-921.
魏　浩石，西村　竜一，伊藤　彰則，小林　まおり，鈴木　陽一

2006年

■Automatic Detection of English Mispronunciation Using Speaker Adaptation and Automatic Assessment of English Intonation and Rhythm
Educational Technology Research, 29(2006), 13-23.
Akinori ITO, Tadao NAGASAWA, Hirokazu OGASAWARA, Motoyuki SUZUKI, and Shozo MAKINO

■Lempel-Ziv符号化を用いたLSP係数のセグメント量子化
電子情報通信学会論文誌D，J89-D (2006), 1504-1513.
木幡　稔，鈴木　基之，伊藤　彰則，牧野　正三

■発話速度と言語的特徴による変動を考慮した音素持続時間モデルを用いた音声認識
情報処理学会論文誌，47 (2006), 3380-3391.
大河　雄一，伊藤　彰則，鈴木　基之，牧野　正三

■An Effective Music Information Retrieval Method Using Three-Dimensional Continuous DP
IEEE Transactions on Multimedia, 8 (2006), 633-639.
Sung-Phil Heo, Motoyuki Suzuki, Akinori Ito, and Shozo Makino

2005年

■A grammatical error detection method for dialogue-based CALL system
Journal of Natural Language Processing, 12 (2005), 137-156.
Oh-pyo Kweon, Akinori Ito, Motoyuki Suzuki and Shozo Makino

■Fast optimization of language model weight and insertion penalty from n-best candidates
Acoustical Science and Technology, 26 (2005), 384-387.
Akinori Ito, Masaki Kohda, and Shozo Makino

2004年

■An Evaluation Method of Japanese Pronunciation for Korean Native Speakers
Educational Technology Research, 27 (2004), 9-16.
Oh Pyo KWEON, Motoyuki SUZUKI, Akinori ITO and Shozo MAKINO

2002年

■尤度差に基づくn-gram言語モデル評価のための指標
情報処理学会論文誌, 43 (2002), 2055-2064.
伊藤　彰則，好田　正紀

■Construction and Evaluation of Language Models Based on Stochastic Context-Free Grammar for Speech Recognition
Systems and Computers in Japan, 33 (2002), 48-59.
Chiori Hori, Masaharu Katoh, Akinori Ito, and Masaki Kohda

2000年

■N-gram出現回数の混合によるタスク適応の性能解析
電子情報通信学会論文誌 D-II, J83-D-II(2000), 2418-2427.
伊藤　彰則，好田　正紀

■音声認識のための確率文脈自由文法に基づく言語モデルの構築と評価
電子情報通信学会論文誌 D-II, J83-D-II(2000), 2407-2417.
堀　智織，加藤　正治，伊藤　彰則，好田　正紀

東北大学大学院工学研究科通信工学専攻 (大町)・能勢研究室

学術論文

リンク

お問い合わせ

東北大学大学院 工学研究科 通信工学専攻 (大町)・能勢研究室

学術論文

リンク

お問い合わせ

東北大学大学院工学研究科通信工学専攻 (大町)・能勢研究室