%PDF-1.3 1 0 obj << /Kids [ 4 0 R 5 0 R 6 0 R 7 0 R 8 0 R 9 0 R 10 0 R 11 0 R 12 0 R ] /Type /Pages /Count 9 >> endobj 2 0 obj << /Subject (Neural Information Processing Systems http\072\057\057nips\056cc\057) /Publisher (Curran Associates\054 Inc\056) /Language (en\055US) /Created (2015) /Description-Abstract (Recurrent sequence generators conditioned on input data through an attention mechanism have recently shown very good performance on a range of tasks including machine translation\054 handwriting synthesis and image caption generation\056 We extend the attention\055mechanism with features needed for speech recognition\056 We show that while an adaptation of the model used for machine translation reaches a competitive 18\0566\134\045 phoneme error rate \050PER\051 on the TIMIT phoneme recognition task\054 it can only be applied to utterances which are roughly as long as the ones it was trained on\056 We offer a qualitative explanation of this failure and propose a novel and generic method of adding location\055awareness to the attention mechanism to alleviate this issue\056 The new method yields a model that is robust to long inputs and achieves 18\134\045 PER in single utterances and 20\134\045 in 10\055times longer \050repeated\051 utterances\056 Finally\054 we propose a change to the attention mechanism that prevents it from concentrating too much on single frames\054 which further reduces PER to 17\0566\134\045 level\056) /Producer (PyPDF2) /Title (Attention\055Based Models for Speech Recognition) /Date (2015) /ModDate (D\07220160216125042\05508\04700\047) /Published (2015) /Type (Conference Proceedings) /firstpage (577) /Book (Advances in Neural Information Processing Systems 28) /Description (Paper accepted and presented at the Neural Information Processing Systems Conference \050http\072\057\057nips\056cc\057\051) /EventType (Spotlight) /Author (Jan K\056 Chorowski\054 Dzmitry Bahdanau\054 Dmitriy Serdyuk\054 Kyunghyun Cho\054 Yoshua Bengio) /lastpage (585) >> endobj 3 0 obj << /Type /Catalog /Pages 1 0 R >> endobj 4 0 obj << /Contents 13 0 R /Parent 1 0 R /Resources 14 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 28 0 R 29 0 R 30 0 R 31 0 R 32 0 R 33 0 R 34 0 R 35 0 R 36 0 R 37 0 R 38 0 R 39 0 R 40 0 R 41 0 R 42 0 R 43 0 R ] /Type /Page >> endobj 5 0 obj << /Contents 44 0 R /Parent 1 0 R /Resources 45 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 82 0 R 83 0 R 84 0 R 85 0 R 86 0 R 87 0 R ] /Type /Page >> endobj 6 0 obj << /Contents 88 0 R /Parent 1 0 R /Resources 89 0 R /Group 103 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 109 0 R 110 0 R 111 0 R 112 0 R 113 0 R 114 0 R 115 0 R 116 0 R 117 0 R 118 0 R 119 0 R 120 0 R 121 0 R 122 0 R 123 0 R 124 0 R 125 0 R 126 0 R 127 0 R 128 0 R 129 0 R ] /Type /Page >> endobj 7 0 obj << /Contents 130 0 R /Parent 1 0 R /Resources 131 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 136 0 R 137 0 R 138 0 R 139 0 R 140 0 R ] /Type /Page >> endobj 8 0 obj << /Contents 141 0 R /Parent 1 0 R /Resources 142 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 155 0 R 156 0 R 157 0 R 158 0 R 159 0 R 160 0 R 161 0 R 162 0 R 163 0 R 164 0 R 165 0 R 166 0 R 167 0 R 168 0 R 169 0 R ] /Type /Page >> endobj 9 0 obj << /Contents 170 0 R /Parent 1 0 R /Resources 171 0 R /Group 187 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 189 0 R 190 0 R 191 0 R 192 0 R 193 0 R 194 0 R 195 0 R 196 0 R 197 0 R 198 0 R 199 0 R 200 0 R 201 0 R 202 0 R ] /Type /Page >> endobj 10 0 obj << /Contents 203 0 R /Parent 1 0 R /Resources 204 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 206 0 R 207 0 R 208 0 R 209 0 R 210 0 R ] /Type /Page >> endobj 11 0 obj << /Contents 211 0 R /Parent 1 0 R /Resources 212 0 R /MediaBox [ 0 0 612 792 ] /Annots [ 226 0 R 227 0 R 228 0 R 229 0 R 230 0 R 231 0 R 232 0 R 233 0 R 234 0 R ] /Type /Page >> endobj 12 0 obj << /Contents 235 0 R /Parent 1 0 R /Type /Page /Resources 236 0 R /MediaBox [ 0 0 612 792 ] >> endobj 13 0 obj << /Length 2999 /Filter /FlateDecode >> stream xYKEU`