The National Academies Press

Currently Skimming:

Index
Pages 525-548

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.

From page 525... ... , 376 context-dependent utterances, 61 corpus, 61, 184-185, 234, 219, 250, 256, 257-258, 491 degree of difficulty, 383-385 error rates, 252, 486 human performance on, 162 interactive dialogue, 227, 228, 233 Read the entire page →
From page 526... ... See Assistive technology for disabled persons; Deployment of applications; Military and government applications; Telec ommunic ations; Telephony air travel information systems, 46, 85-86, 162 aircraft pilots, 40, 41, 44, 45, 359, 365, 509 assessment criteria, 409-410 automatic teller machines, 86 computer-aided instruction, 151 databases for, 406-408 baggage handlers, 40 consumer electronics programming, 43, 353 development environment, 400401 driving instructions, 354 economic impact of, 280 expectations for, 505-506 force feedback glove, 98, 101 foreign language learning, 44 Read the entire page →
From page 527... ... , 362 Communications and Electronics Command (CECOM) program, 361-362 Articulatory models, 88, 95, 117, 118, 120, 122, 124-125, 152153, 461-463, 476 Artificial intelligence, 484 527 Artificial neural networks, 2, 21, 124, 190, 191-193, 381, 479 Assembler language, 399-400, 401 Assistive technology for disabled persons assistive listening devices, 315316 augmentative and alternative communication, 130, 335-337 captioning, 314-315, 322-323 carpal tunnel syndrome, 43 categories of sensory aids, 316 cochlear implants, 314, 328-331, 332-333 computer-assisted instruction, 336 deaf-blind, 327 direct stimulation of auditory system, 328-331 dysarthric speech, 337 extracochlear implant, 329-330 eyeglass speechreader, 320-322 hearing aids and assistive listening devices, 278, 311, 312, 315-318, 3.28-331, 332 hearing impaired, 43, 292, 302304, 312, 314-333 limitations of, 318 mobility control, 312 noise reduction, 331-333 reading machines for blind, 349 research and development efforts, 313-314 sound/speech spectrograph, 319, 325, 349 with speech/language disabilities, 311, 313, 325 speech recognition, 275-279 speech processing for sightless people, 279, 313, 329, 333335, 349 speechreading cues, 320-321, 325, 327, 328 tactile sensory aids, 314, 324-328 talking books, 333 Telephone Relay Services (TRS) Read the entire page →
From page 528... ... , 176, 282, 283, 292, 293, 294, 295, 299, 383-386, 437, 438, 439 Bell System, 6 Bellcore, 291-293 Bigram models, 201, 209, 211, 213, 214, 222 Bit rates and image processing, 101 speech coding and, 23, 24, 81, 83-84 text-to-speech synthesis, 29, 77 Bolt, Beranek, and Newman (BBN) Systems and Technologies ATIS, 46, 261 Delphi system, 259 directory service, 438 hidden Markov models, 175 N-best filtering and rescoring, 267 word lattice parsing, 265 "Break index" data, 147, 148 C C cross compiler, 399-400 Cambridge University, 176 Carnegie-Mellon University (CMU) Read the entire page →
From page 529... ... , 122 CRIM, 176 Cross-word effects, 182 CSELT, 176 CSTR, 130 Currency, pronunciation of, 143 Cybernetics, 445-446, 448-449 D Databases. See also Corpora algorithms, 405-409 for applications, 406-408 dialect considerations, 409 interfaces, 240, 252 large tagged, 152 natural language interfaces, 240 NTIMIT, 409 Official Airline Guide, 46, 219 relational, 53-54 for research, 405-406 remote access to, 42, 44, 278, 296-299, 348, 349, 351 retrieval system, product quality, 57 simulated telephone lines, 408409 speech, 387, 405, 407, 468, 472 StockTalk, 383-386, 437, 438, 439 WordNet, 499 DEC, 130 Decision criteria, 305 Read the entire page →
From page 530... ... See also Text, typewriters Digital encryption, 83 speech coding, 25, 82-83, 85 filtering, 19 telephone answering machines, 7-8 Digital computers. See also Digital signal processors and speech signal processing, 19, 78, 81, 189, 393-396 and microelectronics, 19-21, 81 Digital-to-analog converter, 23, 398 Digital signal processors/processing applications, 350, 400-401 capabilities, 391, 393-394 development environment, 399405 distributed control of, 404-405 floating-point, 383, 394-396 growth of, 19, 78, 81 integer, 383 for LSP synthesis, 398 mechanisms, 393 microphone arrays, 97 technology status, 393-396 transputer architecture, 396, 397 workstation requirements, 189 Digitizing pens, 52 Diplophonia, 122 Discourse natural language processing, 246 and prosodic marking, 149-151 speech analysis, 145, 149-151 in spoken language systems, 227-230 in text-to-speech systems, 145 Dragon Systems, Inc., 176, 380, 401, 402 Dynamic grammar networks, 265266 Dynamic time warping (DTW) Read the entire page →
From page 531... ... See Military and Grammers government applications ambiguity, 380 bigram, 179 combinatory categorical, 490 context-free, 264, 461, 490, 491494 covering, 493 dialogue, 62, 63 dynamic grammar networks, 265-266 features-value structures in, 264 finite-state, 266, 379-380 formalisms, 490 hand-coded linguistic, 483 lexicalized, 490 lexicalized tree-adjoining, 490 Markov, 179-180 modeling, 28, 63 natural language understanding and, 37-38, 264, 380, 491-494 perplexity, 180, 185, 229, 378 probabilistic context-free, 491494 size, 37-38 speech analysis and, 28, 36-38 speech recognition, 36-37, 4142, 63, 81, 85-86, 179-180, 185-186, 265-66 statistical e-gram, 183, 224 training speech, 179-180, 185186 trigram, 141, 179-180, 183 unification, 461 Graphical user-interface. See also User interfaces Read the entire page →
From page 532... ... bigram, 201, 211, 213, 214 defined, 171-173 estimation of statistical parameters of, 199, 202-208 feature extraction, 177-178 fenonic case, 207 grammar-state-transition table, 266 limitations of, 189-190 Markov chains, 170-171, 172 and mel-frequency cepstral coefficients, 178 neural nets combined with, 193194 part-of-speech tagging, 487-488, 490 INDEX phonetic, 166, 173-175, 178-179, 182, 188 and semantics, 221 speaker recognition systems, 30, 85 speech recognition, 28, 30, 85, 170-175, 177-178, 199, 200 208, 377, 394, 396, 397, 478 479 speech variability and, 28, 415 416 and talker verification, 86 three-state, 172 theory development, 175 training and analysis, 30, 178 179, 181-182, 478-479 trellis representation, 203, 208, 212 trigram, 201-202, 212, 213-214 unigram, 210 Viterbi algorithm and, 210 word models, 179 wordspotting, 397 Human-human communication conversational dynamics, 431432 language imitation, 60 repair rates, 260 studies, 50-51 IBM, 9, 175, 349, 380, 495 Image compression, 99 Image processing, 78, 101 Information processing in auditory systems, 91, 94 speech technologies, 453 Information retrieval, 54-55, 57 INFOVOX, 130 Institute for Defense Analyses, 175, 234-235 Institute for Perception Research, 127 INTELLECT, 57 Integrated Services Digital Network (ISDN) Read the entire page →
From page 533... ... See also Natural language bigram, 201, 209, 211, 213, 214, 222, 461 533 computational, 78, 81, 86, 90-91 etymology estimates for proper names, 92 future of, 307 research needs, 26, 29 speech recognition, 29, 81-82, 90-91, 168-169, 183, 263, 307 speech synthesis, 128 statistical, 263-264, 461, 472-473 trigram, 92, 183, 209-210, 212, 213-214, 461 by users, 60 Laryngalization, 122 Law enforcement, 367 Lexicons, 138, 140, 141-142, 178-179, 188, 296, 499 LIMSI, 176 Linear predictive coding analysis by synthesis, 24, 26-27, 119 mapping code book, 128 code-excited (CELP) , 24, 26, 83, 101 mixed-excitation (MELP) Read the entire page →
From page 534... ... See also Personal computers computation speed, 19-20, 97 device density, 20 digital signal processing, 19 projected advances in, 102-103 speech processing and, 19-20, 81, 396-399 Microelectronics chip densities, 102 digital computation and, 19-21 research, 21 revolution, 108 speech signal processing, 19-20 Microphones applications, 86-87, 102 autodirective arrays, 86-89, 96, 97, 99-100, 102 INDEX beamforming systems, 87, 88, 99 characteristics, 414 digital signal processors, 97 directional, 333, 414-415 electret, 87, 88, 97, 102 environmental variation in speech input, 412-413, 460 in hearing aids, 331-332, 333 noise reduction, 331-332, 414415 reflection and reverberation, 414 speaker distance from, 414 and speech recognition, 379, 414 technology projections, 102 three-dimensional, 96, 97, 99 100 track-while-scan mode, 87, 89 Microsoft Windows, 52 Military and government applications. See also Advanced Research Projects Agency; other government agencies Agent's Computer, 367 Air Force, 359, 365 air traffic control, 365-366 aircraft carrier flight deck control and information management, 363 Army, 359, 360-363 combat team tactical training, 364-365, 366 Command and Control on the Move (C20TM) Read the entire page →
From page 535... ... See also Hidden Markov models; Language modeling acoustic, 26, 36, 64, 85, 95, 117, 122, 182-183, 476 allophone, 182 articulation, 88, 95, 117, 118, 120, 122, 124-125, 152-153 auditory, 24, 26, 91, 92, 94, 97 bigram, 201, 209, 211, 213, 214 computational, 78, 81, 86, 90-91 consonants, 123 context-dependent, 182, 246 cross-word effects, 182 dialogue, 62-63 grammar, 28, 63, 380 intonation, 127 Klatt, 123 left-to-right, 175 natural language understanding, 238-253, 262264 noise excitation, 122 phonetic, 173-174, 190-191, 193 prosody, 117 segmental, 125, 173-174, 190191, 193 signal, 19, 101 sinusoidal, 24 sound source, 462 source/system, 22, 118, 120-122 speech perception, 26 speech production, 22 speech recognition requirements, 168-169 speech synthesis, 109, 116-130 speech variability, 176 spoken language systems, 48 stochastic segment, 190-191 trigram, 201-202 vocal tract, 95, 118, 122, 124, 125 wave propagation, 26 word, 179, 207 Modulation theory, 26 Morphemes, 137, 139, 140 535 Morphology, speech synthesis, 110, 111, 112, 113, 137, 141-142, 489 Morphs, 138-139, 140 Motorola, 383, 392 Mouse, 52, 350-351, 402-403 Multilingual systems. See also Foreign language; Spoken language translation; Telephony future of, 513-514 INTERTALKER, 513-514 Japanese kana-kanji preprocessor, 403 MITalk, 130 PIVOT, 512-513 speech synthesis, 42, 101, 117, 129-130, 151-152 Multimodal systems. Read the entire page →
From page 536... ... See also Linguistics; Speech recognition; Spoken language understanding accuracy/error rates, 47, 251, 252, 255, 261, 262, 388 applications, 379-381 architecture, 485-487 background, 238-239 current capabilities, 10, 506 defined, 239 grammar, 37-38, 263, 380, 491494 language variability and, 380 models of, 238-253, 262-264 off-the-subject input and, 287, 380, 388 part-of-speech tagging, 487-489 preprocessing and, 489 search process, 248-249 speech constraints in, 268-269 stochastic parsing, 489-495 task difficulty and, 379-381 TINA system, 222 vocabulary size and, 37-38 unknown words, 488-489 Naval. See also Military and government applications Air Technical Training Center (Orlando) Read the entire page →
From page 537... ... , 283, 291, 292, 296-297, 398-399, 407409, 410, 417 concatenative synthesis, 126 HMM applications, 176 systematic optimization techniques, 115 telephone speech database, 407 additive, 459 and algorithm robustness, 413 excitation, 122 immunity, 305 Lombard effect, 415, 460 reduction technology, 331-333, 414-415 sources, 122 and speaker variation, 415-416 and speech recognition, 288, 305, 379, 388, 414-415, 469, 473-474 white, 122 Northern Telecom, 278, 291, 295, 299 Numbers, pronunciation of, 143, 288 NYNEX, 282, 283, 291, 292, 300, 301302, 407, 409, 436 o Occam parallel programming language, 396 537 Octel, 281 Official Airline Guide database, 46, 219 Olive, Joseph, 107 Operating systems pen, 402, 511-512 speech, 417 Optical character recognition technology, 43, 349 Oregon Graduate Institute, 407 p Packet data network (XU-NET) , 99 Paget, Richard, 15-16 Palantype keyboard, 335 Parallel processing, 89, 383, 400 Parsing/parsers ambiguous, 147-148 clause-level, 144, 145 crossing brackets, 491 natural language, 59, 247, 483, 489-495 phrase-level, 144-145, 146 probabilistic, 56 and prosodic marking, 56, 144, 146-147 in speech synthesis, 137, 139, 144145 stochastic, 489-495 of unrestricted text, 144 word lattice, 265 Pause insertion strategies, 129 Performance structures, 146 Personal Communication Devices, 306 Personal Communication Networks, 306 Personal Communication Services, 306 Personal computers hand-held, 355 portable, 64-65 sound boards, 350, 353, 397 speech interfaces for, 511 speech processing technology, 108, 374, 401-403, 509-510 Read the entire page →
From page 538... ... , 24, 8283, 101 Q Quasi-frequency analysis, 177 Query language, artificial, 57 R Rabiner, Lawrence, 111, 113 Recursive transition networks (RTNs) , 222 Repeaters, electromechanical, 81 Resonators, 78, 80 Research methodology, spoken language vs. Read the entire page →
From page 539... ... G., 9, 42, 83 Signal modeling techniques, 19, 101 539 Signal processing digital, 19, 97 enhancement, 102 research, 21 Sinusoidal models, 24 Software technologies, 391 Sound generation, 118, 119, 124 source model, 462 Sound Pattern of English, 126 Sound/speech spectrograph, 319, 325, 349 Source-filter decomposition, 128 Speak 'N Spell, 110 Speaker adaptation, 459, 460 atypical, 187-188 dependence, 36 recognition/ identification, 9, 30, 85, 348 style shifting, 460, 461 variation, 415-416 verification, 9, 30, 86, 300, 305 Speaking characteristics and styles, 128-129, 378-379 Spectrum analysis, 19 Speech behaviors, conversational, 430432 casual informal conversational, 82 compression, 23, 83, 474 connected, 97 continuous, 36, 78, 95, 323, 427428, 430-431 constraints on, 77, 268-269 databases, 405, 407-409, 468 dialect, 409 digitized, 38, 45, 189, 428 dysarthric speech, 337 gender differences, 129 information processing technologies, 453 interactive, 36 intonation, 45, 127, 129, 432 knowledge about, 117 machine-generated, 335 Read the entire page →
From page 540... ... 540 noninteractive, 48 pause insertion strategies, 129 perception models, 26 preprocessor, 403 production, 21-22, 26, 77, 87-90, 137-138 prolongation of sounds, 322 psychological and physiological research, 462 self-correction, 256, 432 signal processing systems, 19 slips of the tongue, 257 spontaneous, 58-59, 185, 255260, 303, 460, 461, 469-471 standard model of, 267 synthetic, 428-428; see also Speech synthesis; Speech synthesizers toll quality, 23, 24 training, 322, 325 type, 36 ungrammatical, 257 units of, 168-170, 462-463 variability, 28, 176, 378, 413, 459-460, 480 waveforms, 24, 136, 137 Speech analysis acoustic modeling, 26 analysis-by-synthesis method, 26-27 auditory modeling, 26 defined, 22 dimensions, 36-38 importance, 21 interactivity, 36 language modeling, 26 linear predictive coding, 24 robustness, 97 speech continuity, 36 speech type, 36 vocabulary and grammar, 28, 36-38 vocal tract representation in, 90, 91 Speech coding, 26 applications, 82-83 articulatory-model-based, 125 INDEX audio perception factors in, 84, 85 in cochlear implants, 331 concatenation using speech waveforms, 117 bit rates and, 23, 24, 81, 83-84 digital, 25, 82-83, 85 and masking, 84, 93 predictive, 117 psychoacoustic factors in, 101 research challenges in, 76 rule-based diphone system, 118 stereo coding, 84-85 technology status, 82-85, 281 terminal analog, 118 wideband audio signals, 84 Speech processing algorithms, 21, 393 articulatory and perceptual constraints in, 461-463 digital, 22-23, 76 equipment and systems, 19-20, 81, 396-399 evaluation methods, 463-464 in hearing aids, 317 and natural language processing, 460-461 obstacles to, 373 research challenges, 76-77 psychoacoustic behavior and, 94 for sightless people, 333-335 and speech technology development, 76, 78 Speech recognition accuracy, 28, 37, 41, 46-47, 86, 159, 181-189, 377, 378, 470, 473 acoustic modeling, 64, 182-183 adverse conditions, 459-460 algorithms, 28, 409-411, 412, 417-418, 469 alternative models, 189-193 analysis-by-synthesis, 30 applications, 28-29, 30-32, 81, 275-282, 283-284, 318, 377379, 451, 457, 458, 471, 508510 Read the entire page →
From page 541... ... , 187 dynamic grammar networks, 265-266 dynamic programming matching, 509 environmental factors, 413-414 error correction, 64, 261-262, 388 feature extraction, 177-178, 180 Flexible Vocabulary Recognition, 295 future, 307-309, 456-459 generalization, 479 Hidden Markov models and, 28, 30, 85, 170-175, 177-178, 199, 200-208, 377, 397, 478 historical overview, 175-176 improvements in performance, 181-184, 388 interactivity, 36 language modeling, 29, 81-82, 90-91, 168-169, 183, 263 large-vocabulary systems, 183, 193, 277, 292, 506 lip reading, 64 linguistic rules, 82 market for technology, 350-351, 416-417 microphones and, 305, 414 most likely path, 208-209 most likely word sequence, 209 214 N-best filtering or rescoring, 267 natural language and, 17, 262 267, 388 naturalness, 45, 153 neural networks, 191-193 new words, 188-189 noise immunity and channel equalization, 288, 305, 379, 388, 414-415, 469, 473 normalization of speakers in, 30, 456-457, 459, 460 pattern matching, 474, 478-479 perplexity of language model and, 37, 180, 185, 229, 378, 463 phonetics and, 167, 169-170, 188, 410 processes, 167-168, 180-181, 199, 451, 453-454, 473-474 pronunciation and, 44 prototype systems, 34 real-time, 189 rejection of irrelevant input, 287, 388 and repetitive stress injuries, 43 research challenges, 29-30, 44, 76, 108, 183-184, 304-306, 417-418 robustness, 29-30, 44, 184, 261 262, 459-460, 473, 474 sample performance figures, 184-185 search algorithms, 180-181, 248, 264-265 segmental models, 190-191, 473-474 sheep and goats phenomenon, 456 speaker-adaptive, 36, 187-188, 288, 388, 479 speaking characteristics and styles and, 128, 377, 378-379, 415-416, 460 Read the entire page →
From page 542... ... 542 speaker-dependent, 28, 36, 54, 186-187, 292, 509-510 speaker expertise and, 378 speaker-independent, 28, 36, 37, 46, 184, 186-187, 188, 362363, 378, 397, 425, 433-434, 506, 507 spontaneous speech and, 58-59, 185, 460, 461, 469, 471 SR-1000 system, 507 SR-3200 system, 507 subword units, 287-288, 299, 388 successful systems, 239 system structure, 27-28, 398, 401, 402 talker verification, 86 task completion rate, 410 technology status, 8-9, 18, 81, 85-86, 112-113, 159-164, 165166, 181-189, 286-288, 428, 468 templates, 258-259, 425 terminal-type, 508-510 training data, 178-180, 185-186, 457, 459, 473, 478-479 transputer-based, 397 trials, 417 units of speech and, 168-170 user tolerance of errors and, 379 vocabulary and grammar and, 36-37, 41-42, 81, 85-86, 185186, 265-266, 277, 378, 457 Wizard of Oz assessment technique, 410-411, 439 word lattice parsing, 265 wordspotting, 286-287, 292, 295, 298-299, 305, 387, 388, 397, 404 Speech research computational models of language, 90-91 critical directions in, 87-101 historical background, 78-82 language modeling, 26 physics of speech generation, 87-90 unification of coding, synthesis, and recognition, 94-95, 97 INDEX Speech synthesis. See also Text-to speech synthesis acoustic models, 85, 95, 117, 122, 476 analysis-synthesis systems, 117, 118, 119, 125 applications, 30-32, 108, 109, 110, 278, 381-382 articulatory models, 88, 117, 118, 120, 124-125, 152-153, 476, 480 assessment of, 411-412 automatic learning, 127 concatenative, 110, 114, 117, 118-119, 126, 168, 406 concept-to-speech systems, 38 39 content, 45 control, 124, 118, 125-127 corpus-based optimization, 113 defined, 22, 109, 110, 116, 348 digitized speech, 22-23, 25, 38 dimensions of task difficulty, 381-382 discourse-level effects, 149-151 error rates, 112 evaluation of, 130 expectations of listeners, 382 flexibility needs, 117-118 fluid dynamics in, 89-90 formant-based terminal analog, 117, 118, 122-123, 125 forms, 38-39 frequency domain approach, 119 future of, 152-153, 455-456 higher-level parameters, 123 124 history of development, 111-115 individual voices, speaking styles, and accents and, 117 118 input, 109 intelligibility, 44-45, 129, 130, 149, 382, 429 large-vocabulary systems, 101 102, 351 Read the entire page →
From page 543... ... , 118-119, 383, 476 word-level analysis, 138-139 Speech synthesizers acoustic terminal analog, 117 cartridge-type, 510 cascade, 122-123 future, 455-456 large-vocabulary, 349 neural network controller, 124 OVE, 123 parallel, 123, 125 terminal analog, 510 voice quality, 456 Speech technology, See Deployment of applications capabilities and limitations, 427-430 challenges in, 284, 471-475 commercial developments, 352354 foundations, 77-78 growth of, 2 information processing, 453 market, 350-352, 416-418 projections, 101-102, 355-356 readiness evaluation, 440 research on, 65-67, 417-418 service trials, 417 status, 82-87 trends, 117 Read the entire page →
From page 544... ... , 10, 42 voice output, 29 Spoken language understanding, 47 approaches to, 220-221 defined, 255 error repair, 260 limits on, 379 process, 452, 453 progress in, 224-226 spontaneous speech and, 258-26Q Sprint, 300 SQL, 57 SRI International, 52, 176, 213 ATIS, 46, 261 Gemini system, 259, 260 Template Matcher, 258, 259 Stenograph, 322, 335 Stereo coding, 84-85 StockTalk, 383-386, 437, 438, 439 Stored voice, 110 Subb and coders, 24, 83, 101 SUNDIAL spoken language systems, 229 Surnames, pronunciation of, 140 141, 288 Symbols, pronunciation of, 142-143 Symbolic learning techniques, 501 Syntax, 137. See also Parsing natural language processing system, 244-245, 247, 269 speech recognition systems, 305-306 and spoken language understanding, 220-221 Syntactico-semantic theory, 447 System technologies. Read the entire page →
From page 545... ... See also Telecommunications Automated Alternate Billing Services, 292, 293, 431 Automated Customer Name and Address, 302 automatic interpreting, 513-514 bandwidth conservation, 19 banking by phone, 283, 291, 398-399, 407-408, 425 cellular, 6, 7, 81, 83, 374, 383385, 507-508 deaf user aids, 43, 302-304 digital channels, 101 directory assistance, 41, 278, 282, 283, 291, 292, 295-296, 301-302, 355-356, 438, 458 history, 81 language translation, 10, 42, 77, 81, 82, 83, 108-109, 513-514 545 operator services, 8-9, 277, 282, 284, 291, 292, 293-296, 351, 353-354, 374, 380, 383-385, 387 simulated telephone lines, 278, 408-409 speech databases, 407 speech recognition technology, 428 teleconferencing, 454-455 telephone relay service, 302304, 322 text telephone, 322, 323 voice-controlled automated attendant, 356 voiced-based dialers, 40, 292, 299-300, 355, 374, 376, 383386, 436, 507-508 voice-interactive phone service, 292, 300-301, 351 Voice Recognition Call Processing (VRCP) , 292, 293295, 376, 383-385 TELECOM, 510, 513 Telephone answering machines, digital, 7-8 Texas Instruments (TI) Read the entire page →
From page 546... ... automatic, 263-264 INDEX databases for, 387, 405, 407, 468, 472 discriminative, 479 effects of, 185-186, 473 grammar, 179-180, 185-186 natural language processing, 56, 57, 58, 249, 250, 252, 263-264 phonetic HHMs and lexicon, 30, 178-179, 182-183 speech recognition, 178-180, 185186, 457, 459, 473, 478-479 syntactico-semantic theory and, 447 Transatlantic radio telephone, 81 Transatlantic telegraph cables, 81 Transform coders, 24 Treebank Project, 241, 491, 495 Trigrams, 92, 183, 201-202, 209-210, 212, 213-214, 229 Triphones, 182 Turing's test, 35 Tuttle, ferry O., 363 U United Kingdom, Defense Research Agency, 365 University of Indiana, 130 University of Pennsylvania, 181, 241, 252, 491, 495 US West, 300-301 Usability/usefulness. See also Applications of voice communications determinants of, 31-32 issues, 18, 30-32 pronunciation and, 44 voice input, 39-44 voice output, 44-45 User interfaces. Read the entire page →
From page 547... ... INDEX Users design strategies, 387, 423-424, 426, 433-440 dialogue flow, 435-436 direct manipulation, 51, 52-55, 57-58 error recovery, 438-440 evaluation of, 440 feedback and confirmation, 434, 437-438, 445 heirarchical, 454 information requirements of, 425-426 instructions, 438 keyboard dialogs, 49-50 metaphor, 54 multimodal systems, 32, 56, 6365, 505, 508-510 N-best, 217, 221, 226, 233 natural language interaction, 55-57 personal computer, 511-512 prompts, 435-436, 471 research directions, 56, 511-512 revisions suggested, 435 robustness, 56 smart, 512-513 system capabilities, 429-430 task modalities, 426 task requirement considerations, 424-427 telecommunications, 397 training issues, 58 user expectations and expertise and, 430-432 voice-actuated, 360 voice input, 427-428 conversational speech behaviors, 430-432 expectations and expertise, 430432 language modeling by, 60 novices vs. experts, 432 satisfaction, 429-430 tolerance of speech recognition errors, 379 USS Ranger, 363 547 V Vector quantlization, 28 Verbal repair, 269 Videophones, 5-6 Virtual reality technology, 454-455 Visual sensory aids, 319-324 Vocabulary algorithms, 307 confusability, 378 conversational, 101-102 Flexible Vocabulary Recognition, 295 large, 101-102, 183, 193, 277, 292, 307, 349, 351, 506 and natural language understanding, 37-38 operator services, 277 speech analysis and, 28, 36-38 speech recognition and, 36-37, 41-42, 81, 85-86, 183, 185 186, 193, 265-266, 277, 292, 378, 457, 506 speech synthesis, 101-102, 119, 349, 351 user-specific dictionaries, 335 336 wordspotting techniques, 292, 305 Vocal tract modeling, 95, 118, 122, 124, 125 Vocoder, 48, 81, 83, 119, 325 Voice control, ass~stive, 278-279, 313, 337, 360, 452 conversion system, 128-129 dialog applications, 375-377 fundamental frequency, tactile display, 326-327 input, 39-44, 50, 427-428 mail, 7, 81, 83, 101, 110 messaging systems, 281 mimic, 94-95 output, 44-45, 428-429 response, 25 task-specific control, 452 typewriters, 97, 376, 380, 451 Read the entire page →
From page 548... ... , 24 speech synthesis, 118, 119, 136, 137, 381, 474 Wavelets, 21 Wideband audio signals, 84 Windows, 52, 350, 353 Wizard of Oz (WOZ) assessment technique, 410-411, 439 Word-level analysis, 138-139 Word models, 179, 207 Word processors, speech only, 50 Word recognition systems, 182, 188 Workstations Hewlett-Packard 735 RISC chips in, 393 Silicon Graphics Indigo R3000, 189 speech input/output operating systems, 401~03 speech processing board, 397 Sun SparcStation 2, 189 Wheatstone, Charles, 80 X Xerox, 52 Zipf's law, 489 z Read the entire page →

From page 525...

... , 376 context-dependent utterances, 61 corpus, 61, 184-185, 234, 219, 250, 256, 257-258, 491 degree of difficulty, 383-385 error rates, 252, 486 human performance on, 162 interactive dialogue, 227, 228, 233

Index Pages 525-548

Index
Pages 525-548