TEST BANK FOR Speech and Language Processing 2nd Edition By Bethard S., Jurafsky D. and Martin J.H

From Computer Science, Learning Systems and AI

Question posted by

A-Grades
Rating : 0
Grade : No Rating
Questions : 0
Solutions : 275
Blog : 0
Earned : $35.00

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
I Words 1
2 Regular Expressions and Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
3 Words and Transducers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 N-Grams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 Part-of-Speech Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
6 Hidden Markov and Maximum EntropyModels . . . . . . . . . . . . . . . . . . . 27
II Speech 31
7 Phonetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8 Speech Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
9 Automatic Speech Recognition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
10 Speech Recognition: Advanced Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
11 Computational Phonology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
III Syntax 43
12 Formal Grammars of English. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
13 Syntactic Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
14 Statistical Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
15 Features and Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
16 Language and Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
IV Semantics and Pragmatics 69
17 The Representation of Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
18 Computational Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
19 Lexical Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
20 Computational Lexical Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
21 Computational Discourse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
V Applications 93
22 Information Extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
23 Question Answering and Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . 100
24 Dialogue and Conversational Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
25 Machine Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
i
Chapter 2
Regular Expressions and Automata
2.1 Write regular expressions for the following languages. You may use either
Perl/Python notation or the minimal “algebraic” notation of Section 2.3, but
make sure to say which one you are using. By “word”, we mean an alphabetic
string separated from other words by whitespace, any relevant punctuation, line
breaks, and so forth.
1. the set of all alphabetic strings;
[a-zA-Z]+
2. the set of all lower case alphabetic strings ending in a b;
[a-z]*b
3. the set of all strings with two consecutive repeated words (e.g., “Humbert
Humbert” and “the the” but not “the bug” or “the big bug”);
([a-zA-Z]+)\s+\1
4. the set of all strings from the alphabet a, b such that each a is immediately
preceded by and immediately followed by a b;
(b+(ab+)+)?
5. all strings that start at the beginning of the line with an integer and that end
at the end of the line with a word;
ˆ\d+\b.*\b[a-zA-Z]+$
6. all strings that have both the word grotto and the word raven in them (but
not, e.g., words like grottos that merely contain the word grotto);
\bgrotto\b.*\braven\b|\braven\b.*\bgrotto\b
7. write a pattern that places the first word of an English sentence in a register.
Deal with punctuation.
ˆ[ˆa-zA-Z]*([a-zA-Z]+)
2.2 Implement an ELIZA-like program, using substitutions such as those described
on page 26. You may choose a different domain than a Rogerian psychologist,
if you wish, although keep in mind that you would need a domain in which your
program can legitimately engage in a lot of simple repetition.
The following implementation can reproduce the dialog on page 26.
A more complete solution would include additional patterns.
import re, string
patterns = [
(r"\b(i’m|i am)\b", "YOU ARE"),
(r"\b(i|me)\b", "YOU"),
(r"\b(my)\b", "YOUR"),
(r"\b(well,?) ", ""),
(r".* YOU ARE (depressed|sad) .*",
r"I AM SORRY TO HEAR YOU ARE \1"),
(r".* YOU ARE (depressed|sad) .*",
r"WHY DO YOU THINK YOU ARE \1"),
1
2 Chapter 2. Regular Expressions and Automata
(r".* all .*", "IN WHAT WAY"),
(r".* always .*", "CAN YOU THINK OF A SPECIFIC EXAMPLE"),
(r"[%s]" % re.escape(string.punctuation), ""),
]
while True:
comment = raw_input()
response = comment.lower()
for pat, sub in patterns:
response = re.sub(pat, sub, response)
print response.upper()
2.3 Complete the FSA for Englishmoney expressions in Fig. 2.15 as suggested in the
text following the figure. You should handle amounts up to $100,000, and make
sure that “cent” and “dollar” have the proper plural endings when appropriate.
2.4 Design an FSA that recognizes simple date expressions like March 15, the 22nd
of November, Christmas. You should try to include all such “absolute” dates
(e.g., not “deictic” ones relative to the current day, like the day before yesterday).
Each edge of the graph should have a word or a set of words on it. You should
use some sort of shorthand for classes of words to avoid drawing too many arcs
(e.g., furniture! desk, chair, table).
3
2.5 Now extend your date FSA to handle deictic expressions like yesterday, tomorrow,
a week from tomorrow, the day before yesterday, Sunday, next Monday,
three weeks from Saturday.
2.6 Write an FSA for time-of-day expressions like eleven o’clock, twelve-thirty, midnight,
or a quarter to ten, and others.
2.7 (Thanks to Pauline Welby; this problem probably requires the ability to knit.)
Write a regular expression (or draw an FSA) that matches all knitting patterns
for scarves with the following specification: 32 stitches wide, K1P1 ribbing on
both ends, stockinette stitch body, exactly two raised stripes. All knitting patterns
must include a cast-on row (to put the correct number of stitches on the needle)
and a bind-off row (to end the pattern and prevent unraveling). Here’s a sample
pattern for one possible scarf matching the above description:1
1 Knit and purl are two different types of stitches. The notation Kn means do n knit stitches. Similarly for
purl stitches. Ribbing has a striped texture—most sweaters have ribbing at the sleeves, bottom, and neck.
Stockinette stitch is a series of knit and purl rows that produces a plain pattern—socks or stockings are knit
with this basic pattern, hence the name.
4 Chapter 2. Regular Expressions and Automata
1. Cast on 32 stitches. cast on; puts stitches on needle
2. K1 P1 across row (i.e., do (K1 P1) 16 times). K1P1 ribbing
3. Repeat instruction 2 seven more times. adds length
4. K32, P32. stockinette stitch
5. Repeat instruction 4 an additional 13 times. adds length
6. P32, P32. raised stripe stitch
7. K32, P32. stockinette stitch
8. Repeat instruction 7 an additional 251 times. adds length
9. P32, P32. raised stripe stitch
10. K32, P32. stockinette stitch
11. Repeat instruction 10 an additional 13 times. adds length
12. K1 P1 across row. K1P1 ribbing
13. Repeat instruction 12 an additional 7 times. adds length
14. Bind off 32 stitches. binds off row: ends pattern
In the expression below, C stands for cast on, K stands for knit, P
stands for purl and B stands for bind off:
C{32}
((KP){16})+
(K{32}P{32})+
P{32}P{32}
(K{32}P{32})+
P{32}P{32}
(K{32}P{32})+
((KP){16})+
B{32}
2.8 Write a regular expression for the language accepted by the NFSA in Fig. 2.26.
q q3 0 q1 q2
a b a
b
a
Figure 2.1 A mystery language.
(aba?)+
2.9 Currently the function D-RECOGNIZE in Fig. 2.12 solves only a subpart of the
important problem of finding a string in some text. Extend the algorithm to solve
the following two deficiencies: (1) D-RECOGNIZE currently assumes that it is
already pointing at the string to be checked, and (2) D-RECOGNIZE fails if the
string it is pointing to includes as a proper substring a legal string for the FSA.
That is, D-RECOGNIZE fails if there is an extra character at the end of the string.
To address these problems, we will have to try to match our FSA at
each point in the tape, and we will have to accept (the current substring)
any time we reach an accept state. The former requires an
5
additional outer loop, and the latter requires a slightly different structure
for our case statements:
function D-RECOGNIZE(tape,machine) returns accept or reject
current-state Initial state of machine
for index from 0 to LENGTH(tape) do
current-state Initial state of machine
while index < LENGTH(tape) and
transition-table[current-state,tape[index]] is not empty do
current-state transition-table[current-state,tape[index]]
index index + 1
if current-state is an accept state then
return accept
index index + 1
return reject
2.10 Give an algorithm for negating a deterministic FSA. The negation of an FSA
accepts exactly the set of strings that the original FSA rejects (over the same
alphabet) and rejects all the strings that the original FSA accepts.
First, make sure that all states in the FSA have outward transitions for
all characters in the alphabet. If any transitions are missing, introduce
a new non-accepting state (the fail state), and add all the missing
transitions, pointing them to the new non-accepting state.
Finally, make all non-accepting states into accepting states, and
vice-versa.
2.11 Why doesn’t your previous algorithm work with NFSAs? Now extend your algorithm
to negate an NFSA.
The problem arises from the different definition of accept and reject
in NFSA. We accept if there is “some” path, and only reject if all
paths fail. So a tape leading to a single reject path does neccessarily
get rejected, and so in the negated machine does not necessarily get
accepted.
For example, we might have an -transition from the accept state
to a non-accepting state. Using the negation algorithm above, we
swap accepting and non-accepting states. But we can still accept
strings from the original NFSA by simply following the transitions as
before to the original accept state. Though it is now a non-accepting
state, we can simply follow the -transition and stop. Since the -
transition consumes no characters, we have reached an accepting state
with the same string as we would have using the original NFSA.
To solve this problem, we first convert the NFSA to a DFSA, and
then apply the algorithm as before.
Chapter 3
Words and Transducers
3.1 Give examples of each of the noun and verb classes in Fig. 3.6, and find some
exceptions to the rules.
Examples:
• nouni: fossil
• verbj: pass
• verbk: conserve
• nounl: wonder
Exceptions:
• nouni: apology accepts -ize but apologization sounds odd
• verbj: detect accepts -ive but it becomes a noun, not an adjective
• verbk: cause accepts -ative but causitiveness sounds odd
• nounl: arm accepts -ful but it becomes a noun, not an adjective
3.2 Extend the transducer in Fig. 3.17 to deal with sh and ch.
One possible solution:
6
7
3.3 Write a transducer(s) for the K insertion spelling rule in English.
One possible solution:
3.4 Write a transducer(s) for the consonant doubling spelling rule in English.
One possible solution, where V stands for vowel, and C stands for
consonant:
3.5 The Soundex algorithm (Knuth, 1973; Odell and Russell, 1922) is a method
commonly used in libraries and older census records for representing people’s
names. It has the advantage that versions of the names that are slightlymisspelled
or otherwise modified (common, e.g., in hand-written census records) will still
have the same representation as correctly spelled names. (e.g., Jurafsky, Jarofsky,
Jarovsky, and Jarovski all map to J612).
1. Keep the first letter of the name, and drop all occurrences of non-initial a,
e, h, i, o, u, w, y.
2. Replace the remaining letters with the following numbers:
b, f, p, v!1
c, g, j, k, q, s, x, z!2
d, t!3
l !4
m, n!5
r!6
3. Replace any sequences of identical numbers, only if they derive fromtwo or
more letters that were adjacent in the original name, with a single number
(e.g., 666!6).
4. Convert to the form Letter Digit Digit Digit by dropping digits
past the third (if necessary) or padding with trailing zeros (if necessary).
The exercise: write an FST to implement the Soundex algorithm.
8 Chapter 3. Words and Transducers
One possible solution, using the following abbreviations:
V = a, e, h, i, o, u, w, y
C1 = b, f, p, v
C2 = c, g, j, k, q, s, x, z
C3 = d, t
C4 = l
C5 = m, n
C6 = r
3.6 Read Porter (1980) or see Martin Porter’s official homepage on the Porter stemmer.
Implement one of the steps of the Porter Stemmer as a transducer.
Porter stemmer step 1a looks like:
SSES ! SS
IES ! I
SS ! SS
S !
One possible transducer for this step:
9
3.7 Write the algorithm for parsing a finite-state transducer, using the pseudocode
introduced in Chapter 2. You should do this by modifying the algorithm NDRECOGNIZE
in Fig. 2.19 in Chapter 2.
FSTs consider pairs of strings and output accept or reject. So the
major changes to the ND-RECOGNIZE algorithm all revolve around
moving from looking at a single tape to looking at a pair of tapes.
Probably the most important change is in GENERATE-NEW-STATES,
where we now must try all combinations of advancing a character or
staying put (for an ) on either the source string or the target string.
function ND-RECOGNIZE(s-tape,t-tape,machine) returns accept/reject
agenda {(Machine start state, s-tape start, t-tape start)}
while agenda is not empty do
current-state NEXT(agenda)
if ACCEPT-STATE?(current-state) then
return accept
agenda agenda [ GENERATE-NEW-STATES(current-state)
return reject
function GENERATE-NEW-STATES(current-state) returns search states
node the node the current-state is on
s-index the point on s-tape the current-state is on
t-index the point on t-tape the current-state is on
return
(transition[node, :], s-index, t-index) [
(transition[node, s-tape[s-index]:], s-index + 1, t-index) [
(transition[node, :t-tape[t-index]], s-index, t-index + 1) [
(transition[node,s-tape[s-index]:t-tape[t-index]],s-index+1,t-index+1)
function ACCEPT-STATE?(search-state) returns true/false
node the node the current-state is on
s-index the point on s-tape the current-state is on
t-index the point on t-tape the current-state is on
return s-index is at the end of the tape and
t-index is at the end of the tape and
node is an accept state of the machine
3.8 Write a program that takes a word and, using an on-line dictionary, computes
possible anagrams of the word, each of which is a legal word.
def permutations(string):
if len(string) < 2:
yield string
else:
first, rest = string[:1], string[1:]
indices = range(len(string))
for sub_string in permutations(rest):
for i in indices:
yield sub_string[:i] + first + sub_string[i:]
def anagrams(string):
for string in permutations(string):
if is_word(string): # query online dictionary
yield string

Available Answer

$ 15.00

[Solved] TEST BANK FOR Speech and Language Processing 2nd Edition By Bethard S., Jurafsky D. and Martin J.H

This solution is not purchased yet.
Submitted On 15 Feb, 2022 05:08:53

Answer posted by

A-Grades
Rating : 0
Grade : No Rating
Questions : 0
Solutions : 275
Blog : 0
Earned : $35.00

Buy now to view the complete solution

Attachment

TEST BANK FOR Speech and Language Processing 2nd Edition By Bethard S., Jurafsky D. and Martin J.H. (Instructor's Solution Manual).pdf (1 MB)

Health and Health Care Delivery in Canada 2nd Edition Test Bank

Chapter 1: The History of Health Care in Canada MULTIPLE CHOICE 1. When and where was Canada’s first medical school established? a. Saskatoon, in 1868 b. Ottawa, in 1867 c. Montreal, in 1825 d. Kingston, in 1855 ANS: C...

Acade...

ATI Pharmacology Proctored Exam Test Bank

ATI Pharmacology Proctored Exam Test Bank ATI Pharmacology Proctored Exam Test Bank ATI Pharmacology Proctored Exam Test Bank...

Slyve...

Medical Surgical Nursing 2nd Edition Hoffman Test Bank

Medical Surgical Nursing 2nd Edition Hoffman Test Bank 1. The medical-surgical nurse identifies a clinical practice issue and wants to determine if there is sufficient evidence to support a change in practice. Which type o...

HESIS...

COMPLETE HESI Exit Exam Test Bank, All Versions Covered 100%GRADED A+ WIT

1.Following discharge teaching a male client with dual ULCER tellsthe nurse the he will drink plenty of dairy products, such as milk, to help coat and protect his ulcer. What is the best follow-up action by the nurse? A....

Captu...

Med Surg ATI Proctored Exam Test Bank 2023 With NGN

Med Surg ATI Proctored Exam Test Bank 2023 With NGN 1. A nurse is providing discharge teaching to a client who has a new prescription for sublingual nitroglycerin. Which of the following client statements indicates an unde...

The benefits of buying study notes from CourseMerits

Assurance Of Timely Delivery

We value your patience, and to ensure you always receive your homework help within the promised time, our dedicated team of tutors begins their work as soon as the request arrives.

Best Price In The Market

All the services that are available on our page cost only a nominal amount of money. In fact, the prices are lower than the industry standards. You can always expect value for money from us.

Uninterrupted 24/7 Support

Our customer support wing remains online 24x7 to provide you seamless assistance. Also, when you post a query or a request here, you can expect an immediate response from our side.