Finite state transducer
A finite state transducer (FST) is a finite state machine with two tapes: an input tape and an output tape. This contrasts with an ordinary finite state automaton (or finite state acceptor), which has a single tape.
Contents |
[edit] Overview
An automaton can be said to recognize a string if we view the content of its tape as input. In other words, the automaton computes a function that maps strings into the set {0,1}. Alternatively, we can say that an automaton generates strings, which means viewing its tape as an output tape. On this view, the automaton generates a formal language, which is a set of strings. The two views of automata are equivalent: the function that the automaton computes is precisely the indicator function of the set of strings it generates. The class of languages generated by finite automata is known as the class of regular languages.
The two tapes of a transducer are typically viewed as an input tape and an output tape. On this view, a transducer is said to transduce (i.e., translate) the contents of its input tape to its output tape, by accepting a string on its input tape and generating another string on its output tape. It may do so nondeterministically and it may produce more than one output for each input string. A transducer may also produce no output for a given input string, in which case it is said to reject the input. In general, a transducer computes a relation between two formal languages. The class of relations computed by finite state transducers is known as the class of rational relations.
Finite-state transducers are often used for phonological and morphological analysis in natural language processing research and applications. Pioneers in this field include Ronald Kaplan, Lauri Karttunen, Martin Kay and Kimmo Koskenniemi.[1] A common way of using transducers is in a so-called "cascade", where transducers for various operations are combined into a single transducer by repeated application of the composition operator (defined below).
[edit] Formal construction
Formally, a finite transducer T is a 6-tuple (Q, Σ, Γ, I, F, δ) such that:
- Q is a finite set, the set of states;
- Σ is a finite set, called the input alphabet;
- Γ is a finite set, called the output alphabet;
- I is a subset of Q, the set of initial states;
- F is a subset of Q, the set of final states; and
(where ε is the empty string) is the transition relation.
We can view (Q, δ) as a labeled directed graph, known as the transition graph of T: the set of vertices is Q, and
means that there is a labeled edge going from vertex q to vertex r. We also say that a is the input label and b the output label of that edge.
NOTE: This definition of finite transducer is also called letter transducer (Roche and Schabes 1997); alternative definitions are possible, but can all be converted into transducers following this one.
Define the extended transition relation
as the smallest set such that:
;
for all
; and- whenever
and
then
.
The extended transition relation is essentially the reflexive transitive closure of the transition graph that has been augmented to take edge labels into account. The elements of
are known as paths. The edge labels of a path are obtained by concatenating the edge labels of its constituent transitions in order.
The behavior of the transducer T is the rational relation [T] defined as follows:
if and only if there exists
and
such that
. This is to say that T transduces a string
into a string
if there exists a path from an initial state to a final state whose input label is x and whose output label is y.
Finite State Transducers can be weighted, where each transition is labeled with a weight in addition to the input and output labels. This property makes FSTs very useful for machine learning applications. A Weighted Finite State Transducer (WFST) over a semiring K can be defined similarly to an unweighted one as an 8-tuple T=(Q, Σ, Γ, I, F, E, λ, ρ), where:
- Q, Σ, Γ, I, F are defined as above;
(where ε is the empty string) is the finite set of transitions;
maps initial states to weights;
maps final states to weights.
In order to make certain operations on WFSTs well-defined, the weights have to form a Semiring. Two typical semirings used in practice are Log and Tropical Semirings.
[edit] Operations on finite state transducers
The following operations defined on finite automata also apply to finite transducers:
- Union. Given transducers T and S, there exists a transducer
such that
if and only if
or
.
- Concatenation. Given transducers T and S, there exists a transducer
such that
if and only if
and
.
- Kleene closure. Given a transducer T, there exists a transducer
with the following properties: (1)
; (2) if
and
then
; and
does not hold unless mandated by (1) or (2).
- Intersection. Given transducers T, S, the intersection of these transducers H has the property that (1) x[H]y if and only if x[T]y and x[S]y.
- Composition. Given a transducer T on alphabets Σ and Γ and a transducer S on alphabets Γ and Δ, there exists a transducer
on Σ and Δ such that
if and only if there exists a string
such that
and
. This operation extends to the weighted case.[2]
- This definition uses the same notation which is used in mathematics for relation composition. However, the conventional reading for relation composition is the other way around: given two relations
and
,
when there exist some
such that
and
.
- Projection to an automaton. There are two projection functions:
preserves the input tape, and
preserves the output tape. The first projection,
is defined as follows:
-
- Given a transducer T, there exists a finite automaton
such that
accepts x if and only if there exists a string y for which
.
- Given a transducer T, there exists a finite automaton
- The second projection,
is defined similarly.
- Determinization. Given a transducer T, we want to build an equivalent transducer which has a unique initial state and such that no two transitions leaving any state share the same input label. The powerset construction can be extended to transducers, or even weighted transducers, but sometimes fails to halt; indeed, some non-deterministic transducers do not admit equivalent deterministic transducers.[3] Characterizations of determinizable transducers have been proposed[4] along with efficient algorithms to test them[5]: they rely on the semiring used in the weighted case as well as a general property on the structure of the transducer (the twins property).
- Weight pushing for the weighted case.[6]
- Minimization for the weighted case.[7]
- Removal of epsilon-transitions.
[edit] Additional properties of finite state transducers
- It is decidable whether the relation [T] of a transducer T is empty.
- It is decidable whether there exists a string y such that x[T]y for a given string x.
- It is undecidable whether two transducers are equivalent.
- If one defines the alphabet of labels
, finite state transducers are isomorphic to NDFA over the alphabet
, and may therefore be determinized (turned into deterministic finite automata over the alphabet
) and subsequently minimized so that they have the minimum number of states.[citation needed]
[edit] Application examples
- Contextual rewriting

[edit] See also
[edit] Internal links
[edit] External links
- OpenFst, an open-source library for FST operations.
[edit] Notes
- ^ Koskenniemi 1983
- ^ Mohri 2004, pp. 3–5
- ^ [1]
- ^ Mohri 2004, pp. 5–6
- ^ Allauzen 2003
- ^ Mohri 2004, pp. 7–9
- ^ Mohri 2004, pp. 9–11
[edit] References
- Allauzen, Cyril; Mohri, Mehryar (2003). "Efficient Algorithms for Testing the Twins Property". Journal of Automata, Languages and Combinatorics 8 (2): 117–144. http://www.cs.nyu.edu/~mohri/pub/twins.pdf.
- Koskenniemi, Kimmo (1983), Two-level morphology: A general computational model of word-form recognition and production, Department of General Linguistics, University of Helsinki, http://www.ling.helsinki.fi/~koskenni/doc/Two-LevelMorphology.pdf
- Mohri, Mehryar (2004). "Weighted Finite-State Transducer Algorithms: An Overview". Formal Languages and Applications 148 (620): 551–564. http://www.cs.nyu.edu/~mohri/pub/fla.pdf.
[edit] Further reading
- Jurafsky, Daniel; James H. Martin (2000). Speech and Language Processing. Prentice Hall. pp. 71–83. ISBN 0-13-095069-6.
- Kornai, Andras (1999). Extended Finite State Models of Language. Cambridge University Press. ISBN 0-521-63198-X.
- Roche, Emmanuel; Yves Schabes (1997). Finite-state language processing. MIT Press. pp. 1–65. ISBN 0-262-18182-7.
- Beesley, Kenneth R.; Lauri Karttunen (2003). Finite State Morphology. Center for the Study of Language and Information. ISBN 1575864347.
- Roark, Brian; Richard Sproat (2007). Computational Approaches to Morphology and Syntax. Oxford University Press. ISBN 0199274789.
- Berstel, Jean (1979). Transductions and Context-free Languages. Teubner Verlag.. Free PDF version
(where ε is the
;
for all
; and
and
then
.
(where ε is the
maps initial states to weights;
maps final states to weights.
such that
if and only if
.
such that
if and only if
and
.
with the following properties: (1)
; (2) if
and
then
; and
does not hold unless mandated by (1) or (2).
on Σ and Δ such that
if and only if there exists a string
. This operation extends to the weighted case.
and
,
when there exist some
such that
and
.
preserves the input tape, and
preserves the output tape. The first projection,
such that
, finite state transducers are isomorphic to
, and may therefore be determinized (turned into
) and subsequently minimized so that they have the minimum number of states.[