A verbum dicendi (Latin for "word of speaking" or "verb of speaking") is a word that expresses speech or introduces a quotation. English examples of verbs of speaking include say, utter, ask and rumble. Because a verbum dicendi ("verb of speaking") often introduces a quotation, it may grammaticalize into a quotative.

Complement of verbum dicendi: direct and indirect speech[edit]

A complement of verbum dicendi can be direct or indirect speech. Direct speech is a single unit of linguistic object that is '"mentioned" rather than used.'[1] In contrast, indirect speech is a proposition whose parts make semantic and syntactic contribution to the whole sentence just like parts of the matrix clause (i.e. the main clause/sentence, as opposed to an embedded clause).

Cross-linguistically, there are syntactic differences between direct and indirect speech, which include verbatimness, interpretations of deictic expressions, tense, presence or absence of complementizers, and syntactic opacity.[1]

The complement clause may or may not be verbatim[edit]

If a complement of verbum dicendi is direct speech, it is presented as a faithful report of what the original speaker exactly said. In the following examples, (1)a entails that "I will go to Tokyo" was the exact sentence that John uttered. In (1)b, on the other hand, John might have uttered a different sentence, for example, "I'll spend my vacation in Tokyo."[1]

(1)a. John said (to me): "I will go to Tokyo"

(1)b. John said (to me) that he would go to Tokyo.[1]

Indexicals in the complement clause may or may not be utterance-bound[edit]

If a complement of verbum dicendi is direct speech, deictic expressions in the complement are interpreted with respect to the context in which the original sentence was uttered.[1] In (2)a, the embedded clause is direct speech; the first person pronoun I and the second person pronoun you in "Ii will give youj a hand" respectively refer to the utterer and the addressee in the context in which this quoted speech was uttered. In contrast, if the embedded clause is indirect speech, all deictic expressions in the sentence are interpreted in the context in which the matrix clause is uttered. In (2)b, the embedded clause is indirect speech, so all the occurrences of the first person pronoun me and the second person pronoun you in the sentence respectively refer to the utterer and the addressee in the immediate context in which 2(b) is uttered.

(2)a. Youi said to mej: "Ii will give youj a hand."

(2)b. Youi said to mej that youi would give mej a hand.[1]

Sequence of tense[edit]

Some languages, including English, show difference in tense between direct and indirect quotes. This phenomenon is formalized as "the sequence of tense rules."[1]


In some languages, the distinction between direct and indirect speech can be diagnosed by presence of an overt complementizer. Many languages, including English, have an overt complementizer (e.g. that in English) when the complement of verbum dicendi is indirect speech, as seen in (1)b and 2(b) above. Some languages, such as Tikar, on the other hand, use an overt complementizer to introduce indirect speech.[1]

Syntactic opacity[edit]

If a complement of verbum dicendi is direct speech, it is "syntactically opaque,"[1] meaning that syntactic elements inside this embedded clause cannot interact with elements in the matrix clause.

For example, Negative Polarity Items (NPI) inside an embedded direct quote cannot be licensed by a syntactic element in the matrix clause.

(3)a. ?Nobody said "we saw anything."

(3)b. Nobody said that they had seen anything.[2]

Note that (3)a is still syntactically well-formed but cannot communicate the same meaning as (3)b, in which the NPI anything inside the embedded indirect quote [they had seen anything] is licensed by nobody in the matrix clause. Another example is that wh-movement out of an embedded direct quote is prohibited, as seen in (4)a below.

(4)a. *What did John say: "I read _"?

(4)b. What did John say that he had read _?[1]

Verbum dicendi in English[edit]

In English, verba dicendi (singular: verbum) such as say and think are used to report speech and thought processes.[3]

(1)a. If you touched a one they would say ‘wey you’re on’. (UK)
   b. And I thought ‘Well we need some more popcorn’. (US)[3]

Such examples are prototypical, but many variants exist within an open class of manner-of-speaking verbs such as ask, shout, scream, wonder, yell, holler, bellow, grunt, mumble, mutter, etc. These may be considered semantically more specific, implying a clause type (as in ask) or indicating the intensity or prosody of the reported material (e.g. shout, mutter).

Quotation indicates to a listener that a message originated from a different voice, and/or at a different time than the present. An utterance like “Jim said ‘I love you’” reports at the present moment that Jim said “I love you” at some time in the past. Thus, there are two distinct active voices: that of the narrator and that of the reportee.[3] Written English often employs manner-of-speaking verbs or verba dicendi in conjunction with quotation marks to demarcate the quoted content. Speakers use more subtle phonetic and prosodic cues like intonation, rhythm, and mimesis to indicate reported speech.


There are numerous syntactically and semantically relevant properties of verba dicendi and manner-of-speaking verbs, several of which are highlighted below:

i. They are so-called Activity Verbs. They may occur in progressive and imperative forms, among other tests:

(2)a. He was shouting obsenities
   b. Yell to George about the new quota
   c. What John did was lisp French to Mary[4]

ii. The subject of verba dicendi is normally sentient:

(3)a. My father howled for me to pick up the chair
   b. *My desk howled for me to pick up the chair[4]

however, it is possible, at least colloquially, to assign the subject role of some verba dicendi to a non-sentient entity. An expression like when you're late, it says you don't care could be one such example.

iii. Verba dicendi may have an indirect object, which may be marked by to and which is also normally sentient:

(4)a.Scream ‘Up the Queen’ (to the first person who passes)
   b.*She will howl ‘O my stars and garters” to the essence of friendship[4]

iv. Manner-of-speaking verbs may have a direct object, which may be a noun describing the speech act itself, a desentential complement (that-clause, indirect question or infinitive), or a direct quotation:

(5)a. Hoffman will probably mutter a foul oath
   b. Martin shrieked that there were cockroaches in the caviar
   c. Regrettably, someone mumbled, “I suspect poison[4]

Further, the direct object of some manner-of-speaking verbs may be deleted, resulting in a sentence that does not indicate an act of communication, but rather a description of the sound made:

   d. My companion shrieked[4]

Other verba dicendi do not permit this, however. Say, ask, tell, for instance, cannot occur freely without an object:

   e. *Said John[5]

Speak may occur without an object. In fact, it’s occurrence with an object is restricted. A that-clause, for example, is ungrammatical:

   f. Margaret spoke (to me)
   g. *Margaret spoke that there were cockroaches in the caviar[4]

v. Some manner-of-speaking verbs may occur with directional adverbials, which cannot co-occur with indirect objects:

(6)a. He bellowed at us (*to Sam)[4]

Other verba dicendi cannot occur in at constructions:

   b. *She {said/remarked/declared} (something) at me[4]

vi. Some manner-of-speaking verbs may have a nominal (noun) counterpart which sounds the same, but which has no communicative content, such as mutter, bellow, shriek, whine and whisper. Notice that other verba dicendi do not have these homophonous nouns (e.g. speak/speech, tell/tale, declare/declaration).[4] There are many such observations. Another property of verbs of speaking is the lack of so-called factivity effect; in other words, the speaker is not required to actually believe what they are saying.[6] This has implications for the truth conditions of quotative constructions:

(7) Mary says that Paul is her friend

Mary’s statement may be false, though it may be true that she actually said it. In fact, she may even believe it to be false. However, whether or not believing is part of speaking has been debated for some time.[6]


The syntax of quotation and verba dicendi appears at first glance to be a straightforward case of transitivity, wherein the quoted material is interpreted as a direct object. In a case like

(8) Jim said “I love you”[3]
Phrase structure tree of embedded quotation as direct object[3]

the traditionally held analysis takes the reported clause “I love you” to be the complement of say. Thus the quote is termed an NP (noun phrase) and introduced as a direct object.

This analysis is supported by some of the typical syntactic tools for testing direct objects, such as moving into the focus of a question and clefting.[3] However, constituency, movement and replacement tests show that the quotative clause does not behave like a normal transitive construction. For example, clefting and passivization of these forms give marked (ungrammatical, or strange at least) results:

(9)a. ?“I’ll call you” was said by Pat
      cf. The cat was held by Pat
   b. ?What Pat did with “I’ll call you” was say it
      cf. What Pat did with the cat was hold it[7]

Quotation may also be less restricted than ordinary transitive verbs. They may occur parenthetically, unlike other verbs:

(10)a. “I’ll call you” Pat said “and I hope you answer”
       cf. ?The cat Pat held and a book
    b. “I” Pat said “will call you and I hope you answer”
       cf. *The Pat held cat and a book[7]

Another issue is that manner-of-speaking verbs are not always obligatorily transitive. Verbs like think, laugh, scream, yell, whisper may be intransitive.[7] A different model has been proposed, which does not rely on transitivity, but rather an asymmetrical construction containing a reporting clause (head) and an independent reported clause.[7] Note that the asymmetry arises from the fact that the reporting clause is dependent on the quoted content for grammaticality, while the reverse is not true.

(11)a.*I said
    b.“I love you”

In this model, the dependent clause has a site of elaboration (e-site) which is filled by the independent clause:

 HEAD[Pat said      e-site]   COMPLEMENT[“I’ll call you”]
 HEAD[Pat thought   e-site]   COMPLEMENT[“I’ll call you”]
 HEAD[Pat was like  e-site]   COMPLEMENT[“I’ll call you”][7]

Direct/Indirect Quotation[edit]

Direct quotation is reported from the perspective of the experiencer:

(12) He said “I am leaving now”[3]

However, indirect quotation is often paraphrased, and reported by a narrator from the perspective of the reportee. Verbs like ask and tell are frequently associated with indirect speech. English indirect quotation also shows a sequence-of-tense effect: a past tense reporting verb requires a "back-shift" in verb tense within the indirect quote itself[3]

(13)a. He said "I am leaving now"
    b. He said (that) he was leaving immediately[3]

Indirect quotation is, in theory, syntactically constrained and requires that the quoted content form a subordinate clause under the CP node.[3] However, what is seen in speech does not necessarily conform to theory. The complementizer that, though considered to be a marker of indirect quotation, is not obligatory and is often omitted. Further, it can (and does) occur with direct quotes in some dialects of English (e.g. Hong Kong, Indian).[7] Verbs of speaking often employ the Conversational Historical Present tense, whereby actions in the past are referred to with present-tense morphology. This is considered to add immediacy or authority to the discourse.[3] However, it also illustrates the difficulty in differentiating direct and indirect quotation.[3]

(14) So uh ... this lady says ... uh this uh Bert (‘)His son’ll make them. He’s an electrician(‘)[3]

Inverted Constructions[edit]

Sentences with verba dicendi for direct quotation may use the somewhat antiquated verb-first (V2) order of English syntax. Inversion of this type with verbs of speaking or thinking frequently occurs in written English, though rarely in spoken English. It is also possible to invert the clause without changing subject-verb order. This is not possible with regular English transitives:

(15)a. “No no no” said Harry
    b. “You’re not drunk” she says 
    cf. *The cat held Pat (where Pat did the holding)[7]

There are several restrictions, however. For example, quantifiers may occur to the right of the subject in a non-inverted quotative sentence, but not in an inverted sentence. They can, however, occur to the immediate left of the subject in an inverted sentence:

(16)a.”We must do this again”, the guests all declared to Tony
    b.”We must do this again”, declared all the guests to Tony
    c.*”We must do this again”, declared the guests all to Tony[5]

Inversion and negation with verba dicendi may co-occur only if the reporting clause itself is not inverted:

(17)a.“Let’s eat”, said John just once
    b.“Let’s eat”, John didn’t just say once
    c.*”Let’s eat”, said not John just once
    d.*”Let’s eat”, not said John just once[5]

Other constraints involve subject position, DP direct objects, and movement, among others.


Grammaticalization is the attribution of grammatical character to a previously independent, autonomous word.[7] There is significant cross-linguistic evidence of verba dicendi grammaticalizing into functional syntactic categories. For instance, in some African and Asian languages, these verbs may grammaticalize into a complementizer.[7] In other East African languages, they may become markers of Tense-Aspect-Mood (TAM).[7] In English, the verb say in particular has also developed the function of a comment clause:

(18)a. Say there actually were vultures on his tail
    b. What say he does answer?
    c. Buy a big bottle – say about 250 mils 
    d. If we ran out of flour or sugar, say, we would gather up a few eggs and take them to Mr. Nichols’s general store
    e.“Say, isn’t that–” Lance started, but Buck answered before the question was even asked
    f.“Say, that’s our City,” bubbles Dolores
    g. Jump, I say, and be done with it[8]

In these examples, the verb say fulfils many roles. In the first two examples (a & b), it means ‘suppose', or 'assume.’ In the third and fourth examples (c & d) the meaning of say could be paraphrased as 'for example', or 'approximately.' Example (e) uses say as an imperative introducing a question and connotes 'tell me/us.' Say may also function as an interjection to either focus attention on the speaker or to convey some emotional state such as surprise, regret, disbelief, etc.[8] Finally, example (f) uses say in an emphatic, often imperative way. This function dates from the (early) Middle English period[7]

Emergence of Innovative Forms: go, be all, be like[edit]

In addition to basic verba dicendi and manner of speaking verbs, other forms are frequently used in spoken English. What sets these apart is that they are not, semantically speaking, reporting verbs at all. Such forms include be like, be all, and go.

(19)a. Pat was like “I’ll call you.”[7]
    b. [...]and then my sister’s all “excuse me would you mind if I gave you, if I want your autograph” and she’s like “oh sure, no problem.” 
    c. And he goes “yeah” and looks and you can tell maybe he thinks he’s got the wrong address[…][9]

These forms, particularly be like, have captured the attention of much linguistic study and documentation. Some research has addressed the syntax of these forms in quotation, which is highly problematic. For example, a verbum dicendi like say may refer to a previously quoted clause with it. However, this is not possible with these innovative forms:

(20)a. “I don’t know if he heard it, but I know I definitely said it
    b. *I’m like it
    c. *She was all it
    d. *I went it[10]

Notice that these forms also don’t behave as basic verba dicendi in many other ways. Clefting, for example, produces ungrammatical forms like

(21)a. *That’s nice was gone by me 
    b. *Um, yah, I know, but there’s going to be wine there was been all by her[10]

They also can’t participate in inverted constructions like other manner-of-speaking verbs:

(22)a. *“Go home”, he was like
    b. *“I’m leaving”, was all John

Several other issues concerning these forms are the topic of much current study, including their diachrony, or change in use over time. Go, for example, dates as far back as the eighteenth century, in contexts like go bang, go crack, go crash, etc.[3] Research on these forms has also shown that be like in particular is associated with young people. However, this assumption is questioned on the basis of more recent findings which suggest that it is used by older speakers as well.[9] It has also been largely attributed to female speakers, especially white females in California (Valley Girls). However, it is regularly used by speakers of both sexes, and in dialects of English outside of the US, including Canada and the UK.[10] This is a topic of a great deal of research in current syntax and sociolinguistics.

Verbum dicendi in Japanese[edit]

Verba dicendi and Syntactic construction[edit]

In Japanese, verba dicendi (発話行為動詞 [hatsuwa koui doushi] [11] 'speech act verb'), also referred to as verbs of communication[12] or verbs of saying,[13] include 言うiu/yuu 'say,' 聞くkiku 'ask,' 語るkataru 'relate,' 話す hanasu 'talk,' and 述べる noberu 'state.'[12][11]

Verba dicendi occur in the following construction: [_________] {と-to, て-tte} Verbum dicendi.[12]

と-to has been described as a complementizer and a quotative particle.[12][13] Historically, use of と-to was restricted to reporting a statement by another speaker, but it has a much wider distribution in modern Japanese.[12] In conversational Japanese, て-tte[14] is more frequently used, and it has been described as a quotative particle, a hearsay particle, a quotation marker, and a quotative complementizer.[15][16] In the above construction, the underlined phrase headed by {と-to, て-tte} can be a word, a clause, a sentence, or an onomatopoetic expression.[12]

Like in English, verba dicendi in Japanese can introduce both direct and indirect speech as their complement.[13] This is contrasted with verbs of thinking, which only introduce indirect speech. An exception to this is a verb of thinking 思う omou 'think,' which can introduce a speech that is not uttered but takes place in one's mind as direct speech in the simple past tense; this use of 思う omou 'think' is called a "quasi-communicative act."[13]

Direct and indirect speech: Ambiguity[edit]

In Japanese, a complement of verbum dicendi can be ambiguous between direct and indirect readings, meaning that the distinction can only be inferred from the discourse context.[13] For example, in (1), [boku ga Tookyoo e iku], which is in the complement of the verb 言う iu (past tense: itta), can be interpreted as either direct speech (2)a or indirect speech (2)b.

(1) 太郎は僕が東京へ行くと言った。

Taroo wa [boku ga Tookyoo e ik-u] to it-ta[13]

Taro TOP I(MALE) NOM Tokyo to go-PRS QUOT say-PST.

(2)a. Taroi said, "Ii will go to Tokyo."

(2)b. Taroi said that Ij would go to Tokyo.

Phrase structure representation of (1)

One reason for this direct-indirect ambiguity in Japanese is that Japanese indirect speech does not involve "backshifting of tense"[13] observed in other languages including English. In the English translations, the direct speech in (2)a has will in the tense of the embedded clause, but in (2)b, it has been "backshifted" to would so that it matches the past tense of the matrix clause. Tense does not serve as a diagnostics of direct-indirect distinctions in Japanese.[17][12]

Another reason for the ambiguity is that both direct and indirect quotes are introduced by {と-to, て-tte} in Japanese. Hence, presence of an overt complementizer cannot disambiguate direct and indirect speech either.[13]

Diagnostics of direct speech in Japanese[edit]

Disambiguation of direct and indirect speech in Japanese depends on switches in deictic expressions and expressions of "speaker-addressee relationship."[12] One language-specific diagnostics of direct speech is so-called "addressee-oriented expressions,"[16] which trigger a presupposition that there is an addressee in the discourse context. Some examples are listed below:

sentence final particles: さ -sa 'let me tell you'; ね -ne 'you know'; よ -yo 'I tell you'; わ -wa 'I want you to know'

imperative forms: 「走れ!」hashire 'Run!’

polite verbs/polite auxiliary verbs: です desu; ございます gozaimasu; ますmasu[16][12]

For example, in (3), [Ame da yo] in the complement of the verb 言う iu (past tense: itta) is unambiguously interpreted as direct speech because of the sentence final particle よ -yo 'I tell you.'

(3) 太郎は花子に「雨だよ」と言った。

Taro wa Hanako ni [Ame da yo] to it-ta[13]

Taro TOP Hanako DAT Rain COP yo QUOT say-PST

'Taro said to Hanako, "It is raining, I tell you."

Similarly, in (4), [Ame desu] in the complement of the verb 言う iu (past tense: itta) is unambiguously interpreted as direct speech because of the polite verb です desu.

(4) 太郎は花子に「雨です」と言った。

Taro wa Hanako ni [Ame desu] to it-ta[13]

Taro TOP Hanako DAT Rain desu QUOT say-PST

'Taro said to Hanako politely, "It is raining."

Diagnostics of indirect speech in Japanese[edit]

One diagnostics of indirect speech in Japanese is presence of the reflexive pronoun 自分 zibun 'self.' It is a gender-neutral pronoun that uniformly refers to "the private self,"[13] or an agent of thinking, as opposed to "the public self," an agent of communicating, expressed by various personal pronouns (e.g. 僕 boku 'I(MALE)'), occupational roles (e.g. 先生 sensei 'teacher'), and kinship terms (e.g. お母さん okaasan 'mother').[13]

For example, in (5), [zibun ga Tookyoo e iku] in the complement of the verb 言う iu (past tense: itta) is unambiguously interpreted as indirect speech because of the presence of 自分 zibun 'self,' which is co-referential with 太郎 Taro.

(5) 太郎は自分が東京へ行くと言った。

Tarooi wa [zibuni ga Tookyoo e iku] to it-ta[13]

Taro TOP self NOM Tokyo to go QUOT say-PST.

'Taro said that he would go to Tokyo.'

Note that (5) only differs from (1) in the subject of the embedded clause; (5) has 自分 zibun 'self' and (1) has 僕 boku 'I(MALE).'

Simultaneously direct and indirect speech[edit]

It has been reported that some sentences in Japanese have characteristics of both direct and indirect modes simultaneously.[1] This phenomenon is called "semi-indirect mode" or "quasi-direct mode." It is also observed in a reported speech, and Kuno (1988) has termed it "blended discourse."[18][1] (6) is an example of blended discourse.

(6) 太郎は奴のうちに何時に来いと言ったのか。

Taroi wa [yatui-no uti-ni nanzi-ni ko-i] to it-ta no ka?[1]

Taro TOP [he-GEN house-DAT what.time-DAT come-IMP] QUOT say-PST Q Q

'What time did Taroi say, [come to hisi house ______]?'

[yatu-no uti-ni nanzi-ni ko-i] in the complement of the verb 言う iu (past tense: itta) appears to be direct speech because it has an imperative verb form 来い ko-i 'Come!'.

On the other hand, the third person pronoun 奴 yatu 'he' inside the embedded clause is co-referential with the matrix subject 太郎 Taro.[1] This means that this deictic expression inside the embedded clause is interpreted in the context in which the whole sentence (6) is uttered; cross-linguistically, this is considered to be a property of indirect speech.

Moreover, the wh-phrase 何時 nanzi 'what time' inside the embedded clause is taking matrix scope, meaning that it interacts with the matrix clause to influence the meaning of the whole sentence.[1] This sentence means that, for example, 太郎 Taro had said "Come to my house at ten o'clock!," and the utterer of (6), not knowing the content of "ten o'clock," is requesting for this information. Availability of this meaning is an indication of indirect speech because if the embedded clause was direct speech, it would be syntactically opaque.


