= Scrambling (linguistics) =

Scrambling is a syntactic phenomenon wherein sentences can be formulated using a variety of different word orders without a substantial change in meaning. Instead, the reordering of words from their canonical position has consequences on their contribution to the discourse (i.e., the information's "newness" to the conversation). Scrambling does not occur in English, but it is frequent in languages with freer word order, such as German, Russian, Persian and Turkic languages. The term was coined by John R. "Haj" Ross in his 1967 dissertation and is widely used in present work, particularly with the generative tradition.

== Analysis ==

=== Discourse ===

Although scrambling does not change the semantic interpretation ("meaning") of the sentence, its scrambled configurations will be given in particular contexts related to discourse. This is the underlying information that contextualizes a conversation, when you add to the discourse you will reference "old" and "new" information. Within syntax, these can be structurally represented through topic (TopP) and focus (FocP) phrases. Topic is the pre-established context of the discourse, whereas Focus is the "new" or emphasized information being highlighted.

=== Tree Structure (Movement Approach) ===
These additional phrasal categories occur between the clausal phrase (CP) and tense phrase (TP - previously referred to as inflection phrase "IP"). Both TopP and FocP have empty specifier positions that can house the scrambled XPs (phrases). When using discourse phrasal categories, what would typically be notated as "CP" becomes ForceP which will specify the clausal type (i.e., declarative, interrogative, etc.).

The empty specifier positions (i.e., 'XP') in Fig. 1 provide a landing site for phrases to move to. These sites are where phrases will be scrambled under the A-movement approach, which claims that scrambled words move to clause-initial position (see Theories: Base-generated vs. Movement). This movement has been proposed to be driven by the structural constraint 'EPP' (extended projection principle), which selects for a phrasal category as its specifier. Unlike the EPP:D feature in 'DP movement' into subject position, it functions similarly to EPP in wh-movement and topicalization, where it selects for a phrasal category that the language allows to move into 'topic' and 'focus' position (often determiner phrases 'DP' or preposition phrases 'PP').

=== Case Marking ===
Scrambling is most common in morphologically rich languages with overt case markers, which help to keep track of how entities relate to a verb. For example, the Japanese suffix [-ga] is a nominative marker which means the that entity is the subject of the verb, and [-o] is an accusative marker that signals the object of a verb. This allows speakers to know that [Kēkio Maryga taberu] means "Mary eats cake" and not "Cake eats Mary".

==Examples ==

=== Examples from Japanese (Short Distance) ===
Japanese is an SOV language with extensive scrambling due to its robust case-marking system (e.g., が -ga for nominative case, subject marker; を -o for accusative case, direct object marker; に -ni for dative case, indirect object or location marker). Japanese relies heavily on case markers to determine the roles of constituents, allowing word order to be flexible without ambiguity. Scrambling has no syntactic or semantic penalty because the parser immediately uses case markers to interpret arguments. It is often driven by the need to emphasize certain constituents or highlight new information.

The following example from Japanese illustrates a transitive example of short distance scrambling (i.e., to clause initial position).

| 1(a). Canonical Sentence |
| |
| |
| cake |
| eats |
| |
In this default sentence, the subject ( 'Mary' ) and object (kēki, 'cake') retain their typical order. There is no specific emphasis or focus with the default sentence structure.
| 1(b). Scrambled Sentence |
| |
| |
| Mary |
| eats |
| |
The object (kēki, "cake") is fronted, highlighting the subject ("Mary") as new information or the focus.

As you can see, in this table, despite the ordering of the agent (i.e., one who carries out action) 'Mary' and the patient (i.e., thing being acted upon) "cake" changing, there is not difference in their core meaning. Instead, what changes is the information being emphasized. In 1(a), there is either no focus (a statement), or the "cake" is being emphasized, whereas in 1(b), Mary is the focus.

A speaker would generate the sentence in 1(b) when they wanted to emphasize Mary, or when providing new information. For example, if someone asked "who is eating the cake?", the response could be "Cake_{TOP} MARY_{FOC} eats". The canonical sentence "Mary cake eats" would still be an appropriate answer, but the scrambled configuration emphasizes the "new" information.

Studies (Yamashita, 1997) show that scrambled sentences in Japanese do not impose a processing penalty, unlike in some other languages. Case markers enable the parser to immediately assign syntactic roles, reducing reliance on word order. At early parsing stages, Japanese exhibits non-configurationality. This indicates that:  1) Arguments are not strictly hierarchically ordered; 2) Constituents attach directly to the clause, with case markers providing necessary cues for syntactic and semantic interpretation.

=== Examples from German ===

==== Ditransitive Embedded Clause ====
The following examples from German illustrate typical instances of scrambling:

| b. | dass | der_{NOM} Mann | die_{ACC} Bohnen | der_{DAT} Frau | gab |
| c. | dass | der_{DAT} Frau | der_{NOM} Mann | die_{ACC} Bohnen | gab |
| d. | dass | der_{DAT} Frau | die_{ACC} Bohnen | der_{NOM} Mann | gab |
| e. | dass | die_{ACC} Bohnen | der_{NOM} Mann | der_{DAT} Frau | gab |
| f. | dass | die_{ACC} Bohnen | der_{DAT} Frau | der_{NOM} Mann | gab |

These examples illustrate scrambling in the midfield of a subordinate clause in German. The 'midfield' is a position within the sentence structure, with 'frontfield' and 'endfield' acting like bookends for the sentence (usually C-head/subject and V/object). The midfield is where we typically see scrambling occur in freer word order languages.

All six clauses are acceptable, whereby the actual order that appears is determined by pragmatic considerations such as emphasis (i.e., Focus and Topic). The canonical word order is usually considered the one that native speakers accept as the most natural in which none of the referents are known. If one takes the first clause (clause a) as the canonical order, then scrambling has occurred in clauses b–f. The three constituents (DP's) <u>der Mann</u>, <u>der Frau</u>, and <u>die Bohnen</u> have been scrambled to different positions depending on the discourse, with the second pronounced entity (DP) being the focus of the phrase (see "Discourse").

==== Definite vs. Indefinite Pronouns ====
There is a clear tendency for definite pronouns to appear to the left in the midfield. In this regard, definite pronouns are frequent candidates to undergo scrambling:

The canonical position of the object in German is to the right of the subject (i.e., SOV). In this regard, the object pronouns mich in the first example and uns in the second example have been scrambled to the left, so that the clauses now have OS (object-subject) order. The second example is unlike the first example as it necessitates an analysis in terms of a discontinuity, due to the presence of the auxiliary verb wird 'will'.

==== Non-midfield Scrambling or Topicalization? ====
Scrambling can be an ambiguous term, and identifying word movement that fits cleanly into it can be difficult. Standard instances of scrambling in German occur in the midfield, as stated above. There are, however, many non-canonical orderings, whose displaced constituents do not appear in the midfield. One can argue that such examples also involve scrambling:

The past participle erwähnt has been topicalized in this sentence, but its object, the pronoun das, appears on the other side of the finite verb. There is no midfield involved in this case, which means the non-canonical position in which das appears in relation to its governor erwähnt cannot be addressed in terms of midfield scrambling. The position of das also cannot be addressed in terms of extraposition, since extraposed constituents are relatively heavy, much heavier than das, which is a very light definite pronoun. Given these facts, one can argue that a scrambling discontinuity is present. The definite pronoun das has been scrambled rightward out from under its governor erwähnt. Hence, the example suggests that the scrambling mechanism is quite flexible.

On the other hand, while German's ability to scramble verbs is debatable, it does allows for verb-head topicalization. As seen if fig.3, it can be adequately argued that this sentence does not involve scrambling as we currently understand it. In this tree, the verb [erwähnt] moves to the topic position, the auxiliary complementizer [hat] moves to the C head, and [er] moves into subject position. This changes the underlying sentence [Er hat das nicht erwähnt] into [Erwähnt hat er das nicht ___] without any scrambling involving TopicP and FocusP. This highlights how difficult it is for syntacticians to tease apart what processes actually account for changes to canonical word-order. These processes may operate completely independently, or could be aspects of the same syntactic processes that may one day be unified into a single theory.

- Note: Variations in the sentence [Er hat das nicht erwähnt] (e.g., [Das er hat das nicht erwähnt]) could be examples of A-movement scrambling

==== Similarities to Extraposition ====
Scrambling is like extraposition (but unlike topicalization and wh-fronting) in a relevant respect; it is clause-bound. That is, one cannot scramble a constituent out of one clause into another in all cases:

Grammatical Sentence

Ungrammatical Sentence

The first example has canonical word order. The second example illustrates how the definite pronoun [das] becomes ungrammatical when scrambled out of the embedded clause into the main clause. The sentence becomes strongly unacceptable. Extraposition is similar. When one attempts to extrapose a constituent out of one clause into another, the result is unacceptable. However, there are cases where words can be scrambled outside of a clause ('long distance scrambling"), although this form of scrambling is governed by additional rules compared to clause-initial ("short distance") scrambling.

=== Example from Persian ===
Scrambling in Persian plays a significant role in organizing information structure, especially in topicalization and contrasive focus.

==== Topicalization ====
In Persian, topics are typically specific and frequently marked by the specificity particle -râ. The topicalized element scrambles to a higher syntactic position, this includes Spec-Cp or Spec-IP.

- Canonical SOV Word Order

- Scrambled (Topicalized) Order

In the scrambled example, the object in ketâb-ro ("this book") is topicalized. This suggests that "this book" refers to the old or background information that is already familiar to both the speaker and the listener.

==== Contrastive Focus ====
Contrastive focus highlights the element of the sentence that are new, emphasized, or contrasted. The focus elements scramble to the left, which are typically into Spec-Cp or Spec-IP position. They also receives phonological stress for emphasis.
- Canonical SOV Word Order

- Scrambled (Topicalized) Order

The object YE ketâb ("A BOOK") is scrambled to the left-most position. It is also marked with stress, which indicates that it is contrasively focused. This implies that Kimea significantly bought a book instead of something else.

=== Example from Czech ===
Czech is an SVO language with free word order, made possible by its rich case-marking system. Czech's lack of overt articles or fixed positions for noun phrases allows for flexible word order. Despite flexibility, scrambling is constrained by information structure (focus and background) and specificity.

In Czech, scrambling plays a critical role in keeping the syntactic structure consistent with syntactic structure and the interpretation of information. When a constituent remains in situ within the vP phase, it can have two interpretations (or, readings), including an existential reading or specific reading. Exitential reading refers to an indefinite, for instance, unknown entity like "some dog", while a specific reading refers to a known or presupposed entity, such as "her dog". As vP phase is the domain of focus, new or unknown information is introduced. This arises the fexilibity. However, when a constituent is scrambled to the CP phase (typically to the left periphery of the sentence), because scrambling moves the element into the domain of backgrounded or presupposed information it can only have a specific interpretation.

==== Drivers of Scrambling ====
Scrambling in Czech is driven by Specificity-feature with an EPP (Extended Projection Principle) property. For specificity-feature, it refers to the only elements that are specific (referential and presupposed) are eligible for scrambling. The EPP property ensures that these specific constituents are moved overtly to the left periphery (CP phase).

==== Semantic Interface ====
Biskup (2006) proposes a theory of scrambling based on the unification of Diesing (1992) and Chomsky's phase model (2000). He aurgue that the vP phase corresponds to the nuclear scope, and the CP phase corresponds to the restrictive clause. In semantic terms:

- Constituents within vP (in situ) are interpreted in the nuclear scope, allowing existential or non-specific readings.
- Constituents moved to CP (scrambled) are mapped into the restrictive clause, requiring specific interpretations.

Scrambling creates a division between elements that remain part of the predication (new information) and elements that are part of the domain of quantification (background or old information). Specific elements are constrained to restrictive clauses after scrambling.

==== Scrambling Example in Czech ====

===== In Situ (vP Phase) =====

The direct object psa ("dog") remains in its base position within the vP phase, making it part of the focus domain. It is syntactically mapped to the nuclear scope.

====== Semantic Interpretations ======

1. Existential Reading: psa<nowiki/>' can mean "some dog". It can refer to an indefinite or unknown referent. As the object is not linked to a restrictive clause, it retains flexibility in interpretation.
2. Specific Reading: 'psa can also refer to a specific, known entity, such as "her dog." Specific readings in situ are enabled through covert operations like quantifier raising (QR).

===== Scrambled (CP Phase) =====

The object psa is scrambled to the CP phase (left periphery). This result in making it part of the restrictive clause. Its syntactic position outside the vP phase ensures that it no longer contributes to the focus domain.

====== Semantic Interpretations ======

1. Exclusion of Existential Reading: After scrambling, psa cannot mean "some dog," as existential readings are restricted to elements in the nuclear scope (vP phase).
2. Specific Reading: psa must refer to a specific referent, such as "her dog." The restrictive clause excludes existential or non-specific interpretations.

==Scrambling within a Constituent==
Classical Latin and Ancient Greek were known for a more extreme type of scrambling known as hyperbaton, defined as a "violent displacement of words". This involves the scrambling (extraposition) of individual words out of their syntactic constituents. Perhaps the most well-known example is magnā cum laude "with great praise" (lit. "great with praise"). This was possible in Latin and Greek because of case-marking: For example, both magnā and laude are in the ablative case.

Hyperbaton is found in a number of prose writers, e.g. Cicero:

Hic <u>optimus</u> illīs temporibus est <u>patrōnus</u> habitus
he best in those times is lawyer considered
'He was considered the best lawyer in those times.'

Much more extreme hyperbaton occurred in poetry, often with criss-crossing constituents. An example from Ovid is

Grandia per multōs tenuantur flūmina rīvōs.
great into many are channeled rivers brooks.
'Great rivers are channeled into many brooks.'

An interlinear gloss is as follows:

The two nouns (subject and object) are placed side-by-side, with both corresponding adjectives extraposed on the opposite side of the verb, in a non-embedding fashion.

Even more extreme cases are noted in the poetry of Horace, e.g.

Glossed interlinearly, the lines are as follows:

Because of the case, gender and number marking on the various nouns, adjectives and determiners, a careful reader can connect the discontinuous and interlocking phrases Quis ... gracilis ... puer, multā ... in rōsā, liquidīs ... odōribus in a way that would be impossible in English.

== Theoretical Analysis ==

=== Base-generation vs. Movement ===
Thus far, scrambling has mainly been discussed as a type of movement. However, whether scrambling is a result of movement or base generation is one of the great dividers among researchers, and the answer is still unclear. Many syntacticians claim that a combination of both approaches is the answer, while others maintain it is either one or the other. Base-generation theorizes that scrambled words are generated up from the base of the tree, rather than moved or transformed. Many supporters of this theory defend their stance with the argument that there is little solid evidence as to what the trigger for any supposed movement is, although many theorize that an EPP feature of some variety could be the answer. Besides examining what the trigger for scrambling may be, authors also look at the locus. Some claim that base-generation fares better here also, as the different orders that constituents may show are supposedly dependent on a base generation operation "merge". When looking at where scrambling occurs from this point of view, merge leaves less questions to be answered than movement theories do.

As seems to be the case with scrambling, neither movement nor base-generation theories are perfect. This is why many authors concede that there is some combination of both operations going on at the very least, although many take a strong stance on either side. At present, the general consensus is that what exactly is going on in scrambling is still unknown, but that movement is the dominant theory with some possible "enrichment" from base-generation operations.

=== Clause-boundedness ===
John 'Haj' Ross, who was the first to begin formulating research on scrambling, made the initial suggestion that this was a clause-bound operation. This was largely dependent on German data, in which scrambling a word outside of its clause often resulted in ungrammaticality.

However, further research into other languages such as Hindi, Japanese, Persian and Korean revealed that Ross' initial assumption was incorrect, as scrambling being a clause-bound phenomenon was not limited to short-distance movements in other languages. This was the first indication that scrambling is not a uniform operation across all languages, and its varying degrees of movement or word order change are heavily language-dependent. In sum, this is not a phenomenon that is solely limited to within-clause boundaries, although the criteria for its limits change from language to language.

=== Configurational vs Non-Configurational ===
Configurational languages are known for their rigid hierarchical phrase structures. It was initially suggested that scrambling only occurs in non-configurational languages, or languages with flat clause structures. This assumption held for a number of years, until it was revealed that Japanese, a language that often makes use of scrambling, is not non-configurational at all. Furthermore, other scrambling languages such as Persian also don't fall under the non-configurational category. See below for an elaboration on configurational vs. non-configurational structures and what this means for scrambling.
  - Comparison: Configurational vs. Non-Configurational**

| Aspect | Configurational Structure | Non-Configurational |
| Structure | Hierarchical (tree-like syntax) | Flat (arguments attach directly to clause) |
| Role of Word Order | Reflects syntactic roles (e.g., subject, object) | Flexible; case markers define roles |
| Movement | Triggered by syntactic/pragmatic features | Rare or unnecessary due to flat structure |
| Scrambling | Adds hierarchical complexity (e.g., vP → CP) | No additional complexity; roles are marker-driven |
| Sample Language | Czech (configurational) | Japanese (non-configurational) |

=== Scrambling as Shifting ===
The theoretical analysis of scrambling can vary a lot depending on the theory of sentence structure that one adopts. Constituency-based theories (phrase structure theories) that prefer strictly binary branching structures are likely to address most cases of scrambling in terms of movement (or copying) as shown in figs. 1–3. However, other theories of sentence structure, for instance those that allow n-ary branching structures (such as all dependency grammars), see many (but not all!) instances of scrambling as shifting. These two analyses are illustrated below. The first tree illustrates the movement analysis of the example above in a theory that assumes strictly binary branching structures. The German subordinate clause weil mich die anderen oft einladen is used, which translates as 'because the others often invite me':

The abbreviation "Sub" stands for "subordinator" (= subordinating conjunction), and "S" stands for "subordinator phrase" (= embedded clause). The tree on the left shows a discontinuity (= crossing lines) and the tree on the right illustrates how a movement analysis deals with the discontinuity. The pronoun mich is generated in a position immediately to the right of the subject; it then moves leftward to reach its surface position. The binary branching structures necessitate this analysis in terms of a discontinuity and movement.

A theory of syntax that rejects the subject-predicate division of traditional grammar (Sentence → NP+VP) and assumes relatively flat structures (that lack a finite VP constituent) will acknowledge no discontinuity in this example. Instead, a shifting analysis addresses many instances of scrambling. The following trees illustrate the shifting-type analysis in a dependency-based grammar. The clause from above is again used (weil mich die anderen oft einladen 'because the others often invite me'):

The tree on the left shows the object in its canonical position to the right of the subject, and the tree on the right shows the object in the derived position to the left of the subject. The important thing to acknowledge about the two trees is that there are no crossing lines. In other words, there is no discontinuity. The absence of a discontinuity is due to the flat structure assumed (which, again, lacks a finite VP constituent). The point, then, is that the relative flatness/layeredness of the structures that one assumes influences significantly the theoretical analysis of scrambling.

The example just examined can be, as just shown, accommodated without acknowledging a discontinuity (if a flat structure is assumed). There are many other cases of scrambling, however, where the analysis must acknowledge a discontinuity, almost regardless of whether relatively flat structures are assumed or not. This fact means that scrambling is generally acknowledged as one of the primary discontinuity types (in addition to topicalization, wh-fronting, and extraposition).

== Definitions ==

- canonical position: the position in which a sentence is typically organized (pre scrambling)
- freer word order languages: languages that can change word order
- syntax tree: a representation of a sentence and its syntax/syntactic operations that takes on a tree-like structure
- case marker: a grammatical device that indicates the role of a phrase in the sentence (e.g. "ACC" = Accusative)
- embedded clause: a clause that is placed within another clause to add more information to a sentence
- extraposition: a syntactic mechanism that moves a constituent to the right of its usual position
