= Catalan language =

Catalan
- Altname: Valencian
- Nativename: català , valencià
- Pronunciation: /ca/ (, & ) / /ca/ ( & ) , /ca-valencia/ ()
- States: Spain, Andorra, France, Italy
- Region: Southern Europe
- Speakers: L1: million
- Date: 2022
- Ref: e25
- Speakers2: L2: million, Total: million
- Speakers Label: Speakers
- Familycolor: Indo-European
- Fam2: Italic
- Fam3: Latino-Faliscan
- Fam4: Latin
- Fam5: Romance
- Fam6: Italo-Western
- Fam7: Western Romance
- Fam8: Gallo-Romance
- Fam9: Occitano-Romance
- Ancestor: Old Latin
- Ancestor2: Vulgar Latin
- Ancestor3: Proto-Romance
- Ancestor4: Old Occitan
- Ancestor5: Old Catalan
- Nation: Balearic Islands, Catalonia, Valencian Community (as Valencian), Alghero, Sardinia
- Minority: Northern Catalonia (Roussillon), part of Occitania, La Franja, part of the community of Aragon, Carche, part of the Region of Murcia (as Valencian)
- Agency: Institut d'Estudis Catalans (IEC) , Acadèmia Valenciana de la Llengua (AVL)
- Iso1: ca
- Iso2: cat
- Iso3: cat
- Lingua: 51-AAA-e
- Map: Catalan language in Europe (cropped).png
- Map2: Lang Status 80-VU.svg
- Notice: IPA
- Glotto: stan1289
- Glottorefname: Catalan

Catalan (català) is a Western Romance language and is the indigenous and official language of three autonomous communities in eastern Spain: Catalonia, the Balearic Islands, and the Valencian Community, where it is called Valencian (valencià). Catalan is also the sole official language of Andorra, has semi-official status in the Italian municipality of Alghero, and is spoken in the Pyrénées-Orientales department of France and in two further areas in eastern Spain: the eastern strip of Aragon and the Carche area in the Region of Murcia. The Catalan-speaking regions are often called the Catalan Countries (Països Catalans).

The language evolved from Vulgar Latin in the Middle Ages around the eastern Pyrenees. It became the language of the Principality of Catalonia and the kingdoms of Valencia and Mallorca, being present throughout the Mediterranean as the main language of the Crown of Aragon. It was replaced by Spanish as a language of government and literature in the 1700s, but 19th century Spain saw a Catalan literary revival, culminating in the early 1900s. During the Francoist dictatorship (1936–1975), the usage of Catalan was subject to repressive measures, before it entered a relatively successful process of re-normalization between the 1980s and the 2000s. However, during the 2010s, it experienced signs of decline in social use, diglossia and the re-growth of discrimination cases.

== Etymology and pronunciation ==

The word Catalan is derived from the territorial name of Catalonia, itself of disputed etymology. The main theory suggests that Catalunya (Gathia Launia) derives from the name Gothia or Gauthia ('Land of the Goths'), since the origins of the Catalan counts, lords and people were found in the March of Gothia, whence Gothland > Gothlandia > Gothalania > Catalonia theoretically derived.

In English, the term referring to a person first appears in the mid 14th century as Catelaner, followed in the 15th century as Catellain (from Middle French). It is attested a language name since at least 1652. The word Catalan can be pronounced in English as /ˈkætələn,_-æn/ KAT-ə-lən-,_---lan or /ˌkætəˈlæn/ KAT-ə-LAN.

The endonym is pronounced /ca/ in the Eastern Catalan dialects, and /ca/ in the Western dialects. In the Valencian Community and Carche, the term valencià /ca-valencia/ is frequently used instead. Thus, the name "Valencian", although often employed for referring to the varieties specific to the Valencian Community and Carche, is also used by Valencians as a name for the language as a whole, synonymous with "Catalan". Both uses of the term have their respective entries in the dictionaries by the Acadèmia Valenciana de la Llengua (AVL) and the Institut d'Estudis Catalans (IEC). (See also status of Valencian below).

== History ==

=== Middle Ages ===

By the 9th century, Catalan had evolved from Vulgar Latin on both sides of the eastern end of the Pyrenees, as well as the territories of the Roman province of Hispania Tarraconensis to the south. From the 8th century onwards the Catalan counts extended their territory southwards and westwards at the expense of the Muslims, bringing their language with them. This process was given definitive impetus with the separation of the County of Barcelona from the Carolingian Empire in 988.

In the 11th century, documents written in macaronic Latin begin to show Catalan elements, with texts written almost completely in Romance appearing by 1080. Old Catalan shared many features with Gallo-Romance, diverging from Old Occitan between the 11th and 14th centuries.

During the 11th and 12th centuries the Catalan rulers expanded southward to the Ebro river, and in the 13th century they conquered the lands that would become the Kingdoms of Valencia and the Majorca. The city of Alghero in Sardinia was repopulated with Catalan speakers in the 14th century. The language also reached Murcia, which became Spanish-speaking in the 15th century.

In the Low Middle Ages, Catalan went through a golden age, reaching a peak of maturity and cultural richness. Examples include the work of Majorcan polymath Ramon Llull (1232–1315), the Four Great Chronicles (13th–14th centuries), and the Valencian school of poetry culminating in Ausiàs March (1397–1459). By the 15th century, the city of Valencia had become the sociocultural center of the Crown of Aragon, and Catalan was present all over the Mediterranean world. During this period, the Royal Chancery propagated a highly standardized language. Catalan was widely used as an official language in Sicily until the 15th century, and in Sardinia until the 17th. During this period, the language was what Costa Carreras terms "one of the 'great languages' of medieval Europe".

Martorell's novel of chivalry Tirant lo Blanc (1490) shows a transition from Medieval to Renaissance values, something that can also be seen in Metge's work. The first book produced with movable type in the Iberian Peninsula was printed in Catalan.

=== Early modern era ===

==== Spain ====
With the union of the crowns of Castille and Aragon in 1479, the Spanish kings ruled over different kingdoms, each with its own cultural, linguistic and political particularities, and they had to swear by the laws of each territory before the respective parliaments. But after the War of the Spanish Succession, Spain became an absolute monarchy under Philip V, which led to the assimilation of the Crown of Aragon by the Crown of Castile through the Nueva Planta decrees, as a first step in the creation of the Spanish nation-state; as in other contemporary European states, this meant the imposition of the political and cultural characteristics of the dominant groups. Since the political unification of 1714, Spanish assimilation policies towards national minorities have been a constant.

The process of assimilation began with secret instructions to the corregidores of the Catalan territory: they "will take the utmost care to introduce the Castilian language, for which purpose he will give the most temperate and disguised measures so that the effect is achieved, without the care being noticed". From there, actions in the service of assimilation, discreet or aggressive, were continued, and reached to the last detail, such as, in 1799, the Royal Certificate forbidding anyone to "represent, sing and dance pieces that were not in Spanish". The use of Spanish gradually became more prestigious and marked the start of the decline of Catalan. Starting in the 16th century, Catalan literature came under the influence of Spanish, and the nobles, part of the urban and literary classes became bilingual.

==== France ====

With the Treaty of the Pyrenees (1659), Spain ceded the northern part of the Principality of Catalonia to France, and soon thereafter the local Catalan varieties came under the influence of French, which in 1700 became the sole official language of the region.

Shortly after the French Revolution (1789), the French First Republic prohibited official use of, and enacted discriminating policies against, the regional languages of France, such as Catalan, Alsatian, Breton, Occitan, Flemish, and Basque.

=== France: 19th to 20th century ===

After the French colony of Algeria was established in 1830, many Catalan-speaking settlers moved there. People from the Spanish province of Alicante settled around Oran, while those from French Catalonia and Menorca migrated to Algiers.

By 1911, there were around 100,000 speakers of Patuet, as their speech was called. After the Algerian declaration of independence in 1962, almost all the Pied-Noir Catalan speakers fled to Northern Catalonia or Alicante.

The French government only recognizes French as an official language. Nevertheless, on 10 December 2007, the then General Council of the Pyrénées-Orientales officially recognized Catalan as one of the départment's languages and seeks to further promote it in public life and education.

=== Spain: 18th to 20th century ===

In 1807, the Statistics Office of the French Ministry of the Interior asked the prefects for an official survey on the limits of the French language. The survey found that in Roussillon, almost only Catalan was spoken, and since Napoleon wanted to incorporate Catalonia into France, as happened in 1812, the consul in Barcelona was also asked. He declared that Catalan "is taught in schools, it is printed and spoken, not only among the lower class, but also among people of first quality, also in social gatherings, as in visits and congresses", indicating that it was spoken everywhere "with the exception of the royal courts". He also indicated that Catalan was spoken "in the Kingdom of Valencia, in the islands of Mallorca, Menorca, Ibiza, Sardinia, Corsica and much of Sicily, in the Vall d "Aran and Cerdaña".

The defeat of the pro-Habsburg coalition in the War of the Spanish Succession (1714) initiated a series of laws which, among other centralizing measures, imposed the use of Spanish in legal documentation all over Spain. Because of this, use of the Catalan language declined into the 18th century.

However, the 19th century saw a Catalan literary revival (Renaixença), which has continued up to the present day. This period starts with Aribau's Ode to the Homeland (1833); followed in the second half of the 19th century, and the early 20th by the work of Verdaguer (poetry), Oller (realist novel), and Guimerà (drama). In the 19th century, the region of Carche, in the province of Murcia was repopulated with Valencian speakers. Catalan spelling was standardized in 1913 and the language became official during the Second Spanish Republic (1931–1939). The Second Spanish Republic saw a brief period of tolerance, with most restrictions against Catalan lifted. The Generalitat (the autonomous government of Catalonia, established during the Republic in 1931) made a normal use of Catalan in its administration and put efforts to promote it at the social level, including in schools and the University of Barcelona.

The Catalan language and culture were still vibrant during the Spanish Civil War (1936–1939), but were crushed at an unprecedented level throughout the subsequent decades due to Francoist dictatorship (1939–1975), which abolished the official status of Catalan and imposed the use of Spanish in schools and in public administration in all of Spain, while banning the use of Catalan in them. Between 1939 and 1943 newspapers and book printing in Catalan almost disappeared. Francisco Franco's desire for a homogeneous Spanish population resonated with some Catalans in favor of his regime, primarily members of the upper class, who began to reject the use of Catalan. Despite all of these hardships, Catalan continued to be used privately within households, and it was able to survive Franco's dictatorship. At the end of World War II, however, some of the harsh measures began to be lifted and, while Spanish language remained the sole promoted one, limited number of Catalan literature began to be tolerated. Several prominent Catalan authors resisted the suppression through literature. Private initiative contests were created to reward works in Catalan, among them Joan Martorell prize (1947), Víctor Català prize (1953) Carles Riba award (1950), or the Honor Award of Catalan Letters (1969). The first Catalan-language TV show was broadcast in 1964. At the same time, oppression of the Catalan language and identity was carried out in schools, through governmental bodies, and in religious centers.

In addition to the loss of prestige for Catalan and its prohibition in schools, migration during the 1950s into Catalonia from other parts of Spain also contributed to the diminished use of the language. These migrants were often unaware of the existence of Catalan, and thus felt no need to learn or use it. Catalonia was the economic powerhouse of Spain, so these migrations continued to occur from all corners of the country. Employment opportunities were reduced for those who were not bilingual. Daily newspapers remained exclusively in Spanish until after Franco's death, when the first one in Catalan since the end of the Civil War, Avui, began to be published in 1976.

=== Present day ===
Since the Spanish transition to democracy (1975–1982), Catalan has been institutionalized as an official language, language of education, and language of mass media; all of which have contributed to its increased prestige. In Catalonia, there is an unparalleled large bilingual European non-state linguistic community. The teaching of Catalan is mandatory in all schools, but it is possible to use Spanish for studying in the public education system of Catalonia in two situations—if the teacher assigned to a class chooses to use Spanish, or during the learning process of one or more recently arrived immigrant students. There is also some intergenerational shift towards Catalan.

More recently, several Spanish political forces have tried to increase the use of Spanish in the Catalan educational system. As a result, in May 2022 the Spanish Supreme Court urged the Catalan regional government to enforce a measure by which 25% of all lessons must be taught in Spanish.

According to the Statistical Institute of Catalonia, in 2013 the Catalan language is the second most commonly used in Catalonia, after Spanish, as a native or self-defining language: 7% of the population self-identifies with both Catalan and Spanish equally, 36.4% with Catalan and 47.5% only Spanish. In 2003 the same studies concluded no language preference for self-identification within the population above 15 years old: 5% self-identified with both languages, 44.3% with Catalan and 47.5% with Spanish. To promote use of Catalan, the Generalitat de Catalunya (Catalonia's official Autonomous government) spends part of its annual budget on the promotion of the use of Catalan in Catalonia and in other territories, with entities such as (Consortium for Linguistic Normalization).

In Andorra, Catalan has always been the sole official language. Since the promulgation of the 1993 constitution, several policies favoring Catalan have been enforced, such as Catalan medium education.

On the other hand, there are several language shift processes currently taking place. In the Northern Catalonia area of France, Catalan has followed the same trend as the other minority languages of France, with most of its native speakers being 60 or older (as of 2004). Catalan is studied as a foreign language by 30% of the primary education students, and by 15% of the secondary. The cultural association La Bressola promotes a network of community-run schools engaged in Catalan language immersion programs.

In Alicante province, Catalan is being replaced by Spanish and in Alghero by Italian. There is also well ingrained diglossia in the Valencian Community, Ibiza, and to a lesser extent, in the rest of the Balearic islands.

During the 20th century many Catalans emigrated or went into exile to Venezuela, Mexico, Cuba, Argentina, and other South American countries. They formed a large number of Catalan colonies that today continue to maintain the Catalan language. They also founded many Catalan casals (associations).

== Classification and relationship with other Romance languages ==

One classification of Catalan is given by Pèire Bèc:
- Romance languages
  - Italo-Western languages
    - Western Romance languages
    - * Gallo-Iberian languages
    - ** Gallo-Romance languages
    - *** Occitano-Romance languages
    - **** Catalan language

However, the ascription of Catalan to the Occitano-Romance branch of Gallo-Romance languages is not shared by all linguists and philologists, particularly among Spanish ones, such as Ramón Menéndez Pidal.

Catalan bears varying degrees of similarity to the linguistic varieties subsumed under the cover term Occitan language (see also differences between Occitan and Catalan and Gallo-Romance languages). Thus, as it should be expected from closely related languages, Catalan today shares many traits with other Romance languages.

=== Relationship with other Romance languages ===
Some include Catalan in Occitan, as the linguistic distance between this language and some Occitan dialects (such as the Gascon dialect) is similar to the distance among different Occitan dialects. Catalan was considered a dialect of Occitan until the end of the 19th century and still today remains its closest relative.

Catalan shares many traits with the other neighboring Romance languages (Occitan, French, Italian, Sardinian as well as Spanish and Portuguese among others). However, despite being spoken mostly on the Iberian Peninsula, Catalan has marked differences with the Iberian Romance group (Spanish and Portuguese) in terms of pronunciation, grammar, and especially vocabulary; it shows instead its closest affinity with languages native to France and northern Italy, particularly Occitan and to a lesser extent Gallo-Romance (Franco-Provençal, French, Gallo-Italian).

According to Ethnologue, the lexical similarity between Catalan and other Romance languages is: 87% with Italian; 85% with Portuguese and Spanish; 76% with Ladin and Romansh; 75% with Sardinian; and 73% with Romanian.

  - Lexical comparison of 24 words among Romance languages:
17 cognates with Gallo-Romance, 5 isoglosses with Iberian Romance, 3 isoglosses with Occitan, and 1 unique word.**

| Gloss | Catalan | Occitan | (Campidanese) Sardinian | Italian | French | Spanish | Portuguese | Romanian |
| cousin | cosí | cosin | fradili | cugino | cousin | primo | primo, coirmão | văr |
| brother | germà | fraire | fradi | fratello | frère | hermano | irmão | frate |
| nephew | nebot | nebot | nebodi | nipote | neveu | sobrino | sobrinho | nepot |
| summer | estiu | estiu | istadi | estate | été | verano, estío | verão, estio | vară |
| evening | vespre | ser, vèspre | seru | sera | soir | tarde, noche | tarde, serão | seară |
| morning | matí | matin | mangianu | mattina | matin | mañana | manhã, matina | dimineață |
| frying pan | paella | padena | paella | padella | poêle | sartén | frigideira, fritadeira | tigaie |
| bed | llit | lièch (or lèit) | letu | letto | lit | cama, lecho | cama, leito | pat |
| bird | ocell, au | aucèl | pilloni | uccello | oiseau | ave, pájaro | ave, pássaro | pasăre |
| dog | gos, ca | gos, canh | cani | cane | chien | perro, can | cão, cachorro | câine |
| plum | pruna | pruna | pruna | prugna | prune | ciruela | ameixa | prună |
| butter | mantega | bodre | burru (or butiru) | burro | beurre | mantequilla (or manteca) | manteiga | unt |
| piece | tros | tròç, petaç | arrogu | pezzo | morceau, pièce | pedazo, trozo | pedaço, bocado | bucată |
| gray | gris | gris | canu | grigio | gris | gris, pardo | cinzento, gris | gri, sur, cenușiu |
| hot | calent | caud | callenti | caldo | chaud | caliente | quente | cald |
| too much | massa | tròp | tropu | troppo | trop | demasiado | demais, demasiado | prea |
| to want | voler | vòler | bolli(ri) | volere | vouloir | querer | querer | a vrea |
| to take | prendre | prendre (or prene) | pigai | prendere | prendre | tomar, prender | apanhar, levar | a lua |
| to pray | pregar, resar, orar | pregar | pregai | pregare | prier | orar, rezar | orar, rezar, pregar | a se ruga |
| to ask | demanar / preguntar | demandar | dimandai, preguntai | domandare | demander | pedir, preguntar | pedir, perguntar | a cere, a întreba |
| to search | cercar / buscar | cercar | circai | cercare | chercher | buscar | procurar, buscar | a căuta |
| to arrive | arribar | arribar | arribai | arrivare | arriver | llegar, arribar | chegar | a ajunge |
| to speak | parlar | parlar | chistionnai, fueddai | parlare | parler | hablar, parlar | falar, parlar | a vorbi |
| to eat | menjar | manjar | pappai | mangiare | manger | comer (manyar in lunfardo; papear in slang) | comer, manjar (papar in slang) | a mânca |

During much of its history, and especially during the Francoist dictatorship (1939–1975), the Catalan language was ridiculed as a mere dialect of Spanish. This view, based on political and ideological considerations, has no linguistic validity. Spanish and Catalan have important differences in their sound systems, lexicon, and grammatical features, placing the language in features closer to Occitan (and French).

There is evidence that, at least from the 2nd century AD, the vocabulary and phonology of Roman Tarraconensis was different from the rest of Roman Hispania. Differentiation arose generally because Spanish, Asturian, and Galician-Portuguese share certain peripheral archaisms (Spanish hervir, Asturian and Portuguese ferver vs. Catalan bullir, Occitan bolir "to boil") and innovatory regionalisms (Spanish novillo, Asturian nuviellu vs. Catalan torell, Occitan taurèl "bullock"), while Catalan has a shared history with the Western Romance innovative core, especially Occitan.

  - Catalan and Spanish cognates with different meanings**

| Latin | Catalan | Spanish |
| | "to bring closer" | ' "to put to bed" |
| | "to remove; wake up" | ' "to take" |
| | "to remove" | ' "to bring" |
| | "to search" | ' "to fence" |
| | "to bury" | ' "to hang" |
| | "wife" | ' "woman or wife" |

Like all Romance languages, Catalan has a handful of native words which are unique to it, or rare elsewhere. These include:
- verbs: 'to fasten; transfix' > confegir 'to compose, write up', > conjuminar 'to combine, conjugate', > deixondar/-ir 'to wake; awaken', 'to thicken; crowd together' > desar 'to save, keep', > enyorar 'to miss, yearn, pine for', 'to investigate, track' > Old Catalan enagar 'to incite, induce', > Old Catalan ujar 'to exhaust, fatigue', > apaivagar 'to appease, mollify', > rebutjar 'to reject, refuse';
- nouns: > brisa 'pomace', > boga 'reedmace', > cadarn 'catarrh', > congesta 'snowdrift', > deler 'ardor, passion', > freu 'brake', > (a)llau 'avalanche', > vora 'edge, border', 'sawfish' > pestriu > pestiu 'thresher shark, smooth hound; ray', 'live coal' > espurna 'spark', > tardaó > tardor 'autumn'.

The Gothic superstrate produced different outcomes in Spanish and Catalan. For example, Catalan "mud" and "to roast", of Germanic origin, contrast with Spanish and , of Latin origin; whereas Catalan "spinning wheel" and "temple", of Latin origin, contrast with Spanish and , of Germanic origin.

The same happens with Arabic loanwords. Thus, Catalan alfàbia "large earthenware jar" and "tile", of Arabic origin, contrast with Spanish and , of Latin origin; whereas Catalan "oil" and "olive", of Latin origin, contrast with Spanish and . However, the Arabic element is generally much more prevalent in Spanish.

Situated between two large linguistic blocks (Iberian Romance and Gallo-Romance), Catalan has many unique lexical choices, such as "to miss somebody", "to calm somebody down", and "reject".

== Geographic distribution ==

=== Catalan-speaking territories ===

Traditionally Catalan-speaking territories are sometimes called the Països Catalans (Catalan Countries), a denomination based on cultural affinity and common heritage, that has also had a subsequent political interpretation but no official status. Various interpretations of the term may include some or all of these regions.

  - Territories where Catalan is spoken**

| State | Territory | Catalan name |
| Andorra | Andorra | A sovereign state where Catalan is the national and the sole official language. The Andorrans speak a Western Catalan variety. |
| France | Northern Catalonia | Catalunya Nord |
| Spain | Catalonia | Catalunya |
| Valencian Community ( Valencian Country) | Comunitat Valenciana (País Valencià) | Excepting some regions in the west and south which have been Aragonese/Spanish-speaking since at least the 18th century. The Western Catalan variety spoken there is known as "Valencian". |
| La Franja | La Franja | A part of the Autonomous Community of Aragon, specifically a strip bordering Western Catalonia. It comprises the comarques of Ribagorça, Llitera, Baix Cinca, and Matarranya. |
| Balearic Islands | Illes Balears | Comprising the islands of Mallorca, Menorca, Ibiza and Formentera. |
| Carche | El Carxe | A small area of the Autonomous Community of Murcia, settled in the 19th century. |
| Italy | Alghero | L'Alguer |

=== Number of speakers ===
The number of people known to be fluent in Catalan varies depending on the sources used. A 2004 study did not count the total number of speakers, but estimated a total of 9–9.5 million by matching the percentage of speakers to the population of each area where Catalan is spoken. The web site of the Generalitat de Catalunya estimated that as of 2004 there were 9,118,882 speakers of Catalan. These figures only reflect potential speakers; today it is the native language of only 35.6% of the Catalan population. According to Ethnologue, Catalan had 4.1 million native speakers and 5.1 million second-language speakers in 2021.

According to a 2011 study the total number of Catalan speakers was over 9.8 million, with 5.9 million residing in Catalonia. More than half of them spoke Catalan as a second language, with native speakers being about 4.4 million of those (more than 2.8 in Catalonia). Very few Catalan monoglots exist; virtually all of the Catalan speakers in Spain are bilingual speakers of Catalan and Spanish, with 99.7% of Catalan speakers in Catalonia able to speak Spanish and 99.9% able to understand it.

In Roussillon, only a minority of French Catalans speak Catalan nowadays, with French being the majority language for the inhabitants after a continued process of language shift. According to a 2019 survey by the Catalan government, 31.5% of the inhabitants of Catalonia predominantly spoke Catalan at home whereas 52.7% spoke Spanish, 2.8% both Catalan and Spanish and 10.8% other languages.

Spanish was the most spoken language in Barcelona (according to the linguistic census held by the Government of Catalonia in 2013) and it is understood almost universally. According to 2013 census, Catalan was also very commonly spoken in the city of 1,501,262: it was understood by 95% of the population, while 72.3% over the age of two could speak it (1,137,816), 79% could read it (1,246.555), and 53% could write it (835,080). The share of Barcelona residents who could speak it (72.3%) was lower than that of the overall Catalan population, of whom 81.2% over the age of 15 spoke the language. Knowledge of Catalan has increased significantly in recent decades thanks to a language immersion educational system. An important social characteristic of the Catalan language is that all the areas where it is spoken are bilingual in practice: together with French in Roussillon, with Italian in Alghero, with Spanish and French in Andorra, and with Spanish in the rest of the territories.

| Territory | State | Understand | Can speak |
| Catalonia | Spain | 6,502,880 | 5,698,400 |
| Valencian Community | Spain | 3,448,780 | 2,407,951 |
| Balearic Islands | Spain | 852,780 | 706,065 |
| Roussillon | France | 203,121 | 125,621 |
| Andorra | Andorra | 75,407 | 61,975 |
| La Franja (Aragon) | Spain | 47,250 | 45,000 |
| Alghero (Sardinia) | Italy | 20,000 | 17,625 |
| Carche (Murcia) | Spain | ~600 | 600 |
| Total Catalan-speaking territories | 11,150,218 | 9,062,637 | |
| Rest of World | No data | 350,000 | |
| Total | 11,150,218 | 9,412,637 | |
1. The number of people who understand Catalan includes those who can speak it.
2. Figures relate to all self-declared capable speakers, not just native speakers.

==== Level of knowledge ====
| Area | Speak | Understand | Read | Write |
| Catalonia | 81.2 | 94.4 | 85.5 | 65.3 |
| Valencian Community | 57.5 | 78.1 | 54.9 | 32.5 |
| Balearic Islands | 74.6 | 93.1 | 79.6 | 46.9 |
| Roussillon | 37.1 | 65.3 | 31.4 | 10.6 |
| Andorra | 78.9 | 96.0 | 89.7 | 61.1 |
| Franja Oriental of Aragón | 88.8 | 98.5 | 72.9 | 30.3 |
| Alghero | 67.6 | 89.9 | 50.9 | 28.4 |
(% of the population 15 years old and older).

==== Social use ====
| Area | At home | Outside home |
| Catalonia | 45 | 51 |
| Valencian Community | 37 | 32 |
| Balearic Islands | 44 | 41 |
| Roussillon | 1 | 1 |
| Andorra | 38 | 51 |
| Franja Oriental of Aragón | 70 | 61 |
| Alghero | 8 | 4 |
(% of the population 15 years old and older).

==== Native language ====
To calculate the absolute number the figures have been proportioned to the whole population regardless of the age, rounded to the nearest 500.
| Area | People | Percentage | Year | Source |
| Catalonia | 3,101,500 | 40.6% | 2021 | |
| Valencian Community | 1,271,000 | 25.4% | 2021 | |
| Balearic Islands | 401,500 | 33.2% | 2021 | |
| Aragon | 29,500 | 2.5% | 2021 | |
| Rest of Spain | 80,500 | 0.3% | 2021 | |
| Andorra | 35,000 | 44.1% | 2022 | |
| Roussillon | 60,000 | 12.7% | 2015 | |
| Alghero | 10,500 | 24.1% | 2015 | |
| TOTAL | 4,989,500 | | | |

== Phonology ==

Catalan phonology varies by dialect. Notable features include:
- Marked contrast of the vowel pairs //ɛ, e// and //ɔ, o//, as in other Western Romance languages, other than Spanish.
- Lack of diphthongization of Latin short ĕ, ŏ, as in Galician and Portuguese, but unlike French, Spanish, or Italian.
- Abundance of diphthongs containing //w//, as in Galician and Portuguese.

In contrast to other Romance languages, Catalan has many monosyllabic words, and these may end in a wide variety of consonants, including some consonant clusters. Additionally, Catalan has final obstruent devoicing, which gives rise to an abundance of such couplets as amic ("male friend") vs. amiga ("female friend").

Central Catalan pronunciation is considered to be standard for the language. The descriptions below are mostly representative of this variety. For the differences in pronunciation between the different dialects, see the section on pronunciation of dialects in this article.

=== Vowels ===

Catalan has inherited the typical vowel system of Vulgar Latin, with seven stressed phonemes: //a, ɛ, e, i, ɔ, o, u//, a common feature in Western Romance, with the exception of Spanish. Balearic also has instances of stressed //ə//. Dialects differ in the different degrees of vowel reduction, and the incidence of the pair //ɛ, e//.

In Central Catalan, unstressed vowels reduce to three: //a, e, ɛ/ > [ə]/; //o, ɔ, u/ > [u]/; //i// remains distinct. The other dialects have different vowel reduction processes (see the section pronunciation of dialects in this article).
  - Examples of vowel reduction processes in Central Catalan
The root is stressed in the first word and unstressed in the second**

| | Front vowels | Back vowels | | | |
| Word pair | gel ("ice") gelat ("ice cream") | pedra ("stone") pedrera ("quarry") | banya ("he bathes") banyem/banyem ("we bathe") | cosa ("thing") coseta ("little thing") | tot ("everything") total ("total") |
| IPA transcription | /[ˈʒɛl]/ /[ʒəˈlat]/ | /[ˈpeðɾə]/ /[pəˈðɾeɾə]/ | /[ˈbaɲə]/ /[bəˈɲɛm]/ | /[ˈkɔzə]/ /[kuˈzɛtə]/ | /[ˈtot]/ /[tuˈtal]/ |

=== Consonants ===
  - Catalan consonants**

| | Labial | Alveolar / Dental | Palatal | Velar |
| Nasal | | | | () |
| Plosive | voiceless | | | |
| voiced | | | | |
| Affricate | voiceless | | | |
| voiced | | | | |
| Fricative | voiceless | | | |
| voiced | () | | () | |
| Approximant | median | | | |
| lateral | | | | |
| Tap | | | | |
| Trill | | | | |

The consonant system of Catalan is rather conservative.
- //l// has a velarized allophone in syllable coda position in most dialects. However, //l// is velarized irrespective of position in Eastern dialects such as Majorcan and standard Eastern Catalan.
- //v// occurs in Balearic, Algherese, standard Valencian and some areas in southern Catalonia. It has merged with //b// elsewhere.
- The velar nasal /ŋ/ is an allophone of /n/ before /g/ or /k/. However, it has become phonemic in Central dialects that delete the final element of word-final consonant clusters, resulting in minimal pairs such as fan [ˈfan] (“they do”) and fang [ˈfaŋ] (“mud”, pronounced [ˈfaŋk] in other dialects).
- In Valencian, the fricative [ʒ] (and [jʒ]) appears only as a voiced allophone of /ʃ/ (and /jʃ/) before vowels and voiced consonants; e.g. peix al forn [ˈpejʒ al ˈfoɾn] ('oven fish'). The /ʒ/ phoneme in other Catalan dialects is pronounced /dʒ/ in standard Valencian.
- Voiced obstruents undergo final-obstruent devoicing: //b/ > [p], /d/ > [t], /ɡ/ > [k]/.
- Voiced stops become lenited to approximants in syllable onsets, after continuants: //b// > , //d// > , //ɡ// > . Exceptions include //d// after lateral consonants, and //b// after //f//. In coda position, these sounds are realized as stops, except in some Valencian dialects where they are lenited.
- There is some confusion in the literature about the precise phonetic characteristics of //ʃ//, //ʒ//, //tʃ//, //dʒ//. Some sources describe them as "postalveolar". Others as "back alveolo-palatal", implying that the characters would be more accurate. However, in all literature only the characters for palato-alveolar affricates and fricatives are used, even when the same sources use for other languages such as Polish and Chinese.
- The distribution of the two rhotics //r// and //ɾ// closely parallels that of Spanish. Between vowels, the two contrast, but they are otherwise in complementary distribution: in the onset of the first syllable in a word, appears unless preceded by a consonant. Dialects vary in regards to rhotics in the coda with Western Catalan generally featuring and Central Catalan dialects featuring a weakly trilled unless it precedes a vowel-initial word in the same prosodic unit, in which case appears.
- In careful speech, //n//, //m//, //l// may be geminated. Geminated //ʎ// may also occur. Some analyze intervocalic /[r]/ as the result of gemination of a single rhotic phoneme. This is similar to the common analysis of Spanish and Portuguese rhotics.

=== Phonological evolution ===

Catalan shares features with neighboring Romance languages (Occitan, Italian, Sardinian, French, Spanish).

- Marked contrast of the vowel pairs //ɛ/ ~ /e// and //ɔ/ ~ /o//, as in other Western Romance languages, except Spanish and Sardinian.
- Lenition of voiced stops /[b] → [β], [d] → [ð], [ɡ] → [ɣ]/ as in Galician and Spanish.
- Lack of diphthongization of Latin short ĕ, ŏ, as in Galician, Sardinian and Portuguese, and unlike French, Spanish and Italian.
- Abundance of diphthongs containing //w//, as in Galician and Portuguese.
- Abundance of //ʎ// and //ɲ// occurring at the end of words, as for instance ("wet") and ("year"), unlike Spanish, Portuguese or Italian.

In contrast with other Romance languages, Catalan has many monosyllabic words; and those ending in a wide variety of consonants and some consonant clusters. Also, Catalan has final obstruent devoicing, thus featuring many couplets like amic ('male friend') vs. amiga ('female friend').

== Sociolinguistics ==

Catalan sociolinguistics studies the situation of Catalan in the world and the different varieties that this language presents. It is a subdiscipline of Catalan philology and other affine studies and has as an objective to analyze the relation between the Catalan language, the speakers and the close reality (including the one of other languages in contact).

=== Preferential subjects of study ===

- Dialects of Catalan
- Variations of Catalan by class, gender, profession, age and level of studies
- Process of linguistic normalization
- Relations between Catalan and Spanish or French
- Perception on the language of Catalan speakers and non-speakers
- Presence of Catalan in several fields: tagging, public function, media, professional sectors

=== Dialects ===

==== Overview ====

The dialects of the Catalan language feature a relative uniformity, especially when compared to other Romance languages; both in terms of vocabulary, semantics, syntax, morphology, and phonology. Mutual intelligibility between dialects is very high, estimates ranging from 90% to 95%. The only exception is the isolated idiosyncratic Algherese dialect.

Catalan is split in two major dialectal blocks: Eastern and Western. The main difference lies in the treatment of unstressed a and e; which have merged to //ə// in Eastern dialects, but which remain distinct as //a// and //e// in Western dialects. There are a few other differences in pronunciation, verbal morphology, and vocabulary.

Western Catalan comprises the two dialects of North-Western Catalan and Valencian; the Eastern block comprises four dialects: Central Catalan, Balearic, Roussillonese, and Algherese. Each dialect can be further subdivided in several subdialects. The terms "Catalan" and "Valencian" (respectively used in Catalonia and the Valencian Community) refer to two varieties of the same language. There are two institutions regulating the two standard varieties, the Institute of Catalan Studies in Catalonia and the Valencian Academy of the Language in the Valencian Community.

Central Catalan is considered the standard pronunciation of the language and has the largest number of speakers. It is spoken in the densely populated regions of the Barcelona province, the eastern half of the province of Tarragona, and most of the province of Girona.

Catalan has an inflectional grammar. Nouns have two genders (masculine, feminine), and two numbers (singular, plural). Pronouns additionally can have a neuter gender, and some are also inflected for case and politeness, and can be combined in very complex ways. Verbs are split in several paradigms and are inflected for person, number, tense, aspect, mood, and gender. In terms of pronunciation, Catalan has many words ending in a wide variety of consonants and some consonant clusters, in contrast with many other Romance languages.

  - Main dialectal divisions of Catalan**

| Block | Western Catalan | Eastern Catalan |
| Variety | North-Western | Valencian |
| Area | Spain, Andorra | Spain |
| Andorra, provinces of Lleida, western half of Tarragona, La Franja (Aragon) | Valencian Community, Carche (Murcia) | Provinces of Barcelona, eastern half of Tarragona, most of Girona |

=== Pronunciation ===

==== Vowels ====
Catalan has inherited the typical vowel system of Vulgar Latin, with seven stressed phonemes: //a, ɛ, e, i, ɔ, o, u//, a common feature in Western Romance, except Spanish. Balearic has also instances of stressed //ə//. Dialects differ in the different degrees of vowel reduction, and the incidence of the pair //ɛ e//.

In Eastern Catalan (except Majorcan), unstressed vowels reduce to three: //a, e, ɛ/ > [ə]/; //o, ɔ, u/ > [u]/; //i// remains distinct. There are a few instances of unreduced /[e]/, /[o]/ in some words. Algherese has lowered /[ə]/ to /[a]/.

In Majorcan, unstressed vowels reduce to four: //a, e, ɛ// follow the Eastern Catalan reduction pattern; however //o, ɔ// reduce to /[o]/, with //u// remaining distinct, as in Western Catalan.

In Western Catalan, unstressed vowels reduce to five: //e, ɛ/ > [e]/; //o, ɔ/ > [o]/; //a, u, i// remain distinct. This reduction pattern, inherited from Proto-Romance, is also found in Italian and Portuguese. Some Western dialects present further reduction or vowel harmony in some cases.

Central, Western, and Balearic differ in the lexical incidence of stressed //e// and //ɛ//. Usually, words with //ɛ// in Central Catalan correspond to //ə// in Balearic and //e// in Western Catalan. Words with //e// in Balearic almost always have //e// in Central and Western Catalan as well. As a result, Central Catalan has a much higher incidence of //ɛ//.

  - Different incidence of stressed //e//, //ə//, //ɛ//**

| Word | Western | Eastern | | |
| North-Western | Valencian | Majorcan | Central | Northern |
| set ("thirst") | //ˈset// | //ˈsət// | //ˈsɛt// | //ˈset// |
| ven ("he sells") | //ˈven// | //ˈvən// | //ˈbɛn// | //ˈven// |

  - General differences in the pronunciation of unstressed vowels in different dialects**

| Word | Western | Eastern |
| North-Western | Valencian | Majorcan |
| mare ("mother") | //ˈmaɾe// | //ˈmaɾə// |
| cançó ("song") | //kanˈso// | //kənˈso// |
| posar ("to put") | //poˈza(ɾ)// | //puˈza(ɾ)// |
| ferro ("iron") | //ˈfɛro// | //ˈfɛru// |
|}

  - Detailed examples of vowel reduction processes in different dialects**

| | Word pairs: the first with stressed root, the second with unstressed root | Western | Eastern | |
| Majorcan | Central | Northern | | |
| Front vowels | gel ("ice") gelat ("ice cream") | /[ˈdʒɛl]/ /[dʒeˈlat]/ | /[ˈʒɛl]/ /[ʒəˈlat]/ | /[ˈʒel]/ /[ʒəˈlat]/ |
| pera ("pear") perera ("pear tree") | /[ˈpeɾa]/ /[peˈɾeɾa]/ | /[ˈpəɾə]/ /[pəˈɾeɾə]/ | /[ˈpɛɾə]/ /[pəˈɾeɾə]/ | /[ˈpeɾə]/ /[pəˈɾeɾə]/ |
| pedra ("stone") pedrera ("quarry") | /[ˈpeðɾa]/ /[peˈðɾeɾa]/ | /[ˈpeðɾə]/ /[pəˈðɾeɾə]/ | | |
| banya ("he bathes") banyem/banyam ("we bathe") | /[ˈbaɲa]/ /[baˈɲem]/ | /[ˈbaɲə]/ /[bəˈɲam]/ | /[ˈbaɲə]/ /[bəˈɲɛm]/ | /[ˈbaɲə]/ /[bəˈɲem]/ |
| Back vowels | cosa ("thing") coseta ("little thing") | /[ˈkɔza]/ /[koˈzeta]/ | /[ˈkɔzə]/ /[koˈzətə]/ | /[ˈkɔzə]/ /[kuˈzɛtə]/ |
| tot ("everything") total ("total") | /[ˈtot]/ /[toˈtal]/ | /[ˈtot]/ /[tuˈtal]/ | /[ˈtut]/ /[tuˈtal]/ | |

==== Consonants ====

Catalan dialects are characterized by final-obstruent devoicing, lenition and voicing assimilation. Additionally, many dialects contrast two rhotics (//r, ɾ//) and two laterals (//l, ʎ//).

Most Catalan dialects are also renowned by the usage of dark l (i.e. velarization of //l// → ), which is especially noticeable in syllable final position, in comparison to neighbouring languages, such as Spanish, Italian and French (that lack this pronunciation).

There is dialectal variation in regard to:
- The pronunciation and distribution of sibilants (with different results according to voicing and affrication vs. deaffrication).
  - While, arguably there are seven to eight sibilants in Standard Catalan and Standard Valencian, dialects like Central Valencian and Ribagorçan only have three or four.
- The usage of the voiced labiodental fricative phoneme .
- The pronunciation or not of yod () in the digraph .
- The elision and pronunciation of final rhotics (either or ).
- The delateralization of the palatal lateral approximant ().
- The alternation of lenition vs. fortition (such as //b// in poble 'village, people' → /[β]/ vs. /[b]/ vs. /[bː]/ vs. /[p]/ vs. /[pː]/).

=== Morphology ===
Western Catalan: In verbs, the ending for 1st-person present indicative is -e in verbs of the 1st conjugation and -∅ in verbs of the 2nd and 3rd conjugations in most of the Valencian Community, or -o in all verb conjugations in the Northern Valencian Community and Western Catalonia.
E.g. parle, tem, sent (Valencian); parlo, temo, sento (North-Western Catalan).

Eastern Catalan: In verbs, the ending for 1st-person present indicative is -o, -i, or -∅ in all conjugations.
E.g. parlo (Central), parl (Balearic), and parli (Northern), all meaning ('I speak').
  - 1st-person singular present indicative forms**

| Conjugation | Eastern Catalan | Western Catalan | Gloss | | | |
| Central | Northern | Balearic | Valencian | North-Western | | |
| 1st | parlo | parli | parl | parle | parlo | 'I speak' |
| 2nd | temo | temi | tem | tem | temo | 'I fear' |
| 3rd | | sento | senti | sent | sent | sento |
| | poleixo | poleixi | poleix or polesc | polisc or polesc | pol(e)ixo | 'I polish' |

Western Catalan: In verbs, the inchoative endings are -isc/-esc, -ix, -ixen, -isca/-esca.

Eastern Catalan: In verbs, the inchoative endings are -eixo, -eix, -eixen, -eixi.

Western Catalan: In nouns and adjectives, maintenance of //n// of medieval plurals in proparoxytone words.
E.g. hòmens 'men', jóvens 'youth'.

Eastern Catalan: In nouns and adjectives, loss of //n// of medieval plurals in proparoxytone words.
E.g. homes 'men', joves 'youth' (Ibicencan, however, follows the model of Western Catalan in this case).

=== Vocabulary ===
Despite its relative lexical unity, the two dialectal blocks of Catalan (Eastern and Western) show some differences in word choices. Any lexical divergence within any of the two groups can be explained as an archaism. Also, usually Central Catalan acts as an innovative element.
  - Selection of different words between Western and Eastern Catalan**

| Gloss | "mirror" | "boy" | "broom" | "navel" | "to exit" |
| Eastern Catalan | mirall | noi | escombra | llombrígol | sortir |
| Western Catalan | espill | xiquet | granera | melic | eixir |

== Standards ==

  - Written varieties**

| Catalan (IEC) | Valencian (AVL) | gloss |
| anglès | anglés | English |
| conèixer | conéixer | to know |
| treure | traure | take out |
| néixer | nàixer | to be born |
| càntir | cànter | pitcher |
| rodó | redó | round |
| meva | meua | my, mine |
| ametlla | ametla | almond |
| estrella | estrela | star |
| cop | colp | hit |
| llagosta | llangosta | lobster |
| homes | hòmens | men |
| servei | servici | service |

Standard Catalan, virtually accepted by all speakers, is mostly based on Eastern Catalan, which is the most widely used dialect. Nevertheless, the standards of the Valencian Community and the Balearics admit alternative forms, mostly traditional ones, which are not current in eastern Catalonia.

The most notable difference between both standards is some tonic accentuation, for instance: francès, anglès (IEC) – francés, anglés (AVL). Nevertheless, AVL's standard keeps the grave accent , while pronouncing it as //e// rather than //ɛ//, in some words such as: què ('what'), or València. Other divergences include the use of (AVL) in some words instead of like in ametla/ametlla ('almond'), espatla/espatlla ('back'), the use of elided demonstratives (este 'this', eixe 'that') in the same level as reinforced ones (aquest, aqueix) or the use of many verbal forms common in Valencian, and some of these common in the rest of Western Catalan too, such as subjunctive mood or inchoative conjugation in -ix- at the same level as -eix- or the priority use of -e morpheme in 1st person singular in present indicative (-ar verbs): jo compre instead of jo compro ('I buy').

In the Balearic Islands, IEC's standard is used but adapted for the Balearic dialect by the University of the Balearic Islands's philological section. In this way, for instance, IEC says it is correct writing cantam as much as cantem ('we sing'), but the university says that the priority form in the Balearic Islands must be cantam in all fields. Another feature of the Balearic standard is the non-ending in the 1st person singular present indicative: jo compr ('I buy'), jo tem ('I fear'), jo dorm ('I sleep').

In Alghero, the IEC has adapted its standard to the Algherese dialect. In this standard one can find, among other features: the definite article lo instead of el, special possessive pronouns and determinants la mia ('mine'), lo sou/la sua ('his/her'), lo tou/la tua ('yours'), and so on, the use of -v- //v// in the imperfect tense in all conjugations: cantava, creixiva, llegiva; the use of many archaic words, usual words in Algherese: manco instead of menys ('less'), calqui u instead of algú ('someone'), qual/quala instead of quin/quina ('which'), and so on; and the adaptation of weak pronouns. In 1999, Catalan (Algherese dialect) was among the twelve minority languages officially recognized as Italy's "historical linguistic minorities" by the Italian State under Law No. 482/1999.

In 2011, the Aragonese government passed a decree approving the statutes of a new language regulator of Catalan in La Franja (the so-called Catalan-speaking areas of Aragon) as originally provided for by Law 10/2009. The new entity, designated as Institut Aragonès del Català, shall allow a facultative education in Catalan and a standardization of the Catalan language in La Franja.

== Status of Valencian ==

Valencian is classified as a Western dialect, along with the North-Western varieties spoken in Western Catalonia (provinces of Lleida and the western half of Tarragona). Central Catalan has 90% to 95% inherent intelligibility for speakers of Valencian.

Linguists, including Valencian scholars, deal with Catalan and Valencian as the same language. The official regulating body of the language of the Valencian Community, the Valencian Academy of Language (Acadèmia Valenciana de la Llengua, AVL) declares the linguistic unity between Valencian and Catalan varieties.

The AVL, created by the Valencian parliament, is in charge of dictating the official rules governing the use of Valencian, and its standard is based on the Norms of Castelló (Normes de Castelló). Currently, everyone who writes in Valencian uses this standard, except the Royal Academy of Valencian Culture (Real Acadèmia de Cultura Valenciana, RACV), which uses an independent standard for Valencian.

Despite the position of the official organizations, an opinion poll carried out between 2001 and 2004 showed that the majority of the Valencian people consider Valencian different from Catalan. This position is promoted by people who do not use Valencian regularly. Furthermore, the data indicates that younger generations educated in Valencian are much less likely to hold these views. A minority of Valencian scholars active in fields other than linguistics defends the position of the Royal Academy of Valencian Culture (Real Acadèmia de Cultura Valenciana, RACV), which uses for Valencian a standard independent from Catalan.

This clash of opinions has sparked much controversy. For example, during the drafting of the European Constitution in 2004, the Spanish government supplied the EU with translations of the text into Basque, Galician, Catalan, and Valencian, but the latter two were identical.

== Vocabulary ==

=== Word choices ===
Despite its relative lexical unity, the two dialectal blocks of Catalan (Eastern and Western) show some differences in word choices. Any lexical divergence within any of the two groups can be explained as an archaism. Also, usually Central Catalan acts as an innovative element.

Literary Catalan allows the use of words from different dialects, except those of very restricted use. However, from the 19th century onwards, there has been a tendency towards favoring words of Northern dialects to the detriment of others.

=== Latin and Greek loanwords ===
Like other languages, Catalan has a large list of loanwords from Greek and Latin. This process started very early, and one can find such examples in Ramon Llull's work. In the 14th and 15th centuries Catalan had a far greater number of Greco-Latin loanwords than other Romance languages, as is attested for example in Roís de Corella's writings. The incorporation of learned, or "bookish" words from its own ancestor language, Latin, into Catalan is arguably another form of lexical borrowing through the influence of written language and the liturgical language of the Church. Throughout the Middle Ages and into the early modern period, most literate Catalan speakers were also literate in Latin; and thus they easily adopted Latin words into their writing—and eventually speech—in Catalan.

=== Word formation ===
The process of morphological derivation in Catalan follows the same principles as the other Romance languages, where inflection is common. Many times, several affixes are appended to a preexisting lexeme, and some sound alternations can occur, for example elèctric /[əˈlɛktri<u>k</u>]/ ("electrical") vs. electricitat /[ələktri<u>s</u>iˈtat]/. Prefixes are usually appended to verbs, as in preveure ("foresee").

There is greater regularity in the process of word-compounding, where one can find compounded words formed much like those in English.
  - Common types of word compounds in Catalan**

| Type | Example | Gloss |
| two nouns, the second assimilated to the first | paper moneda | "banknote paper" |
| noun delimited by an adjective | estat major | "military staff" |
| noun delimited by another noun and a preposition | màquina d'escriure | "typewriter" |
| verb radical with a nominal object | <u>para</u>caigudes | "parachute" |
| noun delimited by an adjective, with adjectival value | pit-roig | "robin" (bird) |

== Writing system ==

| Main forms | A a | B b | C c | D d | E e | F f | G g | H h | I i | J j | K k | L l | M m | N n | O o | P p | Q q | R r | S s | T t | U u | V v | W w | X x | Y y | Z z |
| Modified forms | À à | | Ç ç | | É é | È è | | Í í | Ï ï | | ĿL ŀl | | Ó ó | Ò ò | | Ú ú | Ü ü | | | | | | | | | |

Catalan uses the Latin script, with some added symbols and digraphs. The Catalan orthography is systematic and largely phonologically based. Standardization of Catalan was among the topics discussed during the First International Congress of the Catalan Language, held in Barcelona October 1906. Subsequently, the Philological Section of the Institut d'Estudis Catalans (IEC, founded in 1911) published the Normes ortogràfiques in 1913 under the direction of Antoni Maria Alcover and Pompeu Fabra. In 1932, Valencian writers and intellectuals gathered in Castelló de la Plana to make a formal adoption of the so-called Normes de Castelló, a set of guidelines following Pompeu Fabra's Catalan language norms.

  - Pronunciation of Catalan special characters and digraphs**

| | Pronunciation | Usage | Examples |
| ç | //s// | before a, o and u; or final position | feliç ("happy") |
| gu | //ɡ// (phonetically /[ɡ ~ ɣ]/) | before i and e | guerra ("war") |
| //ɡw// | elsewhere | guant ("glove") | |
| ig | //t͡ʃ// | in final position | raig ("ray") |
| ix | //ʃ// (/[jʃ]/ in most Western dialects) | medially and finally | caixa ("box") |
| ll | //ʎ// | in any position | lloc ("place") |
| ŀl | //lː// (normatively, but usually //l//) | between vowels | noveŀla ("novel") |
| ny | //ɲ// | in any position | Catalunya ("Catalonia") |
| qu | //k// | before i and e | qui ("who") |
| //kw// | before other vowels | quatre ("four") | |
| rr | //r// | between vowels intervocalic r is pronounced //ɾ// | carrer ("street") mira ("he or she looks") |
| sc | //s// | between vowels, before i and e | ascens ("rise") |
| ss | between vowels intervocalic s is pronounced //z// | grossa ("big, ") casa ("house") | |
| tg | //d͡ʒ// | before i and e | fetge ("liver") |
| tj | elsewhere | mitjó ("sock") | |
| ts | //t͡s// | in any position | potser ("maybe") |
| tx | //t͡ʃ// | in any position | despatx ("office") |
| tz | //d͡z// | mainly word medially | dotze ("twelve") |
| | | | |
| Learned letter combinations (found in loanwords and/or pre-reform terminology) | | | |
| | Pronunciation | Usage | Examples |
| ch | //k// | in final position | Llach ("Llach") |
| kh | //x// | in any position | sikh ("sikh") |
| ph | //f// | in any position | pholis ("pholis") |
| th | //θ// | in any position //t// in native words | theta ("theta") tothom ("everybody") |

  - Letters and digraphs with contextually conditioned pronunciations**

| | Notes | Examples |
| c | //s// before i and e corresponds to ç in other contexts | feliç ("happy, ") vs. felices ("happy, ") caço ("I hunt") vs. caces ("you hunt") |
| g | //ʒ// before e and i corresponds to j in other positions | envejar ("to envy") vs. envegen ("they envy") |
| final g before i and final ig before other vowels are pronounced /[tʃ]/ corresponds to j~g or tj~tg in other positions | desig ("wish") vs. desitjar ("to wish") vs. desitgem ("we wish"), exception: càstig ("punishment"), pronounced with //k// boig ("mad, ") vs. boja ("mad, ") vs. boges ("mad, ") | |
| gu | //ɡ// before e and i corresponds to g in other positions | botiga ("shop") vs. botigues ("shops") |
| gü | //ɡw// before e and i corresponds to gu in other positions | llengua ("language") vs. llengües ("languages") |
| qu | //k// before e and i corresponds to c in other positions | vaca ("cow") vs. vaques ("cows") |
| qü | //kw// before e and i corresponds to qu in other positions | obliqua ("oblique, ") vs. obliqües ("oblique, ") |
| x | //ʃ// (also /[tʃ]/ dialectally) initially and in onsets after a consonant; /[ʃ]/ after i //ks// between vowels and syllable final (except after i in most cases) //ɡz// between vowels and syllable final before voiced consonants | xinxa ("bedbug"), guix ("chalk") taxi ("taxi"), fixar ("to fix"), extra ("extra") exacte ("exact"), exdirector ("ex-director") |

== Grammar ==

The grammar of Catalan is similar to other Romance languages. Features include:
- Use of definite and indefinite articles.
- Nouns, adjectives, pronouns, and articles are inflected for gender (masculine and feminine), and number (singular and plural). There is no case inflexion, except in pronouns.
- Verbs are highly inflected for person, number, tense, aspect, and mood (including a subjunctive).
- There are no modal auxiliaries.
- Word order is freer than in English.

=== Gender and number inflection ===

  - Regular noun with definite article: el gat ("the cat")**

| | masculine | feminine |
| singular | el gat | la gata |
| plural | els gats | les gates |
|
  - Adjective with 4 forms:
verd ("green")**

| | masculine | feminine |
| singular | verd | verda |
| plural | verds | verdes |
|
  - Adjective with 3 forms:
feliç ("happy")**

| | masculine | feminine |
| singular | feliç | |
| plural | feliços | felices |
|
