Jump to content

User:SmallJarsWithGreenLabels/regexes for language templating

From Wikipedia, the free encyclopedia

Italics to transliterations[edit]

Find (?<=[ \[(,.\n"'])''([^']*)''(?=[ \)\],.\n"'])
Replace {{transl|lang code|$1}}
Notes:

  • Must be replaced one by one so that non-transliterations can be skipped.
  • Does not match bolded italics

Bold italics to transliterations[edit]

Find (?<=[ \[(,.\n"'])'''''([^']*)'''''(?=[ \)\],.\n"'])
Replace '''{{transl|lang code|$1}}'''

Hanzi to zh[edit]

Find ([一-龠。、 ]+)
Replace All {{lang|zh|$1}}
Notes:

  • After replacing, search through all instances of {{lang\|zh\|, making sure to detemplate any in ref names, file names, links, or other language templates
  • Not a worthwhile method on articles that heavily mix different CJKV scripts

Japanese[edit]

Find ([ァ-ヴぁ-ゔ一-龠。、 ーヶ〆〤々」「』『―)(]+)
Replace All {{lang|ja|$1}}
Notes: (as above)

  • After replacing, search through all instances of {{lang\|ja\|, making sure to detemplate any in ref names, file names, links, or other language templates
  • Not a worthwhile method on articles that heavily mix different CJKV scripts

Merge neighbouring lang templates[edit]

The commands above can cause a sentence written in one language to be split into multiple templates by spaces. Apply the following command to re-merge separated language templates:
Find {{lang\|([^\|]{0,4})\|([^}{]*)}}( ){{lang\|\1\|([^}{]*)}}
Replace All × n {{lang|$1|$2$3$4}}
Notes:

  • Continue applying until there are zero matches