Jump to content

User talk:Ark25/Romanian diacritics/Replacements in titles

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Scripts

[edit]

The script to transform S/T-cedilla into S/T-comma:

cat in.txt ^
 | sed -e "s/\xc5\x9f/\xc8\x99/g" ^
 | sed -e "s/\xc5\x9e/\xc8\x98/g" ^
 | sed -e "s/\xc5\xa3/\xc8\x9b/g" ^
 | sed -e "s/\xc5\xa2/\xc8\x9a/g" > out.txt
paste in.txt out.txt | sed -e "s/\(.*\)\t\(.*\)/#[[\1]] --- [[\2]]/" > out2.txt

The script to remove the titles containing Turkish, non-Romanian characters: ç, ğ, ı, ö, ü

cat in.txt ^
 | sed -e "s/.*\xc3\xa7.*//g" ^
 | sed -e "s/.*\xc3\x87.*//g" ^
 | sed -e "s/.*\xc4\x9f.*//g" ^
 | sed -e "s/.*\xc4\x9e.*//g" ^
 | sed -e "s/.*\xc4\xb1.*//g" ^
 | sed -e "s/.*\xc4\xb0.*//g" ^
 | sed -e "s/.*\xc3\xb6.*//g" ^
 | sed -e "s/.*\xc3\x96.*//g" ^
 | sed -e "s/.*\xc3\xbc.*//g" ^
 | sed -e "s/.*\xc3\x9c.*//g" > out.txt

Examples

[edit]

Articles correctly containing both S-cedilla and S-comma: Mareşal

Articles correctly containing both Romanian comma-diacritics and Turkish-only diacritics: Starčevo–Kőrös–Criș culture

Resources: User:Strainu/ro - Wikipedia:Bot_requests/Archive_36#Make_redirects_from_titles_with_correct_Romanian_diacritics_to_the_currently_used_diacritics and also a sandbox

wrong renames (made by me): Battle of Ulaş, Nejla Ateş, Torku Şeker Spor

unnecessary redirects: