Jump to content

User:TomTheHand/Unit tests for AWB regexes

From Wikipedia, the free encyclopedia

Hi there! This is my testing page for my AWB regular expressions. I realized that I really needed to perform unit testing on my regexes to make sure they all worked properly, so I started up these pages. I then realized that if I also provide the regexes themselves, people might be able to come by and use them.

I've used <small> tags on the really lengthy ones to try to shrink the width of the table. I realize that this makes many of them illegible, but you can copy and paste them elsewhere for study. I also realize some tables are still horrifyingly wide and will not fit on screens with a resolution less than 1024 pixels wide. Sorry 'bout that.

In order to use them:

  1. Fire up AWB.
  2. Go to the Options tab.
  3. Click "Advanced" under "Find and replace".
  4. Click "New rule".
  5. Copy my description into Name.
  6. Copy my "Find" and "Replace with" text into their respective boxes. Don't edit this page and copy them directly out of the page source; they won't work. View this page and grab them that way.
  7. Look at my "Regular expression?" and "Case sensitive?" Y/N boxes, and check the proper options in AWB. If "Case sensitive?" is N/A, it doesn't matter whether the box is checked or not.
  8. Click OK, and you're good to go.

You could pull these pages up in AWB and make sure that the tests work properly. Let me know if you have any questions or problems.

If you run into any false positives or negatives, please let me know. I'll try to fix them, and then update my regex and unit test to prove that they now handle the new situation. Also, if you dislike any of the changes I'm making, please inform me and we can discuss it. I primarily edit articles about ships, and so I'm not trying to implement these changes everywhere, but if I'm doing something that you think is blatantly wrong we should talk about it.

General[edit]

This section contains regular expressions that make general fixes, not limited to a particular topic or type of unit.

Geographical coordinates[edit]

This section contains regular expressions for converting plain-text geographical coordinates to {{coord}}.

Frequency[edit]

This section contains regular expressions for properly formatting frequency.

Area and volume[edit]

This section contains regular expressions for properly formatting units of area and volume.

Length[edit]

This section contains regular expressions for formatting units of length.

Speed[edit]

This section contains regular expressions for formatting units of speed.

Power[edit]

This section contains regular expressions for formatting units of power.

Torque[edit]

Description
Fix newton metres
Find
\bN(?:\s|&nbsp;|-|•)?m\b
Replace with
N·m
Regular expression? Case sensitive?
Y Y
Text this regex should modify: Intended result:
  1. N m
  2. N m
  3. N-m
  4. Nm
  5. N•m
  1. N·m
  2. N·m
  3. N·m
  4. N·m
  5. N·m
Description
Fix foot pounds
Find
\bft(?:\s|&nbsp;|-|.|·|•|\/)?lbf?s?\b
Replace with
ft·lbf
Regular expression? Case sensitive?
Y Y
Text this regex should modify: Intended result:
  1. ftlb
  2. ft lbf
  3. ft lb
  4. ft-lbfs
  5. ft.lb
  6. ft·lb
  7. ft•lb
  8. ft/lbs
  1. ft·lbf
  2. ft·lbf
  3. ft·lbf
  4. ft·lbf
  5. ft·lbf
  6. ft·lbf
  7. ft·lbf
  8. ft·lbf

Ships[edit]

Description
Put non-breaking spaces between "TF" or "TG" (short for Task Force or Task Group) and a number
Find
T(F|G)\s?(\d)
Replace with
T$1&nbsp;$2
Regular expression? Case sensitive?
Y Y
Text this regex should modify: Intended result:
  1. TF 10
  2. TG 10
  3. TF 10
  4. TG 10
  5. TF10
  6. TG10
  1. TF 10
  2. TG 10
  3. TF 10
  4. TG 10
  5. TF 10
  6. TG 10
Description
Put non-breaking spaces between CruDiv (Cruiser Division) and a number, and capitalize properly
Find
crudiv(?:\s| )?(\d)
Replace with
CruDiv $1
Regular expression? Case sensitive?
Y N
Text this regex should modify: Intended result:
  1. crudiv 5
  2. Crudiv 5
  3. CruDiv 5
  4. CrUdIv 5
  5. crudiv 5
  6. Crudiv 5
  7. CruDiv 5
  8. CrUdIv 5
  9. crudiv5
  10. Crudiv5
  11. CruDiv5
  12. CrUdIv5
  1. CruDiv 5
  2. CruDiv 5
  3. CruDiv 5
  4. CruDiv 5
  5. CruDiv 5
  6. CruDiv 5
  7. CruDiv 5
  8. CruDiv 5
  9. CruDiv 5
  10. CruDiv 5
  11. CruDiv 5
  12. CruDiv 5
Description
Put non-breaking spaces between DesDiv (Destroyer Division) and a number, and capitalize properly
Find
desdiv(?:\s| )?(\d)
Replace with
DesDiv $1
Regular expression? Case sensitive?
Y N
Text this regex should modify: Intended result:
  1. desdiv 5
  2. Desdiv 5
  3. DesDiv 5
  4. DeSdIv 5
  5. desdiv 5
  6. Desdiv 5
  7. DesDiv 5
  8. DeSdIv 5
  9. desdiv5
  10. Desdiv5
  11. DesDiv5
  12. DeSdIv5
  1. DesDiv 5
  2. DesDiv 5
  3. DesDiv 5
  4. DesDiv 5
  5. DesDiv 5
  6. DesDiv 5
  7. DesDiv 5
  8. DesDiv 5
  9. DesDiv 5
  10. DesDiv 5
  11. DesDiv 5
  12. DesDiv 5
Description
Put non-breaking spaces between DesRon (Destroyer squadRon) and a number, and capitalize properly
Find
desdiv(?:\s| )?(\d)
Replace with
DesDiv $1
Regular expression? Case sensitive?
Y N
Text this regex should modify: Intended result:
  1. desron 5
  2. Desron 5
  3. DesRon 5
  4. DeSrOn 5
  5. desron 5
  6. Desron 5
  7. DesRon 5
  8. DeSrOn 5
  9. desron5
  10. Desron5
  11. DesRon5
  12. DeSrOn5
  1. DesRon 5
  2. DesRon 5
  3. DesRon 5
  4. DesRon 5
  5. DesRon 5
  6. DesRon 5
  7. DesRon 5
  8. DesRon 5
  9. DesRon 5
  10. DesRon 5
  11. DesRon 5
  12. DesRon 5

Temperature[edit]

Description Find Replace with RegEx? Case
sensitive?
Tests Intended result
Replace "centigrade" with "Celsius" centigrade Celsius N N
  1. centigrade
  2. Centigrade
  3. CeNtIgRaDe
  1. Celsius
  2. Celsius
  3. Celsius
Get "Celsius" capitalized properly celsius Celsius N N
  1. celsius
  2. Celsius
  3. CeLsIuS
  1. Celsius
  2. Celsius
  3. Celsius
Get "Fahrenheit" capitalized properly fahrenheit Fahrenheit N N
  1. fahrenheit
  2. Fahrenheit
  3. FaHrEnHeIt
  1. Fahrenheit
  2. Fahrenheit
  3. Fahrenheit
Remove spaces from between degree symbol and C or F °(?:\s| )([CF]) °$1 Y N
  1. ° C
  2. ° F
  3. °C
  4. °F
  5. ° C
  6. ° F
  1. °C
  2. °F
  3. °C
  4. °F
  5. °C
  6. °F
Replace °Celsius with °C °celsius °C N N
  1. °celsius
  2. °Celsius
  1. °C
  2. °C
Replace °Farenheit with °F °fahrenheit °F N N
  1. °fahrenheit
  2. °Fahrenheit
  1. °F
  2. °F
Put a non-breaking space between the temperature and °C or °F (\d)\s?(°[CF]) $1&nbsp;$2 Y N
  1. 40°C
  2. 40°F
  3. 40 °C
  4. 40 °F
  5. 40 °C
  6. 40 °F
  1. 40 °C
  2. 40 °F
  3. 40 °C
  4. 40 °F
  5. 40 °C
  6. 40 °F
Get rid of degree symbol in temperatures in Kelvin (\d)(?:\s|&nbsp;)?°K $1&nbsp;K Y N
  1. 300°K
  2. 300 °K
  3. 300 °K
  1. 300 K
  2. 300 K
  3. 300 K

Final[edit]

These are general-purpose regexes, but they should be run last, because their purpose is to make some final fixes after all the other work has been done.

Description
Use non-breaking spaces between numbers and abbreviated SI units, per MoS. This does not try to understand and fix improper SI units, so it needs to be run at the end, after all of that stuff has already been taken care of.
Find
\b(\d+)(?:\s|-)?[ ]*(Y|Z|E|P|T|G|M|k|h|da|d|c|m|µ|n|p|f|a|z|y)?(m|g|s|A|K|mol|cd|Hz|N|Pa|J|W|C|V|F|Ω|S|Wb|T|H|lm|lx|Bq|Gy|Sv|kat|M)(²|³)?\b
Replace with
$1&nbsp;$2$3$4
Regular expression? Case sensitive?
Y Y
Text this regex should modify: Intended result:
  1. 15 km/h
  2. 15-m
  3. 15 km²
  4. 15cm³
  1. 15 km/h
  2. 15 m
  3. 15 km²
  4. 15 cm³