User:Dank/Regex
Appearance
List of US inventoried hardwoods
[edit]\*@1(.*?)@2(.*?)@3(.*?)@4(.*?)@5(.*?)@6(.*?)@7(.*?)@8(.*?)@9(.*?)(\n) rowscopes: !scope="row" \*(\d+)@(\w+)@(\d+)\,(\d+)\,(\-)?(\d+)(\n) |-$7|{{cvt|$1|ft}}; $2@@{{cvt|$3|-|$4|in|cm}}@@{{cvt|$5$6|F}}$7 then change @@ to <br /> \*(\w+)\,(\w+)\,(\w+)\,(\w+)(\n) |-$5|D: $1<br />F: $2<br />L: $3<br />S: $4$5 @4(\d+)@5(\d+)@6[\w \.]+@7(\w+)@8(.*?)@9(.*?)(\n) {{sfn|$3|$5|1990|pp=$1–$2}}$6 then delete the stuff preceding the previous @4; replace || by |; and do the ones with a hyphen or with 3 or more authors by hand. * {{Cite book |last1= |first1= |last2= |first2= |pages=– |chapter='''' |editor-last1=Burns |editor-first1=Russell M. |editor-last2=Honkala |editor-first2=Barbara H. |title=Silvics of North America, Volume 2. Hardwoods. |publisher=US Forest Service, Department of Agriculture (US Government Printing Office) |location=Washington, DC |year=1990 |isbn=978-0-16-029260-6 }} * {{Cite book |last1=$7 |first1=$6 |last2=$9 |first2=$8 |pages=$4–$5 |chapter=''$1'' |editor-last1=Burns |editor-first1=Russell M. |editor-last2=Honkala |editor-first2=Barbara H. |title=Silvics of North America, Volume 2. Hardwoods. |publisher=US Forest Service, Department of Agriculture (US Government Printing Office) |location=Washington, DC |year=1990 |isbn=978-0-16-029260-6 }}$10 Where the most complicated line is "*Arbutus menziesii[https://plants.usda.gov/home/plantProfile?symbol=Arme].Pacific madrone.124.132@Philip M. McDonald@1@ and John C., II Tappeiner", do: \*(\w+ \w+)\[(.+?)\]\.(.+?)\.(\d+)\.(\d+)@(.+?)(\w+)(@1@ and .+?)?(\w+)?(\n) *@1$1@2@3$3@4$4@5$5@6$6@7$7@8$8@9$9$10 then delete "@1@ and " throughout. (\n)\[\[File\:(.+?)\|thumb.+?\n\[\[File\:(.+?)\|thumb.+?\n\[\[File\:(.+?)\|thumb.+?\n $1*@2$2@3$3@4$4 \*@2(.+?)@3(.+?)@4(.+?)(\n) |-$4|{{Multiple image |perrow=3 | total_width = 400px | image_style = border:none; | border = infobox$4| image1 =$1$4| alt1 =landscape$4| image2 =$2$4| alt2 =bark$4| image3 =$3$4| alt3 =foliage}}$4
List of US forest-inventory conifers
[edit]Start with a bulleted list of species in this format: (5 means the USDA symbol is ACNI5; 46 is the page number in the 1991 inventory) *Acer nigrum5.black maple.46 Create links to the USDA Plants Database (but remember this place in the article history; you'll need the version without the links, too): This will add the 4-letter codes: \*(\w\w)([a-z]+) (\w\w)([a-z]+)(\.)(.+?)(\n) *$1$2 $3$4.$1$3$5$6$7 then, for (\.), substitute (\d\.), then (\d\d\.) And this creates the urls: \*(\w+ \w+)\.(\w+)\. *$1[https://plants.usda.gov/home/plantProfile?symbol=$2]. Create a data table in this format: https://en.wikipedia.org/w/index.php?title=User:Dank/Sandbox/8&oldid=1224700887#temp4 (except: the "uses" column is a string of y and @, not y and n). "Uses" mirrors these categories from "Suitability/Use": Christmas Tree, Lumber, Naval Store, Nursery Stock, Post, Pulpwood, Veneer. Create the real table: (\n)\|\-\n\|(\w+?)\n\|([y@]+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(.+?)\n\|(.+?)\n\|(\w+?)\n\|(\w+?)\n\|(\w+?)\n\|(.+?)\n\|(\w+? \w+?) (.+?) (\d+) (\d+) ([`a-zA-Z]+) $1|-$1!scope="row" |''[[$15]]'' ()$1|Uses: $3@1$2 ''$15'']: Characteristics}}{{sfn|$19|1991|pp=$17–$18}}$1|No$1----$1$16$1----$1{{cvt|$4|ft}}; $5@1$2 ''$15'']: Characteristics}}$1|pH $9–$10$1{{cvt|$11|-|$12|in|cm}}<br/>$1{{cvt|$14|F}}@1$2 ''$15'']: Characteristics}}$1|D: $7<br/>F: $8<br/>L: $6<br/>S: $13<br/>@1$2 ''$15'']: Characteristics}}$1| @1 {{sfn|National Plant Data Team|2023|loc=[https://plants.usda.gov/home/plantProfile?symbol= S: intolerant S:<br/>intolerant [in case I forget] intermediate{ medium{ Add the common names. Rearrange the y-@ string to the proper order for: construction, landscaping, posts, pulpwood, terpenes, veneers, winter holiday decorations. ([y@])([y@])([y@])([y@])([y@])([y@])([y@]) $2$4$5$6$3$7$1 Or: (marker)(.)(.)(.)(.)(.)(.)(.) (marker)$2$4$5$6$3$7$1 (?<=\|Uses\: ......)(y) , winter holiday decorations (?<=\|Uses\: .....)(y) , veneers (?<=\|Uses\: ....)(y) , terpenes (?<=\|Uses\: ...)(y) , pulpwood (?<=\|Uses\: ..)(y) , posts (?<=\|Uses\: .)(y) , landscaping (?<=\|Uses\: )(y) construction Remove any leftover @ |Uses:_,_ -> |Uses:_ ` -> | https://en.wikipedia.org/w/index.php?title=User:Dank/Sandbox/8&oldid=1224733838#temp0 Alphabetize by last name, then add refs for single authors to reference section, from a table in that format: \*(\w+ \w+) (\d+) (\d+) (\w+)\, (.+?)(\n) *{{cite book |last1=$4 |first1=$5 |pages=$2–$3 |chapter=''$1'' | editor-last1=Burns | editor-first1=Russell M. | editor-last2=Honkala | editor-first2=Barbara H. | title=Silvics of North America, Volume 1. Conifers. | publisher=US Forest Service, Department of Agriculture (US Government Printing Office) | location=Washington, DC | year=1991 | isbn=978-0160292606 }}$6 Fill in the second column and add images. If desired, this can be added manually to the last column: {{Multiple image |perrow=2 | total_width = 360px | image_style = border:none; | border = infobox | image1 = | alt1 =landscape | image2 = | alt2 =landscape | image3 = | alt3 =bark | image4 = | alt4 =cone and foliage }}
List of Canadian forest-inventory conifers
[edit]Remove uppercase codes at end of each line [A-Z ]+(\n) $1 Do lines where the common name isn't two words by hand; add "/" at the end For the remaining lines, remove all but first two and last two words of each line (\w+ \w+ )(.+?)(\w+ \w+)(\n) $1$3$4 Remove each /. Add * at the beginning of each line Add links and italics: \*(.+?) (\w+) (\w+)(\n) *''[[$2 $3]]'',[https://commons.wikimedia.org/wiki/$2_$3] $1$4 Check POWO for synonyms and Commons for sufficient images. Check on maps. Add: ==Key== :Provinces: AB [[Alberta]], BC [[British Columbia]], MB [[Manitoba]], NB [[New Brunswick]], NL [[Newfoundland and Labrador]], NS [[Nova Scotia]], NT [[Northwest Territories]], NU [[Nunavut]], ON [[Ontario]], PE [[Prince Edward Island]], QC [[Quebec]], SK [[Saskatchewan]], YT [[Yukon]] ==Species== {|class="sortable wikitable plainrowheaders" |+{{sronly|Species}} ! scope="col" width="1%" |Species (or genus) and a [[common name]]{{sfn|CNFI|loc=Tree Species List}}{{sfn|POWO}}{{efn-la|The taxonomy (classification) comes from POWO.}} ! scope="col" class=unsortable width="15%" |Distribution in Canada{{sfn|Burns|Honkala|1991}} ! scope="col" class=unsortable width="30%" style="min-width:120px;" |Description and uses ! scope="col" class=unsortable width="10%" |Co-named North American [[forest#Types|forest types]]{{sfn|Burns|Honkala|1991}} ! scope="col" width="1%" |[[Family (biology)|Family]]{{sfn|POWO}} ! scope="col" class=unsortable width="1%" |Images |- |} Create the table \*''\[\[(\w+ \w+)\]\]''\,\[(.+?)\] (\w+ \w+)( \w+)?( \w+)?(\n) |-$6!scope="row" |''[[$1]]'' ($3$4$5)$6|[[|thumb|100px|center|BC |alt=Species distribution in Canada]]$6|$6|$6|$6|{{Multiple image | width = 120px | image_style = border:none; | border = infobox$6| footer =$6| image1 =$6| alt1 =$6| image2 =$6| alt2=$6| image3 =$6| alt3=$6}}$6 At some point, fill in the "family" column, and (if necessary) add explanations to the Key. Add distribution maps If necessary, add {{CSS image crop}}: \)(\n)\| )$1|{{CSS image crop$1|Image = $1|bSize = 120$1|cWidth = $1|cHeight = $1|oTop = $1|oLeft = $1|Location = center$1|Description = $1|Alt = Species distribution in Canada$1}}$1| Do images; get heights approximately even by cropping. Do alt text. Add license info to the talk page. Create list of parameters for {{cite book}}. Do regex on chapter pages and authors from Silvics in the form "@456-462 Silas Little and Peter W. Garrett": \|@(\d+)\-(\d+) (.+?) (\w+)(\n) |first1=$3 |last1=$4 |pages=$1–$2$5 \|@(\d+)\-(\d+) (.+?) (\w+) and (.+?) (\w+)(\n) |first1=$3 |last1=$4 |first2=$5 |last2=$6 |pages=$1–$2$7 \|@(\d+)\-(\d+) (.+?) (\w+)\, (.+?) (\w+)\, and (.+?) (\w+)(\n) |first1=$3 |last1=$4 |first2=$5 |last1=$6 |first3=$7 |last3=$8 |pages=$1–$2$9 Append chapter names: "row" \|''\[\[(\w+ \w+)\]\]''(.+?)(\n)\|(.+?)\n\|(.+?)\n\| "row" |[[$1]]$2$3|$4$3|$5 |chapter=''$1''$3| Create a blank "References" section. Create a bulleted separate entry in References for each list of parameters, except: when the author(s) is/are the same, combine into one ref. Convert this bulleted list into properly formatted {{cite book}} citations, swapping the "first1" and "last1" on each line: ''(\n) | editor-last1=Burns | editor-first1=Russell M. | editor-last2=Honkala | editor-first2=Barbara H. | title=Silvics of North America, Volume 1. Conifers. | url=https://www.fs.usda.gov/research/treesearch/1547 | publisher=United States Government Printing Office (Department of Agriculture, Forest Service) | location=Washington, DC | year=1991 | isbn=978-0160292606 }}$1 \*\|first1=(.+?)\|last1=(.+?)\| *{{cite book |last1=$2|first1=$1| Alphabetize the list Convert each list of parameters into an appropriate {{sfn}} citation: (\n)\|first1=(.+?)\|last1=(.+?) \|first2=(.+?)\|last2=(.+?) \|first3=(.+?)\|last3=(.+?) \|pages=(\d+.\d+) $1|{{sfn|$3|$5|$7|1991|pp=$8}} (\n)\|first1=(.+?)\|last1=(.+?) \|first2=(.+?)\|last2=(.+?) \|pages=(\d+.\d+) $1|{{sfn|$3|$5|1991|pp=$6}} (\n)\|first1=(.+?)\|last1=(.+?) \|pages=(\d+.\d+) $1|{{sfn|$3|1991|pp=$4}} Remove "chapter=..." from these lines: \|chapter(.+?)(\n) $1 Add end-sections ... If I'm using the PLANTS database, this will add the species name to each sfn: "row" \|''\[\[(\w+ \w+)\]\](.+?)(\n)\|(.+?)\n\|(.+?)\|2023\|loc= "row" |''[[$1]]$2$3|$4$3|$5|2023|loc=''$1'': ... convert e.g. "*https:...symbol=PIST, loc=Fact Sheet,|first2=John |last2=Dickerson" ^\*(.+?)\, loc=(.+?)\,(.+?)(\n) *$3 |url=$1$4 If necessary, to add Burns citations to the blank column: \}\{\{(.+?)\|1991\|pp=(\d+.\d+)\}\}(\n)\|\n\|(\w+) family }{{$1|1991|pp=$2}}$3|{{$1|1991|pp=$2}}$3|$4 family [add ---- where needed] Adding a cite after ----: (\n)\|\{\{sfn\|National(.+?)cs\}\}\{\{(.+?)\}\}\n\|\{\{(.+?)\}\}\n\-\-\-\-\n $1|{{sfn|National$2cs}}{{$3}}$1|{{$4}}$1----$1{{sfn|National$2cs}}
Plant family tables
[edit]Adding hair space before cites in 3rd col: !scope="row"(.+?)(\n)\|(.+?)\n\|(.+?)\{\{ !scope="row"$1$2|$3$2|$4&hair;{{ Condensing Template:Multiple image image\n\| width = 120px\n\| image_style = border\:none\;\n\| border image | width = 120px | image_style = border:none; | border Moving Christenhusz cites to the first column: family\)(\n)\| \[\[(.+?)\{\{sfn\|Chr(.+?)\}\}\n family){{sfn|Chr$3}}$1| [[$2$1 Fix VE soft-hyphen bug: #xAD shy Find and mark Lamiales, etc. lines in [[List of plant family names with etymologies]]: || [[Lamiales]] || [[Lamiales]]zz Remove anchors: \{\{anchor\|\w+\}\} Remove data-sort: (\n)\|data\-sort(.+?)\| $1| Remove non-marked lines: !scope="row" \|(.+?)\n\|(.+?)\n\|(.+?)\]\]\n\|(.+?)\n\|\-\n (then remove the zz markers) Consider whether "Gl" is needed as a source Automated copyediting of the etym column: Note the space: \|\| P(\n)\| ||$1|, for \|\| \w(\n)\|(.+?)plant name \|\| ||$1|, from a $2plant name || \|\| G(\n)\|(.+?) \|\| ||$1|, from Greek for "$2" || Then finish copyediting by hand, and blank the LG column change two widths to 15%; add soft hyphens fix Chr cites: {{sfn|Christenhusz|p {{sfn|Christenhusz|Fay|Chase|2017|p move final Chr cites to the Orders column: ales\]\](\n)\| (\w\w)?\{\{sfn\|Chr(.+?)\n\|\- ales]]{{sfn|Chr$3$1| $2$1|- remove final spaces put total Chr page range in column header for "Family" add vernacular names; check spacing after running this: "row" \|\[\[(.+?)\]\] (.+?)(\n) "row" |[[$1]] ($2 family)$3 add end-sections and etym. sfns, using for instance: (\n)\| CS(\d+) (\d+)\n\|\- $1| {{sfn|Stearn|2002|p=$3}}{{sfn|Coombes|2012|p=$2}}$1|- or: (\n)\| Bu(.+?)\n\|\- $1| {{sfn|Burkhardt|2018|p=$2}}$1|- combine 3 columns into the etym column, first for the Chr-only etym rows: family\)(\n)\|''\[\[(.+?)\]\]'' \|\|\n\|(.+?) ?\|\|(.+?)\{\{sfn\|Chr(.+?)\}\}\n\|\n\|\- family)$1|''[[$2]]''$3{{sfn|Chr$5}}$1|$1|$1|$4{{sfn|Chr$5}}$1|$1|- and then for the others: family\)(\n)\|''\[\[(.+?)\]\]'' \|\|\n\|(.+?) ?\|\|(.+?)\{\{sfn\|Chr(.+?)\}\}\n\| \{(.+?)\n\|\- family)$1|''[[$2]]''$3{$6$1|$1|$1|$4{{sfn|Chr$5}}$1|$1|- add "synonym" language. Do POWO cites for synonyms. add table headers (with proper page range for Chr). Add total genera and single-letter code for POWO database (if any). if total genera=1, add "genus" and move the single-letter code right one column: (\n)\|1([a-z])\n\| $1|1 genus, $1|$2 otherwise, add "genera" and move the single-letter code right one column: (\n)\|(\d+)([a-z])\n\| $1|$2 genera, $1|$3 Do the leftovers: (\n)\|1\n $1|1 genus, $1 (\n)\|(\d+)\n $1|$2 genera, $1 Do the cites for the POWO databases: (\n)\|e\n $1|{{sfn|POWO|loc=Flora of Tropical East Africa}}$1 (\n)\|n\n $1|{{sfn|POWO|loc=Neotropikey}}$1 (\n)\|s\n $1|{{sfn|POWO|loc=Flora of Somalia}}$1 (\n)\|t\n $1|{{sfn|POWO|loc=Trees of New Guinea}}$1 (\n)\|w\n $1|{{sfn|POWO|loc=Flora of West Tropical Africa}}$1 (\n)\|z\n $1|{{sfn|POWO|loc=Flora of Zambesiaca}}$1 Add ipni cite to second col (after "# genera" has been added in 3rd col): "row" \|\[\[(\w+)(.+?)(\n)\|(.+?)\n\|(\d+) genera "row" |[[$1$2$3|$4{{sfn|IPNI|loc=[ $1, Type]}}$3|$5 genera Adding POWO cite to genera: "row" \|\[\[(\w+)(.+?)(\n)\|(.+?)\n\|(.+?)\, "row" |[[$1$2$3|$4$3|$5,{{sfn|POWO|loc=$1}} Copy Chr cites into the Description & Uses column, after the POWO cites (but check the first few): (\n)\|(.+?)ales\]\]\{\{sfn\|Chr(.+?)\}\}\n\|\n\|\- {{sfn|Chr$3}}$1|$2ales]]{{sfn|Chr$3}}$1|$1|- Do IPNI and USDA cites Use FGVP (whatever is available) for the distribution column, then copy Chr cites to the remaining rows: [This works when there's something after the first citation in that column in each row] ceae\}\}(\n)\|(.+?)\n\|(.+?)ales\]\]\{\{sfn\|Chr(.+?)\}\}\n\|\n ceae}} in{{sfn|Chr$4}}$1|$2$1|$3ales]]{{sfn|Chr$4}}$1|$1 Do either this or the next seven: At some point, add image code (and add alt text later): \|(\n)\|\-\n |{{Multiple image |width=120px |image_style=border:none; |border=infobox$1| footer = ''[[]]''$1| image1 =$1| image2 = }}$1|-$1 (For John) =(\w+)ceae\}\}(.+?)(\n)\|(.+?)\n\|(.+?)\n\|\n\|\- =$1ceae}}$2$3|$4$3|$5$3|''[[c:Category:$1ceae]]''$3|- (if necessary) \|(\w+ \w+\W?)(\n)\|\- |''[[c:Category:$1]]''$2|- adding alt parameters: (\n)\|(.+?)\n\}\}\n\|\- | alt1="flowers"$1|$2 | alt2="foliage"$1}}$1|- Removing "thumb" etc. from John's raw image lists: (\.jpg)\|(.+?)\]\](\n) $1]]$3 Adding * and colon: [[F *[[:F convert raw list of images to table: (\n)(.+?)\n\*\[\[\:File\:(.+?)\]\]\n\*\[\[\:File\:(.+?)\]\]\n |{{Multiple image |width=120px |image_style=border:none; |border=infobox$1| footer = ''[[$2]]''$1| image1 = $3 | alt1= "flowers"$1| image2 = $4 | alt2="foliage"}}$1|-$1 (\n)(.+?)\n\*\[\[\:File\:(.+?)\]\]\n |{{Multiple image |width=120px |image_style=border:none; |border=infobox$1| footer = ''[[$2]]''$1| image1 = $3 | alt1= "flowers"}}$1|-$1 removing (...) in last col: \((.+?)\)(\n)\|\- $2|- This might be needed after copying an image column: \}\}(\n)\n\|\- }}$1|- Replacing @ for cites (example): @(\d+)\-(\d+) {{sfn|Kubitzki et al 1993|pp=$1–$2}} @(\d+) {{sfn|Kubitzki et al 1993|p=$1}} double-hyphen to dash (\d+)\-\-(\d+) $1–$2 Moving POWO cites to the end of the cell: (1 genus|\d genera)\,\{\{sfn\|POWO\|loc=(\w+)\}\}(.+?)(\n) $1,$3{{sfn|POWO|loc=$2}}$4 Remove Chr cites from Orders, or add them to the first column, as needed
TFA
[edit](\W\W)\|\| (\w+)\|\| (\d{4}) $1|| [[User:$2|$2]] || $3