User:Monkbot/task 12: London Gazette deprecated parameters

From Wikipedia, the free encyclopedia

Task 12 is a single use task that removes and replaces certain deprecated parameters in existing {{London Gazette}} templates. The original {{London Gazette}} has been converted to be a wrapper template around {{cite news}} and later around {{cite magazine}}. To make the template both stylistically and functionally similar to common cs1|2 templates, some of the original parameters have been supplemented with their equivalent cs1|2 parameters. Task 12 revises existing instances of the templates to use the new parameters.

description[edit]

Task 12 removes and replaces these parameters:

  1. |ps= with |postscript= – included mostly for completeness, this parameter has been deprecated for quite a long time so it is likely that only a very few if any of these parameters exist.
  2. |separator= with |mode= – cs1|2 templates no longer support |separator=. Its functionality was replaced with |mode=. Task 12 replaces instances of |separator=, with |mode=cs2. Because the underlying {{cite magazine}} is cs1, |separator=. is removed because |mode=cs1 would be redundant.
  3. |startpage= and |endpage= with |pages= – The old template used the value assigned to |startpage= in the url that linked to that page on the London Gazette website. It was then concatenated with an en dash and the value assigned to |endpage= to create the pagination information rendered for the reader. The new version of the template is capable of extracting the first page number from a range of pages in |pages=xx–yy so neither of |startpage= and |endpage= are required.
  4. |startpage= (without |endpage=) with |page= – for consistency with item 3 above and for consistency with the common cs1|2 usage.

Task 12, like all Monkbot tasks, does not apply general fixes.

Task 12 skips pages that include {{bots|deny=Monkbot12}}.

ancillary tasks[edit]

There are a handful of other edits made to {{London Gazette}}. These other edit are only made if one or more of the deprecated parameter removal/replacements described above are made. These edits are:

  1. removes |accessdate= and |access-date= – These parameters are intended to mark a point in time when the content of an ephemeral web page supported text in a Wikipedia article. The archived London Gazette facsimile is not ephemeral even though the wrapping html page may be. If an editor at Wikipedia is citing the content of the wrapping html page, then {{London Gazette}} is not the correct template and the editor would be better served by using {{cite web}}.
  2. removes empty parameters – Empty parameters clutter wikitext and do nothing to contribute to how the template is rendered. There is a single exception:
    1. replace |ref= with |ref=none – disables automatic CITEREF generation in a manner consistent with cs1|2.
  3. standardizes redirects {{LondonGazette}}, {{Londongazette}}, and {{London gazette}} to {{London Gazette}}
  4. standardizes |supp=yes to |supp=y – included mostly for completeness, most if not all, of these parameters have already been standardized.

script[edit]

// this script replaces deprecated {{London Gazette}} parameters:
//	|ps= with |postscript=
//	|separator= with |mode=
//	|startpage= (without |endpage=) with |page=
//	|startpage= and |endpage=) with |pages=startpage–endpage

// other tasks when other changes have been made:
//	removes |accessdate= and |access-date=
//	removes empty parameters
//	changes empty |ref= to |ref=none
//	standardizes redirects {{LondonGazette |...}}, {{Londongazette |...}} to {{London Gazette |...}}
//	standardizes single letter |supp= values to y
//	standardizes |supp=yes values to y


public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
	{
	Skip = false;
//	Summary = "[[User:Monkbot/task_12:_London_Gazette_deprecated_parameters|Task 12]]: (developmental testing): London Gazette templates: replace deprecated parameters; remove empty parameters; remove |accessdate=;";
//	Summary = "[[User:Monkbot/task_12:_London_Gazette_deprecated_parameters|Task 12]]: ([[Wikipedia:Bots/Requests_for_approval/Monkbot_12|BRFA testing]]): London Gazette templates: replace deprecated parameters; remove empty parameters; remove |accessdate=;";
	Summary = "[[User:Monkbot/task_12:_London_Gazette_deprecated_parameters|Task 12]]: London Gazette templates: replace deprecated parameters; remove empty parameters; remove |accessdate=;";
	String pattern;
	
	String IS_GAZETTE = @"(?:[Ll]ondon\s*[Gg]azette|(?:Edinburgh|Belfast|Oxford)\s*Gazette)";
	
//---------------------------< H I D E 1 >--------------------------------------------------------------------
// HIDE TEMPLATES: find templates that are not London Gazette; replace the opening {{ with __0P3N__ and the closing }} with __CL0S3__

	while (Regex.Match (ArticleText, @"\{\{(?!\s*" + IS_GAZETTE + @")([^\{\}]*)\}\}").Success)
		{
		ArticleText = Regex.Replace(ArticleText, @"\{\{((?!\s*" + IS_GAZETTE + @")([^\{\}]*))\}\}", "__0P3N__$1__CL0S3__");
		}


//---------------------------< E M P T Y   P A R A M E T E R S >----------------------------------------------
// remove empty parameters

// EMPTY CITY: Remove empty |city= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*city\s*=\s*([\|\}])", "$1$2");

// EMPTY DISPLAY-SUPP: Remove empty |display-supp= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*display\-supp\s*=\s*([\|\}])", "$1$2");

// EMPTY ENDPAGE: Remove empty |endpage= parameters (so we don't confuse later replacements).
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*endpage\s*=\s*([\|\}])", "$1$2");

// EMPTY MODE: Remove empty |mode= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*mode\s*=\s*([\|\}])", "$1$2");

// EMPTY NOLINK: Remove empty |nolink= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*nolink\s*=\s*([\|\}])", "$1$2");

// EMPTY PAGE: Remove empty |page= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*page\s*=\s*([\|\}])", "$1$2");

// EMPTY PAGES: Remove empty |pages= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*pages\s*=\s*([\|\}])", "$1$2");

// EMPTY PS: Remove empty |ps= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*ps\s*=\s*([\|\}])", "$1$2");

// EMPTY POSTSCRIPT: Remove empty |postscript= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*postscript\s*=\s*([\|\}])", "$1$2");

// EMPTY QUOTE: Remove empty |quote= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*quote\s*=\s*([\|\}])", "$1$2");

// EMPTY REF: Special case. Change empty |ref= parameters to |ref=none.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|\s*ref\s*=\s*)([\|\}])", "$1none$2");

// EMPTY SEPARATOR: Remove empty |separator= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*separator\s*=\s*([\|\}])", "$1$2");

// EMPTY STARTPAGE: Remove empty |startpage= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*startpage\s*=\s*([\|\}])", "$1$2");

// EMPTY SUPP: Remove empty |supp= parameters.
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*supp\s*=\s*([\|\}])", "$1$2");


//---------------------------< H I D E 2 >--------------------------------------------------------------------

// HIDE COMPLEX WIKILINKS: find complex wikilinks and replace the brackets with __WL_0P3N__ and __WL_CL053__ and replace the pipe with __P1P3__

	pattern = @"\[\[([^\|\]]+)\|([^\]\|]+)\]\]";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "__WL_0P3N__$1__P1P3__$2__WL_CL053__");
		}

// HIDE WIKILINKS: find wikilinks and replace the brackets with __WL_0P3N__ and __WL_CL053__

	pattern = @"\[\[([^\|\]]+)\]\]";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "__WL_0P3N__$1__WL_CL053__");
		}


//---------------------------< P S >--------------------------------------------------------------------------
//
// |ps= to |postscript=
//

	pattern = @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|\s*)ps";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "$1postscript");
		Skip = false;
		}


//---------------------------< S E P A R A T O R >------------------------------------------------------------
//
// remove |separator=.
// |separator=, to |mode=cs2
//

// Remove |separator=. because {{London Gazette}} uses a cs1 template so |mode=cs1 is superfluous
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*separator\s*=\s*\.\s*([\|\}])", "$1$2");

// replace |separator=, with |mode=cs2
	pattern = @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|\s*)separator\s*=\s*,";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "$1mode=cs2");
		Skip = false;
		}


//---------------------------< S T A R T P A G E >------------------------------------------------------------
//
// {{{2|}}} (unnamed, parameter 2) to |startpage= then
// |startpage= without |endpage= to |page=
// |startpage= with |endpage= to |pages=
//

// replace {{{2|}}} with |startpage=
	pattern = @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|\s*\d+[^\}]*)(\|\s*)(\d+)\s*([\|\}])";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "$1$2startpage=$3$4");
		Skip = false;
		}

// replace |startpage= and |endpage= with |pages=; |startpage= occurs first
	pattern = @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|)\s*startpage\s*=\s*(\w+)(\s*[^\}]*)\|\s*endpage\s*=\s*(\w+)";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "$1pages=$2–$4$3");
		Skip = false;
		}

// replace |startpage= and |endpage= with |pages=; |endpage= occurs first
	pattern = @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|)\s*endpage\s*=\s*(\w+)(\s*[^\}]*)\|\s*startpage\s*=\s*(\w+)";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "$1pages=$4–$2$3");
		Skip = false;
		}

// replace |startpage= with |page=
	pattern = @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|)\s*startpage\s*=\s*(\w+)";
	while (Regex.Match (ArticleText, pattern).Success)
		{
		ArticleText = Regex.Replace(ArticleText, pattern, "$1page=$2");
		Skip = false;
		}


//---------------------------< A C C E S S   D A T E S >------------------------------------------------------

// Remove |accessdate= and |access-date= because {{London Gazette}} an archive and not an ephemeral website
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*)\|\s*access\-?date\s*=[^\|\}]*([\|\}])", "$1$2");


//---------------------------< S T A N D A R D I Z E >--------------------------------------------------------

// standardize {{LondonGazette}}, {{Londongazette}}, and {{London gazette}} to {{London Gazette}}
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*)[Ll]ondon\s*[Gg]azette", "$1London Gazette");


//---------------------------< S U P P  >---------------------------------------------------------------------

// Standardize |supp=yes and |supp=<any single character except n> to |supp=y
	ArticleText = Regex.Replace(ArticleText, @"(\{\{\s*" + IS_GAZETTE + @"[^\}]*\|\s*supp\s*=\s*)(?:[yY][eE][sS]|[A-MO-Za-mo-z])(\s*[\|\}])", "$1y$2");


//---------------------------< U N H I D E >------------------------------------------------------------------
// UNHIDE: replace __WL_0P3N__ with [[
	ArticleText = Regex.Replace(ArticleText, @"__WL_0P3N__", "[[");

// UNHIDE: replace __P1P3__ with |
	ArticleText = Regex.Replace(ArticleText, @"__P1P3__", "|");

// UNHIDE: replace __WL_CL053__ with ]]
	ArticleText = Regex.Replace(ArticleText, @"__WL_CL053__", "]]");

// UNHIDE: replace __0P3N__ with {{
	ArticleText = Regex.Replace(ArticleText, @"__0P3N__", "{{");

// UNHIDE: replace __CL0S3__ with }}
	ArticleText = Regex.Replace(ArticleText, @"__CL0S3__", "}}");

	return ArticleText;
	}