Module talk:String: Difference between revisions
→Noting sandbox test for returning all matches: new section |
→Use circular buffer in str._match() reverse mode: new section |
||
Line 719: | Line 719: | ||
{{ComparePages|970815276|975588539|Making a note}} of the test for enabling str._match() to return all matches at once (done a year ago by [[User:Andrybak]]) before I commandeer the sandbox. —[[User:Wqnvlz|wqnvlz]] ([[User talk:Wqnvlz|talk]] '''·''' [[Special:Contributions/Wqnvlz|contribs]]); 21:14, 4 April 2022 (UTC) |
{{ComparePages|970815276|975588539|Making a note}} of the test for enabling str._match() to return all matches at once (done a year ago by [[User:Andrybak]]) before I commandeer the sandbox. —[[User:Wqnvlz|wqnvlz]] ([[User talk:Wqnvlz|talk]] '''·''' [[Special:Contributions/Wqnvlz|contribs]]); 21:14, 4 April 2022 (UTC) |
||
== Use circular buffer in str._match() reverse mode == |
|||
{{tq|Surely we can do better than finding every match in the source string and returning just the last one!}}, I thought. But there really is no way to search backwards—can't even reverse both strings because of Unicode. How vexing! A bit of memory can be saved, though, by keeping only the last <code>match_index</code> matches and returning the oldest kept match, wrapping the index so no elements need to be shifted. {{ComparePages|970815276|1081406224|I've done this in the sandbox}} and have added some more reverse mode cases to [[Module talk:String/testcases]]. —[[User:Wqnvlz|wqnvlz]] ([[User talk:Wqnvlz|talk]] '''·''' [[Special:Contributions/Wqnvlz|contribs]]); 08:00, 7 April 2022 (UTC) |
Revision as of 08:00, 7 April 2022
This module was considered for merging with Module:HTMLDecode on 2020 May 8. The result of the discussion was "no consensus". |
Text and/or other creative content from this version of Module:String was copied or moved into incubator:Module:Wp/nod/String with this edit. The former page's history now serves to provide attribution for that content in the latter page, and it must not be deleted as long as the latter page exists. |
See also
sub
Why does sub only return a single character? It returns characters in the strong from position "i" to position "i" (only a single character). Shouldn't it go from "i" to "j" like the lua support page suggests? Banaticus (talk) 00:04, 21 February 2013 (UTC)
- Because I made a stupid typo. Already fixed by Tim Starling. Anomie⚔ 00:53, 21 February 2013 (UTC)
error_category
The documentation/comment in the top says: error_category: The default category is ... [Category:Errors reported by Module String]. The category has regular double brackets I assume, or is there an exception in play? -DePiep (talk) 02:44, 27 February 2013 (UTC)
- Yeah, the issue is that double bracket gets interpreted as open / close comment in Lua, so you can't really write that in the middle of the documentation without messing things up. Dragons flight (talk) 03:44, 27 February 2013 (UTC)
Error category: two arguments for one?
To set (overrule) the error cat, one can use two arguments: error_category=...
and no_category=true
. Why is that not one single argument: just enter error_category=<blank>
could withhold the category adding. As it is now, there even is the futile situation: error_category=[[MyCategory]]
and no_category=true
. -DePiep (talk) 12:02, 6 March 2013 (UTC)
- The presence of two parameters came about in an effort to support the existing templates. As I recall, some want one kind of control, and some want the other. It could probably be standardized, but in the initial migration I was trying to avoid making too many changes to the behavior of existing templates. There is also a bit of notation problem if one has a default category, since then it becomes unclear whether
error_category=
(empty string) is meant as "no category" or as "use the default category". Dragons flight (talk) 17:24, 9 March 2013 (UTC)
Match
Is there a way to use match to eliminate hyphens from ISBN numbers? For instance: 978-1-4200-9050-X to 978142009050X. I tried,
{{#invoke:String|match|s=978-1-4200-9050-X|pattern=^(%d*)-*(%d*)-*(%d*)-*(%d*)-*(%d*X*)}} > 978
but I couldn't make work. Anybody can help me? —– Jaider Msg 20:06, 12 March 2013 (UTC)
- If you just want to eliminate hyphens, shouldn't you replace them with empty strings, i.e.
- {{#invoke:String|replace| source=978-1-4200-9050-X | pattern=- | replace= }} = 978142009050X
- You can use match to ensure that the input or output has the appropriate ISBN form, if that is also important. Dragons flight (talk) 00:50, 13 March 2013 (UTC)
- Thanks! But my question is not just about ISBNs. How can we access several values returned from {{#invoke:String|match|...}}? (in other words, several (...) in patterns). And how can we use match to ensure that the input and output has the appropriate ISBN form? —– Jaider Msg 01:15, 13 March 2013 (UTC)
- At present, you can't access multiple (...), not from a template anyway. This is something I should think about how to address. As to using match for checking, something like:
- {{#invoke:String|match|s=978-1-4200-9050-X|pattern=^%d[%d-]*X?$ | nomatch = Not ISBN }} = 978-1-4200-9050-X
- {{#invoke:String|match|s=978-1-BARK-9050-X|pattern=^%d[%d-]*X?$ | nomatch = Not ISBN }} = Not ISBN
- Will work if you aren't picky about the number of digits or the placement of dashes. If you want to be careful about the details you can build a more sophisticated test by using several test calls or writing a short script in Lua. Dragons flight (talk) 02:08, 13 March 2013 (UTC)
- Great! Well, I am not a programmer and I am not sure about Lua stuff, but I made the following script:
local p = {}
function p.isbn(frame)
local isbnString = frame.args[1] or ""
local value1, value2, value3, value4, value5 = string.match(isbnString, "^(%d*)-*(%d*)-*(%d*)-*(%d*)-*(%d*X*)")
return value1, value2, value3, value4, value5
end
return p
And it works ({{#invoke:SomePage|isbn|978-1-4200-9050-X}}
= 978142009050X). Could it be a kind of a solution for several "(...)" in patterns? —– Jaider Msg 12:46, 13 March 2013 (UTC)
- Yes, Lua can match and return multiple patterns. The tricky part is writing a template interface that could access that in a sensible way, especially if you don't know in advance how many capture patterns (...) the template author might want to use. The string module exists mostly to support legacy template code and to provide some string functionality to editors who understand templates but aren't willing to try Lua directly. For a simple dedicated task, like finding an ISBN, writing a short Lua script is probably easier. Congratulations on your first one. Dragons flight (talk) 13:50, 13 March 2013 (UTC)
Pages as strings?
Is it possible to modify this script to allow whole pages as input? For example, if one wanted to include information about article size to Wikipedia:Vital articles? Or would 1000 instances of the script be too much to run on every single page load? — Yerpo Eh? 12:07, 28 April 2013 (UTC)
- Yes, there are ways to operate on an entire page's content, though if all you wanted was page size then the parser function {{PAGESIZE:page name}} probably makes more sense. However, loading entire pages is expensive, which means somewhat slow and limited to no more than 500 times per page. That limit applies to the PAGESIZE: parser function as well, so neither Lua nor PAGESIZE: would work if you needed 1000 iterations on a single page. Dragons flight (talk) 17:15, 28 April 2013 (UTC)
- Ah, I somehow assumed that Lua would magically make loading pages trivial in terms of server load, silly me. Thanks. — Yerpo Eh? 18:19, 29 April 2013 (UTC)
- There are ways to minimize the need to call expensive parser functions repeatedly in some cases. In Lua the result can be stored in a variable for reuse. Similar can be done with templates by passing the result of an expensive parser function as the value of a template argument/parameter. Either way the result could be printed a trillion times with one invocation or template call, as long as the time allocated for Lua and template expansions isn't exceeded. If you look at the result of {{#invoke:string|rep|{{PAGESIZE}}•|1000}} for example, the page size is displayed 1000 times with only one invoke, because the expensive parser function was only called once. This wouldn't work for the specific use case you have in mind though, as there are more then 500 different pages to get the page size for. --darklama 19:52, 29 April 2013 (UTC)
Not really a script-related subquestion, but come to think about it, parsing pages isn't necessarily unavoidable if all I want is page size. Is there an on-wiki handle available to extract it from page history? — Yerpo Eh? 05:51, 30 April 2013 (UTC)
- Besides {{PAGESIZE:page name}}, the MediaWiki API can be queried through JavaScript to find out page sizes. JavaScript is probably going to be the only way you will be able to include the current size for every page. --darklama 10:53, 30 April 2013 (UTC)
Replication on other wikis
Hi, I have just discovered this new "Lua programming" functionality in wikimedia (I am an Italian user). I want to make a question (I don't know if other users have already talked about this). I have noticed that all wikis are replicating this base library (String), changing only the error messages (localization). Does not exists a feature for using only 1 shared String library among wikis (like images on Commons), instead of replicating it for each wiki? "String" is a very base library and if someone discover a bug here, the fix should to be propagated in each wiki (or vice versa).
If a shared library is not possible, at least it would be better to set the localization error strings as variables at the beginning of the source code, so that in other wikis we can cut&paste all the remaining part of the code without changing a line.
If somewhere you have already talked about these problems I would be happy to read about it. Thanks! --Rotpunkt (talk) 11:52, 1 May 2013 (UTC)
- No sharing mechanism currently exists, other than cut and paste. There has been general discussion at the WMF about creating a central code repository for key scripts, but that is likely to be at least months away. Yes, we should do a better job of making localization easier. Dragons flight (talk) 13:39, 1 May 2013 (UTC)
- Ok, thanks. I will looking for that discussion. As a repository, would be nice for example if the modules on Commons (http://commons.wikimedia.org/wiki/Commons:Lua/Modules) could be called from all wikis, so that we could put there the most used libraries (like String), in the same way we use Commons for images. I don't know where we could ask for such a feature... here: http://www.mediawiki.org/wiki/Extension_talk:Scribunto ? --Rotpunkt (talk) 14:39, 1 May 2013 (UTC)
- Translations might be possible with something like
msg = mw.message.new('Empty string'):plain();
. If I've understood the documentation correctly the message is retrieved fromMediaWiki:Empty_string
. $1, $2, etc. can be filled in by including additional parameters tomw.message.new
. It might also be possible to useMediaWiki:Empty_string/it
to include both English and Italian translations for example. If I've understood the documentation correctly this would cut down on needing to edit the module at all. --darklama 16:27, 1 May 2013 (UTC)
- To fully localize the script would be necessary also to localize the default error category.--Moroboshi (talk) 06:56, 3 May 2013 (UTC)
- Well, actually a full localization would also localize the arguments.--Snaevar (talk) 23:57, 3 May 2013 (UTC)
Help needed
This help request has been answered. If you need more help, you can , contact the responding user(s) directly on their user talk page, or consider visiting the Teahouse. |
Please do not deactivate this {{help me}} until 09 Jun 2013 unless you are answering my question. I know that anyone who can help probably has this page watchlisted, but just in case... Now, my questions:
- Is there anyway I can shorten the following replace sequence?
{{#invoke:String|replace|{{#invoke:String|replace|{{Str sub old|{{{TEST-STRING}}}|0|25}}|[^%[%]\{}%`%^%-%w]|_|plain=false}}|^[^%[%]\{}%`%^%a]|_|plain=false}}
- The process currently truncates
{{{TEST-STRING}}}
to 25 characters, replaces all characters outside of the "allowed" set[^%[%]\{}%`%^%-%w]
with_
, then finally replaces the first character of the string with_
if it is outside the "allowed" first character set[^%[%]\{}%`%^%a]
- The next question is, how do I test the result of the above process to see if all of the characters have been replaced with
_
?- I was thinking something like
{{#ifeq:{{#invoke:String|len|{{{TEST-STRING}}}|MATCH|100% invalid input...|{{#invoke:String|replace|{{#invoke:String|replace|{{Str sub old|{{{TEST-STRING}}}|0|25}}|[^%[%]\{}%`%^%-%w]|_|plain=false}}|^[^%[%]\{}%`%^%a]|_|plain=false}}}}
but I don't know how to count the instances of "_" in the string to fill in the "MATCH" section...
- I was thinking something like
- Thanks for any help you can offer. :) Technical 13 (talk) 18:13, 7 June 2013 (UTC)
- i think maybe it would be better if you try to explain what are you actually trying to do, rather than asking us to suggest methods to optimize some obscure piece of code, no? peace - קיפודנחש (aka kipod) (talk) 18:29, 7 June 2013 (UTC)
- It is for work on the Template:Freenode/sandbox that adds an argument to allow the person leaving the template to specify an IRC handle based on the user's wikipedia username. Technical 13 (talk) 18:34, 7 June 2013 (UTC)
- More accurately, to make sure the inputted string (username) is appropriately modified so it follows the IRC rules for names
- Maximum 25 characters [So truncating the string]
- First character cannot be number [So replacing first character by _]
- No character can be outside a lit of characters (a-z,A-Z,0-9,_) [So replacing all of them by _]
- Correct me if there are more rules/ the rules listed are incomplete.
- TheOriginalSoni (talk) 19:02, 7 June 2013 (UTC)
- i still do not understand what you try to do. let me try to focus the question: are you looking for a template/function that will receive a string and will return a boolean (or 0/1 or whatever) that indicates whether this string is "kosher" (according to some criteria), or are you trying to create something that receives a string and cook a "legal" string out of it? or maybe something else entirely? if it's something else, can you explain it again? maybe i'll have better luck understanding it this time around... peace - קיפודנחש (aka kipod) (talk) 20:13, 7 June 2013 (UTC)
- More accurately, to make sure the inputted string (username) is appropriately modified so it follows the IRC rules for names
- It is for work on the Template:Freenode/sandbox that adds an argument to allow the person leaving the template to specify an IRC handle based on the user's wikipedia username. Technical 13 (talk) 18:34, 7 June 2013 (UTC)
- You should write the logic in Lua instead of parser functions that call Lua, and then replace that mess in your template with
{{#invoke:YourModule|functionName|{{{VARIABLE}}}}}
. Seriously, there's no reason at all to do what you did there. And I also note that those replace calls won't even do what you want, since Freenode doesn't appear to allow UTF-8 in nicks. Anomie⚔ 20:41, 7 June 2013 (UTC)- Anomie would you be willing to help me with that? I don't know how to write the logic in Lua yet. I came here to ask because I knew that there had to be an easier shorter way to do it, but I did not know how. To answer your question kipod, create something that receives a string and cook a "legal" string out of it is the goal. Technical 13 (talk) 21:07, 7 June 2013 (UTC)
- Something vaguely like this should get you started.
- Anomie would you be willing to help me with that? I don't know how to write the logic in Lua yet. I came here to ask because I knew that there had to be an easier shorter way to do it, but I did not know how. To answer your question kipod, create something that receives a string and cook a "legal" string out of it is the goal. Technical 13 (talk) 21:07, 7 June 2013 (UTC)
local p = {}
function p.guessNick( frame )
local username = frame.args[1]
local nick
-- First, strip out non-ASCII as best we can
-- Note this will totally fail for non-Latin-script usernames. Nothing much we can do about that.
nick = mw.ustring.toNFD( username )
nick = string.gsub( nick, '[^\32-\126]', '' )
-- Next, replace other unacceptable characters
if string.match( nick, '^[0-9%-]' ) then
-- Begins with a number, so prepend an underscore
nick = '_' .. nick
end
nick = string.gsub( nick, '[^a-zA-Z0-9_%-%[\%]{|}^`]+', '_' )
-- Cut to 25 characters
nick = string.sub( nick, 1, 25 )
return nick
end
return p
Match problem
Is there a problem with match or is it something that I don't understand? Match is supposed to return the string that matches a pattern.
If I want all of the digits up to and including the '4
' in the string '1234567890
' I do this:
{{#invoke:String|match|1234567890|%w*4|nomatch=no match}}
→ 1234
If I want the length of a string I do this:
{{#invoke:String|len|1234}}
→ 4
If I want to find the length of the matched string I do this:
{{#invoke:String|len|{{#invoke:String|match|1234567890|%w*4|nomatch=nomatch}} }}
→ 5
Isn't '5' the wrong result?
—Trappist the monk (talk) 14:32, 10 October 2013 (UTC)
- Try it without the space on the end. -- WOSlinker (talk) 18:19, 10 October 2013 (UTC)
{{#invoke:String|len|1234}}
→ 4{{#invoke:String|len|1234 }}
→ 5{{#invoke:String|len|{{#invoke:String|match|1234567890|%w*4|nomatch=nomatch}}}}
→ 4
- "O that he were here to write me down an idiot! But, masters, remember that I am an idiot; though it be not written down, yet forget not that I am an idiot." (apologies to Shakespeare's Dogberry).
Replace
Hello all, I ran into a problem and would appreciate any help. Trying to use this function to replace strings with [[
does not work, I guess because it tries to parse as a link. Example:
- Trying to transform two words separate by
-
into two wikilinks:
[[{{subst:#invoke:String|replace|Foo-bar|-|]] and [[}}]]
- As you can see, it does not work. Replacing with the HTML code
[
is not a option because it does not create wikilinks.
I understand one could "post-process" that output, like replacing [
with [
using {{Str rep}}
but that's too ugly :) Is there a way to circumvent that? I'm missing something? Cainamarques (talk) 14:00, 3 December 2013 (UTC)
- Your problem is that MediaWiki considers the brackets as well as the braces when trying to parse the wikitext, so it sees it as two potential links ("
[[{{subst:#invoke:String|replace|Foo-bar|-|]]
" and "[[}}]]
") rather than as one parser function call inside brackets that has more brackets in its arguments. There doesn't seem to be any easy way around this. Anomie⚔ 14:57, 3 December 2013 (UTC)
- Just curious here, what exactly are you trying to accomplish? Maybe there is a way to achieve your goal without replacing
-
with]] and [[
? Technical 13 (talk) 15:02, 3 December 2013 (UTC)
I had lunch, thought a little bit and found a solution:
[[{{subst:#invoke:String|replace|Foo-bar|-|]] and {{subst:User:Cainamarques/sandbox}}}}]]
where User:Cainamarques/sandbox just contain two brackets [[
hehe.
Technical 13, definitely there is. All I need is to create wikilinks of two pages that are contained within a page title. These pages are separated by a semicolon and a whitespace, like:
- Wikipédia:Fusão/Central de fusões/Imagem 3D; Estereoscopia
In this example, the pages are "Imagem 3D" and "Estereoscopia". The code is gonna be in a Preload page, and it's necessary to output clean wikicode, so subst:
is needed. So the code below is what I came up to:
[[{{<includeonly>subst:</includeonly>#invoke:String|replace|{{#titleparts:{{PAGENAME}}||3}}|(; )|=]] [[|plain=false}}]]
I guess it's ok. Sorry for my english. Cainamarques (talk) 15:48, 3 December 2013 (UTC)
- Cain, you are aware that Lua Module:s are not subst:itutable, right? Technical 13 (talk) 16:07, 3 December 2013 (UTC)
- I am now :) Makes sense now that I think about it... Thank you, Technical 13. Cainamarques (talk) 16:24, 3 December 2013 (UTC)
- @Technical 13: Yes they are. Look at Module:Unsubst for a prime example. Anomie⚔ 16:33, 3 December 2013 (UTC)
- Cain, I'm guessing by the "Wikipédia:Fusão/Central de fusões/Imagem 3D; Estereoscopia" pagename that this is not on the English Wikipedia here, can you check Special:Version of the the Wikipedia where you want to do this and see if mw:Extension:StringFunctions or something similar is installed? If it is, I may be able to help you with an alternative that will be subst:itutable. Technical 13 (talk) 16:36, 3 December 2013 (UTC)
Anomie is right, modules are substitutable. Anyway my solution above stops working when the simple text "Foo-bar" is changed to the expression I wanna use: {{ #titleparts:{{PAGENAME}} }}
. Even so, it would work if not trying to subst:
the Lua module. Either way, it is not good enough. If not for bugzilla:2777, it would be easy as pie. Technical 13, I'm from pt.wiki, there is no such thing, thank you for the support. I'll try again in the future... Cainamarques (talk) 17:52, 3 December 2013 (UTC)
Another
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left: 0.125em;"><!-- 1em/8 : equivalent to a "fine space" -->!</span> |pattern= %b<> }}}}
→ 45
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left: 0.125em;">!</span> |pattern= %b<> }}}}
→ 45
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left:.125em;">!</span> |pattern= %b<> }}}}
→ 43
The same when using pattern=<.->
:
{{#invoke:String|len|s={{#invoke:String|replace|source= <span style="padding-left: 0.125em;"><!-- 1em/8 : equivalent to a "fine space" -->!</span> |pattern= <.-> }}}}
→ 45
{{#invoke:String|len|s={{#invoke:String|replace|source= {{#tag:nowiki|<span style="padding-left: 0.125em;"><!-- 1em/8 : equivalent to a "fine space" -->!</span>}} |pattern= %b<> }}}}
→ 34
What to do? I am trying to get a result of "1". --Jerome Potts (talk) 19:00, 28 May 2014 (UTC)
- The string module defaults to
|plain=true
which means the search pattern is plain text and is not a regular expression. To fix, add|plain=false
in the above. Johnuniq (talk) 02:38, 29 May 2014 (UTC)
- Oops ! Thank you much. --Jerome Potts (talk) 05:05, 29 May 2014 (UTC)
str.match & str.replace bug
In Lua
string.match("abc,def", "^%w*(%W*)%w*$")
returns ,
but Module:String's implementation
{{#invoke:String|match|s=abc,def|pattern=^%w*(%W*)%w*$|plain=false}}
returns ,def
which is clearly wrong. — {{carismagic|5 February 2014, 08:58}}
- The same problem for replacing with str.replace:
{{#invoke:String|replace|source=abc,def|pattern=^%w*%W*|replace=123|plain=false}}
returns123
instead of123def
. —{{carismagic|5 February 2014, 10:28}}
Uh oh, this looks awkward. These results occur in a module debug console (for example, click "edit" here, then paste in each line beginning with "=" and press Enter, one line at a time):
=string.match('abc,def', '^%w*(%W*)%w*$') , =mw.ustring.match('abc,def', '^%w*(%W*)%w*$') ,def =string.match('abc', '%W') nil =mw.ustring.match('abc', '%W') a
In other words, it's a bug in the mw library. Johnuniq (talk) 10:56, 5 February 2014 (UTC)
- Exactly, but I didn't know where to report the bug, so I wrote about it here. —
{{carismagic|5 February 2014, 11:37}}
- Bugzilla is where you report a bug. In this case, I've filed it for you: T62908. Anomie⚔ 17:15, 5 February 2014 (UTC)
- Thank you for filling it! Next time I'll make an account there and report it myself. —
{{carismagic|5 February 2014, 18:37}}
- Thank you for filling it! Next time I'll make an account there and report it myself. —
- Bugzilla is where you report a bug. In this case, I've filed it for you: T62908. Anomie⚔ 17:15, 5 February 2014 (UTC)
Plain truth
- plain
- A flag indicating that the pattern should be understood as plain text. Defaults to false.
and
- plain
- Boolean flag indicating that target should be understood as plain text and not as a Lua-style regular expression, defaults to true
The parameter flag values for "true" and "false" should be explicitly given in the documentation, because someone looking at this page may not know this information as it differers between programming languages (or they may not know any programming language). At the moment to find what it is to pass into the module that changes the boolean argument "plain" involves hunting in the code at the bottom of the page. -- PBS (talk) 10:27, 28 February 2014 (UTC)
- ? -- PBS (talk) 06:38, 13 May 2017 (UTC)
Where can I find the documentation for wild chars?
Where can I find the documentation for wild chars? %w, %d, %s and so on. I have a regexp that matches two or more words and I need to change it into matching three or more words and I can't figure how to use the wild chars. — Ark25 (talk) 06:37, 23 March 2014 (UTC)
- Thank you very much! — Ark25 (talk) 05:43, 25 March 2014 (UTC)
Error
Hello,
I tried to implement this module on my project, but somehow I got an error: "Script error: Lua error: Internal error: The interpreter exited with status 126." I'm using mediawiki 1.21 --62.65.230.239 (talk) 16:34, 11 May 2014 (UTC)
- You mean, not on a Wikipedia somewhere? If so, a guess would suggest some version problem—incompatible versions of MediaWiki and/or extensions. See Special:Version. Start by finding a very simple module and trying to make that work. Johnuniq (talk) 23:39, 11 May 2014 (UTC)
anchordecode
I created a lossy anchordecode
function at Module:Cite doi. It reverses the effect of the anchorencode:
parser function, though there can be false positives. It's being used for a preload template, but it might be useful for other purposes as well. It should probably be moved into this module. – Minh Nguyễn (talk, contribs) 06:14, 8 October 2014 (UTC)
- IMO, trying to decode anchorencoded text probably means you're doing something wrong. Use a non-lossy encoding of some sort instead. Anomie⚔ 10:12, 8 October 2014 (UTC)
- Ideally, yes. In the case of {{cite doi}}, we had to deal with Citation bot's implementation. I just figured there might be some other use for it, that's all. – Minh Nguyễn (talk, contribs) 07:03, 10 October 2014 (UTC)
Counting
I'm looking for a simple function, maybe I overlooked it somewhere, as it doesn't seem an absurd function. basically I want to count the occurrences of a single character within a string. What I'm particularly thinking of is output from wikidata, which gives a comma-separated list of values. I just want to know how many values, so I thought of counting the commas, plus 1. Maybe there's a more direct way within wikidata, but I've not found that either! Unbuttered parsnip (talk) mytime= Fri 17:30, wikitime= 09:30, 3 April 2015 (UTC)
- I don't know if there is anything built for the purpose, but if you were writing a module, you could use gsub to replace each comma with anything; gsub returns the count as the second value which is easily captured. In a template, you could use this module to replace all non-comma characters with an empty string, then get the length of the result. Johnuniq (talk) 10:25, 3 April 2015 (UTC)
- No I don't fancy writing a module, but you gave the idea of comparing the lengths of the original string and its length after REPLACEing all commas with nothing (plus 1). Unbuttered parsnip (talk) mytime= Fri 19:27, wikitime= 11:27, 3 April 2015 (UTC)
- Not exactly, here is the idea (using "aa,bb,cc" as the example text to be tested):
{{#invoke:String|len|{{#invoke:String|replace|aa,bb,cc|[^,]|plain=false}}}}
→ 2
- The result is 2 which is the number of commas. Johnuniq (talk) 00:06, 4 April 2015 (UTC)
- My idea was
{{#expr:{{str len|abc,def,ghi,klmnopq,r,stz}}-{{str len|{{replace|abc,def,ghi,klmnopq,r,stz|,|}}}}}}
→ 5
and I can more easily see what's going on – I never really mastered regexps. Unbuttered parsnip (talk) mytime= Sat 12:34, wikitime= 04:34, 4 April 2015 (UTC)
- My idea was
- Not exactly, here is the idea (using "aa,bb,cc" as the example text to be tested):
- No I don't fancy writing a module, but you gave the idea of comparing the lengths of the original string and its length after REPLACEing all commas with nothing (plus 1). Unbuttered parsnip (talk) mytime= Fri 19:27, wikitime= 11:27, 3 April 2015 (UTC)
arraytostring
Hello, on it.wiki we added a function called arraytostring (you guess what it does) and we already found many uses for it. I'm just suggesting to adopt it on en.wiki too. It allows to do any repetitive task, such as converting |par1=X|par2=Y|par3=Z... to X, Y, Z... without repeating code, without subtemplates and without specific modules. --Bultro (talk) 13:06, 31 July 2015 (UTC)
Pattern matching
Currently the documentation states:
- pattern
- The pattern or string to find within the string
...
- plain
- Boolean flag indicating that pattern should be understood as plain text and not as a Lua-style regular expression. Defaults to false (to change: plain=true)
I know what a regular expression is but it is not clear to me what a "Lua-style regular expression" is.
If I wanted to pass pass in a regular expression of code |[45]
how would I escape the pipe symbol so that the function did not mistake the pipe symbol as a parameter delimiter?
Or if I passed in plain=false
as a pattern how do I do that without the function interpreting it as a parameter?
-- PBS (talk) 06:37, 13 May 2017 (UTC)
- For pipe, use
{{!}}
(see Template:! which points out that it is a magic word, not a template). That is untested—I'm only assuming that standard method works when invoking a module. For equals, usepattern=
to identify the pattern (that is,pattern=plain=false
). The link at "Scribunto patterns" has documentation. Johnuniq (talk) 08:04, 13 May 2017 (UTC)- I could escape them that way, but the documentation does not explain that there are special characters that need escaping, or that you have to use named parameters if the pattern includes an equals. A few examples of standard use, an example escaping [0-9], and some examples with typical gotchas (like using an = in a pattern) would improve this documentation, because it would allow novices to get up to speed in using these functions much more quickly. As a novice I would appreciate it if an editor experienced in using these these modules could add such documentation. -- PBS (talk) 08:50, 13 May 2017 (UTC)
- "Regular expression" is a poor term to use there, as Lua's patterns aren't as powerful (for example they lack alternation). I see links to documentation explaining Lua patterns (for string and ustring) are already present, but could perhaps be moved into the documentation of the 'pattern' parameter. Anomie⚔ 12:11, 13 May 2017 (UTC)
above br separated entries
Is {{#invoke:string|replace|{{{name}}}| |<br />|plain=false}} doing something except an extra LUA call? --Xoristzatziki (talk) 06:16, 12 August 2017 (UTC)
- It replaces every space with a break, although
plain=false
is not useful. {{#invoke:string|replace|apple banana cherry| |<br />}}
→ apple
banana
cherry- If there is a problem, please link to the page with the problem and outline the issue. Johnuniq (talk) 06:49, 12 August 2017 (UTC)
Protected edit request on 13 December 2018
This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please update the code to use common getParameter functions as demonstrated in sandbox. These functions tend to be common. It is also used in Module:StringFunc. Ans (talk) 19:06, 13 December 2018 (UTC)
- Wouldn't it be better to use
getArgs()
from the very well used and accepted Module:Arguments instead of yet-another-module that does more-or-less-the-same-thing? Same for Module:yesno? - —Trappist the monk (talk) 19:14, 13 December 2018 (UTC)
- Module:yesno is OK, but I cannot find function in Module:Arguments that does the same thing --Ans (talk) 23:13, 13 December 2018 (UTC)
- I think there might be some confusion about the proposed edit. My very quick look suggests that certain functions are currently in Module:String and Ans wants to use those functions in another module. The proposal is that the functions be moved from Module:String to a more general module which can be used as required. In a traditional software project, splitting out functions that are used elsewhere is exactly what would be done. I once used an excellent package where every function was in a separate source file and I'm still not sure about that extreme. However I don't know whether splitting short functions from commonly used modules is desirable. Practical issues include the need to monitor more than one module to know what's going on and to ensure this module is not broken, and to apply suitable protections and handle edit requests for the other modules. Moreover other projects would then need to copy yet another module. Repetition is evil but so is complexity. Johnuniq (talk) 02:37, 14 December 2018 (UTC)
- Module:yesno is OK, but I cannot find function in Module:Arguments that does the same thing --Ans (talk) 23:13, 13 December 2018 (UTC)
- Not done please establish a consensus for this change here then reactivate the edit request. — xaosflux Talk 00:09, 16 December 2018 (UTC)
escapePattern function
I've added an escapePattern function to the sandbox, together with some test cases. This function can be used from wiki pages to escape Lua string patterns. I intend to use this in Template:Basic portal start page to escape the first argument to {{Transclude selected recent additions}}. The default argument is {{subst:PAGENAME}}
, which was causing problems on portal names with magic characters in. (For example, Portal:T.I., now deleted, was at one point displaying a DYK that contained the text "TGIF".) I don't imagine there will be many other use cases for this function, but I think in some limited circumstances it would prove useful. What do people think about adding it to the module? — Mr. Stradivarius ♪ talk ♪ 15:11, 3 May 2019 (UTC)
- That looks good although I wonder if showing an error for a missing parameter is best. Why not output an empty string if there is no input? Sorry to make a massive diff, but I cleaned the whitespace in the sandbox to use tabs for indents and remove trailing space. Are you aware of the enormous wars that have been fought over portals in recent weeks? Johnuniq (talk) 00:52, 4 May 2019 (UTC)
- @Johnuniq: I did that because I think calling the function without the pattern parameter will always be an error - it indicates someone trying to call the function with the no parameters or the wrong parameters, or of a mistake in template syntax. An empty string argument like
{{#invoke:String|escapePattern|}}
, on the other hand, could happen in a template doing something like{{#invoke:String|escapePattern|{{{foo|}}}}}
; the function outputs an empty string in this case. I admit to not being fully aware of the recent drama surrounding portals when I wrote the function last night, although I knew it was contentious. I've been reading up on the declined ArbCom case this morning. My stance is that if we're going to keep Template:Basic portal start page around, we may as well make sure it works properly. Best — Mr. Stradivarius ♪ talk ♪ 01:50, 4 May 2019 (UTC)- The escape function will be useful. Re portals, I mentioned that because it would not be worth spending time on the portal automation system. Johnuniq (talk) 02:01, 4 May 2019 (UTC)
- I've gone ahead and added the function. I did it in two edits to get a clean diff with the escapePattern function before cleaning up the whitespace. Let me know if you notice anything amiss. — Mr. Stradivarius ♪ talk ♪ 03:20, 4 May 2019 (UTC)
- The escape function will be useful. Re portals, I mentioned that because it would not be worth spending time on the portal automation system. Johnuniq (talk) 02:01, 4 May 2019 (UTC)
- @Johnuniq: I did that because I think calling the function without the pattern parameter will always be an error - it indicates someone trying to call the function with the no parameters or the wrong parameters, or of a mistake in template syntax. An empty string argument like
Making match function available to other modules
Following a request, I've amended the function str.match(frame) to call a subroutine str._match() with the arguments passed as parameters. That should allow the function to be exported for use in other modules. Gonnym has been testing the /sandbox version and reports no problems, so I've updated the main module from that sandbox. Please revert if any unexpected issues arise, although the change is minor so (hopefully) hasn't much potential to screw up. Cheers --RexxS (talk) 16:08, 13 May 2019 (UTC)
- @RexxS:, how can I access
|ignore_errors=
from a module call? --Gonnym (talk) 10:29, 27 November 2019 (UTC) - I was able to find the
|nomatch=
but not to disable the categories. Anyways, I fixed the issue I had so don't need this atm. --Gonnym (talk) 11:04, 27 November 2019 (UTC)- @Gonnym: That's difficult, because the module uses a call at line 509 to create a new frame object from the current frame and then attempts to extract
error_category
,ignore_errors
andno_category
from that frame object (rather than passing these parameters into the call each time). That's fine if the module is being #invoked, but of course fails – as you found – if a function is exported for use in an external module. I'll sandbox a fix and let you know when I think it's working. --RexxS (talk) 13:42, 27 November 2019 (UTC) - @Gonnym: I think the current sandbox will handle
error_category
,ignore_errors
andno_category
now:- In the calling module, you create a pseudo-frame object, for example,
local f = {}
. - Then set
f.args = {}
. - Then set
f.args.error_category = "Your category"
, and/orf.args.ignore_errors = true
and/orf.args.no_category = true
as required. - Finally, call
_match
with parameters( s, pattern, start, match_index, plain_flag, nomatch, f )
.
- In the calling module, you create a pseudo-frame object, for example,
- Let me know if it works for you. Cheers --RexxS (talk) 14:09, 27 November 2019 (UTC)
- @Gonnym: That's difficult, because the module uses a call at line 509 to create a new frame object from the current frame and then attempts to extract
Edit request to implement merges
This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please sync the sandbox of this module to merge Module:Join, Module:Str endswith, Module:PatternCount and Module:Text count into it per the applicable TfDs for those modules.
A few notes:
- Module:PatternCount and Module:Text count do the same thing, and are therefore combined into one function called
str.count
. That function also supports a|plain=
parameter, not in either source module, for consistency reasons. - Module:Str endswith is a module that takes arguments from the calling template. I chose to instead implement a pure "does this string ends with this other string" function in Lua, leaving the parameter-handling code to Template:Str endswith (see Template:Str endswith/sandbox).
- I chose not to merge the unused functions of Module:Join, leaving only the
join
function asstr.join
.
* Pppery * survives 00:19, 26 May 2019 (UTC)
- There seems to be an undefined global variable, j, at line 474. Has anyone checked the code that's being merged into this module? Considering the six million plus transclusions, I think we need to do this merge carefully. Is there a plan to merge the documentation? --RexxS (talk) 00:58, 26 May 2019 (UTC)
- The undefined global variable appears to be present in Module:Join as well, so its not an error in the merge per se, just an error in the code being merged, and I'm not sure what it is supposed to refer to. And yes, I do plan to merge the documentation, but it seems less damaging to me to have undocumented code than functions that exist in the documentation but not the code, so I'm not going to merge the documentation until after the code is merged. * Pppery * it has begun... 01:04, 26 May 2019 (UTC)
- Thank you for the plan to merge the documentation. I agree that's the best way to do it. The problem with undefined globals is that they have the potential to interact with other functions that may be merged into this module in future (and that's particularly a risk for simple names like
'j'
. I'm pretty certain that the spurious 'j' is a relic of copying the documentation for table.concat and simply needs to be removed (it will have previously defaulted to false in its original module); do you agree? --RexxS (talk) 01:19, 26 May 2019 (UTC)- Indeed, that does seem likely, feel free to go ahead and remove the j (or make any other changes you feel are necessary to the sandbox). * Pppery * it has begun... 01:22, 26 May 2019 (UTC)
- Thank you for the plan to merge the documentation. I agree that's the best way to do it. The problem with undefined globals is that they have the potential to interact with other functions that may be merged into this module in future (and that's particularly a risk for simple names like
- The undefined global variable appears to be present in Module:Join as well, so its not an error in the merge per se, just an error in the code being merged, and I'm not sure what it is supposed to refer to. And yes, I do plan to merge the documentation, but it seems less damaging to me to have undocumented code than functions that exist in the documentation but not the code, so I'm not going to merge the documentation until after the code is merged. * Pppery * it has begun... 01:04, 26 May 2019 (UTC)
I did a small amount of cleaning in Module:String/sandbox and might do a little more. There is a pairs
problem in the new join function but I'm going to ponder how the following result occurs before thinking about it.
{{#invoke:String/sandbox|join|,|home=unknown|one|two|three|extra=xyz}}
→ one,two,three{{#invoke:String/sandbox|join||home=unknown|one|two|three|extra=xyz}}
→ onetwothree
The last example is currently giving twoonethreeoneunknownonexyz
. What should occur if named parameters are used? For consistency with other functions, I think they should be ignored—that is, ipairs
should be used. Johnuniq (talk) 02:09, 26 May 2019 (UTC)
- Oh, that's obvious. There appears to be no way to specify an empty separator. It ignores the empty parameter and uses "one" as the separator. What should happen? Johnuniq (talk) 02:17, 26 May 2019 (UTC)
- I agree, it should be treated as an empty separator and named parameters should be ignored. The reason for the odd behavior before was because I copied Module:Join's join function nearly verbatim (only changes were reindenting and renaming a variable), and therefore all of its quirks-- but a join function added to Module:String should have a clean API, including handling of empty parameters, unknown parameters, and other errors. I've deactivated the edit request template so that you can continue your code review unencumbered by my changes being prematurely synced live. * Pppery * it has begun... 02:28, 26 May 2019 (UTC)
- @Johnuniq: I dealt with the problems of leading/trailing whitespace in the named prefix parameters in WikidataIB by assuming we'd never use double quotes and then stripping them from the parameter like this:
local lp = (args.linkprefix or args.lp or ""):gsub('"', '')
. You could take a similar course and strip double quotes from the first unnamed parameter, allowing you to use calls like{{#invoke:String/sandbox|join|""|home=unknown|one|two|three|extra=xyz}}
. The other option that occurs to me is to use an extra parameter|nosep=
(or similar) which would be a boolean switch that made the separator the empty string when set to false. Users would still have to supply a dummy value for the first unnamed parameter, though, or you'd have to go through the args moving all of the unnamed parameter indexes up by 1. What do you think? --RexxS (talk) 02:45, 26 May 2019 (UTC)- I prefer Pppery's idea above and I implemented it in the sandbox. The result is that named parameters are ignored and the first parameter is always used as the separator. The separator can be empty—that is very understandable and would be expected by users. Any empty parameters in those following the separator are ignored. Johnuniq (talk) 03:43, 26 May 2019 (UTC)
- BTW, some functions have redundant semicolons while most don't. Should I remove the semicolons? I'm inclined to do that despite it making diffs more complex because people often learn by example and they might think that the semicolons were somehow desirable. Johnuniq (talk) 05:15, 26 May 2019 (UTC)
- That seems to work fine. I'd remove the semicolons and not worry about the diffs. The more prominent a module is, the more we ought to ensure it reflects best practice. --RexxS (talk) 11:44, 26 May 2019 (UTC)
- OK, that makes sense. I've removed the semicolons from the sandbox. * Pppery * it has begun... 15:59, 26 May 2019 (UTC)
- That seems to work fine. I'd remove the semicolons and not worry about the diffs. The more prominent a module is, the more we ought to ensure it reflects best practice. --RexxS (talk) 11:44, 26 May 2019 (UTC)
- @Johnuniq: I dealt with the problems of leading/trailing whitespace in the named prefix parameters in WikidataIB by assuming we'd never use double quotes and then stripping them from the parameter like this:
- I agree, it should be treated as an empty separator and named parameters should be ignored. The reason for the odd behavior before was because I copied Module:Join's join function nearly verbatim (only changes were reindenting and renaming a variable), and therefore all of its quirks-- but a join function added to Module:String should have a clean API, including handling of empty parameters, unknown parameters, and other errors. I've deactivated the edit request template so that you can continue your code review unencumbered by my changes being prematurely synced live. * Pppery * it has begun... 02:28, 26 May 2019 (UTC)
- @Pppery: just making sure as you haven't mentioned it here, is Module:String count part of this merge? --Gonnym (talk) 18:17, 26 May 2019 (UTC)
- @Gonnym: No, that's not being merged into Module:String, it is a separate TfD for which the result is delete. No new functionality needs to be added to Module:String to implement Module:String count in Wikitext other than the functionality added by merging the other modules into it. * Pppery * it has begun... 21:18, 26 May 2019 (UTC)
@RexxS and Johnuniq: Do you have any further tweaks to make to the code or is this ready to be moved to Module:String? * Pppery * it has begun... 02:06, 27 May 2019 (UTC)
- Yes, I think Module:String can be updated from Module:String/sandbox, bearing in mind that all I've done is examine the new code without checking any testcases. I would prefer that
str.count
handled the case of missing source or pattern parameters. However, the calling template ensures the parameters are never nil so if any generalization of count is needed in the future, that problem can be considered then. Johnuniq (talk) 03:38, 27 May 2019 (UTC)- @Johnuniq: What
calling template
, exactly? Module:PatternCount and Module:text count are both modules that do not specifically implement any template but instead were called directly from non-template Wikitext pages. To paraphrase myself from earlier,a [count] function added to Module:String should have a clean API, including handling of empty parameters, unknown parameters, and other errors
, which leads to the question of what{{#invoke:String|count}}
or{{#invoke:String|count|foo}}
should do. Produce a custom error message (respecting|ignore_errors=
? Silently treating any missing values as the empty string?. * Pppery * it has begun... 03:54, 27 May 2019 (UTC)- @Pppery: Hmm, I might be losing it. 24 hours ago I used what-links-here for a module that was being replaced by the new functions. It was used in lots of articles so I then checked what templates used it. I found a template where you had edited the sandbox to call string/sandbox. The template only used invoke if parameters 1 and 2 were nonblank, otherwise it output an empty string. That must have been endswith that I looked at? At any rate, I agree the functions should handle missing parameters so I made another edit to the sandbox to fix that. I checked what mw.ustring.gsub does, and it agrees with string.gsub (as it should) in the way empty strings are handled. For count, the result is that an empty or missing source will give 0 if the pattern is not empty, or 1 if the pattern is empty or missing. If the source is not empty, an empty or missing pattern will give
n+1
wheren
is the number of Unicode characters in source. For example, counting how many times an empty string occurs inabc
would give 4 (an empty string occurs before and after each character). That's what Lua gsub does and is reasonable. Johnuniq (talk) 05:09, 27 May 2019 (UTC)- Yeah, it does look like you were talking about {{str endswith}}, which has that wikitext shim because (a) the template has wierd backwards compatibility requirements and (b) I don't want Module:String to take arguments from the parent frame. In any case, I've reactivated the edit request template. * Pppery * it has begun... 14:07, 27 May 2019 (UTC)
- @Pppery and Johnuniq: Done. The changes look like a valuable upgrade to me. Thank you both. --RexxS (talk) 15:49, 27 May 2019 (UTC)
- Yeah, it does look like you were talking about {{str endswith}}, which has that wikitext shim because (a) the template has wierd backwards compatibility requirements and (b) I don't want Module:String to take arguments from the parent frame. In any case, I've reactivated the edit request template. * Pppery * it has begun... 14:07, 27 May 2019 (UTC)
- @Pppery: Hmm, I might be losing it. 24 hours ago I used what-links-here for a module that was being replaced by the new functions. It was used in lots of articles so I then checked what templates used it. I found a template where you had edited the sandbox to call string/sandbox. The template only used invoke if parameters 1 and 2 were nonblank, otherwise it output an empty string. That must have been endswith that I looked at? At any rate, I agree the functions should handle missing parameters so I made another edit to the sandbox to fix that. I checked what mw.ustring.gsub does, and it agrees with string.gsub (as it should) in the way empty strings are handled. For count, the result is that an empty or missing source will give 0 if the pattern is not empty, or 1 if the pattern is empty or missing. If the source is not empty, an empty or missing pattern will give
- @Johnuniq: What
gsub
This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please add, at the end of the function body of function str.replace
(that is to say, as of rev. 924313232, on line 390):
str.gsub = str.replace
This will help Lua programmers like me who keep forgetting what Module:String calls gsub
.
Module:String name | Lua name (string) | Same? |
---|---|---|
len | :len | |
sub | :sub | |
sublength | N/A | |
match | :match | |
pos | N/A | |
find | :find | |
replace | :gsub | |
rep | :rep | |
escapePattern | N/A[note 1] | |
count | N/A[note 2] | |
join | table.concat ?[note 3]
|
|
endswith | N/A |
References
I'd also like:
str.concat = str.join
But understand if you don't agree with that one...
Also, what do people think of me opening another TPER to add string.format
etc.? Also, this should have a startswith
to go along with endswith
, shouldn't it? Psiĥedelisto (talk • contribs) please always ping! 01:55, 15 July 2020 (UTC)
- @Psiĥedelisto: I don't have a strong opinion on the main request, but I would say adding more functions to this module is likely to be feature creep with no benefit. For instance, I fail to see what wikitext template would want to use
string.format
, given that wikitext syntax is inherently based on formatting strings. Likewise,startswith
can be implemented usingsub
andlen
, so it doesn't need to be there.endswith
only exists because Module:Str endswith existed prior to my mass consolidation of modules last year; I still think it's unnecessary, and that error shouldn't be perpetuated by adding more unnecessary functions. * Pppery * it has begun... 02:55, 15 July 2020 (UTC) - I agree with Pppery that extra functions are a problem. Once you have aliases, some articles/templates will have string.replace while others will feature string.gsub which will add FUD for onlookers who wonder what gsub is and whether they should be using it instead of replace. Adding features should start with examples of articles/templates that need the feature. Johnuniq (talk) 03:45, 15 July 2020 (UTC)
- I also haven't seen the need for any more functionality in Module:String, particularly as it can only be updated and maintained by administrators. I use Module:String2 for string functions that have much less usage, and it can be edited by template editors. If Psiĥedelisto has sufficient cases where new functionality is justified, he might consider a similar course of action. --RexxS (talk) 18:32, 15 July 2020 (UTC)
- Request disabled, pending consensus — Martin (MSGJ · talk) 14:22, 2 August 2020 (UTC)
Documentation of str.match and str._match
When function str._match
has been extracted from str.match
(Special:Diff/896907933), the documentation for the latter was left near the former. Should function str._match
be separately documented? It would at least make sense to move the existing documentation near str.match
where it belongs. —andrybak (talk) 11:12, 2 August 2020 (UTC)
- @Andrybak: The in-source annotation can be considered as documentation for a function, but it is preferred to have documentation for editors to use in the /doc subpage. Most editors will not examine the source code to find documentation on using the function in articles/templates via #invoke, so the documentation they will read is at Module:String/doc #match. When I extracted str._match, I assumed that the in-source annotation would be read by programmers as applying to both str._match and str.match, but it wouldn't be too difficult to move/duplicate relevant parts of the multi-line comment so that str.match had its own info if you felt it was useful. --RexxS (talk) 15:09, 2 August 2020 (UTC)
- RexxS, ok. The documentation for public facing
str.match
is indeed more suited for /doc subpage. However, with the current positioning of this big documentation comment abovestr._match
it a) incorrectly refers to functionmatch
and b) names some of the parameters incorrectly. Please see Special:Diff/970810690/970811487. —andrybak (talk) 15:18, 2 August 2020 (UTC)- Sorry, Andrybak, you can't #invoke the function str._match because #invoke passes the parameters in the frame object, and str._match doesn't use it. The named parameters passed from #invoke do not have to have the same names as internal variables. The whole point of str._match is for programmers to use require() in other modules to import the str._match function, which they then can use by passing the individual parameters directly. I've now separated out the annotations for str._match and str._match in the module source code to avoid future confusion. --RexxS (talk) 15:52, 2 August 2020 (UTC)
- RexxS, ok. The documentation for public facing
Invitation to watch Category:Errors reported by Module String
The tracking category of this module—Errors reported by Module String has been cleared. Interested editors are invited to add this category to your watchlists. —andrybak (talk) 12:15, 15 August 2020 (UTC)
Countings blanks/whitespaces
Hi, what's wrong with this syntax ? {{#invoke:String|str_find| abc 123 def |%s}} -1 Shouldn't %s work? --Bouzinac (talk) 21:06, 7 December 2020 (UTC)
str_find
doesn't support patterns.{{#invoke:string|find|abc 123 def|%s|plain=false}}
-> 0 works * Pppery * it has begun... 21:10, 7 December 2020 (UTC)- @Bouzinac: There are some "features" involved in using find, particularly when you remember that using named parameters trim the whitespace from its arguments, while unnamed (positional) parameters don't:
{{#invoke:string|find|source=abc 123 def|target=123|plain=false}}
→ 5{{#invoke:string|find|source= abc 123 def|target=123|plain=false}}
→ 5{{#invoke:string|find| abc 123 def|123|false}}
→ 6
- You can use a mixture of positional and named parameters:
{{#invoke:string|find| abc 123 def|123|plain=false}}
→ 6
- Using pattern-matching characters works until you try to use a positional parameter for
|plain=
:{{#invoke:string|find|source=abc 123 def|target=%d|plain=false}}
→ 5{{#invoke:string|find|abc 123 def|%d|plain=false}}
→ 5{{#invoke:string|find|abc 123 def|%d|false}}
→ 0{{#invoke:string|find|source=abc 123 def|target=%s|plain=false}}
→ 4{{#invoke:string|find|abc 123 def|%s|plain=false}}
→ 4{{#invoke:string|find|abc 123 def|%s|false}}
→ 0
- If you just want the position of a target string inside a source string, you can alternatively use the posnq function from Module:String2 which strips double quotes from the target string, allowing you to search for spaces directly without using patterns:
{{#invoke:string2|posnq|abc 123 def|" "}}
→ Script error: The function "posnq" does not exist.{{#invoke:string2|posnq|abc 123 def|" 123"}}
→ Script error: The function "posnq" does not exist.
- Note that this currently returns nothing if no match is found:
{{#invoke:string2|posnq|abc 123 def|xyz}}
→ Script error: The function "posnq" does not exist.
- I'll do some work on posnq to upgrade it. --RexxS (talk) 23:45, 7 December 2020 (UTC)
{{#invoke:string2|posnq|abc 123 def|xyz|nomatch=0}}
→ Script error: The function "posnq" does not exist.
- I've updated function posnq (and the wrapper Template:posnq) to allow named parameters, Lua patterns, UTC characters, and the ability to return whatever value is desired for no match. --RexxS (talk) 00:59, 8 December 2020 (UTC)
- @Bouzinac: There are some "features" involved in using find, particularly when you remember that using named parameters trim the whitespace from its arguments, while unnamed (positional) parameters don't:
Is it possible to use a wikitable as the source string?
I have this table, and I want to perform a find and replace on it by using the string module. Apparently, MediaWiki can't process the line break in each cell (see w:fa:Special:PermaLink/30566691), so I was trying to use string to remove the line break and fix the table (the table was exported from quarry:/query/50401, which, for unknown reasons, adds that line break), but I couldn't do it. Apparently, String doesn't treat the table as proper wikicode, and my guess is that the pipe character is the problem, as String treats it as a normal pipe that ends the |source=
parameter (as a result, String will only get "{" as the source string, as seen here). On a side note, I want to do it by using String because I'm trying to make it as easy as possible to export the results of the SQL query, so that all users can export them in case I'm no longer available. Is there a way to do this, by using String or another template/module? Thank you. Ahmadtalk 12:17, 10 December 2020 (UTC)
- Have you tried the regex search and replace that is available under the advance menu of the wikitext editor (quizzing glass at far right)?
- —Trappist the monk (talk) 12:34, 10 December 2020 (UTC)
- Trappist the monk, thanks. Yes (and it worked), but the problem is that any "extra" step (e.g. using the regex search and replace, or using an external software) will add one more step to the process, and users who have never worked with regex may be discouraged by it (I have, for example, created an on-wiki version of the SQL query that will automatically update the timestamp by using {{#time}} so that other users don't need to edit the code). What I was trying to do was using String in a wrapper template, then preloading the wrapper template on a page, and then just asking users to use that. The wrapper template would then fix the problem with line breaks and add some other necessary templates. Ahmadtalk 12:50, 10 December 2020 (UTC)
- I don't think that Module:string can do what you want; all of the pipes in the table markup will be confused with the pipes required for the
{{#invoke:string}}
. No idea if this will work but you might try saving the query result in its own page (nothing else in that page). Then, write a module that reads the wikitext from the query-result page, strips the newlines as appropriate and then returns the preprocessed page. The{{#invoke:}}
for the module goes in its own page (again, nothing else in that page). If that works, you should see on the{{#invoke:}}
page a correctly rendered table. If you do, then transclude the{{#invoke:}}
page into wherever the table is needed. If you try this and it works, let me know. - —Trappist the monk (talk) 13:58, 10 December 2020 (UTC)
- @Trappist the monk:: Thank you! I'm actually not good with modules (I can't write them from scratch), but I tried with templates, and it worked. testwiki:User:Ahmad252/query1 is the raw output. testwiki:User:Ahmad252/query1/fix/template uses testwiki:User:Ahmad252/query1 as the source string of the String module, and removes the line break. Many thanks! Ahmadtalk 18:38, 10 December 2020 (UTC)
- I don't think that Module:string can do what you want; all of the pipes in the table markup will be confused with the pipes required for the
- Trappist the monk, thanks. Yes (and it worked), but the problem is that any "extra" step (e.g. using the regex search and replace, or using an external software) will add one more step to the process, and users who have never worked with regex may be discouraged by it (I have, for example, created an on-wiki version of the SQL query that will automatically update the timestamp by using {{#time}} so that other users don't need to edit the code). What I was trying to do was using String in a wrapper template, then preloading the wrapper template on a page, and then just asking users to use that. The wrapper template would then fix the problem with line breaks and add some other necessary templates. Ahmadtalk 12:50, 10 December 2020 (UTC)
Comma separated values
I'm looking for a template which will extract items in a list separated with a comma. It could be used in the following way:
|para1=
{{extract from list|{{{list}}}|1}}
|para2=
{{extract from list|{{{list}}}|2}}
- etc.
I've seen {{first word}} but I need something that will get the second, third, etc. words too. Any advice would be appreciated — Martin (MSGJ · talk) 12:59, 14 December 2020 (UTC)
{{stringsplit|a, b, c|,|2}}
-> b * Pppery * it has begun... 16:19, 14 December 2020 (UTC)- Thanks! — Martin (MSGJ · talk) 17:02, 14 December 2020 (UTC)
- @MSGJ: Stringsplit allows named parameters instead of positional ones and the separator can be enclosed in double quotes ("), which are stripped out, to allow leading or trailing spaces as part of the separator, regardless of whether named or positional parameters are used. In some cases, you may want to include the space in the separator. For example:
>{{stringsplit|a, b, c|,|2}}<
→ > b< -- this has a leading space in the result because the separator is just ",">{{stringsplit|a, b, c|", "|2}}<
→ >b< -- this has no leading space in the result because the space is included in the separator ", ">{{stringsplit | txt=a, b, c| sep=", "| idx=2}}<
→ >b< -- this is the same as the above
- I've tried to collect together some general utility string-handling functions in Module:String2, so it's worth checking if you need a function that Module:String doesn't supply. --RexxS (talk) 21:36, 14 December 2020 (UTC)
- In this case I wanted to accept with or without spaces, but not to output the space. So I am now using {{trim}} to remove the extra space — Martin (MSGJ · talk) 22:00, 14 December 2020 (UTC)
- @MSGJ: Stringsplit allows named parameters instead of positional ones and the separator can be enclosed in double quotes ("), which are stripped out, to allow leading or trailing spaces as part of the separator, regardless of whether named or positional parameters are used. In some cases, you may want to include the space in the separator. For example:
- Thanks! — Martin (MSGJ · talk) 17:02, 14 December 2020 (UTC)
>{{stringsplit|a, b, c|,|2}}<
→ b>{{stringsplit|a,b,c|,|2}}<
→ b
Read parameter
Template:WikiProject Doctor Who/sandbox contains the parameter definition |tf 1={{{Torchwood-task-force|}}}
I would like to read the contents of that template and extract the parameter name "Torchwood-task-force". I can use
{{Findpagetext|text=tf 1|title=Template:WikiProject Doctor Who/sandbox}}
to get 835 which I think is the character position of the "t". So now I would like to extract the text from that page starting at position 843 and ending at the next "|" encountered. Can someone help me with this please? — Martin (MSGJ · talk) 10:36, 21 December 2020 (UTC)
- @Martin: you can use the functions supplied by Module:Page to work with page content. The simplest is to get the page content:
{{#invoke:Page |getContent |Template:WikiProject Doctor Who/sandbox}}
→
{{#invoke:WikiProject banner/sandbox|main |PROJECT = Doctor Who |substcheck=<includeonly>{{subst:</includeonly><includeonly>substcheck}}</includeonly> |category={{{category|}}} |listas={{{listas|}}} |IMAGE_LEFT = TARDIS-trans.png |QUALITY_SCALE = extended |class={{{class|}}} |importance={{{importance|}}} |MAIN_TEXT = This _PAGETYPE_ is within the scope of WikiProject Doctor Who, an attempt to build a comprehensive and detailed guide to Doctor Who and its spin-offs on Wikipedia. If you would like to participate, you can edit the article attached to this notice, or visit the project page, where you can join the project and/or contribute to the discussion. |tf 1={{{Torchwood-task-force|}}} |TF_1_LINK = Wikipedia:WikiProject Doctor Who/Torchwood |TF_1_NAME = the Torchwood task force |TF_1_NESTED = Torchwood |TF_1_IMAGE = Red T.png |TF_1_MAIN_CAT = Torchwood articles |PORTAL = Doctor Who |DOC = auto }}
- By default the content is wrapped in
<pre>...</pre>
tags. - You can then manipulate that text with the usual string-handling functions:
{{#invoke:String |match |s={{#invoke:Page |getContent |Template:WikiProject Doctor Who/sandbox}} |pattern=tf 1={%{%{([%a_%s-]*) }}
→ Torchwood-task-force- That will find parameter names that are made up of ascii letters, underscores, spaces and dashes (%a_%s-). Cheers --RexxS (talk) 17:18, 21 December 2020 (UTC)
- Great! Thanks — Martin (MSGJ · talk) 19:25, 21 December 2020 (UTC)
- @RexxS: how can I get it to recognise the pattern when there are extra spaces, e.g.
|tf 1 = {{{Torchwood-task-force|}}}
? — Martin (MSGJ · talk) 14:05, 24 December 2020 (UTC)- @Martin:: use
%s*
to match none or more white-spaces (includes tabs, etc.), so the pattern would becometf 1%s*=%s*{%{%{([%a_%s-]*)
. Cheers --RexxS (talk) 18:58, 24 December 2020 (UTC)
- @Martin:: use
"Replace" behaves differently when I name parameters vs. when I don't
In Template:AfC comment/sandbox if I specify source=
or pattern=
the results seen in Template:AfC comment/testcases behave differently, despite the documentation implying that behavior should be the same regardless of whether these parameters are named or left unnamed. Why is that? wbm1058 (talk) 17:14, 4 February 2021 (UTC)
- @Wbm1058: When you use named parameters in any template, any whitespace is trimmed from the start and end of the parameter value. So named parameters will not "see" newlines at the start or end of the
|source=
parameter. Here's how it works in Template:AfC comment and Template:AfC comment/sandbox (I've shown {{tl|paragraph break}} so you can see where it would appear):
Positional parameters:
{{#invoke:String|replace|
|
+|{{tl|paragraph break}}||false}}
Result: {{paragraph break}}
Named parameters:
{{#invoke:String|replace|source=
|pattern=
+|replace={{tl|paragraph break}}|count=|plain=false}}
Result:
- Hope that helps. --RexxS (talk) 17:52, 4 February 2021 (UTC)
If the string begins by #, it's interpreted as wikitext for a numbered list item
Here are examples:
test{{#invoke:String|replace|#test|a|a}}test
gives:
test
- testtest
Same thing with join:
test{{#invoke:String|join||#test|a|a}}test
gives:
test
- testaatest
Just returning the input string gives the same result.
Is there any way around that? I'm trying to replace #0000aa,#ff8000
by #0000aa","#ff8000
in mw:Template:Graph:Lines/sandbox for the colors
parameter. The RedBurn (ϕ) 19:37, 22 February 2021 (UTC)
- Ok, I worked around it by moving the " so that the String module is called with a string that begins by " instead of #. The RedBurn (ϕ) 21:35, 22 February 2021 (UTC)
- @The RedBurn: It sounds like you've run into phab:T14974 - a longstanding bug in MediaWiki which can't be fixed without breaking several templates on many MediaWiki wikis. Working around it is probably your best bet. — Mr. Stradivarius ♪ talk ♪ 01:26, 23 February 2021 (UTC)
- Thanks for the info! The RedBurn (ϕ) 13:41, 23 February 2021 (UTC)
- @The RedBurn: It sounds like you've run into phab:T14974 - a longstanding bug in MediaWiki which can't be fixed without breaking several templates on many MediaWiki wikis. Working around it is probably your best bet. — Mr. Stradivarius ♪ talk ♪ 01:26, 23 February 2021 (UTC)
Not the output I expect
{{#invoke:string|replace|""abc""|['"]['"]|"}}
→ ""abc"" (not as expected){{#invoke:string|replace|""abc""|""|"}}
→ "abc" (as expected){{#invoke:string|replace|''abc''|['"]['"]|"}}
→ abc (not as expected){{#invoke:string|replace|''abc''|''|"}}
→ "abc" (as expected){{#invoke:string|replace|"'abc'"|['"]['"]|"}}
→ "'abc'" (not as expected){{#invoke:string|replace|"'abc'"|"'|"}}
→ "abc'" (as expected)
Or am I doing something wrong? Sb008 (talk) 22:27, 12 April 2021 (UTC)
- You need to specify
|plain=false
to use patterns as arguments. * Pppery * it has begun... 22:43, 12 April 2021 (UTC){{#invoke:string|replace|""abc""|['"]['"]|"|plain=false}}
→ "abc" (works indeed) --Sb008 (talk) 23:20, 12 April 2021 (UTC)
String truncation fails with template calls
I am working on a template for use with political party color templates. I am however having a problem where several templates using this module fail to trim hex color codes when they come from templates like Template:Hugpong ng Pagbabago/meta/color (I am just trying to remove the initial '#', if there is another way please let me know), but I get a weird problem which you can see in my sandbox (2nd 3rd & 4th lines, check the code as well). How can I solve this? Julio974◆ (Talk-Contribs) 13:44, 11 July 2021 (UTC)
- I suspect that the problem is the
<nowiki>...</nowiki>
tags used in #F8F9FA to prevent the#
from being treated as ordered list markup:<nowiki>#CE1126</nowiki>
- The content of
<nowiki>...</nowiki>
tags is saved as plain text early in the rendering process and replaced with a nowiki stripmarker (a more-or-less complete stripmarker is shown in your sandbox third example). At the very end of processing, the stripmarkers are replaced with the text that was initially saved. In your use case, for example this:{{HexShade|{{party color|Hugpong ng Pagbabago}}|0.8}}
- #F8F9FA is executed first and returns the stripmarker. That means that
{{HexShade}}
gets something that looks somewhat like this:{{HexShade|'"`UNIQ--nowiki-00000000-QINU`"'|0.8}}
- A possible solution (not tested) is to replace the content of #F8F9FA with this:
<nowiki/>#CE1126
- That seems to prevent the ordered list markup and still returns the stripmarker but it also returns the hex color. So, you can then do this:
{{#invoke:String|match|s={{party color|Hugpong ng Pagbabago}} |pattern=#(%x%x%x%x%x%x)$ |plain=false |nomatch=}}
- Here, I mimic the above by setting
|s=
to the value returned by the modified #F8F9FA:{{#invoke:String|match|s='"`UNIQ--nowiki-00000001-QINU`"'?#CE1126 |pattern=#(%x%x%x%x%x%x)$ |plain=false |nomatch=}}
- CE1126
- —Trappist the monk (talk) 14:49, 11 July 2021 (UTC)
- This is interesting, the problem with it is the hex code is nowikied in all party color templates; I can't change hundreds or thousands of templates that are already well-used and it may risk breaking other templates. I would need a solution that requires no change to the party color templates as they currently are. Julio974◆ (Talk-Contribs) 18:40, 11 July 2021 (UTC)
- You can use {{unstrip}} to decode the nowiki tag.
{{HexShade|{{unstrip|{{party color|Hugpong ng Pagbabago}}}}|0.8}}
-> F7A1AA * Pppery * it has begun... 19:29, 11 July 2021 (UTC)
- You can use {{unstrip}} to decode the nowiki tag.
- This is interesting, the problem with it is the hex code is nowikied in all party color templates; I can't change hundreds or thousands of templates that are already well-used and it may risk breaking other templates. I would need a solution that requires no change to the party color templates as they currently are. Julio974◆ (Talk-Contribs) 18:40, 11 July 2021 (UTC)
Count counts disjoint matches only
Count
counts disjoint matches only, i.e., the cursor advances and starts again after the end of the previous match.
Upshot: "{{#invoke:String|count|ababababab|aba}}"
→ "2", not four. I've modified the doc page to clarify this because the previous wording was misleading. Mathglot (talk) 17:46, 14 October 2021 (UTC)
Bug with this module
You are invited to join the discussion at Wikipedia:Village pump (technical) § String find bug. {{u|Sdkb}} talk 23:51, 1 December 2021 (UTC)
Function to find in a substring?
I have a bit of a complex problem to solve.
My first attempt to implement this code in a template:
{{#ifexpr: {{#invoke:string|find|{{:{{SUBJECTPAGENAME}}}}|<search string>}}|<string found>|<string not found>}}}}
caused a template loop because SUBJECTPAGENAME transcludes the template. But I expect the recursive transclusion to be towards the end of SUBJECTPAGENAME and the string I'm searching for to be towards the beginning. So my second attempt is to search just the first 500 characters, to avoid triggering the template loop:
{{#ifexpr: {{#invoke:string|find|{{#invoke:string|sub|{{:{{SUBJECTPAGENAME}}}}|1|500}}|<search string>}}|<string found>|<string not found>}}}}
This works if SUBJECTPAGENAME is replaced with the actual subject page name but fails with this error when I try substituting the {{#invoke:string|sub:
- String Module Error: String subset index out of range
I assume the index (500) is out of range because it's trying to search the raw code {{:{{SUBJECTPAGENAME}}}}
rather than the transcluded page because the page is >500 characters long and the template loop call is located after the 500th character.
{{#invoke:string|sub|{{:{{SUBJECTPAGENAME}}}}|1|500}}
works fine when it stands alone. The index out-of-range error only happens when this is nested inside of the {{#invoke:string|find
so I assume this syntax is just too complex.
Is there an {{#invoke:string|find-in-sub|
function that combines the functions of |find| and |sub| so I don't have to nest these? If not, can someone write one for me? Thanks. – wbm1058 (talk) 15:02, 18 February 2022 (UTC)
- |find| has a start parameter: The index within the source string to start the search, defaults to 1
- Maybe all I need is an stop parameter: The index within the source string to stop the search, defaults to end-of-string – wbm1058 (talk) 15:17, 18 February 2022 (UTC)
- (edit conflict) This is an XY problem. Try using Module:Page to get the raw wikitext of the page (
{{#invoke:page|getContent|{{SUBJECTPAGENAME}}}as=raw}}
), and then doing whatever string magic you are trying to do. That said, if I had to guess the direct problem you are experiencing is that you need to subst{{SUBPAGENAME}}
as well, or otherwise it doesn't get expanded when you try to substitute the string call. * Pppery * it has begun... 15:18, 18 February 2022 (UTC)- Thanks Pppery. I wasn't aware of that function to get the raw wikitext. That gets me yet teasingly closer to a solution that works. Getting the raw wikitext is I suppose a cleaner way to avoid the template loop. Now there are no errors when I substitute. But, the problem is that I need to substitute, and my magic is only a viable solution if it works with only transclusion. My specific problem is demonstrated at Talk:David Ayers. That is populating Category:Articles with talk page redirects. But if I {{subst:R from move/except/sandbox}} it populates Category:Redirects for discussion with talk page redirects, the new category my code is intended to implement. I need that category populated with transclusion, not just substitution. I believe my proposed solution above might do that. – wbm1058 (talk) 16:06, 18 February 2022 (UTC)
- Does Special:Diff/1072625351 fix the problem? * Pppery * it has begun... 16:57, 18 February 2022 (UTC)
- Indeed that does! Doh. I initially did try searching for
#invoke:RfD
but wasn't finding it because of the transclusion! And{{#invoke:page|getContent|{{SUBJECTPAGENAME}}|as=raw}}
was the perfect solution for that, rather than the dead-end path I tried. I've been working on this on-and-off for the past day, so very happy to finally have a solution. Thanks! wbm1058 (talk) 17:55, 18 February 2022 (UTC)
- Indeed that does! Doh. I initially did try searching for
- Does Special:Diff/1072625351 fix the problem? * Pppery * it has begun... 16:57, 18 February 2022 (UTC)
- Thanks Pppery. I wasn't aware of that function to get the raw wikitext. That gets me yet teasingly closer to a solution that works. Getting the raw wikitext is I suppose a cleaner way to avoid the template loop. Now there are no errors when I substitute. But, the problem is that I need to substitute, and my magic is only a viable solution if it works with only transclusion. My specific problem is demonstrated at Talk:David Ayers. That is populating Category:Articles with talk page redirects. But if I {{subst:R from move/except/sandbox}} it populates Category:Redirects for discussion with talk page redirects, the new category my code is intended to implement. I need that category populated with transclusion, not just substitution. I believe my proposed solution above might do that. – wbm1058 (talk) 16:06, 18 February 2022 (UTC)
Alphabetical order
Is there any Lua function I can use to convert a list into alphabetical order? — Martin (MSGJ · talk) 22:21, 22 February 2022 (UTC)
- {{Sort list}} * Pppery * it has begun... 22:24, 22 February 2022 (UTC)
- Thanks — Martin (MSGJ · talk) 14:19, 23 February 2022 (UTC)
Protected edit request on 1 March 2022
This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please change:
If named parameters are used, Mediawiki will
automatically remove any leading or trailing whitespace from the parameter.
to:
If named parameters are used, MediaWiki will
automatically remove any leading or trailing whitespace from the parameter.
(emphasis mine). 🐶 EpicPupper (he/him | talk) 22:19, 1 March 2022 (UTC)
- If this is about the documentation, that is not protected so you can fix it yourself.
- —Trappist the monk (talk) 22:29, 1 March 2022 (UTC)
Noting sandbox test for returning all matches
Making a note of the test for enabling str._match() to return all matches at once (done a year ago by User:Andrybak) before I commandeer the sandbox. —wqnvlz (talk · contribs); 21:14, 4 April 2022 (UTC)
Use circular buffer in str._match() reverse mode
Surely we can do better than finding every match in the source string and returning just the last one!
, I thought. But there really is no way to search backwards—can't even reverse both strings because of Unicode. How vexing! A bit of memory can be saved, though, by keeping only the last match_index
matches and returning the oldest kept match, wrapping the index so no elements need to be shifted. I've done this in the sandbox and have added some more reverse mode cases to Module talk:String/testcases. —wqnvlz (talk · contribs); 08:00, 7 April 2022 (UTC)