Jump to content

Wikipedia:Lua/Requests/Archive 1

From Wikipedia, the free encyclopedia
Archive 1Archive 2Archive 3Archive 5

New unit test framework

I created a very simple unit test framework at Module:UnitTests for Lua scripts on Wikipedia. A simple example from Module:Bananas/tests:

-- Unit tests for [[Module:Bananas]]. Click talk page to run tests.
local p = require('Module:UnitTests')
 
function p:test_hello()
    self:preprocess_equals('{{#invoke:Bananas | hello}}', 'Hello, world!')
end
 
return p

I've got a larger sample up at Module:ConvertNumeric/tests. It'd be great to start developing unit tests for all scripts to help guard against regressions. Let me know if you have any suggestions for improvements - feel free to add additional methods as needed to Module:UnitTests. The same framework can also be used to run tests on ordinary templates. Dcoetzee 01:06, 25 February 2013 (UTC)

The above is good and should work well. I'll outline my experiences in case anyone is looking to compare the output of a new module with the output from an existing template. I use two test systems for a project I'm working on with another editor. First is a test system that he created for a specific purpose. An example can be seen at testcases area. According to the docs, it is faster to use frame:expandTemplate instead of frame:preprocess (although the latter is simpler as it takes plain wikitext). I switched Module:ConvertTestcase from using the latter to the former because we are getting script timeout errors (scripts stop after a total of 10 seconds runtime per page), but it did not help. See here and skip down to "Weird timing" for more. Currently, it appears that invoking a simple template which calls a big module has very little overhead with either of the frame calls, but invoking a complex template has a very big overhead however it's done.
My second test system runs a Lua script on a local computer. It feeds hundreds of lines of test cases into a module under test in a second. Each line is of the form:
{{convert|9680|sqyd|acre|abbr=on}}       9,680 sq yd (2.00 acres)
On my system, the LHS invokes the module I am working on, and the RHS is the expected result; the RHS came from Special:ExpandTemplates. My test program is not a wiki-friendly procedure, and it requires that the module can be run as a standard Lua script on the local computer, and it would need a fair bit of work before use elsewhere, but I'm mentioning it for completeness. Johnuniq (talk) 02:17, 25 February 2013 (UTC)
I've now added the method "preprocess_equals_preprocess" to Module:UnitTests to do simple comparisons of scripts against templates. However in the case of complex templates there may be no simple way to deal with the timeout other than to use an offline tool or to subst the result of the template invocations. The nice thing about a local tool is that in principle it could perform the ExpandTemplates automatically and cache the results when the template page hasn't been modified. Dcoetzee 02:59, 25 February 2013 (UTC)

Detect IP addresses

Task
Determine whether a string is a valid IPv4 or IPv6 address (using regular expressions). This can replace the current method of detecting IP addresses in templates such as Template:AfC talk which cannot detect some IPv6 addresses and takes strings which have only numbers and special characters to be IP addresses, even if they are not valid. jfd34 (talk) 17:00, 27 February 2013 (UTC)
Small note: Although it was suggested to do this with regular expressions, it's actually easier to do with a combination of regular expressions and other methods (parse integer, make sure < 256, etc.) We may also want to see what pattern MediaWiki excludes users from having because they look too much like IP addresses, and make sure we stay within the confines of that. Dcoetzee 00:43, 28 February 2013 (UTC)
Working on this now at Module:IPAddress with unit tests written first at Module:IPAddress/tests. Dcoetzee 00:46, 28 February 2013 (UTC)
if you write the tests, please try:
--[[
functions are not "local", so other modules can require this module and call them directly.
we return a table with 3 small stub functions to call the "real" ones, in a scribunto fashion,
so the functions can be called from templates also.
]]

function isIpV6( s )
    local dcolon, groups
    if type( s ) ~= "string"
        or s:len() == 0
        or s:find( "[^:%x]" ) -- only colon and hex digits are legal chars
        or s:find( "^:[^:]" ) -- can begin or end with :: but not with single :
        or s:find( "[^:]:$" )
        or s:find( ":::" )
    then
        return false
    end 
    s, dcolon = s:gsub( "::", ":" )
    if dcolon > 1 then return false end -- at most one ::
    s = s:gsub( "^:?", ":" ) -- prepend : if needed
    s, groups = s:gsub( ":%x%x?%x?%x?", "" ) -- remove and count valid groups
    return ( ( dcolon == 1 and groups < 8 ) or ( dcolon == 0 and groups == 8 ) )
        and ( s:len() == 0 or ( dcolon == 1 and s == ":" ) ) -- might be one dangling : if original ended with ::
end

function isIpV4( s )
    local function legal( n ) return ( tonumber( n ) or 256 ) < 256  end -- 0 is true.
	
    if type( s ) ~= "string" then return false end
    local p1, p2, p3, p4 = s:match( "^(%d+)%.(%d+)%.(%d+)%.(%d+)$" ) 
    return legal( p1 ) and legal( p2 ) and legal( p3 ) and legal( p4 )
end

function isIp( s )
    return isIpV4( s ) and "4" or isIpV6( s ) and "6"
end

-- return a table to allow the functions to be called from templates
return {
    ["isIpV6"] = function( frame ) return isIpV6( frame.args[ 1 ] ) and "1" or "0" end,
    ["isIpV4"] = function( frame ) return isIpV4( frame.args[ 1 ] ) and "1" or "0" end,
    ["isIp"] = function( frame ) return isIp( frame.args[ 1 ] ) or "" end
}
ipV6 is notoriously more complex, and i'm not at all sure i got all the subtleties.
peace - קיפודנחש (aka kipod) (talk) 06:03, 28 February 2013 (UTC)
Comment
Maybe use a singel, three way function "isIp": nil (no), 1 (V4) or 2 (V6). 1 and 2 evaluate to "true". Gives detailed answers when that distiction V4/V6 is needed. -DePiep (talk) 09:17, 28 February 2013 (UTC)
changed the code above and added generic "isIp()" per your suggestion. i'm not sure whether "negative" should be indicated as "0" or empty string - if anyone has strong opinion about it they are welcome to change. "nil" is an OK return value for functions called from other lua functions, but the function called by the template should return a string (can be empty). קיפודנחש (aka kipod) (talk) 15:40, 28 February 2013 (UTC)
I am not that sure about my nil proposal, I thought within Lua "0" evaluates to true. Proceed as you think best. -DePiep (talk) 18:25, 1 March 2013 (UTC)
i created 2 sets of functions: one to be called by other Lua functions, in other lua modules, and one to be called from templates. The set meant for other Lua modules returns lua bollean values for isIpV4 and isIpV6, and "6", "4" or false for isIp. the set for calling from templates returns "1" or "0" for isIpV4, and isIpv6, and "4", "6" or "0" for isIp.
We can easily change the "negative" indication for the 2nd set from "0" to "" (empty string), but i believe all "scribunto" functions (i.e., those that accepts a frame object and are called from templates) must return a string and not any other value type. it might even be plain wrong.
peace - קיפודנחש (aka kipod) (talk) 19:04, 1 March 2013 (UTC)
 Done I modified the suggested script to use objects for the template entry points per convention, implemented a set of unit tests at Module talk:IPAddress/tests, and created a trivial template at Template:IsIPAddress, used like e.g. {{IsIPAddress|192.168.0.1}} producing 4, 6, or empty string. The unit tests are all passing so the above code seems to be in good shape. I also modified Template:AfC_talk to use the template. Dcoetzee 03:22, 6 March 2013 (UTC)
I have added {{trim}} param 1 to the template {{IsIPAddress}}. The module returns a blank (not an IP address) when the argument has leading or trailing spaces, in wikicode we cannot be dependent on diferent outcomme when different legal code options are applied (e.g., with or without using 1=). -DePiep (talk) 07:47, 6 March 2013 (UTC)

Extract substring (M:String.sub): a wrapper

I have created {{str mid}} that employs Module:String.sub to get a substring. This .sub is very tough on input quality (indexes must be within limits, requiring to pre-think string length before using!). I have template-coded a friendlier wrapper version, with the extra option to ask by one index and substring length (instead of two indexes). The template (Module .sub function) should replace most of such string manipulation templates, at least in future usage. See also: {{str mid/core}} {{str mid/testcases}}.

The template (not yet the module) tries to make the best out of the parameters, preventing errors and adding smart default values. It does function, but it is a bit clumsy (being in template code, dealing with blank input, etc.). I propose we add a wrapper routine to the module, that incorporates this template / proposal.

Proposal buildup
  • Use Module:String.sub internally. We only need a wrapper that handles the basic input values.
  • i, j are as defined in the Module:String.sub. Negative values are accepted.
  • i, j can get default values: i=1, j=len
Basic parameters

All params are optional. For readability here names i, sublen, j are used; in the template they are the unnamed params 2, 3, 4.

{{str mid
| 1=[string]
| i=
| sublen=[length of substring]
| j=
}}
  • adding: parameter sublen for requested substring length. Entering sublen=n is used to calculate i or j.
  • sublen can be negative too: counting backwards from i (or forward! from j).

Missing values are calculated:

j = i + sublen - 1
i = j - sublen + 1
note: the "1" alters sign when sublen<0

Assigning value to i and j in current code (as of 6 March 2013):

i={{#if:{{{2|}}}}|{{{2|}}} <!-- i has input -->
|{{#ifexpr:({{#if:{{{3|}}}|1|0}} and {{#if:{{{4|}}}|1|0}}) <!-- sublen and j have input, so we can calculate -->
|{{#expr:{{min|{{{4|}}}|{{#invoke:String|len|{{trim|{{{1|}}}}}}}}}-{{{3|}}}+{{#ifexpr:{{{3|}}}<0|-1|1}}}} <!-- calculation -->
|1<!-- cannot calculate, so default to index 1 -->
}}}}

j={{min|{{#invoke:String|len|{{trim|{{{1|}}}}}}} <!-- j never exceeds string length --> 
|{{#if:{{{4|}}}|{{{4|}}} <!-- j has input -->
|{{#if:{{{3|}}} <!-- we can calculate j -->
|{{#expr:{{#if:{{{2|}}}|{{{2|}}}|1}}+{{{3|}}}+{{#ifexpr:{{{3|}}}<0|1|-1}}}} <!-- calculation -->
| <!-- cannot calculate, will default to len (string) --> }}
}}}}


If sublen<0 then the module (not the user) must switch i and j (after value assignment) for correctness.

Leftover issues
  • One new error can arise: if the user enters three params that do not fit: i=5, j=8, sublen=11 ? → error.
  • Defaults too: when len=0 or sublen=0 → return blank (nullstring). Why check for problems?
  • To be decided: what should happen when a used enters "sublen=-3" only? (answer: count from the end by same logic)
  • The goal is: please note that the user will have many options (intuitive options, I say), all within the same logic: not having to define "i=1" of "j=min(j, string len)", directly ask for subbstring length whatever the big string len is.

-DePiep (talk) 00:08, 2 March 2013 (UTC)

Just wondering if another option would be to add something so that if {{str mid|Abcdefg|4|5}} is called, rather than erroring or returning blank if ignore errors=true, it could return what character are available which would be "defg" -- WOSlinker (talk) 00:44, 2 March 2013 (UTC)
That is exactly current intended default behaviour! Except: something is wrong in the demo, one needs to use i= named param :-( : (resolved)
live code: {{str mid|Abcdefg|4|200}} → {{str mid|Abcdefg|4|200}}
Reason is fundamantal: a user (editor) should not have to worry about string length. Maybe the option should be: |please show errors=yes. More like a debug-option. -DePiep (talk) 01:22, 2 March 2013 (UTC)
Code adjusted and debugged; unnamed params only. Changed source code above accordingly. Request unchanged. -DePiep (talk) 11:17, 6 March 2013 (UTC)

mw.log question

What exactly is the incantation I need to do to use the mw module? For some reason, calling mw.log is causing Module:LegendRJL to throw an error, and I can't figure out why.

Also, is there any way to get a more specific error message than the frustratingly opaque "Script error"? I get that we don't want to overwhelm users with irrelevant error messages when they happen accidentally, but it makes learning the language extremely difficult when the number of things that could be the cause of the error approaches infinity because the message is so nonspecific. —Scott5114 [EXACT CHANGE ONLY] 04:27, 8 March 2013 (UTC)

Click on 'script error' to get the full error message and stack trace. In your case it's not any problem with mw.log, but a string concatenation error because a is nil. Toohool (talk) 05:13, 8 March 2013 (UTC)
Interesting, I hadn't tried that; didn't think to see if that was a link. Thanks. However, even after resolving the problem that makes the script error, I am not able to pass a string to the console. —Scott5114 [EXACT CHANGE ONLY] 05:24, 8 March 2013 (UTC)
How are you trying to test it? If I run =p.colspan({getParent = function() end, args = {}}) from the debug console on the module edit page, it prints "Test, test", followed by the function output. If you're using #invoke to run it, the logs are not going to be displayed anywhere, unless you explicitly call mw.getLogBuffer() and display it somehow. Toohool (talk) 06:32, 8 March 2013 (UTC)
I made the faulty assumption that the code could print to the console when using the "show preview" function to preview a page that uses a template that uses #invoke to call the module. Thanks for the information; I've learned something about how the environment works. I'm going to add a few notes to the documentation just in case anyone makes the same faulty assumption. —Scott5114 [EXACT CHANGE ONLY] 07:46, 8 March 2013 (UTC)

Lua standard libraries

The standard library provided by Lua is quite sparse. Lua is, as they say, "batteries not included". There are a number of projects out there that aim to provide an "extended standard library", such as Penlight, stdlib, Underscore.lua, and Nucleo. All these libraries could be imported as Scribunto modules with little modification, other than removing those parts that depend on unavailable functionality like file I/O.

Should we have a guideline about such "swiss army knife" libraries? Ban them altogether? Pick one and stick with it? Some factors to consider:

  • Performance It's easy to imagine a complex module ultimately importing 10 or more Penlight or stdlib modules, with over 250K of code. If a module is repeatedly invoked on a single page, all its dependencies are repeatedly loaded and executed as well. It could add up fast, especially with templates like {{convert}} and citation templates that can be used many times on a page. But maybe for contexts outside of these high-use templates, the performance hit will be acceptable.
  • Standardization Everyone has their preferred library, and it seems inevitable that eventually they'll all be ported to Wikipedia. Will editors have to master 20 different libraries to work on intricate templates? Will the steeper learning curve be worth the improved ease of development for those who do know the libraries?
  • Version lock-in Once we import the current version of, say, Penlight, and it becomes widely used, it would be very risky to ever update to a newer version, as some pages will surely rely on bugs and undocumented behaviors in the current version. It seems we would effectively be forking our own version frozen in time, warts and all.

These factors seem to cut against using these libraries, but the alternative, I think, is that we'll basically be developing our own core libraries piecemeal, without the years of testing and refinement that some of these projects have gone through. So I think it might be best to act early and import one or more of these libraries to preempt some of the inevitable reinvention of wheels. What does everyone else think? Toohool (talk) 09:54, 4 March 2013 (UTC)

i do not think this should be discussed and decided on enwiki. i think it will be much more sane to discuss and decide about it on the extension page in mediawiki (mw:Extension:Scribunto, if i'm not mistaken). it will be very miserable affair if enwiki will decide to standardize on one of those, frwiki on another, dewiki on a third, and itwiki, hewiki and eswiki will decide to go commando. same as with JS: until mediawiki decided to standardize on jQuery, all wikis went commando. imagine what a mess we would have is some wikis had decided to use different frameworks prior to the introduction of jQuery into core. despite the fact that JS scripts and gadgets are not unified, few people realize how much cross-pollination exists in the JS space between different wikis (just to illustrate: practically every single wiki i sampled so far offers the "hotcat" gadget, many offer navpop, and several offer cat-a-lot). peace - קיפודנחש (aka kipod) (talk) 16:08, 4 March 2013 (UTC)
If all major Lua scripts have proper unit tests, integrating a new version of a library is not actually that big of a deal. The tests would detect most problems caused by the upgrade. To avoid disruption, it could be tested on an offline fork of enwp, then any necessary changes are made by a bot all at once. Standardization would be immensely helpful since many Lua scripts are bound to be copied between projects (Incidentally, forking of scripts between projects is an opportunity for bugs that are fixed in one to not be fixed elsewhere, which is quite problematic.) However, not a lot of what we do requires very complex Lua, and Scribunto has its own custom string libraries and stuff which obviously haven't been integrated yet with these libraries, so at the moment I'm opting against including one. Dcoetzee 05:06, 6 March 2013 (UTC)
Dcoetzee summarized it well in pointing out that a lot of what we do doesn't intersect greatly with the added features in those libraries; look at all the things we removed from the built-in libraries. It's also worth noting that Scribunto's mw object is basically a "standard library" in itself, and that any third-party library that might be integrated would need to be pure Lua or would need to have a pure-Lua fallback to support the copying of our modules to non-WMF wikis that might be using the LuaStandalone engine. BJorsch (WMF) (talk) 15:14, 11 March 2013 (UTC)

Penlight, which seems to me the best candidate for adoption, is pure Lua. Most of its modules could probably be imported as Scribunto modules with little modification, and deal with general topics like data structures, algorithms, functional programming, and common string and table utility methods. They're the sorts of things that could be useful to anyone building moderately complex modules. It may seem like overkill for the modules that are currently being written to replace existing templates, but I think our modules will grow more complex as people figure out more cool things to do with Scribunto that weren't previously feasible, like Module:DiscussionIndex, and a richer standard library would certainly be a boon in that process. It's not something that has to be decided today, but maybe just something to think about for the future. Toohool (talk) 07:42, 13 March 2013 (UTC)

Discussion index / summary

I've been playing around with ways to use Lua to generate a summary of existing discussion pages. My present efforts are rough with a number of bugs, see Module talk:DiscussionIndex, but give a flavor of the kinds of things that should be possible. The code for this is presently spread across Module:DiscussionIndex and Module:ParsePage. I also know that Wnt (talk · contribs) was working on something similar with Module:RDIndex (see: Template:RDIndex). I think something like this might be useful for accessing a variety of talk pages and archives. Dragons flight (talk) 21:21, 28 February 2013 (UTC)

I was away for a while, but it looks great now! The moment I saw it it became the #1 way to view the discussion. :) I do still worry about the overall server load/cost with both yours and mine - is there any way we can monitor what that is as we go forward? Wnt (talk) 15:03, 11 March 2013 (UTC)
It should be possible to significant reduce load by using the recently added mw.title:getContent() to access the raw wikitext, rather than relying on the preprocess template expansion. In case you haven't found it yet, any page that uses Lua has an HTML comment added near the end that explicitly mentions that Lua runtime. I suspect that for most discussion pages we can parse them in a second or less. Dragons flight (talk) 22:12, 13 March 2013 (UTC)
This sounds great, but... looking at [1], which as of this revision of the module is currently set to list (I hope) every mw.thingy down to six levels, I see no "getcontent" in there except "getcontentLanguage" bits, and the value of mw.title.getcontent is nil. I don't see any comments about it in the HTML source either. Where did you see this? Wnt (talk) 15:32, 14 March 2013 (UTC)
I was a little imprecise. getContent() is a method of a title object, and not part called from mw.title directly. For a usage example, see Module:CiteConversionTest. Dragons flight (talk) 22:55, 14 March 2013 (UTC)

Turning a string into a variable name

I know that a.test = a["test"], which permits you to create/access an arbitrary field of any array based on parameters passed into a module.

Is there a way to do this without the a. part?

I wrote Module:DisplayLuaTableContents to look at what all the stuff in mw. actually is (sometimes it's not clear to me if something is a text field or a function that returns text, for example) Admittedly this approach fails miserably on mw.title.new() fields, for example; you have to know the "password" to a function in advance, i.e. what kind of thing it expects, to get things out of it.

Even so, I'd like to be able to replace mw. with an arbitrary passable parameter so that I could explore the contents of any variable in some script I write like a black box, which too often it seems like it is. Wnt (talk) 19:09, 15 March 2013 (UTC)

If I understand correctly, you want to be able to do something like {{#invoke: DisplayLuaTableContents | main | foo }}, where foo is the name of a global variable to be inspected? You're looking for _G, the global variable table. _G['foo'] should give you the global variable foo. Toohool (talk) 19:40, 15 March 2013 (UTC)
Thanks! I'd sort of read that a little, but I'd totally not gotten the point. Indeed, I can even use a _G field to display the contents of _G: [2]. Wnt (talk) 21:04, 15 March 2013 (UTC)

Taxonomy of weird Lua pseudo variables

As we mess with these libraries, there are all sorts of "variable types" that it isn't that easy to tell from the manual what you can do with them. For example:

  • Lists separated by commas. a,b,c = d,e,f.
  • Title objects. Stuff that has the type() = "table" but tostring(of it)="one random field", which are immune to ipairs and pairs.
  • Iterator functions. Every time I try to use gmatch there is wailing and gnashing of teeth. I seem to flop around like a fish out of water with it, unable to get it to do whatever I want unless I precisely copy some known usage.
  • mw.loadData stuff where pairs will work but next() won't.

I feel like this list is only barely begun... Wnt (talk) 19:10, 15 March 2013 (UTC)

a,b,c is not a "list", per se. it's just an efficient way to declare and assign variable, and for functions to return multiple values. there are 2 useful "tricks" that go with these "multi-value values": you can extract the 1st one by enclosing in parens, and you can convert to a list by enclosing in curlies:
function f() 
    return 1,2,3 
end
a,b,c = f()  -- now a == 1; b == 2, and c == 3.
a,b,c = ( f() ) -- now a == 1, b == nil, c == nil.
a = { f() } -- a == {1, 2, 3}, i.e., a[1] == 1; a[2] == 2; a[3] == 3;
Don't know how much this clarify things. i also never wrote a lua enumerator (i.e. somethign that calls "yield") i suggest you leave this to wizards with higher magic powers for now. peace - קיפודנחש (aka kipod) (talk) 21:28, 15 March 2013 (UTC)
Actually, RRTFM I realize that as it's written now, an "expression list" really is explained as a very distinct object, and conceptually I suppose that it's easy to say that (exp-list) extracts the first value, {exp-list} converts to a numeric table. (At the moment I'm not sure if there's some easy way to convert a numeric table to a list... I should play in my sandbox a bit) Wnt (talk) 00:15, 16 March 2013 (UTC)
Is unpack() what you're looking for? Anomie 00:24, 16 March 2013 (UTC)
Thanks - trying this [3] I see it works. I'm not sure if any 'variable' can hold a comma-list, but since unpack({a,b,c}) = abc and so forth it is always possible to port these lists around. Wnt (talk) 00:37, 16 March 2013 (UTC)
coroutine.yield() and the rest of the couroutine library aren't available in Scribunto. It's possible to write an iterator function without using coroutines, BTW. BJorsch (WMF) (talk) 22:44, 15 March 2013 (UTC)
Regarding title objects, I submitted a patch to fix that: gerrit change 54091 BJorsch (WMF) (talk) 22:44, 15 March 2013 (UTC)
For using something that returns an iterator, like pairs or mw.ustring.gmatch, it's easy: you just stick a call to the function into a for loop as shown in the example for pairs(). Writing an iterator function is a bit more complicated; Chapter 7 of Programming in Lua might help with that. But it's not often that you need to write an iterator function. BJorsch (WMF) (talk) 22:44, 15 March 2013 (UTC)
OK - reading this I'm starting to get a clue. I totally failed to understand that when I created a variable a=string.gmatch(x,pattern) that the 'function' a was not simply a copy of the latter statement. Now I realize that I can actually have control by setting a variable and presumably that control lets me scan independently for the next occurrence of this and that in this and that string on the fly, without having to grab up every occurrence at any particular time. I think something like [4] (with a little editing) would be very useful to have in the local manual. Wnt (talk) 01:00, 16 March 2013 (UTC)
It's unfortunate that there's no simple way to make next() work on objects where the __pairs metamethod is used. You can cheat a bit to get something that partially works:
    local mynext, surrogate, k0 = pairs( object )

    -- Now this is mostly like next( object, nil )
    local k1, v1 = mynext( surrogate, k0 )

    -- And this is mostly like next( object, k1 )
    local k2, v2 = mynext( surrogate, k1 )
The caveat is that calling mynext( surrogate, kn ) a second time might return kn+1 and vn+1 again as it does with next(), or it might ignore what you passed and return whatever km and vm it is up to based on the number of calls. FWIW, this trick will work like next() on something returned from mw.loadData(), but will work in the other way on frame:args. BJorsch (WMF) (talk) 22:44, 15 March 2013 (UTC)
TWISI, each language has its "style", or "flavor" if you will. if it's not easy to create an iterator, then we should learn how to write elegant code without iterators, instead of using them.

however, there are some utility functions which are actually pretty natural, which i would love to see in mw.libraryUtil, which can make life more convenient:

mw.libraryUtil.unzip = function( t )
    if type( t ) ~= 'table' then return end;
    local tk, tv = {}, {}
    for k, v in pairs( t ) do 
        table.insert( tv, v )
        table.inset( tk, k )
    end
    return tk, tv
end

mw.libraryUtil.zip = function( tk, tv )
    if type( tk ) ~= 'table' or type( tv ) ~ 'table' then return end
    local res = {}
    for i = 1, math.min( #tk, #tv ) do res[tk[i]] = tv[i] end
    return res
end

mw.librarhUtil.new = function( t )
    retrun setmetatable( {}, { __index = t } )
end

the first two are self-explanatory. the last one will allow you to do things like

x = mw.libraryUtil.new( table )

x:insert( 12 )
y = x:remove()
x:sort(comp)
-- etc.

or
j = { a = function( t ) table.inset( t, 44 )  end }
k = mw.libraryUtil.new( j )
k:a()
peace - קיפודנחש (aka kipod) (talk) 02:03, 16 March 2013 (UTC)

Table with autonumber field

It would be great if we have something like this;

{{Autonumber table
| 1 = {| class="wikitable"
|-
| ### || abc || def
|-
| ### || ghi || jkl
|}
}}

which display

1 abc def
2 ghi jkl

--Nullzero (talk) 05:46, 16 March 2013 (UTC)

Is Template:Autotable5 good enough?
  • Template:Autotable5 can handle narrow tables by empty parameters: To have only 2 data columns, in an auto-numbered table, just put empty parameters for columns 3/4/5, as with:
    {{Autotable5
    | numfmt = width=20px | colfmt1 = width=90px | colfmt2 = width=105px
    | row1col1 | row1col2 | | |
    | row2col1 | row2col2 | | |
    | row3col1 | row3col2 | | |
    }}

{{Autotable5 | numfmt = width=20px | colfmt1 = width=90px | colfmt2 = width=105px | row1col1 | row1col2 | | | | row2col1 | row2col2 | | | | row3col1 | row3col2 | | | }}

Using {autotable5} is very fast, where a short table, of only 10 rows, can reformat within 1/10 second. See: Template:Autotable5, for more examples, and setting the style/width of each column. -Wikid77 (talk) 18:10, 16 March 2013 (UTC)

The current article history template is a mess; it should be redone through Lua. (For example, "for" loops would greatly simplify it.) -- Ypnypn (talk) 15:03, 14 March 2013 (UTC)

General suggestions

String manipulation templates

Everything in the "String manipulation templates" category ought to get converted; that'll be a great performance gain. Sumana Harihareswara, Wikimedia Foundation Engineering Community Manager (talk) 01:06, 13 March 2013 (UTC)

Almost all of them were converted days ago. Dragons flight (talk) 22:09, 13 March 2013 (UTC)
See this discussion above. Boghog (talk) 19:07, 15 March 2013 (UTC)

Mathematics templates

It will help performance to convert the mathematics templates. Sumana Harihareswara, Wikimedia Foundation Engineering Community Manager (talk) 01:06, 13 March 2013 (UTC)

You should be more specific. The vast majority of templates in that category have nothing worth converting. Wnt (talk) 14:29, 13 March 2013 (UTC)

Coord template

This bug indicates that the Coord templates would be nice to convert. Sumana Harihareswara, Wikimedia Foundation Engineering Community Manager (talk) 01:12, 13 March 2013 (UTC)

{{coord}} was converted to Lua a week ago. Toohool (talk) 02:18, 13 March 2013 (UTC)

Arabic Wikipedia

If you have time, check out this bug and this comment regarding templates on Arabic Wikipedia that could use optimizing. Sumana Harihareswara, Wikimedia Foundation Engineering Community Manager (talk) 01:18, 13 March 2013 (UTC)

#switch-like function that allows for ranges

I would like to put in a request for a switch function that will allow for matches in a number range. Something along the lines of

{{xswitch | {{{1|}}}
|0..127 = One byte
|128..2047 = Two bytes
|2047..55295 |57088..65535 = Three bytes
|65536..1114111 = Four bytes
|#default = Invalid code point
}}

where {{example|65279}} would return "Three bytes". VanIsaacWS Vexcontribs 12:23, 28 February 2013 (UTC)

Obviously, some changes may need to happen, eg the equal sign may need to migrate to an arrow "->" to avoid named variable syntax. VanIsaacWS Vexcontribs 21:29, 28 February 2013 (UTC)
I did a beginner-level effort at Module:xswitch. See [5] [6] [7] for some sample calls. I programmed it with some goofy features, like the ability to transclude in a "profile" file containing the values separated by "|" and the intentional though perhaps ill-considered feature that if the value is less than the first number it is matched as above the last number, because that's supposed to be the out-of-range value. I have all kinds of paranoid tests in there after getting a large number of nil errors. I won't take it amiss if little remains of my work by the time this sees actual on-wiki use, but hey ... it "works". That is, until you catch some case I missed. Wnt (talk) 04:31, 11 March 2013 (UTC)
So, it looks like from your third example that it matches to the largest value that is less than the input, so the range is expressed as the current match value to less than the next match, is that right? VanIsaacWS Vexcontribs 03:05, 13 March 2013 (UTC)
Yes. With the caveat I should have mentioned that the thresholds have to be in order; at least right now it doesn't sort them (though it could). Wnt (talk) 14:31, 13 March 2013 (UTC)
[[MediaWiki:{{{1}}}]] ([{{fullurl:MediaWiki:{{{1}}}|action=edit}} edit] | [[MediaWiki talk:{{{1}}}|talk]] | [{{fullurl:MediaWiki:{{{1}}}|action=history}} history] | [{{fullurl:Special:Whatlinkshere/MediaWiki:{{{1}}}}} links] | [{{fullurl:MediaWiki:{{{1}}}|action=watch}} watch] | logs)
Added m links. -DePiep (talk) 14:50, 13 March 2013 (UTC)
(edit conflict)i do not think "|2047..55295 |57088..65535 = Three bytes" makes sense, and not even sure if it's possible. parserfunctions extension can use this inconsistent syntax because it takes full control of the parsing, but in a module, there is no sane way to know which parameter comes first and which second. you probably need to change it to use a saner delimiter (instead of |), e.g., "|2047..55295,57088..65535 = Three bytes" or even something like "|77 42 30..39 101 = Unoccupied seats ". it's even easy to allow both, i.e., any sequence of anuthing other than a digit, minus sign or a dot will be considered a delimiter (you'll have to beware not to use 2,000,000 for "2 million", because it will be interpreted as 2 or 0 or 0). also, i can't think of simple and efficient code to ensure no two ranges overlap, and in case of an overlap, there is no good way to guarantee consistent return. for all other cases, this should be simple enough. regarding "type" (i.e., comparison is lexical/string versus numeric: i would not use an extra parameter to indicate which type is it - i'd much rather use two separate functions, and two separate templates. peace - קיפודנחש (aka kipod) (talk) 15:52, 13 March 2013 (UTC)
The example above was not meant to be prescriptive for syntax or delimiters, merely an example. Essentially, I just copied the current #switch syntax and extended it to take a numeric range (and used a simple test range that I worked with on a recent template), and I am well aware that the workings of Lua could require quite different delimiters than the current #switch parser. VanIsaacWS Vexcontribs 22:06, 13 March 2013 (UTC)
I understand the request like this: make it a calculation (logical check), with each option meaning "between x and y". That sounds great. (This way, it is not exactly a more "robust" #switch, but more like an extension of the #switch idea). Still the request is an improvement. Of course, there could be options added (like between [min .. max], [min .. max), (min .. max), that is including/excluding). The "and" operator needs to be clear. The essense is that we introduce a calculated range. That is new. -DePiep (talk) 23:00, 13 March 2013 (UTC)
i created Module:Range. currently it has one function: iswitch. the idea is (maybe) to add "tswitch" for similar statement that will be based on lexical (textual) rather numeric comparison. it allows for both ranges and discrete numbers, with arbitrary number of ranges and discrete numbers per option, separated by spaces.
there are some caveat: if there is an overlap between several ranges such that the value falls in more than one of them, the function will return one of them, but it's unpredictable which one.
nother, more insidious caveat is that you can't have "44 = something", because there is no way to tell between this and a genuine unnamed parameter whose number is 44. if you want 44 as a legit value, use "44.", "44..44" or "+44".

see example below:

{{#invoke:Range | iswitch
| 12
| 1 2 3 48 = one, two, three or fourty eight
| -1000000..2.45 = between minus one million and two point forty five
| 60..70 1.5e2..2e9 = between sixty and seventy or between hundred and fifty and 2 billion
| default = none of the above
}}
presuming this meets the requirements, it would be nice if someone can creaate Module:Range/tests and Module:Range/doc. i might get to it if no other volunteer does. of course, if you write some tests and find that the current code fails, fix it or give me a holler.
peace - קיפודנחש (aka kipod) (talk) 04:41, 14 March 2013 (UTC)
In the example above, what happens/should happen when input is "12"? -DePiep (talk) 15:28, 15 March 2013 (UTC)
what should happen is an interesting question. what actually happens in current code is that this specific input (12 = twelve) is ignored. if you want to have "12" as a stand-alone entry, you can use "+12", "12.", or "12..12". i can change this behavior, but the cost will be having to use a name for the "main" parameter, i.e. the number against which we compare, which currently is the unnamed parameter 1. if we allow "naked" numbers, someone, sooner or later will want to use "1 = One", at which point the whole thing falls apart.
this behavior can be changed easily, if we agree that the switch key will not be parameter #1, but rather a named parameter. what would be a good name? i can think of "value" or "switch", but there may be a better name.
peace - קיפודנחש (aka kipod) (talk) 16:37, 15 March 2013 (UTC)
A named parameter it is then. I think ease of use prevails (instead of elaborate restrictions, if they could be explained at all). Maybe n= for number? -DePiep (talk) 17:41, 15 March 2013 (UTC)
 Done, except i really don't think "n" is descriptive enough, so i called it "value". open to change, of course, when someone will come with better name for the switch variable. Currently 0 testing... if anyone can create Module:Range/tests and Module talk:Range/tests it would be grand, otherwise i hope to get to it sometime. peace - קיפודנחש (aka kipod) (talk) 18:36, 15 March 2013 (UTC)

some additions

I augmented the code a bit to allow for open ranges (i.e, strictly "smaller than/larger than" rather than the normal "smaller or equal": i.e. "(1..2)" means anything between 1 and 2, not including 1 or 2. For completion, it allows the completely superfluous "[" and "]", i.e., "[1..2]" is the same as "1..2", and "[1..2)" is the same as "1..2)"

In addition, I added open ranges: "..10" means 10 or smaller, "10.." means 10 and large", "..10)" means smaller than 10, and "(10.." means larger than 10.

Remember markup has #switch:1 for any condition

Before creating too many variations, remember that wp:parser functions often run faster than Lua-based templates, and #switch:1 can branch by any combination of range-tests or complex multi-part conditions, as explained in Help:Switch. Recall how any branch, of a #switch, can be written as:

{{#ifexpr: {{{a}}}<34.5 and {{{b}}}>8*67-14 or {{{w}}} = 3*{{{a}}}+{{{b}}}|<--then-->1}} = do this branch

The tactic, for #switch:1, is to have each complex condition yield a true value as "1" which matches the #switch value "1" and so that branch is triggered whenever the complex condition is true.

It will be very difficult to out-perform #switch for complex/range conditions, although it might be difficult to remind everyone that #switch can handle numerous, complex conditions. Meanwhile, we really need a Lua function {#invoke:x|fastswitch} to branch on a list of "800" values, much faster than #switch could process 800 branches. -Wikid77 18:40, 16 March 2013 (UTC)

it's not about "reminding everyone". it's about having sane and maintainable syntax. it is possible to use #ifexpr to write something that will do something equivalent to:
{{#invoke:Range| iswitch 
| value = {{some crazy template that returns a number | param1 | param2 | param3 | param4 }}
| -200..-150 -50..-5 5..10 20..40 = range group 1
| (-150..-140 12..17 120..140 = range group 2
| 180..220 (1e7.. = range group 3
| default = none of the above
}}
but it will be an unreadable, non maintainable mess, that will make anyone who stares at it too long blind.
performance is not everything. if we want the articles and the templates to be maintainable, we can't always resort to the most basic syntax, even if sometimes it might perform slightly better (not to mention that usually it doesn't). peace - קיפודנחש (aka kipod) (talk) 01:05, 18 March 2013 (UTC)

HTML help

Do we have a forum for HTML help, or do we do it here? (For example, I've been having a problem with [8] where I'm having trouble figuring how to insert a bunch of labelled divs in absolute mode without them going further and further down from where they're supposed to be due to newlines in the text) P.S. I eventually figured that part out Wnt (talk) 03:34, 17 March 2013 (UTC)

It's not at all unreasonable to use another forum ... however, if one doesn't exist, we definitely should set it up. For many Lua applications the HTML is going to be the trickiest and most important part of the exercise, and of course, the same is also true for many templates. Wnt (talk) 13:39, 17 March 2013 (UTC)

Failure to get the basics with a new module

A couple of times I've run into mystery errors with modules that can't call getParent. In this case [9], the module was then unable to call mw.title.getCurrentTitle(): attempt to call global "getContent" (a nil value) Since it happens at the beginning, so far I've handled this kind of error by abandoning whatever it happened to. But is there an explanation for it? Wnt (talk) 21:38, 17 March 2013 (UTC)

getContent is a method of the title object. So you can do content = page:getContent() or content = page.getContent( page ), but content = getContent( page ) is an error because there is no getContent function in the global space. Dragons flight (talk) 21:44, 17 March 2013 (UTC)
Crap - I was just getting it all messed up that time - I was sure that these errors were part of some grand concept but it was just really stupid. I was doing better coding than that in the morning, I swear... but thanks! Wnt (talk) 22:08, 17 March 2013 (UTC)
Dang funny how it works though. I had to use "frame:preprocess" in order to get nowiki to work; otherwise the output was all linked. Now if only I can recall how to open the text in an edit window... Wnt (talk) 22:17, 17 March 2013 (UTC)

Article rating

Is it possible to make a module outputting the article rating for any given article? This would help, for example, with keeping the symbols on VA up-to-date. -- Ypnypn (talk) 14:20, 18 March 2013 (UTC)

It's definitely possible. But is there a bot already doing that for the Wikiproject rating tables? I'm not sure which approach would be more efficient, or if it represents a duplication of effort to do it both ways? Wnt (talk) 18:52, 18 March 2013 (UTC)
With a bot, it's hard to use for one-off case. Let's say I'm trying to keep track of just a few articles on my userpage; it's a pain to find a bot, set it up, disable when I'm finished, etc. Also note that many pages such as WP:Vital articles are not up-do-date at all. (I recently rerated Entertainment from B to GA.) -- Ypnypn (talk) 19:46, 18 March 2013 (UTC)

It could be done by using mw.title:getContent() to get the contents of the article talk page. Parsing it would be a bit tricky, as it would have to figure out which of the templates on the page are WikiProject banners, and decide what to do if there are multiple banners with different ratings, and probably handle some other edge cases. Also, I believe fetching a title object is an "expensive operation", and thus limited to 500 per page, so it wouldn't be usable on a page like WP:VA with so many listings. Toohool (talk) 20:10, 18 March 2013 (UTC)

By the way - [10] says the maximum number is 99. How do I see what $wgExpensiveParserFunctionLimit is on this server? Wnt (talk) 20:26, 18 March 2013 (UTC)
View the HTML source of any page and search for "Expensive parser function count". -- WOSlinker (talk) 07:23, 20 March 2013 (UTC)

How does the sandbox/lowercase bug work?

I was so sure that the section in Help:Lua debugging about lowercase problems was out of date that I created [11] to disprove it - and was amazed to find it actually still doesn't work actually this did work, I just left a stray "module" in it like an idiot, sorry!, but [12] does. Yet there are a bazillion Sandbox/isaac/.. pages that I assume do work? I'd like to make the precise details of the bug clearer.

For now I'll just change it to include the "Module:" at the beginning of the example link given, since you can't run anything in User: space (yeah, I even checked that [13]). Wnt (talk) 16:29, 22 March 2013 (UTC)

Oh, also the text reads So, the first step in "sandboxing" is to copy Module:String to the private page User:Lua Developer/sandbox/Module:String. Now our developer can edit the module to her heart's content. At any given moment, she can open Special:TemplateSandbox, using the default "Sandbox prefix", which in her case, will be "User:Lua Developer/sandbox". This means that viewing any page from the sandbox page, whenever the parser encounters a Template T or Module M, it will look first to see if a page named "User:Lua Developer/sandbox/Template:T" or "User:Lua Developer/sandbox/Module/M" exists

Is all this true for Module:User:Lua Developer/sandbox/ pages? Because I know I can't run the thing they say there. Wnt (talk) 16:33, 22 March 2013 (UTC)

i think you confuse two different things here. there is a "Sandbox" pseudo namespace under the "Module" namespace, where people can place "pseudo private" pages in the public namespace, i.e., the page name is "Module/Sandbox/usenname/modulename". The section in Help:Lua debugging is about using pages in userspace to develop modules, by exploiting the synergy between "Scribunto" extension and "TemplateSandbox" extension. the TempateSandbox extension has the feature/limitation, that it does not handle the "autocasing" (i.e. converting first character of a pagename to uppercase) as intuitively as expected, so the user has to be cognizant of the inner working, and make sure the first character of the name _after_ the sandbox prefix she uses is always uppercase. i am the one who wrote the confusing paragraph you referred to, and i actually also opened a bugzilla ticket about it, which was closed with "wontfix", so that's the way it is. if you can find a way to improve on this confusing paragraph that will be clearer, please do it. peace - קיפודנחש (aka kipod) (talk) 17:08, 22 March 2013 (UTC)
The code in both example files does not work with me either (generating "Script error"). I rewrote it and now it is fine: Module:User:DePiep/sandbox2. May I sugest you rewrite the code to prove/demonstrate the bug? -DePiep (talk) 17:32, 22 March 2013 (UTC)
Actually, as weird as that code is [It's called with {{#invoke:user:wnt/sandbox/all lowercase|1}} :)] it did work on both pages. Wnt (talk) 18:47, 22 March 2013 (UTC)
Then, a problem with User:Lua Developer/sandbox/Module:String (in userspace!) is not Module-specific. This is the way WP page naming works. On top of this: saving code in your userspace is fine, but you cannot invoke it from there. i think the paragraph can be removed from the help-page. -DePiep (talk) 17:35, 22 March 2013 (UTC)
no, no, no. Module:User:DePiep/sandbox2 is something else entirely: this page is in the "Module" namespace.
The TemplateSandbox extension is about allowing you to use pages in _user space_ as modules. The page will be User:DePiep/Sandbox/Module:Testmodule. Once you place a module there, open Special:TemplateSandbox, and select "User:DePiep/Sandbox" as "Sandbox prefix". At this point you'll be able to use something like {{#invoke:Testmodule}}, and the system will use the code in "User:DePiep/Sandbox/Module:Testmodule" instead of "Module:Testmodule". However, the automatic uppercasing works only for the first character after the namespace ("User" in this case), so if the page you create will be User:DePiep/Sandbox/Module:testmodule instead of User:DePiep/Sandbox/Module:Testmodule, you will not be able to invoke it from Special:TemplateSandbox, since the automatic uppercasing will _still_ execute on the "invoke" side, but not on the actual page name. hope this is a bit clearer than the explanation in Help:Lua debugging. peace - קיפודנחש (aka kipod) (talk) 17:56, 22 March 2013 (UTC)
I get it. Apart from sandboxing in my own userspace, what would be the profit? I am sandboxing in Module space. -DePiep (talk) 18:01, 22 March 2013 (UTC)
Sorry - I actually just had stray text in the one test and not the other, mimicking what I thought the bug was. As I now understand it, there is no reason to worry about lowercase or uppercase when naming anything in Module: space, i.e. anything that can be viewed outside of the special Template sandbox viewer. Wnt (talk) 18:45, 22 March 2013 (UTC)
Remove all Special:TemplateSandbox associations (help, links, pages, ...) from Module solves it. All these issues should be handled at Special:TemplateSandbox. -DePiep (talk) 22:35, 22 March 2013 (UTC)

let me return the discussion to a question that was asked above: "why do we even need Special:TemplateSandbox to develop modules? why isn't the "pseudo sandbox" under "Module:Sandbox" enough?

so here's the deal: for developing new scripts, there is no huge advantage. personally, i prefer to work and develop in my user space. maybe i feel that this gives me some sense of "privacy", but in reality there is no advantage (btw: pages in my userspace which happen to have the extension ".js", _are_ protected: only i and users with "editinterface" right (which in enwiki means admins) can edit them. i would love it if the same protection was afforded to pages in my userspace whose name contains "Module:")

so, as i said, to develop a new lua script there is not too much difference. however, when modifying an existing module, especially a module which in wide usage. the TemplateSandbox extension allows you to test what effect your changes will have on _actual_ pages, when the modification is still ib your userspace, and only after you test the effect on enough actual pages to feel comfortable, you copy the modified module from your sandbox onto the page in "Module:" namespace.

in order to do so, you need to be cognizant of the whole uppercase/lowercase debacle, otherwise you might even *think* you are testing the modified module in your userspace, but because of wrong case ("string/String"), you'll be actually running the "real" module and not testing your changes.

peace - קיפודנחש (aka kipod) (talk) 00:50, 23 March 2013 (UTC)

Please add 'note=' option

The module should allow a "note=this test checks negative input' extra column. -DePiep (talk) 00:59, 24 March 2013 (UTC)

Need switch-choice selection

As noted in earlier discussions, some #switch functions have hundreds of branches, and a Lua switch-choice function might run faster for hundreds. As a comparison to the tiny #switch in {yesno}, we could have a Lua function to run a switch-branch check for over 3 hundred choices, such as in Template:Metadata_population_AT-7 (based on German Wikipedia), which has been displaying the town populations in the Tirol region of Austria (since 2011). That template has nearly 288 branches in the #switch, and it would be interesting to #invoke a Lua utility function to compare the 288 town codes to assign each of the 288 population numbers. The general concept is to have a few dozen population templates for each nation, and then copy the latest population numbers (from German WP), so within 1 hour, all "3,000" town articles have been updated from the list of recent population figures. If the Lua switch-choices are fast, then perhaps we might list 2,000 towns in each metadata_population template. And beyond that, the subatomic-particle articles use dozens of formatted particle symbols which might benefit from a quick Lua-lookup to change particle-codes into superscript/subscript glyphs. Previously, each particle symbol had been stored in a separate template, as faster than #switch of many particle codes in one template, but then people became confused by the hundreds of template names and no longer kept all the particles formatted correctly. It is easier to proofread the formatting when many particles are all listed inside one template #switch, rather than tediously check inside hundreds of templates. We have a similar problem for flag-icon templates, where about 300 nations could all use a single "{Country_data_common}" template with a fast switch-choice, and then older years or other territories could continue to use the rare, separate thousands of {Country_data_*}, but only 1 {Country_data_common} for flags of the 300 major nations. -Wikid77 (talk) 22:29, 23 March 2013 (UTC)

could not make heads or tails of the above. i'll appreciate it if you could try to summarize it in 3 separate and clear sentences that will state what are you proposing (if indeed you propose something). thanks - קיפודנחש (aka kipod) (talk) 23:38, 23 March 2013 (UTC)
My experiment with Template:SymbolForElement in Template:SymbolForElement/sandbox has a Lua-based switch using loadData running a little more than twice as fast (160 / second vs. 70 / second). In that case there are approximately 270 enumerated cases leading to about 135 possible results. So it does appear that Lua offers an advantage for large switch statements. Not sure where the break-even point is except that 10 cases is probably too few to justify using Lua while 200 cases is probably more than enough. Dragons flight (talk) 05:44, 24 March 2013 (UTC)
  • Using 1,000 parameters exponentially slower: The passing of thousands of switch parameters to Lua incurs an exponential slowdown in the parser, so we need to encode numerous choices inside delimited strings for Lua to search, as selecting the first matching choice, to substring the result. While running tests to simulate a large #switch function of 3,000 branches, I repeatedly confirmed how passing thousands of parameters is typically exponentially slower beyond the first 500 parameters. Hence, passing 2,000 parameters runs 3 seconds, 4,000 parameters runs 13 seconds, or 6,000 parameters runs 27 seconds. It would be much faster to pass a single parameter, as a single string of delimited choices, alternating with resultant values, rather than try to pass 3,000 choices as 6,000 parameters, which has been running about 27 seconds in the parser. Although Lua can still process the 6,000 parameters, to compare choices, within "0.02" Lua seconds, the load time of the #invoke interface is 27 seconds. The NewPP parser hits the huge delays when passing 6,000 parameters whether to Lua, or to a template, or to a #switch parser function of 6,000 "|x=" branches. See expanded description and time-delay formula at: "WT:Lua#Passing 1,000 parameters exponentially slower". More later. -Wikid77 (talk) 12:32, 24 March 2013 (UTC)
  • Delimit choices as perhaps "|^x=7 ^yy= 4^w =zz" form: Because parameters are separated by the vertical-bar pipe "|" then Lua switch-branch choices could be delimited inside a single parameter by caret "^" immediately preceding each choice, as "^xxx=n" where double-caret would be a literal caret as a choice "^^=caret" to exactly match a single '^'. With all choices combined into a delimited string, then many hundreds or thousands of choices could be processed in a fraction of a second. However, for situations of just a few hundred branches, then having another Lua function to compare separate parameters would still be fast enough, while allowing for thousands of choices within a separate delimited-string format. -Wikid77 13:48, 24 March 2013 (UTC)
There's also the other type of switch to investigate where lots of subtemplates are used. See {{ISO 639 name}} for an example. -- WOSlinker (talk) 15:29, 24 March 2013 (UTC)

Image sprites

Wikipedia doesn't allow us to insert general purpose <img src=""> into pages, and doesn't let us fool directly with the CSS sheets for general users. Still, I know that if you have a div with overflow hidden, you can display part of an image. Can we get the kind of efficiency that you'd expect from a good page using a single image with the flags of every country to generate image sprites very quickly? Or are we defeated? (Background: [14]) Wnt (talk) 01:26, 24 March 2013 (UTC)

if you have the image url, you can use it as background to a div using the "backgroud-image" property. i believe you can use a "preprocess" call in conjunction with one kind or another of a [[File:image name]] and extract the url from it. not sure if this is a good idea, and i did not look into which of those calls fall within the "expensive" category, but i am pretty confident it can be done using the "background-image" property (i am actually contemplating allowing it for the barchart thingy, as an alternative to solid colored bars). peace - קיפודנחש (aka kipod) (talk) 02:04, 24 March 2013 (UTC)
CSS background-image should also be prevented by MediaWiki. If you find a way to get it to work (without editing MediaWiki-namespace pages or your personal JS/CSS), please email security@wikimedia.org or file a bug in the "Security" component. BJorsch (WMF) (talk) 02:48, 24 March 2013 (UTC)
Hmmm, looking up quickly it it sounds like you can do this with a regular inline image [15]. I know that I've gotten pieces of images to display in my sandbox, though for some reason I couldn't create clickable links from the very tiny bars I was using in my protein module the last two times I tried (I suppose there's a margin?). Wnt (talk) 04:26, 24 March 2013 (UTC)
That's about the only way that is allowed for "cropping" images in CSS. See also {{CSS image crop}}. Anomie 14:52, 24 March 2013 (UTC)
I have since implemented this as Module:Sprite. Wnt (talk) 17:17, 24 March 2013 (UTC)

Transforming page content

I don't think anyone has tried it, but I wanted to point out that Lua provides us the power to write tools that can be used to transform the page text and then permanently save those changes.

For example if you have a script:

p = {};

function p.run( frame )
    local title = mw.title.getCurrentTitle();
    local wiki_text = title:getContent();

    ---Operations on wiki_text go here---

    return transformed_text;
end

return p;

And then you take any wiki page and replace it with:

{{subst:#invoke:MyModule|run}}

When you save, it will look up the previously saved wikitext of the page, perform your transformation, and the then save the transformed text.

This technique could be used to generate any number of cleanup tools, e.g. sorting categories and references. Anyway, I just want to point out this possibility because I suspect that most people haven't been thinking about using Lua for things like this. Dragons flight (talk) 22:08, 13 March 2013 (UTC)

Of course, every template transforms page content. This idea depends on old subst: behaviour. Please explain what Lua can add. -DePiep (talk) 23:28, 14 March 2013 (UTC)
Do you know if this could be used to push content into the edit summary? I know that I have often been doing a repetitive task, where I am copying an infobox/template/other content into multiple articles in a category, and I can copy the article content, filling in values and information as I go, but the edit summary tends to be ad hoc. Being able to copy the edit summary along with the new article content would be incredibly helpful. VanIsaacWS Vexcontribs 13:29, 25 March 2013 (UTC)

Speedy for module

I want to delete a page in Module space (Module:User:DePiep/sandbox2). How to? {{db-g7}} (author self deletion) is not accepted. -DePiep (talk) 01:38, 25 March 2013 (UTC)

Good question. I deleted the page for you, but I have no idea how/if we can slap templates in module space. Martijn Hoekstra (talk) 09:29, 25 March 2013 (UTC)
Thanks. Of course, that problem was my point. -DePiep (talk) 10:12, 25 March 2013 (UTC)
The only way would be to put it on the module's doc page. Anomie 12:41, 25 March 2013 (UTC)

Unit tests

Every now and then people mention "unit tests", as a type of documentation, or above as something every module should have. The term is not used in the Lua Reference Manual. There's stuff on the Web about them whose point entirely escapes me (if false is true?? [16]) With Unit testing I start to get the idea, but it is still very abstract -- why should authors of a single module used in only one Wiki do it? Is it really possible to compose a few small sample inputs, if that's what they mean and know the module works how you want it to in every regard? Can someone please write a help file on Lua unit testing, saying what it is, who should do it, why they should do it, and how to do it? Thanks! Wnt (talk) 15:40, 14 March 2013 (UTC)

  • An essay is likely the best description about unit testing: I have explained the following issues in new essay "wp:Lua unit testing". The general concept is there are related issues of "unit testing" combined with larger "integration testing" to pass a customer's wishes in "acceptance testing" where not every possible option must work because "YAGNI" (You Aren't Gonna Need It) when the module is in actual use for real "customers". After a major upgrade, the whole subsystem typically goes through "regression testing" to again insure major features still work. If the product will be in wide-scale use for "industrial strength" work, then better run "stress testing" to ensure it can handle the heavy workload, such as testing Lua-based cites with a huge article which contains 450 citations or such. See the essay for more details. -Wikid77 (talk) 08:50, 15 March 2013 (UTC)
  • Unit tests are ubiquitous in all development environments now, but are particularly vital in dynamic languages like Lua, where they essentially supplant the role of compiler errors. Their main purpose is that when you modify a script and/or its dependencies, you need to verify that your modifications haven't messed it up in some unexpected way and caused it to no longer function as expected. Doing this manually is possible but may be tedious - unit tests can be run easily after every single edit, allowing errors to be caught sooner, which in turn makes them easier to isolate and fix. Of course unit testing, like all testing, is never comprehensive, and are generally expanded over time to address new bugs and unanticipated cases. Dcoetzee 02:36, 20 March 2013 (UTC)
  • I have noticed two things: First, it seems that errors terminate the full test run. Is it possible to catch some of it with xpcall()? in mw:Extension:Scribunto/Lua reference manual#Differences from standard Lua it is indicated that not all errors can be caught, but maybe catching some will already help. As a second, a fleeting inspection (so don't burn me at the stake if I'm wrong!) of deep_compare seems to indicate it will run in to an infinite loop when I try to compare t1 and t2 constructed from t1 = {}; t1[1] = t1; t2 = {}; t2[1] = t2. Martijn Hoekstra (talk) 15:10, 25 March 2013 (UTC)
You're right about deep_compare, I just added that yesterday and wanted to keep it quick and simple, so I grabbed an implementation that doesn't handle circular references. I've added a note to the docs about it for now. Toohool (talk) 01:48, 26 March 2013 (UTC)

CSV Parser

I've written a CSV parser at Module:CSV, which could be used by other modules. (It was actually far more complicated than expected due to the difference between quoted and unquoted fields) and would appreciate comments and code-review. Martijn Hoekstra (talk) 18:19, 24 March 2013 (UTC)

3 comments: 1st one, is that i do not think CSV makes sense for this module/template - as mentioned above, i think we should allow for "language-formatting", which means for enwiki that numbers can appear as 12,432.33 and for some other languages it may mean allowing comma as "decimal point". 2nd comment: you may want to look into mw.text.split() (see mw:Extension:Scribunto/Lua reference manual). this module is not yet in the official deplyment, but it will be soon. i copied the code into my experimental module, but as soon as mw.text is deployed, i'll remove it wnd will remain with the elegnt call. and 3rd comment: you may want to open a new section to discuss your CSV module, rather than hiding it inside the "barchart" discussion. peace - קיפודנחש (aka kipod) (talk) 22:14, 24 March 2013 (UTC)
Hi, thanks for the review! As for the language-formatting: I have so far only written the module as Lua function: the string is transformed to a table of tables, where each table represents a line, with each value a field still in string format. If you want numerical data out off it, the data should still be parsed to number, so processing the meaning of 12,432.33 will be left to the calling function.

mw.text.split() still has the problem of quoted fields. "blabla, blabla" is a perfectly legal field, and it won't be easy to parse this with split, as you would then have to 'reassemble' badly split fields. This is even worse with reassembling fields including whitespace. Maybe Anomie can still help out here, as my code currently calls mw.ustring.explode-utf8 for each character at least once, and up to three times. This could be fixed on the user side, but I would almost have to re-implement all off mw.ustring, and I would probably severly mess that up (I looked it over, it is currently a bit above my skill level, not to mention I don't have much UTF-8 experience).

regarding your third point: excellent point! I brought this down to a new section. — Preceding unsigned comment added by Martijn Hoekstra (talkcontribs) 23:03, 24 March 2013 (UTC)

Since the only characters you care about—quote, comma, maybe backslash—are ASCII, you should be able to do it using the string functions instead of mw.ustring. Also, BTW, note that in Scribunto with LuaSandbox (as installed on WMF wikis) won't call that "explode_utf8" function at all, since pretty much everything in mw.ustring winds up being reimplemented with a call into PHP in that case. Anomie 23:28, 24 March 2013 (UTC)
Thanks Anomie, that is a relief, as the prospect of O(n^2) looked pretty ugly. I could limit the choice of field and record separators to the ASCII set, but I would still like to read the contents of the fields in utf-8. since most characters in the format should be dedicated to data rather than metacharacters, it helps, but doesn't get me out off the woods just yet (backslash is not a special character in CSV by the way, quotes in quoted fields are escaped by repeating them). In this module I call mw.ustring(data, index, index) for every index. Is that still O(n) under the current implementation? Martijn Hoekstra (talk) 23:52, 24 March 2013 (UTC)
Assuming that mw.ustring.sub is indeed linear, it should be possible to get a more efficient implementation by using mw.ustring.gcodepoint to get all the individual characters instead. Toohool (talk) 03:49, 25 March 2013 (UTC)
On WMF wikis, most of the basic ustring functions are implemented using PHP's mbstring library. The pattern-matching functions are implemented by translating the pattern into a PCRE regex and using PHP's pcre library. And the Unicode normalization functions are implemented by using MediaWiki's Unicode normalization, which may use PHP's intl library. Note though that gcodepoint isn't terribly memory-efficient, as it creates a table holding all the codepoints as integers and iterates over that. Best thing to do is benchmark it. Anomie 12:54, 25 March 2013 (UTC)
Some implementations use backslash rather than quote-doubling. And some just blindly assume there won't be any quotes inside quoted values. There's no real "standard" for CSV, despite that RFC.
The nice thing about UTF-8 is that if you only care about ASCII delimiters and blindly pass through whatever is in between, you can ignore UTF-8's multi-byteness and just use the string module. Anomie 12:54, 25 March 2013 (UTC)
I hadn't thought about that, thank you. The point is sort of moot with the rewrite to regexes by Dragons flight, but keeping that in mind will certainly come in handy later. Martijn Hoekstra (talk) 13:17, 25 March 2013 (UTC)
I hope you'll forgive me for saying so, but that looks pretty dreadful. I would strongly suggest you really ought to be using regular expression, i.e. gmatch, for identifying patterns. Reading the strings one character at a time is going to be quite slow, doubly so because ustring has a pretty harsh performance penalty for position indexing on long strings. It would be helpful if you gave some examples of expected input and expected output. Without examples it is a little hard to figure out what you are trying to do for each case. Dragons flight (talk) 23:20, 24 March 2013 (UTC)
Well, basically implement the RFC: http://tools.ietf.org/html/rfc4180#page-2 I doubt that a regular expression would perform much faster than the current recursive descent approach, but we could see. The current performance hit comes from the repeated call to explode, which indexes the string to a charnum -> codepoint table. This needs only be done once, but the library doesn't work that way at the moment. As a note, I don't think that regexes (or any solution) are able to read more than one character at a time. as for expected input and output: I would expect the string
item11, item12, item"13, "item14"
item21, "item, 22", item 23, item24
"item""31", "item
32", item33, ",item,34"
to be parsed as table
{ {"item11", "item12", "item\"13", "item14"}, 
{"item21", "item, 22", "item 23", "item24"},
{"item\"31", "item\n32", "item33", ",item,34"} }

I added some unit tests at Module:CSV/testcases (including a few for features that I know aren't yet implemented, like the ignoreEmptyLines option and escaping of quote characters). One test is commented out because it caused a script timeout. Toohool (talk) 00:06, 25 March 2013 (UTC)

Thanks Toohool! Martijn Hoekstra (talk) 00:10, 25 March 2013 (UTC)
Currently fails with ",a" - have added a testcase. -- WOSlinker (talk) 13:23, 25 March 2013 (UTC)
Only one failing testcase remaining. The code has been changed to a regex based approach. (thanks Dragons flight!). Martijn Hoekstra (talk) 16:07, 26 March 2013 (UTC)

Discussion, Newest at Top

Request 1: I'd love (and I bet others would, too) a one time substitute module that would flip my user talk page so that newest sections are at top. Might the idea in [[#Transforming page content|]] apply? Maybe the module could be designed so that the page need not be blanked first--just add {#module:flip section order} anywhere in the page and when you Save it would flip the section order of the entire page and then disappear. It would flip top level sections and also all sub-sections within those. Normal text within each section would not be re-ordered and would thread as before, newest at bottom. Anything before the first section, eg intro text, should remain at page top. Similarly, Category tags should remain at page bottom.

Request 2: (to be transcluded via a normal {newest section at top} template that says "This Talk page is organized newest section at top") a module that would make the Add Section feature put new sections on top of the existing sections, just after any page top intro text, and not a page bottom. Or, if that cannot be done with a module, maybe this could be a user preference setting. Scrolling to the bottom of the page to see what's new is, well, clumsy and unintuitive interface design. I have no clue how to program. So this is a request/idea for those of you with code moxie. Thx!

--Rogerhc (talk) 00:58, 17 March 2013 (UTC)

  • Some advantages but many disadvantages: The newest-at-top order might seem beneficial, where the newest entry would be seen first, but there are many disadvantages. Consider the impact, long-term, with thousands of other editors:
  • Unusual to prior readers: Obviously, a big problem would be people accustomed to just add at page bottom.
  • Readers think moved message was lost: Once auto-moved, many readers might conclude they posted a "lost message" and keep reposting at bottom 5x times before burnout.
  • Old topics get buried lower: The newest-at-top would tend to obscure old topics, formerly reviewed while scrolling to bottom.
  • Not re-reading the to-do topics: Obscuring old topics is likely to increase forgotten to-do topics, as farther down the list.
  • Bottom-up and top-down subsections: Logical sub-subsections of subsections could run contrary to the paragraph flow, so there would be a bottom-up list of entries, written as top-down paragraphs, but with subsections either top-down in order of paragraphs, or reordered bottom-up to mimic the overall list of entries. Such choices of bottom-up or top-down subsections would lead to endless confusion, where readers would structure article subsections top-down, but perhaps talk-page subsections as bottom-up, and you know people would desperately hunt the message signature dates to determine which submessage was posted at which time, regardless of bottom-up placement.
  • Copying partial conversations could conflict in order: On rare occasions of quoting/copying other discussion text, then the message order might run counterflow.
  • Proposed text subsections would seem upsidedown: When using a talk-page to propose a 5-subsection text, then those 5 subsections could seem upside down, when expecting later text to be in the higher subsections.

As if there were not enough challenges for dyslexic readers (who often write their words in a "direction backwards"), this concept of newest-at-top is likely to increase the confusion levels, and extreme frustration levels, of readers who have been pushed to the brink by dealing with thousands of article pages and numerous editors. Then they post to a talk-page where the bottom message keeps disappearing, as moved to top. Not cool. -Wikid77 (talk) 03:33, 20 March 2013 (UTC)

So have "Add topic" feature add topics at top instead of bottom on pages that contain the "Newest at top" widget. Soon as page is saved reader will see his new message right here at top, on first screen of page, unlike current bottom add system which hides it off screen on even modestly used pages. And for the diligently blind bottom feeder who scrolls to page bottom without even looking the first screen in front of him, park a {newest at top} note at page bottom (and top). Done. Upside down interface problem fixed.
Subsections topsy turvy? Don't reorder Subsections. Subsections can be newest at bottom. Sure, maybe it wont work. It might work. I'd like to try it because honestly newest at bottom is broken interface from the get go.
This is not a priority, perhaps. We manage. But it could be useful to experiment with newest at top talk pages. A widget that does that, if it could be easily created, would facilitate such experimentation. And thanks for the caveats. I don't think they are prohibitive to experimentation, but I don't know how to code. So if someone here is willing code this, I'm willing to try using it on my talk page. --Rogerhc (talk) 00:32, 22 March 2013 (UTC)
For the heck of it I tried coding this:Module talk:NewestAtTop which is mostly finished I think - I'm having some trouble thinking of an 'elegant' way to search for the text of the last section (I suppose you can count off numbers ... maybe I'll get back to it later.Wnt (talk) 05:05, 26 March 2013 (UTC) OK, now it looks like it works ... as intended... but the philosophy may be up for debate! Wnt (talk) 16:53, 26 March 2013 (UTC)

I started trying to do this at Module:TrainingPages, but quickly got out of my depth. I'm trying to build a more user-friendly and easily maintainable template for the training modules at Wikipedia:Training. I want something that works like this:

I have an index page of wiki text that lists a set of pages in a specific order (such as all the pages in the "Editing" module of the training for students). I want a module can take an index page (the formatting of which doesn't particularly matter, but the simpler the better) and a page title that is in that index page, and it returns the name of the previous page, or the next page, or the page number of the current page, or the total number of pages.

For example, something like this:

{{#invoke:TrainingPages|next_page|currentpage={{PAGENAME}}|index=Wikipedia:Training/For_students/Editing_module_index}}

would use the PAGENAME magic word to define the currentpage, and use a list of pages at the index to determine what the next page in the sequence is. Similarly, a previous_page module would find the page before the current one, a page_number module would determine the page number (starting from 1 as the first page in the index), and a total_pages module would return the number of pages in the index. --Sage Ross (WMF) (talk) 01:35, 26 March 2013 (UTC)

Wikipedia:Training/For_students/Editing_module_index is a redlink. If you can give us the index, this is easy to do. Formatting specifics are also desirable. Wnt (talk) 01:55, 26 March 2013 (UTC)
Also, the pagename parameter is avoidable (though it can be optional) - we can just use page=mw.title.getCurrentTitle(); linkto=page.fullText (I think) to see what the current page is from within the module. Wnt (talk) 01:58, 26 March 2013 (UTC)
OK, I scribbled up something that seems to work. Thinking about it I'm inclined to put the markup in a different module or an old-fashioned template - I'm thinking this one could actually be moved to some more generic name and simply page through any index. Wnt (talk) 03:42, 26 March 2013 (UTC)
Thanks!!! I've created the example index. The formatting of the index could be simplified if necessary; feel free to change it. The idea is to use this for new versions of Wikipedia:Training/header and Wikipedia:Training/footer so that for individual pages, such as Wikipedia:Training/For students/My sandbox, there's no need to edit them all each time the order of pages or number of pages gets switched around. Moving the module to a generic title and keeping it free of specific formatting markup makes sense to me.--Sage Ross (WMF) (talk) 10:07, 26 March 2013 (UTC)
I modified the module so that it can take either your data in processed form (potentially more efficient) as at Module:TrainingPages/default index or it can access your index file directly. The one flaw I still have in it is that it doesn't work when you have an uninterrupted link in nowiki format (or inside another link in some other way, I suppose). The problem is that there is no "plain" feature for mw.ustring.gsub, so in order to use it I'd have to put escape characters in everything and get them out again flawlessly, or else take down numeric positions for every bracket and nowiki and do the substitution by substrings... it's unnecessarily annoying. (Who do I pester for such a function anyway...? It's not a bug, but it sure isn't a feature.) Wnt (talk) 18:18, 26 March 2013 (UTC) (To clarify, the short term fix was [17])
This is excellent, just what I was hoping for! Already it will be able to do a lot of what I'd like to do with it. A few other useful additions for my use case would be:
  • page_number and total_pages methods to return the number of the current page
  • next_page and previous_page methods should return nothing (or some kind of null indicator) if the current page is the last page or first page, respectively
  • optional arguments for over-riding the above behavior and returning a specified page instead as the previous/next at the beginning/end, ideally defined as part of the Module index format (but not part of the main sequence for counting purposes).--Sage Ross (WMF) (talk) 01:11, 27 March 2013 (UTC)
OK, I think I've provided the right page_number and total_pages methods. The other behavior was actually possible before using defaultpage= , so I took a few minutes to wtfm ;) . Wnt (talk) 05:51, 27 March 2013 (UTC)

Help with transition of wp:CS1 cites

We need more admin support to install the Lua versions of Template:Cite_web/lua and Template:Citation/lua into the prior template names. Currently, the focus on rare, tiny details has overwhelmed the broader progress to move forward, as "paralysis of analysis" in debating minutia used in a few pages, rather than installing the improvements already written for use in the other 1.6 million pages using wp:CS1 cite format. The distraction has severely derailed progress, when trying to make everything "perfect" before doing anything else. Instead, we need to install the Lua versions, first, and then discuss, debate and fret over tiny, trivial format differences afterward. The Lua interface inside Template:Cite_web should be the following markup:

<includeonly>{{#invoke:Citation/CS1|citation
|CitationClass=web
}}</includeonly><noinclude>
{{documentation}}
</noinclude>

All the major parameters have been tested, for basic operation, in the typical page of testcases:

For the rare, few cases where the format is different, then the prior version, {cite_web/old} can be used, in any page, for any citation which absolutely must keep the prior cite format. IMPACT: Over 1.4 million pages will be reformatted to use the 6x faster Lua Module:Citation/CS1, and the markup-based helper Template:Citation/core will be delinked from nearly 1 million pages. A similar upgrade was made to Template:Cite_news, which has nearly the same parameters. -Wikid77 (talk) 08:15, 26 March 2013 (UTC)

To anyone responding to this: Before you do, you should probably review the discussion at Module talk:Citation/CS1#Transition for Cite Book. Anomie 10:34, 26 March 2013 (UTC)
What you're doing is great - I mean, it's seriously some Wikipedia work you could put on a resume, and you pushed for it at a time when no one thought about the benefits. But don't give in to impatience. "Tiny, trivial format differences" would be easier to look at and dismiss if they were carefully documented in Module:Citation/doc (at this time a red link), for example. Wnt (talk) 14:52, 26 March 2013 (UTC)
The Lua script for wp:CS1 cites is in Module:Citation/CS1 with /doc page Module:Citation/CS1/doc. However, those tiny format differences have been fixed, during the delays, so there is nothing to document there. -Wikid77 (talk) 09:53, 27 March 2013 (UTC)
I'm not sure why Wikid77 is getting impatient. We've been installing one Lua citation template every few days for a while now. In the process, we've been taking the time to inspect each case for any regressions, and repair bugs as they get reported. {{cite book}} was installed late in the day on the 24th. I think all the known bugs with that one are now fixed in either the live version or the sandbox copy. My aim is to try and get {{cite web}} installed approximately Thursday of this week. At that point we will have transitioned all of the big four (book, journal, news, and web). That still leaves {{citation}} and a large number of infrequently used templates, but after cite web we should already have realized most of the performance advantages. Dragons flight (talk) 15:53, 26 March 2013 (UTC)
When {cite_web} is transitioned to Lua, it will correct an estimated 100,000 clerical errors in the cite formatting (such as "inc.." or "pp.7" to use "p."), including over 4,900 articles which omit the dots (or commas) between thousands of cite parameters. See "References" (for missing dots) in articles "Bibliography of South America" or "Area code 868" or "Arthur Roth". Also, the transition of {cite_web} will restore perhaps 20 million COinS metadata span-tags, in over 1.3 million pages, for use by DASHBot to update the dead-link URLs to also list suggested archive URLs. I regret that I have seemed "impatient" in trying to get those 1.3 million articles updated sooner. Perhaps I should have noted the millions of existing bugs in the processing of those pages, earlier, and how the transition of {cite_web} to Lua, sooner, could fix those many millions of formatting bugs within a few days. Not that anyone is really worried about bugs in citations, of course. It's just things to ponder. -Wikid77 (talk) 20:22, 26 March 2013 (UTC)

Hi! It is probably too early, but I don't know how much time would it take to implement the following idea: when the property inclusion from Wikidata is deployed to English Wikipedia, it could be worth to implement a generic navbox which shows subdivisions of a particular country administrative unit. Those subdivisions should be grouped by type. If this can be accomplished (I believe it can), we can get rid of thousands of separate templates and use one template which always reflects the current data from Wikidata and doesn't need to be updated. Any thoughts? --DixonD (talk) 11:05, 27 March 2013 (UTC)

If you have an example in mind on Wikidata, please link directly to it so we can see what we can do. There are severe limitations on what a Lua template can directly import - for example, I have no idea how to access the contents of a Category or a set of search results or the contents of a file on Wikipedia, let alone anywhere else. Wnt (talk) 15:45, 27 March 2013 (UTC)
For example, to pick a file d:Q383842:
  • Simple transclusion {{d:Q383842}} ==> {{d:Q383842}}. No dice.
  • Accessing {{#invoke:Page|id|d:Q383842}} ==> 0 - i.e. page doesn't exist as far as mw.title.new is concerned.
  • {{#invoke:Page|exists|d:Q383842}} ==> - page does not exist.
  • {{#invoke:Page|getContent|d:Q383842}} ==>
    .  Not happening.
So far as I know (hopefully I've missed something!), we're still to whatever people carry over by browser cut and paste. Wnt (talk) 15:56, 27 March 2013 (UTC)
It is not deployed yet to English Wikipedia. Let's take, for instance, d:Q54150. The inclusion syntax should be {{#property:P150|id=Q54150}}, and d:P150 seems to be the property that we probably need. --DixonD (talk) 17:14, 27 March 2013 (UTC)
I wish I knew how to look up this stuff. Thanks to you I now know there's a tag #property, and that a file $wgExtensionMessagesFiles holds it, but I have no idea how to access that file. I cannot look up the Mediawiki manual for #property because stuff like "{" and "#" is universally ignored, everywhere, by everyone, in any search, and can never be contemplated. Do you rely on word of mouth to find stuff like this out? Wnt (talk) 17:58, 27 March 2013 (UTC)
You might want to check documentation here and here --DixonD (talk) 20:43, 27 March 2013 (UTC)

I don't know if it has been done before, so I created a list of the templates that use the most parserfunctions in en.wikipedia (with paserfunctioncount.py from pywikipedia): see User:Darkdadaah/Parserfunctioncount. Darkdadaah (talk) 16:28, 27 March 2013 (UTC)

Good job! That Template:Hurricane season ACE ranking definitely looks like it has its neck on the chopping block - 3090 parser functions for one dinky little table and the users have to leave empty columns and deal with a 30-storm limit we'll like as not break soon! I think the tag {{intricate}} is also a dead giveaway for programming needed. It's nice to leave simple templates to be templates so that many people can edit them, but if one is already under a "2313374u" banner it might as well be written in something sane. Wnt (talk) 16:50, 27 March 2013 (UTC)
amazing. reading through the code of some of these "top list", e.g. Template:Albumchart, you understand what inspired the creators of Brainfuck. (it also explains what's wrong in the comment #Remember markup has #switch:1 for any condition above...) peace - קיפודנחש (aka kipod) (talk) 17:00, 27 March 2013 (UTC)

Conversion (1) Integer as text

Task

Convert integer numeric values into an equivalent language appropriate equivalent.

Context

Conversion of figures given as numerical or other form in articles, and for expansion use in templates.

Test cases
Test Case Output Actual output of {{#invoke:ConvertNumeric | numeral_to_english | | case=u}}
0 Zero Zero
1 One One
2 Two Two
10 Ten Ten
11 Eleven Eleven
20 Twenty Twenty
21 Twenty one Twenty-one
1023 One thousand and twenty three One thousand and twenty-three
9999 Nine thousand, nine hundred and ninety-nine Nine thousand nine hundred and ninety-nine
ff Oxff
aff Oxff
fff Oxfff
ca OxCA
cb OxCB
I One
II Two
IV Four
V Five
VI Six
IX Nine
XL Forty
IL Forty-nine
L Fifty
XC Ninty
IC Ninety Nine
C One hundred
MCM One thousand and nine hundred
MCMLXXXIV One thousand nine hundred and eighty four

Sfan00 IMG (talk) 23:54, 23 February 2013 (UTC)

Discussion

Roman number conversion might need a context. Sfan00 IMG (talk) 00:48, 24 February 2013 (UTC)

I removed your hexadecimal tests above. Besides the fact that it doesn't seem in scope (the suggested output is not the textual form of a numeral), it's impossible to distinguish some hexadecimal numbers from decimal numbers with the same digits. Roman numerals on the other hand should be pretty easy to do by just combining with a Roman numeral conversion script, which has other potential uses as well. Dcoetzee 01:05, 24 February 2013 (UTC)
I'm starting work on this one at Module:ConvertNumeric. Dcoetzee 01:09, 24 February 2013 (UTC)

{{Numtext}} or {{spellnum}} can do decimal to text. Hexadecimal numbers can first be converted to decimal using {{hex2dec}}. We have a decimal-to-Roman conversion at {{Roman}}; I don't know if there's a Roman-to-decimal conversion anywhere, but I can't imagine it getting much use. Toohool (talk) 01:57, 24 February 2013 (UTC)

I've now got Module:ConvertNumeric doing basic decimal to text but it doesn't yet support all the features of {{Numtext}} and {{spellnum}}. Once all options are supported it should be able to drop in and replace those complex templates. Still working on it. Dcoetzee 02:37, 24 February 2013 (UTC)
Is there actually a use for {{Numtext}} or can it just be redirected to {{Spellnum}}? -- WOSlinker (talk) 08:27, 24 February 2013 (UTC)
I've now finished Module:ConvertNumeric (numeral_to_english) as documented at Module talk:ConvertNumeric and modified both {{Numtext}} and {{Spellnum}} to use it. It should replicate all the features of the original templates, but may contain small issues or style problems still - I could use a code review from someone else. Regarding why there are two templates: it appears Numtext supports a variety of features that Spellnum didn't, like rounding, and Spellnum used the word "negative" instead of "minus" for negative numbers. So they're not quite entirely interchangeable. I haven't done converting from Roman Numerals. Dcoetzee 13:07, 24 February 2013 (UTC)
I added a function roman_to_numeral for practice and fun and profit, which converts all legal and most illegal roman numerals to numeric values. I'm struggling with scope though. Could someone see how I can make roman_numerals local and still callable? I haven't added an #invoke'able function to the table yet for this, nor any unit tests (they should come tomorrow, when I have some time to look at the unit testing framework). I would appreciate some code review. Martijn Hoekstra (talk) 22:13, 20 March 2013 (UTC)
AFAIK, there's no way to call a function from outside the module without adding it to the returned table (p). Code looks good. A roman:upper() call on the input may be helpful so it can handle lower-case input. Toohool (talk) 22:55, 20 March 2013 (UTC)
I think I'd rather call that from the calling function, something along the lines of {{#invoke|ConvertNumeric|convert_roman|xi}} => p.convert_roman(frame) convert_roman(tostring(frame.args[1]):upper()) end , but I don't know what the conventions on that are, and maybe have an extra parameter to convert it to english, as reqested in the above testcases (but I really don't know what the usecase for that would be). Do you guys think it's clearer to do it like this, or iterate backwards over the unreversed string? (that is, drop the rev, and go with roman:sub(-i, -i)) Martijn Hoekstra (talk) 23:40, 20 March 2013 (UTC)

Numeric Conversion (2)

Task

Convert number and currency specifier to text. (I.e Cheque printing algorithim)

Context

Expansion of currency amounts in articles and templates.

Test cases (using pounds)

Test case Output
1 pounds One pound (only).
1. pounds One pound (only).
1.1 pounds One pound, ten pence.
1.10 pounds One pound, ten pence.
1.101 pounds One pound, ten pence.
121 pounds One hundered and twenty one pounds.
121 dollars One hundred and twenty one dollars.
121.99 pounds One hundred and Twenty one pounds, ninety nine pence.
100l sterling One hundred pounds,
100l 11s 45d sterling One hundred pounds Eleven shilling and forty-five pence.

Sfan00 IMG (talk) 23:54, 23 February 2013 (UTC)

  • Consider using {Spellnum}: In most cases, the quick Template:Spellnum (50 per second) can handle number-to-word spellings, or might provide a basis for a new template. For example:
  • {{spellnum|121.99}} → one hundred and twenty-one point nine nine
The main question to justify creating a new template is: "Will this feature be needed in many articles?" In most cases, an idea only applies to a few articles, and could be hand-coded as much easier for other editors to modify. The Template:Spellnum exists because many articles spell-out the first number in a conversion, which is coded as numerical digits but must show as words in the article text. Remember: new templates are a target for confusing hack-edits, so try to use older, standard templates. -Wikid77 (talk) 08:01, 24 February 2013 (UTC)
Wikid, can you tell me how you profile the speed of a template or script? I recently replaced Spellnum with a Lua script (mainly for maintainability and more features) and I'd like to see what kind of performance hit it took in the process. Thanks. :-) Dcoetzee 03:49, 25 February 2013 (UTC)

Numeric Conversion (3)

Task

Convert imperial measure (length) to text.

Context

A number of old engineering documents (pre 1970) will use US or Imperial units as opposed to metric style units. This script would allow expansion of these units to an appropriate text format.

Test cases
Test value Output
1" 1 inch
2" 2 inches
3" 3 inches
13" 1 ft. 1 in.
1' 1 foot
1' 10" 1 ft. 10 in.
2' 2 feet
2' 3" 2 Ft. 3 in.
23m40 23 and a half miles
23m 20 23 and a quarter miles
23m 22 23 miles, 22 chains

Sfan00 IMG (talk) 23:52, 23 February 2013 (UTC)

Discussion

Context param needed? In some cases 24" to 24in is perfectly valid. Sfan00 IMG (talk) 00:07, 24 February 2013 (UTC)

I don't really understand where you'd use this. If you're changing these units where they appear in an article, isn't it easier (and safer) to retype it yourself rather than retype it as a template parameter? If you want to post entire original documents to Wikisource, it's probably best not to monkey with little stylistic things like this. Maybe as a matter of post-processing - you have the module take as input a transcluded article you want to revamp, and it spots all potential conversions and prints them out in red in a new nowiki-text output so you can check them over and copy and past it? But likely I've missed the point by a mile. Wnt (talk) 07:07, 24 February 2013 (UTC)
  • Consider using {Convert} customization: Years of using custom measurement conversions have shown that they tend to lead to adding several related conversions, and before long, they become a separate rival to Template:Convert, which has dozens of features to avoid needing so many custom-conversion templates. For rare cases of old quoted forms (such as: 2'3"), the wp:MOS recommends to use world-standard notation (such as: 2 ft 3 in); however, in quoted text, {Convert} can show the equivalent metric:
  • Wikitext: Thine height were 5'3" [{{convert|5|ft|3|in|m|2|disp=out}}] in stature.
  • Results:   Thine height were 5'3" [1.60 m] in stature.
  • Wikitext: The cowboy said, "I delivered nine 50# [{{convert|50|lb|0|disp=out}}] bags of horsefeed."
  • Results:   The cowboy said, "I delivered nine 50# [23 kg] bags of horsefeed."
In general, for quoting old literal measurements, use the standard Template:Convert but with option "disp=out" to allow direct quotes of any archaic or rare symbols, but show modern conversion amounts in editorial brackets [...]. There are numerous old-style symbols beyond 2'3" (such as "lbs." for "lb"), and that is why we avoid creating more templates for every old symbol, but rather quote the old symbol and show only the "disp=out" results. However, if there are many 2'3" cases, then we could consider a {Convert} subtemplate to use "tic" to show 2'3". By that method, we can avoid proliferation of rare-use templates, which tend to grow to duplicate the typical features of {Convert}. -Wikid77 (talk) 08:01, 24 February 2013 (UTC)
  • It seems like {{Convert}} should be extended to support the same units in as out, and other output forms, so that |abbr=* could be used to do such conversions.
    The table above would require at least an |abbr= param to control which output format to use. Also, if the capitalization of "2 Ft. 3 in." was intentional, a |cap= param (capitalization) would be needed with values first, none, all.
    It seems that interpreting "23m40" as 23 miles and 40 chains would have to be very context-specific – I think most people don't even know what a chain is, much less interpret or write that to mean 23.5 miles. —[AlanM1(talk)]— 19:13, 26 February 2013 (UTC)
23m40 was a convention I'd seen in some notes about railway lines. I've also seen 23m40c and various other combinations of

spacings and punctuation..Sfan00 IMG (talk) 21:15, 27 February 2013 (UTC)

Short titles (Commonwealth) legislation

Task

Generate a short title (and link) for a Commonwealth style short title.

Context

Generation of Commonwealth style short titles, in relevant articles and indexs. Short titles are widely used in the UK (and other Commonwealth jurisdictions) to refer to specific items of legislation.

Testcase
Test case Output
Short Titles 1896 Short Titles Act, 1896
Short Titles United Kingdom 1896 Short Titles Act, 1896
Statute of Uses Statute of Uses
27 Hen 8 c 10 27 Hen 8 c 10
6 Ann 41 c. 41 6 Ann 41 c.41
Constitution Scotland 2017 Constitution Act 2017 (Scotland)

Sfan00 IMG (talk) 23:52, 23 February 2013 (UTC)

Discussion

There is a short title template at Wikisource, but it's sufficiently intricate that I daren't change it. Migrating to a lua version across both Wikisource and Wikipedia would aid maintainablity. Sfan00 IMG (talk) 01:35, 28 February 2013 (UTC)

Table Generation

Task

Generate tables using varied layout in source.

Context

In some articles, it may be easier to specify the table source data in a different format from that used in the current media wiki table syntax,such as in terms of columns of figures rather than rows. This scripted approach would also make the use of data extracted from certain sources easier to process when it is columnar form.

Parameters
  • layout: rows/cols
  • datasep:
  • blocksep:
Testcase

{{#invoke:Tables|maketable|rows|;|:|1;2;3;4:5;6;7;8:9;10;11;12;}}

Generates :

1 2 3 4
5 6 7 8
9 10 11 12

{{#invoke:Tables|maketable|cols|;|:|1;2;3;4:5;6;7;8:9;10;11;12;}}

1 5 9
2 6 10
3 7 11
4 8 12

Sfan00 IMG (talk) 01:01, 24 February 2013 (UTC)

Comment

I'd be really leery of starting a second table format to be used in any article - keeping track of the one we have is already a lot to ask of volunteer editors. Also, you'd have to decide what to do about delimiters - in your example, you can't display any table entry with a : or ; in it unless you define some escape sequences... However, your comment on making it easier to convert source data might be something else to do. One tool I can think of that would be a fairly straightforward exercise (and a pain in the rear) would be for the template to take a transcluded Wikipedia article as input, and output the same page (in nowiki source format) but with all the rows and columns of every table swapped. This would be a tool for editors to use, not for use in articles. Wnt (talk) 07:15, 24 February 2013 (UTC)

  • Template:Autocol already generates column tables: The quick Template:Autocol already allows specifying a list of entries which can appear in a multi-column table (for any browser), at extremely fast speed (a fraction of a second). For example:
  • {{autocol|n=6|ncols=3|*aa|*bb|*cc|*dd|*ee|*ff}}
{{autocol|n=6|ncols=3|*aa|*bb|*cc|*dd|*ee|*ff}}
  • {{autocol | style=border:1px solid #777
                  | n=12|ncols=4|wrap=y|1|2|3|4|5|6|7|8|9|10|11|12}}
{{autocol |style=border:1px solid #777

| n=12|ncols=4|wrap=y|1|2|3|4|5|6|7|8|9|10|11|12}}

Thank you for suggesting the one-line entry format, which helps to confirm the need for the prior template {autocol}. Again, due to the fast speed of {autocol}, there is no need to have a Lua module provide a redundant column-format feature, and markup-based templates are fine for formatting multi-column tables. In general, Lua is only needed for complex, nested calculations, or intense string-search operations, or deeply nested decision trees hitting the MediaWiki wp:expansion depth limit. Short templates run at the rate of hundreds per second, some character-insertion templates run as fast as 2,400 per second. -Wikid77 (talk) 08:01, 24 February 2013 (UTC)
Wikid77, please do not take the following as a personal attack, but i very strongly disagree with what you wrote above: "Again, due to the fast speed of {autocol}, there is no need to have a Lua module provide a redundant column-format feature".
Speed, of course, is important, and may even be the deciding factor in many cases, but it's definitely not the only one. you are the single author of {{Autocol}}, so please do not take this as an attack on yourself or the template, but rather a critique of the limitation of wiki-code: {{Autocol}} is an abomination against God and man. anyone who opens this template in "edit" mode will understand immediately why such a thing should not exist when a sane, Lua-based solution to the same problem can be made. your success in solving the problem in the face of wikicode limitation is admirable, but once a tool that can do this in a sane way exists, this abomination should not remain, or at least should not be used. it has some limitations (e.g., hard-coded maximum number of cells) but its main problem is that it's not maintainable. so far nobody except yourself have touched it, and once Lua is available, i don't think anyone else ever will. peace - קיפודנחש (aka kipod) (talk) 16:59, 26 February 2013 (UTC)
Generally speaking, I strongly agree that switching complex templates to Lua for maintainability is essential even when they run slower as a result. Dcoetzee 22:55, 27 February 2013 (UTC)
The delimiter problem was why I'd made it option, ie. you could use anything but |#*(Which have special meaning in markup) as delimiters. Another reasoning for this is so that CSV datasets could be used without having to tweak them as much as currently, The delimiters being , and ; respectively IIRC. Sfan00 IMG (talk) 09:42, 24 February 2013 (UTC)

String Replacement

Task

Given a string, returns a modified string where all occurrences of a specified substring are replaced with another substring.

Context

Basic string function to replace the limited {{Str repc}} string function which replaces only the first occurence of a substring within a string. Would be useful for example in the {{Infobox enzyme}} template which would eliminate the need for a separate IUBMB_EC_number parameter (the IUBMB_EC_number string could be generated from the EC_number by replacing periods with slashes).

Parameters
  • string
  • search_string
  • replace_st`ring
Testcase

{{#invoke:String replace|1.1.1.1|.|/}}

Generates: 1/1/1/1

Comment

I am not sure this is the right place to make this sort of very simple request that might already be possible without any special coding. If so, pointers to the appropriate documentation would be appreciated. Thanks. Boghog (talk) 09:28, 24 February 2013 (UTC)

I created Module:StringReplace: {{#invoke:StringReplace|replace_all|1.1.1.1|.|/}}: 1/1/1/1. What do you think? Inkbug (talk) 13:36, 24 February 2013 (UTC)
Thanks for your quick response! I have tested the new module in {{Infobox enzyme/sandbox}} and it works perfectly (see Template:Infobox_enzyme/testcases). I will test this module in several other infoboxes before moving into production. Assuming all the tests look good, is there any reason not to start using this module in production vesions of templates? The infobox enzyme template is transcluded into approximately 5000 articles and I just wanted to make sure it is ready to go. Boghog (talk) 14:44, 24 February 2013 (UTC)
You're welcome. As far as reasons for using / not using this in production, I have absolutely no idea (I doubt I know more about this than you – this was my first Lua code I've ever written). Inkbug (talk) 14:48, 24 February 2013 (UTC)
An impressive first effort! As you probably already had in mind, the StringReplace module should also have replace_first and replace_last options so that the module is more flexible, but the replace_all option is all I need for now. There are a large number of related templates that at some point might be worth systematically converting. One question is it better to have a large number of special purpose modules or a smaller number of general purpose modules? Boghog (talk) 15:24, 24 February 2013 (UTC)
I've also added replace_plain into Module:String. My pattern part isn't quite as good as the one in Module:StringReplace though and doesn't yet work with "." but it does have the option to only replace the first occurrence which would allow it to replace the {{str rep}} template. Perhaps the two could be merged into one single version. -- WOSlinker (talk) 15:43, 24 February 2013 (UTC)
I don't have any problem with moving Module:StringReplace into Module:String – if we want all of the string functions in one place, it is fine with me. If you want to copy my escape_pattern code, I have no problem – all I did was copy the list of special characters from the Lua reference, and added a % before each one. Inkbug (talk) 18:48, 24 February 2013 (UTC)
I'm not sure if there is any good reasons for all in one place or for separate modules. Just that so far all the string funcions have ended up in Module:String and all the numeric functions have ended up in Module:Math. I'll update the code in Module:String to use your escape_pattern function. -- WOSlinker (talk) 19:34, 24 February 2013 (UTC)
I incorporated a more generic replace function into Module:String that allows the user to indicate whether to use regular expressions or not (defaults to no) and allows one to specify the number of replacements to make (defaults to all). Dragons flight (talk) 19:36, 24 February 2013 (UTC)
Also, the escape code in Module:StringReplace was missing close bracket "]", and I used the UTF8-safe ustring class rather than string in making the escape. I'm not sure if there are any UTF-8 cases where that matters or not. Dragons flight (talk) 19:39, 24 February 2013 (UTC)
For the curious, it's my intention to work on exposing all the basic string operations and update all of the non-trivially used functions at Template:String templates see also text. So far we've done {{str len}}, {{str find}}, {{str sub}}, {{str index}} and a few variants of those. Dragons flight (talk) 19:44, 24 February 2013 (UTC)
Sounds good to me. Inkbug (talk) 19:53, 24 February 2013 (UTC)
And also to me. Thanks everyone. Boghog (talk) 20:07, 24 February 2013 (UTC)
Before seeing this discussion, I just made new {{Replace}} for the same. The old template {{str rep}} is used for count=1 (one replacement), and because it is old usage, we cannot change it to default behaviour of "replace all occurrences". -DePiep (talk) 09:44, 27 February 2013 (UTC)

Color conversion (RGB)

Task and context

I'd like to propose and request a new feature Module:RGB color convert. It should convert viable RGB color definitions (by name, RGBhex, RGB triplet, and more) into any other such notation.

Behaviour by a template would be:

  • {{Color convert|gold}} → #ffd700 (default)
  • {{Color convert|255|215|0}} → #ffd700 (default)
  • {{Color convert|100%|84%|0%}} → #ffd700 (default)
  • {{Color convert|#ffd700|to=name}} → gold
  • {{Color convert|gold|to=RGBdec}} → RGB(255, 215, 0)
  • {{Color convert|51°|100%|100%|from=HSV}} → #ffd700
  • {{Color convert|51°|100%|50%|from=HSL}} → #ffd700
Usage

It's main usage would be WP (template) internally and in articles describing the topic of RGB itself. At the moment such transformations (conversions) are not available, mainly because of the inefficiency of pre-Lua templates. They are entered manually in WP pages.

Scope

Below is a table with the proposed scope. Basic definitions are available in linked W3C documentation. RGB value itself is the primary and first aim; later other conversions could be added to the module (HSV, HSL, transparency).


Color spaces

color space data structure #
arg
example(s) example color value source alternative names, versions number of colors wiki page note
RGB triple hex 00-FF 1 #cd5c5c #cd5c5c W3C, rgb alt: triplet 0-F (#abc)
RGB name 1 maroon #800000 [19] CSS 1–2.0 / HTML 3.2–4 / VGA color names 16 web colors HTML 4.01 names (X11 alt names)
RGB w3cname 1 indianred
indian red
#cd5c5c W3C, color keywords X11 147 X11 color names Multiple versions are around, sometimes clashing.
RGB trip separated 3 rgb(00, 9A, FF)
rgb(80%, 36%, 36%)
rgb(55, 0, 255) dec
mixed
Values out of range (<0, >255/100%) are clipped (w3c).
RGB triphex 3 rgb(c5, 10, a5) #c510a5 All hex only. Technically possible; not W3C mentioned. Should never mix with dec values
RGB red
RGBgreen
RGBblue
cc 1 #cc0000 similar for -green, -blue. The red part value in RGB. Corresponds with #rr0000(?)
HSL trip 3 hsl(0°, 53%, 58%)
hsl(0, 53, 58)
#cd5c5c hue:0° satur:53% light:58%. With or without °, %
HSL tripnormalised 3 hsl(0, 0.53, 0.58) #cd5c5c normalsed into fractions (0 ... 1) (see W3C)
HSV trip 3 hsv(0°, 55%, 80%) #cd5c5c for: hue:0° satur:53% value:580%. With or without °, %
RGB-lum Luminance, cset contrast ratio [20] {{RGBColorToLum}} and more, for contrast ratio checking W3C, transparent
RGBA RGBA-transp 4 rgba(80%, 36%, 36%, 0.2) W3C, transparent 0.0=transp, 1.0=opaque. parameter 4. HSLA w3c hsla
1.^ "CSS Color Module Level 3 W3C Recommendation 07 June 2011". W3C. 2011-06-07. Retrieved 1013-02-25. {{cite web}}: Check date values in: |accessdate= (help)
Internals

Always work to and from #RGBhex6 value notation (triangular conversion: always bounce it on the same floor).

See note below on triplets
Development

Main RGB functions are first level of requirement. HSV, HSL, transparency and others could be added later (should not be made impossible). Also, versionings like X11 variants could be added as an option.

Input

By default, basic RGB input (single argument) should be recognised and handled. That is: RGBhex6, RGBhex3, with/without#(?), X11 (W3C?) names, sRGB names. We might need to decide on version variants (X11, W3C names); the other versions could be input with a space notification.

Three-argument input (RGB function triplet) could be default too (by %, by decimal). Other input requires space-identification: HSL, HSV, other color name versions and definitions.

Input is always case independent. All whitespace stripped (mid-spaces too) should be handled correctly to catch common, acceptable "#AC DF 09" writing.

Output

Default output (suggestion): #RGBhex6 for automation and covering most options. Other notations by argument. tbd: multi-argument output like "RGB(r,b,g)" as a string? The required output should be defined more precisely than the input space.

Not-a-color and error situations

Color not recognised or incorrect: output "" (blank), unless specified otherwise by W3C. Option for error/not-a-color situation: output value can be set (like by "default=my error text"). Prefixes, postfixes, triplet separator are optional, maybe other output formatting too.

Parameter list

To get an idea: - The 4 unnamed params are for values. Only the first one is required. 1, 2, 3 when used make a triplet, 4th is for transparency (future feature). - All other params are named, and optional. They are for settings rather than values.

{{Color convert
| [1st unnnamed param, single value (also: triplet-1)]
| [2nd unnnamed param, triplet-2]
| [3rd unnnamed param, triplet-3]
| [4th unnnamed param, transparency]
| from= [default: RGB; other: HSV, HSL, ...]
| to= [default: #RGBhex6; otherwise more specific than the from id]
| opacity= [=1 minus transparency]
| format= [output; e.g. uc or lc]
| separator= [in the triplets; default is <comma><space>]
| prefix=
| postfix=
| default= [returned when not-a-color or error]
}}

Basically, if params 2 and 3 are used, it is a triplet (in % or decimal). If they are empty, input is a single color value. A single value can be checked for color name, for RGBhex3, and it is RGBhex6.

See also
Out of scope
Next

I guess this proposal needs fleshing out. My initial notations and names are descriptive and may be too far off for good understanding or even correct defining. Please feel free to improve, especially the table. -DePiep (talk) 15:54, 25 February 2013 (UTC)

Test cases

params

CoCo Comment

It is, of course, possible to have a single template (say "Convert") which will do all kinds of conversions, and will look at its parameter(s) to guess which conversion is required at any given invocation. But what is the value of this? IMO, it would be better for both the code and the editors, to say explicitly what we want to do. So speaking the language of templates, this means having one distinct template per conversion type, e.g. "Convert color name to rgb", "Convert color name to hex", "Convert rgb to color name", "Convert hex to color name" etc. From the Lua side, you would have a Module:Color, with methods such as rgb_to_name(), hex_to_name(), rgb_to_hex(), name_to_rgb() etc.

IMO, this is better than having a single template and Lua method called "Convert", that will have to deduce from its parameters what did the user actually pass to it, and from some special parameter named "to", what is the expected output.

The way i suggest makes life simpler, and error messages clearer: when calling "hex_to_name" you know that the expected parameter looks like so: #xxxxxx, where "x" is a digit or a letter in the [a-fA-f] range, and you can verify it. if you define a clever "convert" that will look at its input and decide which mode was it (is the input RGB? is it hex? is it color name?) - when neither of the "legit" inputs was found, it will be more difficult to give a coherent error message.

peace - קיפודנחש (aka kipod) (talk) 18:14, 25 February 2013 (UTC)

No guessing needed. It is deducted (from actual input). And yes we want to say which conversion we want. But a. we can do that by using parameters, not by using different templates. And b. we can set a default preference as to what we want, which is just as good as explicitly telling. -DePiep (talk) 11:39, 26 February 2013 (UTC)
I thought about this and there is some logic to having a single template, but not in the manner suggested by DePiep. I was thinking something more like "{{Convert color|source form|target form|...(input data here)...}}". For example {{Convert color|RGB|HSL|100|30|23}}. The reason for this is that in principle one may wish to convert directly from any form of a color directly to any other form, and having a separate template for each would require a template for every possible pair of forms. The template I suggest would be just as straightforward to implement without having to guess the form of the input (since source form/target form come from a fixed list), and by using an internal intermediate representation, it could be implemented with code size linear in the number of forms rather than quadratic. Dcoetzee 20:05, 25 February 2013 (UTC)
A few comments:
  • First, I think that Dcoetzee's suggestion is a good one, for cases in which one knows the source form. However, we should support providing a blank source form ({{Convert color||target form|...}} or {{Convert color|?|target form|...}}) which would figure it out by itself.
  • Two, As far as the structure of the module, I think that we should have functions of the following types:
    • Functions that convert different RGB notations to one standard one (A tuple with three decimal numbers from 0 to 255?).
    • Functions that convert to/from that standard to other standard forms (names, HSL, RGBA, etc.).
    • Functions that convert the RGB standard form to other RGB notations (e.g. (16,16,16) -> #111111).
  • Three, this is a great module for test-driven development.
Inkbug (talk) 06:59, 26 February 2013 (UTC)
Why require defining both incoming and outging form? Most RGB formats can be discerned already (and by default I say), and if one chooses to input a different form (say the three HSV values), only then the input form needs to be explicit. We can use the first four (unnamed) params for the numbers. Other params, which are basically switches, should be named (so we don't have to count input param pipes, often empty). The output form can be #rrggbb by default, and otherwise it can be set by a named param. The examples are in the top of the proposal, including the default worrkings. It reduces required input without losing correctness. -DePiep (talk) 11:20, 26 February 2013 (UTC)
I have expanded the param list to reflect my idea. General form: {{Convert color|||||named_param1=...| named_param2=...}}. -DePiep (talk) 11:34, 26 February 2013 (UTC)
re Inkbug: there is no guessing. It is deducted. And indeed an internal form I already proposed (namely "#rrggbb"). It seems the main difference you point to is to get rid of a default from-setting and require it every time. I do not see the need for that. -DePiep (talk) 11:45, 26 February 2013 (UTC)

On second thought, about my always-calc-to RGB string (and then to other fromat from there). While the triangular calculation should be maintained, I understand in Lua on can use a set of arguments, so we could calculate to the triplet (rr,gg,bb) as separate values (hex of dec?). Formatting into a string (6 hex) internaly would not be needed. -DePiep (talk) 20:47, 1 March 2013 (UTC)

I've created a proof of concept implementation to test out some implementation ideas I had. It's still a work-in-progress, but the basic framework is there to add more formats and conversions between them. It would help to know the proposed use cases for this template, though, particularly when considering what the default output should be. Originally my code was not surrounding the output with parentheses, but the leading # signs for hex numbers was being interpreted as wiki markup for a numbered list. It would be helpful to know how the template would be used to understand if a better workaround is desirable. You can see sample test cases at Module talk:Sandbox/isaacl/ColourSpace/tests isaacl (talk) 04:13, 2 March 2013 (UTC)

About how it is used: convert any color space format into any color space format. No more, no less. The table lists the options (table is in progress).
Secondary: in the module, catch WP input options (like blank or missing input params, or like value "40%"). Having to program this logic in template code is horrible. Also, another sensible logic is: if only param1 is filled, treat as single value. If param2 or param3 is used, treat as triplet input.
About the default output. 1. Decide: lowercase is default. Do we need to discuss that?
About the default output. 2. Format "#3af7e" would be preferable, because it can be used in code directly (think: <style="background:{{color value|0|50%|75%|from=HSV}};">). To prevent the # markup effect, it could be "&#x23;3af7e". Not that nice for reading, but it never conflicts with the wikicode. -DePiep (talk) 12:28, 2 March 2013 (UTC)
Isaacl, I am not happy with the tests. You even introduce a new notation form (#7F, #00, #FF). Also, as far as I can follow Lua code, I think the module is structured too complicated (e.g., mixing calculations and formatting).
This is what I have in mind. About input formats. The table now shows three color spaces: RGB, HSV, HSL. Within these, there are multiple data structures possible for the same color value. Note that in a data structure, formatting with spaces and separators is irrelevant, but a value being "%" or "decimal" does. Together there are eleven input structures for the three color spaces ("from"). In general, we also want these options for output("to"). Now, simply code all these options would require some 11 × 11 = 121 routines. That we do not want (and imagine what to do if we add one more input structure: 24 more routines needed).
So this is what we do: every input structure is calculated to the RGB decimal triplet: RGBtripdec (trip1decimal, trip2decimal, trip3decimal). In Lua this can be three args, of course. This takes eleven routines.
Then we do this: for the eleven (or so) possible output structures, we build eleven routines: from RGBtripdec to that structure. After we have that output in the requested data structure, we can format it with separators, spaces, "#", and so.
Just some notes: HSV and HSL calculations could be postphoned and added later (only four routines then!). Define "from" names (id's) first. Always use color space name in them (RGB, HSV, HSL); keep user-friendly; examples in the table. Stick to W3C definitions. Maybe we should drop the "add prefix" and "add postfix" options, because these are not tied to the topic and so should kept outside. -DePiep (talk) 13:36, 2 March 2013 (UTC)
Regarding use cases, things like "this will be used in HTML code" is exactly what I want to understand. (I thought as much, but wanted to be sure.)
Note with my code it is not necessary to define conversion routines between every format; if the 24-bit formats have a conversion to the sRGB 8-bit triplet format, and the floating point ones have a conversion to the sRGB floating point triplet format, the code will be able to convert between the two. I did not want to prematurely convert to one of these two types of format ahead of time, since if the conversion is strictly between 24-bit formats, or floating point formats, inaccuracies can be introduced. So introducing a new format only really requires two conversion routines to be defined.
As I stated, it's a work in progress; I just wanted to show a sample of what was the current state. No guarantees of course of how it will end up, but it will undergo changes. isaacl (talk) 14:41, 2 March 2013 (UTC)
Then maybe I did not understand the code well enough. What is "sRGB" compared to color space "RGB"? Where do we have floating point introduced? I'd say, if that happens (like with a %), it should be rounded to an integer asap, since the 8-bit is an integer (decimal of hex alike). -DePiep (talk) 15:23, 2 March 2013 (UTC)
Yes, the code at the moment relies on the presence of a % sign to assume the input is sRGB specified in floating point. RGB is a generic term; sRGB corresponds to a specific colour space, so the conversions between sRGB and other standard spaces are well-defined. (The W3C standards use the sRGB colour model.) The standard spaces are all floating point, since the space is continuous, so if both starting and ending formats are floating point, converting to a 24-bit model in between would lose information.
The code currently checks if a direct conversion exists between the starting format and ending format. If not, it looks for a conversion that goes through sRGB 24-bit, if one of the two formats is 24-bit, or through sRGB, if neither format is 24-bit. If that does not exist, then it looks for a conversion that goes through sRGB and sRGB 24-bit. So each format only strictly needs the conversion between itself and either sRGB or sRGB 24-bit (whichever one is of the same type as the format), and vice versa. isaacl (talk) 16:34, 2 March 2013 (UTC)
I've added support for converting to and from the W3C colour names. I've consolidated the format-specific information so new formats can be defined in separate modules; see ColourSpace/Formats/W3Cnames and ColourSpace/Formats/sRGB24bitHex as examples. The format modules loaded by the ColourSpace module are listed in ColourSpace/Formats. A module not in this list (for example, the W3Cnames module) is loaded on demand by specifying it either as a "to" or "from" format. isaacl (talk) 22:44, 3 March 2013 (UTC)
A new module for a format? Four modules already, one for listing "formats"? What is to separate? W3Cname is not a format? I get it less and less. The listing routine is called sRGBFromW3CName, but then there is a reverseMappping? Why not consistently named sRGBToW3CName to start with, away from implicit understandings. There is no "forward" defined, we use "from" and "to". On top of this: why the id "sRGB" in this name is unspecified? There are lots of sRGB formats as you call them (actually, data structures). First: "sRGB" is not a different color space, is a subdivision of RGB. The specification is in the data structure or maybe format. We should not suggest they are two color spaces. Then, this one format is sRgbTripleDecimal. Another thing: in the W3Cname module you write "modelType = '24bit'", but actually you have coded three decimal values. Also, the "modeltype" should mention "RGB" "SRGB": not every 24bit is SRGB. Also, defaultConversion = 'sRGB24bit' looks wrong: the conversion presently is via 3xdecimal. And it suggests there could be other conversions?
Still, my main question is: do you agree that internally we should calculate and format to and from that one data structure: SRGB(r,g,b)decimal. The decimal is floating points. Limitation to discrete sRGB happens in the final formatting. I call this structure SrgbTripDec for now.
If so, I expect the next functions (they take either 1 or 3 arg):
 fromSrgbTripDecToSrgbTripDec (trival, but we want to control the valuesl)
 fromSrgbTripHexToSrgbTripDec
 fromSrgbTripMixedToSrgbTripDec
 fromSrgb24bitHexToSrgbTripDec
 fromSrgbW3cNameToSrgbTripDec
 fromHsvTripMixedToSrgbTripDec
 fromHsvTripNormalisedToSrgbTripDec (h,s,v, value into 0...1 fraction)
 fromHslTripMixedToSrgbTripDec
 ...
And for output:
 fromSrgbTripDecToSrgbTripHex
 fromSrgbTripDecToSrgbTripDec
 fromSrgbTripDecToSrgbTripPerc
 fromSrgbTripDecToSrgb24bitHex
 fromSrgbTripDecToSrgbW3cName
 fromSrgbTripDecToHsvTripMixed
 fromSrgbTripDecToHsvTripNormalised
 fromSrgbTripDecToHslTripMixed
 ...

Some quick notes: Descriptive subroutine names. Shortening for now would not help. These are data structure calculations, not fromatting. Do not mix up "format" with "data structure". Incoming format (string dressup) is handled in the "from" routines. Outgoing formatting (into return code) be in these functions:

 formatRgbTripHex(RgbTripHex)
 formatRgbW3cName(RgbName)
 ...

etc, possibly with options. -DePiep (talk) 10:07, 4 March 2013 (UTC)

As you noted before: I agree that we should call it "SRGB" (not RGB) as color space definition, since we follow W3C. Including W3C defined HSV and HSL, we have about a dozen of structures to handle "from" and "to" (see the table). -DePiep (talk) 11:33, 4 March 2013 (UTC) Adjusted my post accordingly.-DePiep (talk) 11:58, 4 March 2013 (UTC)

Supporting the ability to define display formats in separate modules allows for new formats to be defined without having to modify the core module, and they can be tested using on-demand loading without interfering with existing uses of the core module. New formats can still be added to the core module if desired. As you have an interest in the specific implementation details, I believe you'd gain the greatest gratification in writing your own module—it's very satisfying to use a tool that you've crafted yourself! isaacl (talk) 14:23, 4 March 2013 (UTC)

Check if a pair of colors has enough contrast for acessbiility

Task

Would it be possible to create a template which, given two color codes, verify if they meet the Web Content Accessibility Guidelines which refer to color contrast? This could be used e.g. at Template:Font color or Template:Colors to categorize and/or give a warning message if non accessible colors are used. Helder 13:57, 27 February 2013 (UTC)

Test cases
Text Color Background color Possible output
#000 #FFF yes
#FFFF00 #00FFFF no
red #F99 no
Discussion

 Done (in Wiki language, not Lua): {{Color contrast ratio|fgcolor|bgcolor}}. It might run faster if {{HexColorToLum}} were implemented as a lookup of the 256 possible values (16 switches of 16 choices each) instead of the calc method and unfortunate need for duplicate calls to {{Str sub}}. Note that when the specified colors are light on dark, you want the value to be <= 0.222_ instead of >= 4.5. —[AlanM1(talk)]— 09:09, 1 March 2013 (UTC)

  • Also with {color_contrast_visible} for yes/no answer: New Template:Color_contrast_visible gives a yes/no answer for the same colors of the ratio. Among the testcases, {color_contrast_visible|red|#FF0909} gives answer: "Contrast: red & #FF0909, conformance level= none". Over 40 more of the Web colors, such as "gold" or "PapayaWhip" have been added as color names. -Wikid77 (talk) 07:29, 15 March 2013 (UTC)

Danish genitive

Task

Would somebody be so kind to help make a function to construct the genitive case of Danish nouns or names according to the orthographic rules for the Danish language. This is to be used in templates in the Danish Wikipedia.

The rules are:

  1. Words which end in the letters s, x or z (both upper and lower case) append an apostrophe. This should ideally include s, x or z with any diacritric marks, but that is not very important as such letters are very rare.
  2. Words which end in another letter or a full stop append an s.
  3. Words which end in a non letter symbol append an apostrofe and an s.
minimalistic approach:

--

return {
genitive = function( frame )
    local input = ( frame.args[1] or '' ):gsub( "^%s*(.-)%s*$", "%1" )
    if not input or #input == 0 then return ''
    elseif mw.ustring.match( input, "[sSzZxX]$" ) then return input .. "'" -- please add all the missing letters inside the []
    elseif mw.ustring.match( input, "[%a%.]$" ) then return input .. 's'
    else return input .. "'s" end
end,
}-- say no more
this differ a little from the more common form of creating a named empty table, adding the function to it, and then returning it, but in this case there is no justification for the "common" form.
please note that i change the "testcase" below, to represent how modules are invoked: you have to use #invoke:ModuleName | functionName | arguments - you forgot "functionName".
you can test it by opening Special:TemplateSandbox, and feed as sandbox prefix "User:קיפודנחש/sandbox" (without the quotes), and as "render page" use "Wikipedia:Lua requests" (also without the quotes), oress "View" and scroll to the bottom of the page - please verify that all the words are correct. peace - קיפודנחש (aka kipod) (talk) 23:33, 26 March 2013 (UTC)
I tested it, and it looks correct. Thank you very much. Byrial (talk) 04:38, 27 March 2013 (UTC)
Interesting—I've also started using that style of returning a table with the necessary fields specified in the return statement. FYI here is how I strip leading and trailing whitespace:
text = text:match("^%s*(.-)%s*$")
In Python, I also use code like #input == 0 to test for an empty string, but in Lua I have reverted to input == '' because it is clearer, and I suspect may be more efficient. It's trivial, but I'm interested in any thoughts on that. Johnuniq (talk) 07:23, 27 March 2013 (UTC)
To test for a diacritic on the s, x or z, you could convert to normal form C and normal form D, and check if the last codepoint is identical. If not, the last character has a diacritic. If the second last codepoint of the normal form D string then is in [szxSZX], then return input .. "'". I'll put it in to code tonight if you haven't implemented it yourself yet. Martijn Hoekstra (talk) 13:31, 27 March 2013 (UTC)
Test cases
Input Wanted output Actual output
Jens Jens' Script error: No such module "DanishGenitive".
PS PS' Script error: No such module "DanishGenitive".
sax sax' Script error: No such module "DanishGenitive".
BMX BMX' Script error: No such module "DanishGenitive".
jazz jazz' Script error: No such module "DanishGenitive".
ZZ ZZ' Script error: No such module "DanishGenitive".
abe abes Script error: No such module "DanishGenitive".
revy revys Script error: No such module "DanishGenitive".
Tom Toms Script error: No such module "DanishGenitive".
René Renés Script error: No such module "DanishGenitive".
Renè Renès Script error: No such module "DanishGenitive".
knæ knæs Script error: No such module "DanishGenitive".
klø kløs Script error: No such module "DanishGenitive".
strå strås Script error: No such module "DanishGenitive".
ABE ABEs Script error: No such module "DanishGenitive".
REVY REVYs Script error: No such module "DanishGenitive".
TOM TOMs Script error: No such module "DanishGenitive".
RENÉ RENÉs Script error: No such module "DanishGenitive".
RENÈ RENÈs Script error: No such module "DanishGenitive".
KNÆ KNÆs Script error: No such module "DanishGenitive".
KLØ KLØs Script error: No such module "DanishGenitive".
STRÅ STRÅs Script error: No such module "DanishGenitive".
orð orðs Script error: No such module "DanishGenitive".
ORÐ ORÐs Script error: No such module "DanishGenitive".
Leo 2. Leo 2.s Script error: No such module "DanishGenitive".
fork. fork.s Script error: No such module "DanishGenitive".
DR1 DR1's Script error: No such module "DanishGenitive".
100 % 100 %'s Script error: No such module "DanishGenitive".
C++ C++'s Script error: No such module "DanishGenitive".

Thank you, Byrial (talk) 21:53, 26 March 2013 (UTC)