Your bot, ProteinBoxBot, uploaded a bundle of images a few years ago (eg. File:PBB Protein INDO image.jpg) with no description. There looks to be about 50 or so that are populating in Category:Wikipedia files lacking a description. I was wondering if there would be a way for you or your fellow bot owner to get the bot to go and add a description to the files. (I will also be posting on your bot co-owner's talk page) Thanks in advance. -- ТимофейЛееСуда. 21:57, 4 September 2012 (UTC)

Yes, thanks for the note. We'll look into those ASAP. Ideally actually we'll just replace those older images with newer versions that are better looking (and have descriptions). Cheers, AndrewGNF (talk) 17:00, 5 September 2012 (UTC)


Based on my comments EC commission changed the EC number from to I noted this on page:

I keep changing the number to the updated number and you reverse it to the old number. Please check EC to verify that it is the correct number. Redactor271 (talk) 06:23, 25 March 2013 (UTC)

Hi Andrew,

I've read that already but don't see any reference to TR2 so I wanted to know more about this- did you reanalyze this yourself?


Jen — Preceding unsigned comment added by (talk) 15:38, 1 February 2013 (UTC)

Proper Citation Request[edit]

I would like to know the methods you used to get the testicular receptor 2 graph because I would like to cite your work correctly.

Can you provide me with a link to your original paper or something along those lines so I can read more?

Thank you,

Jennifer (talk) 17:13, 24 January 2013 (UTC)

Hi Jennifer, those expression graphs came from this paper. Cheers, AndrewGNF (talk) 18:27, 24 January 2013 (UTC)

Welcome to the WP:MED/WP:PHARM[edit]

I really like these ideas and look forwards to discussing them more. How does your current bot work?Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:22, 14 May 2013 (UTC)

Thank you, very kind! Cheers, Andrew Su (talk) 05:08, 17 May 2013 (UTC)

Hey Andrew Su

AANAT (gene)[edit]

I've asked a question at Template talk:PBB#Just a hardcoding? but not received a response. As one of ProteinBoxBot's operators, I'm wondering if you can shed a light on it for me. VanIsaacWS Vexcontribs 07:58, 30 July 2013 (UTC)

PBB templates[edit]

I have been editing the PBB templates in Category:Human protein templates making changes in the Name field:

  1. If the name includes a number kDa without a space, e.g. 30kDa, insert a nonbreaking space, e.g. 30 kDa
  2. If the name includes a genus, e.g. Drosophila, or a genus and species, italicize it, e.g. Drosophila
  3. If the name includes a plus sign that is superscripted in chemical notation, e.g. H+, superscript it, e.g. H+
  4. If the name includes a genus abbreviated to initial, e.g. S. cerevisiae, spell out the genus, e.g. Saccharomyces cerevisiae

These changes are consistent with Wikipedia style.

At the beginning of this project, for the fourth type of edit, I set the edit summary to: "italics; spell out genus name — if you don't like it put a note on my talk page". I used this summary for Template:PBB/10111, Template:PBB/10248, Template:PBB/10412, Template:PBB/10427, Template:PBB/10436, Template:PBB/10483, Template:PBB/10484.

When I got to the first one that included C. elegans, Template:PBB/10497, I used the edit summary "italics; spell out genus name — there are over 170 C. elegans species on C. elegans (disambiguation), not counting synonyms!"

By this time I had put the request to put a note on my talk page in 7 edit summaries, so I thought that was enough. But now I see that your ProteinboxBot has been systematically removing all of the edits I have been making.

I request that you modify ProteinBoxBot to automatically make at least the first 3 of my four changes, if not all four, consistent with WP:MOS. —Anomalocaris (talk) 18:56, 26 September 2013 (UTC)

Sorry for butting in here. Concerning gene names in {{PBB}} templates, these are based on the official Human Genome Organisation names (see HGNC Guidelines and PMID: 11944974). The bot is acting to make sure that the gene names in the PBB templates match the current official HUGO gene names. While your changes have merit, they do differ from the approved names. By HUGO convention gene names, in part or in whole, are never italicized (gene symbols are italicized but not names) nor do they contain sub- or superscripts. In addition, "... molecular weights may be specified in kilodaltons using the SI unit: kDa with no space after the molecular weight". Finally the genus is abbreviated to keep the gene names from becoming too long. Boghog (talk) 05:27, 27 September 2013 (UTC)
One additional note. The gene names in PBB articles were systematically created by an approved bot. Hence changing these names is very likely to be controversial. As a general rule, it is wise to first seek consensus before making potentially controversial edits to a large number of articles or templates. Boghog (talk) 05:50, 27 September 2013 (UTC)
Thank you, Boghog, for this information. I didn't purchase "Guidelines for Human Gene Nomenclature" in Genomics, but I read HGNC's "Guidelines for Human Gene Nomenclature" section 3: Gene names, and here, it shows Drosophila italicized in the example "lunatic fringe homolog (Drosophila)" and the example "anillin, actin binding protein (scraps homolog, Drosophila)". But the same guideline in section 5: Homologies with other species, includes the example "BarH-like 1 (Drosophila)". [without italics!]
I began to insert the space before kDA because I found some templates had spaces, e.g. Template:PBB/10621, with name "Polymerase (RNA) III (DNA directed) polypeptide F, 39 kDa". I suspect that ProteinBoxBot removes nonbreaking spaces but not ordinary spaces before kDa. According to NGNC, there is not supposed to be a space between the number and kDa. Why isn't ProteinBoxBot taking out regular spaces between the number and kDa? —Anomalocaris (talk) 08:28, 27 September 2013 (UTC)
Since this discussion began, HGNC's "Guidelines for Human Gene Nomenclature" section 3: Gene names has been updated. The example with a genus or species name is now ASXL1 "additional sex combs like 1 (Drosophila)", without italics. As I said on 27 September, I suggest modifying ProteinBoxBot to take out regular spaces between the number and kDa. —Anomalocaris (talk) 15:47, 3 October 2013 (UTC)
Sorry for joining late. And thanks Boghog for chiming in -- as usual, I have very little to add. I'll just mention that PBB does not do anything to the gene symbols and titles as they come from HGNC (through NCBI) -- neither italicization nor adding/removing whitespaces. While technically it's possible to add logic to do as you suggest, I think we'd want to make sure we had consensus first (please post at WP:MCB). Cheers, Andrew Su (talk) 05:10, 10 October 2013 (UTC)

ProteinBoxBot updates to UniProt links[edit]

Hi Andrew. ProteinBoxBot is making a large number updates that seem to be in error. See for example diff. In the meantime, I have requested a temporary emergency shutdown of the bot. Cheers. Boghog (talk) 11:03, 15 November 2013 (UTC)

Thanks for the note, Boghog. We're on it now and will report back here soon... Cheers, Andrew Su (talk) 17:24, 15 November 2013 (UTC)
All should be fixed now thanks to User:X0xMaximus. Thanks for reporting this! Cheers, Andrew Su (talk) 21:21, 16 November 2013 (UTC)

User pages in Category:Human proteins[edit]

Hello, Andrew. I was perusing "Category:Human proteins" and saw that two of your user pages are in the category. I do not know how to remove them from the category because it seems like a template is putting them in there. You are probably more capable than I am with templates, so I will leave it to you. You do not need to reply to this message, unless you want to.

Warmest regards, Kjkolb (talk) 01:39, 6 March 2014 (UTC)

Fixed, thank you! Cheers, Andrew Su (talk) 05:39, 6 March 2014 (UTC)

PBB Bot at Czech wikik[edit]


our community at Czech wikipedia is interested in enriching our articles on proteins with PBB templates (as seen on en.wikipedia) which are being filled by the bot you operate. I would like to know, if you or someone from your co-operators could operate the bot in Czech mutation of wikipedia or if we need to do it ourselves (and if you can provide guidance). Thank you very much! --Hypothalamus (talk) 12:38, 24 March 2014 (UTC)

Hi there. We unfortunately don't have the bandwidth to take on the Czech pages. (Having enough trouble keeping up to date on English WP!) You are welcome to use/adapt the code base here: Cheers, Andrew Su (talk) 17:37, 24 March 2014 (UTC)
We really need to focus on Wikidata. A single lua-infobox and a single bot could keep all languages of Wikipedia up-to-date. I directed the discussion here: d:Wikidata_talk:WikiProject_Molecular_biology#Wikidata_Infobox_on_Czech_Wikipedia. --Tobias1984 (talk) 11:28, 25 March 2014 (UTC)
I agree completely of course. Wikidata to me is the long-term solution. I suggested the pygenewiki code base only if you wanted to do a short-term hack, but in retrospect, any effort you would have put into that would be much better directed to Wikidata... Cheers, Andrew Su (talk) 18:29, 25 March 2014 (UTC)

WikiHack in DC on April 5-6[edit]

It's a long shot, but if you were going to be in DC for the weekend after next, you might consider going to this event, Open Government WikiHack. Klortho (talk) 05:01, 26 March 2014 (UTC)

Looks awesome, but yeah, unfortunately not possible for me to attend... Thanks for the heads up! Cheers, Andrew Su (talk) 21:39, 26 March 2014 (UTC)

ProteinBoxBot Errors[edit]

As documented here, the ProteinBoxBot appears to have made a large number of erroneous edits from July 2 to 4. Please recheck the bot script before running the bot again. Thanks. Boghog (talk) 08:19, 5 July 2014 (UTC)

Aaack, sincere apologies. The bot was dormant for a few months for an unknown reason. We did a first pass of debugging to work through some changes in the mwclient library, and assumed all would be good for a small run. Obviously we were wrong. We'll clearly apply much more scrutiny and caution as we move forward. Thank you for being our second set of eyes and fixing those errors! Cheers, Andrew Su (talk) 14:06, 5 July 2014 (UTC)
... and doing more spot checking, it's clear that the error rate is unacceptably high. I'm going to just manually revert all edits from PBB made on 3 July 2014. If you have an easy mechanism to programmatically do that, feel free... Cheers, Andrew Su (talk) 14:27, 5 July 2014 (UTC)
No problem. It appears only the edits marked as Minor aesthetic updates have problems. I have been manually reverting these and leaving the other edits which appear to be OK alone. Cheers. Boghog (talk) 14:33, 5 July 2014 (UTC)
But there are hundreds of minor revisions that we made, right? Ugh... Andrew Su (talk) 14:36, 5 July 2014 (UTC)
Plus I've noticed several minor removals of correct content (EC number, chromosome). Certainly not as egregious as the others, but that also makes me not opposed to a wholesale reversion of all edits... Andrew Su (talk) 14:39, 5 July 2014 (UTC)
222 to be exact. And most, but not all of these edits are faulty. So far, I have reverted about 1/3 of them. Boghog (talk) 14:40, 5 July 2014 (UTC)
Okay, all done now... Thanks for the eagle eyes. Spotting that problem after a couple hundred edits is a lot easier to fix than after a couple thousand edits. As always, let me know if you see any other problems! Cheers, Andrew Su (talk) 23:28, 5 July 2014 (UTC)
Great! Thanks for your diligence in fixing this. Cheers. Boghog (talk) 09:19, 6 July 2014 (UTC)

New bot run with same errors[edit]

Hi Andrew. Just a friendly alert. An apparently new bot run but with the same types of errors. Cheers. Boghog (talk) 15:02, 9 July 2014 (UTC)

Thanks! Our enthusiastic new programmer gets in earlier than I do (hadn't yet had a chance to debrief on last week's run). A quick email and chat later, we're back on track. Those new edits are now being reverted... Cheers, Andrew Su (talk) 15:31, 9 July 2014 (UTC)

Proposed GNF Protein box name change[edit]

Hi Andrew. Just a heads up to the above proposed name change. Cheers. Boghog (talk) 18:29, 21 July 2014 (UTC)

Thank you, much appreciated! Cheers, Andrew Su (talk) 18:36, 21 July 2014 (UTC)