Wikipedia talk:WikiProject Chemicals/Chembox validation

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Chemicals (Rated NA-class)
WikiProject icon This page is within the scope of WikiProject Chemicals, a daughter project of WikiProject Chemistry, which aims to improve Wikipedia's coverage of chemicals. To participate, help improve this page or visit the project page for details on the project.
 NA  This page does not require a rating on the project's quality scale.
 


How about putting pKA in the table?[edit]

Hey, I'm not sure if this is the right place to suggest this but I have been thinking it would be useful to put pKa-value to the Chembox. Personally I see it very important when looking for information of medicines, for example itraconazole. — Preceding unsigned comment added by 80.221.61.230 (talk) 20:34, 8 April 2013 (UTC)

Curation of the Excel file[edit]

Requests for changes in the format of the various files should be made (in person) at the offices of the Chembox Validation Workgroup. Note that there are currently several vacancies in this particular dodekatheon.

It seems that we're soon going to have several different versions of the Excel file: is this manageable and will it scale? Physchim62 (talk) 21:23, 20 January 2009 (UTC)

It will as long as I am given Godlike powers to sit on Mount Olympus, controlling the official releases! (ChemSpiderMan has similar godlike control over the SDF file!) As long as individuals only work on the section they have signed up for, there should be no problem for me to collate the information. For example, once Ambix completes 301-400 he can send me his "completed" Excel file, and I can copy/paste his validation column (and any added comments too) into the master Excel file. I will make regular "official" releases onto the pluto server, and send them to people via email also, upon request. The system does require that:
  • People don't add their own columns into the Excel file without discussion
  • I receive the edited versions once people have completed their sections
To me, it's just like a group doing edits to a draft of a MS Word document and sharing their comments with one coordinator. I was planning on emailing you tomorrow with the next official release anyway, which I hope will cover the first 600 (I'm up to 535 at the moment). Walkerma (talk) 09:40, 22 January 2009 (UTC)

Sugars and amino-acids[edit]

From our various discussions last Tuesday, I can see a certain number of problems with simple sugars and amino acids. Let's start with the example of glucose, which is on the list that Martin has volunteered for. The article currently gives CASRNs for D-glucose and L-glucose (don't worry if you can't see them, this is a "known feature", they're there in the chembox code), 50-99-7 and 921-60-8. The CAS file confirms that the former is the 'general' CASRN for D-glucose, and also gives us the numbers for α-D-glucopyranose [492-62-6] and β-D-glucopyranose [492-61-5]: nothing about L-glucose (I know because I have the file in formula order ;). I would suggest that we change the heading of the chembox to

| Name = <small>D</small>-Glucose

However this would mean that we risk losing the information about the L-enantiomer.

For amino acids, CAS have usually given us data on the L-enantiomers, as one might expect. However, there are exceptions: we have numbers in the SDF file for L-leucine [61-90-5], D-leucine [328-38-1] and 'unspecified' leucine [328-39-2]; and for L-arginine [74-79-3] and 'unspecified' arginine [7200-25-1]. How should we treat cases like these? Physchim62 (talk) 01:29, 22 January 2009 (UTC)

Thanks for raising this. We have a couple of options, as I see it:
(a) Have an "unspecified" name on the Chembox (matching the article name), and one or more specific compounds defined in the Chembox, or
(b) Have a specific name on the Chembox, and limit data to that isomer.
Your suggestion is (b) in this list. At first glance (a) might seem attractive (so the chembox matches the article name), but in practice very few people will be looking for information on L-glucose, and almost no one will be searching for details on racemic or "unspecified" forms of glucose. The same will apply to many chiral drugs. With amino acids, these are still heavily skewed towards the natural isomers, but in this case the unnatural isomers are still of some interest.
The major problem with (a) is that if we have every piece of data listed 3 or 4 times over our Chemboxes will get horrible. I remember putting together the early version of Chemboxes for many metal chlorides, and it was a nuisance to put information for every hydrate in as well as the anhydrous (and chromium(III) chloride was a nightmare!). I think, therefore, that we adopt your proposal, PC, for all carbohydrates. For chiral drugs, we should discuss the issue with WP:Pharm.
For amino acids, we could treat unnatural isomers the same way we treat hydrates of inorganic salts or alkaloid hydrochlorides, by adding an extra line in the Chembox. In such cases the "default" would be the "natural isomer unless otherwise specified"; I'd say we'd put the L- in the Chembox title. There are alternatives: One would be to have a "related compounds" Chembox under the main one, where we list such data in groups (CAS# - here are the unnatural, the racemic, the unspecified CAS#s all in one section). Another alternative would be to put "unnatural" data onto the supplementary data page. Thoughts? Walkerma (talk) 09:33, 22 January 2009 (UTC)

For the moment, chiral drugs, steroids, etc are not a problem. They usually have a fairly well-defined stereochemistry–name relation. As for the "small-molecule" problem, where both enantiomers may be available, I've made an "SAA" subset of the entries. It contains (at present) 314 CASRNs which may (or may not) be related to natural amino acids or natural monosaccharides; I will hazard a guess that validation will map the CASRNs onto fewer than 100 articles, and far fewer if my original inclusion criteria are constricted at the same time. I think these will need priority attention once we're out out the first phase, but I can't do them immediately. Physchim62 (talk) 00:21, 23 January 2009 (UTC)

Aha! I just got an email reply from CAS telling me that I'm over 10 years out of date with my knowledge! Apparently racemate CAS#s went out in 1997, see this. This resolves (ha ha) the apparent ambiguity in some of our compounds between unspecified vs racemates - they're all the unspecified CAS#. Walkerma (talk) 20:55, 23 January 2009 (UTC)

Inorganics etc[edit]

The inorganics list has expanded slightly, now that I've found where the carbonates and carbonyls were hiding: currently at 677 CASRNs, of which about 400 have been checked. For reference, "inorganic" is any neutral species without a carbon–hydrogen or carbon–carbon bond, but which is not an element (internal definition). For elements, we could start a verification but we would have to speak with WP:ELEMENTS to make any visible changes: there are also problems with elements such as oxygen and chlorine, where we have CASRNs for both the molecules and the atomic species. For isotopes (we have several cases where CAS have given us the CASRN of specific isotops such as carbon-14), there is no obvious place to put the number in the current scheme of things, and WP:ELEMENTS is probably the best place to ask for advice.

For ions, the SDF file can only be described as a hotchpotch. I have CASRNs for phosphate [14265-44-2] and dihydrogen phosphate [14066-20-7], but not for hydrogen phosphate. The Wikipedia presentation of ions can hardly be described as "our best" either, and maybe it would be best to try to strat a discussion on how this chemistry should be presented. Physchim62 (talk) 00:21, 23 January 2009 (UTC)

Structures[edit]

How about creating a template that can be added to the image description pages of verified structures? Something along the lines of

OrgChem Nomen pictograph.png The accuracy of this structural formula has been verified as part of the English Wikipedia's Validation of Chemical Data project.

This could even add images to a category. Fvasconcellos (t·c) 15:15, 24 January 2009 (UTC)

Very nice idea! We need to find out how this would work with PC's proposed system for image validation. But we've seen that talk page assessment templates worked very well for the 1.0 project, and this could do the same for validation. Thanks, Walkerma (talk) 16:09, 24 January 2009 (UTC)

I think we would need more information on the template, such as

OrgChem Nomen pictograph.png The accuracy of this version of the structural formula has been verified as part of the English Wikipedia's Validation of Chemical Data project.
CAS registry number: 50-00-0
InChI=1/CH2O/c1-2/h1H2
WSFSSNUMVMOOMR-UHFFFAOYAT

There is also a Commons WikiProject Chemistry - a bit inactive, but it might be an appropriate place to centralize Commons-related points. Physchim62 (talk) 19:01, 25 January 2009 (UTC)

See Commons:User:Physchim62/Sandbox for a working test template. Physchim62 (talk) 12:18, 27 January 2009 (UTC)

OK, I suggest we go live on structure validation for two smallish groups of structures:

  • those like Image:Digitoxin.png which are so damn complicated that you need to draw them out again to check the stereochemistry; and
  • those which are drawn specifically from CAS data for the validation effort.

More details can be found at Commons:WikiProject Chemistry/Structure validation. Physchim62 (talk) 13:33, 5 February 2009 (UTC)

Purines and pyrimidines[edit]

Just a quick note to let people know that there is a bug somewhere, either in CAS structure files or in ChemFileBrowser, but the hydrogen is not always placed correctly in the structure you see in CFB for purines and pyrimidines. This can be shown by the fact that the displayed structure does not correspond to the quoted CAS Registry Name. For the two or three cases I've seen so far, the WP structure matches the CAS Registry Name, so I've had no problems in verifying, but I wanted to point this out to others. Physchim62 (talk) 14:45, 4 February 2009 (UTC)

Update from CAS[edit]

CAS informed us that they have now added 626 links TO Wikipedia from http://www.commonchemistry.org/. Some of the later entries on our Excel organics list such as naphthalene aren't there yet, and neither are the inorganics from PC's list, but some more of the early entries such as retinol are now included. I've asked about when the others will be added, and also asked when the press announcement on commonchemistry.org will happen. Walkerma (talk) 15:42, 29 April 2009 (UTC)

Is single verification of CAS no. from CAS website sufficient enough...???[edit]

(moved from project page)

My experience has told me that even government website has been flawed--222.67.212.209 (talk) 05:33, 11 February 2010 (UTC)

Yep, we know. We start at one end, when the data is linked to multiple databases and we have more identifiers there, then we can start cross-checking which identifiers do not matchup in different databases, and have a second look. The CAS number is sometimes problematic (multiple CAS for one compound, or multiple compounds for one CAS .. though that should not happen; but that is true for a lot of databases). --Dirk Beetstra T C 08:38, 11 February 2010 (UTC)
For validating the CAS no., the CAS website should be enough - because by definition, whatever CAS says is the CAS no. is the CAS No. It's true that there are flaws, but CAS is about the best there is, and we have to start somewhere. Pubchem, on the other hand, is worse than Wikipedia for accuracy. Walkerma (talk) 23:05, 11 February 2010 (UTC)

While looking over the bixin page for accuracy, I noticed the listed CAS No and SpiderChem are, in fact, correct. While the CAS is not included in CommonChemistry's database, the SpiderChem value is listed on SpiderChem, but are still marked by a red X. Is there a way to fix this? --Skoot13 (talk) 20:46, 20 January 2012 (UTC)

Validation of structures[edit]

Currently we have this proposal for structure validation. I think this is a great start, but it was written almost two years ago (before CheMoBot validation and StdInChIs), and it also needs "beefing up". Also, things aren't so simple - structure files are separate from the related articles. Here are some thoughts:

The current structure validation scheme, outlined at the above page on Commons, is designed to check that the structure file is correct. This is valuable, but there are shortcomings in the system:

  • What if a vandal or a well-meaning editor decides to upload a new version of a structure (for example, at higher resolution), and the new version of the structure is wrong? The editor is unlikely to remove the "validated" tag - so we end up with erroneous structures tagged as correct.
  • What if we have an article with a chembox, where the structure is validated against Common Chemistry, and the structure file is then changed? For example, if (after chembox validation) someone changes the image file from File:3-Hexanol.png to File:3-HexanolStructure.png, and the new structure is wrong - how can we tell? CheMoBot may report the change, but how will an average user tell that the structure is now wrong?

It seems to me that we need to record which version of which image file was validated, and the user should be able to see this, even if the current chembox has a newer revisionID of the image, or even a different structure image. Also, we need to record in some way that the data in the chembox relate to the validated structure. This should be the case for structure identifiers (SMILES, InChI, etc) but also for physical constants - e.g., if the structure shown is enantiopure, but the MP is for a racemic mixture, that's wrong. How can we do this? I'll be on IRC tomorrow at 1600h UTC if anyone wants to discuss this. Walkerma (talk) 06:23, 9 November 2010 (UTC)

Sigh.
My point throughout has been .. we first need to get for every Wikipedia page with a chembox or drugbox correct identifiers. Let that be CAS, let it be ChemSpider - and we can then link those, unambiguously, to InChI's, StdInChI's etc. CheMoBot can keep track of those, and when we know that we have the e.g. correct StdInChI on a page, then we know which is the correct structure for a compound.
Every upload on Wikipedia or commons has a revid as well, and with having a correct StdInChI on Wikipedia the step to link that to a correct image on Commons, (with an image-revid) is also possible. CheMoBot can then track if a) the image revid on commons changed, or b) the image filename on Wikipedia changed. And then tag that somewhere the image changed on Wikipedia. That is all possible, I would only need to write it (and most of the code is there, just needs to be made specific to follow commons appropriately).
The point is - in the last year there has been no effort in validating identifiers (see this and this). Not having validated identifiers makes it a hell of a work to validate StdInChI's, and not having validated StdInChI's makes it a hell of a work to validate structure images. Of the thousands of InChI's that are there (there are less StdInChI's) you have no clue which ones are correct - are you guys really planning to a) validate identifiers like CAS and CSID, and validate the InChI's and StdInChI's belonging to them, and the images scattered on commons and Wikipedia, while there has been not a single coordinated effort to do only identifiers?
I can start writing the code accordingly, make CheMoBot follow Commons Images, prepare an en.wikipedia index for image revids, and make the bot tag those images which get changed (I'll toss in a couple just to test the system) - still you guys have to help me in actually getting those frigging verified revids for both the Wikipedia pages and for the Commons images into the corresponding indices.
Sorry, if I sound a bit frustrated here, I hear a lot of 'we want this and that', but we have only 30-35% of verified CAS numbers at the moment, we could switch to ChemSpiderIDs, which may bring it up to 40-45% .. still there are thousands of pages where we do not even know if CASNo, CSID, UNII, InChI, SMILES, StdInChI, pubchem, whatever are correct (for CASNo - 65% would amount to 7000-8000 pages for which we do not know ANYTHING!). If we do not have that, I have NO clue how we will be able to find correct StdInChI for the pages (a field which is practically unfilled in Wikipedia .. maybe we have 20 now!!!). For what it is worth, please start adding those StdInChI's to 11000+ chem and drugboxes (please keep track which ones are correct), then we can link commons images to that - but without those I think that that exercise is completely, totally, utterly futile.
I don't even want to think about melting points ...
I'll have another look at unambiguously linking commons-image-revids to en.wikipedia pages .. there is no harm in having the system ready for it. --Dirk Beetstra T C 08:54, 9 November 2010 (UTC)
First half is in effect now:
I am going to make CheMoBot follow commons as well, so it will also change Benzene if there is an upload of a verified image (and not only when there is, maybe months later, an edit to Benzene).
All the settings are in effect, the only thing left to do is actually fill this log. --Dirk Beetstra T C 13:22, 10 November 2010 (UTC)
That is brilliant! I will get to work on this as soon as my work on 0.8 is complete. Many thanks, Walkerma (talk) 18:06, 10 November 2010 (UTC)
As soon as CheMoBot has established where an indexed image is used (it self-generates an index for that in its userspace; I have to write a more efficient way of generating that index of image use) it also will edit the page on en.wikipedia when a new version is uploaded on meta). --Dirk Beetstra T C 19:06, 10 November 2010 (UTC)
Efficiency added, see User:CheMoBot/Chemicals/Images.css on which the bot collects all the images for all the chembox and drugbox pages (only editable by the bot, but maybe useful to extract a full list of images from). --Dirk Beetstra T C 15:51, 13 November 2010 (UTC)
OK, thanks! I'll start work on this late next week. Walkerma (talk) 20:55, 13 November 2010 (UTC)

Issues found during implementation[edit]

I'm working through a list of around 5000 compounds at the moment - so far I've only done about 5% of them. I would describe my work as "verification" rather than "validation" because I'm checking structures, but I'm not "proving" them to be correct against a primary source. The good thing is that I'm checking that the links to WP from ChemSpider are correct, too - some aren't. Everything is working smoothly, except for a couple of things:

  1. MAIN ISSUE. On-wiki, we don't have a way of recording that Structure A is correct for compound A. In other words, a structure is only "correct" in a certain context. For example, let's say I verify a structure for 1,1,2-dichloroethane. Then someone inserts that structure in the article on 1,1,1-dichloroethane by mistake. The 1,1,2-dichloroethane structure is marked as "correct", but how do we know which article it is correct for? (This is especially true if the image file has, say, a Russian name).
  2. If I see two structures in a Chembox, but I can only validate one of them, what should I do?
  3. I'm currently only verifying structures against ChemSpider, with an obvious reality check for chemical names (I can't verify complex drugs in the same way). I figure that's a start, but it's not perfect; it assumes, too, that the structures match with Dirk's work of structure vs CAS verification/validation.
  4. If I find an incorrect structure, what should be the correct course of action? How about if I find a correct structure that is drawn badly? In a couple of cases, I've redrawn these and uploaded them, but if I don't have time, is there a way to tag for this?
  5. Should I be compiling an SDF file for this, that we can share between us?

Any comments? Should we have an IRC discussion on this, before I get deeper into the work? Walkerma (talk) 00:09, 2 January 2011 (UTC)

Funny ...
  1. The bot 'records' that by itself .. so it then knows where which image is used .. but I did not contemplate that that might be .. wrong. Good observation. Working on this. Will bring some extra work with it ..
  2. Well, you validate the image, the bot only knows which one is verified, the other is ignored by the bot. I would verify the one which is of interest, no need to verify a picture of an ampoule with the compound.
  3. They should be matched against the InChI. I must say, I also checked against ChemSpider .. I may have made mistakes (though generally, if it took too much time, I ignored). Also note, that I worked from a list that I got from ChemSpider as well, which may also contain some mistakes (I did notice some).
  4. Not add the revid to the index, what I do is tag the image on commons, there is a template for it (I don't do drawing, maybe I should learn to do that).
  5. Never hurts.
I think you can go on .. will apply a patch and explanation in the next hour. --Dirk Beetstra T C 09:26, 10 January 2011 (UTC)
Regarding '1'. I have changed the index, it needs indeed to know for which page the image is verified. The index is now '<filename>=<commons-logid>#<en.wikipedia-filename>' .. Sorry, Martin, you'll need to add all the pagenames now to the old items .. but I will try and help a bit later). --Dirk Beetstra T C 09:48, 10 January 2011 (UTC)

inchi and inchikey are depending on the tautomeric form of structures and presenting a single inchi or inchikey is not sufficient, unless we are on the 'normalized tautomeric form' of structures and the corresponding process to create it. Either this, or we would need to list all possible inchi(key) for all possible tautomeric forms for ensuring being correctly linked with other database out there. Finally, I am concerned about 'one validated' form of inchi(key)s, strictly speaking is this not correct right now. JKW (talk) 08:57, 4 February 2013 (UTC)

Infobox additions violate self-reference guidelines[edit]

As explained at Wikipedia:Manual of Style (self-references to avoid) Wikipedia articles, including infoboxes, should normally not refer to anything Wikipedia-specific (such as this project). It causes the article content to be broken on mirrors/off-line readers/print/etc. Is it really necessary to display the validation info in the infobox? Aren't the hidden categories sufficient for managing things? What about displaying the status in a Talk Page template instead? Kaldari (talk) 00:15, 2 December 2010 (UTC)

For scientists, knowing that specific information has been validated is of critical importance; many will ignore data that are not validated. If Wikipedia is to gain credibility in the academic community, it is essential that such validation work has been done. There may some other way to resolve this, instead of (in effect) hiding the information from 99% of users because of a technicality. Since I actually coordinate much of the offline release work on the English Wikipedia, I will look into the effects of this in our Version 0.8 release, for which the ZIM file will be ready next week. Walkerma (talk) 17:29, 9 December 2010 (UTC)
First of all, scientists should not be depending on Wikipedia for validated information. They should validate information for themselves regardless of what assurances are offered by Wikipedia. Secondly, it is perfectly fine to validate the articles, but that self-referential information should not be in the article body itself. The validation information should be either in the iconbar or on the talk page. For example, when an article is promoted to be a featured article, we don't add that information into the article itself. We add an icon to the iconbar that links to further information and we add a template to the talk page. This project should be doing something similar. At the very least, we cannot have the 'verify' link which links to a diff in the infobox, as this will definitely be broken in offline/external applications and print. Kaldari (talk) 17:59, 10 December 2010 (UTC)
Also, how long were you guys planning on keeping these links in the infobox? Was it just until the verification push is done or were you planning on keeping them in the infobox indefinitely? Kaldari (talk) 18:20, 10 December 2010 (UTC)
If I understand it correctly, it is only the 'verify' link that you have problems with? The rest of the infobox relating to the verification work (the YesYs and Nes are not of concern? --Dirk Beetstra T C 13:40, 16 December 2010 (UTC)
All of the verification content is somewhat problematic, but the links to "What is this" (which goes to Wikipedia namespace) and "Verify" (which links to article history) are especially problematic. I would suggest either (1) moving all of it to a Talk page template, or (2) replacing the "What is this" link with an inline explanation or a footnote and removing the "Verify" link. Kaldari (talk) 19:15, 16 December 2010 (UTC)
I removed the cross-namespace link, the 'verify' link is something that maybe needs rethinking. Note, there are a plethora of templates which have 'edit', 'view' and similar buttons, displayed in mainspace, whereas I can argue that the 'verify' button has use, those are especially out of line. Maybe the 'verify' button can be replaced by something that is not rendered in a off-line version?
'all of the verification content is somewhat problematic' - how do you mean? What concerns do you have with the 'ticks' and 'crosses'? --Dirk Beetstra T C 08:45, 17 December 2010 (UTC)
I suppose the ticks and crosses are OK. We just don't normally put assessment metadata in the articles themselves and I think it could be confusing to readers. The verify link is especially confusing as there isn't really any way to understand what it links to without being familiar with this project. Why not put the infobox assessments in the WikiProject templates on the Talk pages instead? Kaldari (talk) 18:35, 17 December 2010 (UTC)
Good, why not? Because you would like to see that data is correct. Why don't we put references to sentences on the talkpage?
And well, removing the cross-namespace link has now clearly shown that explanation is necessary. I am putting it back. --Dirk Beetstra T C 20:20, 17 December 2010 (UTC)
I agree that if you're keeping the validation info in the infobox, you do need some sort of explanation for it. Kaldari (talk) 23:02, 17 December 2010 (UTC)
Where else, Kaldari. If you put the validation data on the talkpage, then no-one who reads the talkpage will know that we actually have reliable data on Wikipedia (as no-one will check the talkpage), if you put an icon somewhere else on the page, then that would give the false impression that the whole page is checked - something more difficult, if not impossible. Identifiers and maybe to a certain extend physical data are unmutable, and that is the only thing that can be checked, and which can therefore be tagged. But other solutions are welcome. Thanks! --Dirk Beetstra T C 08:15, 18 December 2010 (UTC)
If you can figure out a way to hide it in print versions (using CSS) that would help at least. Kaldari (talk) 21:27, 18 December 2010 (UTC)
I think this is a good example of where WP:IAR applies. Verification of data is CLEARLY making Wikipedia better, just like inline citations have done. This system began because of a collaboration with CAS to provide us with CAS numbers - the first time CAS has ever shared these openly with an outside body - and we certainly couldn't turn down a chance to have the most reliable CAS numbers on the Web (besides CAS themselves). And despite whether scientists "should" or "should not" rely on Wikipedia, the reality is many do; more scary, one friend who is an EMT told me that they often use Wikipedia for drug information in the ambulance because it is fast, and often the best information source they can get to in the limited time. Clearly, we don't want to make our encyclopedia worse just to avoid having a tick on the page. I think Beetstra has done a remarkably good job of hiding a lot of metadata (and hard work!) behind a few unobtrusive ticks. Walkerma (talk) 06:28, 19 December 2010 (UTC)

A mysterious red X[edit]

As an example of the validation metadata being confusing for a reader, I noticed that Lead(II) chloride has a big red X in the infobox, so I clicked on the verification link and it showed me that someone fixed the case of one of the parameter names. Why this necessitates the infobox getting a red X is a mystery to me. Is this a bug or am I not understanding something? Kaldari (talk) 18:47, 17 December 2010 (UTC)

Good, and another example of why we had a 'what is this' link, that would have explained you in detail what to do .. I am returning it. --Dirk Beetstra T C 20:22, 17 December 2010 (UTC)
By the way, thank you for pointing me to the error in Lead(II) chloride. Seemingly, it is just as effective as a {{fact}} (hey, also a cross-namespace link there) tag .. though this is not exactly a thing which needs a fact .. it is plainly something that was wrong on one end or the other, and needed repair. --Dirk Beetstra T C 21:46, 17 December 2010 (UTC)
The CAS number for 1,2,3-Trichloropropane also has the big red X for no good reason.

coolest thing ever[edit]

This chembox validation concept is the coolest thing ever. Big kudos to whoever thought of this. Really, REALLY clever. 75.4.194.121 (talk) 09:22, 16 January 2011 (UTC)

== what is this? ==

Plutonium hexafluoride displays a chembox with some values marked with this red cross: N. The foot of the infobox displays the same red cross appended with a "what is this?" link to Wikipedia:WikiProject Chemicals/Chembox validation followed by the text "(verify)".

Please change the "what is this?" to read "needs validating", keeping the link the same. This could save the reader an extra click, and also makes more sense when the infobox is rendered in a PDF or in hardcopy. See the PDF generated by Book:Plutonium as an example.

I think the red cross is a good idea and should appear in printed form too, but I wish to find a way to exclude the "(verify)" text from print. If I knew which template generates this I could add it to Category:Exclude in print, or otherwise request it be modified on its talk page. Could someone do this for me? -84user (talk) 14:35, 9 June 2011 (UTC)

I agree with this being one of the best ideas on the chemical part of the Wikipedia! Big thanks for this! By the way, there's another red cross on Etorphine that I don't know how to fix. Please anyone could do it for me? Thanks in advance. /Aeorisdisc → 14:51, 14 June 2011 (UTC)

Melting & boiling point validation[edit]

I would like us to start a process of validating melting point and boiling point data within chemboxes. This means checking numbers against reliable sources, and ideally multiple sources. I know we have talked about this for years, and the size of the task has always appeared daunting. However, I recently had a conversation with an organic chemistry professor friend who is using computer-based tools to evaluate melting point data from the literature, so I think we perhaps have an opportunity to get some useful data quite quickly.

My proposal is that we would start initially by verifying data for important substances, perhaps just 100 or so initially. I propose we record the data and references on the supplementary data page - that way we would have a virtual paper trail for every data point, an important component of any validation effort. I would also propose that the fields be patrolled by CheMoBot, just as are the other validated fields, assuming that User:Beetstra can adapt the bot in this way. I would be willing to devote much of my wiki-time to this validation work for the next year or so, if people are supportive of the effort. What do you think?

I'm proposing an IRC meeting on #wikichem on Thursday at 1500h UTC (11am US Eastern Time), which is a time when my chemistry colleague is able to join us. Please join us! Walkerma (talk) 19:17, 25 July 2011 (UTC)

I'm interested in a common database for physical properties shared between different wikis. I started to build a database in a excel file with data from Handbook of Chemistry and Physics (Tfus, Teb, density and refractive index). My question is to know if there is a bot able to write in a wiki format the content of the database in order to avoid manual introduction/correction of data in existing chembox. (excel -> text/xml file -> bot -> wikipedia articles)
The best solution for the long term is the automatic generation of the chemboxes from an external database in order to provide data in an unique format. I don't know what is the policy in en:wp but in fr:wp we try to include for every property the reference from which the value was extracted. To be useful a database has to include references especially for data comparison purpose. So
If someone is interestind I can share my excel file as draft for discussion. Biglama (talk) 17:07, 10 August 2011 (UTC)

Validation question[edit]

Hi, I'm currently working on the article psilocybin, which is currently up for GA review. The reviewer has asked if I can get rid of the red x in the chembox (next to the ChEBI identifier). I read through the instructions, but am still unclear what I have to do to "validate" this. While I'm here, I would highly appreciate it if someone knowledgeable could look at the chembox in this article and see if there's anything missing; I'm aiming for FAC status eventually and would like to get all the i's dotted and the t's crossed. Thanks, Sasata (talk) 15:12, 12 September 2011 (UTC)

You need to get to verify that the ChEBI identifier matches with the structure that you're working on - do a careful check on the linked page. I see that at present your structure has a zwitterionic form, whereas the ChEBI page has a neutral-only form of the structure - that may be the reason for the red X. (User:Beetstra has a list from the ChEBI people if you have a problem). Once you get all the validated fields correct, then you save the "correct" RevisionID here that contains the correct data. The bot will then know if someone changes any of the verified fields - if someone tries to vandalise any of them, it should leave a big red X as a warning. I'll validate your structure on Commons for you as well - we don't have a red X for that, but it's obviously a key part of the validation process. The article is looking good - keep up the good work! Walkerma (talk) 21:41, 12 September 2011 (UTC)
Thanks for your response. I checked the fields between the two pages, and the only difference I can see is the presence of the Zwitterionic structure like you mentioned. However, this differs as well in the linked page to ChEMBL, and there is no red x beside this identifier. So I updated the revid as suggested and saved the change, but the red X is still there in the psilocybin article. What's next? Sasata (talk) 16:21, 16 September 2011 (UTC)
  • Nevermind, I guess I just had to wait for the Chembot to update. Thanks for your help. Sasata (talk) 17:33, 16 September 2011 (UTC)

Request for instructions on correcting the index[edit]

The project page mentions an index to edit to correct verified values. Where is the index and how do I go about corrections? Thanks. –Temporal User (Talk) 13:05, 4 December 2011 (UTC)

Help on correcting CAS number validation[edit]

I noticed that the CAS number on the beryllium sulfate article is marked as incorrect. I checked the information through the online version of SciFinder, which lets one check CAS numbers, and the provided CAS number for the anhydrous form of BeSO4 is indeed correct. However, I am not sure how to update the validation files so that the cross is removed from the infobox, and the "What's this?" page does not provide me with that information. Can somebody tell me what to do here? I would like to be of use for this on other pages as well, so a step-by-step would be very much appreciated. Kumorifox (talk) 16:22, 3 January 2013 (UTC)

Same issue, different contributor: I believe the CAS Number for Janus Green B, 2869-83-2, is correct. It's pretty fuzzy to those of us on the "outside" how we go about helping with these, or if you even want us to do so.Robert Rossi (talk) 19:38, 14 October 2013 (UTC)

Red X on Methamphetamine[edit]

Not sure if this is the right place to report this, but the ChemSpider number listed for methamphetamine, 1169, is marked with a red X. — Preceding unsigned comment added by 76.181.160.60 (talk) 11:05, 1 June 2013 (UTC)