Wikipedia:WikiProject Chemistry/IRC discussions/22 Jan 2008
--- Log opened Tue Jan 22 11:06:15 EST 2008
11:06 <walkerma> Hi, sorry I'm a couple of minutes late! Have you been discussing anything?
11:06 -!- ChemSpiderMan [n=ChemSpid@c-68-33-151-242.hsd1.md.comcast.net] has joined #wikichem
11:06 <walkerma> Hi Antony
11:07 <+Rifleman_82> hi antony
11:07 <ChemSpiderMan> Hi all
11:07 <+Rifleman_82> just started i guess
11:07 <+Beetstra> Hi guys
11:07 -!- mode/#wikichem [+o Beetstra] by ChanServ
11:07 -!- mode/#wikichem [+v ChemSpiderMan] by Beetstra
11:07 -!- mode/#wikichem [+v walkerma] by Beetstra
11:08 <+walkerma> Beetstra: What does that do?
11:08 <+Rifleman_82> +v give syou a + beside your name
11:09 <+Rifleman_82> when the channel is moderated, only those with @ (ops) and + (voice) can talk
11:09 <+Rifleman_82> the rest can listen but not talk
11:09 <@Beetstra> Nothing special here .. but if I have to moderate the channel because of trolling, then people with 'voice' can still speak, the others that don't have voice can't say anything
11:09 <+dmacks> Wanna kick the bot?
11:09 -!- Netsplit niven.freenode.net <-> irc.freenode.net quits: +Physchim62
11:09 <+Rifleman_82> yes please
11:09 <+Rifleman_82> pc not staying?
11:09 <+walkerma> Thanks! Do you want to moderate this meeting, Beetstra?
11:09 <@Beetstra> CheMoBot quit
11:09 -!- CheMoBot [n=beetstra@69.37.168.214] has quit ["Mayday! Mayday! .. going down!"]
11:10 <+dmacks> netsplit...woooo:(
11:10 <@Beetstra> No, I let that to you ..
11:10 <+Rifleman_82> i'm quite tired, so i'll prolly stay til max 1 am my time
11:10 <+dmacks> I'm logging
11:10 <+Rifleman_82> hey what happened to the last log?
11:11 <+Rifleman_82> i thought we were going to put it up somewhere?
11:11 <+dmacks> I may have a copy, can't remember who was actually planning to do it.
11:11 <+Rifleman_82> oh
11:11 <+walkerma> I have a log, but I wasn't sure how to distribute it - then the semester started...
11:11 <+Rifleman_82> i've got a copy, which i sent to pc
11:11 <+Rifleman_82> gimme a moemnt
11:11 <+Rifleman_82> i'll upload it
11:11 <+walkerma> Thanks
11:12 <+Rifleman_82> ed not joining us?
11:12 -!- Netsplit over, joins: +Physchim62
11:13 <+walkerma> OK, ChemSpiderMan, could you update us on the database? What still needs to be done?
11:13 <+ChemSpiderMan> I need to finish from P to W
11:14 -!- Physchim62 [n=Physchim@unaffiliated/physchim62] has quit [Read error: 110 (Connection timed out)]
11:14 <+walkerma> Will you be doing that some time next month? Is that the plan?
11:14 <+ChemSpiderMan> Then I need to go through one more time...faster second time...
11:14 <+Rifleman_82> sheesh
11:14 <+ChemSpiderMan> hopefully first week of Feb
11:15 <+walkerma> Great! I just looked at my Sandbox, and it looks like things are progressing there - many of the errors have been fixed.
11:15 <+walkerma> There are a couple of general problems we should probably agree on:
11:16 <+ChemSpiderMan> Second time through checking for some complex natural complex products
11:16 <+ChemSpiderMan> maito-toxin is a bear
11:17 <+walkerma> http://en.wikipedia.org/wiki/Image:Maitotoxin.png
11:17 <+walkerma> Looks more like a snake than a bear to me...
11:17 <+dmacks> ha!
11:18 * dmacks uses that as a teaching example of how "not-simple" an ether can be.
11:18 <+walkerma> Good idea, dmacks! Is it worth treating these "bears" as a separate list? That need more than one person to check them?
11:18 <+ChemSpiderMan> sorry..phone right now
11:18 <+Rifleman_82> okay logs at http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/IRC_meeting_1
11:19 <+Rifleman_82> it's a mess but i think it's readable
11:19 <+walkerma> Thanks, R82!
11:19 <+Rifleman_82> i'll figure a way to do it nicely and clean it up
11:19 <+Rifleman_82> but i think it'll do for th emoment
11:19 <+Rifleman_82> np martin
11:19 <+dmacks> Yeah, if there are some specific monsters that you want to set aside somewhere, /me can look as time permits.
11:19 <+Rifleman_82> oh...
11:19 <+Rifleman_82> well instaview and wiki doesn't give the same effect
11:19 <+Rifleman_82> i'll try it out while we discuss
11:20 <+walkerma> "Monsters" sounds like a good name for the page! Then we can check these carefully against the primary literature
11:20 <+walkerma> There are some other issues that are general:
11:21 <+walkerma> 1. How do we represent salts? We need a clear policy
11:21 <+walkerma> 2. How do we represent sugars - ring or open chain?
11:21 <+walkerma> 3. How do we address tautomers where both forms are stable
11:22 <+walkerma> Should we discuss these here?
11:22 <+Rifleman_82> what's the problem wiht salts?
11:22 <+walkerma> Often a structure will not have the counterion, but the CAS no does.
11:23 <+walkerma> Or perhaps a drug will be drawn in a neutral form, but the drug is a succinate salt or something
11:23 <+Rifleman_82> icic
11:24 <+walkerma> Perhaps we should say in our new MOS that we require salts to show their counterions - no quat ammoniums without the Cl- or whatever
11:24 <+walkerma> Does this sound reasonable?
11:24 <+Rifleman_82> agree
11:24 <+dmacks> concur
11:24 <+walkerma> http://en.wikipedia.org/wiki/Nile_blue
11:25 <@Beetstra> Hmm .. that gives the problem that you can't discuss the ammonium ion .. or you have to discuss it on every page (chloride, bromide, acetate, nitrate)
11:25 <@Beetstra> I would say .. a compound gets a chembox .. so ammonium chloride
11:26 <@Beetstra> But ammonium ion gets another box .. ionbox e.g.
11:26 <+dmacks> Is "nile blue" really the salt, or is it the imine, which is available as many HX salts?
11:26 <+Rifleman_82> chloride not seen
11:26 <@Beetstra> As I mentioned for functional groups
11:27 <+dmacks> Rifleman_82: Wikipedia:WikiProject Chemistry/IRC meeting 1a ?
11:27 <+walkerma> I'm guessing that it is generally used as the chloride, because that is what the CAS and formula give
11:28 <+ChemSpiderMan> Sorry..I'm back...I think the compound shown needs to be connected to the article name
11:28 <+Rifleman_82> dmacks: ?
11:28 <+dmacks> Okay, so that seems like a simple structure-drawing error.
11:28 <+ChemSpiderMan> The primary key of the article is the compound name..not the structure
11:28 <+dmacks> Rifleman_82: cleaner upload of the log
11:28 <+dmacks> Right, so again are there many possible "nile blue" with different counteranions, or is it specifically Cl- ?
11:29 <+ChemSpiderMan> So, is Nile Blue a chloride salt or not?
11:29 <+ChemSpiderMan> yes..exactly
11:29 <+ChemSpiderMan> Also, INTERNAL consistency between structure, SMILES and CAS
11:29 <+Rifleman_82> dmacks:looks very nice indeed. i'll move and delete mine
11:29 <+Rifleman_82> you're the official secretary henceforth!
11:29 <+ChemSpiderMan> Nile Blue...the structure has no Chloro...the SMILES does.
11:29 <+dmacks> I think we got not-very-far with this discussion last time, what happens when "the name" (wiki page title) does not map to a single compound.
11:30 <+dmacks> Rifleman_82: ok
11:30 <+ChemSpiderMan> Don't know what the CAS is
11:30 <+ChemSpiderMan> what's an example?
11:30 <+ChemSpiderMan> That will help me think about it...
11:30 <+Rifleman_82> betamethasone?
11:30 <+dmacks> Tartaric acid.
11:30 <+Rifleman_82> you have the valerate, and various other esters
11:31 <+Rifleman_82> betamethasone could use a copyedit since we're on it
11:32 <+ChemSpiderMan> betamethasone...is it a trade name for a material or the name of the steroid itself as drawn?
11:33 <+ChemSpiderMan> The way it is shown is that betamethasone is the structure drawn in the box...
11:33 <+Rifleman_82> free acid?
11:33 <+ChemSpiderMan> It says "It is available as a number of esters: Dipropionate (branded as Diprosone, Diprolene and others), Sodium Phosphate and Valerate (branded as Betnovate, Celestone and others)." and I think that covers the rest
11:33 <+Rifleman_82> maybe we stick with dmacks' simpler example
11:33 <+Rifleman_82> for the moment
11:34 <+ChemSpiderMan> tartaric acid...looking
11:35 <+ChemSpiderMan> This looks okay...
11:35 <+ChemSpiderMan> is there an issue I am missing?
11:35 <+walkerma> Look at the table at the bottom.
11:35 <+walkerma> The natural name for an article of this sort is "tartaric acid"
11:36 <+ChemSpiderMan> Yes...the name is fine
11:36 <+Rifleman_82> you have d and l and meso
11:36 <+walkerma> But there are several stereoisomers, and mixtures
11:36 <+dmacks> One "name" is three compounds, plus there's prolly also a CAS for the racemate.
11:36 <+ChemSpiderMan> The structure is fine...since the structure is NOT stereospecific
11:36 <+walkerma> And there's a CAS for "unspecified" as well, almost certainly
11:36 <+dmacks> (yup)
11:36 <+walkerma> So we can't use CAS as primary key
11:37 <+ChemSpiderMan> The way to specify for each of D/L/meso is to have separate articles
11:37 <+walkerma> I don't think we want that.
11:37 <+ChemSpiderMan> I agree
11:37 <+dmacks> concur strongly.
11:37 <+ChemSpiderMan> so this is fine as is I think
11:38 <+walkerma> (Walkerma considers how many isomers there are for maitoxin)
11:38 <+dmacks> CAS in infobox is (according to the table) the generic for this name.
11:38 <+dmacks> Would we want at least separate data for each compound?
11:38 <+ChemSpiderMan> Going back to what I sense as the issue is the structure drawn should coincide with the article title and all derivatives (SMILES, etc) should be for that
11:39 <+ChemSpiderMan> So, if the article says chloride...show the chloride
11:39 <+ChemSpiderMan> have Chloride in the SMILES
11:39 <+ChemSpiderMan> have CAS for the chloride...not the neutral
11:39 <+ChemSpiderMan> have name for the chloride
11:39 <+walkerma> Concur. We should make this VERY clear in the MOS
11:39 <+ChemSpiderMan> there are many examples where this doesn't happen
11:41 <+walkerma> Hopefully after this sweep there won't be many, and if people are more aware of it this problem won't happen so much in the future?
11:41 <+ChemSpiderMan> I think you are right.
11:41 <+ChemSpiderMan> It's very common with dyes to see no counterion
11:42 <+ChemSpiderMan> http://en.wikipedia.org/wiki/Azorubine has been cleaned up now..
11:42 <+walkerma> That's partly the history - my old UK boss worked in dyes - often they didn't even know the structure of what they made
11:43 <+ChemSpiderMan> no sodium ions before. The name was disodium and the CAS number was 2Na+
11:43 <+ChemSpiderMan> -related so should have been there
11:45 <+walkerma> So shall we agree that 1. We are consistent about counterions between structure, SMILES, CAS etc?
11:45 <+ChemSpiderMan> AGreement from me of course :-)
11:45 <+dmacks> yup
11:45 <+walkerma> 2. We use the Wikipedia article name as the "primary key" (at least for now) for the database, not the CAS?
11:45 <+ChemSpiderMan> for sure!
11:46 <+dmacks> Sounds good.
11:46 <+ChemSpiderMan> The article name "belongs" to wikipedia...CAS numbers don't
11:46 <+ChemSpiderMan> By that I mean that the article exists under the name
11:46 <+ChemSpiderMan> The CAS number is an associated at best
11:46 <+Rifleman_82> hmm
11:46 <+Rifleman_82> the wp article name is not particularly static
11:46 <+Rifleman_82> or rather, there are many which need to be rationalzied
11:46 <+walkerma> It's the way we organize things here, so it's the natural way - unless someone can come up with something better
11:47 <+Rifleman_82> like chloroplatinic acid
11:47 <+ChemSpiderMan> But there is no way to validate CAS or investigate via CAS...CAS is "behind closed doors"
11:47 <+Rifleman_82> c.f. dihydrogen hexachloroplatinate(2-)
11:47 <+Rifleman_82> i'm not rooting for cas
11:47 <+Rifleman_82> i'm saying there may be certain issues
11:47 <+walkerma> Yes, but if you look at the 6000 organics in ChemSpiderMan's collection, there must only be a handful changing their names each month, if that
11:47 <+Rifleman_82> unless we can validate all the names first?
11:47 <+ChemSpiderMan> You can search on CAS numbers but turn up a lot of poor associations
11:47 <+Rifleman_82> all 3000+ of them
11:48 <+ChemSpiderMan> Validate the names? Many of the names are NOT systematic...
11:48 <+ChemSpiderMan> sildenafil.
11:48 <+dmacks> ICANN is "a system" :)
11:48 <+ChemSpiderMan> heme c
11:48 <+Rifleman_82> http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/IRC_discussions
11:48 <+ChemSpiderMan> many,many,many
11:49 <+Rifleman_82> and the ethanoic/acetic acid discussion...
11:49 <+ChemSpiderMan> chloroplatinic acid is a "common name"
11:49 <+ChemSpiderMan> you are now systematizing.
11:49 <+Rifleman_82> also placed a link to http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry
11:49 <+dmacks> thx
11:49 <+Rifleman_82> check out the big box at the top of the main page
11:51 <+ChemSpiderMan> the question is "what will people search on"...and I think chloroplatici acid is more likely to be searched
11:51 <+ChemSpiderMan> but isn't necessary the "correct name"
11:51 <+Rifleman_82> so long as there are redirects you can call it anything you likie
11:51 <+Rifleman_82> but i guess we need to have certain... arbitrary but consistent standards for naming
11:51 <+ChemSpiderMan> Look at Walkerma and my discussion on DMF...
11:51 <+ChemSpiderMan> Yes, within the ChemBox for sure.
11:52 <+walkerma> I think we'll have to ponder the issue of validating the names. That could be a whole hour of IRC alone.
11:52 <+ChemSpiderMan> The best names possible- can convert back to the right structure for example
11:52 <+Rifleman_82> but chembox should pull the name from the article top
11:52 <+Rifleman_82> from the article name
11:52 <+ChemSpiderMan> Also...IUPAC vs CAs vs Beilstein?
11:52 <+Rifleman_82> the name = param should be used sparingly! if you use it it implies a lack of consistency, and it breaks when the page is moved
11:53 <+ChemSpiderMan> separate discussion...I say Systematic names in the CHemBox at all times (if possible)
11:53 <+ChemSpiderMan> but the article name CAN be systematic but shouldn't have to be...
11:53 <+ChemSpiderMan> otherwise you will be renaming HUNDREDS
11:54 <+ChemSpiderMan> Look on http://en.wikipedia.org/wiki/User:Walkerma/Sandbox
11:54 <+ChemSpiderMan> how many are systematic article TITLES?
11:54 <+ChemSpiderMan> 5%?
11:55 <+walkerma> Actually for inorganics, it's over 50%
11:55 <+walkerma> I'd guess
11:55 <+dmacks> Does an infobox map to the page (i.e., "generic tartaric acid") or to a particular compound (separate infobox for each isomer)?
11:55 <+ChemSpiderMan> sorry...I think you are right for inorganics...
11:56 <+ChemSpiderMan> they are also "easier" in many ways...
11:56 <+walkerma> dmacks raises an important issue
11:56 <+ChemSpiderMan> oxides, sulfides, sulphates etc...but organometallics and organics are not this way
11:57 <+walkerma> generic Tartaric acid does not have a specific MP, solubility, etc
11:57 <+Rifleman_82> generic = rac? or undefined?
11:57 <+dmacks> Rifleman_82: There appears to be a CAS for undefined.
11:57 <+walkerma> generic = undefined
11:57 <+walkerma> Because if it means rac, then we need a separate article on the meso
11:58 <+walkerma> Which we don
11:58 <+walkerma> 't want
11:58 <@Beetstra> For some, like tartaric acid, there could be a page for each .. but most are a problem
11:58 <+ChemSpiderMan> Undefined...
11:58 <+dmacks> If we had separate data table for each *compound* (in whatever sense makes it unique), would be easier to process it to/from databases. Then an article (which could have a title that is less specific than a single compound) could have data for each one.
11:59 <+dmacks> I disagree that "separate page for each compound" is a good solution, since they are often going to be copy'n'pastes of each other with [a]D sign-change.
11:59 <+Rifleman_82> m.p. may change too
12:00 <+Rifleman_82> and IR spectra , but we're not doing IR spectra
12:00 <+walkerma> We can't have separate pages for everything like that
12:00 <+walkerma> I think the tartaric acid article handles some of the data well - a nice table listing things like CAS
12:00 <@Beetstra> No, there are some exceptions .. most don't deserve both
12:00 <+walkerma> But we should really list MPs for each, solubilities, etc
12:00 <@Beetstra> 2/3 chemboxes on one page .. or a generic chembox on the page, and a /datapage?
12:01 <+walkerma> And the alpha-Ds for each of course!
12:01 <@Beetstra> And on the /datapage all chemboxes
12:01 <+Rifleman_82> or can we just stick to the undefined?
12:01 <+Rifleman_82> and prefer anhydrous over monohydrate
12:01 <@Beetstra> i mean, you can make that choice ..
12:01 <+Rifleman_82> prefer freebase over .HCl
12:02 <+Rifleman_82> WP:not CRC handbook?
12:02 <+walkerma> I think, Rifleman_82, WP is (for many people) now their CRC
12:02 <+walkerma> And their Merck index, their Aldrich catalogue
12:03 <+ChemSpiderMan> yes...it is becoming that way
12:03 <+walkerma> That's why people like Antony and Peter are interested in it
12:03 <+ChemSpiderMan> actually it's NOT the info in the ChemBox that's of interest to me at all
12:03 <+ChemSpiderMan> It's the text...
12:04 <+ChemSpiderMan> the descriptions, the history etc
12:04 <+dmacks> We (wikipedia) don't have to be comprehensive, but do need to be specific, and if others want to be able to process it automatically, need *some* systematic format for it.
12:04 <+Rifleman_82> how about a quick round - how many of you guys trust the data in chemboxes?
12:04 <+ChemSpiderMan> The majority of the ChemBox is of little concern (sorry guys)..
12:04 <+ChemSpiderMan> But it DOES need to be right for those who need it.
12:04 <+Rifleman_82> i don't trust it. if it really matters, i'll check CRC
12:05 <+ChemSpiderMan> I don't trust it...
12:05 <+Rifleman_82> if it doesn't matter, if i just want a feel for the state of matter, i'll trust the chembox
12:05 <+walkerma> So you need the Chembox as a "door" to find the text, is that right?
12:05 <@Beetstra> Same for me, trust .. nah .. but generally use it .. if it matters, I check properly
12:05 <+Rifleman_82> having entered many a chembox, i think i know bettter than to trust it
12:05 <+dmacks> Don't trust, but do fix when I find *blatant* errors (which isn't that often)
12:05 <+dmacks> I figure mp ballpark, etc.
12:05 <+Rifleman_82> most chemboxes are entered from MSDS, which are actually not authoritative
12:05 <+ChemSpiderMan> No...I need the article name to find the text...but the ChemBox is supposed to be correct which is why I want it fixed for you guys
12:06 <+Rifleman_82> and diff MSDS' conflict with each other esp bp mp and appearance
12:06 <+ChemSpiderMan> mp and bp are uncommon
12:06 <@Beetstra> One of the problems here is the free access .. anyone can put in anything .. all we can do is 'protect' it
12:06 <@Beetstra> (protect as in 'I don't trust your change, revert')
12:07 <+walkerma> That's the validation/flagged revisions issue, a whole other debate
12:07 <+dmacks> Last time I proposed putting the data on a separate page from the article, so that it would be easier to monitor changes to it.
12:07 <+ChemSpiderMan> The realChemBox content of interest for most people I think is as follows: structure drawing, name, SMILES>
12:07 <+ChemSpiderMan> That's 95% of the value I think..
12:07 <+dmacks> Okay, we'll save that debate for later.
12:07 <+Rifleman_82> agree with ChemSpiderMan
12:07 <+walkerma> I'd like to bring this meeting to a close, if that's OK
12:08 <+ChemSpiderMan> ok
12:08 <+dmacks> Disagree...few care about SMILES, much of target audience cares about general physical properties and mw
12:08 <+dmacks> okay.
12:08 <+Rifleman_82> SMILES is useful
12:08 <+Rifleman_82> cut and paste smiles into chemsketch to generate structure
12:08 <+dmacks> Right, but not to most of who read wikipedia.
12:08 <+dmacks> *whom
12:08 <+ChemSpiderMan> people have no way to generate the structure
12:08 <+Rifleman_82> and for search?
12:09 <+ChemSpiderMan> text-based
12:09 <+ChemSpiderMan> no way to search Wikipedia by structure...SMILES is no good.
12:09 <+ChemSpiderMan> There are too many flavors...they have spaces on WP
12:09 <+ChemSpiderMan> too many issues
12:09 <+dmacks> yeah
12:09 <+ChemSpiderMan> It's the other reason I am doing the project with the SDF generation
12:10 <+ChemSpiderMan> Walkerma...hpow long left...I have a proposal
12:10 <+walkerma> Propose it, and I can always say, "Another day!"
12:11 <+ChemSpiderMan> When the SDF file is done I will supply the following:
12:11 <+ChemSpiderMan> Chemical Structures consistent with the title of the article (or my best suggestion)
12:11 <+ChemSpiderMan> SMILES strings for those structures
12:11 <+ChemSpiderMan> InChI Strings for those structures
12:11 <+ChemSpiderMan> InChiKeys for those structures
12:12 <+ChemSpiderMan> IUPAC names from software (no human bias) generated for those structures
12:12 <+ChemSpiderMan> Mw
12:12 <+ChemSpiderMan> Molecular Formulae
12:12 <+ChemSpiderMan> ALL generated by the same software package
12:12 <+ChemSpiderMan> Now...they will need publishing to WP
12:12 <+ChemSpiderMan> The challenge is as follows:
12:13 <+ChemSpiderMan> I want to have a second/third set of eyes to confirm that the structures are appropriate for the article
12:13 <+ChemSpiderMan> Or..uou can trust me...
12:13 <+ChemSpiderMan> I would prefer you DON'T trust me
12:13 <+walkerma> We can trust a spider, right....?
12:13 <+ChemSpiderMan> It is an exhausting project and tired eyes...
12:13 <+Rifleman_82> haha
12:13 <+dmacks> Very funny, Miss Muffet.
12:14 <+ChemSpiderMan> (they bite you on the bum in Australia)
12:14 <+walkerma> Seriously, I agree, we should double check
12:14 <+ChemSpiderMan> Phew...
12:14 <+dmacks> Yes. If structure is the primary key for "a compound", it needs multiple eyes.
12:14 <+Rifleman_82> SDF = ?
12:14 <+ChemSpiderMan> Concatenated molfile
12:15 <+Rifleman_82> oic
12:15 <+Rifleman_82> replacing pngs?
12:15 <+ChemSpiderMan> I can just send a PDF File associated with each "letter" for now...
12:15 <+ChemSpiderMan> PNGs is image format..not a connection table
12:15 <+Rifleman_82> ok
12:15 <+ChemSpiderMan> I say we "try" a dry run with the letter "A"
12:15 <+walkerma> Better split up A into 2 or 3 PDFs
12:16 <+ChemSpiderMan> hold on..will tell you how big
12:16 <+ChemSpiderMan> about 250 records...
12:17 <+ChemSpiderMan> can split as necessary...how about chunks of 50 records per...
12:17 <+walkerma> That's OK, my workbooks for students are over 200 pages of PDF!
12:17 <+ChemSpiderMan> I am only sometimes validating PubChem links...so I am not taking that one at present
12:18 <+ChemSpiderMan> And CAS...I don't have Scifinder so can NEVER validate...just look for consistency
12:18 <+ChemSpiderMan> with other DBs...
12:18 <+ChemSpiderMan> someone else will need to check CSA
12:18 <+ChemSpiderMan> CAS
12:18 <+walkerma> Anyone here with Scifinder? I have to drive to another college for that
12:19 <+ChemSpiderMan> If we are ready to do this then I can send out the first 50 tonight/tomorrow
12:19 <+Rifleman_82> i have
12:19 <+Rifleman_82> i'll be quite free after next wed
12:19 <@Beetstra> I have .. but for 4000 compounds ..
12:19 <+Rifleman_82> but yeah, for 4k compounds?
12:19 <+ChemSpiderMan> it's expensive...
12:19 <+walkerma> We need to split the work up over several people, and probably several months too
12:19 <+Rifleman_82> can we filter out those easily verified ones?
12:20 <+Rifleman_82> use google to let webpages "vote"?
12:20 <@Beetstra> I have uni-access .. but only limited number of accounts .. gues they will be angry if they see me do this
12:20 <+Rifleman_82> if a lot of relatively reliable web sources agree on the cas, then we let it go?
12:20 <+ChemSpiderMan> this is the problem...that's my approach at present...
12:20 <+Rifleman_82> same thing, we have a limited number of accounts so i can probably only check for half an hour a day
12:20 <+Rifleman_82> it's only the exotic which really need more attention
12:20 <+ChemSpiderMan> CAS might take a while...so be it
12:21 <+Rifleman_82> the cas numbers of ethanol are probably verifiable by google
12:21 <+ChemSpiderMan> I have done the biggest part I "think"
12:21 <+ChemSpiderMan> So ou are "checking"
12:21 <+ChemSpiderMan> should be much faster
12:21 <+walkerma> Yes, thank you for all your work, CSM
12:21 <+ChemSpiderMan> It might be good for a rotation
12:21 <+ChemSpiderMan> One person take the first 50
12:21 <+ChemSpiderMan> Next person the next 50
12:21 <+walkerma> As a student project?
12:21 <+ChemSpiderMan> Your call gents...
12:21 <+ChemSpiderMan> as long as the person "cares"
12:21 <+Rifleman_82> whoever can harness their students, please do so!
12:22 <+Rifleman_82> then, we can divvy up the remaining load
12:22 <+ChemSpiderMan> there needs to be process so that 10 students all aren't reviewing the same stuff.
12:22 <+ChemSpiderMan> Maybe a set of letters to each group?
12:22 <+ChemSpiderMan> or records 1-500, 501-1000, etc
12:22 <+Rifleman_82> or we can all "ask" for a quantity
12:23 <+ChemSpiderMan> yup
12:23 <+Rifleman_82> let's say i want 600 entries
12:23 <+Rifleman_82> so you allocate 600 for me
12:23 <+Rifleman_82> and give it to noone else
12:23 <+ChemSpiderMan> and I can manage the distribution...
12:23 <+Rifleman_82> yeah that's simple
12:23 <+walkerma> Rifleman, could you set up a page on wiki for this?
12:23 <+ChemSpiderMan> agreed...
12:23 <+Rifleman_82> don't think we need to worry about this too much
12:23 <+Rifleman_82> ok
12:23 <+walkerma> OK, should we meet again next week at the same time?
12:23 <+Rifleman_82> sure
12:23 <+Rifleman_82> martin
12:24 <+Rifleman_82> perhaps you or dmacks can briefly summarize the discussion of this and the last meeting
12:24 <+ChemSpiderMan> I can't...sorry. Have a meeting next week a bout adding 250,000 Open access chemistry articles to ChemSpider
12:24 <+Rifleman_82> good luck tony!
12:24 <+ChemSpiderMan> :-)
12:24 <+walkerma> Maybe (if we can get PC to stay) we could discuss InChIs and InChIKeys
12:24 <+Rifleman_82> and post it at http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/IRC_discussions
12:24 <+ChemSpiderMan> we are indexing International Union of Crystallography back to 1948...fun
12:25 <+Rifleman_82> cifs?
12:25 <+ChemSpiderMan> abstracts and chemical names...andd try to convert to structures..
12:25 <+ChemSpiderMan> PC?
12:25 <+walkerma> Yes, and I'll post about next week on the projects as well.
12:25 <+walkerma> PC = Physchim62
12:25 * dmacks will post log ASAP when we're done here.
12:25 <+Rifleman_82> ok
12:25 <+walkerma> He wrote the InChI script
12:26 <+ChemSpiderMan> I am interested too...
12:26 <+walkerma> I think many of the issues are to do with how we handle these with the wiki markup and formatting
12:26 <+dmacks> yeah, long InChI keys, etc.
12:27 <+walkerma> So it's probably of less interest to you, CSM
12:27 <+ChemSpiderMan> okay...I'm not needed...one comment...do not BREAK the InChI...no spaces
12:27 <+walkerma> YES!
12:27 <+ChemSpiderMan> also, for InChIKeys...there is a powerful way to use them...let me get the link
12:27 <+ChemSpiderMan> http://www.chemspider.com/news/searching-inchikeys-by-connectivities-only-with-and-without-stereo.html
12:28 <+dmacks> WP really needs a way to allow long text strings to be line-breaked in the middle (i.e., not just whitespace)
12:28 <+ChemSpiderMan> search by connectivity and search with stereo
12:28 <+ChemSpiderMan> You WILL need standards for the acceptance of InChIStrngs...they should NOT be generated by the depositor in my opinio
12:28 <+walkerma> dmacks - that's what we need to resolve next time
12:28 <+dmacks> okay
12:28 <+ChemSpiderMan> If you give InChI generation choices you will be in trouble
12:29 <+ChemSpiderMan> Bye
12:29 <+walkerma> OK, I must get on as well
12:29 <+dmacks> åShall we close for today?
12:29 <+walkerma> See you next week?
12:29 <+dmacks> yup
12:29 -!- ChemSpiderMan [n=ChemSpid@c-68-33-151-242.hsd1.md.comcast.net] has quit []
12:30 <+Rifleman_82> ok
12:30 <+Rifleman_82> http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/CAS_validation
12:30 <+Rifleman_82> sorry if it isn't polished, it's late and i'm not fully functioning
12:30 <+walkerma> Thanks, we can polish it later
12:30 <+Rifleman_82> whoever wants to tweak it, please don't wait for me
12:30 <@Beetstra> I am afraid the only reasonable way at the moment is to use a 'InChI' (the correct one) and a DispInChI, the one that is on display, nicely broken where needed
12:30 -!- walkerma [n=chatzill@admin-151-108.potsdam.edu] has quit ["ChatZilla 0.9.80 [Firefox 2.0.0.11/2007112718]"]
12:31 <@Beetstra> In that way the right and correct InChI is in the box ..
12:31 <+Rifleman_82> sounds fair enough
12:31 <+Rifleman_82> guys, nice talking
12:31 <+Rifleman_82> i gotta sleep
12:31 <+Rifleman_82> goo dnight!
12:31 -!- Rifleman_82 [n=blahblah@wikipedia/Rifleman-82] has quit []
--- Log closed Tue Jan 22 12:35:03 EST 2008