User talk:Andrew Su/Archive1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Old Stuff

Hi Andrew, I have been playing with your gene ontology content in the template. Hope you don't mind. I have created a master list of the gene ontology functions rather than having a separate page for each element. This should make things simpler. I added a hyperlink as link to each function but we could probably find something more elegant. Hope you like the changes. David D. (Talk) 18:58, 30 March 2007 (UTC)

Brilliant, thanks David. I noticed you were playing around with the templates, and I've been trying to "follow along" (frequent reloads). I saw a method to your madness (and I knew the way I implemented the GO index was ugly), and agreed, I like your new structure better. Thanks for the contribution! I like they way ITK (gene) is coming together! AndrewGNF 19:16, 30 March 2007 (UTC)

Need to take a break here. I can't figure out how to stop the link being preceeded by a double line break for the gene ontology info. I'll have another crack at it in a while. Glad you like it. David D. (Talk) 19:52, 30 March 2007 (UTC)

Yeah, weird. Nothing obvious to me why those paragraph breaks appear like that, and it looks like from your edit comments you've tried everything that I'd try. Hmm, rather than beat our heads against this too much, what do you think about going back to the form where the function name is hyperlinked (rather than having that plain text and hyperlinking a superscripted "link" tag)? Maybe we can even convert it to a bulleted list... Thoughts? AndrewGNF 20:30, 30 March 2007 (UTC)
Actually the first thing i tried was to hyperlink the whole name and that seemed to be incompatible. Instead of the GO blue linked it gave the whole http long address too. I think we need to play around with the pretein taxobox template. I'll set up a tempory one in CZ, or here, so we can play around without distrupting the other protein pages. I like the bullet idea too. David D. (Talk) 20:56, 30 March 2007 (UTC)
Got it, just started playing around with it. Must have imagined it that it was working at some point. Anyway, I just started playing around with the GNF_GO template. But I'm off to a meeting now for a while so feel free to revert back to your previous working stage. Otherwise, I'll play more in an hour or so... AndrewGNF 20:59, 30 March 2007 (UTC)
Oh, how do you do temporary templates? It seems like there should be the equivalent of "preview" for templates... AndrewGNF 21:00, 30 March 2007 (UTC)
Finally found the problem. When i annotated the master lsit i included two spaces before the <noinclude> this was then introduced into the taxo box. Once i fixed the master list everything worked as expected. Good thing i took a break rather than banging my head against the wall. Amazing how taking a step back allows you to see the real problem.
As far as template are concered you can preview them but it does not always help ypou know what it will look like on the page iteself. Especially when you are dealing with nested templates. David D. (Talk) 21:13, 30 March 2007 (UTC)
Brilliant! The page looks great now. Thanks! AndrewGNF 21:42, 30 March 2007 (UTC)

thumbnail

File:Itk GNF1Hthumb.png

I made this thumbnail. See how I have incorporated this into the the protein info box and refer to the talk page there for more rationale. David D. (Talk) 21:08, 30 April 2007 (UTC)

Hi! I wanted to remind you about your active BRFA, Wikipedia:Bots/Requests for approval/ProteinBoxBot, that will probably end up expired in a week or so if you don't give us an update. If you no longer need the bot, you can put {{BotWithdrawn}} there and we will close it for you. Thanks! ST47Talk 00:48, 11 May 2007 (UTC)

Not sure if Andrew is around but he mentioned on the MCB wikiproject page that he was in the process of doing a ten article test run. See this post from Andrew, it should serve as an update. While Andrew is doing the grunt work the MCB wikiproject is fully behind this experimental bot, if that helps. David D. (Talk) 03:01, 11 May 2007 (UTC)
Hi ST47. Thanks for the note, I didn't realize that bot requests expire. (makes sense though...) In retrospect, I proposed the bot a little bit far in advance of when we'd actually be creating pages. Anyway, how do I go about delaying that expiration, or should I just repropose it when we're going to start creating pages (within the next couple weeks). Thanks, AndrewGNF 13:29, 11 May 2007 (UTC)


Your bot request

Hi AndrewGNF I wanted to let you know that Wikipedia:Bots/Requests for approval/ProteinBoxBot is labeled as needing your comment. Please visit the above link to reply to the requests. Thanks! --ST47Talk 21:55, 16 May 2007 (UTC)

Hi, I read your note on my talk page and wrote up my concerns here. I imagine when that's addressed, you'll have your answer to the questions specific to this article. Thanks! -- But|seriously|folks  17:04, 13 August 2007 (UTC)

Anything needed?

Hi there. Is there anything you need doing for the bot approval or implementation? I'll be happy to help in any way. Tim Vickers 16:49, 15 August 2007 (UTC)

Tim, you've been a great help already. I think you and others at the BAG have come up with really great enhancements to the hard work that Jon's put in. Me, I'm just a spectator now... ;) Actually, pretty soon we will need the help of PBB "curators" who can help merge enhanced protein boxes with existing articles. (Right now the emphasis has been on new articles.) Check out my edits of Apolipoprotein_E and Amyloid_precursor_protein for examples... Cheers, AndrewGNF 17:01, 15 August 2007 (UTC)

Hi again, I wonder if you could add your database and websites to the new Related projects and resources page? I'm trying to forge more links with the biological community. Tim Vickers 16:55, 16 August 2007 (UTC)

Tim, sure, happy to... But do you mean our WP / Gene page project (in which I would link to User:ProteinBoxBot under daughter projects even though it isn't an official daughter project), or do you mean to SymAtlas our free and publicly-available gene portal for gene expression data and annotation? The latter is not a wiki, so perhaps that should go under another section for "External resources"? Added a link to OpenWetWare, which is another biological wiki I'm aware of... AndrewGNF 17:28, 16 August 2007 (UTC)

Looks good. Yes, links to ProteinBox bot and Symatlas please, with a short explanation of what SymAtlas offers and how it relates to MCB. I'm trying to integrate MCB more with both the wider web organizations and the academic community. Tim Vickers 18:00, 16 August 2007 (UTC)

EMBOencounters

Hi Andrew, just a note to let you know I mentioned this project to EMBO and their magazine EMBOencounters is in touch about writing an article on it. Tim Vickers 17:24, 30 August 2007 (UTC)

cool, thanks for the heads up... I guess that means we better hurry it up. Any idea when it'd come out? Cheers, AndrewGNF 17:53, 30 August 2007 (UTC)

The person writing the article said it was going to be in the November issue. Tim Vickers 19:19, 30 August 2007 (UTC)

Andogen receptor: gene atlas

Your recent addition to Androgen receptor looks really cool! Is the "Protein Box Bot" dynamically updated to include the latest releases from the PDB? Boghog2 22:29, 14 September 2007 (UTC)

Glad you like it... the PBB (as we affectionately call it) gets its data from our gene portal at http://symatlas.gnf.org (well, actually PBB talks to Symatlas' successor). We plan to update the data in our gene portal's database quarterly (from NCBI, Ensembl, PDB, etc.), and after that will re-run PBB to refresh the data in the protein boxes. AndrewGNF 22:33, 14 September 2007 (UTC)
Could you enhance the reelin page similarly? I'd be grateful. Or could I do it using some tool? CopperKettle 13:52, 6 October 2007 (UTC)
Sure thing, we'll get that to the top of the list. Look for it hopefully next week (we're in a bit of hiatus while we do some code improvements). Also, I created a "requests" section on the ProteinBoxBot user pages and took the liberty of adding your request there as the first entry... AndrewGNF 17:54, 6 October 2007 (UTC)
Thank you! Looks great. CopperKettle 16:37, 31 October 2007 (UTC)

SymAtlas

You might wish to create a WP article SymAtlas. It seems there are enough references to it (see WP:Notability). Biophys 23:36, 1 November 2007 (UTC)

Yeah, I suppose I could. Somehow though it feels a little bit slimy to me (though not quite as bad as the folks that start articles on themselves... ;) ) Anyway, I'll put it on my list of things to do... AndrewGNF 00:18, 2 November 2007 (UTC)
You know this subject much better than others. I can take a look then and correct anything if needed. But make sure that the subject does satisfy WP:Notability - you will need some external publications that refer to SymAtlas.Biophys 04:44, 2 November 2007 (UTC)

Usher syndrome genes

Hi Andrew,

It's great to see the ProteinBoxBot up and running! I had no idea that it was so sophisticated. If we do decide to change the categories on all those enzymes, I may have to ask you about how to write a bot to update them, unless the PBB could do it. :)

I'm writing for another reason. I've recently taken an interest in Usher syndrome, sparked by an article in Marie Claire but also inspired by some amazing people I've found at YouTube. Anyway, it would be great if the human genes on that page had stubs instead of redlinks; does that fall within the PBB's scope? I'd be really grateful if you could do those; I think there are roughly 12 or so, some of which already have articles. Thanks muchly! :) Willow 11:47, 5 November 2007 (UTC)

Hey, I just read a few messages above that you already have a request list for the PBB. I'll add the Usher genes, so that you don't have to go looking for them. Thanks! :) Willow 11:52, 5 November 2007 (UTC)
Hi Willow. We're in the process of resolving a couple of technical issues right now, but as soon as we're back up and running, we'll process your genes. Cheers, AndrewGNF 14:29, 5 November 2007 (UTC)

bot

Hi Andrew, the bot is looking good. i just checked in to see how it was going. I left a few comments on the bots talk page. David D. (Talk) 04:14, 6 November 2007 (UTC)

E=mc² Barnstar

The E=mc² Barnstar
I don't know how you haven't gotten this already. In honor of your hard work putting together ProteinBoxBot, I hereby award you this well-deserved barnstar. jonny-mt(t)(c) 06:55, 7 November 2007 (UTC)
Thanks! Is it possible to share a barnstar? While I get to be the mouthpiece of PBB, our trusted masters student JonSDSUGrad has done all the heavy lifting in the background. He deserves (at least) half the credit! Cheers, AndrewGNF 17:30, 7 November 2007 (UTC)

Overlinking

Despite the edit summary, I actually kept two links in Tuberous sclerosis: one in the lead and one it the Genetics section. I've since added one more to the Timeline of tuberous sclerosis: one in the lead and one where TSC2 is first cloned. The advice in WP:OVERLINK seems reasonable: one link is usually enough but sometimes a link later may be appropriate. One link per section, on a word like TSC2 that would appear all over the article, is probably too much. Be aware that the MOS is subject to random edits like any other part of WP, and it is a constant battle to keep it self-consistent. There seem to be too many MOS pages with something to say on links.

It is a judgement call. Some people find the multicoloured text a distraction. I suppose, reading through each article, you have to guess how often and where you should remind/inform the reader that there is a linked article. There may be a case for a further link in the History section of Tuberous sclerosis, what do you think?

Your TSC2 doesn't mention its protein: tuberin, though it mentions the one for TSC1: hamartin. Do you plan to create a TSC1? It is currently a redirect to something obscure but I think the gene TSC1 should replace it. Colin°Talk 23:01, 8 November 2007 (UTC)

I agree, so I added the one more link in the History section (and also wikilinks from the infobox, which we've been doing when a gene family page exists as well as individual gene pages). Yeah, odd that the TSC2 page doesn't mention its more common name (which I now see through the OMIM link). If you haven't seen our ProteinBoxBot effort, we're synthesizing information from many common sources to populate the infobox and stub text, and our sources aren't always complete. In any case, I've added a brief mention of "Tuberin" to TSC2 and also created the redirect from Tuberin. We absolutely plan on creating a page for TSC1 (at which time we'll replace the redirect with a disambig). In fact, it looks like TSC1 is all ready to go, just waiting for a volunteer to claim it and put it in its correct home... AndrewGNF 23:32, 8 November 2007 (UTC)
A disambig is only needed if the options are evenly weighted. To be honest, I never understood the relationship between the TSC1 redirect and what it points at—it seems tenuous at best. Perhaps this is an acronym used by a tiny number of people. A Google(Scholar) should help settle this. There are no links to TSC1 that use the current meaning, so my vote would be to simply eliminate the connection. If you feel that a number of people would search for the other thing with "TSC1" then we can put a dab link at the top of the TSC1 article. Colin°Talk 07:45, 9 November 2007 (UTC)
Great point. For existing redirects that point to non-biology topics, I've been making disambig pages. But this one points to another gene (for unknown or shaky reasons), and I agree that in this case, the redirect should just be replaced... Thanks... AndrewGNF 16:57, 9 November 2007 (UTC)

Your bot's images

Would it be possible if the bot uploaded .png or .svg images? The .jpeg artifacts are pretty ugly. Thanks! Ρх₥α 21:34, 9 November 2007 (UTC)

I assume that you mean the protein structure images, e.g., Image:PBB_Protein_MCL1_image.jpg? If so, then sorry, we download those from a public domain source and that's how we get them. As an aside, I'm no image expert but I haven't noticed any particularly distracting jpg artifacts. (The ones I'm familiar with are splotchy patterns in uniform blocks of color.) Care to explain further? Cheers, AndrewGNF 22:00, 9 November 2007 (UTC)
Ah, okay. I thought you were manually producing the images. Thanks anyways. Ρх₥α 22:02, 9 November 2007 (UTC)

ProteinBoxBot

I strongly support your bot. But you should know that if someone wants to challenge your work, he might nominate any bot-created article without abstract (e.g. CSNK2A2) for deletion claiming that it "does not belong to WP" (see WP:Deletion policy) and then refer to this: [1]. So, first thing would be to go through all such articles and create short abstracts using UniProt, for example. Then your work will be completely safe. I will try to do some of that, but my time is very limited. Thank you for creating great bot!Biophys 15:44, 13 November 2007 (UTC)

Thanks for the comments and support Biophys. Yes, the Uniprot summary would definitely be a good thing to have. For the time being though, I think we're going to have to deal with that on a one-off basis. We, too, are very limited on time (in particular our masters student), which is why I'm pushing so hard to get version 1.0 done. Unfortunately, Uniprot is not as tightly integrated into our local database as Entrez Gene. It may not have been clear before that PBB gets all of its content from SymAtlas (well, actually its successor), which involves a huge amount of work resolving all of these database cross-references. Querying SymAtlas allows PBB to retrieve all gene annotation data in a single XML file. The advantage is that PBB just has to go to one source. The disadvantage (evident here) is that adding new annotation data is more work than simply adding a few lines of code to PBB itself. But again, we'll fix that on round 2! AndrewGNF 16:45, 13 November 2007 (UTC)
From WP perspective, the most important databases are those providing abstracts/annotation for individual proteins, such as Entrez Gene and Uniprot. I will comment more in Box Ideas.Biophys 15:32, 14 November 2007 (UTC)

MLL1 of HRX ?

Hello! I've made a stub on MLL1 and then discovered that it is already here, created by you under the HRX name. I wonder what is the "most official" name for it and what article should be made a redirect.. CopperKettle 09:51, 16 November 2007 (UTC)

ProteinBoxBot reruns

I'll probably start working again on the project this weekend. I noticed you've been doing reruns based on all of the previous discussion. I've already claimed User:ProteinBoxBot/PBB_Log_Wiki_11-4-2007_B-0 but I haven't started it. I can claim one of the rerun files instead, or go ahead and finish this file. Whatever is easier for you. Forluvoft (talk) 23:54, 16 November 2007 (UTC)

If you want to "unclaim" it and move on to one of the newer files, that will probably be more compliant with the recent discussions. We can then ask Jon to re-run that log file so all the updated wikicode is in the log. Thanks again for all your efforts. Your pages look great! AndrewGNF (talk) 03:11, 17 November 2007 (UTC)
Done. And thanks to you guys for making all of this possible! Forluvoft (talk) 03:49, 17 November 2007 (UTC)


Beta-ADR Kinase

The protein info box looks great, but maybe you can compress it to the side so that it does not leave the huge blank white space in the middle of the article. I would do it myself, but I am not sure how to play with your table. Thanks! 198.137.30.179 (talk)

Not sure which page you're referring to. Can you provide a wikilink? Also, the best way to get rid of that whitespace is to add content... ;) AndrewGNF (talk) 19:16, 9 December 2007 (UTC)

Speaking of adding content...

What is the point of creating stubs for genes if nothing is said about what they do? For example, GPR155, GPR156, GPR157, GPR158 all read exactly the same. Aren't you just mirroring some database(s)? Adding 10,000 stubs will increase the size of Wikipedia by 0.47%. AnteaterZot (talk) 08:59, 12 December 2007 (UTC)

Those GPCR genes were created by request -- see User_talk:ProteinBoxBot. Presumably now that those stubs are created, the interested user(s) will add additional useful content. But, I'd argue that the stubs even as they are now are useful and notable (also the consensus of the BAG, MCB, etc.), even if slightly less full than some of the other gene pages. AndrewGNF (talk) 17:34, 12 December 2007 (UTC)
So you are not planning on creating 10,000 such stubs? AnteaterZot (talk) 23:32, 12 December 2007 (UTC)
Yes, we eventually will work up to ~10k genes, as described on the bot approval page. Not sure what you're getting at here... AndrewGNF (talk) 00:03, 13 December 2007 (UTC)
What I'm getting at is that most of the genes in the world are not notable. For example, fruitless is notable. AnteaterZot (talk) 00:08, 13 December 2007 (UTC)
Well, the notability issue has been discussed extensively on the bot approval page, on the MCB/Proposals page, PBB talk page, and at the village pump. (Sorry, if you can't find any of those pages, I'm happy to wikilink. Just feeling lazy...) Each time, the consensus of users has been to move ahead. If you still want to raise notability issues (hopefully with arguments that haven't been previously raised), I'd suggest doing it at the bot talk page. AndrewGNF (talk) 00:13, 13 December 2007 (UTC)
Well, okay. And I think it would be very kind of you to provide maybe one link that would lead me to the others. AnteaterZot (talk) 00:17, 13 December 2007 (UTC)
Done, added two links... AndrewGNF (talk) 00:20, 13 December 2007 (UTC)
I've looked it over, and I must commend you in your efforts to digest material from the various databases into a more accessible format than Entrez. But a couple of things still worry me. One is the assertion that a gene is inherently notable; "Notability of the genes themselves, I think, is a given. These are human genes, the stuff of life!" This is simply not true. Most genes, if knocked out, have little or no effect on phenotype. You address this by requiring the gene be mentioned in more than a couple papers, which is a good start. Two is the heavy reliance on primary sources, and I mean this in the scientific literature sense. Wikipedia requires secondary and/or tertiary sources to establish notability. I take this to mean that a gene should have a couple of mentions in review articles, and/or a mention in the popular press. Take for example, BRCA1. It has 174 mentions in the New York Times. You might say that example is a bit unfair, so how about C5a receptor? It appears in the title of a couple of review journals, and here in a story about a pricy biotech startup. So it might be okay. Now let's take one your bot created, GPR32. It has 208 unique g-hits, none of which amount to anything. I found only one citation on webofknowledge, the (Marchese et al. 1998) one, which is a short communication. They don't really seem to know what the gene does. The gene does not appear to have been in the title or abstract of any review articles. Therefore the gene appears to be not notable. Do you disagree? AnteaterZot (talk) 10:56, 13 December 2007 (UTC)
Having said that, is there any way you can tune your bot to not create stubs on genes like GPR32 while keeping notable ones? Perhaps it can require the word "review" in two sources? AnteaterZot (talk) 10:56, 13 December 2007 (UTC)
I second Andrew's suggestion to move this discussion to the PBB talk page. Since I was the one that requested these GPR pages, I feel that I have an obligation to respond, but on the PBB talk page, not here. Cheers Boghog2 (talk) 17:20, 13 December 2007 (UTC)

Policy discussion

Hi there, I'm afraid this pulled at a loose thread in a contentious definition. Things are being sorted out at Wikipedia_talk:No_original_research#Journal_articles, but this might take some time. I don't think this is anything we need worry about though, most people seem very supportive of your efforts. Tim Vickers (talk) 22:36, 14 December 2007 (UTC)

I'll happily let you sort out the bigger WP issues.  ;) Thanks... AndrewGNF (talk) 22:48, 14 December 2007 (UTC)

The insulin gene

Good day. Do you happen to know if there is any existing article on the Insulin gene? I just started a stub on it, but there sure is enormous information about it out there.Mikael Häggström (talk) 15:28, 13 January 2008 (UTC)

Hi Mikael... No, I was not aware of an article on the Insulin gene which is separate from the main Insulin article. As you've apparently found, PBB did process that gene and normally I would have tried to integrate it into Insulin. I didn't because that was one of the few cases where the existing infobox had substantial content that the PBB infobox did not have. So, I left it at Talk:Insulin#User:ProteinBoxBot_content in case someone wanted to do the legwork to merge and eliminated duplicate content. If you think this is a case where a separate page for the gene and the product is warranted, I'd certainly support you on that... Cheers, AndrewGNF (talk) 03:24, 14 January 2008 (UTC)
Oh, and bravo on all your recent hard work on various gene pages. I think you and User:Boghog2 are the two who perpetually show up on my watchlist. Makes me feel like a slacker sometimes... AndrewGNF (talk) 03:32, 14 January 2008 (UTC)