Sequence profiling tool[edit]

The current article attempts to concisely present the concept of sequence profiling tools in Bioinformatics and their increasing relevance in holding the pyramid of sequence data in genetics/molecular biology. Surprisingly, no single source exists to describe and review such web based tools; the information contained herein is very valuable in providing an overview and their design. The article is not a compilation of the numerous bio-software that specialize in providing focused information or even public portals providing links to valuable databases.

I conceived the article and wrote a stub quite sometime back and have been helped occasionally in formatting and fixing. So, Yes! it’s a self-nomination - a vanity attempt... Though small enough to start with, the definition I provided happened to be the only “web definition” in Google results page whenever one typed ‘What is sequence profiling tool’, making me realize that more intelligent minds were focusing their efforts in defining more important issues. It is then that I thought of making the contribution more comprehensive by clearly outlining the concepts and classifying the —different kinds of sequence profiling tools and an example in each of them. This article is one such attempt.

I am not a wiki expert by any standards in terms of creative formatting, so I might need help in editing to begin with. Meanwhile, I have reasons to believe that the piece I compiled qualifies to be considered as a Wikipedia’s ‘feature article’. I hope the votes confirm this.

Nattu 03:50, 14 June 2006 (UTC)


I cleaned up some of the wiki syntax and started to retool some of the writing, but I think this topic is just inherently too small/of too limited interest to get through FAC even if it is extensively rewritten. Too much expansion would just make a meta-meta-portal. You might have better luck going through FAC with an expanded version of something like sequence alignment or microarray (or even restoring bioinformatics) if you want a bioinformatics-related featured article. Drop me a note on my talk page if you want to start expanding something like that (which I may do anyway, depending on how motivated I get :). Opabinia regalis 04:53, 15 June 2006 (UTC)

Scientific peer review request[edit]

Nattu, you probably noticed this before, but so any other visitors know - Wikipedia:Scientific peer review itself is a (mostly) inactive project. This article has been submitted for peer review but the reviewers are not necessarily subject matter experts; they tend to make comments on style, grammar, and other writing features as well as content. Opabinia regalis 05:49, 19 June 2006 (UTC)


This article is well-written. However, it appears to refer to a neologism, and seems mainly to be original research. The term "sequence profiling tool" does not appear in a cursory check through PubMed abstracts, the external links on this page, or Google, with the exception of articles written or co-written by one of the editors who created this page. Is there other support for the notability of this term? Grouse 16:12, 1 November 2006 (UTC)

===Nattu=== Thank you and I understand and appreciate your concern. The concept of sequence profiling is well known and sites like NCBI's BLAST and Ensembl offer very exhaustive services. It is also very evident in sites like Entrez and Bioinformatic Harvester which compile, process and present metadata in Bioinformatics. So while such well known services have existed and the concept known implictly, there has been very little documentation to elaborate on this. I tried to help compile a formal write up that would help readers grasp the concept as well understand the future directions of the field. In fact the field is gradually evolving to new sites/services that accomodate new kinds of data profiling for research scientists. So, while the term and article may look original, the content is well established in literature and known to scientists.

I would be very happy to see a greater participation from students and researchers and editors alike to contribute to this and shaping up the piece.

Nattu 17:22, 2 November 2006 (UTC)

I sympathise with Grouse's comments; even though the content of the article is interesting, it is grouped under a term which does not seem to appear in the bioinformatics litterature — if this is correct, it is not Wikipedia's role to create or promote such a word. Also, the term, wherever it comes from, seem to have a broad meaning. Given that "profile" is a word already used with many different meaning in the bioinformatics trade, using it to describe a simple "meta-search" engine as described in "Sequence data based profilers" seems a bit exagerated. Finally, I don't really see how the microarray stuff fits into the story; even though the GSEA is an interesting method, I would hardly associate it with the words "sequence profiling".
Maybe this content could be integrated in one or several more general articles about bioinformatic tools ? Schutz 23:22, 2 November 2006 (UTC)
Yes. You have a point in that the term does not seem to appear in the bioinformatics literature, but is taken implicitly . Also, as you say, the term "profile" is used very loosely to mean different things. So I decided make a start by defining one 'profiling' concept; that of Sequence profiling tools. Some liberties were taken but the core concept holds. Yeah, the microarray stuff seems a bit away but I feel it will only be matter of time before Mass spectrometric data, Microarray data will also be integrated alongside sequence databases. Meta searches in bioinformatics are thus rapidly evolving in their complexity to integrate information from varied sources. All of these need to be desribed and presented under a common banner.
Merging the article with any major topics is a good idea but given the length and content it will be too long to be included under any one particular area. I have a list of possible such topics. Alternatively, splitting the content will also tend dilute the core concept. So, its a kind of catch 22 situation, but I am glad we make a begining. More discussions are welcome.
Nattu 02:58, 3 November 2006 (UTC)
You say: "So I decided make a start by defining one 'profiling' concept". This is exactly what we should not do on Wikipedia. Scientists define concept in research or review articles; we do not define anything, even if there is a need for new vocabulary, we just describe what has already been defined elsewhere. "I feel it will only be matter of time before ..." is also an indication of original research.
As for the integration of MS and microarray data, we're probably still a long way from integration... (and as a side note, the MS vocabulary also includes the word "profile", just to add to the confusion). Schutz 06:44, 3 November 2006 (UTC)

Move or merge[edit]

In the recent AfD, numerous editors suggested a move to either Bioinformatics software tools or Sequence analysis. What do you think? Strictly speaking, I do not think that the "keyword based profiler" section really fits into what has traditionally been known as sequence analysis. Grouse

Yes, its not a bad idea to move or merge. The first suggestion of Bioinformatics software tools is a bit tricky, since it could tend to become a directory of known services. The second suggestion of Sequence analysis seems to be fine but slightly misses the point of query profiling/outlining. The core of the article should outline different types of data oultining tools that are capable of analyzing, organizing and presenting the results for a query. This brings me to the keyword based profilers... These are unlike typical search engines. A casual visit to the Entrez will demonstrate the point. For one query the results page returns the hits across all the NCBI databases. So in a single shot, a word like "muscular dystrophy" would be profiled across the NCBI databases. There are other services like this e.g. Bioinformatic Harvester and HPRD. The term 'data profiling/outlining' is key when planning to move/merge.
So my suggestion would be to merge/move the piece with Bioinformatics data profilers OR Bioinformatics data outlining tools. You may have to be the guide lest I neologize. Opabinia regalis could also be a help in this. What do you say? Nattu 18:01, 8 November 2006 (UTC)

