Jump to content

Biositemap

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 128.97.129.123 (talk) at 23:14, 11 June 2008 (See also). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The NCBC Biositemap

The Biositemaps[1] Protocol allows scientists, engineers, centers and institutions engaged in modeling, software tool development and analysis of biomedical and informatics data to broadcast and disseminate to the world the information about their latest computational biology resources (data, software tools and web-services). The biositemap concept is based on ideas from[2] and Crawler-friendly Web Servers,[3] and it integrates the features of Sitemaps and RSS feeds into a decentralized mechanism for announcing and communicating updates to existent and introduction of new biomedical data and computing resources. These site, institution or investigator specific biositemap descriptions are posted in XML format online and are searched, parsed, monitored and interpreted by web search engines, human and machine interfaces, custom-design web crawlers and other outlets interested in discovering updated or novel resources for bioinformatics and biomedical research investigations. The biositemap mechanism separates the providers of biomedical resources (investigators or institutions) from the consumers of resource content (researchers, clinicians, news media, funding agencies, educational and research initiatives).

A Biositemap is an XML file that lists the biomedical and bioinformatics resources for a specific research group or consortium. It allows developers of biomedical resources to completely describe the functionality and usability and of each of their software tools, databases or web-services.

Biositemaps are particularly beneficial in situations

  • when providers and consumers of bioinformatics and biomedical computing resources need to communicate in a scalable, efficient, agile and decentralized fashion. In these cases, a human (graphical) or a machine (computer) interface connects the descriptions of resources and facilities the search, comparison and utilization of most relevant resources for specific scientific studies. This infrastructure enables effective and timely matching of services and needs among biomedical investigators and the public in general.
  • where meta-resources, computational or digital libraries need to update their contents to reflect the current states of newly developed biomedical materials and resources using AJAX, JSON or WSDL protocols.

Biositemaps supplement and do not replace the existing frameworks for dissemination of data, tools and services. By broadcasting a relevant and up-to-date Biositemap file on the web, investigators and institutions are only helping different engine's crawlers, machine interfaces and users dynamically acquire, interpret, process and utilize the most accurate information about the state of the resources disseminated by the developing group. Using this biositemap protocol does not guarantee that your resources will be included in search indexes nor does it influence the way that your tools are ranked or perceived by the community.

Computational biology resources

There are several types of computational biology resources[4] [5]

Software Resources (Tools)

(Downloadable) Data Resources

  • Raw Data - acquired data, going in various SW tools (e.g., CCB IDA)
  • Model Data - processed data coming out of SW tools (e.g., CCB Atlases)
  • Textual Data - spread sheets, web-pages (e.g., Imaging Glossary)
  • Data Types
    • XML (e.g., Module Descriptions)
    • JSON Objects - language independent, data transfer, data attributes (e.g., JSON output from the Yahoo! Web Services contain the same data as an XML object; the only difference is in the format, Get GeoNames PostalCodes within 10-miles of US 90095)
    • Binary Data
    • Comma-separated-values, .csv

Services

  • Web-services
  • Collaborative Services
  • Other Services

Required Resource Description Fields

The Biositemap protocol allows many optional fields, but it requires several specific descriptors that are commonly used and necessary for characterizing biomedical resources. There required fields[6] are:

  • Name, txt, single line
  • Description, txt, multiline
  • Type of Resource, typed list
  • URL, txt
  • Stage, typed list
  • Organization
  • Resource Ontology Label
  • Keywords, txt, single line (May be related to NCBC Ontology)
  • License, typed list, with other

Some very useful, but optional, resource descriptors include the type, specification and expectations of the inputs, as well as the characteristics of the outputs, of these resources.

See also

References

  1. ^ Dinov ID, Rubin D, Lorensen W, Dugan J, Ma J, Murphy S, Kirschner B, Bug W, Sherman M, Floratos A, Kennedy D, Jagadish HV, Schmidt J, Athey B, Califano A, Musen M, Altman R, Kikinis R, Kohane I, Delp S, Parker DS, Toga AW (2008). "iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources". PLoS ONE 3(5): e2265. doi:10.1371/journal.pone.0002265. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  2. ^ M.L. Nelson, J.A. Smith, del Campo, H. Van de Sompel, X. Liu (2006). "Efficient, Automated Web Resource Harvesting" (PDF). WIDM'06. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  3. ^ O. Brandman, J. Cho, Hector Garcia-Molina, and Narayanan Shivakumar (2000). "Crawler-friendly web servers". Proceedings of ACM SIGMETRICS Performance Evaluation Review, Volume 28, Issue 2. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  4. ^ Cannata N, Merelli E, Altman RB (2005). "Time to Organize the Bioinformatics Resourceome". PLoS Computational Biology Vol. 1, No. 7, e76, 0531-0533. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  5. ^ Dinov ID, Rubin D, Lorensen W, Dugan J, Ma J, Murphy S, Kirschner B, Bug W, Sherman M, Floratos A, Kennedy D, Jagadish HV, Schmidt J, Athey B, Califano A, Musen M, Altman R, Kikinis R, Kohane I, Delp S, Parker DS, Toga AW (2008). "iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources". PLoS ONE 3(5): e2265. doi:10.1371/journal.pone.0002265. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  6. ^ Chen YB, Chattopadhyay A, Bergen P, Gadd C, Tannery N (2006). "The Online Bioinformatics Resources Collection at the University of Pittsburgh Health Sciences Library System--a one-stop gateway to online bioinformatics databases and software tools". Nucleic Acids Res. 2007 Jan;35(Database issue):D780-5. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)