OttoQL

From Wikipedia, the free encyclopedia
Jump to: navigation, search

OttoQL is a universal QueryLanguage for tables and documents, which was implemented firstly for XML. It has a very simple syntax (kind of writing). The operations are applied generally sequentially on all corresponding tuples or subtuples. In the following program an XML-document is given by a table:

BMI-example in OttoQL[edit]

<<L(NAME,      LENGTH, L(AGE,   WEIGHT))::
     Klaus      1.68      18     61
                          30     65  
                          56     80
     Rolf       1.78      40     72
     Kathi      1.70      18     55 
                          40     70
     Valerie    1.00       3     16
     Viktoria   1.61      13     51
     Bert       1.72      18     66
                          30     70 >>
 mit NAME: AGE>20                           # with: selection
 ext BMI:=(WEIGHT div LENGTH**2)            # ext: introduction of a new column
 gib BMIAVG,M(AGE,BMIAVG,B(BMI,NAME)) &&    # give me; && connects two lines to a logical unit
    BMIAVG:=avg(BMI)
 round 2

It is visible also without tags that the tuple of Klaus ends with 80 and that Klaus has 3 subtuples. In this structure, for example is AGE subordinated to NAME. In the gib-part this hierarchy is inverted simply by giving the scheme or generally the DTD (Document Type Definition) of the desired XML-document. Here M (German: Menge) abbreviates set, B bag and L list. But at first a selection is applied in the above example. Instead of mit also ohne (without) can be used. By the above selection all tuples without an AGE-entry greater than 20 are discarded. These are Valerie and Viktoria. The first subtuple of Klaus remains yet in the table, because by NAME: is expressed that we select only complete tuples and no subtuples. If we want to omit subtuples, we have to replace NAME by AGE or WEIGHT. The following two conditions select in both lists: mit NAME, AGE: AGE>20 resp. mit AGE>20. By an ext-part the table is extended by a new column (extension). Without an introduction of variables here column names of different levels can be used. Right of WEIGHT the Body-Mass-Index-column is introduced. It is notable that the BMI-values not only for the length 1.68 and the weight 61 but also for 1.68 and the second row (65) are computed. Beside restructurings in non-recursive DTD's it is possible to realize by a gib-part also the following tasks:

  • sort (M,B) (by the first fields of the collections) (M-, B-: descending)
  • aggregate (simultaneously horizontal and vertical)
  • eliminate duplicates (M, M-)
  • joins and unions
  • projections
  • groupby and nest
  • unnest
  • taggen

The last operation (round) rounds all numbers, which occur in the result of the gib-part, to 2 digits after .. Binary operations are written in OttoQL infix. Because of this the above program realizes the following query: Find the average BMI, the BMI per age-level and the BMI of each persons and AGE persons, where the person is older than 20. Sort by AGE and within an AGE-group by BMI. The result as table:

 <<BMIAVG,M(AGE,   BMIAVG, B(BMI,  NAME))::
   23.12    18     20.98     19.03 Kathi
                             21.61 Klaus
                             22.31 Bert
            30     23.34     23.03 Klaus
                             23.66 Bert
            40     23.47     22.72 Rolf
                             24.22 Kathi
            56     28.34     28.34 Klaus>>

BMI-example in XQuery[edit]

  <PERSON>
    <NAME>Klaus</NAME>
    <LENGTH>1.68</LENGTH>
    <SUBTUP>
      <AGE>18</AGE>
      <WEIGHT>61</WEIGHT>
    </SUBTUP>
    <SUBTUP>
      <AGE>30</AGE>
      <WEIGHT>65</WEIGHT>
    </SUBTUP>
    <SUBTUP>
      <AGE>56</AGE>
      <WEIGHT>80</WEIGHT>
    </SUBTUP>
  </PERSON>
  <PERSON>
    <NAME>Rolf</NAME>
    <LENGTH>1.78</LENGTH>
    <SUBTUP>
      <AGE>40</AGE>
      <WEIGHT>72</WEIGHT>
    </SUBTUP>
  </PERSON>
  <PERSON>
    <NAME>Kathi</NAME>
    <LENGTH>1.7</LENGTH>
    <SUBTUP>
      <AGE>18</AGE>
      <WEIGHT>55</WEIGHT>
    </SUBTUP>
    <SUBTUP>
      <AGE>40</AGE>
      <WEIGHT>70</WEIGHT>
    </SUBTUP>
  </PERSON>
  <PERSON>
    <NAME>Walleri</NAME>
    <LENGTH>1.</LENGTH>
    <SUBTUP>
      <AGE>3</AGE>
      <WEIGHT>16</WEIGHT>
    </SUBTUP>
  </PERSON>
  <PERSON>
    <NAME>Viktoria</NAME>
    <LENGTH>1.61</LENGTH>
    <SUBTUP>
      <AGE>13</AGE>
      <WEIGHT>51</WEIGHT>
    </SUBTUP>
  </PERSON>
  <PERSON>
    <NAME>Bert</NAME>
    <LENGTH>1.72</LENGTH>
    <SUBTUP>
      <AGE>18</AGE>
      <WEIGHT>66</WEIGHT>
    </SUBTUP>
    <SUBTUP>
      <AGE>30</AGE>
      <WEIGHT>70</WEIGHT>
    </SUBTUP>
  </PERSON>
 </PERSONS>//PERSON[.//AGE>=20])
 let $bmis:=(for $p in $persons
            return <PER> {$p/NAME}
                         { for $t in $p/SUBTUP
                           return <TU> {$t/AGE}
                                       <BMI>{$t/WEIGHT div($p/LENGTH * $p/LENGTH)}</BMI></TU>
                          }
                   </PER>)
 return <results><BMIDUR>{round-half-to-even(avg($bmis//BMI),2)}</BMIDUR>
          { for $a in distinct-values($bmis//AGE)
            order by $a
            return <AG>
                     <AGE> {$a} </AGE> 
                     <BMIDUR>{round-half-to-even(avg($bmis//TU
                                              [AGE = $a]/BMI),2)}</BMIDUR>
                     { for $p2 in $bmis[.//AGE=$a] for $b in $p2//TU[AGE=$a]//BMI
                       order by $b,$p2/NAME
                       return<TU2><BMI>{round-half-to-even($b,2)}</BMI>{$p2/NAME}</TU2>
                     }
                  </AG>
 }
 </results>

In OttoQL especially the following tools can be used:

  • operations for matrixes
  • operations for tuples
  • recursive extensions
  • userdefined functions
  • nested subqueries

Independence upon the data structure[edit]

The operations of OttoQL need a DTD, because the system has to be able to recognize what is a collection and what is a tuple. Nevertheless the important operations of OttoQL are widely independent of the DTD. The above BMI-example works also, if the given table is flat (L(NAME, LENGTH, AGE, WEIGHT)) or inversely structured (M(WEIGHT, L(NAME, LENGTH, AGE))). This property is important, if OttoQL should be used by search engines.

Development[edit]

The basic ideas of the most important operations of OttoQL are presented already in the.[1] The ideas have been extended in [2] and.[3] But in these publications you can yet not find a generalization to XML. To the present implementation Andreas Hauptmann, Martin Schnabel and Dmitri Schamschurko made great contributions. The algebraic background of OttoQL you can find in the paper of Reichel.[4]

References[edit]

  1. ^ Klaus Benecke, "Hierarchische Datenstrukturen" (Hierarchical Data Structures)(German), Dissertation B, Technische Universität "Otto von Guericke", Magdeburg, May 1987
  2. ^ Klaus Benecke, "Strukturierte Tabellen - Ein neues Paradigma für Datenbank- und Programmiersprachen" (Structured Tables - A new Paradigm for Databases and Programming Languages) (German), Deutscher Universitätsverlag, Wiesbaden 1998, ISBN 3-8244-2099-6
  3. ^ Klaus Benecke, "A powerful Tool for Object Oriented Manipulation", in Object-Oriented Databases: Analysis, Design & Construction (DS-4), R. A. Meersman, W. Kent, S. Khosla (Editors), North-Holland 1991, pp.95-122
  4. ^ Horst Reichel, "Initial Computebility, Algebraic Specifications, and Partial Algebras", Oxford University Press, Oxford 1987 ISBN 0-19-853806-5

External links[edit]