This article contains content that is written like an advertisement. (July 2014)
OpenFormula is an open standard for exchanging recalculated formulae in spreadsheets. OpenFormula is included in version 1.2 of the OpenDocument standard. OpenFormula was proposed and initially drafted by David A. Wheeler.
Discussion of need
OpenDocument 1.0 is a specification for the exchange of office documents, and is fully capable of describing mathematical formulas that are displayed on the screen (through its reuse of the MathML standard). It is also fully capable of exchanging spreadsheet data, formats, pivot tables, and other information typically included in a spreadsheet. OpenDocument can exchange spreadsheet formulae (formulae that are recalculated in the spreadsheet); formulae are exchanged as values of the attribute table:formula.
However, many believed that the syntax and semantics of table:formula was not defined in sufficient detail. Version 1.0 of the specification defined spreadsheet formulae using a set of simple examples which show, for example, how to specify ranges and the SUM() function. Some critics argued that a more detailed, precise specification for spreadsheet functions, including syntax and semantics, was needed.   The OpenDocument committee argued that this was outside their scope at that time. Others have argued that, while the specification is less specific than one might like, the intent is fairly clear (especially since formulae tend to follow decades-long traditions), and also because the vast majority of spreadsheets only use a small set of functions (such as SUM) which are universally supported by all spreadsheet implementations.
One of the external commentors on OpenDocument, David A. Wheeler, released a first draft of a specification for formulae in February 2005. This began a process of discussion with various spreadsheet implementors and developers.
In October 2005, Wheeler publicly began an informal project, backed by the OpenDocument Fellowship, to create a draft formula specification, based on the initial draft and on discussions since that time with various implementors. By January 2006, the group had developed a lengthy specification, and implementors had begun changing their implementations to meet the draft specification.
OASIS Formula subcommittee
In February 2006, OASIS formally created the formula subcommittee, naming Wheeler as the subcommittee chair. After discussion, the subcommittee agreed to use the OpenFormula project's document as their base document. Thus, by February 2006, OASIS had a draft formula specification with a detailed framework and over 100 functions defined.
In 2005, Microsoft's Brian Jones noted that OpenDocument did not define spreadsheet formulae in detail. However, at the time Microsoft's competing proprietary XML format also did not include this kind of detailed specification for formulae.
Microsoft continued to protest that OpenDocument could not be used because it did not define a format for spreadsheet formulae, while its own specification continued to omit any specification about formulae until April 2006. In May 2006, Microsoft also began defining formulae in its XML format, 15 months after the first version of OpenFormula and three months after OASIS posted its first official draft of its specification.
Microsoft Office 2007 SP2 uses the formula language defined in ISO/IEC IS 29500:2008 for OpenDocument spreadsheets. Microsoft stated that they consider adding support for an official ODF formula language (OpenFormula), once a future version of the OpenDocument (ISO/IEC 26300) standard specification includes one.
Completion of the effort
In June 2007 it was announced that four tasks remained before submission to the quality assurance review. Further delays were incurred, but many implementors implemented the specification while it was being written, modifying their applications where necessary to comply with the draft standard.
Finally, the OpenFormula specification was issued on 29 September 2011 as Part 2 of Version 1.2 of the Open Document Format (http://docs.oasis-open.org/office/v1.2/OpenDocument-v1.2-part2.html).
Key attributes of the OpenFormula specification and development process are:
- Fully open standard The specification meets all widely accepted definitions of being an "open standard", including those by Bruce Perens and the European Union. For example, (1) both open source software and proprietary software can implement it, and (2) the work is based on consensus, not domination by any single supplier.
- Developed by many different implementors. OpenFormula is being developed by representatives from many different implementors, working together, including OpenOffice.org and Sun StarOffice (Eike Rathke), KDE Calligra Suite (formerly KOffice) (David Faure and Tomas Mecir), Gnumeric (Dr. Andreas J. Guelzow and Jody Goldberg), IBM/Lotus 1-2-3 (Rob Weir), and wikiCalc (Dan Bricklin, co-creator of the spreadsheet).
- Developed with experienced users. Many experienced users (such as Tom Metcalf, a scientist specializing in the astrophysics of the Sun) take part. The group includes several mathematicians, both users and developers.
- Focused development. The subcommittee is a large group focused specifically on spreadsheet formulae, and nothing else.
- Not rushed. OpenFormula is based on specification work that was first released on 2005-02-26, as well as a large body of research into different applications.
- Future-proofed format The syntax has been carefully designed to work indefinitely into the future. For example, it allows an arbitrary number of columns, while also allowing arbitrary names of values.
- Embedded test cases. OpenFormula includes a large number of test cases, ones that test and demonstrate the specification including "edge cases" that people often forget. More importantly, they are specially formatted so they can be automatically extracted and placed in a test spreadsheet to test applications.
- Rigorous definitions The test cases (noted previously) help it be far more rigorous. In addition, OpenFormula defines the types for each function (as prototypes of each function). Function definitions are examined deeply, e.g., YEARFRAC() has subtle behavior in the leap years, which were carefully examined and defined.
- Doesn't mandate mistakes. The specification is carefully written to not require certain bugs, just because someone has a bug. For example, Excel incorrectly believes that 1900 was a leap year, and at least draft version 1.3 of the Excel specification claims that compatible applications must make the same mistake, and requires that applications cannot be more capable than Excel by supporting dates before 1900. By comparing many different independent implementations, the OpenFormula group can often detect when an application makes a mistake, and ensure that applications are not overly restricted.
- Innovations from many sources. OpenFormula covers the functions of Excel and OpenOffice.org, plus important functions not found in either one but instead found in other spreadsheet applications, such as Gnumeric and KSpread. For example, the specification includes the functions DECIMAL and BASE, which are much better ways to handle different bases than the old BIN2DEC (etc.) functions. It also includes bit operations like BITAND. These sources include Excel, OpenOffice.org Calc, Sun StarOffice Calc, KDE Calligra Sheets, GNOME Gnumeric, IBM/Lotus 1-2-3, Corel Word Perfect Suite Quattro Pro, wikiCalc, and DocumentToGo's SheetToGo. The subcommittee argues that by including the innovations from around the world of many different independent applications, they produce a better result that is far more inclusive.
- Room for innovation by anyone. Application-specific "namespaces" are defined for functions. This allows spreadsheet applications to add new functions, without interfering with current standard functions, future standard functions, or functions defined by other applications. As a result, different applications can add new functions without interfering with others; once a consensus arises about the new function, it can be standardized. The namespace is based on the Internet's naming service (reversed domain names), so ORG.OPENOFFICE.STYLE would be an OpenOffice.org-unique function.
- Internationalization. The specification does not assume that everyone uses "." as the decimal point, and indeed does not constrain user interfaces at all. Named expressions can have names in local character sets.
- Subset support. Applications can implement a subset or superset. To prevent user confusion, various "groups" are defined so that users can request specific sets of capabilities.
One important aspect of OpenFormula is that it provides a predefined set of "groups"; the most important of these groups are small, medium, and large:
- The small group includes a little over 100 functions, including functions for trigonometry, database, finance, and statistics. The vast majority of spreadsheet documents are ably handled by applications that implement the "small" group. At least one PDA application (SheetToGo) has this level of capability, and wikiCalc added the functions in the small group specifically to meet the set defined by OpenFormula.
- The medium group includes all the capabilities of the small group, and adds about 100 more functions.
- The large group includes all the capabilities of the medium group, adding around 130 more functions, as well as capabilities such as complex numbers.
It is expected that users will often request implementations that meet a particular group, based on their needs.
- Wheeler, David A (2004-11-01). "Proposal: More detailed specification for formulas". office-comment (Mailing list).
- Rathke, Eike (2005-06-23). "OpenDocument For Spreadsheets (reply to Morten Welinder)". Archived from the original on 2012-09-10.
- Jones, Brian (2005-10-04). "Comments from Tim Bray on OpenDocument".
- Wheeler, David A (2005-11-07). "FYI: Formulas not specified by Microsoft XML, either". openformula-discuss (Mailing list).
- Rathke, Eike (2007-06-07). "Who is on board?". office-formula (Mailing list).
- About OpenFormula