Expression problem: Difference between revisions

Content deleted Content added

Inline

Revision as of 03:25, 26 June 2021

The expression problem is challenge problem in programming languages that concerns the extensibility and modularity of statically typed data abstractions. The goal is to define a data abstraction that is extensible both in its representations and its behaviors, where one can add new representations and new behaviors to the data abstraction, without recompiling existing code, and while retaining static type safety (e.g., no casts). It exposed deficiencies in programming paradigms and programming languages, and it is still not definitively solved, although there are many proposed solutions.

History

Philip Wadler formulated the challenge and named it "The Expression Problem"^[1] in response to a discussion with Rice University's Programming Languages Team (PLT). He also cited three sources that defined the context for his challenge:

The problem was first observed by John Reynolds in 1975.^[2] Reynolds discussed two forms of Data Abstraction: User-defined Types, which are now known as Abstract Data Types (ADTS) (not to be confused with *Algebraic* Data Types), and Procedural Data Structures, which are now understood as a primitive form of Objects with only one method. He argued that they are complementary, in that User-defined Types could be extended with new behaviors, and Procedural Data Structures could be extended with new representations. His conclusions based on this early analysis turned out to be completely wrong: he wrote that adding a second method to an object "is more a tour de force than a specimen of clear programming," which completely missed the Object-Oriented paradigm and its great success. He also believed the two forms of data abstraction "are inherently distinct and complementary."

Fifteen years later in 1990, William Cook^[3] applied Reynold's work in the context of Objects and Abstract Data Types, which had both grown extensively. Cook identified the matrix of representations and behaviors that are implicit in a Data Abstraction, and discussed how ADTs are based on the behavioral axis, while Objects are based on representations.

At ECOOP '98, Krishnamurthi et al.^[4] presented a design pattern solution to the problem of simultaneously extending an expression-oriented programming language and its tool-set. They dubbed it the "expressivity problem" because they thought programming language designers could use the problem to demonstrate the expressive power of their creations. For PLT, the problem had shown up in the construction of DrScheme, now DrRacket, and they solved it^[5] via a rediscovery of mixins.^[6]^[7] To avoid using a programming language problem in a paper about programming languages, Krishnamurthi et al. used an old geometry programming problem to explain their pattern-oriented solution. In conversations with Felleisen and Krishnamurthi after the ECOOP presentation, Wadler understood the PL-centric nature of the problem and he pointed out that Krishnamurthi's solution used a cast to circumvent Java's type system. The discussion continued on the types mailing list, where Corky Cartwright (Rice) and Kim Bruce (Williams) showed how type systems for OO languages might eliminate this cast. In response Wadler formulated his essay and stated the challenge, "whether a language can solve the expression problem is a salient indicator of its capacity for expression." The label "expression problem" puns on expression = "how much can your language express" and expression = "the terms you are trying to represent are language expressions".

Others co-discovered variants of the expression problem around the same time as Rice University's PLT, in particular Thomas Kühne^[8] in his dissertation, and Smaragdakis and Batory^[9] in a parallel ECOOP 98 article.

Some follow-up work used the expression problem to showcase the power of programming language designs.^[10]^[11]

The expression problem is also a fundamental problem in multi-dimensional Software Product Line design and in particular as an application or special case of FOSD Program Cubes.^{[citation needed]}

Solutions

There are various solutions to the expression problem. Each solution varies in the amount of code a user must write to implement them, and the language features they require.

Multiple dispatch^[12]
Open classes^[13]
Coproducts of functors^[14]
Type classes^[15]
Tagless-final^[16] / Object algebras^[17]
Polymorphic Variants^[18]

Example

Problem description

We can imagine we do not have the source code for the following library, written in C#, which we wish to extend:

public interface IEvalExp
{
    int Eval();
}
public class Lit : IEvalExp
{
    public Lit(int n)
    {
        N = n;
    }
    public int N { get; }
    public int Eval()
    {
        return N;
    }
}
public class Add : IEvalExp
{
    public Add(IEvalExp left, IEvalExp right)
    {
        Left = left;
        Right = right;
    }
    public IEvalExp Left { get; }
    public IEvalExp Right { get; }
    public int Eval()
    {
        return Left.Eval() + Right.Eval();
    }
}

public static class ExampleOne
{
    public static IEvalExp AddOneAndTwo() => new Add(new Lit(1), new Lit(2));

    public static int EvaluateTheSumOfOneAndTwo() => AddOneAndTwo().Eval();
}

Using this library we can express the arithmetic expression 1 + 2 as we did in ExampleOne.AddOneAndTwo() and can evaluate the expression by calling .Eval(). Now imagine that we wish to extend this library, adding a new type is easy because we are working with an Object Oriented Programming Language. For example, we might create the following class:

public class Mult : IEvalExp
{

    public Mult(IEvalExp left, IEvalExp right)
    {
        Left = left;
        Right = right;
    }

    public IEvalExp Left { get; }
    public IEvalExp Right { get; }

    public int Eval()
    {
        return Left.Eval() * Right.Eval();
    }
}

However, if we wish to add a new function over the type (a new method in C# terminology) we have to change the IEvalExp interface and then modify all the classes that implement the interface. Another possibility is to create a new interface that extends the IEvalExp interface and then create sub-types for Lit, Add and Mult classes, but the expression returned in ExampleOne.AddOneAndTwo() has already been compiled so we will not be able to use the new function over the old type. The problem is reversed in functional programming languages like F# where it is easy to add a function over a given type, but extending or adding types is difficult.

Problem Solution using Object Algebra

Let us redesign the original library with extensibility in mind using the ideas from the paper Extensibility for the Masses.^[17]

public interface ExpAlgebra<T>
{
    T Lit(int n);
    T Add(T left, T right);
}

public class ExpFactory : ExpAlgebra<IEvalExp>
{
    public IEvalExp Add(IEvalExp left, IEvalExp right)
    {
        return new Add(left, right);
    }

    public IEvalExp Lit(int n)
    {
        return new Lit(n);
    }
}

public static class ExampleTwo<T>
{
    public static T AddOneToTwo(ExpAlgebra<T> ae) => ae.Add(ae.Lit(1), ae.Lit(2));
}

We use the same implementation as in the first code example but now add a new interface containing the functions over the type as well as a factory for the algebra. Notice that we now generate the expression in ExampleTwo.AddOneToTwo() using the ExpAlgebra<T> interface instead of directly from the types. We can now add a function by extending the ExpAlgebra<T> interface, we will add functionality to print the expression:

    public interface IPrintExp : IEvalExp
    {
        string Print();
    }

    public class PrintableLit : Lit, IPrintExp
    {
        public PrintableLit(int n) : base(n)
        {
            N = n;
        }

        public int N { get; }

        public string Print()
        {
            return N.ToString();
        }
    }

    public class PrintableAdd : Add, IPrintExp
    {
        public PrintableAdd(IPrintExp left, IPrintExp right) : base(left, right)
        {
            Left = left;
            Right = right;
        }

        public new IPrintExp Left { get; }
        public new IPrintExp Right { get; }

        public string Print()
        {
            return Left.Print() + " + " + Right.Print();
        }
    }

    public class PrintFactory : ExpFactory, ExpAlgebra<IPrintExp>
    {
        public IPrintExp Add(IPrintExp left, IPrintExp right)
        {
            return new PrintableAdd(left, right);
        }

        public new IPrintExp Lit(int n)
        {
            return new PrintableLit(n);
        }
    }

    public static class ExampleThree
    {
        public static int Evaluate() => ExampleTwo<IPrintExp>.AddOneToTwo(new PrintFactory()).Eval();
        public static string Print() => ExampleTwo<IPrintExp>.AddOneToTwo(new PrintFactory()).Print();
    }

Notice that in ExampleThree.Print() we are printing an expression that was already compiled in ExampleTwo, we did not need to modify any existing code. Notice also that this is still strongly typed, we do not need reflection or casting. If we would replace the PrintFactory() with the ExpFactory() in the ExampleThree.Print() we would get a compilation error since the .Print() method does not exist in that context.

References

^ "The Expression Problem".
^ Reynolds, John C. (1975). "User-defined Types and Procedural Data Structures as complementary approaches to Data Abstraction.". New Directions in Algorithmic Languages (PDF). IFIP Working Group 2.1 on Algol. pp. 157–168.
^ Cook, William (1990). "Object-Oriented Programming versus Abstract Data Types". In Bakker, J.W. de; Roever, W.P. de; Rozenberg, G. (eds.). Foundations of Object-Oriented Languages (FOOL), REX School/Workshop. Noordwijkerhout, The Netherlands: Springer Berlin Heidelberg. pp. 151–178. ISBN 978-3-540-46450-1.
^ "Synthesizing Object-Oriented and Functional Design to Promote Re-Use".
^ "Modular object-oriented programming with units and mixins".
^ Cook, William (1989). A Denotational Semantics of Inheritance (PDF) (PhD). Brown University.
^ Flatt, Matthew; Krishnamurthi, Shriram; Felleisen, Matthias (1998). "Classes and Mixins". Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '98. pp. 171–183. doi:10.1145/268946.268961. ISBN 978-0897919791.
^ Kühne, Thomas (1999). A Functional Pattern System for Object-Oriented Design. Darmstadt: Verlag Dr. Kovac. ISBN 978-3-86064-770-7.
^ Smaragdakis, Yannis; Don Batory (1998). "Implementing Reusable Object-Oriented Components". Lecture Notes in Computer Science. 1445.
^ Zenger, Matthias; Odersky, Martin (2001). "Extensible Algebraic Datatypes with Defaults": 241–252. CiteSeerX 10.1.1.28.6778. {{cite journal}}: Cite journal requires |journal= (help)
^ Zenger, Matthias; Odersky, Martin (2005). "Independently extensible solutions to the expression problem". Foundations of Object-Oriented Languages (FOOL). ACM SIGPLAN. CiteSeerX 10.1.1.107.4449. {{cite web}}: Missing or empty |url= (help)
^ Chambers, Craig; Leavens, Gary T. (November 1995). "Type Checking and Modules for Multi-Methods". ACM Trans. Program. Lang. Syst. (17): 805–843.
^ Clifton, Curtis; Leavens, Gary T.; Chambers, Craig; Millstein, Todd (2000). "MultiJava: Modular Open Classes and Symmetric Multiple Dispatch for Java" (PDF). Oopsla '00.
^ Wouter Swierstra (2008). "Data Types à La Carte". Journal of Functional Programming. Vol. 18, no. 4. Cambridge University Press. pp. 423–436. doi:10.1017/S0956796808006758. ISSN 0956-7968.
^ Wehr, Stefan; Thiemann, Peter (July 2011). "JavaGI: The Interaction of Type Classes with Interfaces and Inheritance". ACM Trans. Program. Lang. Syst. (33).
^ Carette, Jacques; Kiselyov, Oleg; Chung-chieh, Shan (2009). "Finally Tagless, Partially Evaluated: Tagless Staged Interpreters for Simpler Typed Languages" (PDF). J. Funct. Program.
^ ^a ^b Oliveira, Bruno C. d. S.; Cook, William R. (2012). "Extensibility for the Masses: Practical Extensibility with Object Algebras" (PDF). Ecoop '12.
^ Garrigue, Jacques (2000). "Code Reuse Through Polymorphic Variants". CiteSeerX 10.1.1.128.7169. {{cite journal}}: Cite journal requires |journal= (help)

External links

[WadlerPost-1] "The Expression Problem".

[Reynolds-2] Reynolds, John C. (1975). "User-defined Types and Procedural Data Structures as complementary approaches to Data Abstraction.". New Directions in Algorithmic Languages (PDF). IFIP Working Group 2.1 on Algol. pp. 157–168.

[CookOOPvsADT-3] Cook, William (1990). "Object-Oriented Programming versus Abstract Data Types". In Bakker, J.W. de; Roever, W.P. de; Rozenberg, G. (eds.). Foundations of Object-Oriented Languages (FOOL), REX School/Workshop. Noordwijkerhout, The Netherlands: Springer Berlin Heidelberg. pp. 151–178. ISBN 978-3-540-46450-1.

[Synth-4] "Synthesizing Object-Oriented and Functional Design to Promote Re-Use".

[FF-5] "Modular object-oriented programming with units and mixins".

[Mixins-6] Cook, William (1989). A Denotational Semantics of Inheritance (PDF) (PhD). Brown University.

[FKF-7] Flatt, Matthew; Krishnamurthi, Shriram; Felleisen, Matthias (1998). "Classes and Mixins". Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '98. pp. 171–183. doi:10.1145/268946.268961. ISBN 978-0897919791.

[K-8] Kühne, Thomas (1999). A Functional Pattern System for Object-Oriented Design. Darmstadt: Verlag Dr. Kovac. ISBN 978-3-86064-770-7.

[SB-9] Smaragdakis, Yannis; Don Batory (1998). "Implementing Reusable Object-Oriented Components". Lecture Notes in Computer Science. 1445.

[Scala0-10] Zenger, Matthias; Odersky, Martin (2001). "Extensible Algebraic Datatypes with Defaults": 241–252. CiteSeerX 10.1.1.28.6778. {{cite journal}}: Cite journal requires |journal= (help)

[Scala-11] Zenger, Matthias; Odersky, Martin (2005). "Independently extensible solutions to the expression problem". Foundations of Object-Oriented Languages (FOOL). ACM SIGPLAN. CiteSeerX 10.1.1.107.4449. {{cite web}}: Missing or empty |url= (help)

[Chambers_&_Leavens,_Multi-Methods-12] Chambers, Craig; Leavens, Gary T. (November 1995). "Type Checking and Modules for Multi-Methods". ACM Trans. Program. Lang. Syst. (17): 805–843.

[Clifton_et._al.,_MultiJava_Open_Classes-13] Clifton, Curtis; Leavens, Gary T.; Chambers, Craig; Millstein, Todd (2000). "MultiJava: Modular Open Classes and Symmetric Multiple Dispatch for Java" (PDF). Oopsla '00.

[14] Wouter Swierstra (2008). "Data Types à La Carte". Journal of Functional Programming. Vol. 18, no. 4. Cambridge University Press. pp. 423–436. doi:10.1017/S0956796808006758. ISSN 0956-7968.

[Wehr_&_Thiemann,_JavaGI_Type_Classes-15] Wehr, Stefan; Thiemann, Peter (July 2011). "JavaGI: The Interaction of Type Classes with Interfaces and Inheritance". ACM Trans. Program. Lang. Syst. (33).

[Carette_et_al.,_Finally_tagless,_partially_evaluated:_Tagless_staged_interpreters_for_simpler_typed_languages-16] Carette, Jacques; Kiselyov, Oleg; Chung-chieh, Shan (2009). "Finally Tagless, Partially Evaluated: Tagless Staged Interpreters for Simpler Typed Languages" (PDF). J. Funct. Program.

[Oliveira_&_Cook,_Object_Algebras-17] Oliveira, Bruno C. d. S.; Cook, William R. (2012). "Extensibility for the Masses: Practical Extensibility with Object Algebras" (PDF). Ecoop '12.

[Code_Reuse_Through_Polymorphic_Variants-18] Garrigue, Jacques (2000). "Code Reuse Through Polymorphic Variants". CiteSeerX 10.1.1.128.7169. {{cite journal}}: Cite journal requires |journal= (help)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

@@ Line 1: / Line 1: @@
-The '''expression problem''' is a term used in discussing strengths and weaknesses of various [[programming paradigms]] and [[programming languages]].
+The '''expression problem''' is challenge problem in [[programming languages]] that concerns the extensibility and modularity of statically typed data abstractions.
+The goal is to define a data abstraction that is extensible both in its representations and its behaviors, where one can add new representations and new behaviors to the data abstraction, without recompiling existing code, and while retaining static type safety (e.g., no casts).
+It exposed deficiencies in [[programming paradigms]] and [[programming languages]], and it is still not definitively solved, although there are many proposed solutions.
+== History ==
-[[Philip Wadler]] coined the term<ref name="WadlerPost">{{cite web
+[[Philip Wadler]] formulated the challenge and named it "The Expression Problem"<ref name="WadlerPost">{{cite web
 | title=The Expression Problem
 | url=http://homepages.inf.ed.ac.uk/wadler/papers/expression/expression.txt
-}}</ref> in response to a discussion with Rice University's [[Racket (programming language)#Development|''Programming Languages Team'' (PLT)]]:
+}}</ref> in response to a discussion with Rice University's [[Racket (programming language)#Development|''Programming Languages Team'' (PLT)]].
+He also cited three sources that defined the context for his challenge:
-<blockquote>The expression problem is a new name for an old problem.<ref name="Reynolds">
+The problem was first observed by [[John C. Reynolds|John Reynolds]] in 1975.<ref name="Reynolds">
 {{cite book
-| chapter= User-defined types and procedural data as complementary approaches to data abstraction.
+| chapter= User-defined Types and Procedural Data Structures as complementary approaches to Data Abstraction.
 | title= New Directions in Algorithmic Languages
 | year= 1975
@@ Line 15: / Line 19: @@
 | publisher= IFIP Working Group 2.1 on Algol
 | pages= 157-168
+| url= https://www.cs.tufts.edu/~nr/cs257/archive/john-reynolds/procedural-data-structures.pdf
 }}
+</ref> Reynolds discussed two forms of Data Abstraction: User-defined Types, which are now known as Abstract Data Types (ADTS) (not to be confused with *Algebraic* Data Types), and Procedural Data Structures, which are now understood as a primitive form of Objects with only one method. He argued that they are complementary, in that User-defined Types could be extended with new behaviors, and Procedural Data Structures could be extended with new representations. His conclusions based on this early analysis turned out to be completely wrong:
-</ref><ref name="Cook">
+he wrote that adding a second method to an object "is more a tour de force than a specimen of
-{{cite web
+clear programming," which completely missed the Object-Oriented paradigm and its
-| title=Object-Oriented Programming versus Abstract Data Types
+great success. He also believed
-| url=http://www.cs.utexas.edu/users/wcook/papers/OOPvsADT/CookOOPvsADT90.pdf
+the two forms of data abstraction "are inherently distinct and complementary."
-}}</ref> The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).</blockquote>
+Fifteen years later in 1990, [[William Cook (computer scientist)|William Cook]]<ref name="CookOOPvsADT">
-== History ==
+{{cite book
-The problem was first observed by [[John C. Reynolds|John Reynolds]] in 1975.<ref name="Reynolds" />
+| last= Cook |first= William
+| author-link= William Cook (computer scientist)
+| date= 1990
+| editor-last1= Bakker | editor-first1= J.W. de
+| editor-last2= Roever | editor-first2= W.P. de
+| editor-last3= Rozenberg | editor-first3= G.
+| title= Foundations of Object-Oriented Languages (FOOL), REX School/Workshop
+| location= Noordwijkerhout, The Netherlands
+| publisher= Springer Berlin Heidelberg |pages=151-178
+| chapter= Object-Oriented Programming versus Abstract Data Types
+| chapter-url= https://link.springer.com/chapter/10.1007/BFb0019443
+| isbn= 978-3-540-46450-1}}
+</ref>
+applied Reynold's work in the context of Objects and Abstract Data Types, which had both grown extensively. Cook identified the matrix of representations and behaviors that are implicit in a Data Abstraction, and discussed how ADTs are based on the behavioral axis, while Objects are based on representations.
 At ECOOP '98, Krishnamurthi et al.<ref name="Synth">
@@ Line 33: / Line 52: @@
 | title= Modular object-oriented programming with units and mixins
 | url= http://dl.acm.org/citation.cfm?id=289432
-}}</ref> via a rediscovery of [[Mixin|mixins]].<ref name="FKF">
+}}</ref> via a rediscovery of [[Mixin|mixins]].<ref name="Mixins">{{cite thesis |type= PhD
+| last= Cook | first= William | date= 1989
+| title= A Denotational Semantics of Inheritance | publisher= Brown University
+| url= https://www.cs.utexas.edu/~wcook/papers/thesis/cook89.pdf }}</ref><ref name="FKF">
 {{cite book
 | chapter= Classes and Mixins
@@ Line 47: / Line 69: @@
 | first3= Matthias
 | isbn= 978-0897919791
-}}</ref> To avoid using a programming language problem in a paper about programming languages, Krishnamurthi et al. used an old geometry programming problem to explain their pattern-oriented solution. In conversations with Felleisen and Krishnamurthi after the ECOOP presentation, Wadler understood the PL-centric nature of the problem and he pointed out that Krishnamurthi's solution used a cast to circumvent Java's type system. The discussion continued on the types mailing list, where Corky Cartwright (Rice) and [[Kim Bruce]] (Williams) showed how type systems for OO languages might eliminate this cast. In response Wadler formulated his essay and stated the challenge, "whether a language can solve the expression problem is a salient indicator of its capacity for expression." The label "expression problem" puns on  expression = "how much can your language express" and expression =
+}}</ref> To avoid using a programming language problem in a paper about programming languages, Krishnamurthi et al. used an old geometry programming problem to explain their pattern-oriented solution. In conversations with Felleisen and Krishnamurthi after the ECOOP presentation, Wadler understood the PL-centric nature of the problem and he pointed out that Krishnamurthi's solution used a cast to circumvent Java's type system. The discussion continued on the types mailing list, where Corky Cartwright (Rice) and [[Kim Bruce]] (Williams) showed how type systems for OO languages might eliminate this cast. In response Wadler formulated his essay and stated the challenge, "whether a language can solve the expression problem is a salient indicator of its capacity for expression." The label "expression problem" puns on expression = "how much can your language express" and expression =
 "the terms you are trying to represent are language expressions".
@@ Line 57: / Line 79: @@
 {{cite journal
 | title=Extensible Algebraic Datatypes with Defaults
+| first1=Matthias | last1=Zenger
+| first2=Martin | last2=Odersky
 | pages=241–252
 | citeseerx=10.1.1.28.6778
 | year=2001
 }}</ref><ref name="Scala">
-{{cite journal
+{{cite web
 | title=Independently extensible solutions to the expression problem
+| first1=Matthias | last1=Zenger
+| first2=Martin | last2=Odersky
+| series=Foundations of Object-Oriented Languages (FOOL)
+| publisher=ACM SIGPLAN
 | citeseerx=10.1.1.107.4449
 | year=2005