Talk:Abstract data type

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 


Untitled[edit]

In computing, an abstract data type (ADT) consists of a "domain" (a set of data), a set of operations, and a set of "axioms" (rules that the operations must obey).

Wrong definition[edit]

The first paragraph is wrong in several ways: (a) Classes, while one of the better ways to implement an ADT, aren't the only one, and the style described here is just one of several possible implementations even in an OO language; (b) there's no mention of axioms (it's implicit in "specification" if you will, but that's too easy to overlook).

Here's my attempt at cleaning this up:


In programming, ADTs are often used as a design and documentation vehicle: if some code carries as comment that amounts to "this implements a stack", it is immediately clear that the essential properties of the code are those of the "stack" ADT, and that everything else was considered accidental "implementation detail" by the programmer.

HTH. -- Joachim Durchholz [[1]]

How the heck would a balanced binary tree fit under this categorization? since it being balanced means that the internals ARE important. User:hfastedge
The answer is -- it doesn't. The fact that a binary tree is balanced is an implementation property. Although it may improve the practical performance of a binary tree greatly, it shouldn't have any effect on the functional properties. Balanced and unbalanced binary trees are identical from an ADT point of view -- Derek Ross
Actually a balanced binary tree *is* an ADT. It is a set of data (trees over some given data), operations (insert/remove/lookup by walking over tree nodes), and axioms (balancedness).

However, it's usually not the ADT that users of a balanced binary tree are interested in. The ADT for these is a Map (insert/remove/lookup no matter how), plus some performance guarantees (performance guarantees are usually not considered part of an ADT, though I'd personally say that this would make the concept more useful). -- Joachim Durchholz [[2]]

Separation of interface and implementation[edit]

This article as it currently stands does not spell out the clear separation between interface and implementation that is necessary in defining an abstract data type. See my alternative data abstraction article for an explanation of this.

Ldo 12:33, 12 Sep 2003 (UTC)

Preconditions, postconditions, and invariants[edit]

This page should include information on preconditions, postconditions, and invariants.

YES! An abstract data type is not a class or any other implementation. It is abstract -- not an implementation. And the abstract data type is not merely defined in terms of its programmatic interface, but as the commenter above noted: by the specification of the behavior: preconditions, postconditions, and invariants. —The preceding unsigned comment was added by 67.161.67.58 (talkcontribs) 19:08, 16 March 2007.

Better example[edit]

The original example seemed too complex. Hopefully, seeing the interface and implementation will help the reader to understand.

I hope to add an alternative implementation when I get some more time.

Bill Pringle | Talk]] 20:39, 10 Aug 2004 (UTC)

Complex number???[edit]

Is complex number really an ADT?????? Talam 00:56, 6 December 2005 (UTC)

Depending on how you look at it... If given enough explation why (being an instance of Number and AlgebraicallyClosedField), yes, but no, in the sense that it is a container like List and Tree. —R. Koot 16:15, 6 December 2005 (UTC)
It isn't that a complex number is an ADT, but rather you can create an ADT for a complex number. That is a list of ADT examples that can be found in text books. Personally, I think Complex Numbers make a lousy example, but the text book I use happens to use that example. wrp103 (Bill Pringle) - [[User talk:Wrp103|Talk]] 21:17, 7 December 2005 (UTC)
The way I've found that is the most successful way to explain what an ADT is, to give tangible examples. The ADT for an ordered set of information provides the same API for accessing the data, regardless of it being stored in an array, an unordered linked-list, an ordered linked-list, a binary tree, a 3-2 tree, etc. A List of the common interfaces for ADTs needs to also be added to the article.

Linked List and Binary Search Tree[edit]

An anonymous editor deleted linked list and binary search tree without any explanation in the edit summary. Thinking it was a new editor that was experimenting, I reverted their edit. They since contacted me on my talk page and gave the following explanation:

I tried to make an edit to correct your page about abstract data types. You list "linked list" and "binary search tree" as abstract data types, and I removed them from the list because I believe they should not be included. I was not attempting to vandalize the entry or "experiment" as you said in your message to me.
An abstract data type should not specify an implementation; therefore a linked list is certainly NOT an ADT. A list, also written in the page, IS an ADT. But not a linked list, no more than an array list.
For the same reason, a binary search tree is not an ADT. Binary search trees are, like linked lists, particular data structures and not implementation-independent. A binary search tree can be *USED* to implement an ADT such as a set or map.

(The editor used different IP addresses when they made the edit, and when they wrote to my talk page, so I don't have an easy way to contact them directly.)

I don't totally agree with the logic, but they made a good point, so I reverted my revert back to their version. They are correct that an ADT hides the implementation from the user, but that doesn't (IMHO) preclude an ADT from implementing a data structure.

Personally, I think that a linked list, for example, can be an ADT by providing methods to perform certain operations. The result is that the user doesn't need to know anything about how to make a linked list, but can still use them.

Anybody else have any strong feelings one way or the other? wrp103 (Bill Pringle) 00:48, 27 October 2006 (UTC)

---

Bill, I was the one who made the anonymous edit about linked lists and binary trees. I apologize for my unnecessary anonymity. My name is Marty Stepp and I'm a CS&E instructor at the University of Washington. I still believe that a linked list isn't an ADT, because the name itself implies the exact style of implementation, which is contrary to the idea of ADTs.

User:Martnstepp (Marty Stepp) 02:36, 14 November 2006 (PST)

I can see arguments on both side. There are many ways in which to implement a linked list (single / double, internal / external, pointers / array, etc.) Providing an object with a series of methods allows a user to incorporate a linked list within their program without knowing anything about the implementation. A new version of the class could be distributed with a totally new implementation that would not affect any users. That seems to meet the criteria of an ADT. I don't see that any different than, say Vector in Java: it is an abstract array. wrp103 (Bill Pringle) 14:15, 14 November 2006 (UTC)


Separation of interface and implementation[edit]

Regarding the last paragraph of the "Separation of interface and implementation" section, I appreciate the note regarding ADT representation using the class construct. However the final two sentences seem to introduce language contrasts that I am not aware of. C also provides mechanisms of enforcement and hierarchy. It might be more clear to add C to the list or to remove all language examples.

Also it is not clear what an "object oriented ADT" is. Perhaps an ADT represented as a class?

--Sector7agent 22:59, 8 January 2007 (UTC)

That is what I was thinking of when I wrote that. C is limited to declaring static functions that don't create entry points, but Java/C++ have private, public, and protected. Actually, I wrote that text some time ago, and have no problems if somebody wants to tweak (or rewrite) it. wrp103 (Bill Pringle) 16:00, 9 January 2007 (UTC)
The first paragraph, especially the last sentence; it's a classic! talk about explaining something in layman's terms. Arcturus 22:16, 8 February 2007 (UTC)

Rewritten from scratch[edit]

I've rewritten the article from scratch.

DONE

  • added the summary section and two references.
  • set the conceptual boundaries clear.
  • extended the delimitation emphasizing on values.
  • extended the summary giving an idea how to work with ADTs.
  • references to some ADT tools, to give the reader more hooks.

The article now at least it should make clear, what ADTs are and what they are not.

Any technical details, even most basic definitions are missing, but if you read the predecessor article, i truly did my exploit.

--85.182.79.140 (talk) 22:48, 26 July 2008 (UTC)

Not an improvement[edit]

I don't think 85.182.72.125 improved the article at all. His article lacks any technical rigor and concrete description of what Abstract Data Type is. —Preceding unsigned comment added by 66.117.143.86 (talk) 09:25, 6 August 2008 (UTC)

Expanded the example "Rational numbers"[edit]

I have rewritten just the example, in an attempt to see if I can find the tricks to make the subject more accessible for a wider audience. If it works, similar methods could be used for the article in general. Once that has been achieved, it may no longer make sense to write so much in this particular example. Some of the stuff I have written here should probably be written more generally about the whole subject of ADTs, and would then not need to be repeated here.

I think this article reads more like poetry for those in the know, while for the rest of us, it reads like the financial records of an unknown deceased company. For example,

An abstract data structure is an abstract storage for data defined in terms of the set of operations to be performed on data and computational complexity for performing these operations, regardless of the implementation in a concrete data structure.

This expresses beutifully what an abstract data structure is. Those who know what it is, will recognize it, and feel the beauty.

But even a reader who has taken courses in programming to the point of being able to code a quicksort from a description, will not easily know what it actually means that things are "abstract". What is "an abstract storage"? Imagine the reader who thinks "I suppose a hard disk is a concrete storage. Could it be a network storage?" To deduce this from this article is at best like a crossword puzzle.

To make the article more readable for the laymen, it needs to give a little more context. Just a little may be enough. (The goal is still not to write an introductory course in programming-in-the-large for the average housewife.) Context here means, e.g., the situation where a large computer system is broken down into modules, the modules are often made by different people, often some modules are bought in the market among several competing offers, etc. Even when a system is written by a single person, it is impossible to remember all details of module-one while writing module-two. Even if that were possible, flexibility is still needed in practice, and it is hard to achieve flexibility without modularization and information hiding. And information hiding is not really about "hiding", like covert agents of CIA, but just decoupling.

In other words, if ADTs are the answers, what are the questions? What are we actually talking about?

It took many years from the computer was invented, to the object-oriented programming was. After OOP was invented, it still took many years before people were able to express in words what the gurus knew, sort-of in the form of gut feelings, what the point was, in OOP. Concepts like "information hiding" were born out of that effort.

Before that, and perhaps still today, OOP was often "sold" in unrealistical ways, emphasizing the wrong aspects. I remember during many years I had the feeling that articles about OOP told you that all these wonders came from a syntax change: obj.method(args) rather than method(obj, args). Then writers told you it was not just a syntax change, but they were unable to tell what exacly it was. There was lots of poetry satisfying mostly those who had mastered the black art.

Then it became modern to say that the wonders were the results of sending messages to objects, making the objects like animate things that would more or less pick up where the programmer left off, relieving the programmer from some of the burden. Still this version was much closer to the information hiding, decoupling aspect. Rather than relieving the (single) programmer of the whole system any burden, OOP and later inventions relieve the programmers of the client classes some burden, leaving that burden for the programmer of the server class. It is hard to understand the point in ADT and related concepts until one has tried to create something complex, and understood what errors one made. Seen how the complexity of each part flows like water through the whole system, until one learns to contain that flow in watertight compartments.

ADTs are tools for breaking down complex computer systems, by reducing the complexity of the interactions between the parts. ADTs allow most parts of a computer system to handle complex data in a simplified way. For this purpose, the ADT specifies a limited set of operations on, or interactions with, data of a particular kind. Given the ADT, one team of programmers can write the computer code to carry out this limited set of operations, while all other modules will invoke that module whenever they need such operations done. The ADT tells the programmers of other modules what operations are available, and what exactly are the results that can be relied on.

The description in the previous paragraph describes more or less equally well a couple of other related concepts. The finer points, the distinctions between ADTs and e.g "interfaces", who can say something sensible about that?

(By the way, the word "interface" is used in so many different ways... Imagine the not-so-novice computer user that thinks an "interface" is a network card or a software equivalent... It is kind of rude to say just "interface", and let readers who do not know what you mean, feel like idiots.)

The evolution of OOP served multiple purposes, going in multiple directions. ADTs and interfaces are a next step, in one of the directions. OOP and classes still works largely with specific implementations. ADTs remove everything that is not essential to know about for the programmers of the other modules. On the other hands it goes further, when it investigates other things the user may need to know about - complexity, for example. Fast implementations was always an obvious goal. A new dimension opened up when the STL was documented with the tradeoff of different kinds of efficiencies for different kinds of containers. It was easily realizable that many of the tradeoffs were unavoidable, and that made it clearer that something had to go in the specification to guide the choices of the users. At first, that was, of course, merely 'better' documentation. Then it became promoted to specification.

But, on second thought, I think that it is perhaps wrong to make all of the article readable for "morons". Perhaps what I want is an introduction that allows most readers to go home with something learned from the article. Then the rest of the article may be directed to the needs of more initiated people.

Maybe I will try to do it myself. But I need a break first, to cool down a bit. Cacadril (talk) 14:11, 24 January 2009 (UTC)

In the stack example, a stack should be defined as a long int rather than a handle[edit]

If you really want to make the stack example an ADT, then you shouldn't need to be bothered with the fact that it is implemented using a handle. It would be much cleaner to typedef a stack_t to be a long, and then use that type to refer to stacks (notice how I don't care and don't want to know about implementation details - like "why a long, and not an unsigned?" - even though there might be good reasons for it when you cross the implementation barrier). Remember people: example code should be exemplary (as quoted from "Good API design and why it matters" at googletechtalks). 62.166.203.162 (talk) 12:50, 12 March 2009 (UTC)

An ADT is an ABSTRACT (=theoretical) structure. An implementation of an ADT is not an ADT, it is a (concrete) data strcuture or a data type (in a specific language). All the best, --Jorge Stolfi (talk) 23:05, 14 May 2009 (UTC)

Removed incorrect lead paragraph[edit]

The following definition is incorrect:

"An abstract data type is a set of values and a set of procedures that manipulate those values. The client of an abstract data type (that is, a part of a program that uses that type) All declarations that make up an abstract data type are placed in a module Local identifiers that are to be seen outside a module are exported; all other local identifiers are invisible outside the module, which allows programmers to hide implementation details from the clients of the module. Identifiers from surrounding modules are not automatically inherited by a module. Instead, those that are needed must be explicitly imported. Some identifiers, like the predeclared types integer and Boolean, may be declared pervasive, which means that they are automatically imported into all nested name scopes.

This definition confuses an ADT (which is a theoretical concept) with the software engineering concepts of "module" and "interface". The differece is like between the mathematical function exp(), and a library procedure that computes it. The concepts of clients, identifiers, inheritance, visibility, etc. are as foreign to ADTs as the concepts of overflow and NAN are to mathematical exp(). All the best, --Jorge Stolfi (talk) 23:33, 14 May 2009 (UTC)

Algorithm that needs fetch of uninitialized variables[edit]

There are algorithms that perform fetch(V) when V is an uninitialized variable, and depend on that to achieve optimum asymptotic running time. An example is checking whether there are any repetitions in a given array of n numbers, each between 0 and n2-1. With an array of n2 variables, each capable of storing a number between 0 and n-1, the problem can be solved in O(n) steps. Note that this is not possible if the array elements have to be initialized, either by the algorithm or by the storage allocation operation.
There is anote to this effect in the article. It calls for a reference or wikilink; but I can't recall the name of this trick or its inventor. Googling for "uninitialized array" did not turn up that information. Does anyone have it?
Thanks, and all the best, --Jorge Stolfi (talk) 00:59, 23 May 2009 (UTC)

Removed some text which looks more like a message board discussion[edit]

The question of the relationship between abstract data types and objects is a complex one. Previous versions of this page stated that "Abstract data types are also an important conceptual tool in object-oriented programming and design by contract methodologies for software development.[citation needed]". However, a more detailed study of the relationship reveals that object-oriented programming and abstract data types are fundamental different forms of data abstraction.

The name "abstract data type" apparently was coined by researchers in software engineering and programming language design; while "abstract data structure" was coined by researchers in data structures and algorithms.[citation needed]


I removed the two paragraphs above from article here, since they are interesting observations, but require substantiation ans elabotation. 71.146.74.108 (talk) 14:56, 19 April 2010 (UTC)

Regarding the latter remark, it would be interesting to have an opinion of a historian in computer science. My impression is that while the two terms may have been used independently and interchangeably over time, the term ADT gained preference because it better reflects the corresponding concept of abstraction, which is the *behavior* rather than the *organization* of the abstracted data representation. As for who coined what, it should be remembered that at these early times there was a heavy overlap between researchers in theoretical software engineering and thoretical computer science, judging from the names such as Hopcroft or Ullman. 71.146.74.108 (talk) 15:12, 19 April 2010 (UTC)

Data abstraction in C++ essay by IP[edit]

I have removed the following essay inserted into the text by 101.210.91.186 (talk · contribs · WHOIS) in this edit on February 21, 2012., and restored the section on Advantages of abstract data typing. -- Petri Krohn (talk) 19:45, 12 May 2012 (UTC)

101.210.91.186 wrote:[edit]

Data abstraction refers to, providing only essential information... (Copyvio removed)

Copyvio[edit]

In fact the whole text was a copyright violation and a copy paste from tutorialspoint.com, fittingly named Data Abstraction in C++. Now removed. (tutorialspoint.com is listed on the Wikipedia:Spam blacklist, so I cannot link to the source.) -- Petri Krohn (talk) 12:06, 13 May 2012 (UTC)

IB Computer Science Abstract Data Type[edit]

This section feels more like a tutorial than an encyclopedia article. Does it belong in the article? Maybe the content could be redistributed in a manner that makes more sense. 77.11.28.130 (talk) 10:08, 20 June 2014 (UTC)