Talk:Splay tree

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

Um. I'm not sure the zigzig step is correct. I implemented it and the performance sucked. Works better if the result from a zigzig zigzags. —Preceding unsigned comment added by 122.107.243.206 (talk) 02:01, 19 April 2009 (UTC)

Yes, I noticed there is a problem with the description of the Zig-zag step and editted it. Please someone check the changes. — Preceding unsigned comment added by MazinIssa (talkcontribs) 08:06, 11 July 2013 (UTC)

tree[edit]

Someone please do a pseudocode implementation of splay on the main page. (I removed the note suggesting that, since such notes should be confined to the talk pages.) 68.100.96.143

I've been trying to work out a way to balance a splay tree since I saw this article's claim that splay trees are "self-balancing" and support O(log N) "amortized time" insertions and deletions. I have found multiple splay trees that cannot be balanced and also keep the order defined by splaying. I'm changing the "amortized time" for now, but I would like to chat with someone about the idea that a splay tree can self-balance, just in case I'm crazy. Tjdw 14:16, 15 Jan 2004 (UTC)

I believe splay trees (at least above a certain size) will be (roughly) balanced (meaning their height will be in O(log n)), or at least cannot stay unbalanced once nodes in unbalanced parts of the tree are accessed. I have not checked for myself, but Sleator and Tarjan state in their original article that the height of nodes on the access path to a node being accessed get approximately halfed. If splay trees could be and stay unbalanced, the given worst-case access time could hardly hold. As many internet sources claim this too, I reintroduced it to the article (otherwise the last paragraph would in my eyes not be correct).DrZ 23:11, 14 May 2004 (UTC)
Actually, splay trees are not necessarily self-balancing, they are only stochastically self-balancing. That is why you get O(log n) amortized time - a single access can take n comparisons, just as with a plain binary search tree. It will, however, also somewhat rebalance the tree. Moreover, the insertations that debalance the tree are very cheap, so that in sum you get excellent behaviour. --Stephan Schulz 13:57, 26 May 2005 (UTC)

Should note that the complexity approximates O(M) where M is the working set of nodes, and is therefore appropriate to use when the working set is significantly smaller than the full set. Should also note that both reading (find) and writing (insert/delete) are about equally expensive, so it may behave poorly compared to other self-balancing trees (that only modify the tree in update) when updates are rare.

Complexity should be O(log M), unless I misuderstand what you are saying. --Stephan Schulz 13:57, 26 May 2005 (UTC)
I believe you are right. It should be O(log M). Vecter (talk) 16:36, 16 June 2010 (UTC)

Although the working set theorem says splays have a near O(1) access time for those items in the working set (given working set is small relative to the number of total nodes and constant), in practice randomly constructed BSTs have been shown to outperform splays even when the access pattern is heavily skewed (see, for example, Bell and Gupta). Also of note is that splay trees have better cache performance than BSTs under certain settings. Finally, we should discuss different splalying variants. Top Down vs. Bottom up. The "simple" top-down splaying variant (explained by Sleator and Tarjan) as implemented by Weiss (Data structures in C) is fast in practice (as compared to other variants).


I should probably write up pseudocode (or other more detailed explanation of the algorithm), given that I have both implemented it and documented it before. Bovlb 07:56, 2004 Mar 5 (UTC)

Diagrams of Splaying would be helpful.

I would like to know what literature/research support the claim that "randomly rebalancing the tree can avoid this unbalancing effect and give similar performance to the other self-balancing algorithms"? I didn't manage to verify that, though I tried.

I agree with you. Randomization for splay trees (as by such authors as Fürer, Albers and Karpinski) only helps to save time on randomly not splaying. And only in some cases. As conjectured, of course, it only improves the working time by a constant factor. —Preceding unsigned comment added by 129.67.117.122 (talk) 19:38, 7 January 2008 (UTC)

Self-balancing?[edit]

I have an issue with the fact that this page says splay trees are "self-balancing" as opposed to "self-adjusting". AVL trees, for example, are self-balancing because they maintain a height of O(log n). In contrast, splay trees can have linear height for an indefinite period of time. Of course, splay trees were indroduced as "self-adjusting binary search trees" as well. (Jamie King 16:30, 23 February 2006 (UTC))

See my comment above. They are only stochastically self-balancing. Still, I think the link is useful, so I don't know if we should change it.--Stephan Schulz 20:09, 23 February 2006 (UTC)
They are NOT self balancing. They are self adjusting. They do, as you say, for certain lookup patterns (e.g., random and others (see the "working set theorem" by Sleator and Tarjan)) provide amoritized O(lg n) lookups. However, amortized lookups (for some access patterns) is not equivalent to "stocahtiscally self-balancing". A valid splay tree could have essentially a linked list structure (at times) which is hardly "balanced".
Concur. Splay tress are NOT self balancing trees, they tend to be balanced for common lookup sequences (e.g., uniform queries) but they are not self-balancing in the same way that AVL or Red-Black trees are.
This should be taken care of. I'll do it (i.e. remove the 'self-balancing') if no objection is made during the week --24.37.173.150 (talk) 04:37, 29 November 2009 (UTC)
Not without a reference, and I doubt you'll find one.- Wolfkeeper 05:10, 29 November 2009 (UTC)

I just wanted to also point that splay tree are not self-balancing. In fact they are very unbalanced! They are self-adjusting yes, to the workload, and lookup patterns. I can easly create such pattern of accesses to splay tree which will make it a list! Which is patologically unbalannced. —Preceding unsigned comment added by 87.239.216.2 (talk) 19:46, 4 August 2010 (UTC)

If you think about it they have to be actually self-balancing in a stochastic sense. If they weren't they would never be able to achieve O(ln n) average speed. There are cases where they are partially linear and can become unbalanced, but it takes O(1) operations to cause that, and an O(n) type operation to fix it, so the average is better than O(ln n).- Wolfkeeper 03:40, 13 August 2010 (UTC)

Uniform sequence[edit]

Can anyone explain to me what "uniform sequence" (of operations) means ? --194.9.67.130 12:48, 26 November 2006 (UTC)

Uniform access basically refers to accessing elements of the tree uniformly at random. A uniform sequence of operations to a sequence exhibiting uniform access, i.e., artifacts such as long runs accessing a single node or accessing just a small subset of the nodes are uncommon. It is for non-uniform sequences such as these that splay trees show an advantage. Deco 17:06, 26 November 2006 (UTC)

Where is Splay?[edit]

Why when you search for splay are you brought here. A friend was told they had splayed feet I searched for it on here and found nothing but this on the word splay. I have of course found out what it means. But why do you redirect people to this technical description of something else. Can you add splay (and it's 2 meanings and multiple uses)? I know I can, but I don't know how to make it stop redirecting —The preceding unsigned comment was added by 87.74.86.198 (talkcontribs) .

Just go to this page and edit away. In general, if you come via a redirect, clicking on the little "(Redirected from Splay)" will take you to the page that redirected you. The redirection is part of the page contents, just replace it with whatever you think needs to be said. You probably will create a disambiguation page, so please try to follow the Manual of Style. --Stephan Schulz 22:23, 27 November 2006 (UTC)

symmetric order[edit]

The sequential access theorem mentions symmetric order. I don't know what that means. The original paper also states the theorem (using the same term), but does not prove it. Does anybody know what this means? —Preceding unsigned comment added by Raduberinde (talkcontribs) 13:37, 15 September 2007 (UTC)

Performance theorems[edit]

The last theorem of this section mentions a concrete constant for the upper bound. It is meaningless without knowing which operations are counted. 83.29.230.229 (talk) 19:37, 15 May 2008 (UTC)

Multi threading[edit]

Without being any expert on splay trees it seems like splay trees makes looking up elements into a changing operation which cannot be done from multiple threads at the same time. This is different from most other binary tree and it makes sense to mention it as a performance characteristic. —Preceding unsigned comment added by 38.99.50.50 (talk) 20:17, 20 April 2010 (UTC)

You are absolutely right. I mentioned this in the disadvantages. However, I think concurrent find operations can be made to work without much trouble. First of all, the splay operation need to be in a critical section. Secondly, we can't splay at a node x if a thread is accessing the parent or grandparent of x. Thirdly, we don't want two threads performing splay operations to be at nodes too close to each other. I think everything can be managed in a fairly straightforward way, but the fact that we need to manage it at all is certainly a drawback, and a constant-factor time increase is inevitable. --Jamie King (talk) 16:58, 8 June 2010 (UTC)
Modifying during simple get/find operations would lead to cache trashing when cache lines are owned by another CPUs. That would not be a constant time, though. Usually modifying shared state is the worst scalability bottleneck. While releasing the mutex there will be extra write buffers to be flushed as well, the performance would vary depending if the buffers are can settled during cache misses. Bestsss (talk) 00:52, 5 December 2011 (UTC)

Persistence[edit]

It's mentioned that splay trees can be made into a persistent data structure. Intuitively, I don't see how this is true. Certainly you could make insert and delete operations return new trees, but because accesses in a splay tree edit the tree, is everytime you *access* the tree going to return the desired element *and a new tree?* That seems prohibitively expensive, even for a persistent structure. —Preceding unsigned comment added by 192.160.124.68 (talk) 20:46, 12 May 2010 (UTC)

External links[edit]

There are too many external links in this article. From WP:EL: "Links in the "External links" section should be kept to a minimum," "Long lists of links are not acceptable." Many of the sites linked to are also of unknown reliability. For example [1]. This seems completely unreliable: [2]. Offliner (talk) 00:46, 8 April 2009 (UTC)

I agree. I removed all but three of the links. --Jamie King (talk) 17:19, 8 June 2010 (UTC)

Deleted deletion[edit]

I've deleted the alleged C code for deletion because a) it was not C and b) it was badly broken.

If we want to give an example implementation, I propose Sleator's original code, released into the public domain [3]:

Tree * delete(int i, Tree * t) {
/* Deletes i from the tree if it's there.			   */
/* Return a pointer to the resulting tree.			   */
	Tree * x;
	if (t==NULL) return NULL;
	t = splay(i,t);
	if (i == t->item) {			   /* found it */
		if (t->left == NULL) {
			x = t->right;
		} else {
			x = splay(i, t->left);
			x->right = t->right;
		}
		size--;
		free(t);
		return x;
	}
	return t;						 /* It wasn't there */
}

or, cleaned up a bit:

/* Deletes i from the tree if it's there.   */
/* Return a pointer to the resulting tree.  */
Tree * delete(int i, Tree * t) 
{
   Tree * x;
   if (t==NULL)
   {
      return NULL;
   } 
   t = splay(i,t);
   if (i == t->item)
   {	 /* found it */
      if (t->left == NULL)
      {
 	 x = t->right;
      }
      else
      {
         x = splay(i, t->left);
	 x->right = t->right;
      }
      size--;
      free(t);
      return x;
   }
   return t;	 /* It wasn't there */
}

Likewise, the original code for splay would be preferable because it needs not "parent" link, unlike the version we present now. --Stephan Schulz (talk) 13:28, 20 April 2011 (UTC)

Recursion in the implementation of splay method[edit]

The recursive call to the splay method inserts the same arguments "root" and "x" again. This means that the method need not be done recursively, as it becomes less efficient.

Using recursion in all cases does not really clarify the point of splaying and I was wondering whether it be best that it is turned into a non-recursive method. —Preceding unsigned comment added by 129.132.45.239 (talk) 15:10, 12 May 2011 (UTC)

What do you mean "inserts"? Recursion is a natural approach for tree operations like this. The function calls itself until x==root (after being changed by left and right rotations). I think you'll find an iterative implementation is harder to follow and only marginally more efficient. Maghnus (talk) 00:09, 13 May 2011 (UTC)

C code, splay problem[edit]

Watch out since splay function modifies only copy of root. So if you start with some root don't expect it to change when running splay(x, root)! — Preceding unsigned comment added by 89.67.175.78 (talk) 22:33, 10 January 2013 (UTC)

C++ implementation[edit]

There are several problems:

  1. Memory leaks - operator 'new' is used without single 'delete'.
  2. Splaying should always be performed for complexity statements to be true.
That means while finding an element, last visited node should be splayed regardless whether element was found or not.
This is also true for deletion, min_element and max_element.