Talk:Fractal tree index
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||
|
Notability and COI
[edit]I've added notability and COI tags to the article. I've been unable to find articles on this topic independent of the group that invented the data structure. Without independent, in-depth reliable sources per WP:RS, this article may fail to satisfy notability guidelines per WP:GNG. The COI tag comes from the article creator also being one of the inventors of the algorithm and also an author on two of the cited papers: "Cache-Oblivoius streaming B-trees" and "The TokuFS Streaming File System". Per WP:COI, this can lead to problems with the neutrality of the article. Editors knowledgable about the topic should review the content for neutrality--although without independent reliable sources to rely on, this may be difficult. Thanks, --Mark viking (talk) 18:24, 23 April 2014 (UTC)
I've removed the notability tag. Database indexing is an important topic, B-Trees and their variants are very widely used despite their limitations. A new technique that overcomes them is notable.
I too would like to see more independent sources. Paul Foxworthy (talk) 05:55, 1 April 2016 (UTC)
- @Paul Foxworthy: That's not how notability works, please see WP:GNG for the criteria. I am restoring the notability tag. -- intgr [talk] 08:29, 1 April 2016 (UTC)
- @Intgr:I respectfully leave your judgement there. However, the GNG guidelines say "Notability is a property of a subject and not of a Wikipedia article". I agree with that sentiment.Paul Foxworthy (talk)
- @Paul Foxworthy: I don't see how that quote applies here. My point is, notability is established by the existence of independent reliable sources about the subject, not hand-wavy claims like "A new technique that overcomes them is notable". A common and useful way to demonstrate notability is adding citations to such sources to the article, but you could also simply list them on the talk page. -- intgr [talk] 07:20, 4 April 2016 (UTC)
Terse line maybe?
[edit]The article says "There are several choices for how the buffers are flushed, all leading to similar I/O complexity."
It seems you are hinting at whether all buffers in a node are flushed or only the fullest buffer is flushed, and comparing/contrasting the costs resulting due to one of the 2 decisions. Is that so? If that is the case, then it is totally not clear that it is what is meant. Some more elaboration would be super helpful! — Preceding unsigned comment added by Dhruvbird (talk • contribs) 02:06, 28 April 2014 (UTC)
There are a bunch of choices for how and when to flush: as you point out, you can flush only the fullest buffer or all. But you can also flush-on-full only or also flush-on-query. None of these change the asymptotic analysis. It's all an engineering choice, based on typical workloads. Does that help? Farach (talk) 20:30, 28 April 2014 (UTC)
Thanks! That makes sense. However, I would like to see some proof or a sketch (maybe hand-wavey) of why it doesn't affect the asymptotics and whether the constants involved are different for either case. Dhruvbird (talk) 21:03, 30 April 2014 (UTC)
This looks like a cache-oblivious data structure/(crystalized) algorithm
[edit]However, there are no links to https://en.wikipedia.org/wiki/Cache-oblivious_algorithm — Preceding unsigned comment added by Dhruvbird (talk • contribs) 15:42, 28 April 2014 (UTC)
Although the COLA is cache oblivious, the current version of the FTI is not. It is based on the B-epsilon tree, which explicitly codes for a block transfer size. Farach (talk) 20:28, 28 April 2014 (UTC)
Ah! Yes, I forget - thanks for the clarification. Dhruvbird (talk) 21:03, 30 April 2014 (UTC)
Refs. for why the current version was chosen over the COLA?
[edit]You mentioned in one of the previous comments that the FT was earlier implemented using a COLA, but subsequent implementations have since been based on the B-epsilon tree. Are there any references to why this decision was made? What were the tradeoffs and why the current choice is a better one? — Preceding unsigned comment added by Dhruvbird (talk • contribs) 04:10, 1 May 2014 (UTC)