Talk:Directed acyclic word graph

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing  
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 ???  This article has not yet received a rating on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
 

Disputed external link[edit]

There appears to be a dispute over this external link:

http://www.pathcom.com/~vadco/dawg.html - The authoritative source for information on lexicon data-structure optimization, by JohnPaul Adamovsky

Rather than edit warring over it, please discuss here reasons to include or exclude it. I would say the default position is that it should be left out, pending reasons to keeping it in. Deltahedron (talk) 07:21, 2 February 2013 (UTC)

This is JohnPaul Adamovsky. I have conducted several years of research in the realm of lexicon data-structure optimization. The link I wish to add to this article opens up a detailed understanding of how to write computer code in C to actually create a compressed suffix graph using basic integer arrays. I have published thorough documentation about the objectives of using DAWG's, the creation process, the data storage options, and the benefits / drawbacks of using this type of data-structure. For example, many readers seem to think that there may exist an easy way to make the DAWG into a dynamic structure which can have words added and removed on the fly. As of now, DAWG's are constructed once, and then remain static; modification of any kind, especially with integer array encoding would demand a complete reconstruction of the entire graph. Further, I have meticulously documented the code and framework which uses a novel hash function encoding of DAWG to solve the popular 5x5 Boggle Board optimization NP-Complete problem. There is no other way to analyze 70,000 5x5 Boggle boards per second using a PC. In the year 2007, I was searching the internet for the kind of documentation which would show me how to build a DAWG. I did not find this documentation in the academic articles available, so I conducted years of research, wrote the documentation myself, and made it publicly available to anybody with a computer and knowledge of Google and Wikipedia. If you are smart enough, and you want it bad enough, then you can use my documentation to construct your very own DAWG. Perhaps an analogy will assist David Eppstein in understanding why the link to my DAWG web page must be included in the Wikipedia articles: Enzo Ferrari was the kind of man who could tell you about the exotic high-performance supercars which he designed, built, and sold to his millionaire automotive enthusiast customers, whereas I am the kind of man who invites you to take a tour on my Batcave, where I will show you how to build a bad-ass ride, and within a reasonable amount of time you'll be driving your very own Batmobile. I have deep connection with high-performance data-structures, and I wish to share this enthusiasm with people outside of career academia, who do not have three years of their lives to dedicate to researching this field on their own. I invite you to set aside the elitist academic snobbery, read my work, and present an informed case as to why links to my work do not belong on Wikipedia. Please do not refer to "default position"... I associate the default position of computer people to be playing video games, punctuated by jacking off to internet porn. The right thing to do is to do your research and figure out that the links I posted BELONG in the Wikipedia articles. I apologize for the "edit war", as this is the first time I have used "talk" pages, so it will not happen again. I shall restore the links until you have taken enough time to come to an informed conclusion about their legitimacy, then we shall resume this discussion. I am of the opinion that plagiarism is more likely to happen at UC Irvine than it is at Stanford, so a professor who is getting sick of seeing my code submitted over and over again in his graph theory course should work a little bit harder, and then apply for a job at a better college. Burying information is an illegitimate solution for plagiarism NoblerSavager (talk) 02:46, 3 February 2013 (UTC)

It certainly doesn't belong in the present article; this article isn't about the lexicon data structure, but about a different suffix tree-like data structure for the suffixes of a single string, with some sort of DAWG-like compression. The correct place for those links, if they belong anywhere, is Deterministic acyclic finite state automaton. And since you left the same long comment on Talk:Deterministic acyclic finite state automaton, I think we should just discuss it there, so we don't have two separated discussions about the same thing. As for a related issue, of whether the lexicon data structure should be here or at Deterministic acyclic finite state automaton, my preference would be here: I've only ever heard these things called DAWGs and not DAFSAs. But then, where would the current contents of this article go instead? —David Eppstein (talk) 03:22, 3 February 2013 (UTC)

My family calls me "John", and maybe you've heard of "John von Neumann", whose family probably called him "John". As far as I am concerned, it is perfectly acceptable for two different things to have the same name. A trie which has been compressed into a Boolean graph with shared suffix structures is what I call a DAWG. I should not have to change my name because some long-since-dead person had that name first, and neither should a powerful data-structure. I say, put the DAWG article back here where people are going to look for it. My reason is because this page will remain a stub, and DAFSA is not as catchy an acronym. NoblerSavager (talk) 04:29, 3 February 2013 (UTC)

It's too bad Wikipedia frowns on neologisms, because I think that the things described here currently should be called "directed acyclic suffix graphs", but that phrase doesn't actually seem to be in use by anybody. —David Eppstein (talk) 04:42, 3 February 2013 (UTC)