Talk:Graph partition

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Mathematics (Rated C-class, Low-priority)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
Low Priority
 Field: Discrete mathematics
WikiProject Computer science (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

NP-complete?[edit]

For a problem to be NP-complete, it must be a decision problem; the current written formulation of the graph partition problem is not.— Preceding unsigned comment added by 190.163.7.170 (talkcontribs) 20:30, 24 July 2009‎

"Graph partitioning is known to be NP-complete, but can be solved in polynomial time for |Vi| = 2 by matching (see Michael R. Garey and David S. Johnson's Computers and Intractability ; A Guide to the Theory of NP-Completeness, page 209)." This statement seems incorrect as computing bisection width of an arbitrary graph is also NP-Hard. PraveenYalagandula (talk) 22:51, 12 May 2011 (UTC)

Open source packages and related links[edit]

I added open source graph partitioning packages (of tools out there) to this site but they are continously removed by a user called Ronz ( dont know how to link him here ). Moreover, I added a survey on graph partitioning and a link to my dissertation. All of these contributions are relevant to the reader of this site. Actually I am not willing to contribute anything to wikipedia if relevant material on the topic from experts in the field are removed over and over again. Moreover, I think this is a general problem (see http://slashdot.org/story/13/10/23/1643228/wikipedias-participation-problem). If conflicts of interest should really be the problem here, then the same holds for the graph partitioning book at the cite (added by a user cebichot, bichot is the editor of this book) and the dissertation of emil feldmann). I hope we can discuss this here. — Preceding unsigned comment added by Christian.schulz (talkcontribs) 18:44, 15 November 2013 (UTC)

All the editing appears to be promotional in nature. While we encourage new editors, especially with such expertise, Wikipedia is not a venue for promotion. See WP:SOAP, WP:NPOV, and WP:COI. --Ronz (talk) 18:58, 15 November 2013 (UTC)


The open source section is not promotional. It is a list of tools available out there which is alphabetically ordered. Do you want the list to be more complete or why are you saying it is promotional? Christian.schulz (talk) 20:44, 15 November 2013 (UTC)

I'm saying it's promotional. These are common enough problems with Wikipedia that they are covered by numerous policies and guidelines. Again, see WP:SOAP, which is a section of WP:NOT. Also in WP:NOT is WP:NOTDIRECTORY which applies to the list. Links included in lists are inappropriate per WP:EL. Lists in general are covered in WP:LIST. --Ronz (talk) 21:12, 15 November 2013 (UTC)
Thanks for the fast reply. Frankly, this comment is actually not very helpful. It would be helpful to tell me how to write it such that you think it isnt promotional or give me a link to guidelines. Open source packages are important to the topic and the reader (most of them implement the multilevel algorithms described within the article) and hence improve the value of the article. The same holds for the dissertation of mine. Admittedly, I am the one adding it to the bibliography -- still it is relevant to the topic and the reader, or do you see this differently? If someone else adds it there, its considered spam. So whats actually eligible to be posted here? Moreover, a third option by an administrator would be helpful. Christian.schulz (talk) 21:32, 15 November 2013 (UTC)
In general, open source projects are not encyclopedic.
WP:DR can direct you in to how to work out disputes, including getting others' opinions. Since it's only the two of us (I'm assuming the ip's are all you), then WP:THIRD would be a good next step. --Ronz (talk) 22:34, 15 November 2013 (UTC)
I will do that. And by the way you have not answered a single question of mine. The graph partitioning packages have been in here for four months and you removed my changes after I undid the removed of the overview paper I added -- which can be seen as a response of defiance from your side. Also I think that you missmatch beneficial and promotional. I am not saying that KaHIP is the only software package out there and that it should be used. I am simply listing those. Do you have a source for "open source projects are not encyclopedic"? I would be willing to extend the open source section but why should I bother if changes will be directly removed again?Christian.schulz (talk) 11:06, 16 November 2013 (UTC)
In my opinion, backed by WP:COI, you shouldn't be editing this article until you have a better understanding of the the relevant policies/guidelines, and then do so with great care.
Let's try WP:THIRD and work from there. --Ronz (talk) 16:16, 16 November 2013 (UTC)
You are definitly not used to answer questions, are you? Now I can see it clearly, it is recentivism, spam, conflict of interest and just "not encyclopedic". Its funny how three lines can be in conflict with so many rules on wikipedia. Anyway, I will proceed as suggested by Nczempin. Christian.schulz (talk) 18:32, 16 November 2013 (UTC)
Searchtool-80%.png Response to third opinion request:
I am a somewhat experienced WP editor, and I have no particular background in the area that this article covers. Of course that'll be the background of the vast majority of people looking at this from WP:THIRD; so the process is biased against Christian.schulz in that regard. I hope he can still treat my feedback not as anything against himself (WP:AGF). WP:COI is a tricky subject, especially in the context of experts on a subject. When experienced WP editors revert such contributions, sometimes with harsh-sounding comments such as rv linkspam, in many cases the less experienced editors (but experts in the field) can get offended or confused, because . Please note that any reverts in a case such as this are not a claim that the added information or links are incorrect. For situations like this WP:COI (BTW I strongly suggest you follow links to WP guidelines such as this one and the ones Ronz provided, to make your experience with WP a more pleasurable one in the future) provides a process which can help experts include their valuable knowledge in WP articles while avoiding being reverted. Namely, put edit requests on the article's discussion page and let the more experienced editors help with which bits would be considered to be problematic (which can then be discussed) and which ones are fine. This is my suggestion as an outside observer: On this particular article, Christian.schulz should make edit requests on this Talk page (and others for which there might be a WP:COI for any issues that aren't immediately obviously fine, such as spelling corrections. If the dissertation is considered notable enough, it is likely that it will be added by a third party, so I strongly advise against the author adding it himself. I also strongly suggest that Christian.schulz familiarize himself with some of the guidelines and structures, in particular those of WP:COI. It may even be a good idea to work on some articles outside his immediate field of expertise; one guideline on Wikipedia is WP:The world will not end tomorrow. Please note that this is the first time I'm providing a WP:THIRD, so I hope I haven't botched up any formatting or broken any rules -- Nczempin (talk) 17:20, 16 November 2013 (UTC)
Thanks for the reply. Although not all issues are adressed it is helpful (for example if the same rules apply, then the modifications by bichot would also be recentivism and WP:COI -- yet they are not removed). I will try to add content to the discussion page and we will see if someone else will integrate it into the main article. Christian.schulz (talk) 18:32, 16 November 2013 (UTC)
It may very well be the case that the other edits are recentism and WP:COI, and if so, the same guidelines apply and the issues should be fixed. That they haven't otherwise been brought up should not be used as justification to include more of the same (I'm not saying that you are). BTW I haven't looked deeply into the issue, but your WP:COI may not be a "full" COI (which would for example be the situation in which an editor were to edit an article about himself). It's possible that the COI is limited to your own paper being linked by yourself, and possibly (as I said, I haven't researched the details of your situation) that you're involved in FOSS projects that you linked. So if that is indeed the case, I would consider the behavioural guidelines to apply only for those limited areas. Use your best judgement (which will increase as you get more familiar with the ideosyncrasies of editing on WP; when in doubt, it's better to err on the "careful" side. Regarding the edit requests you're planning: it's not so much about "we will see if someone else will integrate" them than it is to put you on the safe side of any possible COI accusations. If you make an edit request and there are no objections after some reasonable time has passed (let's say a week; this depends a little on how frequently the particular discussion page is visited by other editors. Use your best judgement) it should be fine to add the information to the article yourself. With respect to adding a paper you wrote yourself as a citation, it would be best to let it "bubble up" through a natural process (by that I mean don't even make an edit request for it; if someone else looks for references and finds yours to be pertinent, that's fine, but adding it yourself, even suggesting it, will "ring very loudly" the COI and promotion "bells" of many seasoned WP editors). Anyway, I hope that your initial experience as a WP editor has not discouraged you, we try not to WP:BITE the newcomers, but I'm sure some of them don't feel encouraged when constantly being bombarded by WP guidelines and don't know whether their next edit has them stepping into another minefield. So my advice: Don't take reverts (or sometimes harsh-sounding discussion comments) personally, WP:BEBOLD (within the limits of what you've learned about WP:COI), it is a normal part of the WP:BRD process (sorry to send more guideline-bombs your way). --Nczempin (talk) 23:26, 16 November 2013 (UTC)
Thanks! In fact the dissertation of Emil and the graph partitioning book by Bichot should stay on the page since they are important contributions to the field. Christian.schulz (talk) 12:11, 18 November 2013 (UTC)

Software Tools Section[edit]

I would like to add the following section to the article to list available software tools to tackle the graph partitioning problem (the final edit will contain the citations to the papers):

There are a number of software packages that implement the described algorithms. One of the first publicly available software packages called Chaco is due to Hendrickson and Leland. As most of the publicly available software packages, Chaco implements the multilevel approach outlined above and basic local search algorithms. Moreover, they implement spectral partitioning techniques. Probably the fastest and best known system is the Metis family by Karypis and Kumar. kMetis is focused on partitioning speed and hMetis, which is a hypergraph partitioner, aims at partition quality. PaToH is also a widely used hypergraph partitioner that produces high quality partitions. ParMetis is a widely used parallel implementation of the Metis graph partitioning algorithm. Scotch is graph partitioning framework by Pellegrini. It uses recursive multilevel bisection and includes sequential as well as parallel partitioning techniques. Jostle is a well-known sequential and parallel graph partitioning solver developed by Chris Walshaw. The commercialised version of this partitioner is known as NetWorks. It has been able to hold most of the records in the Walshaw Benchmark for a long period of time. If a model of the communication network is available, then Jostle and Scotch are able to take this model into account for the partitioning process. Party implements the Bubble/shape-optimized framework and the Helpful Sets algorithm. The software packages DibaP and its MPI-parallel variant PDibaP by Meyerhenke implement the Bubble framework using diffusion; DibaP also uses AMG-based techniques for coarsening and solving linear systems arising in the diffusive approach. Sanders and Schulz released a graph partitioning package KaHIP (Karlsruhe High Quality Partitioning) focusing on solution quality. It implements for example flow-based methods, more-localized local searches and several parallel and sequential meta-heuristics.

To address the load balancing problem in parallel applications, distributed versions of the established sequential partitioners Metis, Jostle and Scotch have been developed. The tools Parkway by Trifunovic and Knottenbelt as well as Zoltan by Devine et al. focus on hypergraph partitioning. Recent results of the 10th DIMACS Implementation Challenge suggest that scaling current hypergraph partitioners to very large systems is even more challenging than graph partitioners.

Christian.schulz (talk) 09:57, 18 November 2013 (UTC)

Alternatively, we could add the following table similar to the table used in the Linear Programming article:

Free open-source licenses:

Name License Brief info
Chaco GPL first publicly available software package implementing spectral techniques and the multilevel approach
DiBaP * graph partitioning based on multilevel techniques, algebraic multigrid as well as graph based diffusion
Jostle * multilevel partitioning techniques and diffusive load-balancing, sequential and parallel
KaHIP GPL several parallel and sequential meta-heuristics, guarantees the balance constraint
kMetis Apache 2.0 fastest graph partitioning package based on multilevel techniques and k-way local search
Mondriaan LGPL matrix partitioner to partition rectangular sparse matrices
PaToH BSD multilevel hypergraph partitioning
Parkway * parallel multilevel hypergraph partitioning
Scotch CeCILL-C implements multilevel recursive bisection as well as diffusion techniques, sequential and parallel
Zoltan BSD hypergraph partitioning

Christian.schulz (talk) 10:57, 22 November 2013 (UTC)

This could work, if done exactly like the list in Linear Programming. Note that all the entries have their own articles, and there are no external links to official sites. This is a fairly standard way to create a list of noteworthy entries to follow the relevant policies and guidelines. For this to work, each entry has to be notable on it's own or the entries' articles won't remain. --Ronz (talk) 18:20, 22 November 2013 (UTC)
According to WP:LIST each entry does not need to have their own article. Can you point me to the guideline that states this? Christian.schulz (talk) 12:19, 24 November 2013 (UTC)
Correct.
As I said, the list in Linear Programming is fairly standard. Another option would be to find sources for each entry that meet WP:N, without actually creating individual articles for those entries based upon those sources. --Ronz (talk) 17:56, 24 November 2013 (UTC)
Again, in WP:LIST it is not said that each entry has to meet WP:N and also fairly standard does not mean that each list has to look this way (otherwise it would be stated in the guideline). Can you give an example source for one of the graph partitioners above? Are for example citations already enough? I find it hard to think about other sources for the graph partitioning packages. Thanks. Christian.schulz (talk) 12:44, 25 November 2013 (UTC)
I'm offering options to try that I know have wide consensus across Wikipedia and have worked in many other articles. If no such sources exist, then we're probably not going to find a workable solution.
"Are for example citations already enough?" I'm not clear what you mean or what examples you are referring to. --Ronz (talk) 17:18, 25 November 2013 (UTC)
The question is, wether the usage of the system of an independent party, and thus a citation of one of the papers corresponding to the respective graph partitioning framework, is enough. It is very unlikely that we will find newspaper articles on the subject. To be more precise: what kind of sources would you except for these graph partitioning packages? Another example from Linear Programming, what is the source in SCIP_(optimization_software)?. Christian.schulz (talk) 19:05, 25 November 2013 (UTC)
No, I don't think that sources that simply show usage is enough.
I've clearly stated that sources meeting WP:N criteria will work. --Ronz (talk) 20:16, 25 November 2013 (UTC)
If you read carefully through WP:N you will find that "Notability guidelines do not limit content within an article" and more precisely "The notability guidelines do not apply [...] to list content". Hence, I assume that I can safely include the list above (and potentially also the written text above the list) since in the language of Wikipedia:DUE the list "fairly represents all significant viewpoints". Christian.schulz (talk) 22:28, 25 November 2013 (UTC)
Oh and by the way according to WP:IRS "Material such as an [...] research paper that has been vetted by the scholarly community is regarded as reliable. If the material has been published in reputable peer-reviewed sources or by well-regarded academic presses, generally it has been vetted by one or more other scholars.". And: "When available, academic and peer-reviewed publications, scholarly monographs, and textbooks are usually the most reliable sources.". So in my eyes peer reviewed publications are more than enough as sources (not that we need them in the list). However, I will try to comply with your request and add the citations to each of the packages. Christian.schulz (talk) 23:07, 25 November 2013 (UTC)
I'm not saying that using WP:N-criteria sources is the only solution, only that it is a very good solution that is an alternative to having each entry have it's own article.
It's not reliability that I'm concerned about. It's WP:DUE enough to mention while not being WP:SOAP. --Ronz (talk) 01:54, 26 November 2013 (UTC)
Yes. We are now at the point that I will paste the table and the text into the main article including the sources (aka peer-reviewed publications). Thanks for your help. Christian.schulz (talk) 08:35, 26 November 2013 (UTC)

New "Software Tools" section[edit]

Dispute about the software tools section in this article. Christian.schulz (talk) 22:45, 2 December 2013 (UTC)

[1] Looks like nothing but primary sources. Am I overlooking some source that actually demonstrates that they are are worth mention - a non-primary source that is independent of the subject matter? If there's none, then the entire section is at risk of being removed. --Ronz (talk) 17:59, 2 December 2013 (UTC)

We discussed this already (see discussion above). Please tell me exactly what kind of sources you are looking for, so that I can add those. In the best case you can give an example for one of the packages. Reviewed articles are the best possible sources that we can get. Also please have a careful look at primary sources "Secondary or tertiary sources are needed to establish the _topic's notability_ and to avoid novel interpretations of primary sources.". So secondary sources are only needed when talking about a topic/page as a whole (in our case graph partitioning). Christian.schulz (talk) 18:04, 2 December 2013 (UTC)
Then I think the section should be removed. --Ronz (talk) 19:45, 2 December 2013 (UTC)
I repeat my question. Please tell me exactly what kind of sources you are looking for in this context (research), so that I can add those. Christian.schulz (talk) 19:48, 2 December 2013 (UTC)
And before you remove anything, please let an admin look at the issue. Thanks. Christian.schulz (talk) 19:58, 2 December 2013 (UTC)
I don't see any need for an admin. However, I think it would be best to get some other editors involved at this point. An WP:RfC would be a good next step. --Ronz (talk) 21:15, 2 December 2013 (UTC)
Can you please answer the question. Please tell me exactly what kind of sources you are looking for in this context (research), so that I can add those. An example in a different research context would be enough. Christian.schulz (talk) 22:22, 2 December 2013 (UTC)
We need a published survey paper that reviews and compares multiple packages for graph partitions. Right now we just have an indiscriminate list "X package implements Y algorithm" that appears to be a synthesis of primary sources. This section should instead be based on secondary sources (that is, surveys by others rather than papers by the authors of the packages themselves) and it should say something about the relative strengths and weaknesses of the packages compared to each other. Especially, claims like "possibly the fastest and best known system" should not be included without proper sourcing, independent of the package creators. It would also be a good idea to have some standard for what is to be mentioned here — inclusion this sort of multiple-package survey would be a reasonable choice, or alternatively the existence of a separate article on the package clearly demonstrating its notability. —David Eppstein (talk) 00:11, 3 December 2013 (UTC)
I agree that such a source would work. --Ronz (talk) 00:47, 3 December 2013 (UTC)
Something like this is not available. But neither is it available in the linear programming context. So why apply it here, but not in the linear programming context? I can easily remove the sentence on metis or even remove the whole text if necessary. There are plenty of papers citing the original works and in some cases also news articles. Moreover, WP:SYN is the wrong thing to cite here since I do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any of the sources. Christian.schulz (talk) 04:04, 3 December 2013 (UTC)
Your argument that we shouldn't needs such a citation here because some other article fails to have one is not convincing: see WP:WAX. Probably that other article needs improvement. —David Eppstein (talk) 04:12, 3 December 2013 (UTC)
Thank you for the quick reply. I agree. On the other hand, many people use these tools so that they are valuable to the reader. There are plenty of citations to each of the primary sources. How about, dropping the whole text and just leaving the table in this section. Christian.schulz (talk) 05:21, 3 December 2013 (UTC)
How about dropping everything including the table until it can be properly sourced. WP:ITSUSEFUL is also not a convincing argument. —David Eppstein (talk) 05:46, 3 December 2013 (UTC)
I am sorry, I am really trying my best here and also thanks for pointing that out, I am new to wikipedia so it is very likely that I don't know every guideline -- but I am willing to learn. From my point of view, it cannot be that an overview article comparing all tools is the only possible way to have the table in here. Can you give an example on wikipedia where this is the case or even name a guideline stating that we need an overview paper comparing the tools? Why are citations to the papers/primary sources not enough? Christian.schulz (talk) 06:05, 3 December 2013 (UTC)
At heart, it's about ensuring that we provide neutral coverage of our material, by not relying on sources prone to bias. WP:RS, WP:PRIMARY, and WP:3PARTY have more. Especially see the part in WP:PRIMARY stating "Articles may make an analytic or evaluative claim only if that has been published by a reliable secondary source." —David Eppstein (talk) 06:43, 3 December 2013 (UTC)
Wikipedia:PRIMARY states the following: "[...] primary sources that have been reliably published may be used in Wikipedia; but only with care, because it is easy to misuse them.[4] Any interpretation of primary source material requires a reliable secondary source for that interpretation. A primary source may only be used on Wikipedia to make straightforward, descriptive statements of facts that can be verified by any educated person with access to the source but without further, specialized knowledge.". So if only the table stays in this section then everything is fine. There is no interpretation or whatsoever in the table ( I removed the word fastest ) . Or am I misinterpreting something? Christian.schulz (talk) 06:51, 3 December 2013 (UTC)
  • Comment: The foregoing two paragraphs strike me as summarising the essentials of this problem. When one has nothing but, or nothing better than, primary sources to go on, then primary sources may justifiably be referred to until satisfactory secondary sources become available. This not only is compatible with Wikipedia stated principles, but with plain common sense. Conversely however, licence to use primary sources judiciously in such circumstances does not imply licence to insert POV remarks or even synthesis. The bottom line then is that such a list is perfectly permissible, but that remarks on relative merits of items are not, and any such should be edited out, while the list remains in place. JonRichfield (talk) 09:22, 3 December 2013 (UTC)
  • Comment JonRichfield nicely summarizes my point of view as well. I also think that there may be secondary sources for this. For instance, a minute's search yielded the graph partition entry from the Algorithm Design Manual, which has a list of recommended packages and Skiena's ratings. There may be more in the book itself. --Mark viking (talk) 21:23, 19 December 2013 (UTC)
  • Comment after random call for RFC: I hope y'all are not conflating the terms "primary source" and "journal article". Please keep in mind that many nontrivial research articles start with a brief survey of the "state of the art"; i.e., we don't need a dedicated survey/comparison article for a reasonably NPOV opinion. That said, any claims about "fastest/bested/mostest" must not only be referenced by source independent from the author, but attributed to an authority in the domain (not just some M.S. work). So far I tagged a handful of claims in the article which must be clarified in this respect. Staszek Lem (talk) 01:43, 31 December 2013 (UTC)

In my opinion, this article requires a lot of work. First, it does not contain any of the work done in the community structure field, which is essentially the same as (or, if you wish, parallel to) the graph partition field. Second, the packages described are far from being the most used programs to generate partitions in graphs. For example, modularity programs are I think very often (probably most often) used. I have seen above the discussion on self-promotion, and I think there is a bit of that as the article now stands, being very one sided. — Preceding unsigned comment added by Zombieanalytics (talkcontribs) 07:57, 12 June 2014 (UTC)