Talk:DBpedia
This article was nominated for deletion on 15 November 2009. The result of the discussion was Withdrawn. |
Computing Start‑class Low‑importance | ||||||||||
|
Wikipedia Start‑class Low‑importance | ||||||||||
|
Members of the DBpedia Team
To clear the WP:COI tag, here is a list of people, who work directly on the DBpedia project. We are scientists and value a neutral and scientific view point, but we will not do any more editing on the main article.
- Judging from henrik's notes it seems acceptable if we fix small errors and update things such as release numbers and item / link counts.
- Soeren1611 (talk)
- ChrisBizer (talk)
- Chrisahn (talk)
- SebastianHellmann (talk)
- Jens_Lehmann (talk)
- KingsleyIdehen (talk)
- Beckr (talk)
- Echera (talk)
- maybe more, please add
We currently have a Request for comment going on here: Wikipedia:Requests_for_comment/infobox_template_coherence —Preceding unsigned comment added by SebastianHellmann (talk • contribs) 09:48, 17 November 2009 (UTC)
public sparql endpoint
The "public sparql endpoint" link (http://dbpedia.org/sparql) is 404. Espertus (talk) 18:14, 7 June 2008 (UTC)
- Usually works, sometimes it doesn't. Please send such requests to dbpedia-discussion. Chrisahn (talk) 15:33, 22 July 2009 (UTC)
Deleted references
A user has deleted these references from the External Links section with the edit summary "clean up - any relevant to article should be used as references IN the article - most are papers/talks by DBpedia's own founders", and didn't copy them here. I shall do so. I've replaced the other 2 links as obviously relevant. -- Quiddity (talk) 19:11, 7 November 2009 (UTC)
- Christian Bizer et al.: Interlinking Open Data on the Web (Poster). Poster at ESWC 2007.
- Christian Bizer et al.: DBpedia - Querying Wikipedia like a Database. Developers track presentation at WWW2007.
- Christian Becker, Chrisitan Bizer: DBpedia Mobile – A Location-Aware Semantic Web Client. Semantic Web Challenge at ISWC 2008, Karlsruhe, Germany, October 2008.
- Sören Auer, Jens Lehmann: What have Innsbruck and Leipzig in common? Extracting Semantics from Wiki Content. Paper at ESWC 2007
- Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, Zachary Ives: DBpedia: A Nucleus for a Web of Open Data. 6th International Semantic Web Conference (ISWC 2007), Busan, Korea, November 2007.
- Fabian Martin Suchanek, Gjergji Kasneci, Gerhard Weikum: Yago: A Core of Semantic Knowledge - Unifying WordNet and Wikipedia. Paper at WWW2007.
What have Leipzig and Innsbruck in common is on rank 40 of most cited papers in computer science 2007: according to [citeseer]. The last one by suchanek is a paper at the WWW, which is a very good conference and an acceptance rate of 10-15%. It has therefore been peer reviewed. SebastianHellmann (talk) 14:18, 12 November 2009 (UTC)
Notability
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.
I just skimmed the TED talk and the only mention I see about DBpedia is about 30 seconds from 8:40-9:10. Unless I missed something, I don't see this qualifying as significant coverage in reliable third-party sources.--Crossmr (talk) 01:03, 8 November 2009 (UTC)
- It's the key example in his presentation! Everything before that is preamble/history/overview. It's the first and main example he brings up about a site that uses Linked Data. -- Quiddity (talk) 20:21, 10 November 2009 (UTC)
- I added a few more references - featured in 3sat, used by BBC and New York Times. I could add that DBpedia has been cited in several hundred scientific papers, but that doesn't really belong in the article, does it? Chrisahn (talk) 21:47, 10 November 2009 (UTC)
DBpedia is used as a main example in the RDFa specification: http://www.w3.org/TR/rdfa-syntax/ --Beckr (talk) 12:19, 11 November 2009 (UTC)
In relation to the Linking Open Data effort, Tim Berners-Lee says that "DBpedia is one of the more famous pieces of it" http://talis-podcasts.s3.amazonaws.com/twt20080207_TimBL.html and Ivan Herman, W3C Semantic Web Activity lead, says that "Dbpedia is probably the most important 'hub' in the project." http://www.w3.org/2009/Talks/0830-Nanjing-IH/Talk.odp --Beckr (talk) 12:19, 11 November 2009 (UTC)
- Out of curiosity, how many more DBpedia folks are going to come here and edit this article, after having been asked not to per Wikipedia's WP:COI? -- Collectonian (talk · contribs) 13:55, 11 November 2009 (UTC)
Dear Colectonian, I really don't understand your problems with the page in relation to the WP:COI. We are not advancing outside interests. The whole overall goal of the DBpedia project is to advance the aims of Wikipedia itself. We contribute to the future development of Wikipedia by demonstrating how Wikipedia search could be improved (see for instance Wikipedia facet search demonstrator) and by making Wikipedia a central interlinking hub in the emerging Web of Data. Code that has been written for the DBpedia Mobile demonstrator is also used for the implementation of Wikipedia's new mapping features. This goal of the DBpedia project is well understood by key members of the Wikipedia community and this is why we got really positive feedback on the project at the Wikipedia developers meeting in Berlin and at Wikimania 2009. Therefore I don't understand why DBpedia should not provide basic information about itself inside Wikipedia, just as other projects in the Wikipedia community also do. ChrisBizer —Preceding undated comment added 17:18, 11 November 2009 (UTC).
- Why do you guys continue to not understand that you are an outside interest. You are all directly involved with DBPedia, and continue to try to edit its article in a purely positive manner. Getting positive feedback on the project does not magically mean you do not have a COI on this article. This is not an article for providing your information about yourself, promoting DBpedia, or anything else. It is a normal Wikipedia article about a possibly notable project. It would be no different from Jimmy Wales editing the Wikipedia article about him (and, FYI, because he recognizes it as COI, he does not do so, he follows COI properly). If DBpedia is notable, others will write a proper, neutral article about it, but it is not appropriate for you all to use it as a platform for promotion. 17:42, 11 November 2009 (UTC)
- You are right, we should be very careful when editing this article. To make the article more balanced, I looked for critical statements about DBpedia (hard to find - scientists tend to be nice to each other) and added one to the article. The formatting is not perfect, and it would be better to have a few original sentences instead of a direct quote, but this is what I came up with. Chrisahn (talk) 19:33, 11 November 2009 (UTC)
- No, you shouldn't be editing this article at all. But it seems obvious you will continued to be obtuse about this and just continue editing it anyway. *sigh* -- Collectonian (talk · contribs) 19:36, 11 November 2009 (UTC)
- You are right we should make ourselves acquainted with some guidelines, here is one citation that we should follow: Editors with COIs are strongly encouraged to declare their interests, both on their user pages and on the talk page of any article they edit, particularly if those edits may be contested. also for Where advancing outside interests is more important to an editor than advancing the aims of Wikipedia, that editor stands in a conflict of interest it is kind of difficult to argue, that we are not biased. Then we should maybe read these to avoid other potholes:
- (please add more if there is something else to read and follow other than policies or guidelines)
- So then we will proceed as follows: It should be allowed to collect at least material on this talk page, which can help (maybe also influence, but it is a talk page, so everybody tries to influence somebody else on this page) editors without conflict of interest to review it and decide, whether it should be on the article page. This would be ok and also good conduct, right? (btw. I already found Wikipedia:Meatpuppet#Meatpuppets ) SebastianHellmann (talk) 13:59, 12 November 2009 (UTC)
- No, you shouldn't be editing this article at all. But it seems obvious you will continued to be obtuse about this and just continue editing it anyway. *sigh* -- Collectonian (talk · contribs) 19:36, 11 November 2009 (UTC)
- You are right, we should be very careful when editing this article. To make the article more balanced, I looked for critical statements about DBpedia (hard to find - scientists tend to be nice to each other) and added one to the article. The formatting is not perfect, and it would be better to have a few original sentences instead of a direct quote, but this is what I came up with. Chrisahn (talk) 19:33, 11 November 2009 (UTC)
Break 1
OK it seems my removal of the notability warning didn't meet consensus, so let's go back to basics. I'm reading Wikipedia:Notability_(web) (linked from Wikipedia:Notability on the basis that we can treat DBPedia as 'Web content'. Firstly from the General notability guidelines, (please forgive my formatting here) --DanBri (talk) 10:30, 15 November 2009 (UTC)
- If a topic has received significant coverage in reliable sources that are independent of the subject, it is presumed to satisfy the inclusion criteria for a stand-alone article.
- The Semantic Web in Action article in Scientific American.
- From [[1]],
Scientific American (informally abbreviated to SciAm) is a popular science magazine published since August 28, 1845, which according to the magazine makes it the oldest continuously published magazine in the United States. It brings articles about new and innovative research to the amateur and lay audience. Scientific American had a worldwide monthly circulation of roughly 733,000 as of December 2008, including newsstand sales of over 100,000[2] It is not a refereed scientific journal, such as Nature; rather, it is a forum where scientific theories and discoveries are explained to a broader audience.
- The full text of the article is behind a pay-wall but can be found online by searching for the title and PDF. I'll excerpt from the sidebar 'Consumer Applications' / 'Combining Concepts here, briefly:
Search engines on the World Wide Web cannot provide a single answer to a broad ranging question such as “Which television sitcoms are set in New York City?” But a new Semantic Web engine called pediax can, by analyzing different concepts (top, in approximated form) found on Wikipedia’s seven million online pages. Pediax, which grew from the DBpedia project to extract information from Wikipedia, provides a clean result (bottom) that merges text and images. (article then contains large diagram showing links between wikipedia concepts, and a screenshot of an application).
- Pediax seems offline currently, but also to have won the mashup of the day link at programmableweb, which has a little more information.
- Is 'Scientific American' a reliable source? consider Scientific_American#Controversies; also no specific warning flagged in Wikipedia:Reliable_sources/Noticeboard
- The article goes on to say "DBpedia is an effort to smartly link information within Wikipedia’s seven million articles. This project will allow Web surfers to perform detailed searches of Wikipedia’s content that are impossible today, such as, “Find me all the films nominated for a Best Picture Academy Award before 1990 that ran longer than three hours.” - This confirms or at least echoes the BBC use of the DBpedia work, already linked from the article.
- From [[1]],
- The Semantic Web in Action article in Scientific American.
- again from Wikipedia:Notability ""Reliable" means sources need editorial integrity to allow verifiable evaluation of notability, per the reliable source guideline. "
- Can anyone confirm Scientific American as a reliable source for this kind of project. Note from Wikipedia:Reliable_sources_(medicine-related_articles) which - in the context of medical content - says the following "On the other hand, the high-quality popular press can be a good source for social, biographical, current-affairs and historical information in a medical article. For example, popular science magazines such as New Scientist and Scientific American are not peer reviewed but sometimes feature articles that explain medical subjects in plain English. As the quality of press coverage of medicine ranges from excellent to irresponsible, use common sense, and see how well the source fits the verifiability policy, and the general reliable sources guideline."
- "Sources, for notability purposes, should be secondary sources, as those provide the most objective evidence of notability. The number and nature of reliable sources needed varies depending on the depth of coverage and quality of the sources. Multiple sources are generally preferred."
- What do we have so far? The publication source looks reasonable, are the authors a problem? The SciAm article authors were Lee Feigenbaum, Ivan Herman, Tonya Hongsermeier, Eric Neumann and Susie Stephens. These are all members of the Semantic Web community around W3C. (as I am myself --DanBri (talk) 10:30, 15 November 2009 (UTC), and as are the core DBPedia team). They naturally have an enthusiasm for the technologies presented and as their work is on these themes, they are likely to gain if Semantic Web (RDF etc) approaches to information sharing and linking become more popular. However this doesn't mean that they could freely right an 8 page article on the Semantic Web unless their coverage in that article was of notable works. Does this guarantee that every topic, page and theme in the article was particularly notable? No. But it seems useful evidence.
- "Independent of the subject" excludes works produced by those affiliated with the subject including (but not limited to): self-publicity, advertising, self-published material by the subject, autobiographies, press releases, etc.[4]"
- Just covered; SciAm seems to offer some level of control here, and note that the SciAm article authors are not core DBPedia people, but are others from the wider Semantic Web scene who seem persuaded of its utility.
- Looking now at Wikipedia:Notability_(web) ... "is deemed notable based on meeting any one of the following criteria"
- "The content itself has been the subject of multiple non-trivial published works whose source is independent of the site itself. " -- it is argued above that the various DBPedia publications from the core DBPedia team are themselves quite heavily cited in the recent computer science literature. So can we take that second wave of citations as our non-trivial published works here?
- "The website or content has won a well-known and independent award from either a publication or organization." --- nothing known here
- "The content is distributed via a medium which is both respected and independent of the creators, either through an online newspaper or magazine, an online publisher, or an online broadcaster;"; BBC's usage here should qualify?
- other distribution: the Faviki social bookmarking project claims to be 'powered by DBPedia' (logo and text at page footer). The developer seems (see mail introductions) to have built the system independently of the core DBPedia team. Faviki is reviewed favourably on Mashup.com's Startup Review pages.
- Alexa reports DBPedia to have a traffic rank of 126,532.
- For discussion - Wikipedia:Notability_(organizations_and_companies)#Non-commercial_organizations - can DBPedia be examined more as an organization? While there doesn't seem to be a formal nonprofit organization set up, there is clearly some kind of group. "Organizations are usually notable if they meet both of the following standards: 1. The scope of their activities is national or international in scale. 2. Information about the organization and its activities can be verified by third-party, independent, reliable sources."--- can this be done here, regarding DBPedia as an organization? eg. who is in the organization, what legal status, funding etc does it have, what are its goals etc?
- clumsy bulleted list by DanBri (sorry i tried to get indenting right...), --DanBri (talk) 10:30, 15 November 2009 (UTC)
- SciAm didn't WRITE the article. Its being peer reviewed does not magically remove the fact that the article was written by those directly involved with the project, and not those "independent of the subject". Other wikis and anything involved developers is, again, not reliable. If treated as an organization, its actually even less notable. -- Collectonian (talk · contribs) 16:30, 15 November 2009 (UTC)
- SciAm did however decide the article was fit for their purpose, which is publishing articles about subjects sufficiently notable to sustain their place in the market. The SciAm authors are not people who built DBpedia, but they are members of the wider Semantic Web community. If you've spent any time in the SemWeb scene you'll know there are plenty of arguments internally (a sweeping generalisation as an example: enthusiasts for the OWL/RIF work and Linked Data work have tended not to be super enthusiastic about the other strands). So it doesn't follow that all people active in the SemWeb scene are gushingly enthusiastic about all SemWeb projects. So far in this conversation we have seen evidence that the BBC (the world's oldest and largest broadcaster), SciAm and the inventor of the World Wide Web have all done things that strongly indicate support for the claim that DBpedia is notable. I myself consider it highly notable, despite having been an early (if gentle) critic of their approach; I thought it was a mistake to move things too far outside the Wikipedia world/brand; I'd rather have seen the whole thing under a *.wikipedia.org domain somehow. Re citations, it has already been pointed out here that the initial publications about DBpedia by those who created it, have been cited many many times elsewhere in the computer science and information science literature. While it may be reasonable to be skeptical about the papers authored by the DBpedia development team, it is unfair and implausible to reject other papers mentioning it on the vague notion that they are from other "Semantic Web" people and therefore partial. The Semantic Web community is all about data sharing and collaboration; it is only natural then for many SemWeb developers to start working with DBpedia data, given the scale and richness of the Wikipedia dataset. Does exploring use of DBpedia in practice make people 'point of view' w.r.t. this article? —Preceding unsigned comment added by DanBri (talk • contribs) 17:53, 15 November 2009 (UTC)
How to progress?
Hi, I am independent (did not write ever an article mentioning DBPedia), still I read articles about DBPedia and I believe the project is highly relevant for the future of the knowledge-based society and maybe even for Wikipedia itself (to re-use the collection knowledge and make it even more usefull). What am I allowed to do, to progress the article? MaxVölkel (talk) 16:57, 15 November 2009 (UTC)
- Correct me if I'm wrong, but are you not directly involved with Semantic Web? -- Collectonian (talk · contribs) 17:01, 15 November 2009 (UTC)
- Yes that is correct - I am doing semantic web research. Am I allowed in any way by Wikipedias regulation and intentions to progress the article? MaxVölkel (talk) 17:15, 15 November 2009 (UTC)
- You can help the others by providing actual evidence of notability per Wikipedia's guidelines of notability, reliable sources, and verifiability. -- Collectonian (talk · contribs) 17:22, 15 November 2009 (UTC)
- Ok, thanks. This query shows that three independent conferences with three reviewers each accepted papers on dbpedia. DBLP itself is a trustworthy (at least for me. Unsure how to proof that. But also unsure how to prove that NYC is trustworthy.) source of conference citations. MaxVölkel (talk) 17:37, 15 November 2009 (UTC)
- Papers written by DBpedia. Those papers being published still does not show notability, it just shows reliability of the source. These are two separate issues. -- Collectonian (talk · contribs) 17:43, 15 November 2009 (UTC)
- Ah, I see. An what about the TED talk about dbpedia? This seems to be both trustworthy and notable. MaxVölkel (talk) 17:45, 15 November 2009 (UTC)
- It might also behove all those involved to read User:Uncle G/On notability. It is a user essay, but considered one that speaks to Wikipedia. Note, in particular, that lacking notability does not mean the project is not important nor that it may not be notable later. -- Collectonian (talk · contribs) 17:25, 15 November 2009 (UTC)
- Thanks again. Let me try to look into the definition: An article's subject is notable if (1) it has been the subject of (2) non-trivial published works by (3) multiple separate sources (4) that are independent of that subject itself. Hmm. (1+2) A scientific article should be definition be non-trivial, otherwise it should not get accepted for publication. Three different conferences with different programm committees and different reviewers should count as multiple sources. On (4), the conferences are a different entity from the dbpedia project. MaxVölkel (talk) 17:37, 15 November 2009 (UTC)
- As no sources have been found, it seems the best way to proceed is to have the question answered in an article for deletion. -- Collectonian (talk · contribs) 17:33, 15 November 2009 (UTC)
- I hope I'm allowed to point the interested readership to some books citing DBpedia (O'Reilly - Programming the Semantic Web, Semantic Web for Dummies), see http://books.google.com/books?q=dbpedia&btnG=Search+Books. P.S., WP:3RR —Preceding unsigned comment added by Beckr (talk • contribs) 18:05, 15 November 2009 (UTC)
'article needs references' tag
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.
IMHO there are enough references now. Someone might want to remove that tag. Chrisahn (talk) 18:55, 13 November 2009 (UTC)
- I disagree. The references first need to be checked. Half of the one's added appear to be blogs, which are not RS, and others are not third-party. They were also all added by you, you has a clear COI issue. -- Collectonian (talk · contribs) 21:59, 13 November 2009 (UTC)
- The number of research papers are fine in my opinion. I am not involved in the DBpedia project but I do know about it as part of my research and the list looks fine to me. 118.208.28.176 (talk) 05:50, 15 November 2009 (UTC)
- Just to show an example of the academic credentials... See this link [2] 224 citations is more than most papers could ever dream of getting. 118.208.28.176 (talk) 05:57, 15 November 2009 (UTC)
- Also, I haven't been around here recently but has ZDNET and NY Times ceased to be reliable sources on WP? Even if it is the blog section of each respective news site, they still have a similar editorial oversight so they wouldn't stay around if the chief editor thought something was wrong. 118.208.28.176 (talk) 06:00, 15 November 2009 (UTC)
- Opinion pieces are not the same as news, and they need to be evaluated to be sure they are being used appropriately. As everything was added by DBpedia project members, outside editors need to actually review the contributions and available material to be sure a neutral view is actually being presented. Right now, the article still reads like a big promotional piece. -- Collectonian (talk · contribs) 06:07, 15 November 2009 (UTC)
- The article may not be neutral right now due mostly to the writing style, but at least the Notability tag could be removed now? I haven't ventured to see what the current Wikipedia Notability rules are recently, but the fact that a paper about the project has 224 citations in 2 years means it fits my personal notability standards. Has Wikipedia really gone so far as to say that scientific articles are not reliable sources for a project because they are written by members of a project? If not then the self published references tag could also go. Maybe someone should scout through the 224 citations on scholar and find references and make up a better structured impact paragraph. I don't have time for the next few days, but it shouldn't be too hard to do.
- Not everyone contributing has actually been on the DBpedia project. The larger RDF community also know about this issue and I doubt their expertise should be rejected just because as experts they might have some level of COI that a completely external party such as yourself wouldn't have. 118.208.28.176 (talk) 07:26, 15 November 2009 (UTC)
- See WP:WEB and WP:N for the notability guidelines. Papers about the project written by those involved with it are not significant coverage nor demonstrations of notability. Its when those who are NOT involved write about it that makes something notable. You can not make your own notability, for Wikipedia purposes. No one has yet to show any significant coverage nor any coverage at all beyond what DBpedia and its members have made themselves through their own talks, presentations, writings, etc. -- Collectonian (talk · contribs) 08:11, 15 November 2009 (UTC)
- If it means anything... I cited the publication in question and commented on it, although the final camera ready version was only submitted a week ago so it won't be in publication till next year. It seems a little pedantic to force a secondary source or more to have as its primary topic the project. We all have other research topics that won't be directly about DBpedia. Wikipedia is a strange beast! 118.208.28.176 (talk) 08:55, 15 November 2009 (UTC)
- How so? If only those involved with something could create notability by writing some papers or publishing press releases, anything and everything be notable, and Wikipedia would be no better than a spam machine full of people's personal advertisements. If third-parties not involved with DBpedia are not interested in it enough to give it significant news coverage (not just republishing its press releases), then it is not notable. -- Collectonian (talk · contribs) 09:05, 15 November 2009 (UTC)
- Funny, how you also label peer reviewed papers in major scientific journals and conference proceedings as SPAM. There are 4 or 5 papers by DBpedia's project members that have been accepted (e.g. at ESWC, ISWC, JWS, ODbase), which certifies scientific notablility. BTW peer review means, that it gets reviewed by normally three independent scientists SebastianHellmann (talk) 15:18, 15 November 2009 (UTC)
- How so? If only those involved with something could create notability by writing some papers or publishing press releases, anything and everything be notable, and Wikipedia would be no better than a spam machine full of people's personal advertisements. If third-parties not involved with DBpedia are not interested in it enough to give it significant news coverage (not just republishing its press releases), then it is not notable. -- Collectonian (talk · contribs) 09:05, 15 November 2009 (UTC)
- It is funny to hear someone say that a paper which has 224 citations is non-notable for the purposes of the modern day community built encyclopedia, but that might just be the non-Wikipedian view on life. So the content has some biases... you could put some effort into cleaning them up instead of just tagging, reverting and discussing...
- It is more funny to hear that NY Times and BBC apparently just republished press releases when they started pulling out the DBpedia content for their content enhancement projects and put some information up on their blogs. Blogs are just the way things work these days, even if Wikipedia hasn't accepted that yet. Maybe they will get published in the "mainstream" news, maybe they won't, the information is there already in my opinion.
- From WP:WEB "The website or content has won a well-known and independent award from either a publication or organization."... Does it count if they came second... See list of winners at the following link for 2008 [3] What about "Best In-Use-Track Paper" at [4]? I guess that one is still a publication mostly by people inside the project together with the BBC guys, so it might be taboo by Wikipedia standards. If you go through the results of a google search for "dbpedia" you very quickly get to links like [5], that are obviously involved intimately with Dbpedia so should be rejected, but it still has to sway an untrained non-Wikipedian's eye towards this project having some sort of notability. [sarcasm]KDE sucks though right... Gnome all the way... [/sarcasm], so that reference can't make it into Wikipedia.
- I honestly thought Wikipedia was okay with multiple scientifically published peer-reviewed sources, but I guess it is even more pedantic than that these days. 118.208.28.176 (talk) 09:19, 15 November 2009 (UTC)
- A paper written by the members is not evidence of notability, no matter how many citations you put in it or who published. Its still BY THE MMEMBERS. It is a reliable source, but it does NOT show it is notable. It wasn't written by some third-party person. Let's look at those references. New York Times, notes it has mapped its data for use by several sets, with DBpedia being just one. ZDNet, a single mention quoting a DBpedia project member. The BBC reference - a list of web terms. And no, coming in second is not the same, and it is not a major nor independant award. Again, you are showing only that the only people talking about DBpedia are those involved with it and Semantic Web in general. If the only way you can show DBpedia's notability is by making sarcastic remarks and snide attacks, that really does not reflect any notability for the project at all, only that there must not be any to demonstrate.-- Collectonian (talk · contribs) 09:35, 15 November 2009 (UTC)
- It's not about citations within an article, but citations of an article. Sure, it's mainly Semantic Web people who currently talk about DBpedia, but it's mainly people interested in anime who talk about Tokyo Mew Mew. I don't understand your line of argument there at all. BarryNorton (talk) 11:50, 15 November 2009 (UTC)
'Review'
Collectonian says As everything was added by DBpedia project members, outside editors need to actually review the contributions and available material to be sure a neutral view is actually being presented. OK: I'll bite.
- I am not attached to the DBpedia project in any way. I am therefore an outside editor and have no COI. I have reviewed the contributions and available material and I hereby warrant that the article as it stands is presenting a neutral view of a notable topic. Regarding notability: anyone in the semantic web community who is not aware of DBpedia is ... not in the semantic web community. DBpedia is hugely important and notable. Is that enough? Can I or someone else go ahead and remove the non-notability and other tags from the article without someone reverting it? While the claim that DBpedia is non-notable has some comedy value, the clutch of tags at the top of the article just make Wikipedia look weird and slightly deranged. NormanGray (talk) 12:41, 15 November 2009 (UTC)
- No, it isn't enough. If DBPedia is only know to those directly involved with means it is NOT notable. That you would claim the topic is notable. If you don't like the tags, find real notability apart from pointing at its own presentations and and those related to it. -- Collectonian (talk · contribs) 16:35, 15 November 2009 (UTC)
- For the record, I have no connection to DBpedia and I couldn't agree more. The absurd quests of the 'wikicrats' are making Wikipedia into a laughing stock. I suggest a new acronym WP:MOLLC (making ourselves look like cretins) and a tag
{{subst:attjfaieososi}}
(arguing the toss just for an individual editor's own self of self-importance) be instituted asap. BarryNorton (talk) 14:01, 15 November 2009 (UTC)
- As one of the authors of this article, albeit in a minor way (I included the reference to Tim Berners-Lee describing dbpedia as one of the most notable parts of the Linked Data project), it sounds like I should clarify that I don't nor have I ever worked on the dbpedia project. It's probably also worth noting that as one of the authors of the ESWC2009 paper cited here the majority of the authors of that paper are in the same situation. In case you've not read it the ESWC2009 paper describes how the BBC is using dbpedia as an integral part of bbc.co.uk. —Preceding unsigned comment added by Derivadow (talk • contribs) 16:29, 15 November 2009 (UTC)
- If you helped write the paper, then you do still have a COI. You can't (and shouldn't) add yourself as a source. -- Collectonian (talk · contribs) 16:35, 15 November 2009 (UTC)
- To be clear, while I'm one of the authors of the ESWC2009 paper I didn't add that reference to this Wikipedia article, danbri did - who isn't one of the authors of the conference paper, nor is he a member of the dbpedia project. In other words there is no COI in including a reference the paper in this Wikipedia article. I did reference an interview with the inventor of the Web in which he describes dbpedia as 'one of the more famous parts of the Linked Data project'. I have no connection with that interview. I'm not sure I understand where you think the COI is - I've not referenced my own paper (others have) nor do I don't work for dbpedia. —Preceding unsigned comment added by Derivadow (talk • contribs) 17:42, 15 November 2009 (UTC)
Merge with Semantic Web
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.
Absolutely ridiculous. You might as well suggest to merge Wikipedia with Web. BarryNorton (talk) 11:52, 15 November 2009 (UTC)
- DBpedia is part of Semantic Web. Other parts of it have no separate article, and DBpedia is most talked about there. A merge is perfectly sensible. -- Collectonian (talk · contribs) 16:36, 15 November 2009 (UTC)
- I meant specifically what was listed under Projects, and RIF likely could be merged as it still just proposed and is only a stub. -- Collectonian (talk · contribs) 16:55, 15 November 2009 (UTC)
- You mean projects listed under the semantic web entry? What about FOAF (software), SIOC, Nextbio, Linked Data and OpenPSI —Preceding unsigned comment added by Derivadow (talk • contribs) 17:48, 15 November 2009 (UTC)
- Damn, OpenPSI is worthy of deletion. shellac (talk) 18:21, 15 November 2009 (UTC)
- Yep, and its been deleted. -- Collectonian (talk · contribs) 19:29, 15 November 2009 (UTC)
- Damn, OpenPSI is worthy of deletion. shellac (talk) 18:21, 15 November 2009 (UTC)
- I agree with Barry, it's an inappropriate merge suggestion. -- Earle [t/c] 19:23, 15 November 2009 (UTC)
- Are you going to offer reasons why? -- Collectonian (talk · contribs) 19:29, 15 November 2009 (UTC)
- To be honest, you seem to have no idea of what your are talking about here. This is akin to suggest merging The Beatles and Ludwig van Beethoven into Music (and requesting that Joe Cocker, Claudio Abbado and Robbie Williams should please step back from the discussion due to their COI). Semantic web is a major research topic that have been around for a decade (although many of the techniques used go back much further), with hundreds of research groups working on it and whole conference series dedicated to the topic. --Stephan Schulz (talk) 21:31, 15 November 2009 (UTC)
- Are you going to offer reasons why? -- Collectonian (talk · contribs) 19:29, 15 November 2009 (UTC)
Improvements
This article is in desperate need of a cleanup. But right now the article accomplishes little else but hammering home how notable it is. 90% of the article is a list of numbers with no context, mentions of other "interlinked" datasets with no mention of who is involved (an uneducated reader might think the CIA and US Census Bureau are using the DBpedia data), plus TBL's comment which doesn't really tell me anything (is DBpedia famous the same way TBL is famous?). Since you guys claim to know something about DBpedia, can you improve the article by answering some of these questions?
- Who started the DBpedia project, and who maintains it?
- How often is the dataset updated, and who does it?
- What is the process through which the dataset is built (algorithms & software used, etc.)
- How is DBpedia actually used? (Beyond "NYT includes links" and "BBC uses it to organize stuff".. I have no idea what that even means!)
- In fact, how is OpenCalais even based on the NYT? I scanned the references and don't see any connection.--Jonovision (talk) 11:14, 16 November 2009 (UTC) (Copied from Wikipedia:Articles for deletion/DBpedia by Chrisahn (talk) 17:28, 16 November 2009 (UTC))
- I guess much of that is a result of Collectonian's attempt to get this article deleted for lacking notability. Hopefully it can be edited into a much better article (and we'd love you guys who are actually involved to help) now that that's out of the way. I think that a simple paragraph of two of description and background would be a good start. henrik•talk 17:48, 16 November 2009 (UTC)
Dear Henrik, thank you for moving the discussion towards improving the article by raising these questions. Please find below pointers and initial answers to your questions. Please also note that I'm involved in the DBpedia project and therefore as my comments might be biased (as Collectonian has pointed out).
1. Who started the DBpedia project, and who maintains it?
The project was started by researchers and students at Freie Universität Berlin and Universität Leipzig who implement the code that extracts the data from Wikipedia. The extracted data is hosted by OpenLink, a company that develops RDF databases. Please see http://wiki.dbpedia.org/Team for the complete list of participants.
2. How often is the dataset updated, and who does it?
The dataset is currently updated about every six month. The dataset is updated by project members from Freie Universität Berlin and Universität Leipzig. Please refer to http://wiki.dbpedia.org/ChangeLog for the history of releases.
3. What is the process through which the dataset is built (algorithms & software used, etc.)
The dataset is build using a Wikipedia-specific data extraction framework that has been developed by the DBpedia project. The framework extracts different types of structured information from Wikipedia and represents the extraction results using the RDF data model. Please refer to http://wiki.dbpedia.org/Documentation for more information about the DBpedia information extraction framework. The process is as follows: 1. We load the Wikipedia dumps into a local database. 2. We run the extraction against this database. 3. We send the extracted data to OpenLink for hosting.
4. How is DBpedia actually used? (Beyond "NYT includes links" and "BBC uses it to organize stuff".. I have no idea what that even means!)
There are three main uses of DBpedia:
4.1. Alternative Wikipedia search interfaces. As DBpedia makes structured information within Wikipedia articles easier accessible, the DBpedia dataset can be used as a basis for implementing alternative Wikipedia search interfaces that allow user to ask complex queries against Wikipedia content. Examples of such applications that have been developed my members of the DBpedia community are found at http://wiki.dbpedia.org/Applications.
4.2. Knowledge base. Various research projects within the Knowledge Representation and Semantic Web research community use the DBpedia dataset as a knowledge base for experimentation and demonstration http://wiki.dbpedia.org/UseCases#h19-6. We don't have a list of all these projects, but regularly hear of DBpedia being used at conferences like the World Wide Web conference or the Semantic Web conference.
4. 3. Interlinking Hub for the emerging Web of Data. This use case might appear a bit strange for people not involved in Linked Data and the Semantic Web. The idea of the Semantic Web is that different parties publish structured data on the Web and set datalinks between records in the different datasets describing the same entity or related entities. These data links can then be used by client applications like Linked Data browsers or search engines to retrieve data from various sources about an entity and integrate the data afterwards. As DBpedia offers data about a wide range of topics, many other data providers have started to set datalinks pointing at DBpedia. This makes DBpedia a interlinking hub of the emerging web of data as client applicatons can follow a link to DBpedia and then navigate into various other datasets about the same topic. Please refer to http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData for more information about Linked Data and the Web of Data.
5. In fact, how is OpenCalais even based on the NYT?
OpenCalais is not based on the New York Times. The connection between both projects is that both set data links pointing at DBpedia. This means for example that the NYT says that some news articles are about a specific politician (identified with a DBpedia URI) and OpenCalais says that some other articles are about the same politicial by also using the DBpedia URI for annotating the articles. As both data sources use the same URI to identify the politician, a client application can now retrieve articles from both sources and know that all articles are about the same guy.
ChrisBizer (talk) 10:13, 17 November 2009 (CET))
6. Re. "BBC uses it to organize stuff". As described in the ESWC2009 conference paper (see http://derivadow.files.wordpress.com/2009/06/eswc2009-bbc-dbpedia-2.pdf - apologies for linking to a file on my blog, but I can only find the abstract on the conference site) - the BBC is starting to use dbpedia URIs as a controlled vocabulary i.e. 'tagging' BBC content with dbpedia URIs (the wikipedia text provides the evidence set for what the tag means) and then aggregrating content around those tags.
You can see an example of this at bbc.co.uk/wildlifefinder (e.g. http://www.bbc.co.uk/nature/species/Polar_bear - apologies for those outside of the UK, the video is GeoIP restricted). Note the URL slug - it's the same as use on Wikipedia/ dbpedia. The reason the news stories, clips etc. are there is because they've been tagged with this URI http://dbpedia.org/page/Polar_bear.
Derivadow (talk) —Preceding undated comment added 11:13, 17 November 2009 (UTC).
A small note on COI
Just to be clear, the conflict of interest guidelines are not an absolute prohibition on editing an article where you have an outside interest. What is not allowed to advance the interest of the outside organization at the expense of Wikipedia's goal of a neutral, verifiable encyclopedia. It is a subtle difference, and most people have a hard time to know where to draw the line (it's hard to step out of your work and view it with impartial eyes), so editing articles you have a close connection to is often said to be "strongly discouraged". But I would be most unhappy if you let errors stand in articles because of that guideline. :-) Feel free to fix any minor mistakes yourselves, and perhaps propose larger changes here in the talk page for a bit before doing them.
It is my feeling that some other people you've interacted here with have been overly hostile and we could perhaps have done a better job of explaining our culture - something I'd sincerely like to apologize for. We most definitely need people who know what they are talking about. A partial explanation is that we have tons of people who daily do try to use wikipedia for marketing and promotional purposes fairly unrepentantly. A certain battleground mentality can be understood, if not condoned.
(Becoming more active participants here will likely make your work of integrating DBpedia and Wikipedia much easier too, both by you understanding the cultural norms of the site and for the community here to get to know you). Again, my apologies for the rough start. henrik•talk 13:20, 17 November 2009 (UTC)
Demos
First of all, thanks to everybody for the constructive feedback. May I suggest the inclusion of some demos, such as the DBpedia Faceted Browser and DBpedia Relationship Finder? I think they do a great job demonstrating what can be done with DBpedia. It should be noted that the Faceted Browser was developed in collaboration with, and is hosted by the company Neofonie, who included a "powered by" tagline that can be seen as advertising. However I think they did a great job and other than being a DBpedia team member, I have no affiliation with them that would lead me to think that I'm biased regarding these statements.
- DBpedia Faceted Browser: http://dbpedia.neofonie.de/browse/
- DBpedia Relationship Finder: http://relfinder.dbpedia.org/
On a sidenote, the DBpedia Mobile demo is currently broken and I hope to have it fixed soon.
--Beckr (talk) 14:30, 17 November 2009 (UTC)
How's this for an Introduction?
DBpedia is a project started by researchers and students[1] at Freie Universität Berlin and Universität Leipzig with the objective of extracting structured data from Wikipedia so that it can be published on the Web[2] as RDF, a central data model of the Semantic Web.
DBpedia describes 2.9 million things, including over 282,000 persons, 339,000 places, 88,000 music albums, 44,000 films and 130,000 species, including abstracts in multiple languages[3]. It has been described by Tim Berners-Lee as one of the more famous parts of the Linked Data project.[4]
By extracting structured information from Wikipedia and publishing it as RDF the underlying data can be accessed using an SQL-like query language for RDF called SPARQL.
In addition to structured data, DBpedia provides URIs for the underlying concepts described in Wikipedia, the number and breadth of concepts means that DBpedia now interlinks a very large number of additional datasets within the Linked Data cloud and has been used by some content providers, notably the NYT and the BBC, as a controlled vocabulary[5] i.e. 'tagging' content with DBpedia URIs and then aggregrating content around those tags.
--Derivadow (talk —Preceding undated comment added 23:26, 17 November 2009 (UTC).
History
(Hi, I've tried to write up a bit of history about the project - who was involved, what happend when etc. Apologies for all the people I've not referenced and the mistakes I've made and inparticular the relationship between Leipzig and Berlin which I've never really groked. I'm also aware that this probably lacks sufficient references to pass Wikipedia's quality standards. Although that might not be true. Does anyone else?
I'm not sure where to go from here - I guess if others would like to contribute to what I've written we should be able to get the article into a publishable state? In particular I think fact checking on what I've written would be helpful + a bit of info on the technology stack would be handy, anyone fancy writing that?)
---
DBpedia was first proposed by Chris Bizer in early December 2006 [6], in mid December 2006 Georgi Kobilarov and Richard Cyganiak joined the team and agreed on the name dbpedia.org on the 20th December 2006, on the 21st December Richard Cygniak registers the domain name dbpedia.org.
Over Christmas 2006 and early January 2007 Georgi Kobilarov developed the code to import Wikipedia dumps and on the 23rd January 2007 Sören Auer from the Universität Leipzig (who also developed the infobox extraction code) announced the first release of dbpedia.org[7]. The first release featured:
- two large extracted datasets
- a SPARQL endpoint and a data browser
- a visual query builder
This initial release featured structured data about people and cities and used D2R to publish the data.
After this initial release discussions began with OpenLink who offered to host the triple store, which took place late February 2007. The migration to OpenLink servers also also conincided with a SPARQL endpoint thanks to OpenLink's Virtuoso server.
By mid May 2007 dereferenceable URIs where available, published on top of the Virtuoso SPARQL endpoint. This initial "linked data frontend" built on top of the SPARQ endpoint was a hack based on the original D2R Server code. In June, after some additional work, Richard Cyganiak released it as "Pubby". The frontend ran on servers in Berlin, while the SPARQL endpoint ran on OpenLink-hosted machines - this architecture resulted in poor response times early on.
In July 2007 a new data extraction framework had been built with a unified codebase from Leipzig and Berlin. By this time DBpedia contained 1,950,000 "things", including at least 80,000 people, 70,000 places, 35,000 music albums, 12,000 films, 1,600,000 links to relevant external web pages and 440,000 external links into other RDF datasets. Altogether, the DBpedia dataset consisted of around 103 million RDF triples.[8]
--Derivadow (talk) 13:28, 18 November 2009 (UTC)
- ^ http://wiki.dbpedia.org/Team
- ^ Christian Bizer, Jens Lehmann, Georgi Kobilarov, Soren Auer, Christian Becker, Richard Cyganiak, Sebastian Hellmann, DBpedia - A crystallization point for the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web, Volume 7, Issue 3, The Web of Data, September 2009, Pages 154-165, ISSN 1570-8268
- ^ http://wiki.dbpedia.org/About
- ^ Sir Tim Berners-Lee Talks with Talis about the Semantic Web. Transcript of an interview recorded on 7 February 2008.
- ^ http://derivadow.files.wordpress.com/2009/06/eswc2009-bbc-dbpedia-2.pdf
- ^ http://lists.w3.org/Archives/Public/semantic-web/2006Dec/0003.html
- ^ http://lists.w3.org/Archives/Public/semantic-web/2007Jan/0121.html
- ^ http://lists.w3.org/Archives/Public/semantic-web/2007Sep/0033.html