Talk:NoSQL

Table

In the short comparison table: I'm not sure that influences and sponsors are related. perhaps it would be better to use 2 different columns such as based on for the influences and support for commercial / free groups that provide support. After all the business model of many open-source NoSQL suppliers is support, making it a prime attribute of their software.aary (talk)

Why was the table removed? If there were concerns about the content, it would be better to replace it than to remove the table. I'm tempted to reverse the edits. Bhaskar (talk) 14:22, 13 November 2009 (UTC)[reply]

I'd like to propose changing the NoSQL wiki in the following way:

move the table with all the NoSQL databases to a different page similar to: [Comparison_of_SQL_database_management_systems] or perhaps like here: [Comparison_of_business_integration_software]
elaborate on the different types of NoSQL solutions and their respective uses:
1. Graph-based databases
2. Document-based databases
3. K/V stores
for each one write why and how they are different from SQL and what problems did they come to server

This is just a suggestion, but if no one objects we can all contribute from our expertise and make this a truly detailed overview of the kind of solutions NoSQL offers (what NoSQL means) rather than just a list of available options. aary (talk) 22:54, 13 November 2009 (UTC)[reply]

Agreed. Moved the comparison table to structured storage (the formal name used in the papers), however each of the different types of databases deserves (and generally already has) a dedicated page. This article should focus on the NoSQL group itself, who its members are, what it believes, etc. -- samj _in^out 13:09, 26 November 2009 (UTC)[reply]

Is this what the NPOV dispute was about? Can the tag be removed now? Pcap ping 14:03, 9 January 2010 (UTC)[reply]

I would like to see some "Pro's/Con's" when to use or not to use NoSQL Databases. <br\> Pros:

have tons of data (large datasets)
have sparse data
fast read/write access (but depends on primary-key/indexing)
good for Web 2.0 applications...
good communities which will support you
schemaless
no query language (need Map/Reduce function to query data)

Cons:

schemaless (sometimes important for interoperability, reusability)
still in development (you could experience some missing features like authentication, performance problems, ...)
no query language (need Map/Reduce function to query data)
- in projects many partners are familiar with SQL
you require a server-cluster to get the full performance of NoSQL DBs
ACID transactions is sacrificed for performance.
using aggregations —Preceding unsigned comment added by 129.26.162.152 (talk) 10:00, 22 September 2010 (UTC)[reply]

Legitimacy

I'm concerned about the legitimacy of this page. The only reference links to a Rackspace blog. The paragraph following the paragraph containing the reference claims that a Rackspace employee coined the term. Without any other corroborating links or references, this entry appears to be little more than a marketing ploy to legitimize the term 'NoSQL' Dancrumb (talk) 20:24, 15 November 2009 (UTC)[reply]

If it helps NoSQL is essentially an advocacy group (a "database movement", whatever that is) and the article should focus on it, its members, structure, events, "beliefs", etc. Indeed many who actively promote these new breed of databases (myself included) don't subscribe to the abrasive approach taken by this particular group and I doubt all those listed in the comparison table are willing participants either.

In terms of Wikipedia policy on verifiable notability I suspect that NoSQL would pass, which is why I have resisted the urge to AfD this article and have in fact encouraged them to contribute to Wikipedia rather than create another repository elsewhere. Hopefully the article will improve significantly however as it is currently not great. -- samj _in^out 12:56, 23 November 2009 (UTC)[reply]

I feel this page is totally legitimate. With all the recent press about the NOSQL movement we need to get this removed ASAP. This page can be a great place for people to get more information on alternatives to RDBMS systems. It provides a great service to the community. I vote we add more references and remove the dispute tag now.--Dan 11:40, 4 December 2009 (UTC)

Check the companies on a Google search who are buying keywords for "nosql" -- I believe this is a totally legit topic and will only get biggger. --209.204.139.40 (talk) 06:14, 7 December 2009 (UTC)[reply]

Great point about the Google Keywords. I would like to remove the disputed tag today unless I we get rational arguments. --Dan 13:27, 9 December 2009 (UTC)]

ebay does use an RDBMS

I believe ebay have purchased Greenplum, which is an RDBMS based on PostgreSQL. There's some information on DBMS2's site. Do ebay have any NoSQL implementations?

Note: I do not work for either Greenplum or ebay, but have spoken with Greenplum's sales people, which is how I know about this. Nic Doye (talk) 18:01, 23 November 2009 (UTC) Nic Doye[reply]

Taxonomy of NoSQL Systems

I would like to add a taxonomy section to the main page.

Here is a list that Srini V. Srinivasan posted on the list from Steve Yen's talk at NoSQL Oakland. I have also added a few additional items.

Key-Value APIs
1. key-value cache (memcached, repcached, coherence, infinispan, eXtreme scale, jboss, cache, velocity, terracotta
2. key-value store (keyspace, flare, schema-free, RAMCloud)
3. clustered key-value-store (dynamo, voldemort, Dynomite, SubRecord, MotionDb, Dovetaildb)
4. ordered-key-value-store (tokyo tyrant, lightcloud, NMDB, luxio, memchachedb, actord)
data-structures database (redis)
tuple-store (gigaspaces, coord, apache river)
object database (ZopeDB, db4o, Shoal)
document store (couchDB, Mongo, jackrabbit, ThruDB, CloudKit, Persevere, Riak Basho, Scalaris, Citrusleaf)
Native XML Databases (MarkLogic, eXist-db)
wide columnar store (BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI)

I do people keep calling Riak a document store if it has a dynamo architecture and does not support changes to a field in a document when in the database? It's because it can store and read documents? How is that different to the pluggable serialization form Voldemort? —Preceding unsigned comment added by 188.82.70.107 (talk) 11:13, 17 October 2010 (UTC)[reply]

Jackrabbit NoSQL ?

In my opinion Jackrabbit is not a noSQL Database. Its a Content Repository like the Ariadne Content repository which provides a unified interface to access the contents of a content repository. Perhaps it internally uses a noSQL database (i don't know), but Jackrabbit is not a noSQL Database! —Preceding unsigned comment added by 129.26.162.152 (talk) 06:59, 22 September 2010 (UTC)[reply]

Essence of NoSQL systems

I think that the essence of the NoSQL is about two things. 1. Not ACID-compliant because full transaction support becomes too time consuming when the amount of data is large, when the data is distributed and when relatively cheap hardware is used. 2. No joins between tables. Joins becomes too slow when the amount of data is large. No joins means that data has to be stored in a denormalized format.

It's not about the lack of a schema. Cassandra has structures called column families, it is something simular to a schema and it isn't easy to change such a column family in Cassandra. I tend to think that Cassandra isn't schemaless. Heelmijnlevenlang (talk) 21:32, 4 December 2009 (UTC)[reply]

The feature intrinsically defining NoSQL, is that they all are distributed systems for processing BigData. In this category of volumes RDBMSs just fail short and become tedious and unmanageable, and require large and expensive hardware. By adopting distribution, one can run performing data-stores on commodity hardware or even on cloud infrastructure. However, begin distributed, these systems intrinsically become subject to Eric Brewer's CAP theorem: such a system must drop one of the three CAP properties in favor of the other two. Hence, every of the systems on the example list gets very distinctive properties when compared to the others. CAp, CaP and cAP would effectively categorize every NoSQL datastore for its feature set, which every adopter must evaluate to favor one over the other, mostly depending on his own requirements. Amazon e.g., cares less for consistency, hence prefers eventual consistency in its Dynamo database, favoring its always available property. By dropping typical RDBMS properties, hence not supporting SQL, this range of data-stores has been named 'NoSQL' datastores. wimvanleuven (talk) 10:04, 16 December 2009 (UTC)[reply]

Brewer's CAP theorem does seem the best way to categorize NoSQL implementations, as well as bearing on their key advantage over RDBMS (large volume). But we don't see to have a CAP entry anywhere in en.wikipedia. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem has some references; do they add up to enough to make a relevant article? Jackrepenning (talk) 21:51, 3 March 2010 (UTC)[reply]

See CAP theorem Heelmijnlevenlang (talk) 20:04, 21 March 2010 (UTC)[reply]

I think it is important to differentiate RDBMS (ie the platform) from a relational database (normalised data structures). Data Warehouses are often built on an RDBMS platform and deal with large data volumes and often involve multiple servers. Additionally the data is generally structured in a denormalised manner (eg star schema) and often in a non-row based physical storage format. Finally, bulk updates to warehouses are often done in a way that is not ACID compliant. So they share many of the characteristics of NoSQL, except for the obvious one that they are often queried through SQL (though that is often generated by a tool). But the discussion focuses on transactional workloads rather than the aggregation or analysis associated with datawarehouse activity. It isn't so much data volumes that define the NoSQL use case, but massively concurrent access to detailed data items. —Preceding unsigned comment added by 203.15.73.30 (talk) 00:30, 23 March 2010 (UTC)[reply]

Removing npov tag

I'm removing the {{npov}} tag. It was added three months ago, [1], but no "discussion on the talk page" was initiated to discuss the purported NPOV issue. TJRC (talk) 23:44, 23 February 2010 (UTC)[reply]

VoltDB

Where does VoltDB fit into the list? I'd like to find a list of RDBMS NoSQL databases. --Ysangkok (talk) 18:35, 10 June 2010 (UTC) Makes sense to add VoltDB, but since it's new, that's probably the reason it wasn't on the list in the first place. Leave that to the original author to add that if needed. Captchad (talk) 20:53, 16 June 2010 (UTC)[reply]

Relation to article Document-oriented database

As a sw developer with a memory of more than just the current hype, i'd like to propose merging this articel and the article about 'Document-oriented database'. The Reasons that led to the development of document-oriented databases were more or less the same. The only difference is the usage of the current buzz words in the NoSQL-groups. In a way NoSQL-systems have taken over tzhe role of document-oriented databases with regard to SQL-servers on the other side. 12:55, 30 July 2010 (UTC) —Preceding unsigned comment added by 91.40.129.120 (talk)

Transactional support to BigTable

There is a paper from 2008 presented by Google on SIGMOD/POD about a transactional manager for BigTable called MegaStore. I can't find the paper "Megastore: A Scalable Data System for User Facing Applications" but there is a description about the presentation in James Hamilton's Blog. Now, I don't know the relation of this system with Percolator, but it should be mentioned in the same section, no? —Preceding unsigned comment added by 188.82.70.107 (talk) 11:24, 17 October 2010 (UTC)[reply]

Non SQL vs. NoSQL

NoSQL equals "not only SQL". But what about "non SQL"?--217.162.253.165 (talk) 16:34, 13 November 2010 (UTC)[reply]

Taxonomy section proposal

Hello,

Any objections to moving the info in the Taxonomy section to the Structured storage table? 205.228.108.185 (talk) 04:30, 25 February 2011 (UTC)[reply]

I oppose, the Structured storage table has a bad qualitiy, for example lot of entries without articles (proofs of notability), see WP:WTAF for more. --Kgfleischmann (talk) 06:50, 25 February 2011 (UTC)[reply]

Fair enough, but what you mention is a good reason to improve that table, not to stuff a separate table in this article. 205.228.108.185 (talk) 09:39, 25 February 2011 (UTC)[reply]

Begin improveing it, let's continue this discussion afterwards--Kgfleischmann (talk) 12:11, 25 February 2011 (UTC)[reply]

Sure, can you be more specific about what you don't like in that table? In fact, can you actually lead by example and start working on it? 220.100.23.139 (talk) 13:40, 25 February 2011 (UTC)[reply]

OK, I've given it a go. Can you please have a look and let me know if this is what you had in mind? Are there any more objections? 121.102.42.157 (talk) 23:40, 25 February 2011 (UTC)[reply]

Also, I would point out that the tables in the Taxonomy section currently also suffer from the same notability problem, see e.g. Tokyo Cabinet. 219.111.119.62 (talk) 04:28, 5 March 2011 (UTC)[reply]

I oppose, as the Taxonomy section is about NoSQL. Sae1962 (talk) 11:09, 9 March 2011 (UTC)[reply]

According to the lead, Structured Storage is how NoSQL systems are referred-to in academic papers, so the two terms can be used as synonyms, and indeed Structured Storage redirects to NoSQL. OK in theory Structured Storage also includes relational DBMSs, but that is only an argument to move Comparison of structured storage software to Comparison of NoSQL software.

In the Taxonomy section I would certainly welcome some kind of technical description of the various types, with eminent examples for each category. But effectively rebuilding the same table without merging the efforts does not make sense to me. 205.228.108.58 (talk) 02:19, 10 March 2011 (UTC)[reply]