Talk:Flat file database
|This is the talk page for discussing improvements to the Flat file database article.|
|WikiProject Computing||(Rated Start-class)|
|WikiProject Databases / Computer science||(Rated Start-class, High-importance)|
- 1 definition inaccurate now
- 2 Hyphen (removed)
- 3 Graphic example
- 4 Repeatable fields, and Boolean logic
- 5 My rationale
- 6 Flat-file != Relational DB ?
- 7 Merge Flat file
- 8 Flat File and Flat File Database Topics should be kept separate
- 9 Graphic Comment
- 10 Coinage
- 11 Flat File Indexing (Instantaneous Random Access to Any Record)
- 12 Assorted Semi-boo-Boos:
definition inaccurate now
Someone went ahead and merged "flat file" with "flat file database", which I suppose is a step toward the greater good, but:
a flat file database is not always a single file database. a single file, flat file database is a special type of flat file database.
pmwiki is a flatfile database wiki, for example. it uses one page per wiki entry, and sorts the entries using file names in the filesystem tiddlywiki, on the other hand, is a flatfile database wiki, that holds everything in one file. sorting is accomplished by tag and location in the file's structure.
- As I recall the word "flat" in "flat file" was used to contrast with hierarchical data stores. That contrast may have seemed dated for a while, but now seems relevant again as data stores with more hierarchical or graph-like structures have regained popularity. Ericfluger (talk) 20:17, 31 May 2015 (UTC)
Why the hyphen in the noun? Deb 18:19, 3 Apr 2004 (UTC)
Beats the hell out of me. Edit boldly, as they say. Paul Beardsell 16:23, 8 Jun 2004 (UTC)
This page lacks any example to illustrate the concept, thus it hewes more closely to theory than to the thing itself. I shall edit an upgrade at: User:Xiong/Flat file database. — Xiong (talk) 14:30, 2005 Mar 25 (UTC)
Repeatable fields, and Boolean logic
One improvement of flat file databases was repeatable fields. Around 1970, some DBMS included "repeatable fields". In a bibliographic application, multiple subject headings may be needed to describe a document. Before repeatable fields, the designer would have to establish as many separate subject-heading fields as the maximum number expected to be needed. Many of these fixed-length fields would be unused in most of the records, thereby wasting storage space. Repeatable fields minimized such waste. Also, Boolean logic could be used to query/search flat file databases. Before the advent of relational DBMS, many bibliographic databases were flat file databases. Should this be mentioned in this article? AnonUser 02:02, 15 October 2005 (UTC)
I'm going to go ahead and be bold. Excel is not a flat-file database, so Excel examples do not belong. FileMaker bills itself as a fully-fledged database tool, so it's not a flat-file database either. They might be able to import/export to a flat format, but this is not native to them. The article, as it was, read like an advertisment for FileMaker, providing instructions on how to use it and how to make it look a bit like a flat file database. --Sam Pointon 17:37, 5 June 2006 (UTC)
Flat-file != Relational DB ?
I find this quote strange: "The data are flat as in a sheet of paper, in contrast to more complex models such as a relational database." If I was to make a simple DB I'd just use more than one file to create some relations. Or is it imperative that a flat-file is just one file?
I ask because I have no idea of how it works, but it seems to be a simple walk-around.
id name team 1 Amy Blues 2 Bob Reds 3 Chuck Blues 4 Dick Blues 5 Ethel Reds 6 Fred Blues 7 Gilly Blues 8 Hank Reds
team arena Blues le Grand Bleu Reds Super Smirnoff Stadium
Query Suppose I want to date Amy - what arena do I go to.
team = "SELECT team FROM File1 WHERE name='Amy';" arena = "SELECT arena FROM File2 WHERE team=" + team + ";"
Of course I'd have to build a wrapper between SQL and the flat-files but that should be quite easy.
PER9000 07:06, 4 August 2006 (UTC)
- I also oppose this formulation: "A flat file database is described by a very simple database model, where all the information is stored in text files." - You can model an arbitrarily complex databases as flat-file databases. The thing that is simple about it is that it uses files with raw text instead of a some complex format. I or someone else should make this article more neutral and perhaps incorporate the small example I just made. Perhaps someone with a deeper insight in why it is more inefficient to store a db as flat-files (pointers, harddisks, that kind of stuff) should write a little about this. PER9000 07:26, 4 August 2006 (UTC)
- Yet another insight on my part: In database they talk about a Flat model and not a Flat file database. To me this is/was/should be (I don't know any more) two separate things. That may explain my frustration. Also it is not clear to be that we must have only one table but many files (perhaps a metafor to a phonebook - one table, many pages?) PER9000 07:35, 4 August 2006 (UTC)
- Right the original article is with relation to the "flat" database model. I add a section where "flat files" are used as data stores of a relational database, using the above example. --ANONYMOUS COWARD0xC0DE 07:04, 8 February 2007 (UTC)
I agree. The preceeding comments indicate some room for clarification in the article content. I will modify the introductory paragraph for clarity, possibly other items as well. dr.ef.tymac 23:55, 29 November 2006 (UTC)
- This subject matter area is rife with disagreements about definitions. There can be conflicts between formal and casual use. There are also reasonable disagreements between very well informed people. AFAIK the historical origins of the term "flat" was to contrast with hierarchical data stores. It was about structure (or lack of it) rather than representation. So yeah, a flat file could be a table, and typically was, but it could also be a group of key-value pairs as long as there is no graph structure expressed. We can create a table including pointers from one row to another. By the definition I've just given that would not be a flat file. However, in current popular use it might be considered a flat file.
- Strictly speaking relational databases work with abstractions called relations and tuples that can be represented by by tables and rows. SQL databases work with tables and rows. A relational database purist would probably call that confusing the map with territory. SQL advocates would probably say, that's how it's done in practice rather than theory and it's close enough. It's certainly quite practical to restrict, project and join ordinary text tables with common POSIX-style tools, and there are more formal systems for doing so, like shsql. Whether or not that's really relational depends on which of those views you subscribe to. (I personally try to sidestep the whole debate by referring to SQL databases as just that and reserving the term "relational" where it very clearly fits, but that's not always practical. I don't really care that much, just don't wanna start flame wars.)
- So my feeling is that it's good to say what you mean without jargon when it's practical, and when introducing terms it's probably helpful to spell out not only what you mean, but to acknowledge alternate definitions to avoid confusion, and if practical explain how you made your choice. I think it's good to be diplomatic about this stuff and say things like "for the purpose of this discussion we're defining it this way" rather than trying to present an absolute universal definition of anything.
Flat File Database is a collection of flat files. A Flat File may or may not be (part of) a database. The "Flat File Database" article should remain separate from the "Flat File" article because people looking for information about flat files may or may not want all the information about databases --- But I do believe the articles should reference/link to one another. My considered opinion. Please be kind, this is my first submission into Wikipedia.org. (previously posted Marion 18:26, 15 August 2007 (UTC)Eisforeverything on another discussion by mistake) Marion 18:37, 21 August 2007 (UTC)Eisforeverything (aka Marion)
Flat File and Flat File Database Topics should be kept separate
Flat File Database is a collection of flat files. A Flat File may or may not be (part of) a database.
The "Flat File Database" article should remain separate from the "Flat File" article because people looking for information about flat files may or may not want all the information about databases --- But I do believe the articles should reference/link to one another.
My considered opinion. Please be kind, this is my first submission into Wikipedia.org.
Marion 18:26, 15 August 2007 (UTC)Eisforeverything
I'd like to delete this I created it as a new discussion by mistake Marion 18:39, 21 August 2007 (UTC)
I think the article "flat file" and "flat file database" should be kept as two separate articles. Stolkin 19:01, 23 October 2007 (UTC)
I'm not sure that the blurb under the first graphic on the page actually makes any sense - it gives "one of several typical uses for a flat file database" as being convertible to a fully-fledged relational database.
Firstly, this doesn't really make sense as a use in of itself. Moreover, "converting to" might be less representative of real usage than "converting into a format importable into" (or whatever)?
- The last part is now gone, reasonable way to address the issues here; it would seem. dr.ef.tymac (talk) 03:20, 4 February 2008 (UTC)
I would like to discuss adding two paragraphs on the origination of the term "flat file". The term flat file was coined in 1971 on the campus of Modesto Junior College by the founder of the computer club. At that time data sets were described in the same way that they were stored in a computer on magnetic rings, i.e. two dimensional array or three dimensional array. The coiner decided that a shorter term with fewer syllables may get accepted as a substitute. The term flat file was decided upon because it had a physical image and because comic research found that people enjoy hearing and saying words with “f” sounds and “p’ sounds (think of all the four letter words that start with either of these two letters). The new term was disseminated using the phrase” … flat file, you know, a two dimensional array.” A transfer to UC Berkeley along with membership to the campus computer club spread the term on that campus during 1971 – 1975. Later a position with the Fireman’s Insurance Company, which trained a high percentage of the SF Bay Area COBOL programmers, helped spread the term throughout the local region. Please feel free to edit/change/correct as necessary.David E. Mould (talk) 15:16, 21 October 2010 (UTC)
I don't have a reliable source for my claims. I don't know how to document something like this. does anybody have advice / experience they can provide? David E. Mould (talk) 16:51, 9 November 2010 (UTC)
- Everything in the article should be supportable by a reliable source, so if all you have is opinion or first-hand experience, then it shouldn't be included. As Adrian Willenbücher pointed you to above, you should read WP:Verifiability and WP:No original research for explanations of why we have those requirements. VernoWhitney (talk) 21:24, 16 March 2011 (UTC)
I suggest that origin of the phrase "flat file" might be particularly difficult to adequately document. I further suggest that it may very well have arisen independently in multiple locations. For example, when IBM introduced VSAM in 1970 (or early '70s) the term flat file rapidly came in to daily use in IBM mainframe shops to describe a non-VSAM file. Meaning a non-indexed, or non-keyed, file. Or, as the article states, a file in which "There are no structural relationships between the records". (personal experience). Perhaps a statement in the article to this effect might be more accurate. Merligren (talk) 20:53, 22 April 2011 (UTC)
Flat File Indexing (Instantaneous Random Access to Any Record)
Flat file databases are used (historically) for sequential processing of textual data records, large amounts of data, with perhaps large records. The weakness of Flat file databases is the lack of ability for records to be randomly accessed. This can be accomplished with reading the entire Flat File and loading record offsets (in bytes) to a program hash table, for later random access whilst the program is currently in operation/loaded. But this has to be done each time the program is ran, as the program hash table must be reloaded each time. A user is not likely to appreciate nor tolerate such a wait time, especially if the Flat file is very large (millions of records). To solve this problem, the record offsets may be loaded and maintained within a persistent DBM file tied to a program hash table. That way, the record offsets are immediately available to the program accessing the Flat File database. The File pointer can be set to any record offset (in bytes) within the opened Flat File, for instantaneous random access to any record. #REDIRECT dbm Erichansen1836 (talk) 16:22, 9 April 2015 (UTC)
(Sorry I can't be more thorough, but I'm pressed for time.)
During a quick read I noticed some statements that seemed a bit off, or maybe just confusing, that could use a bit of either correction or clarification. For example the section on contemporary use refers to several legacy products. Also address books and sqlite are both given as examples of flat file storage. Address books are structured in various ways. Some are flat files, but others are not. As I recall, BBDB has some hierarchical structure, and vCard, which is used for persistence as well at data exchange, is flexible enough to adapt to various structures. A vCard file can be flat, but it can also be a graph. Sqlite is a surprisingly full featured SQL database that can be used to manage flat files but is capable of much, much more. I have feeling that some of this stuff may be accidental artifacts of editing rather than misunderstanding and that a once over to clean up and clarify would probably help a lot. 20:35, 31 May 2015 (UTC) — Preceding unsigned comment added by Ericfluger (talk • contribs)
- Agreed. SQLite is backed by a file, but it does its own indexing and in-place updates on top of that. I will remove SQLite as an example if nobody objects soon. --Damian Yerrick (talk) 16:16, 12 June 2015 (UTC)
- Sql server stores its data in one file. Should we list it as a flat file database as well ? — Preceding unsigned comment added by 184.108.40.206 (talk) 08:21, 19 June 2015 (UTC)
I think SQLite should probably count as a flatfile database. According to SQLite's website, [t]he complete state of an SQLite database is usually contained [in] a single file on disk called the 'main database file'. The SQLite Database File Format That's straight from the SQLite webpage. Claystu (talk) 16:55, 25 June 2015 (UTC)
- The lead section of this article states that "the file must be read in its entirety into the computer's memory" and that changes are made by modifying the data in memory and then writing the entire database "out in its entirety to the host's file system." Though an SQLite database file appears as a single binary file to the host operating system, SQLite modifies the file in place a block at a time and does its own on-disk indexing. The only time SQLite rewrites the whole file is during a
VACUUMstatement. --Damian Yerrick (talk) 15:19, 12 July 2015 (UTC)