User:Jeremygbyrne/Datapool

Datapool is a noun, a verb and a specific historical event which took place on 01 July 2006.

According to its proponents, the datapool (n) is the emerging successor to the world wide web — one which promises to function as a universal information storehouse, a global library and ultimately the permanent repository of all shared human knowledge.

At its most abstract, the datapool is a virtual distributed datastore, comprising the sum total of publicly-accessible information (as well as a large amount of private information, stored securely). At a more practical level, the datastore is an automated peer-to-peer filesharing system comprising millions of small, self-contained programs running on up to 77% (January 2007 figures) of all internet-connected "personal computing devices". It operates as a gigantic global hard disk drive (or a RAID array, to stretch the analogy) optimised to provide high-speed reliable storage and retrieval of information and to be protected against any and all outside interference.

The datapool was designed to provide guaranteed availability, absolute security and high-speed accessibility for all its information, and to intelligently manages its huge but limited resources to best approach that ideal.

The PÛL Event

The meaning of datapool (v) is connected to the event which took place on 01 July 2006 (GMT). It refers to the "pooling" of data which took place when the datapool "went live". In addition, the event the media labelled "The PÛL Event" (see Trivia, below) is known by its originators and supporters as "the datapool".

The "declaration phase" (effectively the public launch) was advertised via mass proxy-cache replacement of advertising blocks on tens of thousands of major websites worldwide (much of it legitimately negotiated with ISPs, some involving a spoofing exploit against the caching application squid; the datapool organisation has subsequently declared that "attention is something [they’re] prepared to take" without compensation).

Initially, no uninstall option was available, and the mechanism by which the datapool worked was cloaked in "industrial secrecy", awaiting a "patent application" (according to the minimal "About" information accompanying the program). As of 12.00 +0000 04 July 2006, all queries to the datapool generated a pop-up dialog box offering to uninstall the "datapool application" and "immunise" against future installations. (This terminology fueled security concerns, and the perception that the datapool code was "viral" was very quickly promoted by many media organisations.) However, by this time, the use of the datapool was already widespread and popular, and some 81% of net-connected Windows machines were still running the code.

By the time it was announced on 01 July, much of the web has been reflected into the datapool and linked from urls under datapool.org, organised into a "Library of All World Knowledge" style search system with significantly better response and accuracy (except in a known subset of queries) than google's engine, which retrieved cached information significantly faster than the web and hooked transparently into the web for ultra-recent content, a jukebox of "All the World’s Music" which could play in real-time at mp3 quality from any ADSL link, and the ability to securely and transparently backup hard drives based on earned micro-credits.

The PÛL Event wiped 40% off the average value of entertainment industry stocks on US stockmarkets. Apple Computer’s stock dropped a mere 5% as their wireless "ePod" became the first consumer device to ship with a native compilation of the open-source datapool code in September 2006; iTunes was sold to EMI two months later for 10% of its pre-launch value). Microsoft's value was reduced by 20% overnight.

The Safe Data Act (2006) was introduced into the Republican-dominated US seeking to make it illegal to execute the datapool code, but by the time a modified provision was signed into law by President Bush, it was mid-2007 and the damage had been done. The even more draconian Entertainment Industry Preservation Act (2007), in which the active presence of the code constitutes evidence of copyright violation, has been widely ignored by State law enforcement officials.

History

The datapool project was begun by Russian university student hackers (led by a Trotskyist ideologue named Peter Petrovich) working in secret from the mid-eighties, as a weapon against the US. During the first two decades of its development, the team designed general classification systems for knowledge, fleshed out the concepts surrounding data perpetuity, micro-economics for resource sharing, and the mathematics of fragmentation and redundancy.

The first "release code" from the project was the vector program, which was, over a period of frantic development in late 2001, optimised and surreptitiously inserted into several dozen distributions of major shareware applications, anti-virus patches, screensavers and the like. These vectors were "seeded" in late 2003 with hundreds of genetically-programmed variant datapool storage bots, which immediately began mirroring as much valuable data as they could locate, coordinating their actions via communication "semaphores" including pseudo-spam and steganographically-hidden instructions.

The datapool organisation (who operated from the domain dartcalm.org) had, by the mid-nineties, begun to employ and subcontract from a veritable army of independent hacker and security groups and programmers-for-hire. Much of the organisation's funding for the development and deployment in the two years preceding the Event came from running protection and loans systems within massively multiplayer online rpgs, and trading in virtual currency and goods.

The organisation has not been available for comment since October 2007.

Technical Details

The vector code for the datapool was a 4Kb windows executable of hand-written machinecode which had found its way onto 98% of internet-connected Windows machines piggybacked across the internet on many legitimate releases of widely-used downloadable applications, operating system patches, viruses etc. These were activated in November 2003 by the live datapool code: several hundred 8-12kb GA-variant executables—designed to stay beneath the virus-detection radar—each of which performed the same function in its own unique way. This final version hooks into the OS in such a way as to execute only during wait states (ie. when the CPU would otherwise not be in use), and automatically mirrors cached web content and certain media files (those in "shared folders", although this has become a highly contentious issue) into a fully fragmented, massively redundant distributed datastore utilising unallocated blocks on the hard drive for storage, distributing information across the internet using spare bits in the TCP/IP implementation and, most recently, implementing a "micro-credit economy" to compensate willing hosts for hard drive wear-and-tear and power used.

The client application(s) install a web proxy on port 8880, intercepting any calls to "pool://" resources. The client also generates a drag-and-drop "web folders" style interface for insertion and retrieval of files to and from the datapool, and presents datapool files to the browser as web content (allowing datapool-based "websites" using standard html).

When a file is inserted into the datapool, it is first padded to ten times the minimum block size, then cryptographically scrambled, then divided into at least ten fragments and sufficient to ensure that no fragment exceeds the maximum block size, with each fragment including checksum information to ensure against data corruption.

The following subject to change

The client then "announces" to the datapool that a file of a certain size needs to be inserted. The datapool matches known storage availability to blocksize and uploads three copies of each block/fragment, creates a file-index which will keep track of the locations of all the individual fragments, and "announces" a file-indexkey, which will point to this file-index and which will allow later recall of the file, for the client to store. When a file is retrieved, the client "announces" the relevant file-indexkey, and the datapool uses this to provide, from the file-index, the locations of the copies of each of the file's fragments which will reach the client quickest (taking into account bandwidth and availability of the various storage location options). The client then retrieves each of the fragments making optimal use of available bandwidth, assembles, descrambles and presents the file to the browser for display/local storage.

Zero-Day purchases

Much has been made since 2013 of speculation that early datapool vector code was inserted into certain development environments, not many years ahead as has been claimed, but perhaps as little as months, through the use of three large collections of zero-day exploits purchased by associates of the main development team early in early 2004. One such exploit is reputed to involve legacy ntfs code in XP which allowed the auto-execution of arbitrary file types.

Non-Windows Clients/Developments

Linux and Macintosh executables were available for download immediately from the datapool itself, and from a variety of websites. A number of mobile phone pool clients were also available at "launch", and many more have appeared as the technology matured. This latter datapool is becoming (as of late 2007) the interchange system of choice for the pod craze which kicked into high gear in early 2006 with the convergence of mobile phone, media/game player, organiser and digital camera technology (and was ultimately driven by cheap, high-quality Chinese units, mirroring the Transistor Radio craze of the '70s).

The datapool provides a universal store for multimedia content. Poolplayer (a web-style application with optional OS hooks) software has rapidly become the "internet" connection method of choice on the majority of recent model pods. Most variants include customised content selection, grouping and indexing, inherent "publishing" capability, collaborative customisation features with variably private/secure storage of personal variants, "auto-marketing" features via peer feedback and web-of-trust review networking, as well as highly flexible "skinning", community communication features etc.

Trivia

The AI Seven (in her guise as "Czech teenager" Sophia Marie) provided very useful information to the datapool organisation in late 2004 which helped establish a major new revenue line through online auctions, based on her experience and knowledge of the workings of the Standingstone organisation
Initial small-scale (10,000+ hosts) testing of the datapool was reportedly done via an exploit of the google toolbar
In late 2005 the vector code was detected and classified as windows.PÛL by one anti-virus company, based on a repeating sequence of three bytes at the tail of the code. Its significance was not realised until nearly 12 months later when the datapool was announced, although there were rumours linking windows.PÛL to "file-sharing bots" in early 2006
Uptake (retention) of the datapool was spurred by concurrent (June – August 2006) revelations about the use of "digital mercenaries", including Russian crime networks, by a number of private networking companies to knock out the public internet using coordinated spam-flood, worm and ddos attacks
Petrovich's 2009 interview for Pravda cites La Some's "Cybercrime: Fight Club Catastrophe or Wealth Redistribution?" (2006) in reference to his organisation's intention to negate the need for credit card transactions on the internet