Wikipedia:Secret ballot process

From Wikipedia, the free encyclopedia

This page lays out a possible means to prevent fraud in elections conducted on Wikipedia using a system of secret ballot. While the benefits of secret ballots are favoured by a large number of editors, the disadvantages of moving away from open ballots are raised by a minority. These disadvantages centre around the risk of electoral fraud. The main aims of the proposal is thus to maintain transparency in the vote (compared to an open ballot) while introducing secrecy to the balloting process i.e. individual voters' ballots cannot be identified.

The method allows results to be verified as being valid through an open process involving any Wikipedian. The method does not depend on a select group of users overseeing the ballot to ensure its validity:

  • All users can assure themselves as to the validity of the vote
  • No-one - not even those with direct access to the database - can manipulate the vote without serious risk of getting caught1

The method proposed here could be enabled through an extended version of SecurePoll, which already implements many of the features that it proposes.

Summary[edit]

The proposed method is a process that roughly corresponds to the following six steps:

  1. Prepare a list of eligible voters ahead of the election (as wide or as narrow as necessary)
  2. Assign every voter a unique voter ID that is secret but can be demonstrated to refer uniquely to them
  3. Publish a list of eligible voters and voter IDs (separately) so that it can be seen that there can only be one vote per voter and that every voter is a genuine account
  4. Allow every voter to vote in secret, providing each voter with a sequence ID and time stamp for their vote
  5. Publish the complete list of ballots after the election has closed, showing voter IDs, sequence numbers, vote cast and time stamp
  6. Allow every Wikipedian to inspect a published list of ballots and to raise anomalies that positively identify fraud

Attempts at fraud that the system would identify include:

  • Ineligible voters
  • Voter imitation
  • Tampered ballots
  • Ballot stuffing
  • Suspicious voting patterns
  • Incorrectly calculated results

Allowing candidates to observe the list of ballots (minus the actual vote) as they are cast would allow for additional safeguards. Recording, but not publishing, voters IP addresses would allow for additional means of verification should serious questions be raised about the validity of the poll.

Detailed requirements[edit]

Eligibility[edit]

Voting is limited to a defined set of eligible voters according to prior agreed criteria and principles. This defined set may be as wide or as narrow as agreed necessary or beneficial. The purpose of limiting eligibility to a defined set is to facilitate later validity checking of the vote.

Voter IDs[edit]

Every user that is eligible to vote is assigned a voter ID. Voter IDs must have three qualities:

  • Be unique to each voter
  • Be incapable of to identify the voter
  • Be capable of being demonstrated to refer specifically and uniquely to that voter

Voter IDs should be generated anew for each election (to prevent voters being identified by voting patterns across several elections). Every user should be perpetually able to retrieve their voter ID for any given election that they were/are/will be eligible to vote (to facilitate verification of register of voters and results). Every voter's voter ID should be kept private to that user e.g. require a log in to retrieve.

A list is maintained that links voter IDs with actual user accounts. This list however should never be published except in the case of a serious suspicion of fraud. In that event, whether the decision to release the linking list publically or to an agreed set of users will be agreed at that time as the need demands.

Example[edit]

One possible generating a voter ID is to use an SHA1 encryption. An SHA1 hash could be generated from the user's username and randomly generated words. For example, my username is Rannpháirtí anaithnid, by adding random words to this name (e.g. "bicycle idiom") an SHA1 hash can be generated from that phrase. When I log into the voting area, I could see the randomly generated phrase and the SHA1 hash of it:

Election Phrase SHA1
ArbCom 2010 bicycle idiom - Rannpháirtí anaithnid a938ee398a634ed60e1fff5437493088bca654f8

While it is effectily impossible to link the hash ("a938ee398a634ed60e1fff5437493088bca654f8") to my username ("Rannpháirtí anaithnid"), using an SHA1 calculator (e.g. http://www.xorbin.com/tools/sha1-hash-calculator), I can verify that the hash matches the randomly generated phrase containing my username ("bicycle idiom - Rannpháirtí anaithnid"). Thus this voter ID is: 1.) unique to me, 2.) incapable of being used on it's own to identify me, and 3.) capable of being demonstrated to refer specifically and uniquely to me.

Voter list[edit]

Before the vote takes place the following two lists are published:

  • A complete list of users that are eligible to vote in that election (e.g. Rannpháirtí anaithnid (talk · contribs))
  • The complete list of voter IDs for eligible voters appropriate to that election (e.g. a938ee398a634ed60e1fff5437493088bca654f8)

The two lists should be published separately (i.e. not in one table). The list of eligible voters should be alphabetised for ease of reference. The list of voter IDs should randomised so as to not be able to be used to match voters with their IDs. Both lists should be available for inspection for an appropriate period ahead of any election.

Voting sequence ID[edit]

When a user votes they receive a sequence ID. This sequence must:

  1. Follow a prior agreed method
  2. Allow a user that is given a sequence ID to identify the sequence IDs that immediately follow and immediately precede their own

Sequence IDs should begin at an arbitrary point in the sequence so as to not allow voters to know how many voters have cast ballots before them the election (in order to maintain the secrecy of the ballot).

Publication of ballots[edit]

When the vote is over, a complete list of ballots cast is published. This publication lists all of ballots cast showing the following information for each ballot:

  1. The vote cast
  2. The voter ID of the voter that cast the ballot
  3. The sequence ID of the ballot
  4. The time the vote was cast

Supervision (optional)[edit]

As the vote is taking place, candidates (or nominated supporters of options that are being voted on) and other agreed parties - possibly everyone - have access to observe the vote as it takes place. Observers are able to see the current list of ballots at any given time, less the votes cast by each ballot. Doing so will make it difficult to stuff the ballot with unattributed ballots that will filled in after the election has closed.

If this option is employed, candidates and other parties with access to this information should not reveal the number of ballots cast or other information, nor remonstrate publically about the voting process, while the vote is taking place.

Recording of IP addresses[edit]

The IP address of each vote cast should be recorded but not revealed publically or to candidates. In the event of serious suspicion of fraud, this data could be revealed to (probably a limited set of users) to facilitate investigation.

Verification of the vote[edit]

Using the list of voters and the list of voters IDs, all users can:

  1. Satisfy themselves that all voters listed as being eligible to vote are genuine accounts operated by human editors
  2. Verify that they number of valid voter IDs for the election is the same as the number of eligible voters
  3. Verify that their voter ID appears on the list and is demonstrable unique to them
  4. Inspect the voter list to identify sock-puppets, etc.

Using the published ballots, all users can:

  • Verify that their ballot is on the list and shows the correct vote, voter ID, sequence ID and time stamp (if they voted)
- OR -
  • Verify that their voter ID does not appear on the list of ballots (if they did not vote)

All users will be able to use the publication of ballots to independently calculate the result.

Identifying anomalies[edit]

Anomalies that will be able to be identified by users include:

  • Unlisted voter ID — An ineligible vote
  • Incorrect sequence ID — A stuffed ballot
  • Incorrect vote — A tampered ballot
  • Incorrect time stamp — A tampered ballot (possibly to improve the appearance of a tampered election)

If a voter ID is listed that belongs to a user who did not vote, it risks being identified as voter imitation. Additionally, if editors can identify suspicious patterns (votes, time stamps, etc.), these may also indicate fraud.

To facilitate verification of the vote, a forum should be set up to allow all editors to:

  • Confirm that the vote is correct with respect to their own vote (or that their own vote has not been listed)
  • Confirm that the vote is correct with respect to other votes
  • Raise issues (suspicious voting patterns, etc.) that they see

Instructions and vitiations on how to verify the vote should accompany the announcement of result (and voting instructions before the vote takes place). Other areas where the result of the vote is announced should link to these instructions and invite all editors to verify the result.

Notes[edit]

  1. ^ Using the method, it is still possible to defraud the vote through voter imitation. However, with every fraudulent ballot cast through voter imitation, the risk of getting caught by the genuine owner of that ballot increases.

See also[edit]