= Arquivo.pt =

Arquivo.pt, formerly known as the Portuguese Web Archive, is a web archive that preserves Web content dating back to 1996. It is a service of the Fundação para a Ciência e Tecnologia (FCT) and was founded at the Fundação para a Computação Científica Nacional on the 8th November 2007.

Arquivo.pt collects regularly all the websites that are part of the Portuguese Web, in other words, all the websites with the .pt top level domain, as well as all the websites of the national interest. The preserved content is available one year after its collection for any user on the Arquivo.pt website.

As of March 2025, Arquivo.pt stores over 21 billion webpages from 47 million websites, totaling 1.4 petabytes of data.

== History ==
The original idea of archiving the Portuguese Web started in 2001 with the project tumba!, developed by the XLDB investigation group at the Science Faculty of the University of Lisbon and it was supported by FCCN (Fundação para a Computação Científica Nacional), where it collected about 57 million pieces of content, mainly textual. From this project, Tomba started.

On the 8th November 2007, the project for the Portuguese Web Archive was created at FCCN, after it as combined the resources and skills acquired at the previous project. The project was led by Daniel Coelho Gomes from 2007 to 2025. At the beginning of 2008, the project team made their first web crawl of .pt websites. The project had a 2-year maturity. Meanwhile, it was transformed as a permanent service of FCT.

== Services ==

=== Search and access ===
Arquivo.pt makes available a search tool of web pages from an inserted URL. This functionality allows the users to access different versions of the same page from different dates. Moreover, this functionality is also compatible with full-text search.

On the 24th of March 2021, Arquivo.pt introduced an image search feature, known as Dionisius. This tool allows users to search for images archived from the web, dating back to 1996. Users can find images that are no longer available on the live web and can also locate the original web pages where these images were published.

The page access can be made automatically with the use of APIs which was introduced in 2012.

=== ArchivePageNow ===
In 2022, Arquivo.pt launched ArchivePageNow. This functionality allows the users to archive a web page at the intended moment. Afterwards, the archived web pages stay available for search.

=== Arquivo404 ===
In 2022, Arquivo.pt developed the Arquivo404, an algorithm that allows web pages with the 404 error to contain a hyperlink directed to the preserved page at Arquivo.pt.

=== Others ===

- CitationSaver - extracts the links contained in documents and archived the corresponding pages
- Open Data - makes available data about information archived in the web
- Memorial - collection of websites no longer on the live web
- Educational sessions

== Arquivo.pt Awards ==
Since 2018, the Arquivo.pt Awards is organized with the sponsor President of Portugal and with a partnership with the Público newspaper, where the best investigative works using the features of Arquivo.pt are awarded.

== Awards and recognitions ==

- 2008 - Best Paper Award for its work on measuring the Portuguese web at the Ibero-American IADIS WWW/Internet 2008.
- 2022 - Honour roll for security in Portugal according to the Portuguese Observatory of Internet Technologies.
- 2022 - Best Digital Service award in 2022.
- 2023 - Top 3 government digital services in Portugal.
- 2024 - Finalist for The National Archives (UK) Award for Safeguarding the Digital Legacy (Digital Preservation Coalition Awards 2024).
- 2024 - Best Central Public Administration Digital Project
- 2024 - Digital Transformation 2024 award
