WebData Research Infrastructure
The project facilitates research on data from the internet. The data is collected from the National Library of Norway’s Web Archive, which has harvested the Norwegian part of the internet since the 1990s. When the project is completed in 2029, users of the infrastructure will be able to search, visualize, and extract historical internet data for research purposes.


Main Objectives
During the project period (2025-29), we will:
- build a research platform for searching, exploring, and retrieving data,
- automatically classify and clean texts containing (sensitive) personal data
- annotate data in order to provide analytical services (e.g. event extraction, sentiment analysis, analysis of language development)
- develop the infrastructure in close collaboration with the research community through needs and representation studies

Prerequistes
The Research Infrastructure follows key principles for research data and cultural heritage:
- user-oriented development, where services and tools are designed to meet the needs of researchers
- the FAIR principles, ensuring that research data is Findable, Accessible, Interoperable and Reusable.
- the CARE principles for indigenous data.
- providing as much data as possible to as many as possible, while respecting copyright and data protection legislation.

Newsletter
Researchers and other interested parties can subscribe to our newsletter, issued 2–3 times a year. Receive updates on seminars and workshops, and follow the progress of the project. Maybe you end up becoming a test user?
What Researchers are Saying
WebData is supported by leading scholars and institutions in Norway and abroad. Here are some of the supporting testimonials we received when developing the project.