3  Open Music Observatory: Building a Shared Music Data Space

You can visit the Open Music Observatory on https://openmusicobservatory.eu/. For API-level access get in touch with Reprex.
NoteOpen Music Observatory

Our ambition in developing the Open Music Observatory is to provide the technological basis and a practical roadmap for creating a European Music Observatory in a bottom-up, decentralised way. Instead of waiting for a grand, central agreement, any data owners or collectors who satisfy quality and cooperation rules can add their data. Once the Observatory reaches sufficient maturity, its long-term institutional form can be decided.

The Open Music Observatory https://openmusicobservatory.eu/ is a cornerstone task of the OpenMusE project (running until 31 December 2025), delivering data collection, processing, dissemination, and innovative services. It is a digital service provider for the music industry, aligned with the European Interoperability Framework, and introduces a unique governance model that adapts best practices from the EU and other sectors.

Transparency note: Following the principles of Open Policy Analysis, we have made all key deliverables (including versions 0.99, 1.01, and 1.1 of the Open Music Observatory document) publicly accessible to foster broad stakeholder engagement and to provide a clear audit trail. These versions are available at https://zenodo.org/records/11564114, while version 1.0 remains internal and was shared only with OpenMusE evaluators. Minor edits, as well as access to the standardised folders, figures, and bibliographies, can be found at https://github.com/dataobservatory-eu/open-music-observatory. You can access the documentation in PDF, EPUB, and DOCX formats [if you would like to provide comments] here.

Citation note: If you refer to the specification of the Open Music Observatory in correspondence, publications, or blog posts, please cite the latest versioned DOI available on Zenodo and, if applicable, include the date of access when referring to material on our GitHub repository.1

The Music Ecosystem 2025 report already emphasised that the music sector should be understood as a distributed ecosystem where value and knowledge are held by many small actors (Music Moves Europe 2024, pp6–7). This perspective reinforces why centralised repositories fail and why federated observatories, built on cooperation and interoperability, are more realistic.

Over the past decade, feasibility studies, national reports, and EU pilot projects have laid the foundation for the Open Music Observatory. The roadmap (2014–2026) shows a gradual build-up: from local experiments, through cross-border collaborations, to a European-wide federation aligned with cultural data spaces and interoperability frameworks. This trajectory underlines the Observatory’s pragmatic, step-by-step approach to scaling music data infrastructure. DOI: [10.6084/m9.figshare.30073291.v1](https://doi.org/10.6084/m9.figshare.30073291.v1)

3.1 Discussion

3.1.1 Why centralisation is a futile model

Calls for a centralised European database of music often reappear in policy debates, but in practice such proposals are neither realistic nor aligned with current EU strategies. Centralisation assumes that highly diverse data sources can be harmonised within a single repository. In an ecosystem where knowledge is held by tens of thousands of micro-enterprises, NGOs, collective management organisations, and heritage institutions — each operating under distinct legal frameworks — this assumption is untenable.

CITF reaches the same conclusion. It notes that copyright data is inherently distributed across many custodians with incompatible mandates and governance models, and that no single centralised registry can meet the legal, operational, and semantic requirements of modern copyright workflows. Instead, it argues that future-proof infrastructures must rely on federated, lifecycle-aware registries capable of exchanging trustworthy provenance and rights metadata while preserving institutional autonomy. This conclusion also reflects the broader political economy of copyright data infrastructures: attempts at centralisation repeatedly fail because dominant actors have limited incentives to share authoritative data under neutral governance conditions, reinforcing the need for federated, incentive-compatible architectures (Bodo 2026).

NoteLessons from the Global Repertoire Database

Between 2008 and 2014, European and global stakeholders pursued the Global Repertoire Database (GRD) as a solution to the chronic fragmentation of musical works data. Backed by collective management organisations (CMOs), major publishers, and digital service providers, the GRD aimed to establish a single, authoritative global database of musical works and rightsholders. Its promise was that licensees—especially online platforms—could obtain reliable rights information from one source, reducing duplication and disputes.

However, the GRD ultimately collapsed before launch, despite several years of investment and the establishment of a London-based operating company. A similar project, the International Music Registry project, which was backed by the World Intellectual Property Organization, ended with similar results2.

Post-mortems identified several reasons:

  • Governance conflicts: disagreements between major publishers, CMOs, and other stakeholders over who would control and fund the database.

  • High costs and unclear incentives: the project’s projected maintenance costs exceeded what many participants—particularly smaller CMOs—were willing or able to sustain.

  • Asymmetries of power: large publishers and CMOs were reluctant to share sensitive commercial data on equal terms with competitors.

  • Lack of trust: concerns over who would “own” the data and how revenues would be redistributed undermined cooperation.

The failure of the GRD is now widely cited in policy and industry discussions as evidence of the limits of centralised, “single-database” solutions in the music sector. Similar initiatives have also failed at national level.

We can also add that centralisation, even if it was possible, would pose a new risk of creating monopolistic gatekeepers to the music ecosystem.

The predecessor of the Open Music Europe project, CEEMID, was built on the lessons of these failures and on the insights of a decentralised, data-space–like approach (Antal 2020). In such federated, interoperable approaches—where data remains with its custodians but can be linked through shared identifiers, standards, and protocols—have proven more viable. CISAC’s CIS-Net, Europeana in the heritage field, and emerging European data space initiatives exemplify this more distributed model of governance.

CITF’s analysis reinforces these lessons. It identifies governance opacity, unclear mandates, and incompatible identifier regimes as recurring causes of failure in large-scale copyright registries. It stresses that unless registries adopt transparent governance, open identifiers, and shared semantic profiles, centralised projects inevitably collapse under conflicting incentives3.

EU infrastructure initiatives have already moved beyond this logic. Since the 2000s, projects such as Europeana, the European Open Science Cloud (EOSC), the European Collaborative Cloud for Cultural Heritage (ECCCH), and DARIAH have all adopted federated architectures, linking distributed collections through shared standards and interoperability frameworks rather than consolidating them into one database. The Audiovisual Observatory, established in 1993 as a centralised reporting body, represents an earlier institutional logic that is now being phased out in favour of federation.

The heritage sector, including music heritage, has consistently stressed the need for open, federated models. Libraries, archives, and museums use authority files and collaborative platforms (e.g. VIAF, Wikidata, Wikibase) to enable interoperability while preserving institutional autonomy. Commercial infrastructures do the same: the ISRC system, managed by IFPI, is inherently decentralised, while CISAC’s CIS-Net gives access to rights data without centralising ownership. Even the Mint initiative, launched by CISAC and Armonia Online, shows how shared infrastructure can deliver economies of scale for identifier allocation and metadata management while avoiding dependence on a single repository.4

Even official governmental statistics, often seen as centralised, are in reality decentralised. The ESSnet-Culture project, coordinated under Eurostat, produced the first comprehensive framework for cultural statistics in 2012, adapted from the UNESCO model, and remains a “basic reference” for the field. More broadly, national statistical offices, labour force surveys, and administrative registers each collect partial data, which are harmonised at EU level for comparability. Increasingly, surveys and administrative datasets are complemented by flows from platforms, rights management organisations, and other industry actors. Indicators therefore emerge from hybrid constellations of public and private data sources, confirming that decentralisation is a structural feature of European evidence creation.5

3.1.2 Open Data Directive: right without means

The Open Data Directive grants a right of reuse for public-sector information and requires that certain “high-value datasets” be made freely available across Europe (Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on Open Data and the Re-Use of Public Sector Information 2019). This includes cultural heritage institutions such as libraries, museums, and archives. However, the Directive stops short of providing the means to ensure that such data is actually usable. The OpenMuse Policy Brief 1: Music Metadata Mainstreaming and EU Law highlights this gap from a legal perspective: while EU law increasingly mandates data availability and reuse, it does not ensure the interoperability, attribution clarity, or governance structures required for rights-sensitive domains such as music metadata.

Studies consistently show that open data often remains more of a promise than a reality. In practice, much open data is poorly documented, lacks common identifiers, and is released in unstandardised formats. While it may be free of charge or available at marginal cost, making it interoperable and trustworthy for cross-border use requires significant additional effort. The burden of curation, harmonisation, and enrichment falls on downstream users, which can be prohibitively expensive for smaller organisations. As the CEDAR project put it, “Public authorities are only required to make existing data available, not to create new data or improve existing systems. This leads to significant disparities in usability and accessibility” (Project 2023). A recent EU-wide usability study adds that “many open data portals remain difficult to navigate, poorly documented, and inconsistent in their metadata quality, limiting actual reuse” (Jachimczyk and Nowak 2024).
These structural weaknesses of open data provision set the stage for the Observatory’s role in providing workflow playbooks and redundancy-free registration, discussed in Section 3.2.6

3.1.3 Why voluntary workarounds do not scale

The Slovak pilot shows that voluntary workarounds for attribution under GDPR are possible (see Section 2.1.5), but they do not scale. Even with strong communication and opt-in procedures, fewer than 1.3% of authors responded. Every new dataset requires fresh balancing tests, repeated notifications, and continued exposure to legal risk.

For observatories and data spaces, this is untenable. Interoperability requires clarity and legal certainty across borders and institutions. Without guidance from a Data Protection Authority or the European Commission, every national or sectoral initiative risks being challenged. This legal uncertainty is structural rather than incidental: existing EU frameworks provide partial and sometimes conflicting signals on attribution, personal data, and reuse, making scalable interoperability difficult without further clarification or coordinated governance mechanisms. The result is paralysis: public infrastructures cannot fully attribute works, and private actors refrain from sharing metadata for fear of liability.

In effect, Europe’s music data infrastructures remain locked in uncertainty — unable to guarantee attribution, diversity monitoring, or local content compliance. This makes a purely local or voluntary approach insufficient. The solution must be systemic: a federated data sharing space, supported by common specifications and clear governance frameworks, so that attribution and interoperability can scale. How such systemic solutions can be embedded into the Observatory’s conformance and legal levers is developed in Section 3.2. These unresolved attribution issues ultimately undermine not only observatories but also AI fairness and governance (see Section 4.1.3)7.

3.1.4 Public infrastructures bypass music’s real data flows

The European Union has made significant investments in cross-domain cultural and research data infrastructures, including Europeana, the European Collaborative Cloud for Cultural Heritage (ECCCH), the European Open Science Cloud (EOSC), and, more recently, the planned EU Culture Data Hub announced in the Cultural Compass for Europe. These initiatives establish essential foundations for digitisation, discovery, preservation, and evidence-based cultural policy.

However, by design, such infrastructures operate at a generic, cross-domain level. They are not equipped to encode the sector-specific lifecycle logic, rights complexity, and attribution requirements that characterise recorded music. Music is not marginal in these systems because of lack of relevance, but because its data structures—multiple overlapping rights per asset, high-frequency reuse, and dense interconnection between public and private registries—exceed what cross-domain cultural infrastructures can reasonably handle without domain-specific implementation layers.

The EU Culture Data Hub reflects this same architectural logic. It is intended to aggregate and harmonise cultural evidence across domains, not to replace or replicate sectoral infrastructures. As such, it depends on the existence of interoperable, rights-aware domain-level data spaces capable of supplying structured, comparable inputs. Music illustrates particularly clearly why such domain-level implementation is necessary.

The Europeana Data Model (EDM) was designed for library holdings and is well suited to printed works, but it cannot capture the attribution needs of recorded music, which must identify at least three groups of rightsholders: authors, producers, and performers.8 The ECCCH report likewise overlooked music entirely, focusing instead on monuments, archaeology, textiles, and museums.9 Its first projects — such as AUTOMATA, TEXTaiLES, HERITALISE, and ECHOES — developed advanced tools for other heritage assets, but none addressed music directly. Our own attempts to include music datasets in ECHOES’ cascading grants illustrate the problem: proposals were screened out early, despite the clear need for music representation.

CITF highlights the same structural blind spot. It observes that national libraries already curate large volumes of copyright-protected material and maintain authoritative identifiers, yet they remain largely decoupled from rights metadata workflows. This is exactly the workflow we tested out in Slovakia, and are going to introduce in our Hungary replication. CITF therefore recommends treating national libraries and cultural institutions as copyright-infrastructure actors, not only heritage custodians, and integrating their registries into federated rights environments, which is exactly what we did in our Slovak national federated module, and what we aim to replicate in Hungary.

Other initiatives show the same bias. The Polifonia project created modular ontologies, but it was “blind” to rights management and did not align with ISWC and ISRC identifiers used by industry. As a result, public knowledge graphs and registries do not interoperate smoothly with private-sector identifiers. The result is duplication, costly reconciliation, and under-use of culturally significant catalogues.

EOSC, intended as Europe’s backbone for research data infrastructure, is also relevant. Its federated model provides long-term preservation and persistent identifiers (via Zenodo and OpenAIRE), and music datasets deposited there already attract visibility. But EOSC has no dedicated workflow for music, and industry uptake remains minimal. As with ECCCH, music is underrepresented and rights-aware curation pathways are absent. CITF confirms that lifecycle-based metadata and provenance are prerequisites for integrating cultural and commercial systems10.

The European Interoperability Framework (EIF) helps explain why these gaps persist. Interoperability depends not only on formats but also on legal, organisational, semantic, and technical alignment. Without shared governance and profiles, public and private systems diverge. The principle of subsidiarity adds another layer: stewardship over cultural data is distributed across national and regional authorities, as well as private actors. Centralisation is therefore both impractical and politically illegitimate. The challenge is not whether decentralisation should exist, but how to make decentralised contributions work together.11
This challenge directly motivates the Observatory’s bridging role with EOSC, Europeana, and ECCCH, elaborated in Section 3.212.

The four layers of European interoperability: legal, organisational, semantic and technological in our federated Slovak module.

3.1.5 Subsidiarity and infrastructures for scaling music data

The European principle of subsidiarity requires that decisions be taken as closely as possible to the citizens they affect. In cultural policy, this means that responsibilities are distributed across multiple levels: in some Member States, culture is managed regionally or provincially; in others, nationally. Beyond public administrations, many important datasets are held by private actors — collective management organisations, platforms, or archives. Any attempt to centralise music data governance would therefore risk losing both legitimacy and local relevance.

Instead, subsidiarity must be built into the design of the Observatory. The European Interoperability Framework (EIF) provides a layered model — legal, organisational, semantic, and technical — for coordinating governance across institutions without requiring uniform systems. The Data Governance Act (DGA) reinforces this approach by recognising that data stewardship remains distributed: Member States and sectoral actors retain control over sensitive datasets, while common rules enable secure and comparable cross-border use. The Data Spaces Support Centre (DSSC) translates these principles into operational practice, defining blueprints and building blocks for federated data sharing across heterogeneous participants.

Taken together, these frameworks do not merely accommodate decentralisation — they depend on it. They assume that data will remain under multiple forms of stewardship and therefore focus on interoperability, trust, and controlled access rather than central aggregation. In this context, subsidiarity and federation are not constraints but design principles. They also reflect the practical reality that centralised solutions are unlikely to emerge or persist in sectors such as music, where data ownership, legal responsibilities, and economic incentives are inherently fragmented. CITF’s three-layer model aligns directly with this reasoning13.

At the technical level, Wikidata and Wikibase provide a proven backbone for collaborative metadata management. They are already embedded in EU infrastructures such as the official EU Knowledge Graph and in national projects like MetaBelgica in Belgium. In Flanders, the performing arts field has gone further: since 2017, Kunstenpunt and meemoo have published decades of performing arts data on Wikidata, showing how enrichment happens automatically once data becomes part of a wider ecosystem. These pilots illustrate how subsidiarity and federation can work in practice, with decentralised actors maintaining control of their own data while contributing to a shared framework.14

The problem of scale makes such infrastructures essential. Large platforms and labels can manage millions of assets cheaply, but small actors cannot. Without shared systems, independent and community-based repertoires remain undocumented because the cost of proper registration exceeds likely revenue. Federated tools — strengthened by automation and AI — are the only realistic way to close this gap.

NoteFinno-Ugric Data Sharing Space

Our pilot with the Finno-Ugric Data Sharing Space illustrates subsidiarity in practice (see: https://finnougric.net/). By collaborating with regional NGOs and national archives, we curated and repaired datasets that would have remained invisible in a central repository. The project showed that decentralised actors are best placed to manage their own data, but that interoperability frameworks and shared observability layers can connect them effectively 15.

International comparison confirms this. In the United States, the Mechanical Licensing Collective (MLC) was created in 2021 to administer a blanket mechanical license for streaming and downloads. It inherited more than $424 million in unmatched royalties and developed large-scale reconciliation systems to allocate them. By 2022, it had already distributed nearly $700 million. The MLC shows what can be achieved when identifiers such as ISWC and ISRC are used systematically and backed by law. But it also highlights the limits of centralisation: creators must still claim and maintain their records, education gaps persist, and disputes between platforms and rights bodies continue.16

NoteThe U.S. Mechanical Licensing Collective (MLC)

The Mechanical Licensing Collective was created under the U.S. Music Modernization Act (2018) to administer a blanket mechanical license for streaming and downloads. It inherited more than $424 million in unmatched royalties from digital services and developed large-scale reconciliation systems to allocate them. By late 2022, it had distributed nearly $700 million.

The MLC shows what can be achieved when identifiers (ISWC, ISRC) are captured systematically and backed by legislation. But it also highlights the limits of centralisation: creators must still claim and maintain their records, education gaps persist, and disputes between platforms and rights bodies continue. For Europe, the lesson is clear: scaling metadata infrastructure is possible, but it must respect subsidiarity and federation rather than rely on a single central clearinghouse (Mechanical Licensing Collective 2021; Varghese 2024).

3.1.6 Economies of scale in metadata

Large platforms and major labels can document millions of tracks at very low per-unit cost, because they manage everything in bulk. Smaller actors — independent labels, non-profits, or community archives — face the opposite situation: the cost of registering and maintaining each track is often higher than the revenue it will ever generate. This imbalance explains why so many “frozen” assets remain unregistered and invisible in today’s digital ecosystem.

Without a way to share infrastructure, small actors remain stuck. They cannot afford the per-track cost of full documentation, yet under-documentation ensures their work remains undiscovered. This is not just an accounting issue, but a structural barrier to diversity in music data flows. A federated approach, as outlined in Section 3.2.2, is essential to rebalance these inequalities and enable small actors to benefit from the same efficiencies as global players; CITF frames this imbalance as a foundational infrastructure issue.17

Conformance and observability rules in the Open Music Observatory should be designed in line with the European Interoperability Framework (EIF) and the FAIR data principles. This ensures compatibility with wider European data space initiatives and reduces integration costs for institutions already adapting to these standards (Commission et al. 2020, p9).

The Open Music Observatory sits where open science, public sector information reuse, and music industry workflows overlap. By aligning with the European Interoperability Framework, it creates a shared space where libraries, rights managers, publishers, and researchers can collaborate. This positioning highlights OMO’s role as a bridge between cultural heritage, commercial distribution, and open knowledge. DOI: [10.6084/m9.figshare.30073267.v1](https://figshare.com/articles/dataset/The_Open_Music_Observatory_at_the_Intersection_of_Open_Science_Open_Data_and_Music_Industry_Workflows/30073267/1?file=57754399)

3.2 Policy Proposals

The policy proposals set out in this section are directly grounded in, and fully consistent with, the analytical and institutional trajectory established at European level over the past years. The Study on copyright and new technologies: copyright data management and artificial intelligence identified persistent structural failures in rights metadata management, interoperability, authority, and provenance across creative sectors, and highlighted that emerging AI uses amplify these weaknesses rather than resolve them. The first project report of the Copyright Infrastructure Task Force (CITF) subsequently translated these analytical findings into concrete, infrastructure-oriented requirements, including interoperable identifiers, standardised metadata schemas, trustworthy provenance, and governance arrangements capable of supporting copyright-relevant data throughout the lifecycle of works and recordings.

The Open Music Observatory proposals presented here should be read as an operational continuation of this policy line. They do not introduce a parallel conceptual framework, but instead demonstrate how the requirements articulated in the Commission-initiated study and the CITF process can be implemented in practice through federated, non-centralised infrastructures. The national and regional pilots referenced throughout this section — including the Slovak Comprehensive Music Database, its replication in Hungary, the Baltic and Latvian pilots, and the Finno-Ugric Data Sharing Space — function as policy-embedded test cases that validate, refine, and contextualise these requirements within existing legal competences, institutional mandates, and public–private cooperation frameworks.

In this sense, the Open Music Observatory is positioned neither as a standalone technical solution nor as a new central authority, but as a convening, conformance, and observability layer that operationalises the findings of the Study on copyright and new technologies and the first CITF report. It provides a practical mechanism for translating earlier Commission analysis and task-force recommendations into reusable governance patterns, interoperable workflows, and auditable data practices suitable for the digital and AI era.

NotePublic–private reconciliation in practice

Reconciling public and private infrastructures: the ALOADED pilot in Latvia

The Unlabel workflow was tested with Latvian archives and the distributor ALOADED, demonstrating how public heritage metadata can be reconciled with private music supply chains.

  • Archival recordings (including Hilda Griva’s songs and Latvian/Latgalian midsummer songs) were identified in the Latvian Archives of Folklore.
  • Metadata was translated, enriched, and aligned with international authority files.
  • ALOADED extended this material with DDEX-compliant catalogue transfers and ingested it into Spotify and other platforms.

This pilot demonstrates that reconciliation between public infrastructures (archives) and private infrastructures (distributors and platforms) is both technically and institutionally feasible. It reconnects suppressed or marginalised repertoires with contemporary audiences without requiring centralisation or loss of institutional control.

A more technical description of the workflow is available here:
https://downloads.reprex.nl/2025/open-music-observatory/coordination.html#sec-future-proofing

Conformance and observability rules in the Open Music Observatory should be designed in line with the European Interoperability Framework (EIF) and the FAIR data principles. This alignment ensures compatibility with wider European data-space initiatives and reduces integration costs for institutions already adapting to these standards (Commission et al. 2020, p9).

3.2.1 Workflow playbooks and provenance trails

The Observatory should not only harmonise data formats, but also document workflow playbooks that describe how metadata moves across the music lifecycle:

  • from rights registration,
  • to distribution and royalty attribution,
  • to charting and visibility,
  • to long-term preservation.

Each step should define change-propagation rules: when a correction is made in one register, it should propagate to dependent systems. Provenance trails must survive system boundaries, using standards such as PROV-O to document who did what, when, and under what authority. This enables auditability, cross-border comparability, and long-term trust, and prevents “data death” when assets move between systems.

3.2.2 Federated infrastructure as a cost and governance solution

The Cultural Compass for Europe establishes a clear strategic direction for European cultural policy: shared, interoperable, and federated data infrastructures that support evidence-based policymaking, cultural diversity, and trustworthy use of artificial intelligence. The planned EU Cultural Data Hub reflects this ambition at cross-domain level. At the same time, the Compass does not prescribe how sector-specific complexities should be handled in practice.

This section argues that music represents a domain where such complexity exceeds what generic cultural data infrastructures can reasonably encode, and therefore requires a specialised, federated implementation layer that remains fully compatible with the broader European framework. The Open Music Observatory should be understood precisely in this role: not as an alternative to the EU Culture Data Hub, but as a music- and music-rights–specific implementation layer that operationalises the Hub’s objectives for one of Europe’s most complex cultural sectors

The imbalance described in Section 3.1.6 makes one point clear: small and medium-sized actors cannot compete on metadata quality without shared infrastructure. In a sector characterised by extreme fragmentation, federation — rather than centralisation — is the only viable path forward.

A data sharing space provides the appropriate governance framework for such federation. Instead of forcing participants into a single schema, database, or legal agreement, it enables organisations to share and reuse data on an as-needed or as-permitted basis while retaining stewardship over their own assets. For music — where rights, identifiers, and content are distributed across hundreds of micro-actors and institutions — this approach reduces duplication without creating dependency, while remaining compatible with cross-domain cultural data infrastructures such as the EU Cultural Data Hub.

Crucially, federation avoids the risks inherent in centralisation: technical fragility, governance lock-in, and the emergence of a single gatekeeper capable of restricting access or imposing unilateral conditions on others18. These risks are particularly acute in copyright-relevant environments, where errors in attribution or provenance have direct legal and economic consequences.

Music represents one of the most demanding test cases for European data governance in the cultural sector. Attribution interacts directly with privacy law, identifiers are applied unevenly across the sector, and most music enterprises lack the resources to maintain their own compliance, documentation, and audit infrastructures. If a federated model can function in this environment, it can function elsewhere. At the same time, decentralisation also creates incentives for data hoarding. Effective governance must therefore make participation more attractive than isolation — through lower administrative costs, legal clarity, shared compliance benefits, and increased visibility.

In practice, this requires combining hard alignment (minimal metadata profiles and baseline identifiers) with soft alignment (mappings, crosswalks, and shared workflow playbooks). This combination allows interoperability to be achieved without forcing all actors into a single organisational or technical model, and ensures that music-specific solutions remain interoperable with the EU Cultural Data Hub and other cross-domain infrastructures.

For these reasons, a future European Music Observatory, as part of the emerging EU Culture Data Hub ecosystem, cannot be designed as a single central database. Instead, it must function as a federated, domain-level observability and coordination layer. In this role, the Observatory complements the EU Culture Data Hub by providing validated, rights-aware, provenance-rich music data that can be aggregated at European level without centralising ownership or governance.

The Open Music Observatory is conceived as a music- and music-rights–specialist federated unit that connects shared cultural data infrastructures—including the European Collaborative Cloud for Cultural Heritage, Europeana and its emerging data sharing space, and the EU Culture Data Hub—with sector-specific workflows in music creation, distribution, rights management, and preservation. Rather than duplicating cross-domain functions, it supplies the missing domain logic required for music to be meaningfully represented within European cultural data infrastructures.

Architecturally, the Open Music Observatory inherits CEEMID’s core principles: federation rather than centralisation, reproducible indicators rather than static datasets, and collaboration across public, private, and academic actors. What changes is not the logic, but the scale and governance context. OMO aligns these principles with the European Interoperability Framework, FAIR data principles, the common European data space for cultural heritage, and the work of the Copyright Infrastructure Task Force, ensuring coherence across cultural, data, and copyright policy domains.

Earlier experimentation, including the CEEMID initiative and the Open Music Observatory’s Slovak cultural statistics pilot within the Open Music Europe project, already anticipated this architecture by demonstrating how music-specific data coordination could feed policy-relevant indicators without centralising data—an approach that aligns closely with the design principles now articulated for the EU Culture Data Hub.

NoteDiversity & Circulation Pillar Implementation

The Diversity & Circulation Pillar of the Open Music Observatory is implemented and documented in Music, Heritage, and Policy in the Age of AI19. While this Green Paper defines the policy rationale and governance principles for diversity-aware data infrastructures, the diversity pillar’s implementation paper provides the concrete data models, workflows, and pilot results used to operationalise these principles.

A central objective of the Slovak pilot was to demonstrate how music-specific statistical indicators could be produced in line with best statistical practice, where existing cultural and creative satellite accounts fall short, which are themselves rarely implemented in EU member states20.Drawing on established methods for business satellite accounts, the pilot sought to harmonise authoritative collective rights management data with administrative registers, industry datasets, and statistical surveys in order to generate indicators that are not available through Slovakia’s current cultural and creative satellite accounting system. In practice, this involved linking rights management information—often the most complete and economically meaningful source for music activity—with official statistical frameworks, while preserving methodological transparency, reproducibility, and comparability.

This approach responds to a broader structural gap at European level: most EU Member States do not operate dedicated cultural or creative satellite accounts at all, and where such systems exist, they rarely capture the distinctive revenue structures, cross-border flows, and attribution dynamics of music. The pilot therefore addressed not only a national shortcoming, but a systemic limitation of European cultural statistics, anticipating the need for domain-specific data pipelines capable of producing robust, music-aware indicators. While the OpenMusE project was ultimately unable to carry out this experiment due to resource constraints and changing institutional conditions, the design remains methodologically sound and operationally feasible, and could be taken forward as part of the preparatory piloting of the EU Cultural Data Hub.

Recent analysis of cultural heritage data governance in the context of artificial intelligence clarifies that the limitations of existing heritage infrastructures should not be interpreted as a reason to exclude libraries and memory institutions from future data architectures. On the contrary, these institutions occupy a distinctive position in the AI era precisely because they combine long-term custodianship with public accountability, authority control, and responsibility for provenance and attribution21. From the perspective of the Cultural Compass, integrating these institutions more directly into copyright-relevant data flows is essential for ensuring that attribution, provenance, and accountability are preserved across the full lifecycle of music data.

Europe already has the necessary policy and technical foundations for this approach. The European Strategy for Data, the Data Governance Act, and the Data Act define data spaces as federated by design22. The Data Spaces Support Centre (DSSC) has translated these principles into operational blueprints that can be applied directly to the music sector23. Comparable logics already exist in music and adjacent domains: ISRC governance through national agencies, CISAC’s CIS-Net, and European statistical systems based on subsidiarity rather than central repositories.

Concerns about digital sovereignty make this approach urgent. In the absence of a European solution, metadata infrastructures risk drifting toward US-style centralisation, such as the Mechanical Licensing Collective (MLC), where legislative mandates and market power converge in a single hub. OpenMusE Deliverable D5.6 explicitly warns against this outcome24. This Green Paper complements that legal-institutional analysis by demonstrating how a federated, culture-led infrastructure can translate EU law into practice.

In practical terms, this requires capture-once, reuse-many pipelines across the music lifecycle: from registration of works and recordings, through distribution and royalty attribution, to preservation and cultural statistics. The Observatory should therefore function as a redundancy-minimising registration and coordination space, aligned with the European Interoperability Framework and provenance-oriented models such as PROV-O25.

When implemented effectively, this approach lowers entry barriers for smaller actors, improves interoperability across institutions, and grounds Europe’s cultural and economic policies in reliable evidence rather than fragmented silos.

Broader debates on AI governance caution that technological systems can reproduce existing hierarchies when governance structures remain unequal. Guest, Suarez, and van Rooij warn that AI may extend “projects of domination, of hierarchies, of extractivism of cognitive labour”26. Federated, transparent, and participatory infrastructures therefore play a corrective role by preventing the concentration of informational and cultural power.

This consideration is particularly salient in the Finno-Ugric Data Sharing Space music module (https://finnougric.net/en/), which addresses decolonisation and subsidiarity by consolidating dispersed knowledge about the musical heritage of Livonians, Sámis, Maris, and Setos. The posthumous publication of Hilda Griva’s recordings illustrates how federated infrastructures can enable cultural self-representation rather than algorithmic erasure.

3.2.4 Alignment with the European Open Science Cloud

Position the Open Music Observatory as the convening, conformance, and observability layer connecting ECCCH, Europeana, and GLAM authority files with industry workflows. Concretely:

  1. Capture once, reuse many across creation, registration, distribution, and preservation.
  2. Require minimal profiles that smaller actors can realistically implement.
  3. Prioritise identifier crosswalks (ISRC–ISWC–ISNI–VIAF/Wikidata) and change propagation.
  4. Use Wikibase and Wikidata as a low-friction backbone where appropriate.
  5. Govern through EIF- and FAIR-aligned rules with auditability and public–private participation.

This reframes Europe’s investments from siloed repositories into a shared music data space that respects subsidiarity, lowers reconciliation costs, and enables interoperability across public and commercial contexts — the practical foundation for any future European Music Observatory.


  1. Always use the latest versioned DOI when citing this Open Music Observatory technical report, available via Zenodo. If you rely on supporting material hosted in the GitHub repository, please add the date of access in your reference.↩︎

  2. See for example Goldenfein and Hunter (n.d.); Milosic (2015).↩︎

  3. See (Partanen et al. 2025, pp15–18).↩︎

  4. On heritage practices, see (Bianchini, Bargioni, and Pellizzari di San Girolamo 2021, p210) and (Sardo and Bianchini 2022, p297), which describe how VIAF, Wikidata, and Wikibase function as authority tools in libraries and archives. On identifiers, the ISRC Handbook (International ISRC Registration Authority 2021, p5) explains the decentralised structure of the ISRC system, while CISAC’s Mint Digital Services (CISAC/SUISA/SESAC 2017) illustrates how federated allocation works in practice. Together, these examples show how distributed stewardship and shared standards underpin global metadata infrastructures.↩︎

  5. The ESSnet-Culture framework (Commission et al. 2020, p9) demonstrates how cultural statistics are built on national contributions harmonised at EU level, not on central databases. A Slovak pilot (Antal 2023) further illustrates how decentralisation works in practice, integrating public and private sources into coherent cultural indicators.↩︎

  6. Early modelling stressed the economic potential of open data but also identified major obstacles in practice: lack of availability, uneven quality, and poor usability (Carrara et al. 2015, p7; Huyer and van Knippenberg 2020, p14). Comparative studies show that simply granting a right to reuse rarely produces machine-actionable datasets. In complex domains like music, where attribution depends on precise identifiers, these shortcomings become particularly costly. Cross-sector reviews underline persistent fragmentation: heterogeneous formats and divergent practices across Member States (Buttow and Meijer 2024, p12); variability even in high-value geospatial datasets (Kević, Kuveždić Divjak, and Welle Donker 2023, p3); and sectoral case studies (e.g. mineral intelligence) repeatedly call for shared profiles beyond legal openness (Simoni, Aasly, and Schjøth 2021, p5). Additional evidence shows that preparing legacy administrative data for reuse requires cleansing and enrichment that impose real costs, even when the data are nominally “open” (EuroSDR 2021, p9; Schnurr 2021, p14; Nakos and Tsoulos 2022, p6).↩︎

  7. The Slovak pilot demonstrated that even with careful communication and GDPR balancing tests, participation was below 1.3%, showing the practical limits of voluntary attribution workarounds. Without EU-level guidance, every dataset requires fresh legal reasoning, making scale impossible. Comparable findings in other cultural domains underline the risk: voluntary consent-based models tend to collapse under low response rates and high compliance costs. See also discussions of attribution and AI fairness in (Commission et al. 2020) and (Music Moves Europe 2024).↩︎

  8. The EDM builds on DCTERMS, which works well for printed music but not for recordings. It fails to capture neighbouring rights such as those of producers and performers (Europeana 2017).↩︎

  9. Ex–ante impact assessment on the European Collaborative Cloud for Cultural Heritage (Commission et al. 2022). The first ECCCH pilots (AUTOMATA, TEXTaiLES, HERITALISE, ECHOES) focused on archaeology, textiles, and monuments, leaving music out.↩︎

  10. EOSC provides federated access and persistence through Zenodo and OpenAIRE, but music workflows remain marginal. On EOSC’s role, see the European Strategy for Data (European Commission 2020). CITF notes that without harmonised identifiers, provenance chains, and semantic profiles, public infrastructures cannot interoperate with private-sector rights workflows, especially in AI contexts where reproduction and transformation rights depend on reliable metadata [Partanen et al. (2025), pp20; pp28–33].↩︎

  11. The EIF defines layered interoperability (legal, organisational, semantic, technical) (Commission and Digital Services 2017). The European Strategy for Data frames subsidiarity as compatible with federation (European Commission 2020). BDVA and the Federation Working Group emphasise that interoperability frameworks are needed to operationalise federation (BDVA/DAIRO 2023; BDVA/DAIRO Federation Working Group 2023).↩︎

  12. The EIF defines layered interoperability (legal, organisational, semantic, technical) (Commission and Digital Services 2017). The European Strategy for Data frames subsidiarity as compatible with federation (European Commission 2020). BDVA and the Federation Working Group emphasise that interoperability frameworks are needed to operationalise federation (BDVA/DAIRO 2023; BDVA/DAIRO Federation Working Group 2023).↩︎

  13. On subsidiarity and federation: the Data Governance Act (European Parliament and Council 2022) and the European Strategy for Data (European Commission 2020). On technical frameworks: DSSC’s blueprints (Data Spaces Support Centre 2025b, 2025a). On governance: BDVA (BDVA/DAIRO 2023) and the Federation Working Group (BDVA/DAIRO Federation Working Group 2023). CITF’s foundational layer concerns authoritative identifiers and registries, its semantic layer provides shared meaning across heterogeneous models, and its technical layer covers APIs, mappings, and resolution services. Together these layers provide a structured approach for embedding subsidiarity into copyright data governance without forcing schema or organisational unification (Partanen et al. 2025, pp23–33).↩︎

  14. On official adoption: EU Knowledge Graph (Diefenbach, De Wilde, and Alipio 2021); SEMIC guidelines (SEMIC Support Centre 2023). On Belgian pilots: MetaBelgica (Stallmann et al. 2023) and Flemish performing arts enrichment (Magnus and Van D’huynslager 2021).↩︎

  15. See (Antal et al. 2025; Antal, Pigozne, and Federico 2025).↩︎

  16. On the MLC’s establishment and operations: (Mechanical Licensing Collective 2021); on contested governance and disputes with platforms: (Varghese 2024).↩︎

  17. Comparative research shows that costs per asset decrease sharply with catalogue size, creating scale advantages for majors and global platforms. Without shared infrastructures, small actors are disproportionately disadvantaged. The Feasibility Study for a European Music Observatory emphasised this imbalance as a structural barrier (Commission et al. 2020, p9), while the Music Ecosystem 2025 study highlighted how fragmentation and duplication reinforce these scale inequalities (Music Moves Europe 2024). CITF frames this imbalance as a foundational infrastructure issue: without open, authoritative identifiers and interoperable registries, small actors face disproportionately high documentation costs and cannot benefit from economies of scale. It therefore recommends strengthening the foundational identifier layer as a precondition for fair and efficient copyright ecosystems (Partanen et al. 2025, pp23–28).↩︎

  18. This definition paraphrases (Curry 2020) and reflects the view that a data sharing space is an ecosystem of exchange, processing, sharing and provision of data between trusted partners (EBU and Gaia-X 2022, p16). The CITF further emphasises that trustworthy provenance, machine-readable rights metadata, and auditable rights management information are essential components of such federated copyright infrastructures, particularly in the AI era [Partanen et al. (2025), p31; pp101–102].↩︎

  19. Music, Heritage, and Policy in the Age of AI (Antal 2025). (In the OpenMusE project it was Deliverable D2.3.)↩︎

  20. Pilot Program for Novel Music Industry Statistical Indicators in the Slovak Republic (Antal 2023). The pilot was not implemented within the OpenMusE project due to resource constraints and institutional changes, but its methodological design remains valid and suitable for continuation in the context of EU Cultural Data Hub preparatory work.↩︎

  21. Publishing Cultural Heritage Data in the Age of AI (Keller 2025)↩︎

  22. The European Strategy for Data (2020), the Data Governance Act (2022), and the Data Act (2023) establish federated data spaces supported by trust frameworks and shared services (European Commission 2020; European Parliament and Council 2022).↩︎

  23. The Data Spaces Support Centre (DSSC) provides blueprints and building blocks for implementing federated data spaces across sectors (Data Spaces Support Centre 2025b, 2025a).↩︎

  24. See Policy Brief 1: Music Metadata Mainstreaming and EU Law (Senftleben et al. 2024).↩︎

  25. The European Interoperability Framework defines interoperability across legal, organisational, semantic, and technical layers (Commission and Digital Services 2017). The W3C PROV model and PROV-O ontology enable standardised provenance chains linking actors, activities, and entities (W3C 2013b, 2013a).↩︎

  26. Towards Critical Artificial Intelligence Literacies (Guest, Suarez, and Rooij 2025, p3)↩︎