We do care what our children learn, but we do not care yet about what our robots learn from. One key idea behind trustworthy AI is that you verify what data sources your machine learning algorithms can learn from. As we have emphasised in our forthcoming academic paper and in our experiments, one key problem that goes wrong when you see too few small country artists, or too few womxn in the charts is that the big tech recommendation systems and other autonomous systems are learning from historically biased or patchy data.
In complex systems there are hardly ever singular causes that explain undesired outcomes; in the case of algorithmic bias in music streaming, there is no single bullet that eliminates women from charts or makes Slovak or Estonian language content less valuable than that in English.
While the US have already taken steps to provide an integrated data space for music as of 1 January 2021, the EU is facing major obstacles not only in the field of music but also in other creative industry sectors. Weighing costs and benefits, there can be little doubt that new data improvement initiatives and sufficient investment in a better copyright data infrastructure should play a central role in EU copyright policy. Preprint of our article with copyright researchers.
At last, Reprex has its own company website, leaving the two flagship project sites, the Demo Music Observatory and the Listen Local separate. We are back to blogging after a particularly difficult lockdown period.