Good morning Conference and thank you so very much for inviting me here to speak to you today.
Now I would normally start off with a joke or a witty anecdote to break the ice, but given the audience and the subject matter – Open Information and Big Data – I promise that I’ve tried my very best to remove everything in the least bit humorous or funny from my speech today, and I apologise, in advance, for anything even mildly amusing that I may have, inadvertently, left in.
Okay… Let’s talk metadata!
I’m particularly pleased to be speaking to you today as I believe that this is a moment of immense opportunity – for every single one of us in this room, and for the public far and wide. The transformative powers of digital technologies are opening a whole new world of possibilities for anyone and everyone that works with information, data and what we still call, in a decidedly old-fashioned way, “archives.”
As I’ll argue over the next twenty minutes or so, in a digital world, the potential for institutions and individuals to create, collaborate, and share, information and data of every conceivable kind and size and shape is almost unlimited.
This is not the world of tomorrow, it’s not the coming digital era – these possibilities already exist in the here and now. What *is* needed is the leadership and the conviction to seize them.
I’m going to approach the issues of Open Information and Big Data through the prism of the BBC and its archives, then on into the UK’s memory institutions and the other custodians of public domain assets and information until we arrive at something we’re calling the Digital Public Space.
I’m going to say that liberating these unique resources will help to transform the UK creative economy and by the time I reach the end I hope you will agree with my suggestion that because of our immensely large, rich and often priceless archives – amassed and lovingly preserved over many centuries by experts, including many of you here today – the UK is potentially the world’s largest and richest source of Open Information and Big Data.
We just don’t know it – yet.
One increasingly important part of my job, and that of my “fellow travellers”, within the BBC is to make sure that the Corporation does come to know and appreciate this and then act upon that knowledge, to the benefit of everyone, by enabling first ourselves and then our other, important public institutions to also appreciate this and then to work together to transform this stored heritage, our Collective Abundance, into accessible resources. Thereby facilitating the most significant creative transformation since the birth of Broadcasting almost a century ago.
To make this vision a reality we will then need to bring into being a stable and permissive environment where reasonable people are free to do reasonable things with that information and data to the benefit of everyone – so long as they abide by some basic rules which don’t involve harming the interests of others. Or indeed themselves.
And when I say to the benefit of everyone, I really do mean “everyone” – not just a metropolitan elite with a misplaced sense of entitlement or those with superfast broadband and tablets and smartphones, but ‘an everyone’ that includes single mothers on benefit, people living with disabilities, people living in remote rural areas, people in prisons, and even surly teenagers on Walthamstow High Street – the area of East London where I grew up.
Because it’s the BBC’s job to benefit them all; after all Licence Fee Payer is a term which, in accordance with our written Charter, is defined as not only those who actually *pay* but in fact:
“… any other person in the UK who watches, listens to or uses any BBC service, or may do so or wish to do so in the future.”
And with its obligation to serve everyone, the BBC occupies a unique position. Still, today, in a digital age. Even more so, in fact, in a digital age.
Many good people inside the BBC, mainly engineers, have spent the past 20 years, a generation, thinking about the role and impact of the internet, at scale – in a way that almost no one else has had the resources, nor the remit, to be able to do.
Because apart from still being – in *my* opinion at least – the greatest public service broadcaster in the world, it is our mission to make information and data, in vast quantities – available for everyone and for the broader public good – not to control it, not to lock it away, not to monitor it, not to enhance the wealth of our shareholders.
Yes, there are of course many, many others who also do this on a smaller scale, often driven by an equally vocational sense of the “public good.”
But perhaps only the BBC can liberate, and help others to liberate, vast amounts of information and data, reflecting nine decades of global history, in a way that serves a clearly defined set of public purposes, shaped in partnership, in particular, with its users.
But we now need to do this in a new way. One that isn’t constrained by analogue, top-down, from-us-to-them thinking. And I must admit, that is a pretty tough challenge.
Yet in my own view such a vision of the BBC as a liberator of information and data, sometimes curated, sometimes uncurated, really needs to be a key part of the underpinning of our next Charter.
In fact, the liberation of our archives is an almost perfect metaphor for the liberation of the BBC itself in this respect.
Why? Because our Archives represent an inert analogue past that must be digitised – digi*tal*ised even – and then subjected to a number of processes and legal and ‘imaginative’ changes, in order to help the Archive to achieve its maximum potential and therefore be truly equal to the opportunities and challenges of the 21st Century.
And, I’d argue that the same is, in many respects, true of the Corporation itself.
Yet transforming and liberating the Archives and the Corporation at the same time will almost certainly – I’ll go further… I’d bet my house on it – lead to the unexpected emergence of something we cannot yet imagine that arises from bringing both the BBC, and its past, into the digital age.
With all that in mind, let me try to explain my vision for a genuinely “open” BBC by reaching for an historical analogy from the Corporation’s own past. Please bear with me for a few minutes!
Transport yourself back one hundred years.
It’s 1913 and no one yet has ever heard or seen a “programme.” No one has the faintest inkling of what one might sound or look like – because no-one has even imagined the idea of “broadcasting.” The Today programme does not exist. There are no cookery shows, or talent shows and Big Brother is still statistically less likely than the Titanic sinking on its maiden voyage… no, hang on, wait! – that *did* happen the year before, didn’t it... oh well…
Anyway, the technology to send a wireless signal from one place to another *does* exist, although many people back then think it is a flawed technology with little or no value to society.
All the *useful* communications services were run by the GPO – the General Post Office:
For a penny you could send a sealed message to anyone, anywhere. Then the telegraph enabled you to send a sealed message to someone in a fraction of the time with minimal human intervention. Then came the telephone and you could then actually talk directly with your correspondent in real time.
But Marconi wireless telegraphy? A totally, utterly useless technology. Why? Because not only can the person you’re talking to hear you… everyone else can hear you too!
The military however did think it *could* be useful and so the GPO became responsible for keeping it under control.
Thus three forms of permits were created:
If you wanted to make the equipment that could send or receive wireless signals, you needed a permit from the Post Office.
If you wanted to transmit wireless signals from one place to another, you needed a permit from the Post Office
And if you wanted to own the equipment to ‘Listen In’ to wireless signals, in particular because you could hear military signals, you needed to apply for a permit from the Post Office. A bit like an ownership permit for a gun, or a dog – you need one because you’ve got one.
Then in 1914 the First World War broke out and the Post Office rescinded the permits to transmit wireless signals. Eight years later, in 1922 the original BBC – the British Broadcasting Company – was formed from a consortium of radio manufacturers that owned the six key patents and this BBC produced a set of standards that determined the manufacture of radios. As long as you conformed to these standards you no longer needed permission from the GPO.
Reith became the General Manager and in less than four years
single-handedly destroyed the British Broadcasting Company.
Which, it turns out, was a very good thing indeed because he created in its stead the British Broadcasting Corporation we all grew up with, taking broadcasting as we know it to a whole new level.
The radio manufacturers had originally come together to form the first BBC because without anybody broadcasting anything worth listening in to, their products weren’t much worth buying.
on the other hand, realised there was something much, much bigger at
What Reith realised was that wireless broadcasting was far, far too important a medium to rest in the hands of what was, on the face of it, a commercial entity: broadcasting, he believed, had the potential to impact the whole of society for the better, and he was right.
It wasn't that Reith believed that the transmitters should only be controlled by public entities (although he may well have done), but that the medium itself – the public space lying between transmitter and receiver – was a uniquely powerful resource in which the public's needs should always be foremost.
He fought and won the battle to prevent the airwaves from being controlled by government or influenced by payment. It was to be unmetered, unlimited, unobserved and unbiased.
And so, as I’ve said, the permit to transmit and the permit to make equipment disappeared.
The only permit we never got rid of was the one to own the equipment that could receive wireless signals, free at the point of use, and that became the licence fee that we still have today. The machine itself was to be broadcaster agonistic, capable of receiving signals from anyone legally permitted to send them, without any intervening conditions or restrictions, nor politically imposed limitations or commercial interventions.
As far as I am aware no nation has ever come up with a better service nor a better way of financing one of comparable public benefit or value, and most, I’m pretty sure, would exchange their own for the BBC.
The Licence Fee is certainly the most notorious aspect of what makes the BBC singular in the world of media but there are also other, less well known and less well appreciated components that make up the BBC’s Genome – the essential strands of the BBC’s unique DNA. I’ll give you an example of what I mean:
At the heart of Reith’s BBC were four principles:
- It was not for profit,
- It was to guarantee universal access
- It was to deliver to the highest possible standards
- It was operated under single, centralized control.
Those four things, taken together, are things the market can never deliver. In fact the very theology of The Market would openly decry each and every one of these principles as a portent of certain failure. Value-destroying evils that must be stopped at any cost.
Nevertheless, using these guiding principles, the BBC guaranteed that a publically funded broadcasting signal was always available to everyone, free at the point of use, regardless of wealth or location, of race or religion, of ability or disability. The broadcasting signals one person heard were the same signals everybody heard. And no amount of money nor privilege nor influence could get you a better version of the BBC.
Information at last was genuinely open.
But now move forward to 2013, and the BBC’s signals or output (but NOT ‘content’… please, please never ‘content’) is meditated in all manner of ways; by the time it reaches a piece of hardware, for example, it has passed through various intermediaries, controlled by ISPs, chip and device manufacturers, aggregators, identity providers, software companies and so on.
Each of whom have a vested interest in helping or hindering your route to the BBC, each of whom benefit from creating friction or harvesting *your* personal data or simply using gradual, asynchronous, obsolescence as business model.
In the past, that could never have been the case because the BBC would never have let it happen.
It was never the case that you could buy a TV that could not get BBC 1. Or a radio that could not receive Radio 4.
It was never the case that you could no longer access the News because you had reached your allotted bandwidth limit.
And it was certainly never, ever the case that people were collecting data about your patterns of consumption and interaction with your friends without your consent and even without your even knowing – and then handing your personal data to other businesses or selling it to the police as was reported to have happened last week.
The network has become a crowded, cacophonous environment. And grows ever more so with each passing day.
Why all this mediation? Well, in truth, there’s money in the friction, there’s no money in simplicity.
The challenges of securing an open, frictionless environment are, I believe the term is, “decidedly nontrivial”, which I understand to mean ‘requiring real thought or significant computing power’. In other words, the challenges are massive…
For example, even at the BBC we struggle with the idea of linked open data. There are people who do linked data in the BBC, but it’s left to the individual product managers to decide for themselves what their approach is going to be – it’s not yet part of the core philosophy of the current BBC.
The BBC must, of course, commit to the linked open semantic web by publishing open data if it is to fulfil its public service mission in the 21st century – and yet I’d have to admit that we still seem to have some way left to go ourselves.
And if we are to ensure that data is open and permanently available, then we need to change the ways that the BBC produces stuff in future. The easiest way is to simply ensure that the data are there and in the right shape from the beginning.
The BBC is also committed to ensuring that it has in place an effective means of enabling our partners to access these data, including metadata in all its variants, for everything it creates in the future.
A key objective for us all must therefore be, to ensure that all large, publicly-funded organisations, like the BBC, publish linked open data on the web according to emerging standards, in a way that enables “the crowds” – both crowds of people and of machines – to run analyses and extract meaning and value from that data and to use and it reuse it as they wish.
Some don't believe that the crowd will bother. I say one word to them: Wikipedia.
Another of the biggest challenges is to persuade our public organisations to let their data go, under a permissive licence such as CC-Zero or, to a lesser degree perhaps, the Open Government Licence, and to be content that people will eventually make something of value from it – with any money made along the way being eventually returned to the public purse via the taxes paid by companies which successfully licence the materials.
After all, it was the public who paid for the creation or acquisition of the materials in the first place and, with very few exceptions, the *data* at least belongs to them, rather than to the institutions who simply hold it in trust on behalf of that public.
Now we all know it’s a big step for organisations to go from the philosophising about open data to actually releasing it in the wild. But it has to be done. I’m glad to say that both the BBC and the British Library are relatively forward-thinking in this respect.
Our own RES project – the Research and Education Space project – is another major initiative in this sphere. It’s a non-audience facing collection of data assets and media, pooled from both the BBC and cultural partners. Organisations authorised by the Educational Recording Agency will then be able to make the assets in RES, available to authenticated learners and researchers, using their own audience propositions. We’re working with partners including the British Film Institute, the British Universities Film and Video Council, Jisc and the Welsh Government.
Ultimately it will provide a platform, initially only for those in formal education, through which linked open semantic data can be analysed. All of the metadata in RES will be open to all, but access to much of the *media* itself is going to require user authentication.
Seen together, these building blocks are starting to form the foundations of a revised set of standards or principles that need to be added to the DNA of the BBC, underpinned by the next Charter and Agreement.
Reith’s idea of the network as a guarantor of universal access, as a public space which should not be commoditised is something that my small team at the BBC have spent considerable time and effort focussing upon.
We’ve asked ourselves over and over again “what is the equivalent of that Reithian sense of a public space in the digital world?” What is the place where commercial pressures can't stand between you and the things held on your behalf; where so-called ‘advertisers’ aren't collecting reams of information about your every move; where rights and responsibilities are communicated clearly and comprehensibly; and where gatekeepers don't stand between you and your creative ambition.
The answers to these questions must also be a central theme of the next Charter in order to address and resolve major issues such as guaranteed access and the security of the citizen in the digital present.
Let me start to answer that fundamental question – What would Lord Reith have done? – by going back to the first principles which underlie the BBC’s approach to “archives” and “archiving.”
I’m afraid that neither Mr Marconi nor Mr Logie-Baird included a self-recording capability within their patents. And nobody back then much imagined that ‘broadcasting’ needed to be recorded – transmitting a wireless signal simultaneously into every home in the nation seemed miraculous enough at the time, so asking for a recorded copy to keep as well, might have sounded a little – dare I say ‘greedy’?
The miracle of broadcasting was that which Reith could see almost before anyone else; its ability to encourage Nation to speak peace unto Nation, to leap over concert hall queues into the best seats in the house; to eradicate loneliness and isolation; to bring sweetness and light into every life; to enrich and to inspire; to inform and to educate and to entertain.
I know it may seem odd to us today but at that time the act of broadcasting itself was akin to any other live performance. If you weren’t there when it happened then you simply missed it. Recording it, wasn’t something many people would have imagined doing, even it had been an easy thing to do – which it wasn’t.
Over the years, remembered broadcasts were to be recast as of those half imagined memories in time, like your star performance in the school play or your first kiss or 90th minute goals at White Hart Lane.
So there you have it. The machines didn’t record by themselves and so every time somebody thought – “Hey! You know what…? this afternoon’s speech by the King might be worth preserving for posterity”…, they had to lug a whole recording and record pressing set-up into the studio, and it wasn’t really until video tape came along, in the late 50s and early 60s, that it was even possible to ‘consider’ recording programmes on more than on a very occasional basis.
And then of course there’s the whole business of archiving itself to be taken into account – the purpose of preserving for posterity or evidence – that is somewhat at odds with the expectations of today’s increasingly digital businesses who expect everything to be stored and retrievable – on demand.
Now here, there is a huge misconception of what the BBC retained, even *after* recording it became possible. There were always many other considerations, that have determined what was kept and what wasn’t, most having very little to do with the actual ‘programmes’ themselves;
Shelf space is one, and so is the cost of perpetual preservation, or indexing for rediscovery in the likelihood of the material ever being reused – or even being *allowed* to be reused; the rights of the contributors and the prevailing laws at any given time, and so on… Therefore to the great disappointment of many, we don’t have a copy of *everything* we’ve ever made – but we do, nevertheless, have a lot, including over 500,000 programmes.
But, to think of the collection as just the finished programmes is misleading. We also currently hold more than 2½ million different items of film and video and over 1 million items of audio. Add to that 4 million music scores, 6 million photos, plus letters, scripts and other documents filling miles and miles of shelving and, of course, one of the largest record collections in the world.
A unique and priceless collection that has the potential to transform our understanding of the past and stimulate our creative industries in the future.
All of these assets contain data – Big Data – just like every other memory institution and public repository. We all need to work in collaboration to bring it back to life.
Now, when I was appointed into my current role about five years ago, I was asked what we should do with all of that. Unlike a lot of people I already knew that there wasn’t a complete set of Doctor Who or Whicker’s Worlds or Top of the Pops so I doubted the answer was as simple as to create a sort of cross between the BBC iPlayer and Blockbuster Video, even though I realised that that was what most people were initially expecting me to say.
Instead, I set about asking as many people as I could, what *they* thought the BBC should do with its archives. Now if you ask *that* question to twenty people, you’ll get thirty different answers ranging from “give it away for free to school kids” to “sell it to overseas markets through an international On Demand service”. From making it available to artists to make new creative works to putting it on a wiki and letting the more knowledgeable general public identify and annotate it.
The most powerful call, oddly enough, was for us to find a way to unite it with other archives and records in the UK – and beyond – in order that the fragmented, partially told stories we hold may be joined to other important elements or fragments in other collections.
And so, rather than hold a beauty parade or devise some other method of deciding which ideas were worth pursuing and which were not, we set out to see if we could create the necessary conditions through which *all* of the really innovative ideas people had put forward, might be possible.
In other words, to liberate our information and data in order to create the greatest possible value, with the help of both institutional partners and the public. And to atomise our archives, thinking of them not just as programmes and documents, but sets of frames, pixels, sentences and words.
Such an idea is certainly not an easy one to put it into practice, particularly that part about uniting the BBC’s archives with others, and therefore it would need to be built upon – in fact, it would entirely rely upon – a vision that would have to be shared by each and every participating archive and archivist.
Fortunately for me, the small but beautifully *informed* Archive Development team are intellectually capable of conceiving this high degree of complexity and are also skilled enough to create demonstrators and early versions of working models to articulate these concepts. Just as the ground breaking engineers did in the past, at Kingswood Warren or Alexandra Palace.
So we thought we’d give it a try and working with an ever growing number of organisations, mainly in the public sector, over the past few years, we have made a lot of progress towards what that shared vision might be, and we have coined the umbrella term ‘Digital Public Space’ to try to encapsulate the many principles that it contains.
In the past of course, the memory institutions have been custodians of culture through necessity – if the British Library or the other Statutory Deposit Libraries didn’t save books then no one did. One of effects of internet is that we all now, as individuals, have vast amounts of storage or archival capability within our own gift.
We need to be clear that institutions such as the BBC and the British Library are custodians of culture rather than the sole arbiters – they don’t, or shouldn’t, ultimately be the *only* ones who decide what is important for society. In the Digital Public Space we are also empowering people so that they don’t have to wait for the Imperial War Museum or The National Archives to release material in order to have a record of what society was like at a given time – the real power of the Digital Public Space resides in individuals and institutions working *together* to create something that is far more than the sum of its parts.
So far we have worked on a wide range of pilots and prototypes – mostly ‘under the radar’ so to speak but occasionally we have created public-facing experiments. Take our collaboration with Arts Council England – and many others – to produce The Space, a temporary Arts and Culture digital service for example. It contains many of the characteristics of the Digital Public Space we are aiming towards, such as neutrality and open metadata. As a consequence we have a much better idea now of what those challenges are – such as making sure that everyone involved in the chain knows they need to make their data open.
Perhaps here I should set out my vision of this Digital Public Space.
I see the Digital Public Space as an open and accessible, digital environment that would always (yes, always) put the needs of the public first and foremost. It would guarantee universal access to the vast wealth of our nation – the UK’s Collective Abundance – free at the point of use, just as we do with traditional broadcasting.
It would permit, encourage and even require contributions from the whole of our society. It will be a place where the national Conversation thrives, where all contributions are welcomed, where every story, no matter who tells it has value.
For the BBC, this presents a number of challenges, only one of which is the problem of permanent access. In this environment, things no longer disappear after a certain period of time. Material that once would have flourished briefly before languishing under lock and key, diminishing in value and being cast away — would now be available for ever.
This is something many now take for granted but for the BBC it’s still relatively new and so we’ll need time to learn how best to make such things available forever.
In fact our real task is to empty our archives, to make sure that material is never locked away again. We need to stop putting stuff in dusty vaults; we need to put information, data, everything, in open spaces where anyone can get at it
Of course it’s impossible to access material like this without knowing that it’s there in the first place or, at least, without having access to the metadata – the catalogues, for instance. So aligning and then publishing our catalogues is an essential first step.
Again by working with partners, we have created an early prototype to map the data from the various BBC archives along with our friends at The British Library, The National Archives, The Royal Opera House, Kew Gardens and the BFI and so far it seems to work. We’d love to hear from anyone who’d be interested in joining the pilot if for no other reason than to see where it breaks down and how we might improve the data model.
But even once you ‘do’ know what’s held in each collection you still have to overcome the complex rights issues, and any number of other time-consuming and costly restrictions. There are many other issues besides the huge first step of exposing the metadata that must be resolved before any system could support our vision of a frictionless exchange between participating organisations that will genuinely work to the benefit of all users.
In many respects, the toughest issue is simply bringing together the organisations that hold the bulk of the nation’s culture and heritage in their archives. Finding that London 2012 spirit and working towards a common, shared success.
So the challenge I’d like to set you all is this: to find a way to agree and then to work together with us to achieve mutually beneficial outcomes on behalf of our organisations and the public that most, if not all of us, exist to serve. If we can make this work, then I believe we will transform not only our own industry but also the entire creative sector.
Because within our archives I believe are the reserves of raw material that can reignite the UK – metaphorically you might describe it as the Coal of the Digital Revolution, the foundations of the emerging Creative Economy.
If we can find a way to mine the UK’s archives, and if we can make the assets and the data digitally available, then we will create an entirely new Era of Possibility.
In the case of the BBC it’s not just the programmes but in the individual elements *within* the programmes that the inert value resides… the stills and documents, the information and data – all of these can be used and reused countless times in countless ways to produce brand new works, brand new products, brand new services.
In the case of other archives it may actually be the data that has the most immediate value, or the parts of the collections that have been stored for decades due to lack of display opportunities or simply a lack of resources.
Bringing them into alignment, making them ‘interoperable’, will ultimately create new opportunities to innovate and result in new jobs, new markets and, eventually, entirely new industries. Services which can be either freely available in the public domain or commercially exploitable.
And even the processes to transform these otherwise inert assets will become business opportunities. Just as with coal mines in earlier centuries… there’ll be a need for an entirely new, skilled, labour force.
But, of course, without the exploitation of that labour, low wages, health and safety risks and pollution that went along with coal mining and much of that particular Industrial Revolution. This has the makings of something far, far more sustainable, and far more beneficial to society – for generations to come.
And central to all of this will be the librarians and archivists. No longer at the end of the chain but now firmly in the centre, setting the requirements of the creative process itself. Managing the assets and resources, into and throughout the value chain. For without planned Data and Asset Management, none of this will be possible.
Some of these skills exist today – but we don’t have enough of them. Many, we don’t yet have at all.
But when, as a nation, we master these new skills, when we move from small scale to industrial scale – then the emerging revolution will really begin, changing the UK’s reputation around the globe once again.
And that I believe is the contribution we can make to rediscovering the Britain we’ve left behind. Buried in our archives is the story of our past and also the means to transform that past into a future worthy of our heritage.
We can do this. We don’t need permission. Just the belief that we can do better by working together than we could ever achieve by working apart. To rediscover our shared values and our common purpose.
And so, I’ll apologise once again for the lack of jokes today. I actually did spend quite some time searching for some but the only thing that came up under Information Records Management and Humour was… that there wasn’t any. So that’s something else I think we can all work on together to fix.
In the meantime I’d like to leave you with a few quotes I *did* find about information, libraries and librarians.
First of all, this one from John F. Kennedy:
are not afraid to entrust the people with unpleasant facts, foreign
ideas, alien philosophies, and competitive values. For a nation that
is afraid to let its people judge the truth and falsehood in an open
market is a nation that is afraid of its people.”
Then this from Peace campaigner Norman Cousins:
“The library is not a shrine for the worship of books. It is not a temple where literary incense must be burned or where one's devotion to the bound book is expressed in ritual. A library, to modify the famous metaphor of Socrates, should be the delivery room for the birth of ideas - a place where history comes to life.”
Which is perhaps better expressed in this line from Monty Python:
see, I don't believe that libraries should be drab places where
people sit in silence, and that's been the main reason for our policy
of employing wild animals as librarians.”
But I think the final word *must* go to Spider Robinson in The Callahan Touch:
"Mary Kay is one of the secret masters of the world: a librarian. They control information.
Don't ever piss one off."
Thank you all for listening.
If you want to keep OurBeeb debating the BBC, please chip in what you can afford.