The volume of data in the world is massive and growing exponentially. Every day studies tells us we are in the midst of a 4th revolution, the data revolution. 90% of the data generated worldwide since the beginning of time has been generated in the last 2 years. Recent forecasts state that we will reach 163 zettabytes of data by 2025 (IDC Whitepaper). Whilst the transformational potential of data is not disputed, cities still have a long way to go to fully leverage the power of this data for more responsive and effective city management. The emergence of a) the Internet-of-Things generating useful data from city sensors, and b) Cloud storage and computing, has created new opportunities for harnessing city data, generating new tools, techniques and businesses focused on enhancing city understanding and the city experience. Surprisingly, despite these new capabilities, only 2% of all the data in the world is effectively utilised, with just 12% of city data used for policy making.
As expected, the private sector is ahead of the game, and own much of the richer, more useful data that is generated daily by the city. For example, they own mobility data from people’s smartphones, commerce data from store and credit cards, data from privately owned sensors and more, using it to increase the quality of their services and revenue potential. Government, despite the move towards Open Data in the last ten years, are being left behind. Whilst thousands of open data sets are available for reuse across Europe, take-up remains low. European, National and Local open data portals have poor name recognition and a smaller number of datasets are downloaded than expected. Europe is not reaching its full potential, with an overall maturity score of 65% in 2018’s EU Open Data Maturity Report (2018). This data has not turned out to be the ‘new oil’ many predicted.
The reason for struggling impact is due to many factors including: the lack of data quality in all dimensions: consistency, accuracy, coverage, freshness, and completeness; the lack of data interoperability to ensure mobility of services keeps up with the mobility of the users; the lack of data understanding (data literacy) to enable meaningful interpretation of data in the form of information, and; the lack of useful real-time data sources which have the potential to deliver the most impact.
Aside from the data itself, an over reliance on traditional analytics techniques, and lack of an infrastructure with the needed processing power to analyse the volume and variety of city data fast enough has also hampered progress.
Despite the economics of sharing hardware and software, in reality the costs for sending data to and retrieving it from the cloud is often more expensive than in-house storage.
Data sharing and reuse also amplifies subtle and complex questions of interpretation, transparency, collaboration and trust that form a number of data ethics concerns, along with confusion around balancing the principles of ‘openness’ and ‘privacy’. Use of data must meet wider ethical requirements including; A clear public benefit; Use only to the extent proportionate to the need; Recognition of the limitations of the data used (including the risks of taking decisions on incomplete or inaccurate data) and; A precautionary approach, with transparency and accountability in the acquisition, processing, storage and use of data, i.e. ensuring that the algorithms driving HPC analytics are open and fair.
Whilst there is a big movement towards open government, it’s not correct for administrations to automatically assume that ‘open,’ ‘shared’ and ‘public’ are synonymous with the principle of public good. Many who are reluctant to make data public often have concerns about how it is reused, after all research has shown several examples of surprising correlations which can unintentionally disclose sensitive information about persons in public datasets (Accenture, the ethics of data sharing).
Even if personal privacy is protected, Administrations must consider if the citizens providing the data would support the way that their data is being used, and would they have provided it if they had known how it would be used? This is not a straightforward process which requires policy and regulations to be developed with stakeholders and social partners, as it cannot be left to the technologists alone (a 2018 survey by StackOverflow found that 80% of developers wanted to know what their code would be used for, however 40% wouldn’t rule out writing unethical code, and 80% did not feel responsible for the use of their unethical code).
Together, all the issues above combine to put cities off publishing and sharing data, meaning many notable European open data projects focused on empowering and upskilling citizens in using data and enhancing city decision-making, never achieve their true potential for collaboration and innovation. Public administrations seem destined to remain stuck in a world of pilots, with data literacy capacity remaining low, so their results rarely hit the mainstream the same way as private sector offerings do.
Imagine if cities could overcome these challenges and utilise lessons from the private sector that help to move beyond the 12%, using fresh approaches to bring together existing and new data sources (structured and unstructured) via an underpinning infrastructure which creatively aggregates them in a way that makes the data more valuable both in its quality and usefulness. A new Cloud enabled approach for the public sector that will aggregate city data adhering to legal and ethical principles, and intuitively make it easy to understand by all. An approach that removes concerns around ethics and skills and unlocks the real potential in open data for driving future decisions for cities whilst simultaneously enhancing today’s city experience for all.
To take advantage of the increasing opportunities presented by vast amounts of city data for improving policy making three major barriers must be overcome:
1. Lack of Access to Computing Power: Cities need cost-effective access to high levels of computing power to creatively unlock tangible benefits from large quantities of different data, and enable real-time decision making.
2. Lack of Data Literacy: City data needs to be easier to understand for all through simple interfaces that enable everyone to understand the issue being addressed, and to be able to contribute ideas, thoughts, own data and feedback towards creating a more sustainable future
3. Lack of Data Ethics: As policymakers move towards using data from multiple sources, using new and creative data models, and advanced analytical techniques and easy to use tools, it is increasingly crucial to ensure that the way the data is collected and used conforms not only to the requirements of the privacy of personal data but also to the wider ethical principles of public benefit, proportionality, a precautionary approach and transparency.
New H2020 project DUET tackles the challenges outlined above to leverage the European Cloud Infrastructure to bring new opportunities to policy-making as follows:
1: Providing access to needed computing power: Real-time city management needs algorithms and computing power that can scale to distill oceans of open data, deliver insights and maintain efficiency. Cloud computing offers the ability for cities to access highly scalable hardware and software resources for the overwhelming majority of IT use cases. However, for future scenario predictions for policy modelling, cities need to execute heavy algorithms and leverage near real-time deployment and processing require the use of high-performance computing (HPC).
Cloud computing has not been used for high performance computing (HPC) to the same degree as other use cases for several reasons, namely cost, but DUET will advance this area by providing a new shared approach for its use in policy making and city management – using a Digital Twin.
A “Digital Twin” is a new concept consisting of a continuously learning digital copy of real-world assets, systems and processes that can be queried for specific outcomes. DUETs Digital Twins will consume Open Data and Data models from different sources in the city and integrate them with new technology capabilities including HPC, Artificial Intelligence and Advanced Analytics in order to provide a replica city environment where policy experimentation can safely take place. By predicting asset behaviour and capacity to deliver on specific outcomes within given parameters and cost constraints, the Digital Twin provides a risk-free experimentation environment to inform stakeholders what they need to do with the assets in the real-word in order to both achieve the most effective long-term policy outcomes, and short-term operational decisions.
2: Making data easier to understand: Easy to understand visualisations are a critical factor for driving trust in using data for democratic decision making. However, most visualisation platforms still need a degree of geo-expertise to truly use them to extract intelligence. DUET is different as it can provide a 3D interface for its Digital Twins alongside a 2D offering. Users, regardless of their technical or academic background, will be able to walk through DUET’s virtual 3D city neighbourhoods, and directly see dynamic data readings from multiple sources in a familiar context that makes them easy to understand. For example, users may see air quality through colours, traffic congestion as lines, incident sites as icons and so on. This simple, relatable way of viewing the city through multiple integrated data sources brings to life the tangible, systemic impacts of policy options, fueling ‘what if’ experimentation that unleashes creative and innovative qualities of all participants. This levelling of the field means that policy makers, administrative workers, emergency services, entrepreneurs, businesses and citizens can all participate in co-creation and consultation exercises as part of the traditional policy making cycle.
3: Establishing Ethical Principles for Data-Driven Decisions: The game-changing, cloud based, Digital Twin infrastructure with its deep-dive visualisation platform for policy experimentation will boost collaboration and policy innovation and bring new discoveries and intelligence through novel views of the data. Using visualisation tools, analysis of problems can have greater depth as many multi-disciplinary and multi-sectoral layers of data relating to the physical and social world can be considered together. Using a Digital Twin users can explore policy impacts across a whole city, rather than just one or two small localities. Instead of providing complicated graphs and multiple versions of maps from different industries to illustrate the impacts of, for example road routing decisions on mobility, air quality and health, the Digital Twin provides one version/replica of the city for all to use as a trusted baseline for exploring systemic impact of decisions. Visualising multiple data sources through the Digital Twin make relationships more apparent, dependencies and interactions more clearly viewed and the trade-off between a variety of possible solutions can be modelled and evaluated. For the first time complex policy will be open for all to easily explore and understand the situation that needs to be improved, experiment with ideas, cocreate potential solutions, and contribute to its formalisation.
The DUET project officially begins in December 2019.