Open energy system database projects employ open data methods to collect, clean, and republish energy-related datasets for open use. The resulting information is then available, given suitable copyright, for statistical analysis and for building numerical energy system models, including open energy system models. Permissive licenses like Creative Commons CC0 and CC BY are preferred, but some projects will house data made public under more restrictive proprietary licenses.
The databases themselves may furnish information on national power plant fleets, renewable generation assets, transmission networks, time series for electricity loads, dispatch, spot prices, and cross-boarder trades, weather information, and similar. They may also offer other energy statistics including fossil fuel imports and exports, gas, oil, and coal prices, emissions certificate prices, and information on energy efficiency costs and benefits.
Much of the data is sourced from official or semi-official agencies, including national statistics offices, transmission system operators, and electricity market operators. Data is also crowdsourced using public wikis and public upload facilities. Projects usually also maintain a strict record of the provenance and version histories of the datasets they hold. Some projects, as part of their mandate, also try to persuade primary data providers to release their data under more liberal licensing conditions.
Two drivers favor the establishment of such databases. The first is a wish to reduce the duplication of effort that accompanies each new analytical project as it assembles and processes the data that it needs from primary sources. And the second is an increasing desire to make public policy energy models more transparent to improve their acceptance by policymakers and the public. Better transparency dictates the use of open information, able to be accessed and scrutinized by third-parties, in addition to releasing the source code for the models in question.
Video Open energy system databases
General considerations
Background
In the mid-1990s, energy models used structured text files for data interchange but efforts were being made to migrate to relational database management systems for data processing. These early efforts however remained local to a project and did not involve online publishing or open data principles.
The first energy information portal to go live was OpenEI in late-2009, followed by reegle in 2011.
A 2012 paper marks the first scientific publication to advocate the crowdsourcing of energy data. The 2012 PhD thesis by Chris Davis also discusses the crowdsourcing of energy data in some depth. A 2016 thesis surveyed the spatial (GIS) information requirements for energy planning and finds that most types of data, with the exception of energy expenditure data, are available but nonetheless remain scattered and poorly coordinated.
In terms of open data, a 2017 paper concludes that energy research has lagged behind other fields, most notably physics, biotechnology, and medicine. The paper also lists the benefits of open data and open models and discusses the reasons that many projects nonetheless remain closed. A one-page opinion piece from 2017 advances the case for using open energy data and modeling to build public trust in policy analysis. The article also argues that scientific journals have a responsibility to require that data and code be submitted alongside text for peer review.
Database design
Data models are central to the design and organization of databases. Open energy database projects generally try to develop and adhere to well resolved data models, using de facto and published standards where applicable. Some projects attempt to coordinate their data models in order to harmonize their data and improve its utility. Defining and maintaining suitable metadata is also a key issue. The life-cycle management of data includes, but is not limited to, the use of version control to track the provenance of incoming and cleansed data. Some sites allow users to comment on and rate individual datasets.
Data copyright and database rights
Issues surrounding copyright remain at the forefront with regard to open energy data. As noted, most energy datasets are collated and published by official or semi-official sources. But many of the publicly available energy datasets carry proprietary licenses, limiting their reuse in numerical and statistical models, open or otherwise.
Measures to enforce market transparency have not helped because the associated information is again normally licensed to preclude downstream usage. Rather this data is provided for inspection only. Transparency measures include the 2013 European energy market transparency regulation 543/2013. Indeed the EU directive "is only an obligation to publish, not an obligation to license".
Energy databases are protected under general database law, irrespective of the legal status of the information they hold.
Energy statistics
National and international energy statistics are published regularly by governments and international agencies, such as the IEA. In 2016 the United Nations issued guidelines for energy statistics. While the definitions and sectoral breakdowns are useful when defining models, the information provided is rarely in sufficient detail to enable its use in high-resolution energy system models.
Published standards
There are few published standards covering the collection and structuring of high-resolution energy system data. The IEC Common Information Model (CIM) defines data exchange protocols for low and high voltage electricity networks.
Maps Open energy system databases
Open energy system database projects
Energy system models are data intensive and normally require detailed information from a number of sources. Dedicated projects to collect, collate, document, and republish energy system datasets have arisen to service this need. Most database projects prefer open data, issued under free licenses, but some will accept datasets with proprietary licenses in the absence of other options.
The OpenStreetMap project, which uses the Open Database License (ODbL), contains geographic information about energy system components, including transmission lines. Wikipedia itself has a growing set of information related to national energy systems, including descriptions of individual power stations.
The following table summarizes projects that specifically publish open energy system data. Some are general repositories while others (for instance, oedb) are designed to interact with open energy system models in real-time.
Three of the projects listed work with linked open data (LOD), a method of publishing structured data on the web so that it can be networked and subject to semantic queries. The overarching concept is termed the semantic web. Technically, such projects support RESTful APIs, RDF, and the SPARQL query language. A 2012 paper reviews the use of LOD in the renewable energy domain.
Energy Research Data Portal for South Africa
The Energy Research Data Portal for South Africa is being developed by the Energy Research Centre, University of Cape Town, Cape Town, South Africa. Coverage includes South Africa and certain other African countries where the Centre undertakes projects. The website uses the CKAN open source data portal software. A number of data formats are supported, including CSV and XLSX. The site also offers an API for automated downloads. As of March 2017, the portal contained 65 datasets.
energydata.info
The energydata.info project from the World Bank Group, Washington, DC, USA is an energy database portal designed to support national development by improving public access to energy information. As well as sharing data, the platform also offers tools to visualize and analyze energy data. Although the World Bank Group has made available a number of dataset and apps, external users and organizations are encouraged to contribute. The concepts of open data and open source development are central to the project. energydata.info uses its own fork of the CKAN open source data portal as its web-based platform. The Creative Commons CC BY 4.0 license is preferred for data but other open licenses can be deployed. Users are also bound by the terms of use for the site.
As of January 2017, the database held 131 datasets, the great majority related to developing countries. The datasets are tagged and can be easily filtered. A number of download formats, including GIS files, are supported: CSV, XLS, XLSX, ArcGIS, Esri, GeoJSON, KML, and SHP. Some datasets are also offered as HTML. Again, as of January 2017, four apps are available. Some are web-based and run from a browser.
Enipedia
The semantic wiki-site and database Enipedia lists energy systems data worldwide. Enipedia is maintained by the Energy and Industry Group, Faculty of Technology, Policy and Management, Delft University of Technology, Delft, the Netherlands. A key tenet of Enipedia is that data displayed on the wiki is not trapped within the wiki, but can be extracted via SPARQL queries and used to populate new tools. Any programming environment that can download content from a URL can be used to obtain data. Enipedia went live in March 2011, judging by traffic figures quoted by Davis.
A 2010 study describes how community driven data collection, processing, curation, and sharing is revolutionizing the data needs of industrial ecology and energy system analysis. A 2012 chapter introduces a system of systems engineering (SoSE) perspective and outlines how agent-based models and crowdsourced data can contribute to the solving of global issues.
oedb
The oedb or OpenEnergy Database project is a collaborative versioned dataset repository for storing open energy system model datasets. A dataset is presumed to be in the form of a database table, together with metadata. Registered users can upload and download datasets manually using a web-interface or programmatically via an API using HTTP POST calls. Uploaded datasets are screened for integrity using deterministic rules and then subject to confirmation by a moderator. The use of versioning means that any prior state of the database can be accessed (as recommended in this 2012 paper). Hence, the repository is specifically designed to interoperate with energy system models. The backend is a PostgreSQL object-relational database under subversion version control. Open source licenses are specific to each dataset. Unlike other database projects, users can download the current version (the public tables) of the entire PostgreSQL database or any previous version. Initial development is being lead by the Reiner Lemoine Institute, Berlin, Germany.
Open Power System Data
The Open Power System Data (OPSD) project seeks to characterize the German and western European power plant fleets, their associated transmission network, and related information and to make that data available to energy modelers and analysts. The project is jointly managed by the University of Flensburg, DIW Berlin, the Technical University of Berlin, and the energy economics consultancy Neon Neue Energieökonomik, all from Germany. The project is funded by the Federal Ministry for Economic Affairs and Energy (BMWi) for EUR490000 and runs from August 2015 to July 2017. Developers collate and harmonize data from a range of government, regulatory, and industry sources throughout Europe. The website and the metadata utilize English, whereas the original material can be in any one of 24 languages. The datasets are hosted on GitHub. The website was launched on 28 October 2016. As of September 2016, the project offers, for Germany and other European countries:
- details, including geolocation, of conventional power plants and renewable energy power plants
- aggregated generation capacity by technology and country
- hourly time series covering electrical load, day-ahead electricity spot prices, and wind and solar resources
- a script to filter and download NASA MERRA-2 satellite weather data
To facilitate analysis, the data is aggregated into large structured files (in CSV format) and loaded into data packages with standardized machine-readable metadata (in JSON format). The same data is usually also provided as XLSX (Excel) and SQLite files. The datasets can be accessed in real-time using stable URLs. The Python scripts deployed for data processing are also available and carry an MIT license. The licensing conditions for the data itself depends on the source and varies in terms of openness. Previous versions of the datasets and scripts can be recovered in order to track changes or replicate earlier studies. The project also engages with energy data providers, such as transmission system operators (TSO), to encourage them to make their data available under open licenses (for instance, Creative Commons and ODbL licenses).
A number of electricity market modeling analyses are based on OPSD data. In 2017, Open Power System Data won Schleswig-Holstein's "Open Science Award" and the "Germany - Land of Ideas" award.
OpenEI
Open Energy Information (OpenEI) is a collaborative website, run by the US government, providing open energy data to software developers, analysts, users, consumers, and policymakers. The platform is sponsored by the United States Department of Energy (DOE) and is being developed by the National Renewable Energy Laboratory (NREL). OpenEI launched on 9 December 2009. While much of its data is from US government sources, the platform is intended to be open and global in scope.
OpenEI provides two mechanisms for contributing structured information: a semantic wiki (using MediaWiki and the Semantic MediaWiki extension) for collaboratively-managed resources and a dataset upload facility for contributor-controlled resources. US government data is distributed under a CC0 public domain dedication, whereas other contributors are free to select an open data license of their choice. Users can rate data using a five-star system, based on accessibility, adaptability, usefulness, and general quality. Individual datasets can be manually downloaded in an appropriate format, often as CSV files. Scripts for processing data can also be shared through the site. In order to build a community around the platform, a number of forums are offered covering energy system data and related topics.
Most of the data on OpenEI is exposed as linked open data (LOD) (described elsewhere on this page). OpenEI also uses LOD methods to populate its definitions throughout the wiki with real-time connections to DBPedia, reegle, and Wikipedia.
OpenEI has been used to classify geothermal resources in the United States. And to publicize municipal utility rates, again within the US.
OpenGridMap
OpenGridMap employs crowdsourcing techniques to gather detailed data on electricity network components and then infer a realistic network structure using methods from statistics and graph theory. The scope of the project is worldwide and both distribution and transmission networks can be reverse engineered. The project is managed by the Chair of Business Information Systems, TUM Department of Informatics, Technical University of Munich, Munich, Germany. The project maintains a website and a Facebook page and provides an Android mobile app to help the public document electrical devices, such as transformers and substations. The bulk of the data is being made available under a Creative Commons CC BY 3.0 IGO license. The processing software is written primarily in Python and MATLAB and is hosted on GitHub.
OpenGridMap provides a tailored GIS web application, layered on OpenStreetMap, which contributors can use to upload and edit information directly. The same database automatically stores field recordings submitted by the mobile app. Subsequent classification by experts allows normal citizens to document and photograph electrical components and have them correctly identified. The project is experimenting with the use of hobby drones to obtain better information on associated facilities, such as photovoltaic installations. Transmission line data is also sourced from and shared with OpenStreetMap. Each component record is verified by a moderator.
Once sufficient data is available, the transnet software is run to produce a likely network, using statistical correlation, Voronoi partitioning, and minimum spanning tree (MST) algorithms. The resulting network can be exported in CSV (separate files for nodes and lines), XML, and CIM formats. CIM models are well suited for translation into software-specific data formats for further analysis, including power grid simulation. Transnet also displays descriptive statistics about the resulting network for visual confirmation.
The project is motivated by the need to provide datasets for high-resolution energy system models, so that energy system transitions (like the German Energiewende) can be better managed, both technically and policy-wise. The rapid expansion of renewable generation and the anticipated uptake of electric vehicles means that electricity system models must increasingly represent distribution and transmission networks in some detail.
As of 2017, OpenGridMap techniques have been used to estimate the low voltage network in the German city of Garching and to estimate the high voltage grids in several other countries.
reegle
reegle is a clean energy information portal covering renewable energy, energy efficiency, and climate compatible development topics. reegle was launched in 2006 by REEEP and REN21 with funding from the Dutch (VROM), German (BMU), and UK (Defra) environment ministries. Originally released as a specialized internet search engine, reegle was relaunched in 2011 as an information portal.
reegle offers and utilizes linked open data (LOD) (described elsewhere on this page). Sources of data include UN and World Bank databases, as well as dedicated partners around the world. reegle maintains a comprehensive structured glossary (driven by an LOD-compliant thesaurus) of energy and climate compatible development terms to assist with the tagging of datasets. The glossary also facilitates intelligent web searches.
reegle offers country profiles which collate and display energy data on a per-country basis for most of the world. These profiles are kept current automatically using LOD techniques.
Renewables.ninja
Renewables.ninja is a website that can calculate the hourly power output from solar photovoltaic installations and wind farms located anywhere in the world. The website is a joint project between the Department of Environmental Systems Science, ETH Zurich, Zürich, Switzerland and the Centre for Environmental Policy, Imperial College London, London, United Kingdom. The website went live during September 2016. The resulting time series are provided under a Creative Commons CC BY-NC 4.0 license and the underlying power plant models are published using a BSD-new license. As of February 2017, only the solar model, written in Python, has been released.
The project relies on weather data derived from meteorological reanalysis models and weather satellite images. More specifically, it uses the 2016 MERRA-2 reanalysis dataset from NASA and satellite images from CM-SAF SARAH. For locations in Europe, this weather data is further "corrected" by country so that it better fits with the output from known PV installations and windfarms. Two 2016 papers describe the methods used in detail in relation to Europe. The first covers the calculation of PV power. And the second covers the calculation of wind power.
The website displays an interactive world map to aid the selection of a site. Users can then choose a plant type and enter some technical characteristics. As of February 2017, only year 2014 data can be served, due to technical restrictions. The results are automatically plotted and are available for download in hourly CSV format with or without the associated weather information. The site offers an API for programmatic dataset recovery using token-based authorization. Examples deploying cURL and Python are provided.
A number of studies have been undertaking using the power production datasets underpinning the website (these studies predate the launch of the website), with the bulk focusing on energy options for Great Britain.
See also
- Comprehensive Knowledge Archive Network (CKAN) - a web-based open data management system
- Climate change mitigation scenarios
- Crowdsourcing
- Energy modeling - the process of building computer models of energy systems
- Energy system - the interpretation of the energy sector in system terms
- Open Energy Modelling Initiative - a European-based energy modeling community
- Open energy system models - a review of energy system models that are also open source
- Open Knowledge International - a global non-profit network that promotes and shares information
Notes
References
Further information
- Open energy data wiki maintained by the Open Energy Modelling Initiative
External links
- De-risking Energy Efficiency Platform (DEEP) - an open energy efficiency data platform for Europe
- European Climatic Energy Mixes project (ECEM) -- the role that climate change may play on future energy systems
- OpenEnergy Database (oedb) - an open energy system database being developed in Germany
- OpenEnergyMonitor - an open source energy use monitoring project
Source of article : Wikipedia