On 18 September, Dentons hosted an Energy Institute event in our London office with the title “The Clash of Digitalisations”. Speakers from Upside Energy, Powervault and Mixergy spoke about the Pete Project, an initiative funded by Innovate UK, that is exploring the potential of domestic hot water tanks and batteries to provide flexibility services to National Grid. Fascinating as the technological and energy-regulatory aspects of this kind of household demand-side response aggregation services are, a key common theme of the evening was the central role played in them by the analysis of large amounts of “personal data”, and whether recent changes in privacy legislation help or hinder the development of such services. We produced this short article to put that discussion in context.
The General Data Protection Regulation (GDPR) came into force across the European Union (EU) on 25 May 2018 and is intended to overhaul the way that companies collect and use personal data. GDPR puts the onus on companies to ensure that they have a lawful basis to collect and process personal data. It also requires mechanisms to allow data subjects to exercise the new rights available to them under GDPR.
Breach reporting requirements have been strengthened with a requirement to report most breaches to the relevant supervisory authority within 72 hours. Supervisory authorities have increased enforcement powers including the ability to impose fines of 20 million Euros or 4% of total worldwide annual turnover.
Compliance with the requirements of GDPR presents a particular challenge within the energy sector. One high profile example is in connection with the use of smart meters and smart grids. Smart grids when combined with smart metering systems automatically monitor energy usage, adjust to changes in energy supply and provide real-time information on consumer energy consumption. The EU aims to have 80% of electricity meters converted to smart meters by 2020. As such, the volume of personal data collected in the energy sector is set to increase.
What is Big Data?
Big data has been defined in various ways including by reference to the “three V’s”. This refers to volume being the size of the dataset, velocity being the real-time nature of the data and variety referring to the different sources of the data.
However, this definition does not accurately describe all big data. An alternative is to define big data as an extremely large data set that cannot be analysed using traditional methods. Instead such big data is analysed using alternative methods (such as machine learning) in order to reveal trends, patterns, interactions and other information that can be used to inform decision-making and business strategy.
The key to big data is the analysis and resulting output. Big data analytics can be achieved using machine learning where computers are taught to “think” by creating mathematical algorithms based on accumulated data. Machine learning falls broadly into two categories, supervised and unsupervised. Supervised learning involves a training phase to develop algorithms by mapping specific datasets to pre-determined outputs. Alternatively machine learning can be unsupervised where algorithms are created by the machine to find patterns within the input data without being instructed what to look for specifically.
Big data is a particular issue following the Facebook / Cambridge Analytica story and the public concern about mass data capture and exploitation.
Below, we consider the 7 key issues surrounding big data from a data protection perspective within the energy sector.
Key issues
1. Fairness and transparency
One of the principles of GDPR is that personal data must be processed in a fair and transparent manner.
In practice this means that companies processing personal data must provide a privacy notice to individuals that sets out how and why personal data is being processed. This raises a practical issue in connection with big data analytics because often the purposes of processing are not always known at the outset.
In addition, machine learning algorithms are often conducted in what is known as a “black box”. This means that the algorithm itself is unknown to the data controller and cannot be interrogated to determine how the output was selected or decision made. This likely means that the privacy notice may not be GDPR compliant.
2. Lawful basis for processing
The processing of personal data must have a lawful basis at the outset. There are a number of legal bases available (listed out in A6 and A9 GDPR).
Consent is unlikely to be an option when big data analytics are involved. The analysis of big data sets is often conducted to discover trends within that data set and if those trends were known prior to the analysis, the analysis would not need to be conducted. Machine learning algorithms are often impossible for humans to understand as they cannot be translated into an intelligible form without losing their meaning. Consent must be freely given, specific, informed and unambiguous to be valid under GDPR. If the information regarding how personal data is processed cannot be understood then this cannot be translated into a meaningful consent.
In addition, under GDPR, data subjects have the right to withdraw consent and have a company cease processing their personal data. This would be difficult, if not impossible, in a big data context if the machine-learning algorithm is opaque and there is no ability to segregate personal data relating to a specific individual. As such, consent is highly unlikely to be a viable lawful basis for processing big data.
A potential alternative would be reliance on “legitimate interests”. This is available where processing of personal data is necessary for the pursuance of the legitimate interests of the company determining how and why the personal data is held and processed. The legitimate interests of the company need to be balanced against the interests, rights and freedoms of the individual (with particular care taken where data relates to children). A legitimate interests assessment should be conducted to determine whether legitimate interests can be relied upon. This should be documented.
An issue with legitimate interests as a basis for processing big data is that processing must be “necessary” for the purpose pursued by the company. In some instances big data analytics are pursued because the output may reveal a new correlation of interest. However, processing data because it may be “interesting” is unlikely to be sufficient to qualify as a legitimate interest that needs to be pursued by the controller.
3. Purpose limitation
GDPR requires that personal data be collected for specified, explicit and legitimate purposes and not further processed in an incompatible manner.
Big data analytics by their very nature often result in processing of data for new and novel purposes. These may be incompatible with the original purpose for which the data was collected. The issue then arises as to how and when privacy notices should be refreshed and brought to the attention of individuals.
Where material changes are made to a privacy notice or the reasons and methods by which personal data are processed these need to be actively brought to the attention of the data subject in advance of the processing. If the novel purposes or outcome is not known prior to analysis of the personal data then there is no logical way for a privacy notice to be refreshed or brought to the attention of an individual.
In addition, the personal data may have been obtained in bulk from a third party. This poses an additional challenge as it may be difficult or difficult to contact those individuals to whom the personal data relates.
4. Data minimisation
Big data analytics involves the collection and use of extremely large quantities of information. This is potentially problematic from a data minimisation perspective because GDPR requires that personal data held and processed should be limited to the minimum required for the purposes for which they were collected.
However, there are solutions to this issue. Personal data could be anonymised such that individuals are no longer identifiable from the information. A benefit of big data analytics is that it is often not dependent on the identification of specific individuals but rather of overall trends within the data population. Once personal data is anonymised it is no longer “personal data” for the purposes of GDPR and could be used and analysed as needed without the requirement for further refreshed privacy notices or legitimate interest assessments in relation to such processing. However data subjects should be told how their data may be used including that it may be anonymised and the purposes of subsequent usage.
5. Individual rights
There are practical issues around how data subjects can exercise their rights under GDPR in relation to big data. Data subjects have various rights under GDPR including the right to request confirmation that their personal data is being processed, access copies of personal data held, to correct inaccuracies, the “right to be forgotten”, to restrict processing, to have personal data “ported” to another entity and the right to object to processing.
The exercise of many of these rights requires business systems and processes that enable the identification and segregation of personal data relating to a specific individual. If personal data is being processed within an opaque algorithm then segregation of that personal data (e.g. to erase it) will be difficult.
Given the quantities of personal data held in the context of big data any exercise of individual privacy rights is likely to be a time consuming exercise and potentially a costly administrative burden.
There are also specific rules on automated decisions which are made concerning an individual that may have a legal (for example a mortgage rejection or acceptance) or other similarly significant effect. In practice this would involve explicitly referencing the automated decision-making within a privacy or other notice and gaining the explicit consent of the data subject (unless it is necessary for performance of a contract or otherwise authorised by EU or Member State law). As discussed above, consent is a tricky concept in connection with big data analytics and gaining a meaningful consent to the proposed automated decision making would be difficult.
Depending on the nature of the automated decision-making and its effect on the individual, one argument may be that the decision does not have a legal or similarly significant effect on the data subject. This would need to be carefully considered in the context of the automated decision-making and the effect on the individual.
6. Accuracy
GDPR requires that personal data held be accurate and that every reasonable step must be taken to ensure that personal data is accurate (and suitably erased or rectified to remove inaccuracies).
Whilst a level of inaccuracy may have minimal impact where large data sets are analysed to reveal general trends, there will be a significant impact when processing is used to analyse a specific individual.
An additional issue is that drawing conclusions or correlations from large data sets, even if the data itself is accurate, may still lead to inaccurate conclusions. This is a particular problem where the input data is not representative of the entire population.
The machine-learning algorithm may include hidden biases that will lead to inaccurate predictions. Consider Ethics Committee input and user testing to mitigate this risk.
Although there is no quick fix to rectify inaccuracies in data sets, the above highlights the importance of ensuring personal data and other information are both accurate and representative of the population sampled to ensure that the outputs and conclusions drawn from big data analytics are accurate.
7. Security
Security and the risk of hacking and data breaches are inherent to any business that is processing personal data. This risk is only increased where the personal data held consists of extremely large quantities of personal data. Any high profile organisation that holds large quantities of personal data will be a bigger target for hackers and also at higher risk of human error within the business resulting in the inadvertent loss of personal data.
It is therefore essential that companies within the energy sector review security measures and procedures to minimise the ability of hackers to breach systems and any resulting impact of a data breach. This will inevitably involve a combination of upgrades to security systems and regular training to ensure staff know how to hold and transmit personal data and what to do in the event of a breach.
Conclusion
The energy sector faces significant challenges if it wants to both utilise and benefit from large data sets available to it, comply with GDPR and protect the rights of individuals.
However, despite the challenges, the benefits of big data analytics for both the company and the individual in the energy sector mean that solutions to these issues must be considered in order to facilitate the growth of domestic demand-side response services, to manage energy consumption more efficiently and respond to changes in local usage and give individuals greater visibility and control over their individual energy consumption. A balance needs to be found between the needs of the sector and privacy of individuals, and a proper GDPR analysis can help you achieve that.