Technical Article

Mitigating Smart Meter Security Risk: A Privacy-preserving Approach

March 23, 2023 by Shraddha Tupe

This article provides techniques to safeguard against security threats and maintain the privacy of smart metering systems.

Smart metering systems have transformed energy distribution systems by providing real-time energy consumption data and better energy management. However, the widespread use of smart meters has also introduced significant security risks that can compromise customer privacy and security. This article proposes a privacy-preserving approach to mitigate the security risks associated with smart metering.

 

Utility meters. Image used courtesy of Adobe Stock

 

How do Smart Meters Work?

Smart meters are IoT devices that measure and transmit data on electricity, water, and gas usage, using sensors to eliminate manual checks. They enable monitoring of resource consumption for individual units, facilities, or equipment and measuring energy production from solar panels. Some can connect to building automation systems, enabling management of heating, cooling, and other utilities based on usage.

To function properly, smart meters need to be connected to a network; they do not always transmit data directly to the cloud. As illustrated in Figure 1, smart meters send data to a local smart meter gateway that collects information from all the meters in the area, forwarding that information to the cloud. Providers and customers can access this data through a platform. Smart meters and gateways have different connectivity requirements that vary depending on the data link, network, and transport layers of the network architecture. Since they are often indoors or underground, they require communication solutions that can penetrate buildings and obstructions.

 

Figure 1. Advanced metering infrastructure. Image used courtesy of EETech

 

Smart meters use network protocols for communication, such as smart meter-to-gateway communication and gateway-to-cloud communication. By relying on a gateway to transmit data to the cloud, smart meters can use simpler communication technologies that do not rely on TCP/IP. These technologies use less power, enabling smart meters to run on batteries.

Smart meters use both wired and wireless protocols for communication. Wired protocols include hardwired connections, Ethernet, power line communication (PLC), and meter bus (M-Bus). Ethernet connections send data to the gateway using TCP/IP or User Datagram Protocol/IP.  Power line communication is a simple option for smart meter communication but is not widely used. M-Bus is a European standard already used in many buildings and was developed specifically for smart meters. Wireless protocols include wireless meter bus and LoRaWAN. A wireless meter bus is a wireless version of the M-Bus standard. Providers can connect to one of these networks or deploy their own.

Smart meters gather information on the amount of electricity consumed in households and communicate this data wirelessly to utility providers. Unfortunately, most of these devices do not utilize encryption when transmitting data, leading to data interception. The collected fine-grained data on power usage can be used to make inferences about household behavior, which threatens homeowners' privacy.

Engineers are actively seeking ways to improve scalable privacy in smart meters while ensuring their efficiency and cost-effectiveness.

 

Privacy Concerns

The privacy concerns associated with smart energy meters primarily stem from collecting and storing personal energy consumption data. This data can provide detailed insight into a household's daily routine, habits, and lifestyle, which could be used by third parties for various purposes such as targeted marketing or even criminal activities like burglary. Moreover, there are concerns about the unauthorized sharing or selling of this data to third parties, including energy suppliers, marketers, and government agencies, which can raise issues about the misuse of personal data and erosion of privacy rights.

 

Security Concerns

Another concern with smart energy meters, which are internet-connected and can be accessed remotely, is their vulnerability to hacking and cyberattacks. If an attacker gains access to a smart meter, they could tamper with the meter's readings or disrupt the energy flow in the grid. This could result in serious safety risks, as well as financial losses for both utilities and consumers.

There is also a risk that smart meters could be used as a vector for cyber attacks on other connected devices in a household or business. This could include laptops, smartphones, and other IoT devices connected to the same network as the smart meter. A successful cyber attack on a smart meter could give an attacker access to all the devices on the network, potentially compromising sensitive personal or business data.

 

Risk Targets 

The most common advanced metering infrastructure (AMI) targets are illustrated in Figure 2. Attackers may tamper with usage data during transmission or after recording. Data collectors also face threats, as attackers may exploit their remote disconnect functions to create power outages.

 

Figure 2. Smart energy meter infrastructure with threat existence. Image used courtesy of Energies [PDF]

 

Smart meters are typically secure against physical tampering, but semi-physical attacks using specific hardware components are still possible. To prevent such attacks, utility companies can use challenge-response mechanisms to authenticate laptops and handheld devices and vendor-specific data encryption.

 

Data Risks Associated With Smart Meters

Data Breaches

Smart energy meters collect and transmit detailed information about your energy usage, which could be valuable to cybercriminals. If a hacker gains access to your meter, they could steal your personal information, such as your name, address, and energy consumption patterns.

 

Hacking

Hackers could gain access to your smart energy meter and manipulate the data it collects, which could lead to inaccurate readings and higher bills. They could also use the meter as a gateway to the consumer's home network, enabling them to gain access to other devices connected to your network.

 

Data Gathering and Theft

Smart energy meters use wireless communication to send data, which can be intercepted by unauthorized individuals, leading to the theft of sensitive information and system harm. Strong security measures are crucial to protect against data breaches and maintain privacy.

 

Privacy Protection Solutions 

Numerous researchers have expressed their belief that existing privacy protection methods are either too expensive, too weak, or lead to an inefficient exchange of information. To make the data secure, some of the following techniques are used.

 

Machine Learning 

There are various ways in which machine learning can be applied, such as detecting fraudulent activity, preserving privacy, and identifying cybersecurity vulnerabilities. Machine learning (ML) algorithms can detect unusual patterns in smart meter data, such as significantly higher energy consumption, indicating a security breach.

 

Anomaly Detection

Anomaly detection identifies data points that deviate from normal patterns and can detect abnormal energy consumption due to faulty equipment, tampering, or other issues. Numerous algorithms are available for anomaly detection in smart energy systems, and the choice depends on specific requirements, data characteristics, and available resources. Below are two types of algorithms explained in detail.

 

Gaussian mixture models

Anomaly detection in smart meters is based on statistical models such as the Gaussian mixture model (GMM). A GMM is a probabilistic model representing the data distribution as a mixture of several Gaussian (normal) distributions. In the case of smart meters, a GMM can be used to model the distribution of energy consumption over time, taking into account factors such as time of day, day of the week, and season.

To detect anomalies using a GMM, the following steps can be taken:

1. Model the data: A GMM is fit to the data using an algorithm such as expectation-maximization (EM) to estimate the parameters of the Gaussian distributions that make up the mixture. The number of Gaussians used in the model can be determined using techniques such as the Bayesian information criterion (BIC) or the Akaike information criterion (AIC).

2. Calculate the probability density function (PDF): Once the GMM is fitted, the data PDF can be calculated for each point in time. The PDF represents the likelihood that the energy consumption at a particular time is drawn from the distribution modeled by the GMM.

3. Calculate the anomaly score: The anomaly score for each data point is calculated as the negative log-likelihood of the PDF. A higher anomaly score indicates a lower likelihood that the data point was drawn from the distribution modeled by the GMM and a higher likelihood that it is an anomaly.

The anomaly score can be calculated using the following equation:

\[Anomaly\,score(x_{t})=-log\Big(\sum\limits^{K}_{i=1}w_{i}N(x_{t}|\mu_{i}\Sigma_{i})\Big)\]

Where

xi is the energy consumption at time t, K is the number of Gaussians in the GMM, 

wi is the weight of the ith Gaussian, µi is the mean of the ith Gaussian, 

i is the covariance matrix of the ith Gaussian, and 

\(N(x_{t}|\mu_{i}\Sigma_{i})\) is the probability density function of the Gaussian evaluated at xt.

By setting a threshold for the anomaly score, data points with scores above the threshold can be identified as anomalies. The threshold can be determined using techniques such as the receiver operating characteristic (ROC) curve or the precision-recall curve.

Thus anomaly detection using GMM provides a powerful and flexible approach to identifying abnormal energy consumption patterns in smart meter data.

 

Robust Principal Component Analysis

The Robust Principal Component Analysis (RPCA) algorithm is a statistical method used for anomaly detection in smart meter data for privacy protection. This algorithm separates the observed data matrix into two components, a low-rank matrix, and a sparse matrix.

The low-rank matrix represents the expected data and normal energy consumption patterns. This matrix can be thought of as a summary of the underlying structure of the data. The sparse matrix represents the anomalies or outliers in the data. This matrix contains energy consumption patterns significantly different from the expected behavior.

The objective of the RPCA algorithm is to decompose the observed data matrix into these two components by solving the following optimization problem:

\[minimize||L||^{*}+\lambda||S||1\]

subject to D = L + S

Where

D is the observed data matrix 

L is the low-rank matrix 

S is the sparse matrix 

||.||* is the nuclear norm of a matrix (i.e., the sum of its singular values) 

||.||1 is the L1-norm of a matrix (i.e., the sum of its absolute values), and 

λ is a parameter that controls the trade-off between the low-rank and sparse components.

The first term in the objective function, ||L||*, represents the nuclear norm of the low-rank matrix, which is used to encourage the matrix to have a small rank. The second term, λ||S||1, represents the L1-norm of the sparse matrix, which is used to encourage the matrix to have a large number of zero entries. The optimization problem is subject to the constraint that the observed data matrix is equal to the sum of the low-rank and sparse matrices.

By solving this optimization problem, the RPCA algorithm separates the observed data matrix into low-rank and sparse matrices, which can be used for anomaly detection. The low-rank matrix represents the expected behavior, while the sparse matrix represents the anomalies or outliers in the data. By identifying these anomalies, privacy breaches can be detected, and appropriate measures can be taken to protect the privacy of smart meter users.

RPCA algorithm is a powerful technique for anomaly detection in smart meter data for privacy protection. It can be used to identify unusual energy consumption patterns that may indicate a privacy breach, allowing for appropriate action to be taken to protect the privacy of smart meter users.

Gaussian mixture models (GMMs) and Robust Principal Component Analysis (RPCA) are both statistical methods used for anomaly detection in smart meter data, but they differ in their approach to identifying anomalies.

GMMs model the distribution of energy consumption over time as a mixture of several Gaussian distributions, using techniques such as expectation maximization to estimate the parameters of the Gaussians. The PDF of the data can then be calculated, and the anomaly score for each data point is calculated as the negative log-likelihood of the PDF. By setting a threshold for the anomaly score, data points with scores above the threshold can be identified as anomalies. GMMs are powerful and flexible for identifying abnormal energy consumption patterns in smart meter data.

RPCA, on the other hand, is used for privacy protection and separates the observed data matrix into two components: a low-rank matrix and a sparse matrix. The low-rank matrix represents the expected behavior, while the sparse matrix represents the anomalies or outliers in the data. RPCA solves an optimization problem to decompose the observed data matrix into these two components, using the nuclear norm and L1-norm to encourage the low-rank and sparse matrices to have specific properties. By identifying anomalies in the sparse matrix, privacy breaches can be detected, and appropriate measures can be taken to protect the privacy of smart meter users.

Overall, GMMs are more suited for identifying abnormal energy consumption patterns, while RPCA is more suited for identifying anomalies that may indicate a privacy breach.

 

Differential Privacy

Differential Privacy (DP) is a privacy-preserving technique that can be used to protect sensitive data, such as energy consumption data collected by smart energy meters. 

DP adds random noise to data to protect individual privacy. It can be used in energy consumption data to add noise to individual load profiles and safeguard personal information. This method preserves overall statistical properties while making it challenging to infer individual energy consumption.

The DP algorithm functions as follows. 

Noise is added to each consumer's load profile in a large dataset containing many users' load profiles based on the largest change that the individual's profile could have on the outcome of a given query. The level of noise to be added to a consumer's load is determined by calculating the impact of their data on the result of a query. This is done by first performing a query on a dataset that includes the consumer's information and then performing the same query on the same dataset without the consumer's data. The results of these two queries are compared to determine the impact of the consumer's data on the aggregated result. Noise is then added to the consumer's load profile based on its impact.

This ensures that the same query performed on two different datasets, one with and one without a specified user's data, will produce the same result. Even if an adversary knew that a specific consumer's data was included in a dataset, they would be unable to determine the individual's consumption. DP guarantees privacy in that datasets can only be used to infer information about a group, not one individual.

However, DP does not guarantee complete privacy, as an individual can still suffer a maximum amount of privacy loss. Generalizations of the entire dataset can still be used by an adversary to make statistical inferences about an individual's consumption. Nevertheless, the larger the dataset, the less of an impact any one individual can have on it, making it harder to make statistical inferences about a user. Moreover, larger datasets require less noise to anonymize an individual consumer's load profile.

The DP algorithm's impact on the system should be evaluated based on its effects in four system operations: no privacy, low privacy, medium privacy, and high privacy. On the consumer side, an individual opting into the algorithm would select one of the aforementioned categories and be charged accordingly. The lower the epsilon value, the higher the security, which leads to more complex calculations requiring more computational power and higher pricing to equalize profit margins.

Let M be a randomized algorithm that takes dataset D as input and produces an output M(D). Let D and D' be neighboring datasets, and let S be the range of possible outputs of M. Then, M satisfies ε-differential privacy if

\[Pr[M(D)\epsilon\,S]\leq e^{\varepsilon}Pr[M(D^{'}\epsilon\,S)\]

where ε is a non-negative privacy parameter that controls the strength of privacy protection. A smaller value of ε provides stronger privacy guarantees, but may also reduce the usefulness of the output. The above equation states that the probability of observing a certain output S from M on dataset D is no more than e^ε times the probability of observing S on dataset D'.

The Laplacian noise, which aggregates in the load profile, fluctuates the profile significantly. However, when a composite of buses or consumers is added to the load profiles, it creates a smoothing effect on the data without losing any security. In some cases, smoothing functions may be applied because the load profile with DP is immune to post-processing. The smoothing functions are applied to a time series to remove the fine-grained variations between time steps.

If a smoothing effect is applied, the noise sum is periodically stored in its respective server, either by a third party or directly to the retailer. As raw Smart Meter data are sent out at the retailer's preferred tick rate, the noise signal is added to the raw data.

Here are some ways in which DP can improve the security and privacy of smart energy meters:

  • Privacy-preserving data aggregation: Differential Privacy can aggregate data while preserving privacy and providing useful insights. Adding noise to data before aggregation using DP algorithms can make it hard to identify individual households.
  • Privacy-preserving machine learning: Machine learning can detect patterns in smart energy meter data but may reveal sensitive information. DP can safeguard privacy while allowing useful data analysis by adding noise to data before analysis.
  • Protecting against inference attacks: DP can prevent inference attacks that use statistical analysis to identify households from smart energy meter data. Adding noise to data can make it harder to identify individual households, even with other sources of information.
  • Transparency: DP measures and quantifies privacy protection provided by algorithms, enabling energy suppliers and stakeholders to understand privacy-utility trade-offs when analyzing smart energy meter data.

The GMM, RPCA, and Differential Privacy algorithms secure the data. They are highly recommended because the technology can offer interactive support to machine learning models, allowing for easy pattern recognition, complete automation, and support for various applications.

 

Takeaways of Securing Smart Meters

Smart energy meters are a valuable tool for modernizing the power grid, providing benefits such as increased energy efficiency, cost savings, and improved grid stability. However, using these meters also poses a risk to privacy and security. Fine-grained data on power usage collected by smart meters can be used to make inferences on household behavior, posing a threat to homeowners' privacy. Additionally, the vulnerability of smart meters to hacking and cyberattacks poses a risk to utilities and consumers. Therefore, it is essential to implement strong security measures to protect against data breaches, hacking, and data snooping. Researchers are actively exploring techniques such as machine learning and differential privacy to improve scalable privacy in smart meters while ensuring their efficiency and cost-effectiveness. It is crucial to continue to develop and implement innovative solutions to protect the privacy and security of consumers' data while reaping the benefits of smart energy meters.