# Anomaly/outlier detection in energy generation

**15. Anomaly/outlier detection in energy generation**

**15.1 Rationale & Link to BEYOND Apps**

The anomaly or outlier detection of DER system components analytics enables the detection of system failures and unusual energy generation patterns derived from system malfunctions. Prompt and effective anomalies detection of energy generation are imperative for initiating repairs, correct maintenance plans and eliminating errors in DER system.

The anomaly/outlier detection in the energy generation will be available in the BEYOND AI Analytics toolkit. The dedicated AI analytic with feed with valuable insights the Self-consumption optimization features of the Digital Twin environment (BEPO application), and the personal energy analytics PEASH application.

**15.2 Overview of relevant implementations**

Energy generation can be interpreted as a time-series data. These values depend on other independent variables like weather forecast and/or extracted time features (hour, day of week, day of year, etc.). When dealing with anomaly detection in time-series datasets, besides statistical approaches (ARIMA [1], SARIMA [2]) different types of neural networks are often used [3],[4],[5].

The concept of anomaly detection is similar in all these approaches and involves determining a threshold for future distinction between normal and anomalous data. The problem is initially considered as a regression problem while the model is trained on normal data. In a second step, the model is evaluated on testing data and anomaly data sets. The reconstruction errors of both data sets are plotted using a histogram so that the optimal threshold can be calculated. Finally, the problem evolves in a classification problem, meaning that if the prediction for a new data point has a reconstruction error less than a threshold it is classified as normal, otherwise as an anomaly.

**15.3 Implementation in BEYOND**

In BEYOND, the approach with neural network is used.

1. Import data and fill-in eventual missing values using interpolation

2. Split data in normal dataset and anomaly dataset

3. Normal dataset split in datasets for training and testing

4. Train the neural network and use 20% of training set as a validation set

5. Configure number of iterations and hyperparameters based on accuracy score for predicted values of test dataset

6. Predict data for anomaly dataset

7. Calculated test loss and anomaly loss show in the same histogram and set the value for threshold

A good accuracy score is considered to be anything between 70% and 90%. For regression problems, the accuracy score is usually calculated with mean squared error or mean absolute error. Any future data for which the prediction has a reconstruction error greater than a fixed threshold should be considered a possible anomaly.

**15.3.1. Data inputs and Analytics Pipeline (incl. assumptions /limitations)**

Given the expected forecast horizon, the numerical weather prediction data is required as an input, available at appropriate time ahead of the actual forecasted production realization. This input data is used for training and validating the model. In this development phase, we rely on the meteorological reanalysis data provided by MERRA-2 system [9] and available through the Renewables.ninja website [10] coupled with publicly available data from the ECMWF [11]. Anomalies can be added manually.

**15.3.2. Analytics Libraries Employed**

The key Python libraries used for data manipulation and data analytics are the following:

· Pandas for time series management

· Numpy for numerical manipulation

· Sklearn and Keras for the implementation of the machine learning algorithm

· Matplotlib for visualization of the training results

**References**

[1] V. Kozitsin, I. Katser, and D. Lakontsev, “Online Forecasting and Anomaly Detection Based on the ARIMA Model,” Appl. Sci., vol. 11, no. 7, Art. no. 7, Jan. 2021, doi: 10.3390/app11073194.

[2] F. Örneholm, “Anomaly Detection in Seasonal ARIMA Models,” Department of Mathematics Uppsala University, U.U.D.M. Project Report 2019:28, 2019. [Online]. Available: https://uu.diva-portal.org/smash/get/diva2:1333467/FULLTEXT01.pdf

[3] H. Pan, Z. Yin, and X. Jiang, “High-Dimensional Energy Consumption Anomaly Detection: A Deep Learning-Based Method for Detecting Anomalies,” Energies, vol. 15, no. 17, Art. no. 17, Jan. 2022, doi: 10.3390/en15176139.

[4] T. Wen and R. Keyes, “Time Series Anomaly Detection Using Convolutional Neural Networks and Transfer Learning.” arXiv, May 31, 2019. doi: 10.48550/arXiv.1905.13628.

[5] A. Santolamazza, V. Cesarotti, and V. Introna, “Anomaly detection in energy consumption for Condition-Based maintenance of Compressed Air Generation systems: an approach based on artificial neural networks,” IFAC-Pap., vol. 51, no. 11, pp. 1131–1136, Jan. 2018, doi: 10.1016/j.ifacol.2018.08.439.

Back to BEYOND_Baseline_Analytics