Managing Energy Data for AI development
Gorilla joined many others in giving an overview of how developments in AI might impact the future in our earlier blog, with a particular focus on energy retailers. Now we want to dive into more specifics, looking at how retailers might develop their own AIs, instead of using a third party. In which areas are utilities going to see the most return on AI investment? What are the weaknesses of current data portfolios for AI development? What changes are needed? We’ll look at all of these questions and more in this blog.
AI in the energy sector
The adoption of AI across various sectors has become ubiquitous, albeit under different monikers. Simple predictive algorithms and highly sophisticated LLMs like Claude or ChatGPT are both examples of AI adoption in business. The forecasting tools that most retailers use to balance supply and demand are just as much artificial intelligence as anything else.
Jason Li, Global President of Huawei’s Electric Power Digitalisation Business Unit, recently stated the following with regards to AI: “We need to complement the advantages [of different technologies] with digital such as cloud and big data and we need to adopt AI technologies so we can shift to an architecture supported digitalisation that integrates multiple technologies.”
Traditionally, much of the AI-driven innovation in the energy sector has been concentrated upstream, in grid management or generation. For example, smart grids are seeing a big push, while sensor-connected power generation promises great leaps in efficiency. Of course, no discussion of AI in energy would be complete without mentioning AI-powered advancements in weather forecasting. These technologies are pivotal for predicting and managing energy production and distribution effectively, which is crucial for grid stability and operational efficiency. However, the potential for AI to revolutionize the retail side of energy is still significant.
While generative AI garners significant attention in the media, particularly for its applications in creating content and simulating scenarios, it lacks direct applicability to energy retailers. The use of generative AI in this sector is likely to enhance customer service and improve internal productivity, such as automating customer interactions through chatbots. No one would call such an advancement revolutionary. For retailers, predictive AI stands out as the game changer.
Predictive AI excels in analyzing vast amounts of data to forecast future scenarios with remarkable accuracy. For example, a machine learning-based forecasting algorithm can continually train itself on consumption patterns and similar data available to utilities, adapting to changes in demand with minimal human intervention. This capability allows energy retailers to not only predict customer usage more accurately but also manage their purchasing strategies to optimize costs.
One energy figure stated “In fact, the advances in [predictive AI] have been nothing short of rapid. The challenge, though, has been in supplying the ‘right’ data to make them effective.”
Another potential application is the development of machine learning algorithms for wholesale trading. These AI systems can autonomously make purchasing decisions based on predicted changes in energy prices and demand, potentially leading to significant financial gains and increased efficiency in operations.
We could go on and on, but the key point is that plenty of room remains for drastic advances driven by AI. More importantly, a competitive advantage only develops when a company has something that other companies cannot replicate. Giving ChatGPT to all of your employees might give you a big boost, but every company can do the same. Developing your own predictive algorithms for forecasting or trading will give you a genuine edge over the competition.
The technical landscape required for machine learning
At its core, machine learning (ML) is about two things: compute and data. If you’re unfamiliar, compute refers to computational resources, the computing power you have available to train an algorithm.
The compute required to train a modern predictive AI is immense; this is not something you can build with just a few servers. If retailers did try to build their own capability here, the cost would soon drown out any benefits that would accrue from AI and have little use in other parts of the business. Fortunately, there is no need to spend millions building this up as the big cloud providers - AWS and similar - have services ready for ML training.
Data is where the energy retailer comes in. Retailer operations produce endless amounts of data that are directly relevant for training. While the total amount of data might still be a barrier - as a comparison, it is estimated that GPT-4 needed most of the text produced in human history - the smaller scope of an energy retailer AI will probably mean there is sufficient data.
It can be difficult for business stakeholders to understand the link between the development of ChatGPT and an AI for energy forecasting; the latter will not look much like the former after all. LLMs have captured the imagination, but as we discussed in the previous blog post, there’s not much opportunity for them inside an energy retailer.
The lesson to take from the rise of LLMs is in capabilities demonstrated. At its core, an LLM is a prediction engine; it predicts the next letter based on the previous text. From this seemingly simple process, the AI is able to generate highly complex human language - and even more impressive emergent behaviours such as playing chess.
An AI for energy retail would be very similar - a prediction engine for energy use, weather conditions, wholesale prices, etc. It might seem surprising, but in comparison to the complexity of language, making accurate predictions in the energy environment is much simpler. By applying the techniques developed for LLMs to energy data and taking advantage of the services now available, there is a clear path to effective AI for the energy industry.
The barrier? Making use of that energy data. Even if you have access to enough compute, ML training follows the classic computer science concept: garbage in, garbage out. Can retailers get the right data in the right place for training? Data is coming from smart meters, demand profiles, IoT devices, smart grids, baseload prices, day-ahead prices, previous forecasts, weather, competitors, and many more. All of this data needs to be cleaned, it needs to be in the right format, and it needs to be available for your training, not siloed away.
For predictive AI to function effectively in the context of energy retail, selection and preparation are critical steps. Here’s an overview of the types of data needed and how to prepare it for use:
Types of Data Required
- Historical Consumption Data: This includes detailed records of customer energy usage over time, often broken down by hour, day, month, and year. It helps the AI model understand patterns and predict future demands.
- Weather Data: Since weather significantly affects energy consumption, historical and real-time weather data is crucial. This might include temperature, humidity, wind speed, and solar irradiance.
- Pricing Data: Historical pricing information helps the model predict how changes in energy prices might influence future consumption and cost.
- Customer Data: Demographic information about customers, such as location, type of residence or business, and possibly energy efficiency ratings of buildings, which can influence energy usage patterns.
- Grid Data: Information on grid performance, outages, maintenance schedules, and other operational metrics can help predict issues and optimize energy distribution.
Data Preparation Steps
- Data Cleaning: This involves removing inaccuracies or irrelevant data, such as incorrect meter readings or data from decommissioned meters. It’s crucial to ensure the accuracy of the training data.
- Data Integration: Combining data from various sources into a unified format is essential. For instance, consumption data might need to be aligned with corresponding weather data by date and location.
- Normalization: This process adjusts different scales of data (e.g., temperature in Celsius and energy consumption in kWh) to a common scale, which prevents any one feature from dominating the predictive model.
- Feature Engineering: Creating new variables that might help in making more accurate predictions. For example, transforming raw temperature data into heating and cooling degree days can provide more insights into energy usage patterns.
- Data Partitioning: Dividing data into subsets for training, validation, and testing the model. This ensures that the model can be trained on one set of data, validated on another, and tested on unseen data to gauge its predictive accuracy.
Ensuring Data Quality
- Data Security: Managing data privacy and security, especially personal customer data, according to regulatory requirements.
- Continuous Monitoring: Regularly updating data and monitoring for shifts in data quality or relevance, as outdated or poor-quality data can lead to inaccurate predictions.
Underlying all of this is a software layer to handle the data processing and storage needs. Building this infrastructure is not a small task, we we will detail below.
By meticulously preparing the relevant data, energy retailers can at least solve the major technical challenges for creating their own AI. However, they will still need to overcome a variety of non-technical issues.
Non-technical challenges to overcome
Stick a petabyte of data into a powerful computer, and what will happen? Nothing, because someone has to tell a computer what to do first. While advances in AI may one day lead to AIs that develop themselves, we are still quite a way off. Until then, people are going to be the most scarce resource for businesses looking to compete.
Unfortunately, competition will not be limited to the energy sector. Every company in the world, particularly in tech, is frantically hiring every Python programmer and data scientist they can. Upskilling might be the best option for retailers. Joseph Santamaria, Director of WW Energy and Utilities Solution Architecture at AWS, commented: “In 2024 and beyond, as energy demands grow on a global scale, the energy sector will find combining generative AI with proper upskilling to be extremely valuable.”
When it comes to hard skills, the most popular deep learning frameworks are Tensorflow and PyTorch, both of which are mostly Python-based. On the data side, you’ll need data scientists but being able to work with data is more important than specific technical skills. However, you will almost certainly require a large group of support roles. It is easy to underestimate the amount of infrastructure that is required to start training ML models; you could have all the data scientists in the world but they will be stuck without technical architects to provide the scaffolding.
The solution that can satisfy the demand for infrastructure and skills without hiring 100 new people is to turn to software that has already done a lot of the work for you.
Gorilla for ML infrastructure
The Gorilla platform is one such example. Let’s imagine that you are using AWS SageMaker for your training, with EC2 for compute. What else are we going to need? An Infrastructure as Code platform will be necessary to underpin everything: you could use AWS’s own CloudFormation or CDK, but if you want the option to use non-AWS services you might go for Ansible or Terraform. Storage is relatively simple with S3, but for data ingestion and extraction there are several choices. AWS Glue will allow you to implement Apache Spark or PythonShell, but now you need to start thinking about performance, as well as ease of integration. How well does Glue compare to other managed services or your own implementation? Or perhaps you end up using Kafka or Flink, requiring yet more evaluation of different platforms.
Gorilla eliminates that effort. Instead of dozens of services to source and manage, the Gorilla Platform functions as a single data layer to handle data preparation, integration, storage, and infrastructure.
Final Word
The potential of AI to enhance the business of energy retail is still poorly understood. The frontiers of AI development are only just being explored, and even AI experts have failed to predict many of the recent developments to emerge. Any attempt to push forward what would be a multi-year project to develop an internal machine learning predictive AI would be fraught with risk. At the same time, highly risky multi-year projects are the bread and butter of the energy sector; vertically integrated retailers will have plenty of experience to draw on from their generation sides.
Regardless of how retailers choose to approach AI, the actual investment will not determine the rewards. Whether you pour millions (or even billions) into data centres and AI teams or choose third-party options is meaningless compared to the AI programs and use cases that you actually pursue. It's not who you choose to work with, but what you develop, that matters.
A major advantage for retailers is that many of the investments needed for AI development can benefit other areas of the business. The biggest early investment will be in hiring and upskilling your people; even if they don’t eventually work on AI, you’re sure to benefit from strengthening your teams. Getting your data landscape under control will improve efficiency across the company. Put this way, embarking on this project begins to look promising, particularly when you factor in current enthusiasm for everything AI. The only question remaining is: Can you get the quality needed in both people and data? Gorilla, at least, can help you with the latter.