Hero Image
- Carlos de la b

Predicting Solar Radiation

Incident solar radiation, building energy consumption, and other environmental analyses often rely on expensive and computationally-intensive simulations. These energy consumption forecasts are based on digital 3D models that represent building physical properties and are generated through thermodynamic or fluid simulations. This process can be slow due to the complex calculations required. Additionally, the 3D model needs to be geometrically well-defined to ensure accurate results. As a result, energy consumption analysis tends to be expensive and complex, particularly during the early design stages.

Machine Learning (ML) presents a solution by simplifying the model and learning to approximate simulation results. By doing so, it can reduce the need for extensive energy simulations in the early design phases, allowing designers to explore optimal design solutions more efficiently. Moreover, the knowledge gained from one model can be transferred to others to produce accurate predictions.

In this article, we will present our initial findings on using a simple Machine Learning model to predict cumulative solar radiation. The model was trained on synthetic data generated through various simulations using Grasshopper, Ladybug, and the Laga library. A detailed article about how this data was generated can be found here: Synthetic Data with Rhino-Grasshopper, Ladybug, and Laga Library.

Training and testing models

A learning curve in Machine Learning is a graph that compares the performance of a training model and a testing model over different numbers of training instances. Ideally, as the amount of training data increases, the model's performance should improve. However, this is not always the case.

By splitting the data into training and testing sets, and graphing performance on each, we can understand how well the model generalizes to unseen data. Learning curves help us determine when the model has learned as much as it can about the data, and whether the model is suffering from overfitting (variance) or underfitting (bias).

Learning Curves:

*max_depth = 1: As more training points are added, the training curve decreases while the testing curve increases, stabilizing around 1,000 training points. Beyond this point, adding more data does not improve performance, and both curves converge to a lower score, indicating limited benefit from further training.

*max_depth = 3: The training curve gently decreases and stays close to 0.8, while the testing curve starts above 0.7 after 100 points. After that, it flattens, with both curves converging around 0.8, showing diminishing returns from adding more data.

*max_depth = 6: The score remains above 0.8, but the curves don’t converge as neatly, suggesting that more training data may still be needed.

*max_depth = 10: The training curve is higher than the testing curve, but they tend to converge, indicating that more training points could further improve performance and generalization.

The gap between the learning curves shows that the training dataset might be insufficient for the model to generalize well on the test set. The goal is to minimize this gap.

Complexity curves

Complexity curve

This graph is similar to the learning curves but compares different decision tree depths (Maximum Depth) on the X-axis. The key here is to identify where the model transitions from bias (underfitting) to variance (overfitting). For our final model, a depth of 12 was chosen, although some variance still remains.

Conclusions

Results

On the left, we show the parameters with the highest correlation to the target variable total solar radiation. These three parameters had the most influence on the machine learning model.

On the right, we present results from 11 independent tests. The diamond represents the actual simulation result, while the blue dot represents the prediction made by the machine learning model.

Of the 11 tests, 10 were within 10% of the actual measured value. Although the model is far from perfect, it shows significant progress and demonstrates its potential for making accurate solar radiation predictions.

By integrating machine learning, we can streamline the simulation process and reduce the costs associated with building energy analysis, enabling more efficient exploration of design solutions.

Other Related Posts: