Abstract:
Efficient building design and the accurate computation of the heating and cooling loads of the heating and cooling equipment are required in order to ensure comfortable indoor air conditioning. In order to estimate the required cooling and heating capacities, architects and building designers need information about the characteristics of the building and the conditioned space. We are focusing on the calculation of the energy efficiency of the existing buildings with the study of the UC-Irvine energy efficiency dataset. The dataset has two response variables (heating load and cooling load) and eight explanatory variables (relative compactness, surface area, wall area, roof area, overall height, orientation, glazing area, and glazing area distribution). First, the correlation between these two-response variables and other variables was calculated and the correlation among all the variables was studied. As obtained in the correlation matrix, the correlation between the two response variables is very high (i.e., 0.976). Hence by studying one of the response variables, we can predict the values of other response variables. Therefore, we investigated the effect of eight explanatory variables on heating load. The graphical and tabular analysis is used to analyse the features of the data set. Linear regression is handled to grope the relationship between response and explanatory variables. The Box-Cox method is used to find the optimal transformation for the response variable and Ordinary Least Squares (OLS) and Akaike Information Criteria (AIC) are used to select the best-fitted model. From recorded data, the correlation between the two response variables is very high (i.e., 0.976). When considering heating load as the only response variable, the variable “Overall Height” has a perfect correlation with the response variable heating load (i.e., 0.889). Hence, we can say that the variable “Overall Height” is a good predictor of the response variable heating load. Also, the variable “Relative Compactness” has a good relationship with the response variable, (i.e., 0.622), so it is also a good predictor of the response variable. The scatterplot concurs with the fact that there is a linear relationship between the two response variables. In the variable visualization, there might be an interaction between variables glazing area and “Overall Height”. The variable “Overall Height” is the highest positive correlated feature of the data set. The regression model reveals that the weighted least squared model is the best model for energy-efficiency data. The model indicates that the surface area, wall area, overall height, glazing area and glazing area distribution are most important for heating load.