Production forecast of steam-assisted gravity drainage (SAGD) in heterogeneous reservoir is important for reservoir management and optimization of development strategies for oil sand operations. In this work, artificial intelligence (AI) approaches are employed as a complementary tool for production forecast and pattern recognition of highly nonlinear relationships between system variables. Field data from more than 2000 wells are extracted from various publicly available sources. It consists of petrophysical log measurements, production and injection profiles. Analysis of a raw dataset of this magnitude for SAGD reservoirs has not been published in the literature, although a previous study presented a much smaller dataset. This paper attempts to discuss and address a number of the challenges encountered. After a detailed exploratory data analysis, a refined dataset encompassing ten different SAGD operating fields with 153 complete well pairs is assembled for prediction model construction. Artificial neural network (ANN) is employed to facilitate the production performance analysis by calibrating the reservoir heterogeneities and operating constraints with production performance. The impact of extrapolation of the petrophysical parameters from the nearby vertical well is assessed. As a result, an additional input attribute is introduced to capture the uncertainty in extrapolation, while a new output attribute is incorporated as a quantitative measure of the process efficiency. Data-mining algorithms including principal components analysis (PCA) and cluster analysis are applied to improve prediction quality and model robustness by removing data correlation and by identifying internal structures among the dataset, which are novel extensions to the previous SAGD analysis study. Finally, statistical analysis is conducted to study the uncertainties in the final ANN predictions. The modeling results are demonstrated to be both reliable and acceptable. This paper demonstrates the combination of AI-based approaches and data-mining analysis can facilitate practical field data analysis, which is often prone to uncertainties, errors, biases, and noises, with high reliability and feasibility. Considering that many important system variables are typically unavailable in the public domain and, hence, are missing in the dataset, this work illustrates how practical AI approaches can be tailored to construct models capable of predicting SAGD recovery performance from only log-derived and operational variables. It also demonstrates the potential of AI models in assisting conventional SAGD analysis.