Recently UK Oil and Gas Authority released 130 terabytes of data, including subsurface data through the National Data Repository (NDR). This gave us a great opportunity to expand our well planning database across the other part of the Norwegian North Sea and to use machine learning algorithms on a huge dataset. This was a perfect project for data science enthusiasts and our summer interns, Carl and Emilia from Capgemini Consulting to work on.

Posted 03.06.2020 10:21 by Juan Gonzalez

They both started with gaining essential oil and knowledge while fiddling through the different oil and gas data types like stratigraphy data ( and wellpath data. In the whole process, they downloaded and stored metadata of 12,000 wellbores, around 4,500 wellbore stratigraphy and 2,500 wellbores with wellpath data present in the UK NDR. The dataset was stored in Pro Well Plan’s cloud server and was easily accessible for further analysis.

Scraping Stratigraphy data from HTML pages was a straightforward task. However, the file formats of the wellpath data were heterogeneous across the stored data. Some of the common file formats encountered were csv, xls, xlsx and p72 ( Hence, 1,900 wellpath with most common file formats (mentioned above) were used for further analysis. This heterogeneous file formats were standardized and stored in a unique format. Along with this, an algorithm was used in order to calculate the missing features (eg missing azimuth, tvd etc) and increase the resolution of the data. Figure 1 shows the wellpath data of some of the wellbores after standardizing the data.


Figure 1. Wellpath view of different wellbores originating from a common well.

After standardizing the wellpath data, some of the Machine Learning techniques applied by Carl and Emilie were:

  • Reconstructing a wellpath based on few/selected points only. The model learns the variability from the already existing wellpaths in the database. This model can be used for generating a new wellpath or removing gaps from the existing wellpath data
  • Clustering wellpath data using image recognition and clustering technique. This model helps in finding similar wellpath(s) based on user-selected wellpath.

Results of clustering wellpath data are discussed further.


Each graph in above Figure (2) represents the arbitrary dimensions representing the wellpath clustering space and each point in each graph represents a wellbore. The graphs are colour-coded with max values of true vertical depth, measured depth, reach and inclination encountered in the wellpath. It can be seen from the graphs that all the vertical wells (low reach and low inclination) are clustered in the left hand of the graph. Within the vertical wells, shallow wells are in the bottom whereas deeper wells are at the top. Similarly highly deviated wells are clustered in the right side of the graph.


Figure 3 (a) shows the results after applying Convolutional Neural Networks (CNN) and clustering algorithm. The two axes represent the arbitrary dimensions representing the wellpath clustering space. The red points in the graph indicate the wellbore selected for the wellpath view. The wellbores selected by red points are further visualized in (b) and (c). (b) shows the Reach vs TVD plot of the selected wellbores. (c) shows the 3-D view of the wellpath. (b) and (c) shows that the trajectory of all the wellbores are almost identical except the direction in which the wellbore is drilled. This is because the change in azimuth is more valuable to know for a well planning engineer when planning a well than the absolute value of azimuth.

And this concludes our summer of 2019 with Carl and Emilia. Not to mention all the above was done in six weeks!!

Feel free to get in touch with us if you to have any questions!

Many thanks to UK Oil and Gas Authority for making this data publicly available!

Related Links:

Carl August Gjørsvik's LinkedIn Profile:

Emilia Botnen Vanden Bergh's LinkedIn Profile:

Share this