Multivariate Analysis of Job Pause Time Data Using Classification and Regression Tree and Kernel Clustering

Autor:	Marko Maucec, Srimoyee Bhattacharya, Dwight D. Fulton, Jon Matthew Orth, Ajay Singh, Jeffrey Marc Yarus
Rok vydání:	2013
Předmět:	Cart Engineering Multivariate analysis business.industry Decision tree computer.software_genre Missing data Normal score Data mining Cluster analysis business computer Categorical variable Decision tree model
Zdroj:	Day 2 Tue, October 29, 2013.
DOI:	10.2118/167399-ms
Popis:	The well treatment program is an important part of the field development plan, and certain variables, such as job pause time (JPT), can affect its efficiency. JPT is the time during which pumping is paused between subsequent treatments of a job. The objectives of this work are to investigate whether, from existing data, it is possible to find patterns in significant variables that affect the extreme values of JPT in a particular region. The answers are sought by applying a classification and regression tree (CART) to both categorical and continuous variables in the database. The practical application of CART is presented using case studies first using classical CART analysis, then using CART analysis with enhancement tools such as the normal score transform (NST), and then dividing the large dataset into smaller groups using clustering. Significant variables are found that affect the response variables, and predictor variables are ranked in order of their importance. Such information can be used to control predictor variables that cause high JPT. The results are outlined in an intuitive way, including categorical, continuous, and missing values. Because CART is a data driven, deterministic model, we cannot calculate the confidence interval of the predicted response. Confidence in the results is purely based on the historical values, and the accuracy of the result produced by a tree model depends on the quality of the recorded data measured in terms of volume, reliability, and consistency. The prediction capability of CART is enhanced by the use of NST and clustering techniques. The approach presented in this paper analyzes a dataset with limited information and high uncertainty and should lead to developing a method for generating proxy models to find future success indices (e.g., for drilling efficiency or production from a fracture). This could standardize stimulation and generate decision ‘best practices’ to save costs in field development and the optimization process.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::903fa474367341f5e6182d0fb9efab01 https://doi.org/10.2118/167399-ms Zobrazit plný text záznamu