Any process manufacturing plant has sensor measurement data, operator log book data either in electronic form or in handwritten form, CCTV monitoring data, audio data either from CCTV or separate system, thermal imaging data for hot areas of the plant, weather information, logistics information etc.\n\n\u00a0\u00a0 Table 1. Details of different data types stored in process manufacturing plants\n\n\n\nData available in process manufacturing plants have following characteristics in the context of big data definition\n\n\n Variety\u00a0\u00a0\u00a0\u00a0\u00a0 numeric, text, video and audio data types\n Velocity\u00a0\u00a0\u00a0 scan rate of Mili-second to minutes (multi scale data)\n Volume\u00a0\u00a0\u00a0 a refinery is designed for 2GB\/day data logging capacity\n Value\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 novel insights for operations, energy and asset management etc.\n Veracity\u00a0\u00a0\u00a0 data quality and metadata management processes\n\n\nCan process manufacturing data be termed BIG DATA\n\nIt is evident that the process manufacturing plant data has all the characteristics for it to be termed big data. Despite fulfilling the big data definition criteria, process manufacturing data cannot be used for big data analysis because different data types are not available for combined analysis.\n\nAs described in Table 1. Sensor measurement data is available in different DCS systems, PLCs and data loggers in the plant; sensor data is not combined in real time hence rendering it unusable for combined analysis. CCTV data store is from different vendors and developed on different database technology than sensor measurement data and hence cannot be used for combined analysis.\n\nSimilarly, operations and maintenance logbooks contain text data that is either stored in Manufacturing Execution System (MES) database or in handwritten records; thermal imaging and audio data systems are from different vendors and developed on different database technology than sensor measurement data. Moreover, there is no technology platform available for combined analysis of different types of data available in process manufacturing plant.\n\nHow can process manufacturing data be transformed to BIG DATA\n\nData integration and combined data analysis platform are the enablers for big data analysis of process manufacturing data. New technologies are available for common storage and management of different data types. Unstructured databases with distributed file system architecture are now available which can handle variety of data types in scalable way. Apache Hadoop is one such open source unstructured database which can handle numeric, text, video and image data types.\n\nApache Hadoop is scalable because of its Distributed File System [DFS] architecture. In memory data stores such as GridGain and aggregate oriented databases such as mongoDB provide different choices for real time and unstructured data analysis. These new technologies enable combined data analysis at large scale in near real time scenarios. Parallel computing platforms such as Apache Hadoop\u2019s MapReduce framework and many other open source and proprietary MapReduce implementations can carry out data mining computations for large amount of data in short period of time.\n\n\n\nFig. 1. Schematic of big data technology deployment for process manufacturing industry\n\nProcess manufacturing industry can benefit from new big data technologies to solve complex problems which have not been attempted so far because of data unavailability or high computational requirements.\n\nRelevance of Big Data to Process Manufacturing Industry\n\nMore sources of data will now be available for analysis with the help of big data technologies. It will be possible to deploy a solution for combined analysis of sensor measurement data and other data types such as text, audio, thermal imaging etc.\n\n\n\nFig. 2. Potential benefits of new data\n\nFig. 2 describes that new data can impact process manufacturing plant operations in two ways - new questions that can be answered with the availability of new data & technologies and existing questions that can be better answered by including new data sources and analytics techniques. Five new questions and three current questions are listed as use cases in this section to intuit potential benefits of big data analytics technologies for process manufacturing plants.\n\nWhat new questions can be asked?\n\n\n What are the best parameter settings for operating process plant?\n When is the next process disturbance or asset abnormal operation could occur?\n What is the root cause of process disturbance or asset abnormal operation?\n What are the best corrective actions to restore the operation?\n What would be the quality of product being produced?\n\n\nHow existing questions can be answered better\n\n\n How to smoothly operate with least possible losses and inventory\n Real time integration of shop floor KPI to organization KPIs\n Traceability across the value chain\n\n\nRole of Big Data Stakeholders\n\nThere are three stakeholders in realizing benefits of big data for process manufacturing plants. Refer Fig. 3.\n\n\n Computing technology vendors\n Automation technology vendors\n Process manufacturing Industry\n\n\n\n\nFig. 3. Stakeholders for Big data technology deployment in process manufacturing plants\n\nContribution from each of the three stakeholders is critical for realizing benefits of big data technologies for process manufacturing plants. Rest of the section outlines potential contribution of the three stakeholders towards big data technology adoption in process manufacturing plants.\n\nComputing technology vendor\n\nComputing technology vendors have been traditionally supplying IT hardware and basic computing environment. In the Big Data deployment scenario, computing vendor could bring following components -\n\n\n Unstructured and Distributed File System (DFS)database (e.g. Apache Hadoop)\n Parallel computing environment (e.g. Apache Hadoop\u2019s MapReduce)\n Analytics modeling platform and computing libraries (e.g. Apache Mahout, IBM BIG data platform)\n\n\nAutomation technology vendor \n\nAutomation technology vendors have been traditionally supplying control and plant information management systems. In the big data deployment scenario, automation technology vendor could bring following components -\n\n\n Domain, control system and industrial communications knowledge\n Application of new computing technologies (DFS, MapReduce) to process manufacturing plant problems\n Operational roll out, refinement and field demonstration of new solutions\n\n\nProcess manufacturing industry\n\nProcess manufacturing industry have been traditionally adopting technology solutions and partnering with automation companies for new technology development e.g. advanced control and optimization solutions. In the big data deployment scenario, process manufacturing industry could actively engage with all the stakeholders in developing ecosystem for manufacturing analytics solutions. Process manufacturing industry could provide -\n\n\n Analytics ecosystem development and leadership\n Actual process manufacturing plant data which is key element of the puzzle\n Business critical problem statements i.e. information about what to look for in the data\n Operational and maintenance nuances which are of critical important for realizing successful operational roll out\n\n\nConclusion\n\nProcess manufacturing plants have variety of data sources such as sensors, CCTV systems, thermal videography, operation and maintenance logbooks etc. These data sources generate different types of data such as numeric, video, text etc. Different data types are used for specific analysis e.g. sensor data for process control, CCTV for facility monitoring, thermal videography for operational and safety etc. Current automation technologies do not allow for combined analysis of different data types. Big data technologies could enable complex data analysis in near real time to unlock operational efficiencies for process manufacturing plants.