ArtificiaI Intelligence Glossary, in Mind Maps

September 13, 2018

BrainWhat do people mean by artificial intelligence (AI)? The term has never had a clear definition or boundaries. When it was introduced at a seminal 1956 workshop at Dartmouth College, it was taken broadly to mean making a machine behave in ways that would be called intelligent if seen in a human. An important recent advance in AI has been machine learning, which shows up in technologies from spellcheck to self-driving cars and is often carried out by computer systems called neural networks. Any discussion of AI is likely to include other terms as well, so this glossary is not the be all and end all. However, it’s a start and what makes it different from other glossaries it is in a Mind Map format.

faceThe Glossary was developed using the iThoughts Mind Mapping software and each map has been uploaded to BiggerPlate. All maps have no rights reserved so you are free to do as you wish, but if you use them at all I would appreciate a mention. You can go directly to my Biggerplate profile Graham0921 to download the maps or view other maps I produced, some being about Artificial intelligence.

I am sure that you will get to use the glossary either now or in the future as it has relevance in Schools, Colleges, Universities and Businesses alike.

Glossary of Artificial Intelligence – A

Glossary of Artificial Intelligence – B

Glossary of Artificial Intelligence – C

Glossary of Artificial Intelligence – D

Glossary of Artificial Intelligence – G

Glossary of Artificial Intelligence – H

Glossary of Artificial Intelligence – I

Glossary of Artificial Intelligence – K

Glossary of Artificial Intelligence – L

Glossary of Artificial Intelligence – M

Glossary of Artificial Intelligence – N

Glossary of Artificial Intelligence – O

Glossary of Artificial Intelligence – P

Glossary of Artificial Intelligence – Q

Glossary of Artificial Intelligence – R

Glossary of Artificial Intelligence – S

Glossary of Artificial Intelligence – T

Glossary of Artificial Intelligence – U, V and W 


Smaller, Thinner, Lighter

August 18, 2018

April last year I wrote an article about the laptop ban which our friend Mr Trump introduced. In that article, I gave some advice about how to travel without compromise. You can read it here at

Some thirteen years previous I wrote an article for Digital Oman about this very topic, you can read the original article here

Since that time I have re-evaluated my own travelling equipment and am now travelling with a few different equipment companions.

Firstly, I should mention that I still have my iPhones 6 and 6+ which although slower than before still perform the well for what I need to do. I usually use my iPhone 6 as a camera and the 6+ for my business activities. This article is being written on my 6+ in WordPress, my Blog ‘Dr Dot’s Daily Dose’.

Last year, as in previous years I travelled with a Bluetooth LiveScribe Pen and associated notebook. Whilst this was great Technology the pen itself was quite delicate and the notebook was a onetime use only. I must say as well it was quite expensive when you had to get new notebooks and refills for the pen.

Enter the ‘Everlast’ notebook from RocketBook. This is a notebook that will (should) never run out. Why? As the pages are erasable, assuming you use the correct pen. I paid $34 for my A5 notebook, which you may think is quite costly, but it is reusable. So we have a onetime cost. The pens that you use are from the Pilot Frixon Range which comes with a built-in eraser or you can just use a damp cloth to wipe away your notes. You are probably sitting there thinking what use is an erasable notebook. Well here is the thing it comes with an app for your phone that enables you to direct your notes to any services you wish, such as email, cloud storage or just notes. Write it, scan it, send it and erase it. Simplicity itself.

At this point, I should mention that I wear glasses for reading, otherwise my eyesight is perfect. So I have been carrying various reading glasses for years until I came across ‘ThinOptics’ at the beginning of this year. These are glasses without arms and just peg on your nose. Perfect as I don’t need to fill my pockets with expensive designer labelled glasses when I am travelling. I have two pairs, one pair is on a lanyard around my neck the other pair fits into my travel document wallet.

If you remember my last two articles on this subject you will see that I continue to downsize and adapt to changing technology which I can benefit from.

When you are writing articles such as this on a smart device its beneficial to have a keyboard that you can link. Again last year I travelled with a Microsoft Universal Bluetooth Keyboard, whilst exceptionally good it was quite bulky. Earlier this year I found an AVATTO A18 Portable Twice Folding Bluetooth Keyboard BT Wireless Foldable Touchpad Keypad for IOS/Android/Windows iPad Tablet which is, in fact, the same size as my iPhone 6+. Coupled to this I bought a phone stand which flips open to any angle. I believe it’s been made from reusable plastic. Both fit nicely into my backpack and waste no space.

Finally, I have purchased a personal drone. The DJI Tello, the most affordable and smallest drone on the market. To be honest it takes wonderful pictures even though its duration is limited (15 minutes if you are lucky on slow speed). I use my iPhone 6+ as its controller and photographs or videos can be sent directly from the phone onto the internet or directly to friends or family.

So there we have it some changes and I am sure there will be more to come in the future. There are a couple of things already on the wish list but will have to wait and see how things develop.

Mobile Mind Mapping

September 4, 2017

Since the Laptop ban that Mr Trump introduced some months ago I took it upon myself to change the way I worked and use enabling technology to my advantage. When the ban was announced I wrote an article and published it on LinkedIn entitled “Light and Easy is the way to go” (Access the article –

Twelve years previously I wrote my first article for Digital Oman about mobility, you can read the original article here


To tell you the truth there are some things that you just can’t do on a mobile device, but I think that I will leave that for another blog.

I have been and still am a mind mapper for a considerable amount of years, using mind mapping in business for project management, reports, presentations, my own book, planning, brainstorming, training, consulting etc etc, the list of uses for me seems endless. Now I am using mind mapping in Education for teaching, management and my own education process. You can find many of my maps on BiggerPlate and just recently as part of the BiggerPlate Business Club I was offered the opportunity to speak on the subject of Business Process Management using a mind map for the presentation


So what has the mobile device got to do with Mind Mapping and Mr Trumps laptop ban. Well it’s really all about size! the ban was about only allowing certain sized devices into the cabins whilst your laptop had to go with your baggage (if you are crazy enough to put it in the hold of an aircraft, I used to load aircraft once upon a time so I have a good idea about their handling). The size of the screen is all import when you are trying to Mind Map. Previously I would only mind map on consulting assignments with an iPad Air, basically a 10 inch screen, but since the April ban I have downsized considerably to approximately 5 x 3 inch screen of the iPhone 6+. MindGenius was my preferred mind mapping tool but having downsized I have had to find software I can use on the iPhone which suits my purpose.


iThoughts2go on iPhone 6+

So I have a list of six which are suitable for this device and have worked extremely well on consulting assignments. There is one in my arsenal which is not really a mind mapping tool, but you can use it for mind mapping, but it’s primary use for me is Flowcharts and BPMN diagramming.

So here is the list with links to websites

  1. i Thoughts 2 go 
  2. MindMeister
  3. MindMaple
  4. iMindMap
  5. Lucidchart
  6. Mindjet Maps

The example shown below is a flowchart for teaching Academic English Writing that was developed for a University Course on the subject. Lucidchart is a great online tool for both business and education.


To summarise things are getting smaller and smaller, but capability is getting larger and larger. With these small portable devices and mapping software I can pretty much go anywhere and do my work or own education. People do say that these screens are to small, but so far I have not found it a disadvantage in what I am doing. For me it’s really about the software’s functionality and usability.



IoRIT, Internet of Really Important Things

March 1, 2016


Just recent I published a post entitled “The Three A’s of Predictive Maintenance” basically discussing the importance of maintaining assets in these current economically volatile times. The post does contain some references to IoT (Internet of Things), but here I want to concentrate what is really important, so I am going to steal the phrase from Mr Harel Kodesh, Vice President, Chief Technology Officer at GE, who introduced the phrase in his key note speech at the Cloud Foundry Summit in May of 2015 (

We build huge assets to support our way of living and these assets are the REALLY important things that without maintenance will disrupt everything if left to the “fix it when it breaks mentality”. Mr Kodesh uses two examples which I have explained in the table below, we have the Commercial Internet and we have the Industrial Internet.Both are equally as important as each other, but impacts on business and environment are much greater in the Industrial Internet and could have far reaching consequences.


When we wake in the morning we tend to think about having a shower and getting ready for work, cooking our breakfast either via electric or gas. We don’t think about the Water Distribution system, We don’t think about power generation or it distribution and we certainly don’t think about gas extraction or it’s distribution.We don’t think about the fuel or where it was made for the fight across the world for us to do business in another country. We are not sure about where the petrol or diesel comes from that powers are cars or trucks.

Well it’s reasonably simple to define, all of these commodities come from huge assets that may power other assets and have to be maintained. We are talking here about Oil & Gas Drilling and Production platforms, or Oil Refineries, or Power Stations. All of these asset include other assets which have to be maintained.


Above is a good example of what we are talking about and one that I was intimately involved with. Some 195 miles out to sea, the first concrete platform (Condeep, built by Aker in Stavanger, Norway ), the Beryl Alpha, was given a life expectancy of 20 years when it was installed by Mobil, now part of ExxonMobil, on the Beryl oilfield (Block 9/13-1) in 1975. Now 41 years on and being purchased from ExxonMobil By the Apache Corporation there is no sign of it being decommissioned and the addition in 2001 of facilities to process gas from the nearby Skene gas field has given it a new lease of life.

At its peak in 1984, Beryl Alpha was producing some 120,000 bpd. It is still pumping an average of 90,000 to 100,000 barrels of Beryl, a high quality crude (Beryl) named after Beryl Solomon, wife of Mobil Europe president Charles Solomon. Gas production is around 450 mm cfpd, representing nearly 5 % of total British gas demand or the needs of 3.2 mm households. Today “The challenge is the interface between technology 41 years old and new technology.”

So here we are thinking now about “The Internet of Really Important Things” and how we can use technology of today with the technology of yesteryear? Doing more with less, sweating the assets to coin a phrase! Compliance to specifications and rules and regulations, this is where we need tools and techniques such as Predictive Maintenance (PdM).The link specifications is a snapshot of specifications for the Beryl, monitors and sensor ensure that data is captured which as a result can be used to highlight problems before they occur, this information is being collected in realtime.

To achieve what it is called World Class Maintenance (WCM), it is necessary to improve adopted maintenance processes.Various tools available today have adopted the word maintenance. It is important to note that these are not new types of maintenance but tools that allow the application of the main types of maintenance.



Process Mining, Bridging the gap between BPM and BI

February 29, 2016

Later this year I will be involved in a MOOC entitled “Introduction to Process Mining with ProM“,  from FutureLearn. Unfortunately it has just been delayed from April till July, but being interested in BPM and BI, I thought that I would start my own research into the subject and publish my own findings.

Prof. Dr. Ir. Wil van der Aalst, Department of Mathematics and Computer Science (Information Systems WSK&I) is the founding father of “Process Mining ” and is located at the Data Science Center, Eindhoven in the Netherlands. You will find many quotes attributed to him in this post.


Today a tremendous amount of  information about business processes is recorded by information systems in the form of  “event logs”. Despite the omnipresence of such data, most organisations diagnose problems based on fiction rather than facts. Process mining is an emerging discipline based on process model-driven approaches and data mining. It not only allows organisations to fully benefit from the information stored in their systems, but it can also be used to check the conformance of processes, detect bottlenecks, and predict execution problems.

So lets see what it is all about?

Companies use information systems to enhance the processing of their business transactions. Enterprise resource planning (ERP)  and workflow management systems (WFMS)  are the predominant information system types that are used to support and automate the execution of business processes. Business processes like procurement, operations, logistics, sales and human resources can hardly be imagined without the integration of information systems that support and monitor relevant activities in modern companies. The increasing integration of information systems does not only provide the means to increase effectiveness and efficiency. It also opens up new possibilities of data access and analysis. When information systems are used for supporting and automating the processing of business transactions they generate data. This data can be used for improving business decisions.

The application of techniques and tools for generating information from digital data is called business intelligence (BI) . Prominent BI approaches are online analytical processing (OLAP)  and data mining  (Kemper et al. 2010 pp. 1–5). OLAP tools allow analysing multidimensional data using operators like roll-up and drill-down, slice and dice or split and merge (Kemper et al. 2010 pp. 99–106). Data mining is primarily used for discovering patterns in large data sets (Kemper et al. 2010 p. 113).

However the availability of data is not only a blessing as a new source of information but it can also become a curse. The phenomena of information overflow  (Krcmar 2010 pp. 54–57), data explosion (Van der Aalst 2011 pp. 1–3) and big data  (Chen et al.2012) illustrate several problems that arise from the availability of enormous amounts of data. Humans are only able to handle a certain amount of information in a given time frame. When more and more data is available how can it actually be used in a meaningful manner without overstraining the human recipient?

Data mining  is the analysis of data for finding relationships and patterns. The patterns are an abstraction of the analysed data. Abstraction reduces complexity and makes information available for the recipient. The aim of “Process Mining” is the extraction of information about business processes (Van der Aalst 2011 p. 1). Process mining encompasses “techniques, tools and methods to discover, monitor and improve real processes “by extracting knowledge from event logs” (Van der Aalst et al. 2012 p. 15). The data that is generated during the execution of business processes in information systems is used for reconstructing process models. These models are useful for analysing and optimising processes. Process mining is an innovative approach and builds a bridge between data mining (BI) and business process management (BPM).

Process mining evolved in the context of analysing software engineering processes  by Cook and Wolf in the late 1990s (Cook and Wolf 1998). Agrawal and Gunopulos (Agrawal et al. 1998) and Herbst and Karagiannis (Herbst and Karagiannis 1998) introduced process mining to the context of workflow management. Major contributions to the field have been added during the last decade by van der Aalst and other research colleagues by developing mature mining algorithms and addressing a variety of topic related challenges (Van der Aalst 2011). This has led to a well developed set of methods and tools that are available for scientists and practitioners.

Introduction to the basic concepts of process mining. 

The aim of process mining is the construction of process models based on available event log data. In the context of information system science a model is an immaterial representation of its real world counterpart used for a specific purpose (Becker et al.2012 pp. 1–3). Models can be used to reduce complexity by representing characteristics of interest and by omitting other characteristics. A process model is a graphical representation of a business process that describes the dependencies between activities that need to be executed collectively for realising a specific business objective. It consists of a set of activity models and constraints between them (Weske 2012 p. 7).

Process models can be represented in different process modelling languages, BPMN provides more intuitive semantics that are easier to understand for recipients that do not possess a theoretical background in informatics. So I am going to use BPMN models for examples in this post.

Above is a business process model of a simple procurement process . It starts with the definition of requirements. The goods or service get ordered, at some point of time the ordered goods or service get delivered. After the goods or service have been received the supplier issues an invoice which is finally settled by the company that ordered the goods or service.

Each one of the events depicted in the process above will have an entry in an event log.  An event log  is basically a table. It contains all recorded events that relate to executed business activities. Each event is mapped to a case. A process model  is an abstraction of the real world execution of a business process. A single execution of a business process is called process instance . They are reflected in the event log as a set of events that are mapped to the same case. The sequence of recorded events in a case is called trace . The model that describes the execution of a single process instance is called process instance model . A process model abstracts from the single behaviour of process instances and provides a model that reflects the behaviour of all instances that belong to the same process. Cases and events are characterised by classifiers and attributes.Classifiers  ensure the distinctness of cases and events by mapping unique names to each case and event. Attributes store additional information that can be used for analysis purposes.

The Mining Process

The process above provides an overview of the different process mining activities. Before being able to apply any process mining technique it is necessary to have access to the data. It needs to be extracted from the relevant information systems. This step is far from trivial. Depending on the type of source system the relevant data can be distributed over different database tables. Data entries might need to be composed in a meaningful manner for the extraction. Another obstacle is the amount of data. Depending on the objective of the process mining up to millions of data entries might need to be extracted which requires efficient extraction methods. A further important aspect is confidentiality. Extracted data might include personalised information and depending on legal requirements anonymisation or pseudonymisation might be necessary.

Before the extracted event log can be used it needs to be filtered and loaded into the process mining software. There are different reasons why filtering is necessary. Information systems are not free of errors . Data may be recorded that does not reflect real activities. Errors can result from malfunctioning programs but also from user disruption or hardware failures that leads to erroneous records in the event log.

Process Mining Algorithms

The main component in process mining is the mining algorithm. It determines how the process models are created. A broad variety of mining algorithms do exist. The following three categories will be discussed but not in great detail.

  • Deterministic mining algorithms
  • Heuristic mining algorithms
  • Genetic mining algorithms

Determinism means that an algorithm only produces defined and reproducible results. It always delivers the same result for the same input. A representative of this category is the α-Algorithm  (Van der Aalst et al. 2002). It was one of the first algorithms that are able to deal with concurrency. It takes an event log as input and calculates the ordering relation of the events contained in the log.

Heuristic mining also uses deterministic algorithms but they incorporate frequencies of events and traces for reconstructing a process model. A common problem in process mining is the fact that real processes are highly complex and their discovery leads to complex models. This complexity can be reduced by disregarding infrequent paths in the models.

Genetic mining algorithms use an evolutionary approach that mimics the process of natural evolution. They are not deterministic. Genetic mining algorithms follow four steps: initialisation, selection, reproduction and termination . The idea behind these algorithms is to generate a random population of process models and to find a satisfactory solution by iteratively selecting individuals and reproducing them by crossover and mutation over different generations. The initial population of process models is generated randomly and might have little in common with the event log. However due to the high number of models in the population, selection and reproduction better fitting models are created in each generation.

The process above shows a mined process model that was reconstructed by applying the α-Algorithm from an event log. It was translated into a BPMN model for better comparability. Obviously this model is not the same as the model in the first process diagram above. The reason for this is that the mined event log includes cases that deviate from the ideal linear process execution that was assumed for modelling in the first process depiction. In case 4 the invoice is received before the goods or service. Due to the fact that both possibilities are included in the event log (goods or service received before the invoice in case 1, 2, 3, 5 and invoice received before the ordered goods in case 4) the mining algorithm assumes that these activities can be carried out concurrently.

Process Discovery and Enhancement

A major area of application for process mining is the discovery of formerly unknown process models for the purpose of analysis or optimisation  (Van der Aalst et al. 2012 p. 13). Business process reengineering and the implementation of ERP systems in organisations gained strong attention starting in the 1990s. Practitioners have since primarily focused on designing and implementing processes and getting them to work. With maturing integration of information systems into the execution of business processes and the evolution of new technical possibilities the focus shifts to analysis and optimisation.

Actual executions of business processes can now be described and be made explicit. The discovered processes can be analysed for performance indicators like average processing time or costs for improving or reengineering the process. The major advantage of process mining is the fact that it uses reliable data. The date that is generated in the source systems is generally hard to manipulate by the average system user. For traditional process modelling necessary information is primarily gathered by interviewing, workshops or similar manual techniques that require the interaction of persons. This leaves room for interpretation and the tendency that ideal models are created based on often overly optimistic assumptions.

Analysis and optimisation is not limited to post-runtime inspections. Instead it can be used for operational support  by detecting traces being executed that do not follow the intended process model. It can also be used for predicting the behaviour of traces under execution. An example for runtime analysis is the prediction of the expected completion time by comparing the instance under execution with similar already processed instances. Another feature can be the provision of recommendations to the user for selecting the next activities in the process. Process mining can also be used to derive information for the design of business processes before they are implemented.


Process mining builds the bridge between data mining (BI)  and business process management (BPM). The increasing integration of information systems for supporting and automating the execution of business transactions provides the basis for novel types of data analysis. The data that is stored in the information systems can be used to mine and reconstruct business process models. These models are the foundation for a variety of application areas including process analysis and optimisation or conformance and compliance checking. The basic constructs for process mining are event logs, process models and mining algorithms. I have summarised essential concepts of process mining in this post, illustrating the main application areas and one of the available tools, namely ProM.

Process mining is still a young research discipline and limitations concerning noise, adequate representation and competing quality criteria should be taken into account when using process mining. Although some areas like the labelling of events, complexity reduction in mined models and phenomena like concept drift need to be addressed by further research the available set of methods and tools provide a rich and innovative resource for effective and efficient business process management.

The Three “A’s” of Predictive Maintenance

February 25, 2016

Again today in the news is another Oil & Gas company posting a profit loss, a rig operator scrapping two rigs and predictions of shortfalls in supplies by 2020, plus major retrenchments of staff across the globe. With all of this going on the signs are that we are going to have to sweat the assets and do more with less. How then are we going to do more with less?


This post is going to focus on the use of Predictive Analytics for the Maintenance Process or PdM (Predictive Maintenance ) Organisation’s are looking at their operations and how to reduce costs more than ever before. They are experiencing increased consumer empowerment, global supply chains, ageing assets, raw material price volatility, increased compliance, and an ageing workforce. A huge opportunity for many organisations is a focus on their assets.

Although organisations lack not only visibility but also predictability into their assets’ health and performance, maximising asset productivity and ensuring that the associated processes are as efficient as possible are key aspects for organisations striving to gain strong financial returns.

In order for your physical asset to be productive, it has to be up, running, and functioning properly. Maintenance is a necessary evil that directly affects the bottom line. If the asset fails or doesn’t work properly, it takes a lot of time, effort, and money to get it back up and running. If the asset is down, you can’t use it. For example, you can’t manufacture products, mine for minerals, drill for oil, refine lubricants, process gas, generate power etc, etc.

Maintenance has evolved with the technology, organisational processes, and the times. Predictive maintenance (PdM), technology, has become more popular and mainstream for organisations, but in many cases remains inconsistent.

There are many reasons for the this that include the items below:

  • Availability of large amounts of data due to Instruments and connected assets (IoT)
  • Increased coupling of technology within businesses (MDM, ECM, SCADA)
  • Requirements to do more with less. For example, stretching the useful life of an asset (EOR)
  • Relative ease of use of garnering insights from raw data (SAP HANA)
  • Reduced cost of computing, network, and storage technology (Cloud Storage, SaaS, In Memory Computing)
  • Convergence of Information Technology with Operational technology (EAM, ECM)

PdM will assist organisations with key insights regarding asset failure and product quality, enabling them to optimise their assets, processes, and employees. Organisations are realising the value of PdM and how it can be a competitive advantage. Given the economic climate and pressure on everyone to do more with less.

Operations budgets are always the first to be cut, it no longer makes sense to employ a wait-for-it-to-break mentality. Executives say that the biggest impact on operations is failure of critical assets. In this post I am going to show how Predictive Analytics or PdM will assist organisations.

Predictive Maintenance Definition.

We have all understood what Preventive Maintenance was, it was popular in the 20th Century but PdM is very much focused in the 21st Century. PdM is an approach based upon various types of information that allows maintenance, quality and operational decision makers to predict when an asset needs maintenance. There is a myth that PdM is focused purely on asset data, however it is much more. It includes information from the surrounding environment in which the asset operates and the associated processes and resources that react with the asset.

PdM leverages various Analytical techniques to provide better visibility of the asset to the decision makers and analyses various type of data. It is important to understand the data that is being analysed. PdM is usually based upon usage and wear characteristics of the asset, as well as other asset condition information. As we know data comes in many different formats. The data can be at REST (data that is fixed and does not move over time) or Streaming data (where data can be constantly on the move).

Types of Data.

From my previous posts on the subject of Big Data you will know by now that there are basically two types of Data, however in the 21st century there is a third. The 1st being Structured Data, the 2nd being Unstructured data and the 3rd is Streaming Data. The most common of course is structured and is collected from various systems and processes. CRM, ERP, Industrial controls systems such as SCADA, HR, Financial, information and data warehouses etc. All of these systems contain datasets in tables. Examples of this include Inventory information, production information, financial information and specifically asset information such as name, location, history, usage, type etc.

Unstructured Data comes in the form of Text data such as e-mails, maintenance and operator logs, social media data, and other free-form data that is available today in limitless quantities is unstructured data. Most organisations are still trying to fathom how to utilise this data. To accommodate this data, a text analytics program must be in place to make the content useable.

Streaming data is information that needs to be collected and analysed in real time. It includes information from sensors, satellites, Drones and programmable logic controllers (PLCs), which are digital computers used for automation of electromechanical processes, such as control of machinery on factory assembly lines, amusement rides, or light fixtures. Examples of streaming data include telematic, measurement, and weather information this format is currently gaining the most traction as the need for quick decision making grows.

Why use PdM?

There are a number of major reasons to employ PdM and there is a growing recognition that the ability to predict asset failure has great long term value to the organisation.

  •  Optimise maintenance intervals
  • Minimise unplanned downtime
  • Uncover in depth root cause analysis of failures
  • Enhance equipment and process diagnostics capabilities
  • Determine optimum corrective action procedures

Many Industries Benefit from PdM

For PdM to be of benefit to organisations, the assets must have information about them as well as around them. Here are a couple of examples from my own recent history. However any industry that has access to instrumented streaming data has the ability to deploy PdM.

Energy Provider

Keeping the lights on for an entire State in Australia is no small feat. Complex equipment, volatile demand, unpredictable weather, plus other factors can combine in unexpected ways to cause power outages. An energy provider used PdM to understand when and why outages occurred so it could take steps to prevent them. Streaming meter data helped the provider analyze enormous volumes of historical data to uncover usage patterns. PdM helped define the parameters of normal operation for any given time of day, day of the week, holiday, or season and detected anomalies that signal a potential failure.

Historical patterns showed that multiple factors in combination increased the likelihood of an outage. When national events caused a spike in energy demand and certain turbines were nearing the end of their life cycle, there was a higher likelihood of an outage. This foresight helped the company take immediate action to avoid an imminent outage and schedule maintenance for long-term prevention. With PdM, this energy provider

  • Reduced costs by up to 20 percent (based on similar previous cases) by avoiding the expensive process of reinitiating a power station after an outage
  • Predicted turbine failure 30 hours before occurrence, while previously only able to predict 30 minutes before failure
  • Saved approximately A$100,000 in combustion costs by preventing the malfunction of a turbine component
  • Increased the efficiency of maintenance schedules, costs and resources, resulting in fewer outages and higher customer satisfaction

Oil & Gas Exploration & Production Company

A large multinational company that explores and produces oil and gas conducts exploration in the Arctic Circle. Drilling locations are often remote, and landfall can be more than 100 miles away. Furthermore, the drilling season is short, typically between July and October.

The most considerable dangers that put people, platforms, and structures at risk are colliding with or being crushed by ice floes, which are flat expanses of moving ice that can measure up to six miles across. Should a particularly thick and large ice floe threaten a rig, companies typically have less than 72 hours to evacuate personnel and flush all pipelines to protect the environment. Although most rigs and structures are designed to withstand some ice-floe collisions, oil producers often deploy tugboats and icebreakers to manage the ice and protect their rigs and platform investments. This is easily warranted: a single oil rigcosts $350 million and has a life cycle that can span decades. To better safeguard its oil rigs, personnel, and resources, the company had to track the courses of thousands of moving potential hazards. The company utilised PdM by analyzing direction, speed, and size of floes using satellite imagery to detect, track, and forecast the floe trajectory. In doing so, the company

  • Saved roughly $300 million per season by reducing mobilisation costs associated with needing to drill a second well should the first well be damaged or evacuated
  • Saved $1 billion per production platform by easing design requirements, optimising rig placement, and improving ice management operations
  • Efficiently deployed icebreakers when and where they were needed most

Workforce Planning, Management & Logistics and PdM

The focus of predictive maintenance (PdM) is physical asset performance and failure and its associated processes. One key aspect that tends to be overlooked, but is critical to ensure PdM sustainability, is Human Resources. Every asset is managed, maintained, and run by an operator or employee. PdM enables organisations to ensure that they have the right employee or service contractor assigned to the right asset, at right time with the right skill set.

Many organisations already have enough information about employees either in their HR, ERP, or manufacturing databases. They just haven’t analysed the information in coordination with other data they may have access to.

Some typical types of operator information include

  • Name
  • Work duration
  • Previous asset experience
  • Training courses taken
  • Safety Courses
  • Licences
  • Previous asset failures and corrective actions taken

The benefits of using PdM in the WPML process include the following:

  • Workforce optimisation: Accurately allocate employee’s time and tasks within a workgroup, minimising costly overtime
  • Best employee on task: Ensure that the right employee is performing the most valuable tasks
  • Training effectiveness: Know which training will benefit the employee and the organisation
  • Safety: Maintain high standards of safety in the plant
  • Reduction in management time: Fewer management hours needed to plan and supervise employees
  • A more satisfied, stable workforce: Make people feel they are contributing to the good of the organisation and feel productive.

The key for asset intensive companies is to ensure that their assets are safe, reliable, and available to support their business. Companies have found that simply adding more people or scheduling more maintenance sessions doesn’t produce cost-effective results. In order for organisations to effectively utilize predictive maintenance (PdM), they must understand the analytical process, how it works, its underlying techniques, and its integration with existing operational processes; otherwise, the work to incorporate PdM in your organisation will be for nothing.

The Analytical Process, the three “A” approach.

As organisations find themselves with more data, fewer resources to manage them, and a lack of knowledge about how to quickly gain insight from the data, the need for PdM becomes evident.The world is more instrumented and interconnected, which yields a large amount of potentially useful data. Analytics transforms data to quickly create actionable insights that help organizations run their businesses more cost effectively

First A = Align

The align process is all about the data. You understand what data sources exist, where they are located, what additional data may be needed or can be acquired, and how the data is integrated or can be integrated into operational processes. With PdM, it doesn’t matter if your data is structured or unstructured, streaming or at rest. You just need to know which type it is so you can integrate and analyse the data appropriately.

Second A = Anticipate

In this phase, you leverage PdM to gain insights from your data. You can utilise several capabilities and technologies to analyse the data and predict outcomes:

1). Descriptive analytics provides simple summaries and observations about the data. Basic statistical analyses, for which most people utilise Microsoft Excel, are included in this category. For example, a manufacturing machine failed three times yesterday for a total downtime of one hour.

2). Data mining is the analysis of large quantities of data to extract previously unknown interesting patterns and dependencies. There are several key data mining techniques:

Anomaly detection: Discovers records and patterns that are outside the norm or unusual. This can also be called outlier, change, or deviation detection. For example, out of 100 components, component #23 and #47 have different sizes than the other 98.

Association rules: Searches for relationships, dependencies, links, or sequences between variables in the data. For example, a drill tends to fail when the ambient temperature is greater than 100 degrees Fahrenheit, it’s 1700 hrs, and it’s been functioning for more than 15 hours.

Clustering: Groups a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. For example, offshore oil platforms that are located in North America and Europe are grouped together because they tend to be surrounded by cooler air temperatures, while those in South America and Australia are grouped separately because they tend to be surrounded by warmer air temperatures.

Classification: Identifies which of a set of categories a new data point belongs to. For example, a turbine may be classified simply as “old” or “new.”

Regression: Estimates the relationships between variables and determines how much a variable changes when another variable is modified. For example, plant machinery tends to fail as the age of the asset increase.

Text mining derives insights and identifies patterns from text data via natural language processing, which enables the understanding of and alignment between computer and human languages. For example, from maintenance logs, you may determine that the operator always cleans the gasket in the morning before starting, which leads to an extended asset life.

Machine learning enables the software to learn from the data. For example, when an earthmover fails, there arethree or four factors that come into play. The next time those factors are evident, the software will predict that the earthmover will fail. You may come across predictive analytics. It is a category of analytics that utilises machine learning and data mining techniques to predict future outcomes.

Simulation enables what-if scenarios for a specific asset or process. For example, you may want to know how running the production line for 24 continuous hours will impact the likelihood of failure.

Prescriptive analytics goes beyond predicting future outcomes by also suggesting actions and showing the implications of each decision option. For example, based on the data, organisations can predict when a water pipe is likely to burst. Additionally, the municipality can have an automated decision where for certain pipes, certain valves must be replaced by a Level-3 technician. Such an output provides the operations professional with the predictive outcome, the action, and who needs to conduct the action. A decision management framework that aligns and optimises decisions based on analytics and organisational domain knowledge can automate prescriptive analytics.

The Final A = Act

In the final A, you want to act at the point of impact with confidence on the insights that your analysis provided. This is typically done by using a variety of channels including e-mail,mobile, reports, dashboards, Microsoft Excel, and enterprise asset management systems (EAM) essentially, however, your organisation makes decisions within your operational processes. A prominent aspect of the act phase is being able to view the insights from the anticipate process so employees can act on them. There are three common outputs:

Reports: Display results, usually in list format

Scorecards: Also known as balanced scorecards; automatically track the execution of staff activities and monitor the consequences arising from these actions; primarily utilised by management

Dashboards: Exhibit an organisation’s key performance indicators in a graphical format; primarily utilised by Senior Management

Organisations that utilise as many analytical capabilities ofPdM as possible will be able to optimise the appropriate analytics with the data. Ultimately, organisations will have better insights and make better decisions than those organisations that don’t. It may be easier for you to leverage a single software vendor that can provide all of these capabilities and integrate all three phases in your operational processes so you can maximise PdM’s benefits. Here are a few names to be going on with, TROBEXIS, OPENTEXT, SAP, MAXIMO

Everything has Big Data, Big Data Has everything

February 24, 2016

During the course of 2015 and on into 16 I have been studying, researching as well as working. My study, research has been on various topics from IoT, Sustainability, Environment, Food Security, Complex Decision making, Supply Chain Management, Cyber Security, In memory Computing, Statistics to name but a few(FutureLearn). Work wise I have been a Consultant for Enterprises Solutions(Trobexis), Enterprise Asset Management, Enterprise Content Management (OpenText) in a variety of Industries, which include, Oil & Gas, Utilities, Construction, Maintenance even Banking. Here is the thing though, what do all these study, research and work all have in common? Simple BIG DATA!!!! That’s getting bigger everyday.


If you want to be successful with big data, you need to begin by thinking about solving problems in new ways. Many of the previous limitations placed on problem-solving techniques were due to lack of data, limitations on storage or computing power, or high costs. The technology of big data is shattering some of the old concepts of what can’t be done and opening new possibilities for innovation across many industries (SAP S/4 HANA.)

The new reality is that an enormous amount of data is generated every day that may have some relevance to your business, if you can only tap into it. Most likely, you have invested lots of resources to manage and analyse the structured data that you need to understand your customers, manage your operations, and meet your financial obligations. However, today you find huge growth in a very different type of data. The type of information you get from social media, news or stock market data feeds, and log files, and spatial data from sensors, medical-device data, or GPS data, is constantly in motion.

These newer sources of data can add new insight to some very challenging questions because of the immediacy of the knowledge. Streaming data, data in motion, provides a way to understand an event at the moment it occurs.

Working with TROBEXIS over the last 12 months we have built a model which has big data constantly in motion, which is both structured and unstructured, coming from a variety of sources, in a variety of formats. The data is constantly in motion and events occur when you least expect them, giving rise to exceptions and causing problems to occur in the best laid plans. Some of the questions that we have found ourselves asking are:

– Do we have the right people, with the right skills, in the right place, at the right time, in the right phase of the project, with the right materials, for the right service, at the right cost, in the right numbers, against the right contract, for the right client?

– What if we do this job, in a different phase of the project, using a different contractor, with a different set of numbers, an alternative process, with a higher cost?

– Are we buying the right materials, through the right channel, at the right cost using the right process?

Data in Motion – a real world view.

To complete a credit card transaction, finalise a stock market transaction, or send an e-mail, data needs to be transported from one location to another. Data is at rest when it is stored in a database in your data center or in the cloud. In contrast, data is in motion when it is in transit from one resting location to another.

Companies that must process large amounts of data in near real time to gain business insights are likely orchestrating data while it is in motion. You need data in motion if you must react quickly to the current state of the data.

Data in motion and large volumes of data go hand in hand. Many real-world examples of continuous streams of large volumes of data are in use today:

✓ Sensors are connected to highly sensitive production equipment to monitor performance and alert technicians of any deviations from expected performance. The recorded data is continuously in motion to ensure that technicians receive information about potential faults with enough lead time to make a correction to the equipment and avoid potential harm to plant or process.

✓ Telecommunications equipment is used to monitor large volumes of communications data to ensure that service levels meet customer expectations.

✓ Point-of-sale data is analysed as it is created to try to influence customer decision making. Data is processed and analysed at the point of engagement, maybe in combination with location data or social media data.

✓ Messages, including details about financial payments or stock trades, are constantly exchanged between financial organisations. To ensure the security of these messages, standard protocols such as Advanced Message Queuing Protocol (AMQP) or IBM’s MQSeries are often used. Both of these messaging approaches embed security services within their frameworks.

✓ Collecting information from sensors in a security-sensitive area so that an organisation can differentiate between the movement of a harmless animal and a car moving rapidly toward a facility.

✓ Medical devices can provide huge amounts of detailed data about different aspects of a patient’s condition and match those results against critical conditions or other abnormal indicators.

The value of streaming data

Data in motion, often in the form of streaming data, is becoming increasingly important to companies needing to make decisions when speed is a critical factor. If you need to react quickly to a situation, having the capability to analyse data in real time may mean the difference between either being able to react to change an outcome or to prevent a poor result. The challenge with streaming data is to extract useful information as it is created and transported before it comes to a resting location. Streaming data can be of great value to your business if you can take advantage of that data when it is created or when it arrives at your business.

You need to process and analyse the streaming data in real time so that you can react to the current state of the data while in motion and before it is stored. You need to have some knowledge of the context of this data and how it relates to historical performance. Also you need to be able to integrate this information with traditional operational data. The key issue to remember is that you need to have a clear understanding of the nature of that streaming data and what results you are looking for. For example, if your company is a manufacturer, it will be important to use the data coming from sensors to monitor the purity of chemicals being mixed in the production process. This is a concrete reason to leverage the streaming data. However, in other situations, it may be possible to capture a lot of data, but no overriding business requirement exists. In other words, just because you can stream data doesn’t mean that you always should.

How can you use data to change your business? In the following use cases, look at how organisations in several industries are finding ways to gain value from data in motion. In some situations, these companies are able to take data they already have and begin to use it more effectively. In other situations, they are collecting data that they were not able to collect before. Sometimes organisations can collect much more of the data that they had been only collecting snapshots of in the past. These organisations are using streaming data to improve outcomes for customers, patients, city residents, or perhaps for mankind. Businesses are using streaming data to influence customer decision making at the point of sale.

Use Cases

2015 became the age of the data driven organisation. Thanks to the rise of easier collection mechanisms and the ubiquity of available data sources, savvy organisations are now implementing data in new and novel ways that address real business issues. Here are just a few worth a mention: –

Objective: Operational Analysis; Efficiency Enhancement & Risk Elimination

Organization: Siemens

Article title: Using Big Data and Analytics to Design a Successful Future

Overview: Siemens Mobility Data Services division is capitalising on big data and analytics to ensure transportation around the globe is fast, reliable and more energy efficient. Innovation that includes predicting failures, ensuring a seamless supply chain for parts to reduce or eliminate downtime and use weather data to differentiate problems in the same train model in different regions.


Objective: Cost Analysis & Reduction, Operations Analysis, Risk Elimination

Organization: N/A

Article title: Could Big Data be the Solution to Clean Technology?

Overview: Big data analysis is extraordinarily useful and powerful in terms of identifying ways to reduce waste and improve processes. There is a massive amount of data available today that can be used to predict energy needs, improve energy production methods, build more energy-efficient structures, and even curb consumer energy consumption.


Objective: Efficiency Enhancement, Market Sensing & Predictive Analysis

Organization: IBM

Article title: Big Data & Analytics adds to the power of Renewable Energy

Overview: Sophisticated weather forecasting and analytics matures renewable energy market


Objective: Cost Analysis & Reduction, Risk Elimination

Organization: N/A

Article title: Use Big Data to Survive Market Volatility

Overview: For every single well they own, executives in the oil and gas industry must know full lifecycle costs, from exploration to abandonment. Data is the foundation of this understanding; even at a basic level it requires looking at everything from geophysical exploration data to real-time production data to refining operations data to trading data, and more.


Objective: Operations Analysis; Revenue Generation / Customer Acquisition

Organization: Panjiva

Article title: The global import-export data you need to take your business across borders

Overview: From import-export trends, to the tally of cargos for individual shippers or consignees, right down to the details of each transaction – you are just clicks away from information you need to gain market insights


Objective: Risk Elimination

Organization: N/A

Article title: How advanced analytics are redefining banking

Overview: Innovators are using big data and analytics to sharpen risk assessment and drive revenue.


Objective: Efficiency Enhancement

Organization: Nutanix

Article title: The Impact of Big Data on the Data Supply Chain

Overview: The impact of big data on the data supply chain (DSC) has increased exponentially with the proliferation of mobile, social and conventional Web computing. This proliferation presents multiple opportunities in addition to technological challenges for big data service providers.


Objective: Market Sensing / Predictive Analysis

Organization: DHL

Article title: “Big Data in Logistics: A DHL perspective on how to move beyond the hype”

Overview: Big Data is a relatively untapped asset that companies can exploit once they adopt a shift of mindset and apply the right drilling techniques.


Objective: Operations Analysis

Organization: N/A

Article title: Making Big Data Work: Supply Chain Management

Overview: The combination of large, fast-moving, and varied streams of big data and advanced tools and techniques such as geo analytics represents the next frontier of supply chain innovation. When they are guided by a clear understanding of the strategic priorities, market context, and competitive needs of a company, these approaches offer major new opportunities to enhance customer responsiveness, reduce inventory, lower costs, and improve agility.


Objective: Efficiency Enhancement

Organization: Transport for London

Article title: How Big Data And The Internet Of Things Improve Public Transport In London

Overview: Transport for London (TfL) oversees a network of buses, trains, taxis, roads, cycle paths, footpaths and even ferries which are used by millions every day. Running these vast networks, so integral to so many people’s lives in one of the world’s busiest cities, gives TfL access to huge amounts of data. This is collected through ticketing systems as well as sensors attached to vehicles and traffic signals, surveys and focus groups, and of course social media.



And one of my particular favourites being a lover of Starbucks……

Objective: New Product Creation

Organization: Starbuck’s

Article title: How Big Data Helps Chains Like Starbucks Pick Store Locations – An (Unsung) Key To Retail Success

Overview: The reality is that 94% of retail sales are still rung up in physical stores, and where merchants place those stores plays an outsized role in determining whether their chains fly or flop.


Riding the Waves of Big Data

February 22, 2016

The data management waves over the past fifty years have culminated in where we are today, “The Era of Big Data”. To really understand what big data is all about you have to understand how we have moved through the waves from one to another and that we have never thrown anything away as we have moved forward, in terms of tools, technology and practices to address different types of problems.

The First “Big” Wave – Manageable Data Structures

Computing moved into the commercial market in the late 1960s, data was stored in flat files that imposed no structure. When companies needed to get to a level of detailed understanding about customers, they had to apply brute-force methods, including very detailed programming models to create some value. Later in the 1970s, things changed with the invention of the relational data model and the relational database management system (RDBMS) that imposed structure and a method for improving performance. More importantly, the relational model added a level of abstraction (the structured query language [SQL], report generators, and data management tools) so that it was easier for programmers to satisfy the growing business demands to extract value from data.

The relational model offered an ecosystem of tools from a large number of emerging software companies. It filled a growing need to help companies better organise their data and be able to compare transactions from one geography to another. In addition, it helped business managers who wanted to be able to examine information such as inventory and compare it to customer order information for decision-making purposes. However a problem emerged from this exploding demand for answers: Storing this growing volume of data was expensive and accessing it was slow. To make matters worse, huge amounts of data duplication existed, and the actual business value of that data was hard to measure.

At this stage, an urgent need existed to find a new set of technologies to support the relational model. The Entity-Relationship (ER) model emerged, which added additional abstraction to increase the usability of the data. In this model, each item was defined independently of its use. Therefore, developers could create new relationships between data sources without complex programming. It was a huge advance at the time, and it enabled developers to push the boundaries of the technology and create more complex models requiring complex techniques for joining entities together. The market for relational databases exploded and remains vibrant today. It is especially important for transactional data management of highly structured data. When the volume of data that organisations needed to manage grew out of control, the data warehouse provided a solution. The data warehouse enabled the IT organisation to select a subset of the data being stored so that it would be easier for the business to try to gain insights. The data warehouse was intended to help companies deal with increasingly large amounts of structured data that they needed to be able to analyse by reducing the volume of the data to something smaller and more focused on a particular area of the business. It filled the need to separate operational decision support processing and decision support, for performance reasons. In addition, warehouses often store data from prior years for understanding organisational performance, identifying trends, and helping to expose patterns of behaviour. It also provided an integrated source of information from across various data sources that could be used for analysis. Data warehouses were commercialised in the 1990s, and today, both content management systems and data warehouses are able to take advantage of improvements in scalability of hardware, virtualisation technologies, and the ability to create integrated hardware and software systems, also known as appliances.

Sometimes these data warehouses themselves were too complex and large and didn’t offer the speed and agility that the business required. The answer was a further refinement of the data being managed through data marts. These data marts were focused on specific business issues and were much more streamlined and supported the business need for speedy queries than the more massive data warehouses. Like any wave of data management, the data warehouse has evolved to support emerging technologies such as integrated systems and data appliances. Data warehouses and data marts solved many problems for companies needing a consistent way to manage massive transactional data (SAP). Unfortunately, when it came to managing huge volumes of unstructured or semi-structured data, the warehouse was not able to evolve enough to meet changing demands. To complicate matters, data warehouses are typically fed in batch intervals, usually weekly or daily. This is fine for planning, financial reporting, and traditional marketing campaigns, but is too slow for increasingly real-time business and consumer environments. How would companies be able to transform their traditional data management approaches to handle the expanding volume of unstructured data elements? The solution did not emerge overnight. As companies began to store unstructured data, vendors began to add capabilities such as BLOBs (binary large objects). In essence, an unstructured data element would be stored in a relational database as one contiguous chunk of data. This object could be labeled (that is, a customer inquiry) but you couldn’t see what was inside that object. Clearly, this wasn’t going to solve changing customer or business needs.

Enter the object database management system (ODBMS). The object database stored the BLOB as an addressable set of pieces so that we could see what was in there. Unlike the BLOB, which was an independent unit appended to a traditional relational database, the object database provided a unified approach for dealing with unstructured data. Object databases include a programming language and a structure for the data elements so that it is easier to manipulate various data objects without programming and complex joins. The object databases introduced a new level of innovation that helped lead to the second wave of data management.

The Second “Big” Wave – Web, Unstructured and Content Management

It is no secret that most data available in the world today is unstructured. Paradoxically, companies have focused their investments in the systems with structured data that were most closely associated with revenue: line of business transactional systems. Enterprise Content Management systems (example OpenText ) evolved in the 1980s to provide businesses with the capability to better manage unstructured data, mostly documents. In the 1990s with the rise of the web, organisations wanted to move beyond documents and store and manage web content, images, audio, and video. The market evolved from a set of disconnected solutions to a more unified model that brought together these elements into a platform that incorporated business process management, version control, information recognition, text management, and collaboration. This new generation of systems added metadata (information about the organisation and characteristics of the stored information). These solutions remain incredibly important for companies needing to manage all this data in a logical manner. However,  at the same time, a new generation of requirements has begun to emerge that drive us to the next wave. These new requirements have been driven, in large part, by a convergence of factors including the web, virtualisation, and cloud computing. In this new wave, organisations are beginning to understand that they need to manage a new generation of data sources with an unprecedented amount and variety of data that needs to be processed at an unheard-of speed (SAP S/4 HANA).

The Third “Big” Wave – Managing “BIG” Data

     Big data is not really new,  is it an evolution in the data management journey!  As with other waves in data management, big data is built on top of the evolution of data management practices over the past five decades. What is new is that for the first time, the cost of computing cycles and storage has reached a tipping point. Why is this important? Only a few years ago, organisations typically would compromise by storing snapshots or subsets of important information because the cost of storage and processing limitations prohibited them from storing everything they wanted to analyse.

In many situations, this compromise worked fine. For example, a manufacturing company might have collected machine data every two minutes to determine the health of systems. However, there could be situations where the snapshot would not contain information about a new type of defect and that might go unnoticed for months.

With big data, it is now possible to virtualise data so that it can be stored efficiently and, utilising cloud-based storage, more cost-effectively as well.  In addition, improvements in network speed and reliability have removed other physical limitations of being able to manage massive amounts of data at an acceptable pace. Add to this the impact of changes in the price and sophistication of computer memory. With all these technology transitions, it is now possible to imagine ways that companies can leverage data that would have been inconceivable only five years ago.

Is there a Fourth “Big” Wave ? – Evolution, IoT

   Currently we are still at an early stage of leveraging huge volumes of data to gain a 360-degree view of the business and anticipate shifts and changes in customer expectations. The technologies required to get the answers the business needs are still isolated from each other. To get to the desired end state, the technologies from all three waves will have to come together. Big data is not simply about one tool or one technology. It is about how all these technologies come together to give the right insights, at the right time, based on the right data whether it is generated by people, machines, or the web.

Five Major Challenges that Big Data Presents

February 22, 2016

The “big data” phrase is thrown around in the analytics industry to mean many things. In essence, it refers not only to the massive, nearly inconceivable, amount of data that is available to us today but also to the fact that this data is rapidly changing. People create trillions of bytes of data per day. More than 95% of the world’s data has been created in the past five years alone, and this pace isn’t slowing. Web pages. Social media. Text messages. Instagram. Photos. There is an endless amount of information available at our fingertips,but how to harness it, make sense of it, and monetise it are huge challenges. So lets narrow the challenges down a little and put some perspective on them. After fairly extensive reading and research, I believe there are 5 major challenges offered up to big data at the moment.

1). Volume

How to deal with the massive volumes of rapidly changing data coming from multiple source systems in a heterogeneous environment?

Technology is ever-changing. However, the one thing IT teams can count on is that the amount of data coming their way to manage will only continue to increase. The numbers can be staggering: In a report published last December, market research company IDC estimated that the total count of data created or replicated worldwide in 2012 would add up to 2.8 zettabytes (ZB). For the uninitiated, a Zettabyte is 1,000 Exabytes, 1 million Petabytes or 1 billion Terabytes or, in more informal terms, lots and lots and lots of data. By 2020, IDC expects the annual data creation total to reach 40 ZB, which would amount to a 50-fold increase from where things stood at the start of 2010.

Corporate data expansion often starts with higher and higher volumes of Transaction Data. However, in many organisations, unstructured and semi-structured information, the hallmark of Big Data Environments, is taking things to a new level altogether. This type of data typically isn’t a good fit for relational databases and comes partly from external sources, big data growth also adds to the Data Integration Workload, and challenges for IT managers and their staff.

2). Scope

How do you determine the breath, depth and span of data to be included in cleansing, conversion and migration efforts?

Big Data is changing the way we perceive our world. The impact big data has created and will continue to create can ripple through all facets of our life. Global Data is on the rise, by 2020, we would have quadrupled the data we generate every day. This data would be generated through a wide array of sensors we are continuously incorporating in our lives. Data collection would be aided by what is today dubbed as the “Internet of Things”. Through the use of smart bulbs to smart cars, everyday devices are generating more data than ever before. These smart devices are incorporated not only with sensors to collect data all around them but they are also connected to the grid which contains other devices. A Smart Home today consists of an all encompassing architecture of devices that can interact with each other via the vast internet network. Bulbs that dim automatically aided by ambient light sensors and cars that can glide through heavy traffic using proximity sensors are examples of sensor technology advancements that we have seen over the years. Big Data is also changing things in the business world. Companies are using big data analysis to target marketing at very specific demographics. Focus Groups are becoming increasingly redundant as analytics firms such as McKinsey are using analysis on very large sample bases that have today been made possible due to advancements in Big Data. The potential value of global personal location data is estimated to be $700 billion to end users, and it can result in an up to 50% decrease in product development and assembly costs, according to a recent McKinsey report. Big Data does not arise out of a vacuum: it is recorded from some data generating source. For example, consider our ability to sense and observe the world around us, from the heart rate of an elderly citizen, and presence of toxins in the air we breathe, to the planned square kilometer array telescope, which will produce up to 1 million terabytes of raw data per day. Similarly, scientific experiments and simulations can easily produce petabytes of data today. Much of this data is of no interest, and it can be filtered and compressed by orders of magnitude. There is immense scope in Big Data and a huge scope for research and Development.

3). 360 Degree View

With all the information that is now available, how do you achieve 360 degree views of all your customers and harness the kind of detailed information that is available to you, such as WHO they are? WHAT they are interested in?  and HOW they are going to purchase and WHEN?

Every brand has its own version of the perfect customer. For most, these are the brand advocates that purchase regularly and frequently, take advantage of promotions and special offers, and engage across channels. In short, they’re loyal customers.

For many brands and retailers, these loyal customers make up a smaller percentage of their overall customer base than they would prefer. Most marketers know that one loyal customer can be worth five times a newly acquired customer, but it’s often easier to attract that first-time buyer with generic messaging and offers. In order to bring someone back time and time again, marketers must craft meaningful and relevant experiences for the individual. So how can brands go about building loyalty for their businesses? Let’s start with what we know.

We all know that customers aren’t one dimensional. They have thousands of interests, rely on hundreds of sources to make decisions, turn to multiple devices throughout the day and are much more complex than any audience model gives them credit for. It’s imperative for marketers to make knowing their customers a top priority, but it isn’t easy.

In the past, knowing a customer meant one of two things: you knew nothing about them or you knew whatever they chose to tell you. Think about it. Often in the brick and mortar world you would deal with a first-time customer about whom you knew next to nothing: age, gender and the rest was based on assumptions. Over time, a regular customer might become better understood. You’d see them come in with children, or they’d make the same purchase regularly.

Now, thanks to technology, you can know an incredible amount about your customers some might even say too much. Amassing data is one thing, but increasingly the challenge has become how to make sense of the data you already have to create a rich, accurate and actionable view of your customer a 360-degree view.

Building and leveraging a 360-degree view of your customer is critical to helping you drive brand loyalty. Your customers need to be at the center of everything your business does. Their actions, intentions and preferences need to dictate your strategies, tactics and approach. They aren’t a faceless mass to whom something is done; they are a group of individuals that deserve personalised attention, offers and engagement. Your goal as a marketer is to make the marketing experience positive for your customers, which, in turn, will be positive for your business.

How can marketers go about establishing that 360-degree view and creating that positive customer relationship? It must be built on insights, but that doesn’t mean simply more data. In fact, more data can make it more difficult to come to a solid understanding of your customer. On top of that, it can also clearly raise privacy concerns. Marketers need to know how to make good inferences based on smart data.

Lets look at some of the key types of data and how they can be used:

First and most valuable is an organisation’s own (“first-party”) data. This should be obvious, but the diversity of this data – past purchase history, loyalty program participation, etc. – can cause some potentially valuable information to be overlooked.

Next is the third-party data so readily available for purchase today. This can be useful for target new audiences or finding look-alikes of existing customers, but it often comes at a prohibitive price and with fewer guarantees of quality and freshness than first-party data.

Finally, there is real-time data about customers or prospects. While real-time data can, in theory, come from either a first- or third-party source, it functions differently than the historical data sources described above. Certainly it can be used to help shape a customer profile but in its raw form, in the moment, it acts as a targeting beacon for buying and personalising impressions in the perfect instant.

How can you as a marketer use these three data types to come up with the most accurate view of your customer?

First, you need to understand the scope and diversity of your own data. There is valuable information lurking in all kinds of enterprise systems: CRM, merchandising, loyalty, revenue management, inventory and logistics and more. Be prepared to use data from a wide array of platforms, channels and devices.

From there, you can start answering questions about your customers. What are they saying about my products; when are they thinking about purchasing a product from me (or a competitor); how frequently have they have done business with me; how much do they spend?  The faster and more fully I can answer these questions, the more prepared I am to connect with my customer in the moment.

Integrating and analysing all of this information in a single manageable view is the next challenge for marketers, allowing them to recognise, rationalise and react to the fantastic complexity that exists within their data. Doing this is no small task, but a holistic view will enable marketers to differentiate meaningful insights from the noise.

The bottom line is that customers want brand experiences that are relevant and engaging, and offers that are custom-tailored for them, not for someone an awful lot like them. This is exactly what the 360-degree approach is designed to make possible: highly personalised, perfectly-timed offers that can be delivered efficiently and at scale.

In order to deliver those experiences, marketers must think about customer engagement from the 360-degree perspective, in which every touch-point informs the others. This cannot be achieved with a hodgepodge of disconnected data. It can only be achieved when all of the resources available, insights, technology and creative, are working together in perfect harmony. Over time, personalised customer experiences drive long-term loyalty for brands and retailers, ultimately creating even more of those “perfect” customers.

4). Data Integrity

How do you establish the desired integrity level across multiple functional teams and business processes? Is it merely about complete data (something in every required field)? or does it include accurate data, that is, the information contained within those fields is both correct and logical? What about unstructured data?

In the previous sections, we saw what Big Data means for the search and social marketer. Now, let’s spend some time on how we can make sure that the Big Data we have actually works for us.
Specifically, It’s my belief that there are four key factors determining our return from Big Data:

  • Is our Big Data accurate?
  • Is our Big Data secure?
  • Is our Big Data available at all times?
  • Does our Big Data scale?

Collating and creating Big, Valuable Data is a very expensive process and requires lots of investment and massive engineering resources to create a rigorous and high-quality set of data streams. Currently, 75% of Fortune 500 companies use cloud-based solutions, and the IDC predicts that 80% of new commercial enterprise apps will be deployed on cloud platforms.
Given these numbers, let’s address the 4 factors above in a specific context, using a cloud-based digital marketing technology platform for your Big Data needs.
1. Ensure Your Data Is As Accurate As Possible
As a search marketer, you are among the most data-driven people on this planet. You make important decisions around keywords, pages, content, link building and social media activity based on the data you have on hand.
Before gaining insight and building a plan of action based on Big Data, it’s important to know that you can trust this data to make the right decisions. While this might seem like a daunting exercise, there are a few fairly achievable steps you can take.
a. Source data from trusted sources: trust matters. Be sure that the data you, or your technology vendor, collect is from reliable sources. For example, use backlink data from credible and reputed backlink providers such as Majestic SEO, which provides accurate and up to-date information to help you manage successful backlinking campaigns.
b. Rely on data from partnerships: this is a corollary to the previous point. Without getting into the business and technical benefits of partnerships, I strongly recommend that you seek data acquired through partnerships with trusted data sources so that you have access to the latest and greatest data from these sources.
For example, if you need insight into Twitter activity, consider accessing the Twitter fire hose directly from Twitter and/or partner with a company who already has a tie-up with Twitter. For Facebook insight, use data that was acquired through the Facebook Preferred Developer Program certification. You need not go out and seek these partnerships, just work with someone who already has these relationships.
c. Avoid anything black hat: build your SEO insights and program with a white hat approach and takes a trusted partnership driven approach like the ones mentioned above.
If and when in doubt, ask around and look for validation that your technology provider has partnerships and validate it on social media sources such as Facebook and Twitter. Do not be shy about getting more information from your  technology vendors and track back to check that their origins do no tie back to black hat approaches.
2. Ensure Your Data Is Secure
You have, on your hands, unprecedented amounts of data on users and their behavior. You also have precious marketing data that has a direct impact on your business results.
With great amounts of knowledge comes even greater responsibility to guarantee the security of this data. Remember, you and your technology provider together are expected to be the trusted guardians of this data. In many geographies, you have a legal obligation to safeguard this data.
During my readings and research, I have learned a lot about the right way to securely store data. Here are a few best practices that, hopefully, your technology provider follows:

  1. ISO/IEC 27001 standard compliance for greater data protection
  2. Government level encryption
  3. Flexible password policies
  4. Compliance with European Union and Swiss Safe Harbor guidelines for compliance with stringent data privacy laws

3. Ensure Your Data Is Available
Having access to the most valuable Big Data is great, but not enough, you need to have access to this data at all times. Another valuable lesson I learned is how to deliver high availability and site performance to customers.
To achieve this, implement industry leading IT infrastructure including multiple layers of replication in data centres for a high level of redundancy and failover reliability, and datacenter backup facilities in separate locations for disaster recovery assurance and peace of mind. If you work with a marketing technology provider, be sure to ask them what measures they take to guarantee data availability at all times.
4. Ensure Your Data Scales With User Growth
This is the part that deals with the Big in Big Data. Earlier in the post we saw how Zetabytes of data already exist and more data is being generated at an astounding pace by billions of Internet users and transactions everyday. For you to understand these users and transactions, your technology should have the ability to process such huge volumes of data across channels and keep up with the growth of the Internet.
Scale should matter even if you are not a huge enterprise. Think about this analogy, even if you are searching for a simple recipe on Google, Google has to parse through huge volumes of data to serve the right results.
Similarly, your technology should be able to track billions of keywords and pages, large volumes of location-specific data and social signals to give you the right analytics. Be sure the technology you rely on is made for scale.

5). Governance Process.

How do you establish Procedures across people, processes and technology to maintain a desired state of Governance? Who sets the rules? Are you adding a level of Administration here?

Big Data has many definitions, but all of them come down to these main points: It consists of a high volume of material, it comes from many different sources, it comes in a variety of formats, it arrives at high speeds and it requires a combination of analytical or other actions to be performed against it. But at heart, it’s still some form of data or content, though slightly different than what has been seen in the past at most organizations. However, because it is a form of data or content, business-critical big data needs to be included in Data Governance processes.

Do remember that not all data must be governed. Only data that is of critical importance to an organisation’s success (involved in decision making, for example) should be governed. For most companies, that translates to about 25% to 30% of all the data that is captured.

What Governance best practices apply to big data? The same best practices that apply to standard data governance programmes, enlarged to handle the particular aspects of Big Data:

  1. Take an enterprise approach to big data governance. All Data Governance Programmes should start with a strategic view and be implemented iteratively. Governance of big data is no different.
  2. Balance the people, processes and technologies involved in big data applications to ensure that they’re aligned with the rest of the data governance programme. Big data is just another part of enterprise data governance, not a separate programme.
  3. Appoint Business Data Stewards for the areas of your company that are using big data and ensure that they receive the same training as other data stewards do, with special focus on big data deemed necessary due to the technology in use at your organisation.
  4. Include the Value of Big Data Governance in the business case for overall data governance.
  5. Ensure that the metrics that measure the success of your data governance programme include those related to big data management capabilities.
  6. Offer incentives for participating in the data governance programme to all parts of the business using big data to encourage full participation from those areas.
  7. Create data governance policies and standards that include sets of big data and the associated metadata, or that are specific to them, depending on the situation.

It has to be said that there are many more challenges in Big Data, but researching and reading these are basically the top five that come out every time and are referenced by any and all that are venturing into this world. If there are any different aspects that have been encountered please let me know and perhaps together we can formulate a global checklist for all to follow.

The Business Intelligence Puzzle

February 21, 2016

The Data Warehousing Institute, provider of education and training in the areas of data warehousing and BI industry defines Business Intelligence as: “The processes, technologies, and tools needed to turn data into information, information into knowledge, and knowledge into plans that drive profitable business action”. Business intelligence has been described as “active, model-based, and prospective approach to discover and explain hidden decision-relevant aspects in large amount of business data to better inform business decision process” (KMBI, 2005).

Defining Business Intelligence has not been a straightforward task, given the multifaceted nature of data processing techniques involved and managerial output expected. “Business information and business analyses within the context of key business processes that lead to decisions and actions and that result in improved business performance” (Williams & Williams, 2007). BI is “both a process and a product. The process is composed of methods that organisations use to develop useful information, or intelligence, that can help organisations survive and thrive in the global economy. The product is information that will allow organisations to predict the behaviour of their competitors, suppliers, customers, technologies, acquisitions, markets, products and services and the general business environment” with a degree of certainty (Vedder, et al., 1999). “Business intelligence is neither a product nor a system; it is an architecture and a collection of integrated operational as well as decision-support applications and databases that provide the business community easy access to business data” (Moss & Atre, 2003). “Business Intelligence environment is a quality information in well-designed data stores, coupled with business-friendly software tools that provide knowledge workers timely access, effective analysis and intuitive presentation of the right information, enabling them to take the right actions or make the right decisions” (Popovic, et al., 2012).

The aim of business intelligence solution is to collect data from heterogeneous sources, maintain, and organise knowledge. Analytical tools present this information to users in order to support decision making process within the organisation. The objective is to improve the quality and timeliness of inputs to the decision process. BI systems have the potential to maximise the use of information by improving company’s capacity to structure a large volume of information and make it accessible, thereby creating competitive advantage, what Davenport calls “competing on analytics” (Davenport, 2005). Business intelligence refers to computer based techniques used in identifying, digging-out, and analysing business data such as sales revenue by product, customer and or by its costs and incomes.

Business Intelligence encompasses data warehousing, business analytic tools and content/knowledge management. BI systems comprise of specialised tools for data analysis, query, and reporting such as Online Analytical processing system (OLAP) and dashboards that support organisational decision making which in turn enhances the performance of a range of business processes. General functions of BI technologies are reporting, online analytical processing (OLAP), analytics, business performance management, benchmarking, text mining, data mining and predictive analysis:

Online Analytical Processing (OLAP) includes software enabling multi dimensional views of enterprise information which is consolidated and processed from raw data with a possibility of current and historical analysis.

Analytics helps make predictions and forecasting of trends and relies heavily on statistical and quantitative analysis to enable decision making concerned with future predictions of business performance.

Business Performance Management tools concerned with setting appropriate metrics and monitoring organisational performance against these identifiers.

Benchmarking tools provide organisational and performance metrics which help compare enterprise performance with benchmark data, to industry average, for example.

Text Mining software helps analyse non structured data, such as written material in natural language, in order to draw conclusions for decision making.

Data Mining involves large scale data analysis based such techniques as cluster analysis, anomaly and dependency discovery, in order to establish previously unknown patterns in business performance or making predictions of future trends.

Predictive Analysis deals with data analysis, turn it into actionable insights and help anticipate business change with effective forecasting.

Specialised IT infrastructure such as data warehouses, data marts, and extract transform & load (ETL) tools are necessary for BI systems deployment and their effective use. Business intelligence systems are widely adopted in organisations to provide enhanced analytical capabilities on the data stored in the Enterprise Resource Planning (ERP) and other systems. ERP systems are commercial software packages with seamless integration of all the information flowing through an organisation – Financial and accounting information, human resource information, supply chain information and customer information (Davenport, 1998). ERP systems provide a single vision of data throughout the enterprise and focus on management of financial, product, human capital, procurement and other transactional data. BI initiatives in conjunction with ERP systems increase dramatically the value derived from enterprise data.

While many organisations have an information strategy in operation, effective business intelligence strategy is only as good as the process of accumulating and processing of corporate information. Intelligence can be categorised in a hierarchy which is useful in order to understand its formation and application. The traditional intelligence hierarchy is shown below, which comprises of data, information, knowledge, expertise and, ultimately, wisdom levels of intelligence.


Data is associated with discrete elements – raw facts and figures; once the data is patterned in some form and is contextualised, it becomes information. Information combined with insights and experience becomes knowledge. Knowledge in a specialised area becomes expertise. Expertise morphs into the ultimate state of wisdom after many years of experience and lessons learned (Liebowitz, 2006). For small businesses, processing data is a manageable task. However, for organisations that collect and process data from millions of customer interactions per day, identifying trends in customer behaviour, accurately forecasting sales targets appear more challenging.

Use of data depends on the contexts of each use as it pertains to the exploitation of information. At a high level it can be categorised into operational data use and strategic data use. Both are valuable for any business, without operational use the business could not survive but it is up to the information consumer to derive the value from a strategic perspective. Some of the strategic uses of information through BI applications include:

Customer Analytics, which aims to maximise the value of each customer and enhance customer’s experience;

Human Capital Productivity Analytics, provides insight into how to streamline and optimise human resources within the organisation;

Business Productivity Analytics, refers to the process of differentiating between forecasted and actual figures for inputs/outputs conversion ratio of the enterprise;

Sales Channel Analytics, aims to optimise effectiveness of various sales channels, provides valuable insight into the metrics of sales and conversion rates;

Supply Chain Analytics offers the ability to sense and respond to business changes in order to optimise an organisation’s supply chain planning and execution capabilities, alleviating the limitations of the historical supply chain models and algorithms.

Behaviour Analytics helps predict trends and identify patterns in specific kinds of behaviours.

Organisations accumulate, process and store data continuously and rely on their information processing capabilities for staying ahead of competitors. According to the PricewaterhouseCoopers Global Data Management Survey of 2001, the companies that manage their data as strategic resource and invest in its quality are far ahead of their competitors in profitability and superior reputation. A proper Business Intelligence system implemented for an organisation could lead to benefits such as increased profitability, decreased cost, improved customer relationship management and decreased risk (Loshin, 2003). Within the context of business processes, BI enables business analysis using business information that lead to decisions and actions and that result in improved business performance. BI investments are wasted unless they are connected to specific business goals (Williams & Williams, 2007).

As competitive value of the BI systems and analytics solutions are being recognised in the industry, many organisations are initiating BI to improve their competitiveness, but not as quickly as it could be.