The Four Faces of Automation in Industry 4.0

The Four Faces of Automation: Tactical, Point Solution, RPA, and Cognitive. Icons credits: flaticon.com/authors/eucalyp, flaticon.com/authors/catkuro, flaticon.com/authors/wanicon and flaticon.com/authors/freepik

For the last thirteen years, I've been developing multiple software solutions from desktop, Android apps, and websites to more recent bots. It was in those days when I started to work with Robotic Process Automation (RPA) and Machine Learning (ML) that I heard for the first time the concept of Industry 4.0:
Industry 4.0 refers to a new phase in the Industrial Revolution that focuses heavily on interconnectivity, automation, machine learning, and real-time data. [1]
A topic that is extremely hot in many corporations, since they are constantly searching how to profit from these emerging technologies. In 2019, I read a great article by Accenture about the skills that will dominate the market in the foreseeable future. They are called: DARQ skills [2].


Artificial Intelligence is the most evolved of all these skills. Since there has been significant development in all industries, e.g., medicine, video games, retail, outsourcing, management, etc.

Nowadays, all industries, especially, in Canada, the USA, and Europe are embracing automation. Nevertheless, it is becoming laborious to choose the right tech to implement adequately. Many software solutions are appearing almost monthly in the market. This is something that I have learned the hard way throughout the last 3 years.

I have been involved in so many talks with managers and team leaders from different industries and just because the words RPA, Chatbot, or ML are so popular, they want to jump on the bandwagon and implement anything. Later … The results were not as good as expected, having few or no FTE (Full-Time employee) savings or significant delays that resulted in the loss of trust in the tools or their capabilities. This situation was perfectly defined by Albert Einstein almost a century ago:


There was one time when I was leading a new team. We invested several months refactoring a "simple" bot, to recover the trust of our client in RPAs. Did we succeed? Yes, we did. The RPA is still working today, but the considerable amount of time and money spent on it was vastly expensive.

All these entertaining experiences made my team and I become more careful, go to more in-depth talks with the clients, and most importantly don't rush to conclusions or invest resources in certain techs that won't give the expected results. Nowadays, we start to combine multiple kinds of automations, building "blocks of automation" to create reliable, robust, and stable solutions in the long term.

This article is profitable if you're a decision-maker (CTO, manager, team leader, tech lead, etc.) or if you want to start/promote automation in your organization, and most importantly if you want to avoid many of the missteps that we faced. Let's start with the four kinds of automation that I have identified:

1. Tactical

Credits: Zapier.

You might have heard the word: Macro and this is the perfect example of these basic automations. They are excellent for small processes because they rarely need any special software or permissions since they tend to be embedded in common apps like Microsoft Excel.

Let's suppose an example of a situation that you might have seen in the corporate world:
"A colleague shared with you a magical Excel that can connect to the internet, download data from www.myreports.com, beautify your spreadsheet and upload all to SAP in one-shot." Does it sound familiar?
Probably, it is familiar, and, in the beginning, you were surprised by that magical spreadsheet, but in fact, these tools tend to have their own recorders, which leverage all that extra work. Certainly, they were an inspiration for more advanced solutions like the RPAs.


The pieces of code generated by these recorders tend to be highly accurate since they are software specific like an Excel Macro or an SAP Script. They access clear-cut controls with their IDs and execute certain commands that are only understandable by these tools.

Also, I can suggest to you some extra tools to start these small automations:
  • Selenium is one of the most powerful and respected frameworks for web automations [4].
  • Pullover's Macro Creator is a Windows automation tool that has its own basic recorder, and scheduler and it is entirely graphical, like advanced solutions.
  • AutoIT is a BASIC-like scripting language designed for automating Windows GUI and general scripting. The language is similar to Macros.
However, they are not exactly the best solutions when you're focused on compliance, performance, scalability, traceability, and collaboration.

In my experience, they can be used by almost anyone, even the less tech-savvy. You just need to remember: "They are perfect for small automations". If you forget, you can have an Excel Spreadsheet running for 50+ min and you don't know what is going on or if it's working.

2. Point Solutions

Credits: DomaHub

These are the most common automations all over the world since everyone who has a PC or smart device benefits from them.

They are custom-made and are commonly called apps or websites. They are experts in their fields because they are case-specific and solve a particular task like booking a hotel or paying for your dinner with your phone. You might know some already: Uber, Airbnb, PayPal, Booking.com, Google Pay, SAP, etc.

This kind of automation has existed since the beginning of the Information Age. Many organizations have developed their own apps, CRMs, ERPs, websites, etc. To help their employees be more efficient, reduce FTEs, optimize costs, sell new services, focus on their core business, etc. Since they are specialized, they tend to be compliant, flexible, efficient, scalable, powerful, and capable of high collaboration.

Until some years ago, it was required to have relatively advanced programming skills to build any of them. Currently, many organizations have developed their own Low-Code Development Platforms (LCDP) that can empower you to create your own apps or websites without major coding efforts:
These are just some examples of many in the market, that I described deeper in another article.

However, if you need any integration with other services or tools, this might involve some extra effort since rarely any of them was designed for these purposes [5]. Also, you may need special access to certain APIs (if any is available), or to re-write entire apps to reach your goals, which certainly entails considerable costs.

3. Robotic Process Automation

Credits: Quipu Blog

This is unquestionably, one of the hottest topics in all kinds of organizations around the globe [6].

RPAs are technologies for automating repetitive tasks, almost mimicking perfectly human interactions with their PCs such as filling Excel forms, answering emails, downloading files, etc. They often come in two flavors:
  • Attended. They work in collaboration with humans leveling up their work.
  • Unattended. Mostly work touchless performing complex tasks in the background.
And are known by CUSEC:


Also, since in most cases, they imitate humans, they tend to follow all accepted rules and norms from their organizations.

Now, you might be thinking, so, what can they really do? Technically, they can:
  • Access and perform a click on, send keystrokes to, or get info from almost any software like Word, Excel, Adobe Reader, SAP, etc.
  • Collect info from or insert data to any website like Facebook, eBay, Amazon, Google, Outlook.com, etc. Even internal ones are made for Internet Explorer.
  • Connect to VDIs (Citrix) and perform multiple actions there.
  • Schedule and run tasks at certain times/days.
  • Be monitored if they succeeded or not and the possible root cause of any failure.
Among many other things and all these without major changes in the organization's infrastructure or rebuilding internal tools (it depends on the chosen solution). Normally, you only need a Server/Cloud for hosting the Command Center and the PCs for developing/running the bots in its clients.

How do they work? It is another question that you might have and generally, they follow the next techniques:
  • Control recognition. Every control you can see on a PC has an ID and they can access it. This is the most efficient when you're able to do it.
  • Computer Vision. They use advanced techniques of AI to see in your PCs and identify certain objects by shapes and patterns. This is the second-best one, but unsupported by most of the RPAs (UiPath is the only one -in my experience-).
  • OCR. They scan the PC screens and identify certain objects/texts using Google OCR, Microsoft OCR, etc. This feature is also used for advanced options like getting info from PDFs or images, e.g., acquiring the values from bills.
  • Image Recognition. They identify an image as you would do by its shape and location on specific controls and windows. Nevertheless, if anything changes all are going to fail.
  • Coordinates. They search for a specific location (x, y) on the screen and perform the requested action (click, send keystrokes, etc.) on clear-cut controls and windows. Similar to the previous one, if something is not exactly in the same place it is going to fail.
Further, as the Macros, they commonly have their own recorders (desktop, web, Citrix, etc.) that leverage lots of extra work from creating bots from scratch. Also, the results can be later modified to make them more generic and reusable.

UiPath Recorders.

Furthermore, some of these platforms have their own Bot Stores (Automation Anywhere and UiPath) where you can search and download already custom-made bots that could fit your business needs and if they don't, you will probably be able to adapt. Similar, to regular apps from the Apps Store or Play Store.
A crucial point, any task you are going to automate using RPAs, must be rule-based, if the process is unstandardized or the rules are unclear, then re-engineering is the first step for successful automations.
Now, let's analyze a final example to understand better their capabilities:


The process described before is an excellent candidate for RPA automation because all actions are known and repetitive. Definitely, there might be some exceptions, but they can be added later to improve its efficiency and performance.

From my experience, I can advise you of the following RPA platforms for Windows:
Or if you want to try some free and Open-Source RPAs: OpenRPA or UI.Vision.

4. Cognitive

Credits: Exponea

This one is the last frontier of automation and perhaps, the most lucrative and costly because the 21st-century gold is Data [7].

In the news and everywhere, you can hear words like ML, Deep Learning, Computer Vision, and many more. Most of these techs tend to follow three approaches to do their work:
  • Classification. To classify things as if an animal is a dog or a cat based on millions of images.
  • Regression. To predict when the next recession could be based on historical data.
  • Clustering. To group things by features like people who tend to be more prone to buy Coke.
However, as a businessperson, you might be wondering, what are some business cases where I can use these things? All of them sound very sophisticated to me and you might be right. That's why I am going to give you some ideas that could inspire you:
  • Product recommendation. Recommend products based on purchase history.
  • Chatbots. Simulate conversation with humans and recommend actions to take.
  • Fraud detection. Detect fraudulent credit card transactions.
  • Sales spike detection. Detect spikes and changes in products.
  • Customer segmentation. Identify groups of customers with similar profiles.
  • Price prediction. Predict taxi fares based on distance traveled, for instance.
  • Sentiment analysis. Analyze the feelings of customer reviews.
  • Image classification. Classify images (e.g., cat vs dog).
  • Sales forecasting. Forecast future sales for products.
  • Object detection. Recognize objects in an image.
And the list keeps growing since they are embedded in almost everything you know like chat apps, video games, targeted ads, or even very popular recommender systems like YouTube:


In the beginning, these systems were extremely complex since just writing the right algorithms was a herculean task. You needed highly specialized teams, but nowadays, you have excellent frameworks to start your cognitive automations, e.g.:
Among many others. Most of these solutions tend to be supported by massive organizations, e.g., Microsoft, Facebook, Google, etc.

However, this area tends to be still very tricky for multiple reasons:
  • Lots of data are required. The right number is hard to predict. Nevertheless, it's not less than hundreds of thousands of records for optimum results.
  • Choosing the correct features is not easy.
  • Labeling properly the data is a key element and not necessarily easy.
  • Cleaning data might be crucial in many situations since you might have missing information in numerous datasets and you would need to "guess" them based on similar cases.
  • Choosing the right algorithms or adapting them takes time.
  • Significant resources (they can be leveraged to a cloud for a price) are required for training the models.
  • A few sections from traditional Agile methodologies can be applied. Regularly, they work with Waterfall models [8].
And many more. Also, I can add that the latest research in these areas is mostly connected to, identifying the best ways to properly labeled data automatically and this is a critical point [9].

Furthermore, you might be wondering, if is there any way to start without complex coding efforts, building sophisticated teams, or hiring so many specialists? Because that sounds costly and the answer is: Yes, there is. In 2020, you have very powerful ML LCDPs that can empower you to start straight away:
And my experience with the ML LCDPs has been very successful and rewarding. Especially, that a couple of years ago I took part as a Software Architect in the development of a complex web recommender system powered by Azure ML Studio.

Practically, the section of the recommender system was a No-Code solution, built on blocks with some predefined algorithms and the power of Azure. This solution reduced hours of extra work and complex server configurations from our side. An example of how this No-Code solution looked is the following picture:

Credits: Microsoft.

Now, after all, you read, you might have many thoughts in mind to process. However, I can give some final tips and tricks to consider, they are the following:
  • Understand the kind of automation that you need. Probably, in the first steps, you only need a Macro or a Point Solution. You need to talk in-depth, take hundreds of notes, research what can be done or not at this point and understand clearly the requirements to choose wisely and save money and time.
  • Frequent reskilling programs, including management. Generally, most of the efforts are going to the IT departments or hiring experienced people, which is correct. Nevertheless, I have heard many stories from multiple companies, where the management was unaware of the limitations of certain tools for various reasons. They expected to save hundreds of FTEs pretty quickly, but that never happened. Because they should have known how to choose the right automations or combine them to reach their expected goals.
  • Automate in small pieces. Don't try automating huge processes at once, it will become a nightmarish story that will be hard to maintain in the future.
  • Learn and apply good practices. A good trick to succeed in any automation initiative is to start reading and implementing good practices. In the beginning, they are not so easy since they might involve structural changes in your organization. However, in the long term, they bring excellent benefits. If you would like to know one, I can recommend you create your Center of Excellence (CoE).
  • Video "document" everything. Many times, when a new process is automated, people just write complex BRDs. Nevertheless, -in my experience- videos are thousands of times more effective; especially if any key person leaves the organization and someone else needs to take care of the project.
  • Think big. If you're building non-cognitive automations; especially RPAs, think about how you can start storing key information while creating your future base for Cognitive automations. Most RPA architects or devs -in my experience- rarely think about the future of their creations. They tend to focus only on imitating human behaviors to save the required FTEs.
  • Combine multiple kinds of automations. Some people might tell you that the RPAs can lead you to Cognitive, that Point Solutions are useless, or Macros are old fashion. However, we learned as a team that the best way is to combine all of them, and a first-rate orchestrator is the RPAs.


If you reached this point, you would have read some parts of my journey in automation. A journey that is just ongoing, where new adventures are coming. This article is more than anything about sharing lessons, tips, tricks, and ideas about what I can recommend to you, and you could profit from Smart Automations in Industry 4.0.

And if you have any questions, tips, or tricks feel free to leave them in the comments or if you want more in-depth advice on how to get started, you can reach me at:

federiconavarrete.com

Acknowledgment.

I'd like to thank my two friends who helped improve my article with their time, suggestions, and constructive feedback.

  • Agile methodology is a type of project management process, mainly used for software development, where demands and solutions evolve through the collaborative effort of self-organizing and cross-functional teams and their customers. (Zenkit)
  • API (Application Programming Interface) is a set of functions and procedures allowing the creation of applications that access the features or data of an operating system, application, or other services. (Oxford)
  • Artificial Intelligence is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. (Techopedia)
  • Automation is the technology by which a process or procedure is performed with minimal human assistance. (Mikell Groover)
  • Computer Vision is a field of computer science that works on enabling computers to see, identify, and process images in the same way that human vision does, and then provide appropriate output. (Techopedia)
  • Deep Learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example. (MATLAB)
  • A feature is a measurable property of the object you're trying to analyze. (DataRobot)
  • Labeled data is a group of samples that have been tagged with one or more labels. (Wikipedia)
  • A low-Code Development Platform is software that provides environment programmers use to create application software through graphical user interfaces and configuration instead of traditional computer programming. (Wikipedia)
  • Machine Learning is a way to get predictive insights from data to make repetitive decisions. (Carolyn from Google Cloud)
  • Macro is a saved sequence of commands or keyboard strokes that can be stored and then recalled with a single command or keyboard stroke. (Margaret Rouse)
  • OCR (Optical Character Recognition) is the use of technology to distinguish printed or handwritten text characters inside digital images of physical documents, such as scanned paper documents. (TechTarget)
  • Recommender System is a system that generates meaningful recommendations to a collection of users for items or products that might interest them. (Prem Melville, Vikas Sindhwani)
  • Robotic Process Automation is the technology that allows anyone today to configure computer software, or a “robot” to emulate and integrate the actions of a human interacting within digital systems to execute a business process. (UiPath)
  • The waterfall model is a classical model used in the system development life cycle to create a system with a linear and sequential approach. (The Economic Times)

Bibliography.

Comments