From the past to the present
Nuno Rosa, Closer Consulting
For many of us, Data Science can be a new hot topic trend during the last decade, but its foundations are far older than one can imagine. There are many simple examples in the ancient history of the application of certain principles that could easily match what we can describe as the first steps in a Data Science problem.
In ancient Greece, one of the greatest interests of researchers was the study of celestial bodies, and the reason was the following: there was the common belief that the configuration of stars and planets had an impact on Earth’s life. Therefore, knowing their position in advance and relating them to a specific event would be helpful for society. To accomplish these predictions, the Greeks would need two things: data and a model.
They had astronomical observations at their disposal and developed a quite general model to extrapolate the number for the future using epicycles. An epicycle is a model for the orbit of a heavenly body consisting of a superposition of circles, with each circle’s center moving along a bigger circle’s circumference. If one gives enough data, any periodic orbit can fit an epicycle model and make predictions that can equally well describe real and unrealistic orbits, depending on the data quality.
Sounds quite familiar and actual, doesn’t it? The model contains very few assumptions about underlying laws of nature, in order to capture any kind of regularity in the input datasets, much like when one tries to apply some kind of segmentation for clients in a bank using an algorithm, aside from technological aspects and scalability of data.
Using this example, we can see Data Science as the beginning of a more complex scientific process, rather than its refinement. We start with a very first step of going from raw observations to a description of patterns, and then, if we can connect with more fundamental principles, we can start to discover new ones.
Physics is a clear example of a fundamental field that started as a data-driven science and evolved to a model-driven science. Why? Because we began to understand the laws of nature. The same is happening outside of the fundamental sciences, as we start to gather data from several open business problems and to improve the data storage and quality. So, Data Science is a new term, but its main principles were already among us.
Since this recent discovery, Data Science has been giving a boost to monitor, manage, and collect performance measures to improve decision-making across the organization: a merge between the analytics and rigorous methods of fundamental sciences, computer science and the business knowledge supported by several years of field experience. The following examples are useful to understand how Data Science helped companies, still without knowing what fundamental law of nature is governing them:
- Transportation companies use statistical data to map customer journeys, manage unexpected circumstances, and provide people with personalized transport details.
- Customer insights data helps companies to build more tête-à-tête marketing campaigns.
- Historical data from clients in an insurance company combined with an appropriate model can help solve the churn problem and increase its profits.
In the end, the final goal is always the same: achieve the best financial results within the constraints of each organization.
Nowadays, this is a simplistic vision. We don´t live in isolated trees, but in a forest where each tree contributes to a greater good, which means that the engine of decision should not be only made by financial motivations.
Currently, our society is facing many challenges regarding sustainability, climate crisis and social changes after pandemics, so this topic has a grown interest. Progressive companies are using a sustainability framework of the triple bottom line – which combines profit, people, and the planet – rather than just focusing on generating profit, the standard ‘bottom line’. The most astonishing aspect is that many are doing so, while increasing profits, which means that there exists a sweet point in the relationship between social responsibility and financial performance, capable of maximising the financial performance. Once again, the solution to such a non-trivial problem should be Data Science. But how?
As every problem involving data, it should start from creating a methodology for data ingestion capable of collecting and merging the different sources of relevant data: consumption data, safety data, emissions data, survey data, status reports (unstructured), RSS feeds (unstructured) and financial reports. After the data is completely matured, the company should advance to “for-purpose” solutions, using the known AI solutions.
A straightforward example is the application of a full Data Science cycle to a manufacturing company. With the proper data and model, this company would be able to reduce resource use, making them less vulnerable to price and supply volatility while also anticipating supply, demand, and price.
Moreover, from the point of view of labour practices, this company, with a full matured Data Science department working alongside with Human Resources, would be able to identify future risks and to mitigate attrition levels, using data from employee surveys, salary conditions, compensations and past resignations.
With Data Science, the sweet point between profit, people and environment is found. Any company capable of achieving this level of progression will enter in an ecosystem where all stakeholders reward companies based on the environmental, social, and economic metrics.