supply chain case study dataset

Supply Chain Analysis: Data Analysis Case Study Using Excel

A supply chain analysis project using microsoft excel and microsoft power bi..

supply chain case study dataset

Supply chain analytics plays a crucial role in driving data-driven decision-making across industries, including manufacturing, retail, healthcare, and logistics. In this report, I presented an analysis of a Fashion and Beauty startup’s supply chain data, specifically focused on the movement of makeup products. By collecting, analyzing, and interpreting this dataset, I aim to gain insights that can inform strategic decisions and optimize supply chain operations.

Supply Chain Analysis

Dataset Overview

The dataset comprises various features related to the fashion and makeup product supply chain. These features provide valuable information for understanding the flow of products from suppliers to customers. The following features include:

Product Type

Availability

Number of products sold

Revenue generated

Customer demographics

Stock levels

Order quantities

Shipping times

Shipping carriers

Shipping costs

Supplier name

Production volumes

Manufacturing lead time

Manufacturing costs

Inspection results

Defect rates

Transportation modes

Supply Chain Analysis using Excel

The supply chain represents a network of interconnected processes involved in the production and delivery of goods to customers. Analyzing the various components of a supply chain is crucial for identifying opportunities to enhance its effectiveness and generate greater value for customers. In this report, I conducted a supply chain analysis using Excel and gave valuable insights into optimizing supply chain operations.

Dataset Source

To perform a supply chain analysis on this company, it is crucial to gather data related to various stages of the supply chain. This includes information on sourcing, manufacturing, transportation, inventory management, sales, and customer demographics.

Fortunately, I have come across this excellent dataset that provides comprehensive data about the supply chain of a Fashion and Beauty startup.

In the following section, I will guide you through the process I use to conduct a supply chain analysis using Excel.

Dataset from Github

Dataset Analysis using Excel

The first step is to import the file into Excel.

After importing the file, the next task is to remove any duplicate entries. To do this, I selected all the data by clicking on cell A1 and then pressed Ctrl+A. Then click on the “Data” tab and choose “Remove Duplicates” under the “Data Tools” section. Once duplicates are removed, the next step is to filter the data to identify any spelling errors or unwanted names. To do this, I can click on the “Filter” tool under the “Sort & Filter” section. While filtering, I noticed that some entries under the “Customer Demographics” category are labelled as “Unknown” instead of specific demographics like Male, Female or Non-Binary. Since I won’t be using this data, I proceeded with the existing information. At this stage, the data cleaning process is considered complete, and I have moved on to the next steps.

To facilitate further analysis, I converted the data into a table. I achieved this by selecting the entire data range (Ctrl+A) and then pressing Ctrl+T to convert it into a table. The next step is to create a pivot table. Navigated to the “Insert” tab and select the “PivotTable” tool. Choose the location for the pivot table (such as a new worksheet) and specify the data range (Table 1 or the relevant table name). Once the pivot table was created, I formatted the numbers to display as whole numbers and removed any decimal places for better readability.

A. Now, I’m ready to analyze the data and provide insights based on the following requirements or objectives.

I begin the analysis of the Supply Chain by examining the correlation between product prices and the corresponding revenue they generate.

supply chain case study dataset

Therefore, the company generates a higher revenue from skincare products, and there is a positive relationship between the price of skincare products and the revenue they generate.

B. Now, I examined the sales based on different product types.

supply chain case study dataset

Skincare products account for 45% of the company’s business, while haircare products contribute to 29.5% of the revenue, and cosmetics make up 25.5% of the total sales.

C. Now, I analyzed the total revenue generated from shipping carriers.

supply chain case study dataset

The company utilizes three shipping carriers for transportation, and among them, Carrier B contributes significantly to the company’s revenue generation.

D. Now, I examined the average lead time and average manufacturing costs for all products of the company.

supply chain case study dataset

Analyzing SKUs

In the dataset, there is a column labelled SKUs, which stands for Stock Keeping Units. SKUs are unique codes assigned to products to facilitate inventory management and tracking. They serve as a means to identify and differentiate individual items, ensuring accurate monitoring of stock levels. For instance, in a toy store with a diverse range of toys, each toy would be assigned a unique SKU as a secret number known only to the store, enabling efficient inventory control.

A. Now, I analyzed the revenue generated by each SKU.

supply chain case study dataset

Another column in the dataset is labelled Stock levels. Stock levels indicate the quantity of products available in a store or business’s inventory at a given time

B. Now, I examined the stock levels of each SKU.

supply chain case study dataset

C. Now, I analyzed the order quantity of each SKU.

supply chain case study dataset

Cost Analysis

A. Now, I analyzed the shipping costs associated with different carriers.

supply chain case study dataset

The above visualizations revealed that Carrier B contributes significantly to the company’s revenue. However, it is also the most expensive carrier among the three options.

B. Now, I examined the distribution of costs by transportation mode.

supply chain case study dataset

The company allocates a larger portion of its transportation expenses to the Road and Rail modes for the transportation of goods.

Analyzing Defect Rate

The defect rate in the supply chain refers to the percentage of products that are found to have issues or are damaged after being shipped.

A. Now, I analyzed the average defect rate across all product types.

supply chain case study dataset

Haircare products have a higher defect rate compared to other product types.

B. Now, I examined the defect rates based on the mode of transportation.

supply chain case study dataset

Road transportation exhibits a higher defect rate, while Air transportation demonstrates the lowest defect rate. This showcases how Excel programming language can be utilized to analyze a company’s supply chain.

supply chain case study dataset

I successfully created a dashboard using Power BI that presents the insights mentioned earlier. The dashboard effectively highlights essential revenue-related information, including:

1. Revenue generated from each supplier. 2. Revenue generated from each location. 3. Revenue generated from each company product. 4. Total products delivered by each carrier.

These insightful visualizations offer valuable data-driven perspectives, enabling informed decision-making and strategic planning.

Supply Chain Analysis involves examining different aspects of a supply chain to identify areas for improvement and enhance the overall efficiency of the supply chain, ultimately delivering greater value to customers. This report provided an overview of conducting a supply chain analysis using Excel, highlighting key steps and techniques and revenue generated.

Thank you for reading, for more guides like this follow me on;

Twitter || LinkedIn

Buy Me a Coffee

If you find this article insightful you can support me by buying me a coffee (Click HERE ).

Remember to like, share, comment and subscribe.

supply chain case study dataset

Ready for more?

Statso

Supply Chain Analysis: Case Study

  • April 3, 2023

Download the dataset below to solve this Data Science case study on Supply Chain Analysis.

Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products and services from suppliers to customers.

Here is a dataset we collected from a Fashion and Beauty startup. The dataset is based on the supply chain of Makeup products. Below are all the features in the dataset:

  • Product Type
  • Availability
  • Number of products sold
  • Revenue generated
  • Customer demographics
  • Stock levels
  • Order quantities
  • Shipping times
  • Shipping carriers
  • Shipping costs
  • Supplier name
  • Production volumes
  • Manufacturing lead time
  • Manufacturing costs
  • Inspection results
  • Defect rates
  • Transportation modes

You are required to perform Supply Chain Analysis to find data-driven approaches to optimize the supply chain performance and improve customer satisfaction while reducing costs and maximizing profits for all stakeholders involved.

References to Solve this Data Science Case Study

  • Supply Chain Analysis by Aman Kharwal

Recommended Case Studies & Datasets

Price Elasticity of Demand: Case Study

Price Elasticity of Demand: Case Study

  • August 19, 2024

Impact of Inflation: Case Study

Impact of Inflation: Case Study

  • August 12, 2024

Customer Satisfaction: Case Study

Customer Satisfaction: Case Study

  • July 29, 2024

Discussion Cancel reply

Discover more from statso.

Subscribe now to keep reading and get access to the full archive.

Type your email…

Continue reading

supply chain case study dataset

Supply Chain Optimization with Python

Find the optimal locations of your manufacturing facilities to meet your customers’ demands and reduce production costs.

Samir Saci

Towards Data Science

Supply chain optimization makes the best use of data analytics to find an optimal combination of factories and distribution centres to match supply and demand.

Because of the current surge in shipping costs, companies start to challenge their current footprint to adapt to the post-covid “New Normal” .

In this article, we will present a simple methodology using Linear Programming for Supply Chain Optimization considering

  • Fixed production costs of your facilities ($/Month)
  • Variable production costs per unit produced ($/Unit)
  • Shipping costs ($)
  • Customer’s demand (Units)
Should we keep outsourcing with shipping costs that have tripled in the last 12 months?

💌 New articles straight in your inbox for free: Newsletter 📘 Your complete guide for Supply Chain Analytics: Analytics Cheat Sheet

If you prefer to watch, you can have a look at the video version of this article

I. How do you design a supply chain network with Python?

Problem statement.

As the Head of Supply Chain Management of an international manufacturing company, you want to redefine the Supply Chain Network for the next 5 years

You must consider the recent increase in shipping costs and the forecasts of customers’ demand.

Your Supply Chain Network

  • 5 market s in Brazil, USA, India, Japan, Germany
  • 2 types of manufacturing facilities : low-capacity and high-capacity sites
  • Shipping costs ($/container)
  • Customer’s demand (Units/year)

Manufacturing Facility Fixed Costs

  • Capital Expenditure for the Equipment (Machines, Storage, ..)
  • Utilities (Electricity, Water, ..)
  • Factory management, administrative staff
  • Space Rental

These costs depend on the country and the type of plant.

Production Variable Costs

  • Production lines operators
  • Raw materials

For instance, the variable cost of a unit produced in Germany is 13$/Unit.

Shipping Variable Costs

  • Cost per container ($/Container)
  • Assumption: 1 container can contain 1000 units

For instance, if you want to ship a container from Germany to Japan it will cost you 1,231 ($/Container).

Manufacturing Capacity by Site

For instance, a high-capacity factory in Brazil can produce 1,500,000 (Units/month).

Customers' demand per market

💡 Follow me on Medium for more articles related to 🏭 Supply Chain Analytics, 🌳 Sustainability and 🕜 Productivity.

You can find the full code in this Github repository: Link My portfolio with other projects: Samir Saci

II. Results

Let us try three scenarios

  • Scenario 1: initial parameters
  • Scenario 2: we increase the production capacity of India (x2)

Scenario 3: surging shipping costs due to container shortage

Scenario 1: initial scenario.

  • Brazil plant is producing for the local market and the USA
  • India plants produce for all countries except Brazil
  • Japan needs to produce locally because of the limited capacity of India

Final Costs

Scenario 2: Outsourcing to low costs regions

What if we double the size of high-capacity plants in India?

Let us try to double the size of the India High Capacity plant, assuming that it will double the fixed costs.

  • Brazil plant is still producing for the local market and the USA
  • Japan does not produce locally anymore.

What if we have container costs multiplied by 5?

  • Brazil is producing for the local market only
  • The USA started to produce for the local market, and Japan
  • India closed its low-capacity factory
  • Japan starts to produce for its local market

Because of their limited production capacity, Japan and the USA still rely on the Indian plant.

Do you want to try it yourself?

This model has been implemented in a web application,

👇 Access it here,

  • Sustainable Supply Chain Optimization Application

III. Build your Model

We will be using the PuLP library of Python.

PuLP is a modelling framework for Linear (LP) and Integer Programming (IP) problems written in Python and maintained by COIN-OR Foundation (Computational Infrastructure for Operations Research).

1. Declare your variables, parameters and model

  • LpMinimize: Your objective is to minimize your costs
  • lowBound =0: You cannot have negative values of units produced

2. Define the objective and add constraints

3. solve your model and analyze the results, iv. conclusion, generative ai: gpt x supply chain optimization.

Following the trend of Generative AI with large language models (LLMs), I shared to experiment with their usage with this prototype of LangChain Agent, powered by GPT and connected to a TMS.

This agent’s performance is impressive; it can answer operational questions autonomously by querying a database and extracting the results.

What if we create a super agent for Supply Chain Optimization?

The idea would be to equip a GPT agent with

  • Advanced optimization models written in Python in a core module: Supply Chain Optimization, Sustainable Sourcing, Supply Planning
  • Documentation, articles and context to understand how to use them

Let us imagine agents that can interact with the users to get data, understand the constraints and run the solution to analyse results.

For more details,

Create GPTs to Automate Supply Chain Analytics

“the supply chain analyst” is a custom chatgpt’s “gpt” that performs pareto & abc analysis using sales data..

s-saci95.medium.com

Leveraging LLMs with LangChain for Supply Chain Analytics — A Control Tower Powered by GPT

Build an automated supply chain control tower with a langchain sql agent connected to the database of a transportation….

towardsdatascience.com

Implement Sustainable Sourcing

A similar approach can be used to select your suppliers, considering an objective function that will minimize the CO2 footprint .

Sustainable sourcing is the process of integrating social, ethical and environmental performance factors when selecting suppliers.

This involves assessing and evaluating suppliers based on sustainability criteria, such as labour rights, health and safety, environmental impact, human rights, and more.

Use data analytics to automatically select the best supplier with a mix of economic and environmental constraints

This article will discover how to use data analytics to design an optimal Supply Chain Network to minimize costs and environmental impacts.

What is Sustainable Sourcing?

Sustainability how can you use data analytics to select the best suppliers considering indicators for sustainability….

www.samirsaci.com

This model allows you to simulate several scenarios influencing operational and commercial parameters.

  • What if the demand explodes in India?
  • What if we have to close our plant in Brazil?
  • What if we triple the production capacity in Japan?

Scenario 3 is an example of a shipping cost surge that could push companies to switch to a more local footprint with plants producing for their local market only.

This simple model can help you get the potential of linear optimization for supply chain network optimization. We can quickly improve this model by adding constraints

  • Storage Costs
  • Carbon emissions limitations (CO2 = f(distance, weight))
  • Delivery lead time
  • Customer Clearance Fees
  • Currency change

Feel free to share suggestions of additional constraints to improve the model and meet the business requirements in your industry.

Let’s connect on Linkedin and Twitter , I am a Supply Chain Engineer using data analytics to improve logistics operations and reduce costs.

If you are interested in Data Analytics and Supply Chain, have a look at my website

Samir Saci | Data Science & Productivity

A technical blog focusing on data science, personal productivity, automation, operations research and sustainable….

samirsaci.com

[1] Computational Infrastructure for Operations Research, Optimization with PuLP (Documentation), Link

Samir Saci

Written by Samir Saci

Top Supply Chain Analytics Writer — Follow my journey using Data Science for Supply Chain Sustainability 🌳 and Productivity ⌛

Text to speech

Big data and the supply chain: The big-supply-chain analytics landscape (Part 1)

Big data and the era of digital means a big analytics landscape for supply chain to work with.

Your supply chains generate big data. Big supply-chain analytics turn that data into real insights.

The explosive impact of e-commerce on traditional brick and mortar retailers is just one notable example of the data-driven revolution that is sweeping many industries and business functions today. Few companies, however, have been able to apply to the same degree the "big analytics" techniques that could transform the way they define and manage their supply chains.

In our view, the full impact of big data in the supply chain is restrained by two major challenges. First, there is a lack of capabilities. Supply chain managers—even those with a high degree of technical skill—have little or no experience with the data analysis techniques used by data scientists. As a result, they often lack the vision to see what might be possible with big data analytics. Second (and perhaps more significantly), most companies lack a structured process to explore, evaluate and capture big data opportunities in their supply chains.

In the second part of this article series, we will show how companies can take control of the big data opportunity with a systematic approach. Here, we will look at the nature of that opportunity and at how some companies have managed to embed data driven methodologies into their DNA. Exhibit 1 provides an overview of the landscape of supply chain analytics opportunities.

What is big supply-chain analytics?

Big supply chain analytics uses data and quantitative methods to improve decision making for all activities across the supply chain. In particular, it does two new things. First, it expands the dataset for analysis beyond the traditional internal data held on Enterprise Resource Planning (ERP) and supply chain management (SCM) systems. Second, it applies powerful statistical methods to both new and existing data sources. This creates new insights that help improve supply chain decision-making, all the way from the improvement of front-line operations, to strategic choices, such as the selection of the right supply chain operating models.

supply chain case study dataset

Let's look at each main area in turn.

A. Sales, Inventory and Operations Planning

Typically, planning is already the most data-driven process in the supply chain, using a wide range of inputs from Enterprise Resource Planning (ERP) and SCM planning tools. There is now significant potential to truly redefine the planning process, however, using new internal and external data sources to make real-time demand and supply shaping a reality.

We can think about managing inventory in a supply chain similar to the way electricity is managed: Storing electricity is expensive and difficult; power companies bring in additional consumers or start and stop plants to ensure a balanced power grid. Retailers now have the opportunity to use a similar approach. Visibility of point of sale (POS) data, inventory data, and production volumes can be analyzed in real time to identify mismatches between supply and demand. These can then drive actions, like price changes, the timing of promotions or the addition of new lines, to realign things.

Retailers can also use new data sources to improve planning processes and their demand-sensing capabilities. For example, Blue Yonder has developed data intensive forecasting methods now deployed into retailing where 130,000 SKUs and 200 influencing variables generate 150,000,000 probability distributions every day. This has dramatically increased forecast accuracy; enabled a better view of the company's logistics capacity needs; and reduced obsolescence, inventory levels, and stockouts. The recent growth of third party cloud-based services like Blue Yonder is making such activities more accessible for other retailers, too.

Similarly, IBM has helped develop links between production planning and weather forecasts for bakeries. By incorporating temperature and sunshine data, baking companies are able to more accurately predict demand for different product categories based on factors that influence consumer preferences. Amazon, meanwhile, has patented an "anticipatory shipping" approach, in which orders are packaged and pushed into the delivery network before customers have actually ordered them.

Having truly mastered big-data forecasting, the next level of sophistication is to start actively shaping demand. Leading online retailers, for example, use big data analytics, inventory data, and forecasting to change the products recommended to customers. This effectively steers demand towards items that are available in stock.

B. Sourcing

In many companies, data on procurement volumes and suppliers are only gathered for few activities in the sourcing process. However, supply data goes beyond the classic spend analysis and annual supplier performance review. On a transactional basis, supply processes can be sensed in real time to identify deviations from normal delivery patterns. Firms are also finding opportunities for predictive risk management. By mapping its supply chains and using "Google trend"-style information and social data about strikes, fires, or bankruptcies, a firm can monitor supply disruptions in transportation, or at 2nd or 3rd tier suppliers, and take decisive actions before its competitors.

Data analysis can also drive strategic decisions. In recent years, one pharmaceutical company has created a database with all bids submitted for packaging. The data has been evaluated to fully understand the cost structure of those suppliers and to create detailed cost models for different types of packaging. Using updated information on commodity prices, factor costs, and plant utilization, these models can be used to aid the selection of the most appropriate suppliers for new packaging projects. Similarly, Caterpillar has initiated a contest on the crowd-data science website Kaggle to model quoted prices for industrial tube assemblies.

These "clean sheet costing" bottom-up calculations can also be applied in the purchase of transportation and warehousing. By exploiting data on the cost breakdown of operations of trucks and warehouses across the globe, companies do create a powerful fact base to challenge carriers and Logistics Solution Providers (LSPs), and provide real insight into "should cost" during negotiations.

C. Manufacturing

Big data and analytics can already help improve manufacturing. For example, energy-intensive production runs can be scheduled to take advantage of fluctuating electricity prices. Data on manufacturing parameters, like the forces used in assembly operations or dimensional differences between parts, can be archived and analyzed to support the root-cause analysis of defects, even if they occur years later. Agricultural seed processors and manufacturers analyze the quality of their products with different types of cameras in real-time to get the quality assessments for each individual seed.

The Internet of Things, with its networks of cameras and sensors on millions of devices, may enable other manufacturing opportunities in the future. Ultimately, live information on a machine's condition could trigger production of a 3D-printed spare part that is then shipped by a drone to the plant to meet an engineer, who may use augmented reality glasses for guidance while replacing the part.

D. Warehousing

Logistics has traditionally been very cost-focused, and companies have happily invested in technologies that provide competitive advantage. Warehousing in particular has seen many advances using available ERP data. One example are "chaotic" storage approaches that enable the efficient use of warehouse space and minimize travel distances for personnel. Another are high-rack bay warehouses that can automatically reshuffle pallets at night to optimize schedules for the next day. Companies can track the performance of pickers in different picking areas to optimize future staff allocation.

New technologies, data sources and analytical techniques are also creating new opportunities in warehousing. A leading forklift provider is looking into how the forklift truck can act as a big data hub that collects all sorts of data in real time, which can then be blended with ERP and Warehouse Management System (WMS) data to identify additional waste in the warehouse process. For example, the analysis of video images collected by automated guided vehicles, along with sensor inputs including temperature, shelf weight, and the weight on the forklift, can be used to monitor picking accuracy, warehouse productivity and inventory accuracy in real time. Similarly forklift driving behavior and route choices can be assessed and dynamically optimized to drive picking productivity. The data can also be used to conduct root-cause analysis of picking errors by shape, color, or weight, to help to make processes more robust.

New 3D modelling technologies can also help to optimize warehouse design and simulate new configurations of existing warehouse space to further improve storage efficiency and picking productivity. German company Logivations, for example, offers a cloud-based 3D warehouse layout planning and optimization tool.

E. Transportation

Truck companies already make use of analytics to improve their operations. For example, they use fuel consumption analytics to improve driving efficiency; and they use GPS technologies to reduce waiting times by allocating warehouse bays in real time.

Courier companies have started real-time routing of deliveries to customers based on their truck's geo-location and traffic data. UPS, for example has spent ten years developing its On-Road Integrated Optimization and Navigation system (Orion) to optimize the 55,000 routes in the network. The company's CEO David Abney says the new system will save the company $300 million to $400 million a year.

Big analytics will also enable logistics providers to deliver parcels with fewer delivery attempts, by allowing them to mine their data to predict when a particular customer is more likely to be at home. On a more strategic basis, companies can cut costs and carbon emissions by selecting the right transport modes. A major CPG player is investing in analytics that will help it to understand when goods need be shipped rapidly by truck or when there is time for slower barge or train delivery.

F. Point of Sale

Brick and mortar retailers—often under heavy pressure from online competitors that have mastered analytics—have understood how datadriven optimization can provide them with competitive advantages. These techniques are being used today for activities like shelf-space optimization and mark-down pricing. Advanced analytics can also help retailers decide which products to put in high value locations, like aisle ends, and how long to keep them there. It can also enable them to explore the sales benefits achieved by clustering related products together.

Search engine giant Google has acquired Skybox, a provider of highresolution satellite imagery, that can be used to track cars in the car park in order to anticipate in-store demand. Others have explored the use of drones equipped with cameras to monitor on-shelf inventory levels.

A topic that is still a challenge for many retailers is out-of-stock detection and prevention. In developed markets, manual inspections are expensive, while RFID tags still cost too much to be applied to individual grocery items. Instead, retailers are now monitoring sales activity for out of stock indicators. If an item that usually sold every few minutes does not appear at the tills, an alert is triggered to have person check if the item is out of stock at the shelf. Other innovative technologies are also being tested, including the installation of light or weight sensors on shelves as well as the use of in-store cameras to monitor on-shelf stock levels.

Similar technologies can be applied directly at the point of use. Amazon's Dash service, for example provides consumers with wireless buttons that can be used to reorder domestic products with a single push, like washing powder or razor blades. Ultimately, stores may be able to link to data gathered from consumer's Internet-connected refrigerators to forecast demand in real time.

As the examples in this article show, big data is already helping leading organizations transform the performance of their supply chains. Today, such approaches are the exception rather than the norm, however. Lack of capabilities and the lack of a structured approach to supply chain big data is holding many companies back. For big data and advanced analytical tools to deliver greater benefits for more companies, those organizations need a more systematic approach to their adoption. Part 2 of this series will address that topic in detail.

About the authors: Knut Alicke is a master expert in the Stuttgart office, Christoph Glatzel is a director in the Cologne office, and Per-Magnus Karlsson is a consultant in the Stockholm office. Kai Hoberg is an associate professor of supply chain and operations strategy at Kühne Logistics University, Germany.

supply chain case study dataset

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

The GraphRAG Manifesto: Unlock Better GenAI Results With Knowledge Graphs | Read Now

  • Neo4j Graph Database Self-managed, deploy anywhere
  • Neo4j AuraDB Fully-managed graph database as a service
  • Neo4j Graph Data Science Graph analytics and modeling platform
  • Deployment Center Get started. Download, integrate, and deploy.
  • GRAPH TOOLS
  • Neo4j Developer Tools Tools to make graph application development easier
  • Neo4j Workspace Import, explore, and query Neo4j
  • Neo4j Bloom Easy graph visualization and exploration
  • Cypher Query Language Declarative graph query language, created by Neo4j
  • Neo4j GraphQL Library Low-code, open-source API library
  • Generative AI Back your LLMs with a knowledge graph for better business AI
  • Industries and Use Cases Fraud detection, knowledge graphs, financial services, and more
  • Case Studies Customer success stories across industries
  • Customers Companies, governments, and NGOs using Neo4j
  • Developer Center Best practices, guides, tutorials, and downloads
  • Developer Blog Deep dives into more technical Neo4j topics
  • Community A global forum for online discussion
  • DATA SCIENTISTS
  • Graph Data Science Home Learn what Neo4j offers for data science
  • Get Started With Graph Data Science Download or get started in Sandbox today
  • Data Science Community A global forum for data-driven professionals
  • Resource Library White papers, data sheets, and more
  • Executive Insights Get to know Graph Technology
  • Neo4j Blog Daily reads on general Neo4j topics
  • FEATURED EVENTS
  • GraphSummit Touring 20+ cities globally. Join us for free and hear from data and business trailblazers.
  • Connections Free, online conference every quarter focusing on new themes every time
  • QUICK LINKS
  • Find a Partner
  • Become a Partner
  • Solution Partners
  • OEM Partners
  • Technology Partners
  • Partner Portal Login
  • Awards and Honors
  • Graphs4Good

Neo4j Graph Database

Intro to the advantages of graph databases

Neo4j AuraDB

Fully-managed graph database as a service

Neo4j Graph Data Science

Graph Analytics and modeling platforms

Deployment Center

Get started. Download, integrate, and deploy.

  • Neo4j Data Connectors Download Apache Kafka, Apache Spark, and BI tools

Generative AI

Back your LLMs with a Knowledge Graph for better business AI

Industries & Use Cases

Fraud detection, knowledge graphs, financial services, and more

All Use Cases

Developer Center

Best practices, guides, tutorials, and downloads

GraphAcademy

Free online courses and certifications. Join the 100K+ Neo4j experts.

Hear directly from data and business trailblazers

Touring 20+ cities globally | Free

Join our online conference. New themes every time.

Virtual | Every Quarter | Free

  • Get Started Free

Warning: JavaScript is disabled on your browser. Parts of Neo4j.com will not work properly.

(Neo4j Developer Blog)←[:BACK]

Graph Data Science for Supply Chains – Part 1: Getting Started with Neo4j GDS and Bloom

Zach blumenfeld , data science product specialist, neo4j jun 21, 2022 15 mins read.

Actionable insights in minutes, using Neo4j Graph Data Science and Bloom to intuitively visualize and extract supply chain insights around operational load, flow control, and regional patterns.

supply chain case study dataset

Supply chains are inherently complex, involving multiple stages, inputs, outputs, and interconnectivity. Looking at supply chain data in raw form (tables) can be daunting, and extracting valuable insights can be difficult and unintuitive. Luckly, supply chains are intrinsically structured as a graph – a network of stages and interconnecting arcs, which in graph terminology we would refer to as nodes and relationships respectively.

Fortunately, graph-based approaches explicitly model the rich interconnected nature of supply chain data. Using products like Neo4j, stakeholders can immediately visualize the network and intuitively explore and analyze the data. By further coupling this with Neo4j Graph Data Science (GDS), practitioners are empowered to rapidly conduct more advanced inference and gain insights which would otherwise remain obfuscated and challenging to uncover in other data models.

In this running blog series, we explore how Neo4j and GDS can be practically applied to supply chain and logistics use cases with specific technical examples.

This is the first blog in the series, where I will be demonstrating how graph technology provides insights for a freight forwarding logistics network. The data is obscure and unwieldy to deal with in its raw form. It is also heavily anonymized with dates and locations retracted. However, once we get it into Neo4j, you will see that, despite all this, the data almost instantaneously starts to tell a story as insights reveal themselves naturally through the network structure and things become transparent.

This blog will focus specifically on getting started with experimentation and visualization of supply chain data using Neo4j Graph Data Science and Bloom together. I will introduce some basic measures for operational load, flow control, and clustering that can help you better understand your supply chain structure and different interdependencies and risks within it.

While the focus of this blog will be visualization and experimentation, the next couple blogs will dive deeper into leveraging GDS and Neo4j to better operationalize our analysis as well as analyze the effect of the network structure on performance and risk in a more quantitative manner.

It is important to note that while the specific example in this blog is highly focused on logistics and freight forwarding, this same methodology can be applied to other types of supply chain problems like inventory management, manufacturing, Bill of Materials and more.

Source Logistics Dataset

To explore the application of graph data science to supply chain logistics, we will use the Cargo 2000 transport and logistics case study dataset . Cargo 2000 (re-branded as Cargo iQ in 2016) is an initiative of the International Air Transport Association (IATA) that aims to deliver a new quality management system for the air cargo industry.

The below figure shows a model of the business processes covered in the IATA case study. It represents the business processes of a freight forwarding company, in which up to three smaller shipments from suppliers are consolidated and then shipped together to customers. The business process is structured into incoming and outgoing transport legs, with the overall objective that freight is delivered to customers in a timely manner.

supply chain case study dataset

Figure 1: Transport and Logistics Process used in the Cargo 2000 Case Study

Each of the transport legs involves the following physical transport services:

  • RCS (Freight Reception) : Freight is received by the airline. It is delivered and checked in at the departure warehouse.
  • DEP (Freight Departure) : Goods are delivered to an aircraft and, once confirmed on board,the aircraft departs.
  • RCF (Freight Transport/Arrival): Freight is transported by air and arrives at the destination airport. Upon arrival freight is checked in and stored at the arrival warehouse.
  • DLV (Freight Delivery): Freight is delivered from the destination airport warehouse to the ultimate recipient.

A transport leg may involve multiple segments (e.g., transfers to other flights or airlines). In those cases, activity RCF loops back to DEP (indicated by the “loop-back” arrow in Figure 1).

The case study data comprises tracking and tracing events from a forwarding company’s Cargo 2000 system for a period of five months. From those Cargo 2000 messages, 3,942 business process instances (end-to-end shipments from incoming to outgoing, comprising 7,932 transport legs and 56,082 service invocations were reconstructed. The data includes planned and effective durations (in minutes) for each of the 4 services of the business process outlined above.

Dataset Anonymization

For confidentiality reasons, message fields that exhibit business critical or customer-related data (such as airway bill numbers, flight numbers and airport codes) have been eliminated or masked. The dataset also does not contain any datetime or geolocation information. DEP (“departure”) and RCF (“arrival”) services have specific airports associated with them, expressed through a “place” id in the dataset. However, due to the anonymization, this place id is masked with a sequential integer id rather than the airport IATA code or other real-world identifier.

To make analytical findings a bit easier to parse and recall, I will add fictitious names to each airport in the dataset before ingesting the raw data into a graph. This will allow us to refer to airports by a names like “Davisfort“ and “Richardberg” rather than a raw number that may be hard to remember.

Initial Tabular Dataset Statistics

The case study dataset is provided in tabular form. While containing a lot of valuable information, the dataset feels difficult to parse through in this format, at least for me. Below is a snapshot of what it looks like.

supply chain case study dataset

Figure 2: Source Data in Tabular Form

Every row consists of a business process instance, an end-to-end shipment from entry of incoming legs to delivery of outgoing leg. Many/most shipments do not utilize the maximum number of legs and segments (some numbers broken down below), resulting in many null values, represented by a “?” in the table. In data science terminology, we would refer to this as “sparse” data.

supply chain case study dataset

Figure 3: Leg and Segment Statistics

Graph Data Modeling and Ingest

Given the multi-hop interconnected nature of air freight forwarding, this data seems really conducive to analyzing in a graph, especially given the sparsity involved.

The first step for any graph data science project is data modeling: deciding on a graph data model or “schema” that represents the business processes with nodes and relationships. In general, you want to

  • represent nouns like locations and checkpoints with nodes, and
  • represent verbs, like service invocations or actions with relationships

This keeps the model intuitive and will also come in handy later when we want to do path calculations, route optimization, and what-if analysis.

For this data set, the below data model will serve well.

supply chain case study dataset

Figure 4: Graph Data Model

In this graph data model, the four transportation services are represented by relationships with stage and checkpoint nodes in-between. Each checkpoint node is then connected to an airport representing its location.

To keep this graph model simple and robust, incoming and outgoing legs are not modeled with separate relationship types, instead, a ”TransferPoint ” node type marks the transfer between incoming delivery and outgoing reception. Likewise, for multi-segment shipment legs, sequential segments are represented by freight departure coming directly from the arrival warehouse checkpoint as opposed to continuing to delivery or looping back to a departure warehouse.

There are also some important Node and Relationship properties included in the data model.

  • All the relationships, except the LOCATED_AT relationship, have a shipmentId property corresponding to the end-to-end shipment process (row in the source data set) it belongs to. The property is indexed to allow for fast retrieval in queries
  • All nodes will have an airportId property with a uniqueness constraint. This asserts that every airport has only one node for each type checkpoint as well as just a single airport node representing it in the graph.

Graph Ingest

Once the data model is decided, you can ingest the data into a Neo4j graph using basic data transformations and Cypher. I accomplish this with a Python notebook here if you are interested in the steps I used.

Exploring End-to End Shipments with Neo4j Bloom

Once the data is ingested, a typical starting point for any data science project is exploratory analysis. Below, I use Cypher queries and Bloom to begin to understand the air freight forwarding data.

I can easily visualize unique end-to-end-shipments in Bloom using Cypher search phrases and the search bar. I saved my Bloom Perspective in GitHub if you are interested in replicating these visualizations.

For example, say you are interested in visualizing shipment 12. You can plug “Path for Shipment 12” into the search bar to retrieve the result.

supply chain case study dataset

Figure 5: End-To-End Shipment Path in Bloom

You can see how much clearer the data presents itself now. This path shows 3 incoming shipment legs, one departing from “Odonnelltown”, the other two from “Wanborough”. All legs are single segments. Once all the shipments arrived in “Moodytown”, the freight was transferred to an outgoing leg which delivered to the final destination in “Michaelstad”.

Understanding Influence & Risks in Supply Chain Stages

Most real-world supply chains aren’t perfectly uniform: there are usually certain steps or stages that are critical for ultimate delivery. With a bill of materials, this could be a specialized part or supplier; for manufacturing processes, it could be a specific core step with lots of inputs and outputs. In our freight forwarding logistics example, the critical step is transfers at airport locations that are highly central to shipment routes.

Central, and potentially high risk, stages in supply chains are common but they may lead to

  • higher and/or more volatile operational load
  • significantly higher risk to the supply chain. This is because trouble in these stages can be more likely to cause bottlenecks or otherwise carry over to other critical business processes.

When you can visualize your supply chain, these risks often stand out – without any fancy data science needed! However, we’re often required to use more quantitative and wholitistic approaches to identify, and measure, risk: that’s where graph algorithms come in handy.

Exploring Operational Load and Flow Control with Graph Data Science

We started off our analysis by visually exploring the supply chain data. We can get more sophisticated by applying graph algorithms, in order to find patterns, anomalies, or trends. Bloom allows you to run graph algorithms over the data in a scene, and automatically applies rule based styling to help interpret the results. This can be a powerful way to get started with graph algorithms quickly: it’s a no-code approach to experimentation.

To make our visualizations easier, we are going to collapse the graph model. We will do this by creating a new relationship called SENDS_TO between airports. The relationship will have a flightCount parameter that counts the number of TRANSPORT relationships going between airports.

Once we do that we can go into Bloom and look at all the airports with the SENDS_TO relationships between them.

supply chain case study dataset

Figure 6: Network of Airports in Bloom

While the view provides an interesting high level picture, it doesn’t give us a lot of new information yet.

To start enriching our supply chain data, we can start by running graph algorithms on the in-scene data directly from Bloom. We will start by using centrality algorithms. This family of algorithms can calculate the importance of nodes (here, stages) based on the structure of the graph.

One of the most popular algorithms to understand operational load is degree centrality. This algorithm counts the number of relationships for each node. If we configure the algorithm to use freightCount as a weight and select an “Undirected orientation” this will effectively count the total number of departing and arriving flights for each airport.

Degree centrality measures the operational load for stages in your supply chain. Stages with high operational load have to manage larger inflows and outflows and may be forced to reconcile conflicting schedules and priorities more often. All else held constant, stages with higher operational load tend to require more resources to run effectively .

Below are the results of applying degree centrality to the airport network in our Bloom scene.

supply chain case study dataset

Figure 7: Degree Centrality (Operational Load) in Bloom

We can see that 4-5 airports really stand out here. They are

supply chain case study dataset

While degree centrality can tell us about operational load, it only measures the local activity associated with the stage, not necessarily the control or influence the stage has on the entire supply chain network. For this we can look at another algorithm called Betweenness centrality. Technically speaking, a node’s Betweenness centrality is calculated by counting how often the node rests on the shortest paths between all the other nodes in a graph. It is generally a good metric for describing how well a node bridges different regions of the graph together.

> Betweenness centrality measures the flow control for stages in distribution and logistics networks. Stages with high Betweenness centrality have more control over the flow of material and/or product because they connect many other stages together that may otherwise be disconnected or connected through much longer less efficient paths. All else held constant, stages with higher flow control present higher risk for causing bottlenecks in supply chains if they encounter delays or other issues [4].

Below we apply the Betweenness centrality algorithm in Bloom. Unlike degree centrality, we will keep the natural relationship orientation so we capture the direction of freight shipments.

supply chain case study dataset

Figure 8: Betweenness Centrality (Flow Control) in Bloom

More often than not, degree centrality and betweenness will be positively correlated, however, it isn’t perfect. We see here that the ranking of top airports is a bit different, with Richardberg now having the highest score, and Shanefort being a close follow. The highest scoring degree centrality node of Davisport is in third place and has less than half the betweenness centrality of second place Shanefort.

supply chain case study dataset

Understanding Local Networks with Graph Data Science

Centrality algorithms can measure the importance of stages in our supply chain. Another aspect we may want to consider, particularly for distribution and logistics networks, is how flows may naturally cluster into distinct well defined regions. This can be driven by geographical proximity, economic (supply/demand) features, or other structural factors. This clustering strongly affects flow control and risks on local/regional levels. For example, stages within a particular region often depend more heavily on each other. Additionally some stages will have a stronger effect on flows within a region while others will be more instrumental in flows coming/going with different regions. In a graph, we can use community detection algorithms to find clusters of the supply chain that are densely interconnected.

To analyze whether this regional clustering exists in our supply chain network, and if so, identify and label the stages within them, we can use the Louvain algorithm. Technically speaking, the Louvain algorithm runs recursively to optimize a modularity score – essentially seeking to assign nodes to communities such that they are as densely connected within the community as possible relative to other random nodes in the graph.

In the context of distribution and logistics networks, Louvain Community Detection finds regional interdependence within the network by identifying groups of stages which have highly interconnected flows between them. All else held constant, Stages within the same community have a stronger interdependence on each other relative to stages outside the community.

Below is an example of the Louvain algorithm run within Bloom. Building on our centrality score scene above, where nodes are sized based on their importance, we can color nodes based on their community membership. Nodes in each community are assigned different colors, and we can clearly see some structural patterns emerging.

supply chain case study dataset

Figure 9: Louvain Communities in Bloom

You will see that Louvain found 5 large communities or “regions” within our logistics network. Louvain will label these communities numerically, but to make them easier to describe I will name each region after its highest betweenness centrality node:

  • Masseyhaven region top left in Orange
  • Moodeytown region center left in purple
  • Davisfort region bottom left in red
  • Richardberg region in center right in blue
  • Shanefort region top right in yellow

If we zoom in on the Masseyhaven region we will see that there are four airports that all other airports in the region connect to and which seem to dominate the regional flow control: MasseyTown, Lambfort, Fieldsbury, and Sonyafort.

supply chain case study dataset

Figure 10: Masseyhaven Region (Louvain Community) in Bloom

You can also see that many other airports in the region are mostly (or exclusively) connected to just one of these four central ones, making those airports even more dependent on a single other airport in the region. If you know much about airline routing, you can see that this neatly recapitulates the hub and spoke model of air transit.

We can use these views to analyze flow between regions as well.If we zoom out of the Masseyhave community, we can see how its connected to Davisfort

supply chain case study dataset

Figure 11: Flows Between Masseyhaven & Davisfort Regions in Bloom

You will notice that even though the Masseytown airport has the highest operational load and flow control, Lambfort, Fieldsbury, and Sonyafort actually control most of the flow between the Masseytown and Davisfort regions. As such, goods that require transport between Masseytown and Davisfort regions will be more dependent on airports like Lambfort and Fieldsbury even though they have less overall operational load and flow control compared to other airports in the logistics network.

No Geospatial Data Needed!

It is important to mention as well that we were able to learn all of this information about air freight networks without any geographical identifiers. Since these are airports, there is likely correlation between physical distance and transport relationships. Due to the capabilities of graph algorithms and visualization tools like Neo4j GDS and Bloom, we can use historic data to infer the regional dependencies and flow patterns even when we do not have access to geospatial information.

Until Next Time….

In this blog we explored the use of Neo4j GDS and Bloom to visualize and analyze a logistics network. We established some useful algorithms to understand influence and risks in our supply chain. Namely:

  • Degree Centrality -> Operational Load
  • Betweenness Centrality -> Flow Control and Bottle Neck Risk
  • And Louvain -> Regional Interdependence

While the Cargo 2000 case study data was a bit difficult to parse in its tabular form, it became much more transparent once ingested into Neo4j as a logistics network. Despite the heavily redacted nature of the source data, GDS and Bloom together provided an intuitive no-code interface which enabled us to easily identify and visualize the locations with highest operational load and flow control, along with the natural clustering/regional structure of the network.

If you are interested in this sort of analysis and want to go deeper, stay tuned for the next section of this series where we will take a more production-ready approach by running algorithms directly from Python. We will introduce a couple more algorithms and explore how to investigate the effect of operational load, flow control, and other critical metrics on supply chain performance and risk.

Neo4j’s fully managed cloud  service

Free online courses & certifications.  Join 100K+ Neo4j experts!

Online Developer Conference  Happening November 7, 2024

Zach Blumenfeld Image

Zach Blumenfeld, Data Science Product Specialist, Neo4j

Related Articles

supply chain case study dataset

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

In this Project, I presented an analysis of a Fashion and Beauty startup’s supply chain data and by collecting, analyzing, and interpreting this dataset, I aimed to gain insights that can inform strategic decisions and optimize supply chain operations.

MisterAare/Supply-chain-analysis-with-excel

Folders and files.

NameName
31 Commits

Repository files navigation

Supply-chain-analysis-with-excel.

Supply chain analytics plays a crucial role in driving data-driven decision-making across industries, including manufacturing, retail, healthcare, and logistics. In this report, I presented an analysis of a Fashion and Beauty startup's supply chain data, specifically focused on the movement of makeup products. By collecting, analyzing, and interpreting this dataset, I aim to gain insights that can inform strategic decisions and optimize supply chain operations.

The dataset comprises various features related to the fashion and makeup product supply chain. These features provide valuable information for understanding the flow of products from suppliers to customers. The following features are included:

  • Product Type
  • Availability
  • Number of products sold
  • Revenue generated
  • Customer demographics
  • Stock levels
  • Order quantities
  • Shipping times
  • Shipping carriers
  • Shipping costs
  • Supplier name
  • Production volumes
  • Manufacturing lead time
  • Manufacturing costs
  • Inspection results
  • Defect rates
  • Transportation modes

Supply Chain Analysis with PowerBi_page-0001

Supply Chain Analysis involves examining different aspects of a supply chain to identify areas for improvement and enhance the overall efficiency of the supply chain, ultimately delivering greater value to customers. This project provided an overview of conducting a supply chain analysis using Excel, highlighting key steps and techniques.

If you have any questions or would like to contribute further insights, please feel free to reach out.

References to the Solved Analysis Case Study Using Excel (Check out my Full Analytic Report Here)

Supply Chain Analysis with Excel by Aare Adegboyega

  • Survey Paper
  • Open access
  • Published: 25 July 2020

Predictive big data analytics for supply chain demand forecasting: methods, applications, and research opportunities

  • Mahya Seyedan 1 &
  • Fereshteh Mafakheri   ORCID: orcid.org/0000-0002-7991-4635 1  

Journal of Big Data volume  7 , Article number:  53 ( 2020 ) Cite this article

125k Accesses

150 Citations

23 Altmetric

Metrics details

Big data analytics (BDA) in supply chain management (SCM) is receiving a growing attention. This is due to the fact that BDA has a wide range of applications in SCM, including customer behavior analysis, trend analysis, and demand prediction. In this survey, we investigate the predictive BDA applications in supply chain demand forecasting to propose a classification of these applications, identify the gaps, and provide insights for future research. We classify these algorithms and their applications in supply chain management into time-series forecasting, clustering, K-nearest-neighbors, neural networks, regression analysis, support vector machines, and support vector regression. This survey also points to the fact that the literature is particularly lacking on the applications of BDA for demand forecasting in the case of closed-loop supply chains (CLSCs) and accordingly highlights avenues for future research.

Introduction

Nowadays, businesses adopt ever-increasing precision marketing efforts to remain competitive and to maintain or grow their margin of profit. As such, forecasting models have been widely applied in precision marketing to understand and fulfill customer needs and expectations [ 1 ]. In doing so, there is a growing attention to analysis of consumption behavior and preferences using forecasts obtained from customer data and transaction records in order to manage products supply chains (SC) accordingly [ 2 , 3 ].

Supply chain management (SCM) focuses on flow of goods, services, and information from points of origin to customers through a chain of entities and activities that are connected to one another [ 4 ]. In typical SCM problems, it is assumed that capacity, demand, and cost are known parameters [ 5 ]. However, this is not the case in reality, as there are uncertainties arising from variations in customers’ demand, supplies transportation, organizational risks and lead times. Demand uncertainties, in particular, has the greatest influence on SC performance with widespread effects on production scheduling, inventory planning, and transportation [ 6 ]. In this sense, demand forecasting is a key approach in addressing uncertainties in supply chains [ 7 , 8 , 9 ].

A variety of statistical analysis techniques have been used for demand forecasting in SCM including time-series analysis and regression analysis [ 10 ]. With the advancements in information technologies and improved computational efficiencies, big data analytics (BDA) has emerged as a means of arriving at more precise predictions that better reflect customer needs, facilitate assessment of SC performance, improve the efficiency of SC, reduce reaction time, and support SC risk assessment [ 11 ].

The focus of this meta-research (literature review) paper is on “demand forecasting” in supply chains. The characteristics of demand data in today’s ever expanding and sporadic global supply chains makes the adoption of big data analytics (and machine learning) approaches a necessity for demand forecasting. The digitization of supply chains [ 12 ] and incoporporation Blockchain technologies [ 13 ] for better tracking of supply chains further highlights the role of big data analytics. Supply chain data is high dimensional generated across many points in the chain for varied purposes (products, supplier capacities, orders, shipments, customers, retailers, etc.) in high volumes due to plurality of suppliers, products, and customers and in high velocity reflected by many transactions continuously processed across supply chain networks. In the sense of such complexities, there has been a departure from conventional (statistical) demand forecasting approaches that work based on identifying statistically meannignful trends (characterized by mean and variance attributes) across historical data [ 14 ], towards intelligent forecasts that can learn from the historical data and intelligently evolve to adjust to predict the ever changing demand in supply chains [ 15 ]. This capability is established using big data analytics techniques that extract forecasting rules through discovering the underlying relationships among demand data across supply chain networks [ 16 ]. These techniques are computationally intensive to process and require complex machine-programmed algorithms [ 17 ].

With SCM efforts aiming at satisfying customer demand while minimizing the total cost of supply, applying machine-learning/data analytics algorithms could facilitate precise (data-driven) demand forecasts and align supply chain activities with these predictions to improve efficiency and satisfaction. Reflecting on these opportunities, in this paper, first a taxonmy of data sources in SCM is proposed. Then, the importance of demand management in SCs is investigated. A meta-research (literature review) on BDA applications in SC demand forecasting is explored according to categories of the algorithms utilized. This review paves the path to a critical discussion of BDA applications in SCM highlighting a number of key findings and summarizing the existing challenges and gaps in BDA applications for demand forecasting in SCs. On that basis, the paper concludes by presenting a number of avenues for future research.

Data in supply chains

Data in the context of supply chains can be categorized into customer, shipping, delivery, order, sale, store, and product data [ 18 ]. Figure  1 provides the taxonomy of supply chain data. As such, SC data originates from different (and segmented) sources such as sales, inventory, manufacturing, warehousing, and transportation. In this sense, competition, price volatilities, technological development, and varying customer commitments could lead to underestimation or overestimation of demand in established forecasts [ 19 ]. Therefore, to increase the precision of demand forecast, supply chain data shall be carefully analyzed to enhance knowledge about market trends, customer behavior, suppliers and technologies. Extracting trends and patterns from such data and using them to improve accuracy of future predictions can help minimize supply chain costs [ 20 , 21 ].

figure 1

Taxonomy of supply chain data

Analysis of supply chain data has become a complex task due to (1) increasing multiplicity of SC entities, (2) growing diversity of SC configurations depending on the homogeneity or heterogeneity of products, (3) interdependencies among these entities (4) uncertainties in dynamical behavior of these components, (5) lack of information as relate to SC entities; [ 11 ], (6) networked manufacturing/production entities due to their increasing coordination and cooperation to achieve a high level customization and adaptaion to varying customers’ needs [ 22 ], and finally (7) the increasing adoption of supply chain digitization practices (and use of Blockchain technologies) to track the acitivities across supply chains [ 12 , 13 ].

Big data analytics (BDA) has been increasingly applied in management of SCs [ 23 ], for procurement management (e.g., supplier selection [ 24 ], sourcing cost improvement [ 25 ], sourcing risk management [ 26 ], product research and development [ 27 ], production planning and control [ 28 ], quality management [ 29 ], maintenance, and diagnosis [ 30 ], warehousing [ 31 ], order picking [ 32 ], inventory control [ 33 ], logistics/transportation (e.g., intelligent transportation systems [ 34 ], logistics planning [ 35 ], in-transit inventory management [ 36 ], demand management (e.g., demand forecasting [ 37 ], demand sensing [ 38 ], and demand shaping [ 39 ]. A key application of BDA in SCM is to provide accurate forecasting, especially demand forecasting, with the aim of reducing the bullwhip effect [ 14 , 40 , 41 , 42 ].

Big data is defined as high-volume, high-velocity, high-variety, high value, and high veracity data requiring innovative forms of information processing that enable enhanced insights, decision making, and process automation [ 43 ]. Volume refers to the extensive size of data collected from multiple sources (spatial dimension) and over an extended period of time (temporal dimension) in SCs. For example, in case of freight data, we have ERP/WMS order and item-level data, tracking, and freight invoice data. These data are generated from sensors, bar codes, Enterprise resource planning (ERP), and database technologies. Velocity can be defined as the rate of generation and delivery of specific data; in other words, it refers to the speed of data collection, reliability of data transferring, efficiency of data storage, and excavation speed of discovering useful knowledge as relate to decision-making models and algorithms. Variety refers to generating varied types of data from diverse sources such as the Internet of Things (IoT), mobile devices, online social networks, and so on. For instance, the vast data from SCM are usually variable due to the diverse sources and heterogeneous formats, particularly resulted from using various sensors in manufacturing sites, highways, retailer shops, and facilitated warehouses. Value refers to the nature of the data that must be discovered to support decision-making. It is the most important yet the most elusive, of the 5 Vs. Veracity refers to the quality of data, which must be accurate and trustworthy, with the knowledge that uncertainty and unreliability may exist in many data sources. Veracity deals with conformity and accuracy of data. Data should be integrated from disparate sources and formats, filtered and validated [ 23 , 44 , 45 ]. In summary, big data analytics techniques can deal with a collection of large and complex datasets that are difficult to process and analyze using traditional techniques [ 46 ].

The literature points to multiple sources of big data across the supply chains with varied trade-offs among volume, velocity, variety, value, and veracity attributes [ 47 ]. We have summarized these sources and trade-offs in Table  1 . Although, the demand forecasts in supply chains belong to the lower bounds of volume, velocity, and variety, however, these forecasts can use data from all sources across the supply chains from low volume/variety/velocity on-the-shelf inventory reports to high volume/variety/velocity supply chain tracking information provided through IoT. This combination of data sources used in SC demand forecasts, with their diverse temporal and spatial attributes, places a greater emphasis on use of big data analytics in supply chains, in general, and demand forecasting efforts, in particular.

The big data analytics applications in supply chain demand forecasting have been reported in both categories of supervised and unsupervised learning. In supervised learning, data will be associated with labels, meaning that the inputs and outputs are known. The supervised learning algorithms identify the underlying relationships between the inputs and outputs in an effort to map the inputs to corresponding outputs given a new unlabeled dataset [ 48 ]. For example, in case of a supervised learning model for demand forecasting, future demand can be predicted based on the historical data on product demand [ 41 ]. In unsupervised learning, data are unlabeled (i.e. unknown output), and the BDA algorithms try to find the underlying patterns among unlabeled data [ 48 ] by analyzing the inputs and their interrelationships. Customer segmentation is an example of unsupervised learning in supply chains that clusters different groups of customers based on their similarity [ 49 ]. Many machine-learning/data analytics algorithms can facilitate both supervised learning (extracting the input–output relationships) and unsupervised learning (extracting inputs, outputs and their relationships) [ 41 ].

Demand management in supply chains

The term “demand management” emerged in practice in the late 1980s and early 1990s. Traditionally, there are two approaches for demand management. A forward approach which looks at potential demand over the next several years and a backward approach that relies on past or ongoing capabilities in responding to demand [ 50 ].

In forward demand management, the focus will be on demand forecasting and planning, data management, and marketing strategies. Demand forecasting and planning refer to predicting the quantities and timings of customers’ requests. Such predictions aim at achieving customers’ satisfaction by meeting their needs in a timely manner [ 51 ]. Accurate demand forecasting could improve the efficiency and robustness of production processes (and the associated supply chains) as the resources will be aligned with requirements leading to reduction of inventories and wastes [ 52 , 53 ].

In the light of the above facts, there are many approaches proposed in the literature and practice for demand forecasting and planning. Spreadsheet models, statistical methods (like moving averages), and benchmark-based judgments are among these approaches. Today, the most widely used demand forecasting and planning tool is Excel. The most widespread problem with spreadsheet models used for demand forecasting is that they are not scalable for large-scale data. In addition, the complexities and uncertainties in SCM (with multiplicity and variability of demand and supply) cannot be extracted, analyzed, and addressed through simple statistical methods such as moving averages or exponential smoothing [ 50 ]. During the past decade, traditional solutions for SC demand forecasting and planning have faced many difficulties in driving the costs down and reducing inventories [ 50 ]. Although, in some cases, the suggested solutions have improved the day’s payable, they have pushed up the SC costs as a burden to suppliers.

The era of big data and high computing analytics has enabled data processing at a large scale that is efficient, fast, easy, and with reduced concerns about data storage and collection due to cloud services. The emergence of new technologies in data storage and analytics and the abundance of quality data have created new opportunities for data-driven demand forecasting and planning. Demand forecast accuracy can be significantly improved with data-mining algorithms and tools that can sift through data, analyze the results, and learn about the relationships involved. This could lead to highly accurate demand forecasting models that learn from data and are scalable for application in SCM. In the following section, a review of BDA applications in SCM is presented. These applications are categorized based on the employed techniques in establishing the data-drive demand forecasts.

BDA for demand forecasting in SCM

This survey aims at reviewing the articles published in the area of demand and sales forecasting in SC in the presence of big data to provide a classification of the literature based on algorithms utilized as well as a survey of applications. To the best of our knowledge, no comprehensive review of the literature specifically on SC demand forecasting has been conducted with a focus on classification of techniques of data analytics and machine learning. In doing so, we performed a thorough search of the existing literature, through Scopus, Google Scholar, and Elsevier, with publication dates ranging from 2005 to 2019. The keywords used for the search were supply chain, demand forecasting, sales forecasting, big data analytics, and machine learning.

Figure  2 shows the trend analysis of publications in demand forecasting for SC appeared from 2005 to 2019. There is a steadily increasing trend in the number of publications from 2005 to 2019. It is expected that such growth continues in 2020. Reviewing the past 15 years of research on big data analysis/machine learning applications in SC demand forecasting, we identified 64 research papers (excluding books, book chapters, and review papers) and categorized them with respect to the methodologies adopted for demand forecasting. The five most frequently used techniques are listed in Table  2 that includes “Neural Network,” “Regression”, “Time-series forecasting (ARIMA)”, “Support Vector Machine”, and “Decision Tree” methods. This table implies the growing use of big data analysis techniques in SC demand forecasting. It shall be mentioned that there were a few articles using multiple of these techniques.

figure 2

Distribution of literature in supply chain demand forecasting from 2005 to 2019

It shall be mentioned that there are literature review papers exploring the use of big data analytics in SCM [ 10 , 16 , 23 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 ]. However, this study focuses on the specific topic of “demand forecasting” in SCM to explore BDA applications in line with this particular subtopic in SCM.

As Hofmann and Rutschmann [ 58 ] indicated in their literature review, the key questions to answer are why, what and how big data analytics/machine-learning algorithms could enhance forecasts’ accuracy in comparison to conventional statistical forecasting approaches.

Conventional methods have faced a number of limitations for demand forecasting in the context of SCs. There are a lot of parameters influencing the demand in supply chains, however, many of them were not captured in studies using conventional methods for the sake of simplicity. In this regard, the forecasts could only provide a partial understanding of demand variations in supply chains. In addition, the unexplained demand variations could be simply considered as statistical noise. Conventional approaches could provide shorter processing times in exchange for a compromise on robustness and accuracy of predictions. Conventional SC demand forecasting approaches are mostly done manually with high reliance on the planner’s skills and domain knowledge. It would be worthwhile to fully automate the forecasting process to reduce such a dependency [ 58 ]. Finally, data-driven techniques could learn to incorporate non-linear behaviors and could thus provide better approximations in demand forecasting compared to conventional methods that are mostly derived based on linear models. There is a significant level of non-linearity in demand behavior in SC particularly due to competition among suppliers, the bullwhip effect, and mismatch between supply and demand [ 40 ].

To extract valuable knowledge from a vast amount of data, BDA is used as an advanced analytics technique to obtain the data needed for decision-making. Reduced operational costs, improved SC agility, and increased customer satisfaction are mentioned among the benefits of applying BDA in SCM [ 68 ]. Researchers used various BDA techniques and algorithms in SCM context, such as classification, scenario analysis, and optimization [ 23 ]. Machine-learning techniques have been used to forecast demand in SCs, subject to uncertainties in prices, markets, competitors, and customer behaviors, in order to manage SCs in a more efficient and profitable manner [ 40 ].

BDA has been applied in all stages of supply chains, including procurement, warehousing, logistics/transportation, manufacturing, and sales management. BDA consists of descriptive analytics, predictive analytics, and prescriptive analytics. Descriptive analysis is defined as describing and categorizing what happened in the past. Predictive analytics are used to predict future events and discover predictive patterns within data by using mathematical algorithms such as data mining, web mining, and text mining. Prescriptive analytics apply data and mathematical algorithms for decision-making. Multi-criteria decision-making, optimization, and simulation are among the prescriptive analytics tools that help to improve the accuracy of forecasting [ 10 ].

Predictive analytics are the ones mostly utilized in SC demand and procurement forecasting [ 23 ]. In this sense, in the following subsections, we will review various predictive big data analytics approaches, presented in the literature for demand forecasting in SCM, categorized based on the employed data analytics/machine learning technique/algorithm, with elaborations of their purpose and applications (summarized in Table  3 ).

Time-series forecasting

Time series are methodologies for mining complex and sequential data types. In time-series data, sequence data, consisting of long sequences of numeric data, recorded at equal time intervals (e.g., per minute, per hour, or per day). Many natural and human-made processes, such as stock markets, medical diagnosis, or natural phenomenon, can generate time-series data. [ 48 ].

In case of demand forecasting using time-series, demand is recorded over time at equal size intervals [ 69 , 70 ]. Combinations of time-series methods with product or market features have attracted much attention in demand forecasting with BDA. Ma et al. [ 71 ] proposed and developed a demand trend-mining algorithm for predictive life cycle design. In their method, they combined three models (a) a decision tree model for large-scale historical data classification, (b) a discrete choice analysis for present and past demand modeling, and (c) an automated time-series forecasting model for future trend analysis. They tested and applied their 3-level approach in smartphone design, manufacturing and remanufacturing.

Time-series approach was used for forecasting of search traffic (service demand) subject to changes in consumer attitudes [ 37 ]. Demand forecasting has been achieved through time-series models using exponential smoothing with covariates (ESCov) to provide predictions for short-term, mid-term, and long-term demand trends in the chemical industry SCs [ 7 ]. In addition, Hamiche et al. [ 72 ] used a customer-responsive time-series approach for SC demand forecasting.

In case of perishable products, with short life cycles, having appropriate (short-term) forecasting is extremely critical. Da Veiga et al. [ 73 ] forecasted the demand for a group of perishable dairy products using Autoregressive Integrated Moving Average (ARIMA) and Holt-Winters (HW) models. The results were compared based on mean absolute percentage error (MAPE) and Theil inequality index (U-Theil). The HW model showed a better goodness-of-fit based on both performance metrics.

In case of ARIMA, the accuracy of predictions could diminish where there exists a high level of uncertainty in future patterns of parameters [ 42 , 74 , 75 , 76 ]. HW model forecasting can yield better accuracy in comparison to ARIMA [ 73 ]. HW is simple and easy to use. However, data horizon could not be larger than a seasonal cycle; otherwise, the accuracy of forecasts will decrease sharply. This is due to the fact that inputs of an HW model are themselves predicted values subject to longer-term potential inaccuracies and uncertainties [ 45 , 73 ].

Clustering analysis

Clustering analysis is a data analysis approach that partitions a group of data objects into subgroups based on their similarities. Several applications of clustering analysis has been reported in business analytics, pattern recognition, and web development [ 48 ]. Han et al. [ 48 ] have emphasized the fact that using clustering customers can be organized into groups (clusters), such that customers within a group present similar characteristic.

A key target of demand forecasting is to identify demand behavior of customers. Extraction of similar behavior from historical data leads to recognition of customer clusters or segments. Clustering algorithms such as K-means, self-organizing maps (SOMs), and fuzzy clustering have been used to segment similar customers with respect to their behavior. The clustering enhances the accuracy of SC demand forecasting as the predictions are established for each segment comprised of similar customers. As a limitation, the clustering methods have the tendency to identify the customers, that do not follow a pattern, as outliers [ 74 , 77 ].

Hierarchical forecasts of sales data are performed by clustering and categorization of sales patterns. Multivariate ARIMA models have been used in demand forecasting based on point-of-sales data in industrial bakery chains [ 19 ]. These bakery goods are ordered and clustered daily with a continuous need to demand forecasts in order to avoid both shortage or waste [ 19 ]. Fuel demand forecasting in thermal power plants is another domain with applications of clustering methods. Electricity consumption patterns are derived using a clustering of consumers, and on that basis, demand for the required fuel is established [ 77 ].

K-nearest-neighbor (KNN)

KNN is a method of classification that has been widely used for pattern recognition. KNN algorithm identifies the similarity of a given object to the surrounding objects (called tuples) by generating a similarity index. These tuples are described by n attributes. Thus, each tuple corresponds to a point in an n-dimensional space. The KNN algorithm searches for k tuples that are closest to a given tuple [ 48 ]. These similarity-based classifications will lead to formation of clusters containing similar objects. KNN can also be integrated into regression analysis problems [ 78 ] for dimensionality reduction of the data [ 79 ]. In the realm of demand forecasting in SC, Nikolopoulos et al. [ 80 ] applied KNN for forecasting sporadic demand in an automotive spare parts supply chain. In another study, KNN is used to forecast future trends of demand for Walmart’s supply chain planning [ 81 ].

Artificial neural networks

In artificial neural networks, a set of neurons (input/output units) are connected to one another in different layers in order to establish mapping of the inputs to outputs by finding the underlying correlations between them. The configuration of such networks could become a complex problem, due to a high number of layers and neurons, as well as variability of their types (linear or nonlinear), which needs to follow a data-driven learning process to be established. In doing so, each unit (neuron) will correspond to a weight, that is tuned through a training step [ 48 ]. At the end, a weighted network with minimum number of neurons, that could map the inputs to outputs with a minimum fitting error (deviation), is identified.

As the literature reveals, artificial neural networks (ANN) are widely applied for demand forecasting [ 82 , 83 , 84 , 85 ]. To improve the accuracy of ANN-based demand predictions, Liu et al. [ 86 ] proposed a combination of a grey model and a stacked auto encoder applied to a case study of predicting demand in a Brazilian logistics company subject to transportation disruption [ 87 ]. Amirkolaii et al. [ 88 ] applied neural networks in forecasting spare parts demand to minimize supply chain shortages. In this case of spare parts supply chain, although there were multiple suppliers to satisfy demand for a variety of spare parts, the demand was subject to high variability due to a varying number of customers and their varying needs. Their proposed ANN-based forecasting approach included (1) 1 input demand feature with 1 Stock-Keeping Unit (SKU), (2) 1 input demand feature with all SKUs, (3) 16 input demand features with 1 SKU, and (4) 16 input demand features with all SKUs. They applied neural networks with back propagation and compared the results with a number of benchmarks reporting a Mean Square Error (MSE) for each configuration scenario.

Huang et al. [ 89 ] compared a backpropagation (BP) neural network and a linear regression analysis for forecasting of e-logistics demand in urban and rural areas in China using data from 1997 to 2015. By comparing mean absolute error (MAE) and the average relative errors of backpropagation neural network and linear regression, they showed that backpropagation neural networks could reach higher accuracy (reflecting lower differences between predicted and actual data). This is due to the fact that a Sigmoid function was used as the transfer function in the hidden layer of BP, which is differentiable for nonlinear problems such as the one presented in their case study, whereas the linear regression works well with linear problems.

ANNs have also been applied in demand forecasting for server models with one-week demand prediction ahead of order arrivals. In this regard, Saha et al. [ 90 ] proposed an ANN-based forecasting model using a 52-week time-series data fitted through both BP and Radial Basis Function (RBF) networks. A RBF network is similar to a BP network except for the activation/transfer function in RBF that follows a feed-forward process using a radial basis function. RBF results in faster training and convergence to ANN weights in comparison with BP networks without compromising the forecasting precision.

Researchers have combined ANN-based machine-learning algorithms with optimization models to draw optimal courses of actions, strategies, or decisions for future. Chang et al. [ 91 ] employed a genetic algorithm in the training phase of a neural network using sales/supply chain data in the printed circuit board industry in Taiwan and presented an evolving neural network-forecasting model. They proposed use of a Genetic Algorithms (GA)-based cost function optimization to arrive at the best configuration of the corresponding neural network for sales forecast with respect to prediction precision. The proposed model was then compared to back-propagation and linear regression approaches using three performance indices of MAPE, Mean Absolute Deviation (MAD), and Total Cost Deviation (TCD), presenting its superior prediction precision.

Regression analysis

Regression models are used to generate continuous-valued functions utilized for prediction. These methods are used to predict the value of a response (dependent) variable with respect to one or more predictor (independent) variables. There are various forms of regression analysis, such as linear, multiple, weighted, symbolic (random), polynomial, nonparametric, and robust. The latter approach is useful when errors fail to satisfy normalcy conditions or when we deal with big data that could contain significant number of outliers [ 48 ].

Merkuryeva et al. [ 92 ] analyzed three prediction approaches for demand forecasting in the pharmaceutical industry: a simple moving average model, multiple linear regressions, and a symbolic regression with searches conducted through an evolutionary genetic programming. In this experiment, symbolic regression exhibited the best fit with the lowest error.

As perishable products must be sold due to a very short preservation time, demand forecasting for this type of products has drawn increasing attention. Yang and Sutrisno [ 93 ] applied and compared regression analysis and neural network techniques to derive demand forecasts for perishable goods. They concluded that accurate daily forecasts are achievable with knowledge of sales numbers in the first few hours of the day using either of the above methods.

Support vector machine (SVM)

SVM is an algorithm that uses a nonlinear mapping to transform a set of training data into a higher dimension (data classes). SVM searches for an optimal separating hyper-plane that can separate the resulting class from another) [ 48 ]. Villegas et al. [ 94 ] tested the applicability of SVMs for demand forecasting in household and personal care SCs with a dataset comprised of 229 weekly demand series in the UK. Wu [ 95 ] applied an SVM, using a particle swarm optimization (PSO) to search for the best separating hyper-plane, classifying the data related to car sales and forecasting the demand in each cluster.

Support vector regression (SVR)

Continuous variable classification problems can be solved by support vector regression (SVR), which is a regression implementation of SVM. The main idea behind SVR regression is the computation of a linear regression function within a high-dimensional feature space. SVR has been applied in financial/cost prediction problems, handwritten digit recognition, and speaker identification, object recognition, etc. [ 48 ].

Guanghui [ 96 ] used the SVR method for SC needs prediction. The use of SVR in demand forecasting can yield a lower mean square error than RBF neural networks due to the fact that the optimization (cost) function in SVR does not consider the points beyond a margin of distance from the training set. Therefore, this method leads to higher forecast accuracy, although, similar to SVM, it is only applicable to a two-class problem (such as normal versus anomaly detection/estimation problems). Sarhani and El Afia [ 97 ] sought to forecast SC demand using SVR and applied Particle swarm optimization (PSO) and GA to optimize SVR parameters. SVR-PSO and SVR-GA approaches were compared with respect to accuracy of predictions using MAPE. The results showed a superior performance by PSO in terms time intensity and MAPE when configuring the SVR parameters.

Mixed approaches

Some works in the literature have used a combination of the aforementioned techniques. In these studies, the data flow into a sequence of algorithms and the outputs of one stage become inputs of the next step. The outputs are explanatory in the form of qualitative and quantitative information with a sequence of useful information extracted out of each algorithm. Examples of such studies include [ 15 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 ].

In more complex supply chains with several points of supply, different warehouses, varied customers, and several products, the demand forecasting becomes a high dimensional problem. To address this issue, Islek and Oguducu [ 100 ] applied a clustering technique, called bipartite graph clustering, to analyze the patterns of sales for different products. Then, they combined a moving average model and a Bayesian belief network approaches to improve the accuracy of demand forecasting for each cluster. Kilimci et al. [ 101 ] developed an intelligent demand forecasting system by applying time-series and regression methods, a support vector regression algorithm, and a deep learning model in a sequence. They dealt with a case involving big amount of data accounting for 155 features over 875 million records. First, they used a principal component analysis for dimension reduction. Then, data clustering was performed. This is followed by demand forecasting for each cluster using a novel decision integration strategy called boosting ensemble. They concluded that the combination of a deep neural network with a boosting strategy yielded the best accuracy, minimizing the prediction error for demand forecasting.

Chen and Lu [ 98 ] combined clustering algorithms of SOM, a growing hierarchical self-organizing mapping (GHSOM), and K-means, with two machine-learning techniques of SVR and extreme learning machine (ELM) in sales forecasting of computers. The authors found that the combination of GHSOM and ELM yielded better accuracy and performance in demand forecasts for their computer retailing case study. Difficulties in forecasting also occur in cases with high product variety. For these types of products in an SC, patterns of sales can be extracted for clustered products. Then, for each cluster, a machine-learning technique, such as SVR, can be employed to further improve the prediction accuracy [ 104 ].

Brentan et al. [ 106 ] used and analyzed various BDA techniques for demand prediction; including support vector machines (SVM), and adaptive neural fuzzy inference systems (ANFIS). They combined the predicted values derived from each machine learning techniques, using a linear regression process to arrive at an average prediction value adopted as the benchmark forecast. The performance (accuracy) of each technique is then analyzed with respect to their mean square root error (RMSE) and MAE values obtained through comparing the target values and the predicted ones.

In summary, Table  3 provides an overview of the recent literature on the application of Predictive BDA in demand forecasting.

Discussions

The data produced in SCs contain a great deal of useful knowledge. Analysis of such massive data can help us to forecast trends of customer behavior, markets, prices, and so on. This can help organizations better adapt to competitive environments. To forecast demand in an SC, with the presences of big data, different predictive BDA algorithms have been used. These algorithms could provide predictive analytics using time-series approaches, auto-regressive methods, and associative forecasting methods [ 10 ]. The demand forecasts from these BDA methods could be integrated with product design attributes as well as with online search traffic mapping to incorporate customer and price information [ 37 , 71 ].

Predictive BDA algorithms

Most of the studies examined, developed and used a certain data-mining algorithm for their case studies. However, there are very few comparative studies available in the literature to provide a benchmark for understanding of the advantages and disadvantages of these methodologies. Additionally, as depicted by Table  3 , there is no clear trend between the choice of the BDA algorithm/method and the application domain or category.

Predictive BDA applicability

Most data-driven models used in the literature consider historical data. Such a backward-looking forecasting ignores the new trends and highs and lows in different economic environments. Also, organizational factors, such as reputation and marketing strategies, as well as internal risks (related to availability of SCM resources), could greatly influence the demand [ 107 ] and thus contribute to inaccuracy of BDA-based demand predictions using historical data. Incorporating existing driving factors outside the historical data, such as economic instability, inflation, and purchasing power, could help adjust the predictions with respect to unseen future scenarios of demand. Combining predictive algorithms with optimization or simulation can equip the models with prescriptive capabilities in response to future scenarios and expectations.

Predictive BDA in closed-loop supply chains (CLSC)

The combination of forward and reverse flow of material in a SC is referred to as a closed-loop supply chain (CLSC). A CLSC is a more complex system than a traditional SC because it consists of the forward and reverse SC simultaneously [ 108 ]. Economic impact, environmental impact, and social responsibility are three significant factors in designing a CLSC network with inclusion of product recycling, remanufacturing, and refurbishment functions. The complexity of a CLSC, compared to a common SC, results from the coordination between backward and forward flows. For example, transportation cost, holding cost, and forecasting demand are challenging issues because of uncertainties in the information flows from the forward chain to the reverse one. In addition, the uncertainties about the rate of returned products and efficiencies of recycling, remanufacturing, and refurbishment functions are some of the main barriers in establishing predictions for the reverse flow [ 5 , 6 , 109 ]. As such, one key finding from this literature survey is that CLSCs particularly deal with the lack of quality data for remanufacturing. Remanufacturing refers to the disassembly of products, cleaning, inspection, storage, reconditioning, replacement, and reassembling. As a result of deficiencies in data, optimal scheduling of remanufacturing functions is cumbersome due to uncertainties in the quality and quantity of used products as well as timing of returns and delivery delays.

IoT-based approaches can overcome the difficulties of collecting data in a CLSC. In an IoT environment, objects are monitored and controlled remotely across existing network infrastructures. This enables more direct integration between the physical world and computer-based systems. The results include improved efficiency, accuracy, and economic benefit across SCs [ 50 , 54 , 110 ].

Radio frequency identification (RFID) is another technology that has become very popular in SCs. RFID can be used for automation of processes in an SC, and it is useful for coordination of forecasts in CLSCs with dispersed points of return and varied quantities and qualities of returned used products [ 10 , 111 , 112 , 113 , 114 ].

Conclusions

The growing need to customer behavior analysis and demand forecasting is deriven by globalization and increasing market competitions as well as the surge in supply chain digitization practices. In this study, we performed a thorough review for applications of predictive big data analytics (BDA) in SC demand forecasting. The survey overviewed the BDA methods applied to supply chain demand forecasting and provided a comparative categorization of them. We collected and analyzed these studies with respect to methods and techniques used in demand prediction. Seven mainstream techniques were identified and studied with their pros and cons. The neural networks and regression analysis are observed as the two mostly employed techniques, among others. The review also pointed to the fact that optimization models or simulation can be used to improve the accuracy of forecasting through formulating and optimizing a cost function for the fitting of the predictions to data.

One key finding from reviewing the existing literature was that there is a very limited research conducted on the applications of BDA in CLSC and reverse logistics. There are key benefits in adopting a data-driven approach for design and management of CLSCs. Due to increasing environmental awareness and incentives from the government, nowadays a vast quantity of returned (used) products are collected, which are of various types and conditions, received and sorted in many collection points. These uncertainties have a direct impact on the cost-efficiency of remanufacturing processes, the final price of the refurbished products and the demand for these products [ 115 ]. As such, design and operation of CLSCs present a case for big data analytics from both supply and demand forecasting perspectives.

Availability of data and materials

The paper presents a review of the literature extracted from main scientific databases without presenting data.

Abbreviations

Adaptive neural fuzzy inference systems

Auto regressive integrated moving average

Artificial neural network

  • Big data analytics

Backpropagation

Closed-loop supply chain

Extreme learning machine

Enterprise resource planning

Genetic algorithms

Growing hierarchical self-organizing map

Holt-winters

Internet of things

K-nearest-neighbor

Mean absolute deviation

Mean absolute error

Mean absolute percentage error

Mean square error

Mean square root error

Radial basis function

Particle swarm optimization

Self-organizing maps

Stock-keeping unit

Supply chain analytics

Supply chain

  • Supply chain management

Support vector machine

Support vector regression

Total cost deviation

Theil inequality index

You Z, Si Y-W, Zhang D, Zeng X, Leung SCH, Li T. A decision-making framework for precision marketing. Expert Syst Appl. 2015;42(7):3357–67. https://doi.org/10.1016/J.ESWA.2014.12.022 .

Article   Google Scholar  

Guo ZX, Wong WK, Li M. A multivariate intelligent decision-making model for retail sales forecasting. Decis Support Syst. 2013;55(1):247–55. https://doi.org/10.1016/J.DSS.2013.01.026 .

Wei J-T, Lee M-C, Chen H-K, Wu H-H. Customer relationship management in the hairdressing industry: an application of data mining techniques. Expert Syst Appl. 2013;40(18):7513–8. https://doi.org/10.1016/J.ESWA.2013.07.053 .

Lu LX, Swaminathan JM. Supply chain management. Int Encycl Soc Behav Sci. 2015. https://doi.org/10.1016/B978-0-08-097086-8.73032-7 .

Gholizadeh H, Tajdin A, Javadian N. A closed-loop supply chain robust optimization for disposable appliances. Neural Comput Appl. 2018. https://doi.org/10.1007/s00521-018-3847-9 .

Tosarkani BM, Amin SH. A possibilistic solution to configure a battery closed-loop supply chain: multi-objective approach. Expert Syst Appl. 2018;92:12–26. https://doi.org/10.1016/J.ESWA.2017.09.039 .

Blackburn R, Lurz K, Priese B, Göb R, Darkow IL. A predictive analytics approach for demand forecasting in the process industry. Int Trans Oper Res. 2015;22(3):407–28. https://doi.org/10.1111/itor.12122 .

Article   MathSciNet   MATH   Google Scholar  

Boulaksil Y. Safety stock placement in supply chains with demand forecast updates. Oper Res Perspect. 2016;3:27–31. https://doi.org/10.1016/J.ORP.2016.07.001 .

Article   MathSciNet   Google Scholar  

Tang CS. Perspectives in supply chain risk management. Int J Prod Econ. 2006;103(2):451–88. https://doi.org/10.1016/J.IJPE.2005.12.006 .

Wang G, Gunasekaran A, Ngai EWT, Papadopoulos T. Big data analytics in logistics and supply chain management: certain investigations for research and applications. Int J Prod Econ. 2016;176:98–110. https://doi.org/10.1016/J.IJPE.2016.03.014 .

Awwad M, Kulkarni P, Bapna R, Marathe A. Big data analytics in supply chain: a literature review. In: Proceedings of the international conference on industrial engineering and operations management, 2018(SEP); 2018, p. 418–25.

Büyüközkan G, Göçer F. Digital Supply Chain: literature review and a proposed framework for future research. Comput Ind. 2018;97:157–77.

Kshetri N. 1 Blockchain’s roles in meeting key supply chain management objectives. Int J Inf Manage. 2018;39:80–9.

Michna Z, Disney SM, Nielsen P. The impact of stochastic lead times on the bullwhip effect under correlated demand and moving average forecasts. Omega. 2019. https://doi.org/10.1016/J.OMEGA.2019.02.002 .

Zhu Y, Zhao Y, Zhang J, Geng N, Huang D. Spring onion seed demand forecasting using a hybrid Holt-Winters and support vector machine model. PLoS ONE. 2019;14(7):e0219889. https://doi.org/10.1371/journal.pone.0219889 .

Govindan K, Cheng TCE, Mishra N, Shukla N. Big data analytics and application for logistics and supply chain management. Transport Res Part E Logist Transport Rev. 2018;114:343–9. https://doi.org/10.1016/J.TRE.2018.03.011 .

Bohanec M, Kljajić Borštnar M, Robnik-Šikonja M. Explaining machine learning models in sales predictions. Expert Syst Appl. 2017;71:416–28. https://doi.org/10.1016/J.ESWA.2016.11.010 .

Constante F, Silva F, Pereira A. DataCo smart supply chain for big data analysis. Mendeley Data. 2019. https://doi.org/10.17632/8gx2fvg2k6.5 .

Huber J, Gossmann A, Stuckenschmidt H. Cluster-based hierarchical demand forecasting for perishable goods. Expert Syst Appl. 2017;76:140–51. https://doi.org/10.1016/J.ESWA.2017.01.022 .

Ali MM, Babai MZ, Boylan JE, Syntetos AA. Supply chain forecasting when information is not shared. Eur J Oper Res. 2017;260(3):984–94. https://doi.org/10.1016/J.EJOR.2016.11.046 .

Bian W, Shang J, Zhang J. Two-way information sharing under supply chain competition. Int J Prod Econ. 2016;178:82–94. https://doi.org/10.1016/J.IJPE.2016.04.025 .

Mourtzis D. Challenges and future perspectives for the life cycle of manufacturing networks in the mass customisation era. Logist Res. 2016;9(1):2.

Nguyen T, Zhou L, Spiegler V, Ieromonachou P, Lin Y. Big data analytics in supply chain management: a state-of-the-art literature review. Comput Oper Res. 2018;98:254–64. https://doi.org/10.1016/J.COR.2017.07.004 .

Choi Y, Lee H, Irani Z. Big data-driven fuzzy cognitive map for prioritising IT service procurement in the public sector. Ann Oper Res. 2018;270(1–2):75–104. https://doi.org/10.1007/s10479-016-2281-6 .

Huang YY, Handfield RB. Measuring the benefits of erp on supply management maturity model: a “big data” method. Int J Oper Prod Manage. 2015;35(1):2–25. https://doi.org/10.1108/IJOPM-07-2013-0341 .

Miroslav M, Miloš M, Velimir Š, Božo D, Đorđe L. Semantic technologies on the mission: preventing corruption in public procurement. Comput Ind. 2014;65(5):878–90. https://doi.org/10.1016/J.COMPIND.2014.02.003 .

Zhang Y, Ren S, Liu Y, Si S. A big data analytics architecture for cleaner manufacturing and maintenance processes of complex products. J Clean Prod. 2017;142:626–41. https://doi.org/10.1016/J.JCLEPRO.2016.07.123 .

Shu Y, Ming L, Cheng F, Zhang Z, Zhao J. Abnormal situation management: challenges and opportunities in the big data era. Comput Chem Eng. 2016;91:104–13. https://doi.org/10.1016/J.COMPCHEMENG.2016.04.011 .

Krumeich J, Werth D, Loos P. Prescriptive control of business processes: new potentials through predictive analytics of big data in the process manufacturing industry. Bus Inform Syst Eng. 2016;58(4):261–80. https://doi.org/10.1007/s12599-015-0412-2 .

Guo SY, Ding LY, Luo HB, Jiang XY. A Big-Data-based platform of workers’ behavior: observations from the field. Accid Anal Prev. 2016;93:299–309. https://doi.org/10.1016/J.AAP.2015.09.024 .

Chuang Y-F, Chia S-H, Wong J-Y. Enhancing order-picking efficiency through data mining and assignment approaches. WSEAS Transactions on Business and Economics. 2014;11(1):52–64.

Google Scholar  

Ballestín F, Pérez Á, Lino P, Quintanilla S, Valls V. Static and dynamic policies with RFID for the scheduling of retrieval and storage warehouse operations. Comput Ind Eng. 2013;66(4):696–709. https://doi.org/10.1016/J.CIE.2013.09.020 .

Alyahya S, Wang Q, Bennett N. Application and integration of an RFID-enabled warehousing management system—a feasibility study. J Ind Inform Integr. 2016;4:15–25. https://doi.org/10.1016/J.JII.2016.08.001 .

Cui J, Liu F, Hu J, Janssens D, Wets G, Cools M. Identifying mismatch between urban travel demand and transport network services using GPS data: a case study in the fast growing Chinese city of Harbin. Neurocomputing. 2016;181:4–18. https://doi.org/10.1016/J.NEUCOM.2015.08.100 .

Shan Z, Zhu Q. Camera location for real-time traffic state estimation in urban road network using big GPS data. Neurocomputing. 2015;169:134–43. https://doi.org/10.1016/J.NEUCOM.2014.11.093 .

Ting SL, Tse YK, Ho GTS, Chung SH, Pang G. Mining logistics data to assure the quality in a sustainable food supply chain: a case in the red wine industry. Int J Prod Econ. 2014;152:200–9. https://doi.org/10.1016/J.IJPE.2013.12.010 .

Jun S-P, Park D-H, Yeom J. The possibility of using search traffic information to explore consumer product attitudes and forecast consumer preference. Technol Forecast Soc Chang. 2014;86:237–53. https://doi.org/10.1016/J.TECHFORE.2013.10.021 .

He W, Wu H, Yan G, Akula V, Shen J. A novel social media competitive analytics framework with sentiment benchmarks. Inform Manage. 2015;52(7):801–12. https://doi.org/10.1016/J.IM.2015.04.006 .

Marine-Roig E, Anton Clavé S. Tourism analytics with massive user-generated content: a case study of Barcelona. J Destination Market Manage. 2015;4(3):162–72. https://doi.org/10.1016/J.JDMM.2015.06.004 .

Carbonneau R, Laframboise K, Vahidov R. Application of machine learning techniques for supply chain demand forecasting. Eur J Oper Res. 2008;184(3):1140–54. https://doi.org/10.1016/J.EJOR.2006.12.004 .

Article   MATH   Google Scholar  

Munir K. Cloud computing and big data: technologies, applications and security, vol. 49. Berlin: Springer; 2019.

Rostami-Tabar B, Babai MZ, Ali M, Boylan JE. The impact of temporal aggregation on supply chains with ARMA(1,1) demand processes. Eur J Oper Res. 2019;273(3):920–32. https://doi.org/10.1016/J.EJOR.2018.09.010 .

Beyer MA, Laney D. The importance of ‘big data’: a definition. Stamford: Gartner; 2012. p. 2014–8.

Benabdellah AC, Benghabrit A, Bouhaddou I, Zemmouri EM. Big data for supply chain management: opportunities and challenges. In: Proceedings of IEEE/ACS international conference on computer systems and applications, AICCSA, no. 11, p. 20–26; 2016. https://doi.org/10.1109/AICCSA.2016.7945828 .

Kumar M. Applied big data analytics in operations management. Appl Big Data Anal Oper Manage. 2016. https://doi.org/10.4018/978-1-5225-0886-1 .

Zhong RY, Huang GQ, Lan S, Dai QY, Chen X, Zhang T. A big data approach for logistics trajectory discovery from RFID-enabled production data. Int J Prod Econ. 2015;165:260–72. https://doi.org/10.1016/J.IJPE.2015.02.014 .

Varela IR, Tjahjono B. Big data analytics in supply chain management: trends and related research. In: 6th international conference on operations and supply chain management, vol. 1, no. 1, p. 2013–4; 2014. https://doi.org/10.13140/RG.2.1.4935.2563 .

Han J, Kamber M, Pei J. Data mining: concepts and techniques. Burlington: Morgan Kaufmann Publishers; 2013. https://doi.org/10.1016/B978-0-12-381479-1.00001-0 .

Book   MATH   Google Scholar  

Arunachalam D, Kumar N. Benefit-based consumer segmentation and performance evaluation of clustering approaches: an evidence of data-driven decision-making. Expert Syst Appl. 2018;111:11–34. https://doi.org/10.1016/J.ESWA.2018.03.007 .

Chase CW. Next generation demand management: people, process, analytics, and technology. Hoboken: Wiley; 2016.

Book   Google Scholar  

SAS Institute. Demand-driven forecasting and planning: take responsiveness to the next level. 13; 2014. https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper2/demand-driven-forecasting-planning-107477.pdf .

Acar Y, Gardner ES. Forecasting method selection in a global supply chain. Int J Forecast. 2012;28(4):842–8. https://doi.org/10.1016/J.IJFORECAST.2011.11.003 .

Ma S, Fildes R, Huang T. Demand forecasting with high dimensional data: the case of SKU retail sales forecasting with intra- and inter-category promotional information. Eur J Oper Res. 2016;249(1):245–57. https://doi.org/10.1016/J.EJOR.2015.08.029 .

Addo-Tenkorang R, Helo PT. Big data applications in operations/supply-chain management: a literature review. Comput Ind Eng. 2016;101:528–43. https://doi.org/10.1016/J.CIE.2016.09.023 .

Agrawal S, Singh RK, Murtaza Q. A literature review and perspectives in reverse logistics. Resour Conserv Recycl. 2015;97:76–92. https://doi.org/10.1016/J.RESCONREC.2015.02.009 .

Gunasekaran A, Kumar Tiwari M, Dubey R, Fosso Wamba S. Big data and predictive analytics applications in supply chain management. Comput Ind Eng. 2016;101:525–7. https://doi.org/10.1016/J.CIE.2016.10.020 .

Hazen BT, Skipper JB, Ezell JD, Boone CA. Big data and predictive analytics for supply chain sustainability: a theory-driven research agenda. Comput Ind Eng. 2016;101:592–8. https://doi.org/10.1016/J.CIE.2016.06.030 .

Hofmann E, Rutschmann E. Big data analytics and demand forecasting in supply chains: a conceptual analysis. Int J Logist Manage. 2018;29(2):739–66. https://doi.org/10.1108/IJLM-04-2017-0088 .

Jain A, Sanders NR. Forecasting sales in the supply chain: consumer analytics in the big data era. Int J Forecast. 2019;35(1):170–80. https://doi.org/10.1016/J.IJFORECAST.2018.09.003 .

Jin J, Liu Y, Ji P, Kwong CK. Review on recent advances in information mining from big consumer opinion data for product design. J Comput Inf Sci Eng. 2018;19(1):010801. https://doi.org/10.1115/1.4041087 .

Kumar R, Mahto D. Industrial forecasting support systems and technologies in practice: a review. Glob J Res Eng. 2013;13(4):17–33.

MathSciNet   Google Scholar  

Mishra D, Gunasekaran A, Papadopoulos T, Childe SJ. Big Data and supply chain management: a review and bibliometric analysis. Ann Oper Res. 2016;270(1):313–36. https://doi.org/10.1007/s10479-016-2236-y .

Ren S, Zhang Y, Liu Y, Sakao T, Huisingh D, Almeida CMVB. A comprehensive review of big data analytics throughout product lifecycle to support sustainable smart manufacturing: a framework, challenges and future research directions. J Clean Prod. 2019;210:1343–65. https://doi.org/10.1016/J.JCLEPRO.2018.11.025 .

Singh Jain AD, Mehta I, Mitra J, Agrawal S. Application of big data in supply chain management. Mater Today Proc. 2017;4(2):1106–15. https://doi.org/10.1016/J.MATPR.2017.01.126 .

Souza GC. Supply chain analytics. Bus Horiz. 2014;57(5):595–605. https://doi.org/10.1016/J.BUSHOR.2014.06.004 .

Tiwari S, Wee HM, Daryanto Y. Big data analytics in supply chain management between 2010 and 2016: insights to industries. Comput Ind Eng. 2018;115:319–30. https://doi.org/10.1016/J.CIE.2017.11.017 .

Zhong RY, Newman ST, Huang GQ, Lan S. Big Data for supply chain management in the service and manufacturing sectors: challenges, opportunities, and future perspectives. Comput Ind Eng. 2016;101:572–91. https://doi.org/10.1016/J.CIE.2016.07.013 .

Ramanathan U, Subramanian N, Parrott G. Role of social media in retail network operations and marketing to enhance customer satisfaction. Int J Oper Prod Manage. 2017;37(1):105–23. https://doi.org/10.1108/IJOPM-03-2015-0153 .

Coursera. Supply chain planning. Coursera E-Learning; 2019. https://www.coursera.org/learn/planning .

Villegas MA, Pedregal DJ. Supply chain decision support systems based on a novel hierarchical forecasting approach. Decis Support Syst. 2018;114:29–36. https://doi.org/10.1016/J.DSS.2018.08.003 .

Ma J, Kwak M, Kim HM. Demand trend mining for predictive life cycle design. J Clean Prod. 2014;68:189–99. https://doi.org/10.1016/J.JCLEPRO.2014.01.026 .

Hamiche K, Abouaïssa H, Goncalves G, Hsu T. A robust and easy approach for demand forecasting in supply chains. IFAC-PapersOnLine. 2018;51(11):1732–7. https://doi.org/10.1016/J.IFACOL.2018.08.206 .

Da Veiga CP, Da Veiga CRP, Catapan A, Tortato U, Da Silva WV. Demand forecasting in food retail: a comparison between the Holt-Winters and ARIMA models. WSEAS Trans Bus Econ. 2014;11(1):608–14.

Murray PW, Agard B, Barajas MA. Forecasting supply chain demand by clustering customers. IFAC-PapersOnLine. 2015;48(3):1834–9. https://doi.org/10.1016/J.IFACOL.2015.06.353 .

Ramos P, Santos N, Rebelo R. Performance of state space and ARIMA models for consumer retail sales forecasting. Robot Comput Integr Manuf. 2015;34:151–63. https://doi.org/10.1016/J.RCIM.2014.12.015 .

Schaer O, Kourentzes N. Demand forecasting with user-generated online information. Int J Forecast. 2019;35(1):197–212. https://doi.org/10.1016/J.IJFORECAST.2018.03.005 .

Pang Y, Yao B, Zhou X, Zhang Y, Xu Y, Tan Z. Hierarchical electricity time series forecasting for integrating consumption patterns analysis and aggregation consistency; 2018. In: IJCAI international joint conference on artificial intelligence; 2018, p. 3506–12.

Goyal R, Chandra P, Singh Y. Suitability of KNN regression in the development of interaction based software fault prediction models. IERI Procedia. 2014;6:15–21. https://doi.org/10.1016/J.IERI.2014.03.004 .

Runkler TA. Data analytics (models and algorithms for intelligent data analysis). In: Revista Espanola de las Enfermedades del Aparato Digestivo (Vol. 26, Issue 4). Springer Fachmedien Wiesbaden; 2016. https://doi.org/10.1007/978-3-658-14075-5 .

Nikolopoulos KI, Babai MZ, Bozos K. Forecasting supply chain sporadic demand with nearest neighbor approaches. Int J Prod Econ. 2016;177:139–48. https://doi.org/10.1016/j.ijpe.2016.04.013 .

Gaur M, Goel S, Jain E. Comparison between nearest Neighbours and Bayesian network for demand forecasting in supply chain management. In: 2015 international conference on computing for sustainable global development, INDIACom 2015, May; 2015, p. 1433–6.

Burney SMA, Ali SM, Burney S. A survey of soft computing applications for decision making in supply chain management. In: 2017 IEEE 3rd international conference on engineering technologies and social sciences, ICETSS 2017, 2018, p. 1–6. https://doi.org/10.1109/ICETSS.2017.8324158 .

González Perea R, Camacho Poyato E, Montesinos P, Rodríguez Díaz JA. Optimisation of water demand forecasting by artificial intelligence with short data sets. Biosyst Eng. 2019;177:59–66. https://doi.org/10.1016/J.BIOSYSTEMSENG.2018.03.011 .

Vhatkar S, Dias J. Oral-care goods sales forecasting using artificial neural network model. Procedia Comput Sci. 2016;79:238–43. https://doi.org/10.1016/J.PROCS.2016.03.031 .

Wong WK, Guo ZX. A hybrid intelligent model for medium-term sales forecasting in fashion retail supply chains using extreme learning machine and harmony search algorithm. Int J Prod Econ. 2010;128(2):614–24. https://doi.org/10.1016/J.IJPE.2010.07.008 .

Liu C, Shu T, Chen S, Wang S, Lai KK, Gan L. An improved grey neural network model for predicting transportation disruptions. Expert Syst Appl. 2016;45:331–40. https://doi.org/10.1016/J.ESWA.2015.09.052 .

Yuan WJ, Chen JH, Cao JJ, Jin ZY. Forecast of logistics demand based on grey deep neural network model. Proc Int Conf Mach Learn Cybern. 2018;1:251–6. https://doi.org/10.1109/ICMLC.2018.8527006 .

Amirkolaii KN, Baboli A, Shahzad MK, Tonadre R. Demand forecasting for irregular demands in business aircraft spare parts supply chains by using artificial intelligence (AI). IFAC-PapersOnLine. 2017;50(1):15221–6. https://doi.org/10.1016/J.IFACOL.2017.08.2371 .

Huang L, Xie G, Li D, Zou C. Predicting and analyzing e-logistics demand in urban and rural areas: an empirical approach on historical data of China. Int J Performabil Eng. 2018;14(7):1550–9. https://doi.org/10.23940/ijpe.18.07.p19.15501559 .

Saha C, Lam SS, Boldrin W. Demand forecasting for server manufacturing using neural networks. In: Proceedings of the 2014 industrial and systems engineering research conference, June 2014; 2015.

Chang P-C, Wang Y-W, Tsai C-Y. Evolving neural network for printed circuit board sales forecasting. Expert Syst Appl. 2005;29(1):83–92. https://doi.org/10.1016/J.ESWA.2005.01.012 .

Merkuryeva G, Valberga A, Smirnov A. Demand forecasting in pharmaceutical supply chains: a case study. Procedia Comput Sci. 2019;149:3–10. https://doi.org/10.1016/J.PROCS.2019.01.100 .

Yang CL, Sutrisno H. Short-term sales forecast of perishable goods for franchise business. In: 2018 10th international conference on knowledge and smart technology: cybernetics in the next decades, KST 2018, p. 101–5; 2018. https://doi.org/10.1109/KST.2018.8426091 .

Villegas MA, Pedregal DJ, Trapero JR. A support vector machine for model selection in demand forecasting applications. Comput Ind Eng. 2018;121:1–7. https://doi.org/10.1016/J.CIE.2018.04.042 .

Wu Q. The hybrid forecasting model based on chaotic mapping, genetic algorithm and support vector machine. Expert Syst Appl. 2010;37(2):1776–83. https://doi.org/10.1016/J.ESWA.2009.07.054 .

Guanghui W. Demand forecasting of supply chain based on support vector regression method. Procedia Eng. 2012;29:280–4. https://doi.org/10.1016/J.PROENG.2011.12.707 .

Sarhani M, El Afia A. Intelligent system based support vector regression for supply chain demand forecasting. In: 2014 2nd world conference on complex systems, WCCS 2014; 2015, p. 79–83. https://doi.org/10.1109/ICoCS.2014.7060941 .

Chen IF, Lu CJ. Sales forecasting by combining clustering and machine-learning techniques for computer retailing. Neural Comput Appl. 2017;28(9):2633–47. https://doi.org/10.1007/s00521-016-2215-x .

Fasli M, Kovalchuk Y. Learning approaches for developing successful seller strategies in dynamic supply chain management. Inf Sci. 2011;181(16):3411–26. https://doi.org/10.1016/J.INS.2011.04.014 .

Islek I, Oguducu SG. A retail demand forecasting model based on data mining techniques. In: IEEE international symposium on industrial electronics; 2015, p. 55–60. https://doi.org/10.1109/ISIE.2015.7281443 .

Kilimci ZH, Akyuz AO, Uysal M, Akyokus S, Uysal MO, Atak Bulbul B, Ekmis MA. An improved demand forecasting model using deep learning approach and proposed decision integration strategy for supply chain. Complexity. 2019;2019:1–15. https://doi.org/10.1155/2019/9067367 .

Loureiro ALD, Miguéis VL, da Silva LFM. Exploring the use of deep neural networks for sales forecasting in fashion retail. Decis Support Syst. 2018;114:81–93. https://doi.org/10.1016/J.DSS.2018.08.010 .

Punam K, Pamula R, Jain PK. A two-level statistical model for big mart sales prediction. In: 2018 international conference on computing, power and communication technologies, GUCON 2018; 2019. https://doi.org/10.1109/GUCON.2018.8675060 .

Puspita PE, İnkaya T, Akansel M. Clustering-based Sales Forecasting in a Forklift Distributor. In: Uluslararası Muhendislik Arastirma ve Gelistirme Dergisi, 1–17; 2019. https://doi.org/10.29137/umagd.473977 .

Thomassey S. Sales forecasts in clothing industry: the key success factor of the supply chain management. Int J Prod Econ. 2010;128(2):470–83. https://doi.org/10.1016/J.IJPE.2010.07.018 .

Brentan BM, Ribeiro L, Izquierdo J, Ambrosio JK, Luvizotto E, Herrera M. Committee machines for hourly water demand forecasting in water supply systems. Math Probl Eng. 2019;2019:1–11. https://doi.org/10.1155/2019/9765468 .

Mafakheri F, Breton M, Chauhan S. Project-to-organization matching: an integrated risk assessment approach. Int J IT Project Manage. 2012;3(3):45–59. https://doi.org/10.4018/jitpm.2012070104 .

Mafakheri F, Nasiri F. Revenue sharing coordination in reverse logistics. J Clean Prod. 2013;59:185–96. https://doi.org/10.1016/J.JCLEPRO.2013.06.031 .

Bogataj M. Closed Loop Supply Chain (CLSC): economics, modelling, management and control. Int J Prod Econ. 2017;183:319–21. https://doi.org/10.1016/J.IJPE.2016.11.020 .

Hopkins J, Hawking P. Big Data Analytics and IoT in logistics: a case study. Int J Logist Manage. 2018;29(2):575–91. https://doi.org/10.1108/IJLM-05-2017-0109 .

de Oliveira CM, Soares PJSR, Morales G, Arica J, Matias IO. RFID and its applications on supply chain in Brazil: a structured literature review (2006–2016). Espacios. 2017;38(31). https://www.scopus.com/inward/record.uri?eid=2-s2.0-85021922345&partnerID=40&md5=f062191611541391ded4cdb73eea55cb .

Griva A, Bardaki C, Pramatari K, Papakiriakopoulos D. Retail business analytics: customer visit segmentation using market basket data. Expert Syst Appl. 2018;100:1–16. https://doi.org/10.1016/J.ESWA.2018.01.029 .

Lee CKM, Ho W, Ho GTS, Lau HCW. Design and development of logistics workflow systems for demand management with RFID. Expert Syst Appl. 2011;38(5):5428–37. https://doi.org/10.1016/J.ESWA.2010.10.012 .

Mohebi E, Marquez L. Application of machine learning and RFID in the stability optimization of perishable foods; 2008.

Jiao Z, Ran L, Zhang Y, Li Z, Zhang W. Data-driven approaches to integrated closed-loop sustainable supply chain design under multi-uncertainties. J Clean Prod. 2018;185:105–27.

Levis AA, Papageorgiou LG. Customer demand forecasting via support vector regression analysis. Chem Eng Res Des. 2005;83(8):1009–18. https://doi.org/10.1205/CHERD.04246 .

Chi H-M, Ersoy OK, Moskowitz H, Ward J. Modeling and optimizing a vendor managed replenishment system using machine learning and genetic algorithms. Eur J Oper Res. 2007;180(1):174–93. https://doi.org/10.1016/J.EJOR.2006.03.040 .

Sun Z-L, Choi T-M, Au K-F, Yu Y. Sales forecasting using extreme learning machine with applications in fashion retailing. Decis Support Syst. 2008;46(1):411–9. https://doi.org/10.1016/J.DSS.2008.07.009 .

Efendigil T, Önüt S, Kahraman C. A decision support system for demand forecasting with artificial neural networks and neuro-fuzzy models: a comparative analysis. Expert Syst Appl. 2009;36(3):6697–707. https://doi.org/10.1016/J.ESWA.2008.08.058 .

Lee CC, Ou-Yang C. A neural networks approach for forecasting the supplier’s bid prices in supplier selection negotiation process. Expert Syst Appl. 2009;36(2):2961–70. https://doi.org/10.1016/J.ESWA.2008.01.063 .

Chen F-L, Chen Y-C, Kuo J-Y. Applying Moving back-propagation neural network and Moving fuzzy-neuron network to predict the requirement of critical spare parts. Expert Syst Appl. 2010;37(9):6695–704. https://doi.org/10.1016/J.ESWA.2010.04.037 .

Wu Q. Product demand forecasts using wavelet kernel support vector machine and particle swarm optimization in manufacture system. J Comput Appl Math. 2010;233(10):2481–91. https://doi.org/10.1016/J.CAM.2009.10.030 .

Babai MZ, Ali MM, Boylan JE, Syntetos AA. Forecasting and inventory performance in a two-stage supply chain with ARIMA(0,1,1) demand: theory and empirical analysis. Int J Prod Econ. 2013;143(2):463–71. https://doi.org/10.1016/J.IJPE.2011.09.004 .

Kourentzes N. Intermittent demand forecasts with neural networks. Int J Prod Econ. 2013;143(1):198–206. https://doi.org/10.1016/J.IJPE.2013.01.009 .

Lau HCW, Ho GTS, Zhao Y. A demand forecast model using a combination of surrogate data analysis and optimal neural network approach. Decis Support Syst. 2013;54(3):1404–16. https://doi.org/10.1016/J.DSS.2012.12.008 .

Arunraj NS, Ahrens D. A hybrid seasonal autoregressive integrated moving average and quantile regression for daily food sales forecasting. Int J Prod Econ. 2015;170:321–35. https://doi.org/10.1016/J.IJPE.2015.09.039 .

Di Pillo G, Latorre V, Lucidi S, Procacci E. An application of support vector machines to sales forecasting under promotions. 4OR. 2016. https://doi.org/10.1007/s10288-016-0316-0 .

da Veiga CP, da Veiga CRP, Puchalski W, dos Coelho LS, Tortato U. Demand forecasting based on natural computing approaches applied to the foodstuff retail segment. J Retail Consumer Serv. 2016;31:174–81. https://doi.org/10.1016/J.JRETCONSER.2016.03.008 .

Chawla A, Singh A, Lamba A, Gangwani N, Soni U. Demand forecasting using artificial neural networks—a case study of American retail corporation. In: Applications of artificial intelligence techniques in wind power generation. Integrated Computer-Aided Engineering; 2018, p. 79–90. https://doi.org/10.3233/ica-2001-8305 .

Pereira MM, Machado RL, Ignacio Pires SR, Pereira Dantas MJ, Zaluski PR, Frazzon EM. Forecasting scrap tires returns in closed-loop supply chains in Brazil. J Clean Prod. 2018;188:741–50. https://doi.org/10.1016/J.JCLEPRO.2018.04.026 .

Fanoodi B, Malmir B, Jahantigh FF. Reducing demand uncertainty in the platelet supply chain through artificial neural networks and ARIMA models. Comput Biol Med. 2019;113:103415. https://doi.org/10.1016/J.COMPBIOMED.2019.103415 .

Sharma R, Singhal P. Demand forecasting of engine oil for automotive and industrial lubricant manufacturing company using neural network. Mater Today Proc. 2019;18:2308–14. https://doi.org/10.1016/J.MATPR.2019.07.013 .

Tanizaki T, Hoshino T, Shimmura T, Takenaka T. Demand forecasting in restaurants using machine learning and statistical analysis. Procedia CIRP. 2019;79:679–83. https://doi.org/10.1016/J.PROCIR.2019.02.042 .

Wang C-H, Chen J-Y. Demand forecasting and financial estimation considering the interactive dynamics of semiconductor supply-chain companies. Comput Ind Eng. 2019;138:106104. https://doi.org/10.1016/J.CIE.2019.106104 .

Download references

Acknowledgements

The authors are very much thankful to anonymous reviewers whose comments and suggestion were very helpful in improving the quality of the manuscript.

Author information

Authors and affiliations.

Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, H3G 1M8, Canada

Mahya Seyedan & Fereshteh Mafakheri

You can also search for this author in PubMed   Google Scholar

Contributions

The authors contributed equally to the writing of the paper. First author conducted the literature search. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Fereshteh Mafakheri .

Ethics declarations

Ethics approval.

Not applicable.

Competing interests

The authors declare no competing or conflicting interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Seyedan, M., Mafakheri, F. Predictive big data analytics for supply chain demand forecasting: methods, applications, and research opportunities. J Big Data 7 , 53 (2020). https://doi.org/10.1186/s40537-020-00329-2

Download citation

Received : 05 April 2020

Accepted : 17 July 2020

Published : 25 July 2020

DOI : https://doi.org/10.1186/s40537-020-00329-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Demand forecasting
  • Closed-loop supply chains
  • Machine-learning

supply chain case study dataset

Subscribe to the PwC Newsletter

Join the community, edit social preview.

supply chain case study dataset

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK REMOVE

Remove a task

Add a method, remove a method, edit datasets, supplygraph: a benchmark dataset for supply chain planning using graph neural networks.

27 Jan 2024  ·  Azmine Toushik Wasi , MD Shafikul Islam , Adipto Raihan Akib · Edit social preview

Graph Neural Networks (GNNs) have gained traction across different domains such as transportation, bio-informatics, language processing, and computer vision. However, there is a noticeable absence of research on applying GNNs to supply chain networks. Supply chain networks are inherently graph-like in structure, making them prime candidates for applying GNN methodologies. This opens up a world of possibilities for optimizing, predicting, and solving even the most complex supply chain problems. A major setback in this approach lies in the absence of real-world benchmark datasets to facilitate the research and resolution of supply chain problems using GNNs. To address the issue, we present a real-world benchmark dataset for temporal tasks, obtained from one of the leading FMCG companies in Bangladesh, focusing on supply chain planning for production purposes. The dataset includes temporal data as node features to enable sales predictions, production planning, and the identification of factory issues. By utilizing this dataset, researchers can employ GNNs to address numerous supply chain problems, thereby advancing the field of supply chain analytics and planning. Source: https://github.com/CIOL-SUST/SupplyGraph

Code Edit Add Remove Mark official

Tasks edit add remove, datasets edit.

Introduced in the Paper:

supply chain case study dataset

Results from the Paper Edit Add Remove

Methods edit add remove.

Blockchain-Enabled Accountability in Data Supply Chain: A Data Bill of Materials Approach

In the era of advanced artificial intelligence, highlighted by large-scale generative models like GPT-4, ensuring the traceability, verifiability, and reproducibility of datasets throughout their lifecycle is paramount for research institutions and technology companies. These organisations increasingly rely on vast corpora to train and fine-tune advanced AI models, resulting in intricate data supply chains that demand effective data governance mechanisms. In addition, the challenge intensifies as diverse stakeholders may use assorted tools, often without adequate measures to ensure the accountability of data and the reliability of outcomes. In this study, we adapt the concept of “Software Bill of Materials” into the field of data governance and management to address the above challenges, and introduce “Data Bill of Materials” (DataBOM) to capture the dependency relationship between different datasets and stakeholders by storing specific metadata. We demonstrate a platform architecture for providing blockchain-based DataBOM services, present the interaction protocol for stakeholders, and discuss the minimal requirements for DataBOM metadata. The proposed solution is evaluated in terms of feasibility and performance via case study and quantitative analysis respectively.

Key terms - Data governance, blockchain, accountability, data bill of materials, traceability, architecture

1 Introduction

Data governance is critical in the era of advanced artificial intelligence (AI), particularly with the proliferation of large-scale generative AI that necessitates extensive datasets for model training and fine-tuning. Organisations that navigate complex data supply chains involving multiple stakeholders and varied tools are facing challenges in ensuring the traceability, verifiability, and reproducibility of data. This complexity is compounded in cross-departmental or cross-organisational data exchanges, where maintaining data accountability becomes increasingly significant. This issue is exacerbated after the emergence of large-scale generative AI models such as Large Language Models (LLMs)  [ 1 ] . As enterprises and research institutions all need large and high-quality corpora for model development and enhancement, the lack of effective governance frameworks to manage data creation, usage, and transfer, especially across diverse stakeholders, becomes evident.

Within a data supply chain, which involves continuing dataset artifact transformation and dissemination, stakeholders need to i) ensure data traceability in terms of the origin, authorisation and operations conducted on the dataset artifacts, ii) achieve data verifiability with authenticated sources and licence, iii) preserve data reproducibility that if questions are raised for specific steps on processing or transferring, and consequently, iv) the overall accountability to identify the responsible stakeholders if violations are detected. Nevertheless, current data governance models, often tied to specific platforms and focusing on data storage schemes (e.g., object storage, InterPlanetary File System), secure trading protocols   [ 2 , 3 ] , and privacy regulations (e.g. the General Data Protection Regulation), fall short in addressing the dynamic nature of data flows from the perspective of the overall data supply chain and the requirement for platform-agnostic traceability solutions.

Hence, in this study, we adapt the concept of “Software Bill of Materials” (SBOM)   [ 4 ] into the field of data governance and management and explore “Data Bill of Materials”  [ 5 ] from the perspective of software architecture design. DataBOM provides a formal inventory of dependency relationships for datasets via specific metadata, facilitating traceability, verifiability, reproducibility, and ultimately accountability across the data supply chain. In addition, considering the decentralised nature of data supply chain, we employ blockchain technology to serve as an infrastructure for managing DataBOM among diverse stakeholders. The proposed solution is evaluated via a case study and performance analysis. The main contributions of this paper are as follows:

We introduce and leverage the concept of “Data Bill of Materials” from the perspective of software architecture design, which provides a traceable, verifiable, and reproducible data supply chain to ensure accountability.

We provide a three-tiered system architecture for Blockchain-based DataBOM, utilising on-chain smart contracts to create, retrieve and update DataBOM records, and discuss the application of different access control mechanisms and design patterns to enable fine-grained auxiliary services. To the best of our knowledge, this is the first study integrating Data Bill of Materials with blockchain technology in terms of software architecture design.

We present the interaction protocol for operationalising the proposed architecture, and discuss the minimal metadata requirement for designing and deploying customised DataBOM smart contract.

2 Background and Related Work

2.1 blockchain technology.

Blockchain is essentially a distributed ledger technology which was popularised by the emergence of Bitcoin  [ 6 ] and the subsequent cryptocurrencies. In recent years, blockchain has been leveraged as a software component for enabling decentralisation and on-chain autonomy in diverse applications including decentralised finance  [ 7 ] , healthcare  [ 8 ] , etc., by providing two core elements: the underlying distributed ledger as immutable, transparent and secure data storage, and a decentralised “computing” infrastructure facilitated through smart contracts. In this study, we explore blockchain-enabled Data Bill of Materials as the decentralised environment of data supply chain aligns with the nature of blockchain, while blockchain can provide on-chain program execution with a series of design patterns that can satisfy the assorted requirements of data security and privacy. Blockchain achieves data storage through encapsulating data within digital transactions, which serve as identifiable records that manage the versioning of data over time. The verification of digital transactions does not need to rely on any central authority to establish a trustworthy business relationship  [ 9 ] . Blockchain also affords Turing-complete on-chain programmability through smart contracts, which are user-defined programs deployed and executed on-chain. Smart contracts support advanced programming features such as triggers and conditions  [ 10 ] allowing the composition of complex business logic.

2.2 Bill of Materials

The Bill of Materials (BOM), historically rooted in manufacturing, serves as a comprehensive list detailing all components necessary for product assembly  [ 11 ] . This concept, pivotal for ensuring transparency and accountability in production, has been adapted to the realm of software development, manifesting as the Software Bill of Materials. An SBOM delineates all software components within an application, playing a crucial role in securing the software supply chain by cataloging every constituent component. By extending BOM’s principles to data, we aim to establish a DataBOM framework that ensures traceability and accountability throughout the data supply chain.

In the domain of SBOMs, three primary standard formats have emerged: Software Package Data Exchange (SPDX) 1 1 1 https://spdx.dev/ , CycloneDX 2 2 2 https://cyclonedx.org/ , and Software Identification (SWID) Tagging 3 3 3 https://csrc.nist.gov/projects/Software-Identification-SWID . SPDX, an open-source standard recognised internationally and hosted by the Linux Foundation, primarily focuses on licence compliance. CycloneDX, developed by OWASP in 2017, is tailored towards addressing security concerns. SWID Tagging, maintained by the US National Institute of Standards and Technology (NIST), aims to offer a robust mechanism for the transparent identification of software components.

Following the U.S. Executive Order on Improving the Nation’s Cybersecurity 4 4 4 https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/ , which mandates SBOMs for software procurement, there has been a marked acceleration in SBOM research and development. Studies have begun to explore SBOM’s potential, challenges, and opportunities (e.g., [ 12 , 13 , 14 , 15 ] ). Notably, the notion of DataBOM, while nascent, was preliminarily explored in [ 5 ] , which outlined a theoretical model for ensuring data traceability. The potential of DataBOM as a transformative tool for data governance has been further corroborated by subsequent studies, including [ 13 ] , highlighting its capacity to address the challenges of data accountability and lifecycle management.

2.3 Data Governance and Management

The landscape of data governance and management has witnessed significant evolution, driven by both academic and industry efforts. Specifically for responsible AI data governance, frameworks such as datasheets [ 16 ] and data cards [ 17 ] have been proposed for transparent data documentation via structured summaries of essential dataset facts from the perspective of a project lifecycle. On the other hand, Levin et al.  [ 18 ] integrate AI with cloud object storage for maintaining the normal behaviour and health of IT Operations. Provena  [ 19 ] leverages the W3C Provenance model for comprehensive workflow provenance in distributed environments. Apache NiFi 5 5 5 https://nifi.apache.org/ underscores the importance of data provenance for operational analysis, facilitating the tracing of data through user-defined pipelines. This emphasis on provenance is further extended by initiatives like the Coalition for Content Provenance and Authenticity 6 6 6 https://c2pa.org/ , a collaborative effort by industry giants including Adobe and Microsoft, aimed at establishing standards for certifying the origin and history of media contents.

The exploration of decentralised technologies, such as the InterPlanetary File System and blockchain, introduces novel paradigms for data governance and management. Research underscores the importance of a systematic approach to blockchain data governance, addressing key challenges in privacy, data quality assurance, and the provision of trustable data analytics  [ 20 ] . Further enriching this discourse, the Secure Blockchain-Based Data Trading Ecosystem model  [ 21 ] introduces a paradigm for secure data trading, emphasising governance mechanisms essential for safeguarding data integrity and confidentiality during transactions. Zhang et al.  [ 22 ] propose a responsible web framework that leverages distributed ledger technology to empower users in managing consent and copyright of their online data.

In essence, the domain of data governance and management is transitioning towards a more interconnected, standardised, and user-centric approach, reflecting a collective move to address the pressing demands of data accountability. The concept of “DataBOM” was introduced by Barclay et al.  [ 5 ] , which is similar to data sheets and cards but focuses more on the changes and features of dataset artifact transformation and circulation. However, the extant work merely emphasised data traceability aspect, while this study extends the scope to include verifiability and reproducibility to further preserve accountability within data supply chain.

3 Blockchain-based Data Bill of Materials

In this section, we present a platform architecture for blockchain-based DataBOM, the corresponding interaction protocol, and the minimal requirements for metadata contained in DataBOM records.

Refer to caption

3.1 Architecture Design

Fig.  1 illustrates the overview of our proposed three-tiered architecture for blockchain-based DataBOM, including service layer, on-chain DataBOM layer, and off-chain data layer, and we also identify where a set of blockchain design patterns  [ 23 ] can be applied. The service layer consists of a DataBOM generator to automatically detect and capture the metadata from updated dataset artifacts (e.g., Excel files), an API gateway , identity services and DataBOM services . Specifically, identity services manage stakeholders’ on-chain identities for interaction, data source tracing, and accountability process, while DataBOM services include the development and deployment of DataBOM registry smart contracts, the specific operations to DataBOM records (i.e., creation, update, retrieval), and the access authorisation to certain granularity levels of DataBOM.

The on-chain DataBOM layer can process on-chain business logic via smart contracts. The identity registry smart contract provides identity services, where self-sovereign identity can be employed if the target usage scenario includes a series of cross-organisation interactions and collaborations, and hence the privacy requirements of both individuals and organisations should be noticed and satisfied by establishing formal business relationships. DataBOM registries defines the data structure for selected dataset metadata, which can be utilised to identify, trace, verify and reproduce datasets along with the data supply chain. Considering the discrepancy of metadata and the different levels of granularity, customised DataBOM registries are required to meet the diverse requirements for assorted datasets/projects. Hence, a factory contract can serve as a template for generating customised contract instances. In a DataBOM registry, embedded permission can grant access to stakeholders on the different granularity of a DataBOM record (i.e., only certain stakeholders can retrieve the records of a certain dataset artifact), whilst selective disclosure enables further fine-grained visibility of confidential or sensitive metadata or certain time intervals. Finally, the off-chain data layer consists of the data repositories, which are maintained by the stakeholders themselves instead of centralised in the implemented platform. The DataBOM registries contains the paths directing to actual data repositories.

Refer to caption

3.2 Protocol for Blockchain-based DataBOM

Fig.  2 presents an interaction protocol for the stakeholders of blockchain-based DataBOM. First, all stakeholders need to register on-chain identities with their blockchain public keys. Data producer A possesses a dataset, thence deploys a DataBOM registry smart contract for this dataset, and stores the selected metadata. Data consumer B is seeking suitable dataset(s) for certain projects, and queries the DataBOM registry to understand the essential information of A’s dataset. Afterwards, consumer B can retrieve producer A’s identity information (e.g. service endpoint) from identity registry , and request access to the dataset by directly contacting A. After authorisation, B accesses the dataset and decides to use only a subset of the included data. Hence they partition the dataset into multiple chunks, while the DataBOM generator exports and registers metadata of this tailored dataset in DataBOM registry as a new version. Please note that if the dataset identifier is changed within data supply chain, the dataset should be recorded in DataBOM registry as a new dataset with the alternated identifier. If data consumer B discovers abnormal data in the dataset, they can report to the producer, who will review and update the dataset by addressing the issue, and then the updated dataset will trigger the DataBOM generator to register the metadata in DataBOM registry (e.g., the explanation for this new version).

3.3 Minimal Requirements for DataBOM Metadata

Since Data Bill of Materials is a new concept adopted from Software Bill of Materials, there is currently no widely accepted standard or guideline specifying what kind of dataset metadata should be included in DataBOM records. Hereby, we provide several insights on the minimal requirement for metadata considering the objectives of DataBOM.

Data traceability refers to the ability to trace the flow of data throughout its lifecycle, from the origin or source to its final storage or use. It requires the identifiers of different datasets and the involved versions, and the dependency to other datasets or even particular versions to pinpoint the data source. Data verifiability denotes the ability to confirm the reliability and compliance of data. The generation, transfer and use of data should comply with relevant laws and specifications. Consequently, we recommend including the licence information in DataBOM to certify that the operations to a dataset or version do not violate the regulations. Data reproducibility stands for the ability to replicate the results of a certain phase within the data supply chain. In addition to the relationships with other datasets or versions, a stakeholder needs to specify the conducted operations to transform a dataset from the previous version to the current one. Other stakeholders can refer to this operation history for reproducing a precise dataset version. Accountability in data supply chain is embodied by the identifiability and answerability of stakeholders for their decisions and operations on dataset artifacts. The inherent public key infrastructure of blockchain and identity registry enable the connection between stakeholders’ identities and blockchain accounts. Hence, if violations are detected, it will be straightforward to locate the responsible stakeholder(s) according to dataset operation history and on-chain identities.

4 Evaluation

This section presents the proof-of-concept implementation and evaluation of the proposed blockchain-based DataBOM. We first illustrate how DataBOM can be integrated into the data supply chain of a research project consisting of four stakeholders, then elaborate the DataBOM data structure for this use case. In addition, we conduct experiments to evaluate the performance of creating and retrieving on-chain DataBOM records.

Refer to caption

4.1 Case Study

Fig.  3 illustrates a simplified data supply chain in a research project, where the annotated steps are supported by our proposed blockchain-based DataBOM services. Please note that we omit the DataBOM generator with automated metadata export and API invocation in the figure, which is implemented as a local file watcher in our minimal viable prototype. First, for a research project, a researcher can refer to DataBOM registries as a catalogue to check whether the required dataset has been purchased by the institute. If the dataset is available, the researcher can directly access it via the data path recorded in DataBOM, otherwise, the researcher needs to contact the responsible data scientist, who will approach external organisations to acquire the dataset. After obtaining the target dataset, the data scientist stores it in data repository, and creates a DataBOM record for this new dataset for future searching and usage. Subsequently, the researcher uses the dataset for further analysis and may conduct specific operations, for instance, performing quality assurance/quality control, selecting a certain subset and rechunking this subset with other datasets, etc. After the research paper publication, the researcher can create a DataBOM record for the revised dataset. In particular, the paper’s Digital Object Identifier (DOI) can be added as a data path considering the paper itself provides the link to data repository, and explains the methodology for processing the dataset. Other researchers can review the published paper and dataset (in this case the tailored dataset is open-source). If there are some errors found in the dataset (e.g., missing data should be set to a specific value but was found to be 0), the external researcher can report the error to the corresponding research institute. The researcher and data scientist both need to review the dataset and operation history recorded in DataBOM, while the data scientist also needs to notify the upstream organisation about the found errors. Regarding addressing the errors, the data scientist can publish the updated dataset and create a DataBOM record explaining this new version, which can be retrieved and used in future projects within the research institute.

We explored the data repository provided by the data scientist in our case study, and investigated the dataset metadata. We extracted and selected specific global attributes in the dataset artifacts to be included in the DataBOM registry smart contract, including Universally Unique Identifier (UUID), file name, licence, summary, and history, etc. The DataBOM data structure contains four structs to simulate the real-world file system and enable different granularity, including project, year, dataset, and version. An individual research project covers multiple years, while each year contains a series of dataset artifacts, and each dataset artifact may have multiple versions. Specifically, the authorisation lists are enabled by embedded permission for project and dataset levels respectively, in such manner, only authorised stakeholders at the project level can create new dataset records, and the authorised ones at the dataset level can create new versions for a particular dataset. Considering dataset artifacts are identified via UUID, if multiple dataset artifacts are merged and a new UUID is used as the identifier, the resulted artifact should be recorded as a new dataset.

4.2 Performance Analysis

We conducted quantitative experiments to measure the throughput and response time of two critical DataBOM service APIs in the proposed architecture: DataBOM creation and DataBOM retrieval , while other services have similar performance as they are all essentially the operations to store and query data to/from on-chain smart contracts. We implemented a minimal viable prototype for proof of concept of our proposed solution. The prototype API gateway is developed using Node.js v18.19.0, Web3.js v1.10.4 and Solidity v0.5.3, and it can support input and output in JSON format. The on-chain DataBOM layer is deployed on a local Ganache blockchain network. We configured the Ganache network block size to simulate the Ethereum Mainnet. The API requests, containing the corresponding required information of the four services, are produced via JMeter. JMeter is configured to 20 creations per batch (API calls). Each test ran for 30 minutes. Secondly, we tested the APIs 1000 times, and recorded the response time.

Refer to caption

Fig.  4 illustrates the measurement results for DataBOM creation and DataBOM retrieval API throughput. The x-axis represents the time elapsed since the start of the experiment (in minutes), while the y-axis represents the average throughput (transaction per second - tps). It can be observed that DataBOM retrieval throughput stayed around 27 tps, while DataBOM creation throughput peaked at the beginning (6.5 tps), but decreased to 1 tps afterwards. The performance of DataBOM creation is heavily dependent on the transaction and block size, since the DataBOM information can be large by including the licence, summary, and operation history, whilst the block size is fixed. Fig.  5 demonstrates the response time of running the two APIs for 1000 times respectively. The x-axis represents the number of API requests (1 to 1000), and the y-axis represents the response time of each sent request (in milliseconds - ms). For DataBOM creation API, the majority of requests are completed between 100 200 ms, while a small subset requires longer response time, whilst most DataBOM retrieval API requests are all carried out within 100 ms except the first several ones. We assume this is caused by the communication overhead between Jmeter, API gateway and Ganache blockchain at the beginning of the test.

5 Conclusion

This paper presents a platform architecture of blockchain-based data bill of materials. In the design, the identity services enable users to manage their on-chain identities, and the DataBOM services facilitate data traceability, verifiability, reproducibility and overall accountability by storing the metadata of dataset artifacts in on-chain smart contracts. We introduce the interaction protocol of our proposed architecture, and discuss the minimal requirements for DataBOM metadata. A minimal viable prototype for proof-of-concept is implemented, and we perform evaluation regarding feasibility and performance. In the future, we plan to: i) explore the feasibility of utilising foundation model based agents in the DataBOM generator to automatically analyse the dataset artifacts and extract required metadata, and; ii) integrate SBOM and DataBOM in the field of AI governance, to operationalise AIBOM  [ 24 ] .

  • [1] M. Khan and A. Hanna, “The subjects and stages of ai dataset development: A framework for dataset accountability,” Available at SSRN: https://ssrn.com/abstract=4217148 or http://dx.doi.org/10.2139/ssrn.4217148, 2022, accessed 17-January-2024.
  • [2] W. Dai, C. Dai, K.-K. R. Choo, C. Cui, D. Zou, and H. Jin, “Sdte: A secure blockchain-based data trading ecosystem,” IEEE Transactions on Information Forensics and Security , vol. 15, pp. 725–737, 2020.
  • [3] L. D. Nguyen, I. Leyva-Mayorga, A. N. Lewis, and P. Popovski, “Modeling and analysis of data trading on blockchain-based market in iot networks,” IEEE Internet of Things Journal , vol. 8, no. 8, pp. 6487–6497, 2021.
  • [4] T. U. S. D. of Commerce, “The minimum elements for a software bill of materials,” https://www.ntia.doc.gov/files/ntia/publications/sbom_minimum_elements_report.pdf , accessed 17-January-2024.
  • [5] I. Barclay, A. Preece, I. Taylor, and D. Verma, “Towards traceability in data ecosystems using a bill of materials model,” arXiv preprint arXiv:1904.04253 , 2019.
  • [6] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” https://bitcoin.org/bitcoin.pdf , 2008, accessed 6-June-2022.
  • [7] Y. Chen and C. Bellavitis, “Blockchain disruption and decentralized finance: The rise of decentralized business models,” Journal of Business Venturing Insights , vol. 13, p. e00151, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2352673419300824
  • [8] T. McGhin, K.-K. R. Choo, C. Z. Liu, and D. He, “Blockchain in healthcare applications: Research challenges and opportunities,” Journal of Network and Computer Applications , vol. 135, pp. 62–75, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1084804519300864
  • [9] F. Tschorsch and B. Scheuermann, “Bitcoin and beyond: A technical survey on decentralized digital currencies,” IEEE Communications Surveys & Tutorials , vol. 18, no. 3, p. 464, 2016.
  • [10] S. Omohundro, “Cryptocurrencies, smart contracts, and artificial intelligence,” AI Matters , vol. 1, no. 2, pp. 19–21, Dec. 2014.
  • [11] J. Jiao, M. M. Tseng, Q. Ma, and Y. Zou, “Generic bill-of-materials-and-operations for high-variety production management,” Concurrent Engineering , vol. 8, no. 4, pp. 297–321, 2000.
  • [12] B. Xia, T. Bi, Z. Xing, Q. Lu, and L. Zhu, “An empirical study on software bill of materials: Where we stand and the road ahead,” in 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE) , 2023, pp. 2630–2642.
  • [13] T. Stalnaker, N. Wintersgill, O. Chaparro, M. Di Penta, D. M. German, and D. Poshyvanyk, “Boms away! inside the minds of stakeholders: A comprehensive study of bills of materials for software systems,” in Proceedings of the 46th IEEE/ACM International Conference on Software Engineering , 2024, pp. 1–13.
  • [14] T. Bi, B. Xia, Z. Xing, Q. Lu, and L. Zhu, “On the way to sboms: Investigating design issues and solutions in practice,” arXiv preprint arXiv:2304.13261 , 2023.
  • [15] S. Nocera, S. Romano, M. Di Penta, R. Francese, and G. Scanniello, “Software bill of materials adoption: A mining study from github,” in 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME) .   IEEE, 2023, pp. 39–49.
  • [16] T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. D. Iii, and K. Crawford, “Datasheets for datasets,” Communications of the ACM , vol. 64, no. 12, pp. 86–92, 2021.
  • [17] M. Pushkarna, A. Zaldivar, and O. Kjartansson, “Data cards: Purposeful and transparent dataset documentation for responsible ai,” in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , 2022, pp. 1776–1826.
  • [18] A. Levin, S. Garion, E. K. Kolodner, D. H. Lorenz, K. Barabash, M. Kugler, and N. McShane, “Aiops for a cloud object storage service,” in 2019 IEEE International Congress on Big Data (BigDataCongress) , 2019, pp. 165–169.
  • [19] J. Yu, P. Baker, S. J. Cox, R. Petridis, A. C. Freebairn, F. Mirza, L. Thomas, S. Tickell, D. Lemon, and M. Rezvani, “Provena: A provenance system for large distributed modelling and simulation workflows,” in 25th International Congress on Modelling and Simulation (MODSIM2023) , 2023, pp. 14–20.
  • [20] H.-Y. Paik, X. Xu, H. D. Bandara, S. U. Lee, and S. K. Lo, “Analysis of data management in blockchain-based systems: From architecture to governance,” Ieee Access , vol. 7, pp. 186 091–186 107, 2019.
  • [21] W. Dai, C. Dai, K.-K. R. Choo, C. Cui, D. Zou, and H. Jin, “Sdte: A secure blockchain-based data trading ecosystem,” IEEE Transactions on Information Forensics and Security , vol. 15, pp. 725–737, 2019.
  • [22] D. Zhang, B. Xia, Y. Liu, X. Xu, T. Hoang, Z. Xing, M. Staples, Q. Lu, and L. Zhu, “Tag your fish in the broken net: A responsible web framework for protecting online privacy and copyright,” arXiv preprint arXiv:2310.07915 , 2023.
  • [23] X. Xu, H. Dilum Bandara, Q. Lu, I. Weber, L. Bass, and L. Zhu, “A decision model for choosing patterns in blockchain-based applications,” in 2021 IEEE 18th International Conference on Software Architecture (ICSA) , 2021, pp. 47–57.
  • [24] B. Xia, D. Zhang, Y. Liu, Q. Lu, Z. Xing, and L. Zhu, “Trust in software supply chains: Blockchain-enabled sbom and the aibom future,” arXiv preprint arXiv:2307.02088 , 2023.

Explore the Hakkoda services you need to modernize your data stack.

Discover how Hakkoda helps healthcare, financial services, and public sector organizations move forward with their data innovation journeys.  

SOLUTIONS AND ACCELERATORS

We are the real data people.

  • DATA FOR GOOD

The Top 5 Data Challenges Faced By Supply Chain and Logistics Providers

Hakkoda - Data Challenges - Thumbnail

  • David Berry,
  • Director, Client Services

Disruptions in the global supply chain demand innovation , and the ongoing need for adaptation across the data landscape. Whether it is volatile fuel prices, a shrinking labor force, or geo-political trends, we understand your logistics operations are under constant pressure from external forces. Hakkoda can help you fight these headwinds by activating valuable insights and focusing on business outcomes that mitigate risk for your business.

In this blog, we will walk through five of the biggest data challenges supply chain and logistics organizations face, and how a modern data stack centered on the Snowflake AI Data Cloud , together with a trusted data partner like Hakkoda, can help you solve them.

supply chain case study dataset

Challenge Number 1: Modernization

According to findings from Hakkoda’s State of Data report , chances are your organization has an overdependence on legacy, multi-layered WMS/OMS systems. This creates outdated data silos, riddled with tech debt, that are not well-positioned for modern tools capable of leveraging AI. 

The good news? Hakkoda’s Data Innovation Journey helps assess your data landscape and prioritizes short-term and long-term strategies to get you back on track. 

Challenge Number 2: Scale

The inability to properly scale warehouse operations can have challenges way beyond inefficient client onboarding. Failure to adopt common data models, repeatable patterns, and proper visibility can lead to slower inventory turnover, higher return rates, and missed shipment SLAs. 

Hakkoda has helped logistics companies organize their data processes focused solely on shortening the length of time it takes to onboard a new customer, successfully reducing onboarding time by over 65%, and leading to dramatic increases in margin and customer loyalty.

Challenge Number 3: Forecasting Sophistication

An accurate and reliable inventory or capacity forecast can make or break customer satisfaction. In many cases, legacy solutions have not evolved with changing business needs, or are still simply a black box, not allowing for basic customization or full understanding. 

Hakkoda has helped numerous Fortune 500 companies optimize their forecasting tools and rethink their approach with custom, outside the box solutions. By leveraging technology like Snowpark for predictive analytics and propensity modeling, or creating custom applications that solve even the most complex carryover processes, we’re confident we can help.

Challenge Number 4: Optimized Labor Model

We understand labor is most likely the largest line item on your P&L. Do you have the right visibility to control the right levers? Hakkoda can help harmonize common data models across operations, time management, human resources, and more to help you be proactive. Through pick pattern optimizations, real-time SLA insights, or inventory capacity concerns, let your data do the heavy lifting.

Challenge Number 5: SAP Analytics & Reporting

The steep costs and data silos associated with SAP can be a huge blocker for supply chain and logistics companies looking to refine their demand forecasting models. With BW and ECC coming to the end of their lifecycle, now is the time to migrate analytics workloads to Snowflake, unify your enterprise’s data in one place, and save big on retiring legacy tech. 

Hakkoda has in-house SAP experts with deep experience in the ERP space and teams equipped with the technical skill sets you need to tackle these complex challenges.

Hakkoda - Data Challenges - Image 2

So You’ve Identified Your Top Data Challenges. Now What?

Every industry has a distinctive set of obstacles to navigate on their path to innovation, and the supply chain and logistics space is no exception. And the complexities don’t stop there, either. While many of the challenges above may resonate with your organization as it navigates its data modernization initiatives, there will no doubt be other, equally crucial considerations that are unique to your enterprise. It isn’t all about the challenges you face, either. Your financial and operational goals are just as informative when it comes to building a data stack that works for your organization. 

Hakkoda’s supply chain and logistics team was built with a deep understanding of the industry at its core. We don’t just deliver data outcomes, we deliver business outcomes backed by years of experience in businesses like yours. We also understand that not every tool is the right fit for every enterprise, which is why we bring knowledge from across the modern data stack to bear on every client engagement—to help you build the data strategy that works best for your business.

Ready to take top challenges in your industry head-on with a data consultancy that understands your goals? Talk to one of our experts today .

Hakkoda - Public Sector Data Transformation

Never miss an update​

Join our mailing list to stay updated with everything Hakkoda.

Ready to learn more?

Speak with one of our experts.

supply chain case study dataset

supply chain case study dataset

Tesla: Transforming Supply Chain Sustainability

supply chain case study dataset

Tesla has launched its first all-electric train service in Germany, dubbed the "Giga Train". 

The train began operations on August 7, 2024, connecting Erkner station to Tesla Süd, serving both Tesla employees and the general public free of charge.

It represents a significant development in sustainable transportation with important implications for supply chain leaders. 

Youtube Placeholder

The Giga Train provides a free, eco-friendly commute for Tesla employees to the Gigafactory Berlin, optimising the workforce supply chain. This efficient transportation solution can potentially improve punctuality, ensuring a steady supply of labour to the factory.

By accommodating up to 500 passengers per trip, the Giga Train helps alleviate traffic congestion on regional roads. This improvement in local infrastructure can lead to more efficient logistics operations, potentially reducing delivery times and transportation costs for Tesla's supply chain.

The battery-electric train is estimated to save approximately 50 tons of CO2 emissions per week. This significant reduction in carbon footprint aligns with sustainable supply chain practices and may help Tesla meet environmental regulations and corporate sustainability goals.

Scalable green transportation

The success of the Giga Train could serve as a model for implementing similar sustainable transportation solutions in other regions. This scalability could potentially benefit Tesla's global supply chain by providing environmentally friendly logistics options at other manufacturing sites.

The train service, operated by Niederbarnimer Eisenbahngesellschaft (NEB), demonstrates successful integration with existing transportation infrastructure.

This collaboration model could be replicated in other locations to enhance supply chain connectivity and efficiency. By focusing on sustainable, efficient transportation solutions like the Giga Train, Tesla is not only addressing immediate logistical needs but also paving the way for more environmentally conscious supply chain practices in the automotive industry and beyond.

Key features

The Giga Train, operated by Niederbarnimer Eisenbahngesellschaft (NEB), is a battery-electric train and the first of its kind in the Berlin and Brandenburg areas. Some notable features include:

  • Capacity : Each train can accommodate up to 500 passengers, with two carriages holding 120 passengers each.
  • Accessibility : The train is designed with ground-level access and bicycle compartments for user convenience.
  • Technology : It is equipped with an advanced passenger information system.

By replacing an older diesel-electric rail vehicle, the Giga Train aligns with Tesla's mission to accelerate the transition to renewable energy. This move towards electric rail transportation is seen as a significant step in reducing carbon emissions and promoting cleaner air.

It uses Siemens Mobility Mireo B battery-electric trains leased from Siemens Mobility subsidiary Smart Train Lease. NEB has announced plans to gradually replace its entire diesel fleet with battery-electric and hydrogen multiple units.

The success of the Giga Train could potentially pave the way for widespread adoption of electric trains in cities worldwide, revolutionizing public transportation. As Tesla continues to push the boundaries of electric transportation, the transportation industry will be closely watching the performance and efficiency of this new electric train service.

Greening supply chains

Innovations such as Tesla's Giga Train could transform supply chains of the future by providing a highly efficient and sustainable alternative to traditional freight transport.

Utilising advanced electric propulsion and cutting-edge technology, the Giga Train significantly reduces greenhouse gas emissions and operational costs, making logistics more eco-friendly and cost-effective.

Its state-of-the-art automation and high-capacity design facilitate faster and more reliable product delivery, optimising inventory management and shortening lead times.

The integration of the Giga Train into existing supply chain networks can streamline operations, enhance distribution route flexibility and support the global shift towards greener transportation solutions.

This innovative approach not only boosts the efficiency of goods movement but also contributes to a more sustainable and resilient supply chain infrastructure.

Check out the latest edition of  Supply Chain Magazine  and sign up to our global conference series –  Procurement and Supply Chain LIVE 2024 . 

Supply Chain Digital  is a  BizClik  brand.

  • 'Be proactive on circular supply chain', CSCOs are urged Operations
  • Sustainable product design 'has long way to go' - Capgemini Sustainability
  • The Path to Sustainability Digital Supply Chain
  • How businesses are pursuing Sustainability in Supply Chain Digital Supply Chain

Featured Articles

Procurement & supply chain live london – cpd accredited.

P&SC LIVE London Global Summit becomes an accredited CPD event helping individuals maintain and enhance their professional skills and knowledge …

Analysing Transit Time Delays Along Key Global Trade Routes

Nidhi Gupta, CEO at Portcast, examines how transit times across major global trade routes have been affected over the past 12 months …

Procurement & Supply Chain LIVE Dubai 2024 Watch On-Demand

Did you attend Procurement & Supply Chain LIVE Dubai in May? Watch on-demand to discover the latest trends and insights in the industry …

UPS, FedEx and DHL Ranked as Most Valuable Logistics Brands

P&sc live: ben madin senior director at bamboo rose, global supply & demand: key vendors in supply chain planning.

  • 2 New Speakers From PwC & Versuni Join P&SC LIVE London
  • Greener Supply Chains: IKEA, Schneider Electric & npower
  • P&SC LIVE Singapore 2024: Watch On-Demand Now
  • FedEx Goes for Gold with Paris 2024 Olympics Strategy
  • Two Months to Go: Procurement & Supply Chain LIVE London

Navigating Supply Chain Resilience in the Knowledge Economy: Insights from Digital Technologies and Organizational Information Processing

  • Published: 23 August 2024

Cite this article

supply chain case study dataset

  • Yang Han 1  

This study investigates the role of supply chain resilience in the knowledge economy, focusing on the impact of digital technologies and organizational information processing theory (OIPT) in managing supply chain disruptions. The research employs a mixed-methods approach, combining a survey of 200 supply chain professionals and in-depth interviews with experts in the field. The results indicate that organizations with higher information processing capacity can better anticipate and respond to supply chain disruptions, recovering 20% faster than their counterparts. Additionally, digital technologies such as blockchain, additive manufacturing, artificial intelligence, and the Internet of Things (IoT) significantly enhance supply chain resilience by improving information processing, reducing lead times, and increasing flexibility. The study also highlights the importance of effective collaboration and communication among supply chain stakeholders in managing disruptions and enhancing resilience. The findings contribute to the literature on supply chain resilience, digital technologies, and OIPT, providing practical recommendations for organizations seeking to improve their supply chain resilience in the knowledge economy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

supply chain case study dataset

Explore related subjects

  • Artificial Intelligence

Ahmad, S. A. H., Ab Rahman, M. N., & Muhamed, A. A. (2020). Optimal temperature in cold storage for perishable foods. In Proceedings of the 5th NA international conference on industrial engineering and operations management, Michigan, USA, August 10–14, 2020.

Ali, I., Sadiddin, A., & Cattaneo, A. (2023). Risk and resilience in agri-food supply chain SMEs in the pandemic era: A cross-country study. International Journal of Logistics Research and Applications, 26 (11), 1602–1620.

Article   Google Scholar  

Alkahtani, M., Khalid, Q. S., Jalees, M., Omair, M., Hussain, G., & Pruncu, C. I. (2021). E-agricultural supply chain management coupled with blockchain effect and cooperative strategies. Sustainability, 13 (2), 816.

Al-Rakhami, M., & Al-Mashari, M. (2021). A blockchain-based trust model for the internet of things supply chain management. Sensors, 21 (5), 1759.

Al-Shorman, H. M., Eldahamsheh, M. M., Attiany, M. S., Al-Azzam, M. K. A., & Al-Quran, A. Z. (2023). Potential effects of smart innovative solutions for supply chain performance. Uncertain Supply Chain Management, 11 (1), 103–110.

Alvarenga, M. Z., Oliveira, M. P. V. D., & Oliveira, T. A. G. F. D. (2023). The impact of using digital technologies on supply chain resilience and robustness: The role of memory under the covid-19 outbreak. Supply Chain Management: An International Journal, 28 (5), 825–842.

Bahn, R. A., Yehya, A. A. K., & Zurayk, R. (2021). Digitalization for sustainable agri-food systems: Potential, status, and risks for the mena region. Sustainability, 13 (6), 3223.

Belhadi, A., Kamble, S. S., Venkatesh, M., Jabbour, C. J. C., & Benkhati, I. (2022). Building supply chain resilience and efficiency through additive manufacturing: An ambidextrous perspective on the dynamic capability view. International Journal of Production Economics, 249 , 108516.

Ben-Daya, M., Hassini, E., & Bahroun, Z. (2017). Internet of things and supply chain management: A literature review. International Journal of Production Research, 57 (15–16), 4719–4742.

Google Scholar  

Benyam, A. A., Soma, T., & Fraser, E. (2021). Digital agricultural technologies for food loss and waste prevention and reduction: Global trends, adoption opportunities and barriers. Journal of Cleaner Production, 323 , 129099.

Bollen, Y., De Ville, F., & Orbie, J. (2016). EU trade policy: Persistent liberalisation, contentious protectionism. Journal of European Integration, 38 (3), 279–294.

Bosona, T., & Gebresenbet, G. (2023). The role of blockchain technology in promoting traceability systems in agri-food production and supply chains. Sensors, 23 (11), 5342.

Bryden, W. L. (2012). Mycotoxin contamination of the feed supply chain: Implications for animal productivity and feed security. Animal Feed Science and Technology, 173 (1–2), 134–158.

Cei, L., Defrancesco, E., & Stefani, G. (2018). From geographical indications to rural development: A review of the economic effects of European Union policy. Sustainability, 10 (10), 3745.

Cheng, H., Zhou, X., Sun, J., Ng’ombe, J. N., Mzyece, A., & Feng, W. ,..., & Liu, B. (2024). Assessing the efficacy of agricultural cold chain facility expansion in China. Journal of Stored Products Research, 105 , 102244

Chopra, S., & Sodhi, M. S. (2014). Reducing the risk of supply chain disruptions. MIT Sloan Management Review, 55 (3), 73-80.

Christopher, M., Lowson, R., & Peck, H. (2004). Creating agile supply chains in the fashion industry. International Journal of Retail & Distribution Management, 32 (8), 367–376.

Cichosz, M., Wallenburg, C. M., & Knemeyer, A. M. (2020). Digital transformation at logistics service providers: Barriers, success factors and leading practices. The International Journal of Logistics Management, 31 (2), 209–238.

Da, F., & Zhang, X. (2021). The protective effect of digital financial inclusion on agricultural supply chain during the COVID-19 pandemic: Evidence from China. Journal of Theoretical and Applied Electronic Commerce Research, 16 (7), 3202–3217.

Drewry, J. L., Shutske, J. M., Trechter, D., Luck, B. D., & Pitman, L. (2019). Assessment of digital technology adoption and access barriers among crop, dairy and livestock producers in Wisconsin. Computers and Electronics in Agriculture, 165 , 104960.

Dubey, R., Gunasekaran, A., Childe, S. J., Wamba, S. F., Roubaud, D., & Foropon, C. (2019). Empirical investigation of data analytics capability and organizational flexibility as complements to supply chain resilience. International Journal of Production Research, 59 (1), 110–128.

Eslami, M. H., Achtenhagen, L., Bertsch, C. T., and Lehmann, A. (2023). Knowledge-sharing across supply chain actors in adopting Industry 4.0 technologies: An exploratory case study within the automotive industry. Technological Forecasting and Social Change, 186, 122118.

Fang, L., & Ge, H. (2023). Research on the resilience system of the agricultural supply chain under the blockchain. Journal of Innovation and Development, 2 (1), 62–65.

Gava, O., Ardakani, Z., Delalić, A., Azzi, N., & Bartolini, F. (2021). Agricultural cooperatives contributing to the alleviation of rural poverty. The case of Konjic (Bosnia and Herzegovina). Journal of Rural Studies, 82 , 328–339.

Golan, M. S., Jernegan, L. H., & Linkov, I. (2020). Trends and applications of resilience analytics in supply chain modeling: Systematic literature review in the context of the COVID-19 pandemic. Environment Systems and Decisions, 40 (2), 222–243.

Guo, C., Liu, Y., Huai, Y., Rao, C. Y., Lai, S., Mu, D., …, & Nie, S. (2018). Exposure history, post-exposure prophylaxis use, and clinical characteristics of human rabies cases in China, 2006–2012. Scientific Reports , 8 (1). https://doi.org/10.1038/s41598-018-35158-0

Hashem, N. M., Hassanein, E. M., Hocquette, J. F., Gonzalez-Bulnes, A., Ahmed, F. A., Attia, Y. A., & Asiry, K. A. (2021). Agro-livestock farming system sustainability during the COVID-19 era: A cross-sectional study on the role of information and communication technologies. Sustainability, 13 (12), 6521.

Hassan, M., Kowalska, A., & Ashraf, H. (2023). Advances in deep learning algorithms for agricultural monitoring and management. Applied Research in Artificial Intelligence and Cloud Computing, 6 (1), 68–88.

Hohenstein, N. O., Feisel, E., Hartmann, E., & Giunipero, L. (2015). Research on the phenomenon of supply chain resilience: A systematic review and paths for further investigation. International Journal of Physical Distribution & Logistics Management, 45 (1/2), 90–117.

Ivanov, D. (2021). Exiting the COVID-19 pandemic: After-shock risks and avoidance of disruption tails in supply chains. Annals of Operations Research, 335 (3), 1627–1644.

Jagtap, S., Trollman, H., Trollman, F., Garcia-Garcia, G., Parra-López, C., & Duong, L. ,..., & Afy-Shararah, M. (2022). The Russia-Ukraine conflict: Its implications for the global food supply chains. Foods, 11 (14), 2098

Jiang, W. (2019). An intelligent supply chain information collaboration model based on Internet of Things and big data. IEEE Access, 7 , 58324–58335.

Jin, H., & Xie, Y. (2023). A review of research on the impact of digitalization on agricultural supply chain security. In Frontiers in Artificial Intelligence and Applications . https://doi.org/10.3233/faia230038

Kamilaris, A., Fonts, A., & Prenafeta-Boldú, F. X. (2019). The rise of blockchain technology in agriculture and food supply chains. Trends in Food Science & Technology, 91 , 640–652.

Ko, T., Lee, J., & Ryu, D. (2018). Blockchain technology and manufacturing industry: Real-time transparency and cost savings. Sustainability, 10 (11), 4274.

Lasso, S. V., Mainardes, E. W., & Motoki, F. Y. (2018). Types of technological entrepreneurs: A study in a large emerging economy. Journal of the Knowledge Economy, 9 , 378–401.

Li, Y., Li, D., Liu, Y., & Shou, Y. (2023). Digitalization for supply chain resilience and robustness: The roles of collaboration and formal contracts. Frontiers of Engineering Management, 10 (1), 5–19.

Lin, Y., Petway, J. R., Anthony, J., Mukhtar, H., Liao, S., Chou, C., …, & Ho, Y. (2017). Blockchain: The evolutionary next step for ICT e-agriculture. Environments , 4 (3), 50

Liu, Y., Zhou, Z., & Huang, Y. (2023). Regional allocation design and efficiency improvement of rural water conservancy project in China. Journal of Physics: Conference Series, 2508 (1), 012047.

Lubag, M., Bonifacio, J., Tan, J. M., Concepcion, R., Mababangloob, G. R., Galang, J. G., & Maniquiz-Redillas, M. (2023). Diversified impacts of enabling a technology-intensified agricultural supply chain on the quality of life in Hinterland communities. Sustainability, 15 (17), 12809.

Ma, W., McKay, A., Rahut, D. B., & Sonobe, T. (2023). An introduction to rural and agricultural development in the digital age. Review of Development Economics, 27 (3), 1273–1286.

Malah Kuete, Y. F., & Asongu, S. A. (2022). Infrastructure development as a prerequisite for structural change in Africa. Journal of the Knowledge Economy , 1–27. https://doi.org/10.1007/s13132-022-00989-w

Messina, D., Barros, A. C., Soares, A. L., & Matopoulos, A. (2020). An information management approach for supply chain disruption recovery. The International Journal of Logistics Management, 31 (3), 489–519.

Mishra, R., Singh, R. K., & Subramanian, N. (2021). Impact of disruptions in agri-food supply chain due to COVID-19 pandemic: Contextualised resilience framework to achieve operational excellence. The International Journal of Logistics Management, 33 (3), 926–954.

Modgil, S., Gupta, S., Stekelorum, R., & Laguir, I. (2021). Ai technologies and their impact on supply chain resilience during COVID-19. International Journal of Physical Distribution & Logistics Management, 52 (2), 130–149.

Monrat, A. A., Schelén, O., & Andersson, K. (2019). A blockchain survey from the perspectives of applications, challenges, and opportunities. IEEE Access, 7 , 117134–117151.

Munir, M., Jajja, M. S. S., & Chatha, K. A. (2022). Capabilities for enhancing supply chain resilience and responsiveness in the COVID-19 pandemic: Exploring the role of improvisation, anticipation, and data analytics capabilities. International Journal of Operations & Production Management, 42 (10), 1576–1604.

Nedumaran, S., Selvaraj, A., Nandi, R., Suchiradipta, B., Jyosthnaa, P., & Bose, D. (2020). Digital integration to enhance market efficiency and inclusion of smallholder farmers: A proposed model for fresh fruit and vegetable supply chain. International Food and Agribusiness Management Review, 23 (3), 319–337.

Nikookar, E. and Yanadori, Y. (2021). Preparing supply chain for the next disruption beyond COVID-19: Managerial antecedents of supply chain resilience. International Journal of Operations & Production Management, 42(1), 59–90

Nzeako, G., Akinsanya, M. O., Popoola, O. A., Chukwurah, E. G., & Okeke, C. D. (2024). The role of AI-driven predictive analytics in optimizing IT industry supply chains. International Journal of Management & Entrepreneurship Research, 6 (5), 1489–1497.

Paul, J., Lim, W. M., O’Cass, A., Hao, A. W., & Bresciani, S. (2021). Scientific procedures and rationales for systematic literature reviews (SPAR-4-SLR). International Journal of Consumer Studies, 45 (4), O1–O16.

Pu, G., Qiao, W., & Feng, Z. (2023). Antecedents and outcomes of supply chain resilience: Integrating dynamic capabilities and relational perspective. Journal of Contingencies and Crisis Management, 31 (4), 706–726.

Quartey, S. H., & Oguntoye, O. (2021). Understanding and promoting industrial sustainability in Africa through the Triple Helix approach: A conceptual model and research propositions. Journal of the Knowledge Economy, 12 (3), 1100–1118.

Queiroz, M. M., Ivanov, D., Dolgui, A., Wamba, S. F., & Boros, E. (2020). Impacts of epidemic outbreaks on supply chains: Mapping a research agenda amid the COVID-19 pandemic through a structured literature review. Annals of Operations Research , 319 , 1159–1196

Rey, D., Holman, I., Sutcliffe, C., & Hess, T. (2023). Synergies and trade-offs in drought resilience within a multi-level UK food supply chain. Regional Environmental Change , 23 (2). https://doi.org/10.1007/s10113-023-02046-x

Rotz, S., Duncan, E., Small, M., Botschner, J., Dara, R., Mosby, I., …, & Fraser, E. D. G. (2019). The politics of digital agricultural technologies: A preliminary review. Sociologia Ruralis , 59 (2), 203–229

Saberi, S., Kouhizadeh, M., Sarkis, J., & Shen, L. Y. (2018). Blockchain technology and its relationships to sustainable supply chain management. International Journal of Production Research, 57 (7), 2117–2135.

Salemink, K., Strijker, D., & Bosworth, G. (2017). Rural development in the digital age: A systematic literature review on unequal ICT availability, adoption, and use in rural areas. Journal of Rural Studies, 54 , 360–371.

Segami, S., Hirono, M., Mimura, H., Nakanishi, Y., & Maeshima, M. (2008). Type i and ii h+-pyrophosphatases in plants: functional structural models and protein contents. Plant and Cell Physiology Supplement , 0802-0802. https://doi.org/10.14841/jspp.2008.0.0802.0

Seville, E., Van Opstal, D., & Vargo, J. (2015). A primer in resiliency: Seven principles for managing the unexpected. Global Business and Organizational Excellence, 34 (3), 6–18.

Shahid, A., Almogren, A., Javaid, N., Al-Zahrani, F. A., Zuair, M., & Alam, M. (2020). Blockchain-based agri-food supply chain: A complete solution. IEEE Access, 8 , 69230–69243.

Shen, D., Wang, L., & Cai, L. (2024). Aging agricultural labor force, outsourcing service of pest control and biopesticide application: A case study of 10 counties in Fujian Province. Frontiers in Sustainable Food Systems, 8 , 1333053.

Spielman, D. J., Ekboir, J., Davis, K., & Ochieng, C. M. (2008). An innovation systems perspective on strengthening agricultural education and training in sub-Saharan Africa. Agricultural Systems, 98 (1), 1–9.

Srivastava, S. K. (2007). Green supply-chain management: A state-of-the-art literature review. International Journal of Management Reviews, 9 (1), 53–80.

Tebaldi, L., & Vignali, G. (2023). Is it possible to quantify the current resilience level of an agri-food system? A review of the literature. Agricultural and Food Economics, 11 (1), 45.

Trebicki, P., & Finlay, K. (2019). Pests and diseases under climate change; Its threat to food security. Food security and climate change , 229–249.

Um, J., & Han, N. (2020). Understanding the relationships between global supply chain risk and supply chain resilience: The role of mitigating strategies. Supply Chain Management: An International Journal, 26 (2), 240–255.

Venkatesh, V. G., Kang, K., Wang, B., Zhong, R. Y., & Zhang, A. (2020). System architecture for blockchain based transparency of supply chain social sustainability. Robotics and Computer-Integrated Manufacturing, 63 , 101896.

Wang, X., & He, G. (2020). Digital financial inclusion and farmers’ vulnerability to poverty: Evidence from rural China. Sustainability, 12 (4), 1668.

Wang, Y., Han, J. H., & Beynon-Davies, P. (2019). Understanding blockchain technology for future supply chains: A systematic literature review and research agenda. Supply Chain Management: An International Journal, 24 (1), 62–84.

Wang, L., Zhai, F., Wang, Z., & Tan, Q. (2021). The impact of rural infrastructural investment on farmers’ income growth in China. China Agricultural Economic Review, 14 (1), 202–219.

Wang, Y., Huang, Y., & Zhang, Y. (2023). Coupling and coordinated development of digital economy and rural revitalisation and analysis of influencing factors. Sustainability, 15 (4), 3779.

Wieland, A., & Wallenburg, C. M. (2013). The influence of relational competencies on supply chain resilience: A relational view. International Journal of Physical Distribution & Logistics Management, 43 (4), 300–320.

Yang, M., Fu, M., & Zhang, Z. (2021). The adoption of digital technologies in supply chains: Drivers, process and impact. Technological Forecasting and Social Change, 169 , 120795.

Yuan, Y., & Li, W. (2022). The effects of supply chain risk information processing capability and supply chain finance on supply chain resilience: A moderated and mediated model. Journal of Enterprise Information Management, 35 (6), 1592–1612.

Zhang, X., & Fan, D. (2023). Can agricultural digital transformation help farmers increase income? An empirical study based on thousands of farmers in Hubei Province. Environment, Development and Sustainability , 23 , 55 . https://doi.org/10.1007/s10668-023-03200-5

Download references

Author information

Authors and affiliations.

School of Economics and Management, Northeast Agricultural University, Harbin, 150006, China

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Yang Han .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Han, Y. Navigating Supply Chain Resilience in the Knowledge Economy: Insights from Digital Technologies and Organizational Information Processing. J Knowl Econ (2024). https://doi.org/10.1007/s13132-024-02289-x

Download citation

Received : 24 April 2024

Accepted : 10 August 2024

Published : 23 August 2024

DOI : https://doi.org/10.1007/s13132-024-02289-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Supply chain resilience
  • Digital technologies
  • Organizational information processing theory
  • Knowledge economy
  • Disruptions
  • Find a journal
  • Publish with us
  • Track your research

Case Study: Supply Chain Analytics in Power BI

Learn how to use Power BI for supply chain analytics in this case study. Create a make vs. buy analysis tool, calculate costs, and analyze production volumes.

Create Your Free Account

Loved by learners at thousands of companies, course description, master supply chain analytics with power bi, create a make versus buy analysis tool, discover the advantages of power bi, extend the model to include more supply chain aspects, .css-1goj2uy{margin-right:8px;} group .css-gnv7tt{font-size:20px;font-weight:700;white-space:nowrap;} .css-12nwtlk{box-sizing:border-box;margin:0;min-width:0;color:#05192d;font-size:16px;line-height:1.5;font-size:20px;font-weight:700;white-space:nowrap;} training 2 or more people, the buy option.

In this first chapter, you will learn about the make and buy options in a supply chain. Then you will become acquainted with the concepts of extended cost and full cost. You will learn how to calculate these with a basic quote dataset.

Creating a Volume Parameter and Scenario Analysis

In this chapter, you will explore building a scenario analysis tool to examine how your cost calculations change as volumes change. You will create visualizations of how the full cost changes across the different potential production volumes.

The Make Option

You will learn how to assess internal costs manufacturing costs - the "Make" option in the Make versus Buy. You will also learn how to communicate your results.

Group Training 2 or more people?

Collaborators

Collaborator's avatar

Prerequisites

Nicholas Switzer Headshot

Analytics Solutions Lead at Stryker

What do other learners have to say?

Join over .css-ou6dz6{color:#03ef62;} 14 million learners and start case study: supply chain analytics in power bi today.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

applsci-logo

Article Menu

supply chain case study dataset

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Enhancing space management through digital twin: a case study of the lazio region headquarters.

supply chain case study dataset

1. Introduction

1.1. advanced digital technologies, 1.2. space management in smart working adoption, 2. materials and methods.

  • Variations due to the active interaction of occupants with building systems. The active interaction of occupants with building energy systems is a complex mechanism, where numerous parameters, including the number of occupants, occupant behavior, type of space, type of work, day, and time, exert a significant influence. The aforementioned parameters can be classified into three distinct groups: temporal, spatial, and occupancy. Temporal parameters indicate the time scale, while spatial parameters indicate spatial information. Occupancy parameters, on the other hand, indicate occupant behavior. It can be reasonably argued that psychological and sociological aspects exert a considerable influence on occupant behavior. Given the inherent unpredictability of occupants’ behavior, it remains challenging to identify any specific circumstances that may influence the consumption of energy. This uncertainty has a significant impact on the indoor environment, which in turn contributes to the observed variations in energy consumption. Other factors that influence occupants’ behavior include temperature, humidity, illuminance factor, and several others, which collectively affect the building’s energy consumption. The most common forms of occupant behavior are adjusting the thermostat, dimming the lighting, switching off lights, and using electronic devices.
  • Variations due to the passive interaction of occupants with building systems. Passive interaction depends on the presence of occupants, reported as hours of occupancy in the building, and the absence of occupants, reported as hours of unoccupied building operation. It is possible that there is inefficient operation of the electrical and mechanical systems, which would result in high energy wastage during the unoccupied/occupied hours; this issue was further investigated to estimate the amount of energy wasted during unoccupied hours, with between 26% and 65% of energy being used during unoccupied/occupied hours, compared to working hours between 7.30 a.m. and 5 p.m. Of the total consumption, between 19% and 28% (mainly electricity and HVAC) is used during the weekend when the building is unoccupied [ 44 ].

2.1. Methodology

  • Space analysis via the BIM model: Through the use of IoT sensors, which can be easily installed and configured, it is possible to start the occupancy assessment process. - Use analysis: Evaluation of office space use to identify inefficiencies or cases of overutilization. - Occupancy measurement: Determine the occupancy percentage for each floor, with a detailed map of space use.
  • Space analysis and management (statistics): Using data obtained from a dedicated booking application, through the utilization of BIM-integrated software, facilitates the examination of the actual use of workspaces, with the objective of optimizing their management. The specific objectives are as follows: - Presence forecasting: Estimating building occupancy over the next 30 days based on historical booking data, allowing better maintenance services and the anticipation of space and resource requirements. - Savings analysis: Quantification of potential maintenance cost savings through appropriate operational downsizing. By analyzing space utilization, it is possible to identify areas that can be consolidated or downsized, thus reducing operating costs. - Duration of bookings: Calculation of the average duration of room bookings to optimize space availability. Knowing the typical duration of bookings allows you to better plan the use of meeting rooms, avoiding overlaps and improving efficiency in resource allocation. - Temporal assessment of room use: Examination of meeting room usage by time of day and day of the week and the identification of high- and low-occupancy periods. This provides a better understanding of space utilization patterns, allowing for more flexible and adaptive management.
  • Insight and automated alerts: Provide automatic alerts to the building manager [ 55 ] to improve space management [ 56 ]. These alerts signal when there is a high probability that a particular floor or the entire building will reach 100% occupancy on a particular day or period, allowing for more efficient planning of the necessary resources.
  • Machine learning: This stage uses ML to make reliable predictions and generate insights [ 57 ], going beyond the traditional programming approach [ 58 ]. The system learns from the relationships between data and outcomes, identifying patterns and connections that may not be obvious to human intelligence [ 59 ]. This ‘knowledge discovery’ process not only provides useful results, but can also reveal new, previously unknown/ignored/unexpected information.

2.1.1. Analysis of Spaces via the BIM Model—Phase 1

2.1.2. optimizing space management through data analysis (statistics)—phase 2, 2.1.3. insight and automatic alarms—phase 3, 2.1.4. machine learning for enhanced predictive insights—phase 4, 3. results and discussion, 3.1. case study: lazio region headquarters, 3.2. digital transformation in modern work environments, 4. conclusions, author contributions, institutional review board statement, informed consent statement, data availability statement, conflicts of interest.

  • Cumo, F. Innovation for the Digitization Process of the AECO Sector. In Proceedings of the Technological Imagination in the Green and Digital Transition (CONF.ITECH 2022), Rome, Italy, 30 June–1 July 2022; The Urban Book Series. Springer: Cham, Switzerland, 2023. [ Google Scholar ] [ CrossRef ]
  • Piras, G.; Agostinelli, S.; Muzi, F. Digital Twin Framework for Built Environment: A Review of Key Enablers. Energies 2024 , 17 , 436. [ Google Scholar ] [ CrossRef ]
  • Lamagna, M.; Groppi, D.; Nezhad, M.M.; Piras, G. A comprehensive review on Digital twins for Smart energy management system. Int. J. Energy Prod. Manag. 2021 , 6 , 323–334. [ Google Scholar ] [ CrossRef ]
  • Grieves, M. Digital Twin: Manufacturing Excellence through Virtual Factory Replication. White Paper. 2014. Available online: https://www.researchgate.net/publication/275211047_Digital_Twin_Manufacturing_Excellence_through_Virtual_Factory_Replication (accessed on 15 April 2024).
  • Research and Markets. Digital Twin Market Research Report: By Type, Technology, Enterprise, Application, Industry—Global Industry Analysis and Growth Forecast to 2030. Available online: https://www.researchandmarkets.com/reports/5128896/digital-twin-market-research-report-by-type (accessed on 11 March 2024).
  • Muzi, F.; Marzo, R.; Nardi, F. Digital Information Management in the Built Environment: Data-Driven Approaches for Building Process Optimization. In Proceedings of the International Conference on Technological Imagination in the Green and Digital Transition, Rome, Italy, 30 June–2 July 2022; Springer International Publishing: Cham, Switzerland, 2023; pp. 123–132. [ Google Scholar ] [ CrossRef ]
  • Orsini, G.; Piras, G. Digital Construction and Management the Public’s Infrastructures. In Proceedings of the Technological Imagination in the Green and Digital Transition (CONF.ITECH 2022), Rome, Italy, 30 June–1 July 2022; The Urban Book Series. Springer: Cham, Switzerland, 2023. [ Google Scholar ] [ CrossRef ]
  • IFC. Available online: https://www.buildingsmart.org/standards/bsi-standards/industry-foundation-classes/ (accessed on 29 April 2024).
  • Matarneh, S.; Elghaish, F.; Rahimian, F.P.; Dawood, N.; Edwards, D. Automated and interconnected facility management system: An open IFC cloud-based BIM solution. Autom. Constr. 2022 , 143 , 104569. [ Google Scholar ] [ CrossRef ]
  • Cespedes-Cubides, A.S.; Jradi, M. A review of building digital twins to improve energy efficiency in the building operational stage. Energy Inf. 2024 , 7 , 11. [ Google Scholar ] [ CrossRef ]
  • Alankarage, S.; Chileshe, N.; Rameezdeen, R.; Edwards, D.J.; Samaraweera, A. Exploring BIM-triggered organisational and professional culture change: A systematic literature review. Constr. Innov. 2023 , 23 , 229–247. [ Google Scholar ] [ CrossRef ]
  • Tu, B.; Zuo, J.; Chang, R.-D.; Webber, R.J.; Xiong, F.; Dong, N. A system dynamic model for assessing the level of BIM implementation in construction phase: A China case study. Eng. Constr. Archit. Manag. 2023 , 30 , 1321–1343. [ Google Scholar ] [ CrossRef ]
  • Sun, J.; Paasch, J.M.; Paulsson, J.; Tarandi, V.; Harrie, L. A BIM-based approach to design a lifecycle 3D property formation process: A Swedish case study. Land Use Policy 2023 , 131 , 106712. [ Google Scholar ] [ CrossRef ]
  • Piras, G.; Muzi, F. Energy Transition: Semi-Automatic BIM Tool Approach for Elevating Sustainability in the Maputo Natural History Museum. Energies 2024 , 17 , 775. [ Google Scholar ] [ CrossRef ]
  • Sakr, M.; Sadhu, A. Visualization of structural health monitoring information using Internet-of-Things integrated with building information modeling. J. Infrastruct. Intell. Resil. 2023 , 2 , 100053. [ Google Scholar ] [ CrossRef ]
  • Pan, Y.; Zhang, L. A BIM-data mining integrated digital twin framework for advanced project management. Autom. Constr. 2021 , 124 , 103564. [ Google Scholar ] [ CrossRef ]
  • Blut, C.; Becker, R.; Kinnen, T.; Schluetter, D.; Emunds, C.; Frisch, J.; Heidermann, D.; Wenthe, M.; Rettig, T.; Baranski, M.; et al. Optimizing Building Energy Systems through BIM-enabled georeferenced Digital Twins. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2024 , 48 , 1–8. [ Google Scholar ] [ CrossRef ]
  • Ngamakeur, K.; Yongchareon, S.; Yu, J.; Islam, S. Passive infrared sensor dataset and deep learning models for device-free indoor localization and tracking. Pervasive Mob. Comput. 2023 , 88 , 101721. [ Google Scholar ] [ CrossRef ]
  • Wu, Y.; Chen, S.; Jin, Y.; Xu, H.; Zhou, X.; Wang, X.; Chong, A.; Li, J.; Yan, D. Novel occupancy detection method based on convolutional neural network model using PIR sensor and smart meter data. Adv. Eng. Inform. 2024 , 62 , 102589. [ Google Scholar ] [ CrossRef ]
  • Futagami, T. Experimental evaluation for occupancy lighting control using combination of PIR and image-based sensors. SICE J. Control. Meas. Syst. Integr. 2024 , 17 , 32–42. [ Google Scholar ] [ CrossRef ]
  • Romero, D.A.V.; Laureano, E.V.; Betancourt, R.O.J.; Álvarez, E.N. An open source IoT edge-computing system for monitoring energy consumption in buildings. Results Eng. 2024 , 21 , 101875. [ Google Scholar ] [ CrossRef ]
  • Cumo, F.; Piras, G.; Pennacchia, E.; Cinquepalmi, F. Optimization of design and management of a hydroponic greenhouse by using BIM application software. Int. J. Sustain. Dev. Plan. 2020 , 15 , 157–163. [ Google Scholar ] [ CrossRef ]
  • Cumo, F.; Pennacchia, E.; Zylka, C. Energy-Efficient Solutions: A Multi-Criteria Decision Aid Tool to Achieve the Targets of the European EPDB Directive. Energies 2023 , 16 , 6245. [ Google Scholar ] [ CrossRef ]
  • Cumo, F.; Giustini, F.; Pennacchia, E.; Romeo, C. The “D2P” Approach: Digitalization, Production and Performance in the Standardised Sustainable Deep Renovation of Buildings. Energies 2022 , 15 , 6689. [ Google Scholar ] [ CrossRef ]
  • Piras, G.; Muzi, F.; Tiburcio, V.A. Digital Management Methodology for Building Production Optimization through Digital Twin and Artificial Intelligence Integration. Buildings 2024 , 14 , 2110. [ Google Scholar ] [ CrossRef ]
  • Mckinsey Tech Trends 2022 Report. Available online: https://www.mckinsey.com/~/media/mckinsey/business%20functions/mckinsey%20digital/our%20insights/the%20top%20trends%20in%20tech%202022/mckinsey-tech-trends-outlook-2022-full-report.pdf (accessed on 10 June 2024).
  • Heydari, A.; Nezhad, M.M.; Keynia, F.; Fekih, A.; Shahsavari-Pour, N.; Garcia, D.A.; Piras, G. A combined multi-objective intelligent optimization approach considering techno-economic and reliability factors for hybrid-renewable microgrid systems. J. Clean. Prod. 2023 , 383 , 135249. [ Google Scholar ] [ CrossRef ]
  • Adhikary, A.; Halder, S.; Bose, R.; Panja, S.; Halder, S.; Pratihar, J.; Dey, A. Design and Implementation of an IOT-based Smart Home Automation System in Real World Scenario. EAI Endorsed Trans. Internet Things 2024 , 10 . [ Google Scholar ] [ CrossRef ]
  • Nivedha, M.; Titus, S. IoT-based monitoring of smart grid using high-gain converter with optimized maximum power point tracking. Electr. Eng. 2024 , 106 , 2297–2311. [ Google Scholar ] [ CrossRef ]
  • Sujith, M.; Prabu, R.T.; Kumar AT, A.; Kumar, A. Performance analysis of CsPbI3-based solar cells under light emitting diode illumination as an energy harvester for IoT and indoor photovoltaics. J. Comput. Electron. 2024 , 23 , 866–873. [ Google Scholar ] [ CrossRef ]
  • Krishna Rao, C.; Sahoo, S.K.; Yanine, F.F. An IoT-based intelligent smart energy monitoring system for solar PV power generation. Energy Harvest. Syst. 2024 , 11 , 20230015. [ Google Scholar ] [ CrossRef ]
  • Rosa, R.L.; Boulebnane, L.; Pagano, A.; Giuliano, F.; Croce, D. Towards Mass-Scale IoT with Energy-Autonomous LoRaWAN Sensor Nodes. Sensors 2024 , 24 , 4279. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Cinquepalmi, F.; Piras, G. Earth Observation Technologies for Mitigating Urban Climate Changes. In Proceedings of the Technological Imagination in the Green and Digital Transition (CONF.ITECH 2022), Rome, Italy, 30 Jun–1 July 2022; The Urban Book Series. Springer: Cham, Switzerland, 2023. [ Google Scholar ] [ CrossRef ]
  • Rewired and Running Ahead: Digital and AI Leaders Are Leaving the Rest Behind. Available online: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/rewired-and-running-ahead-digital-and-ai-leaders-are-leaving-the-rest-behind (accessed on 18 June 2024).
  • Regona, M.; Yigitcanlar, T.; Hon, C.; Teo, M. Artificial intelligence and sustainable development goals: Systematic literature review of the construction industry. Sustain. Cities Soc. 2024 , 108 , 105499. [ Google Scholar ] [ CrossRef ]
  • Rajkumar, N.; Viji, C.; Latha, P.M.; Vennila, V.B.; Shanmugam, S.K.; Pillai, N.B. The power of AI, IoT, and advanced quantum based optical systems in smart cities. Opt. Quantum Electron. 2024 , 56 , 450. [ Google Scholar ] [ CrossRef ]
  • Asif, M.; Naeem, G.; Khalid, M. Digitalization for sustainable buildings: Technologies, applications, potential, and challenges. J. Clean. Prod. 2024 , 450 , 141814. [ Google Scholar ] [ CrossRef ]
  • La Gestione BIM della Classificazione di Ponti Esistenti, Harpaceas. 2021. Available online: https://www.harpaceas.it/blog/pontisicuri-la-gestione-bim-della-classificazione-di-ponti-esistenti (accessed on 16 June 2024).
  • Osservatori Digital PoliMi. Available online: https://www.osservatori.net/it/ricerche/comunicati-stampa/smart-working-emergenza-covid19-new-normal (accessed on 14 March 2024).
  • International Data Corporations. Available online: https://www.idc.com/getdoc.jsp?containerId=IDC_P44469 (accessed on 14 March 2024).
  • Marketers State of Remote Sensing. Available online: https://wearemarketers.net/remote-working-2021/ (accessed on 15 March 2024).
  • Mao, Y.; Chen, J.; Liu, X.; Dang, J.; Schiöth, H.B. Social support predicted subsequent subjective well-being during the COVID-19 pandemic: A prospective study. BMC Public Health 2024 , 24 , 943. [ Google Scholar ] [ CrossRef ]
  • Gourabpasi, A.H.; Nik-Bakht, M. BIM-based automated fault detection and diagnostics of HVAC systems in commercial buildings. J. Build. Eng. 2024 , 87 , 109022. [ Google Scholar ] [ CrossRef ]
  • Li, B.; Tavakoli, A.; Heydarian, A. Occupant privacy perception, awareness, and preferences in smart office environments. Sci Rep. 2023 , 13 , 4073. [ Google Scholar ] [ CrossRef ]
  • Marikyan, D.; Papagiannidis, S.; FRana, O.; Ranjan, R. Working in a smart home environment: Examining the impact on productivity, well-being and future use intention. Internet Res. 2024 , 34 , 447–473. [ Google Scholar ] [ CrossRef ]
  • Edwardson, C.L.; Maylor, B.D.; Biddle, S.J.H.; Clarke-Cornwell, A.M.; Clemes, S.A.; Davies, M.J.; Dunstan, D.W.; Granat, M.H.; Gray, L.J.; Hadjiconstantinou, M.; et al. Participant and workplace champion experiences of an intervention designed to reduce sitting time in desk-based workers: SMART work & life. Int. J. Behav. Nutr. Phys. Act. 2023 , 20 , 142. [ Google Scholar ] [ CrossRef ]
  • Direttiva EU PE 102 2023 REV 1. Available online: https://eur-lex.europa.eu/legal-content/IT/TXT/?uri=CONSIL%3APE_102_2023_REV_1&qid=1714658803749 (accessed on 22 March 2024).
  • Smart Working Analysis on Worker. Available online: https://www.osservatori.net/it/ricerche/comunicati-stampa/smart-working-italia-numeri-trend (accessed on 10 May 2024).
  • Li, J.; Herdem, M.S.; Nathwani, J.; Wen, J.Z. Methods and applications for Artificial Intelligence, Big Data, Internet of Things, and Blockchain in smart energy management. Energy AI 2023 , 11 , 100208. [ Google Scholar ] [ CrossRef ]
  • IEA. Energy Efficiency of Buildings as One of the Main Criteria for Limiting CO2 Emissions. Available online: https://www.iea.org/energy-system/buildings (accessed on 25 July 2024).
  • Smart Working Effects on Companies. Available online: https://codacons.it/bollette-codacons-prezzi-record-di-energia-e-benzina-incentivano-smartworking-lavoratore-risparmia-fino-a-5-100-euro-allanno-per-aziende-costi-si-abbattono-del-30/ (accessed on 15 May 2024).
  • Di Girolamo, P.; Piras, G.; Pini, F. The effect of COVID-19 on the distribution of PM10 pollution classes of vehicles: Comparison between 2020 and 2018. Sci. Total Environ. 2022 , 811 , 152036. [ Google Scholar ] [ CrossRef ]
  • Wang, C. Refined intelligent manufacturing enterprise human management based on IoT and machine learning technology. Int. J. Adv. Manuf. Technol. 2024 , 1–12. [ Google Scholar ] [ CrossRef ]
  • Palupiningtyas, D.; Maria, A.D.; Wijoyo, T.A.; Alyka, A.P.; Brawarso, K.Z.P. Application of Rapid Application Development Method in Designing Knowledge Management System to Improve Employee Knowledge and Performance at Ministry of Agriculture. J. Inf. Dan Teknol. 2024 , 29–35. [ Google Scholar ] [ CrossRef ]
  • Congiu, E.; Quaquero, E.; Rubiu, G.; Vacca, G. Building Information Modeling e Sistema Informativo Territoriale: Framework Integrato a Supporto del Facility Management (FM). Edifici 2024 , 14 , 610. [ Google Scholar ]
  • Chaudhari, P.; Xiao, Y.; Cheng MM, C.; Li, T. Fondamenti, algoritmi e tecnologie di rilevamento dell’occupazione per edifici intelligenti che utilizzano sensori IoT. Sensors 2024 , 24 , 2123. [ Google Scholar ] [ CrossRef ]
  • Westergaard, G.; Erden, U.; Mateo, O.A.; Lampo, S.M.; Akinci, T.C.; Topsakal, O. Time Series Forecasting Utilizing Automated Machine Learning (AutoML): A Comparative Analysis Study on Diverse Datasets. Information 2024 , 15 , 39. [ Google Scholar ] [ CrossRef ]
  • Adeoye, I. Unlocking Potential: A Holistic Approach to Predicting Employee Performance with Business Analytics and Machine Learning. Available at SSRN 4729240 2024 . [ Google Scholar ] [ CrossRef ]
  • Hasan, M.R.; Ray, R.K.; Chowdhury, F.R. Employee Performance Prediction: An Integrated Approach of Business Analytics and Machine Learning. J. Bus. Manag. Stud. 2024 , 6 , 215–219. [ Google Scholar ] [ CrossRef ]
  • Latiffi, A.A.; Brahim, J.; Mohd, S.; Fathi, M.S. Building information modeling (BIM): Exploring level of development (LOD) in construction projects. Appl. Mech. Mater. 2015 , 773 , 933–937. [ Google Scholar ] [ CrossRef ]
  • Dias, P.; Ergan, S. The need for representing facility information with customized LOD for specific FM tasks. In Proceedings of the Construction Research Congress 2016, San Juan, Puerto Rico, 31 May–2 June 2016; pp. 2563–2572. [ Google Scholar ] [ CrossRef ]
  • Alavi, S.H.; Forcada, N. BIM LOD for facility management tasks. In Proceedings of the EC3 Conference 2019, Crete, Greece, 10–12 July 2019; European Council on Computing in Construction: Creta, Greece, 2019; Volume 1, pp. 154–163. [ Google Scholar ]
  • Ordenes, F.V.; Silipo, R. Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications. J. Bus. Res. 2021 , 137 , 393–410. [ Google Scholar ] [ CrossRef ]
  • Powell, B. Mastering Microsoft Power BI: Expert Techniques for Effective Data Analytics and Business Intelligence ; Packt Publishing Ltd.: Birmingham, UK, 2018. [ Google Scholar ]
  • Bruce, P.; Bruce, A.; Gedeck, P. Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python ; O’Reilly Media: Sebastopol, CA, USA, 2020. [ Google Scholar ]
  • Ferrari, A.; Russo, M. Introducing Microsoft Power BI ; Microsoft Press: Washington, DC, USA, 2016; p. 367. ISBN 9781509302765. [ Google Scholar ]
  • Sim, E. Workplace repositioning post-pandemic: Hybrid working. In Routledge Companion to Creativity and the Built Environment ; Routledge: London, UK, 2024; pp. 84–92. [ Google Scholar ]
  • Lauesen, L.M.; Green, M. The future of work: Marx, Illich, Gorz–are we heading in their directions in post-COVID-19? In The Routledge Companion to the Future of Management Research ; Routledge: London, UK, 2024; pp. 117–135. [ Google Scholar ]
  • Cooper, P.B.; Maraslis, K.; Tryfonas, T.; Oikonomou, G. An intelligent hot-desking model harnessing the power of occupancy sensing data. Facilities 2017 , 35 , 766–786. [ Google Scholar ] [ CrossRef ]
  • Millward, L.J.; Haslam, S.A.; Postmes, T. Putting employees in their place: The impact of hot desking on organizational and team identification. Organ. Sci. 2007 , 18 , 547–559. [ Google Scholar ] [ CrossRef ]
  • Brand, J.L. Health and Productivity Effects of Hot Desks, Just-in-Time Work Spaces, and Other Flexible Workplace Arrangements. Ergon. Workplace Des. Health Wellness Product. 2017 , 341–353. [ Google Scholar ] [ CrossRef ]
  • Cui, W.; Chen, Y.; Xu, B. Application research of intelligent system based on BIM and sensors monitoring technology in construction management. Phys. Chem. Earth Parts A/B/C 2024 , 134 , 103546. [ Google Scholar ] [ CrossRef ]
  • Silipo, R.; Mazanetz, M.P. The KNIME Cookbook ; KNIME Press: Zürich, Switzerland, 2012. [ Google Scholar ]
  • Microsoft PowerBI Architecture. Available online: https://learn.microsoft.com/it-it/power-bi/guidance/center-of-excellence-business-intelligence-solution-architecture (accessed on 19 June 2024).
  • Dietterich, T.G. Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems ; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [ Google Scholar ] [ CrossRef ]
  • Seyedzadeh, S.; Rahimian, F.P.; Glesk, I.; Roper, M. Machine learning for estimation of building energy consumption and performance: A review. Vis. Eng. 2018 , 6 , 5. [ Google Scholar ] [ CrossRef ]
  • Gulati, K.; Tiwari, S.; Basandrai, K.; Kamat, P. Predictive maintenance of bearing machinery using matlab. In Proceedings of the Congress on Intelligent Systems (CIS), Bengaluru, India, 4–5 September 2021; Springer Nature: Singapore, 2022; Volume 2, pp. 107–136. [ Google Scholar ]
  • Che, D.; Liu, Q.; Rasheed, K.; Tao, X. Decision tree and ensemble learning algorithms with their applications in bioinformatics. In Software Tools and Algorithms for Biological Systems ; Springer: Berlin/Heidelberg, Germany, 2011; pp. 191–199. [ Google Scholar ]
  • Ampomah, E.K.; Qin, Z.; Nyame, G. Evaluation of tree-based ensemble machine learning models in predicting stock price direction of movement. Information 2020 , 11 , 332. [ Google Scholar ] [ CrossRef ]
  • De Mauro, A. Big Data Analytics: Analizzare e Interpretare Dati con il Machine Learning ; Apogeo Editore: Milan, Italy, 2019. [ Google Scholar ]
  • Quiñonero-Candela, J.; Sugiyama, M.; Schwaighofer, A.; Lawrence, N.D. (Eds.) Dataset Shift in Machine Learning ; MIT Press: Cambridge, MA, USA, 2022. [ Google Scholar ]
  • Studocu. Machine Learning. University of Nairobi. 2024. Available online: https://www.studocu.com/row/document/university-of-nairobi/machine-learning-algorithms-programming/machine-learning/69662783 (accessed on 2 July 2024).
  • Banfield, R.E.; Hall, L.O.; Bowyer, K.W.; Kegelmeyer, W.P. A comparison of decision tree ensemble creation techniques. IEEE Trans. Pattern Anal. Mach. Intell. 2006 , 29 , 173–180. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Deng, H. Interpreting tree ensembles with intrees. Int. J. Data Sci. Anal. 2019 , 7 , 277–287. [ Google Scholar ] [ CrossRef ]
  • IEA. Emissions Factors 2023. Available online: https://www.iea.org/data-and-statistics/data-product/emissions-factors-2023 (accessed on 17 August 2024).
  • Huawei Academy, Ensemble Learning Diagram. Available online: https://forum.huawei.com/enterprise/en/machine-learning-algorithms-unsupervised-learning/thread/667243990275211264-667213860488228864 (accessed on 2 July 2024).
  • Mahon, J.; Quille, K.; Mac Namee, B.; Becker, B.A. A novel machine learning and artificial intelligence course for secondary school students. In Proceedings of the 53rd ACM Technical Symposium on Computer Science Education V. 2, Providence, RI, USA, 3–5 March 2022; p. 1155. [ Google Scholar ]
  • Bharadwaj, A.; Sawy, O.A.E.; Pavlou, P.A.; Venkatraman, N. Digital business strategy: Toward a next generation of insights. MIS Q. 2013 , 37 , 471–482. [ Google Scholar ] [ CrossRef ]
  • Sambamurthy, V.; Bharadwaj, A.; Grover, V. Shaping agility through digital options: Reconceptualizing the role of information technology in contemporary firm. MIS Q. 2003 , 27 , 237–264. [ Google Scholar ] [ CrossRef ]
  • Rodrigues, M.; Franco, M.; Sousa, N.; Silva, R. COVID 19 and the Business Management Crisis: An Empirical Study in SMEs. Sustainability 2021 , 13 , 5912. [ Google Scholar ] [ CrossRef ]
  • Amankwah-Amoah, J.; Khan, Z.; Wood, G.; Knight, G. COVID-19 and digitalization: The great acceleration. J. Bus. Res. 2021 , 136 , 602–611. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Seetharaman, P. Business models shifts: Impact of COVID-19. Int. J. Inf. Manag. 2020 , 54 , 102173. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Teece, D.J.; Pisano, G.; Shuen, A. Dynamic capabilities and strategic management. Strateg. Manag. J. 1997 , 18 , 509–533. [ Google Scholar ] [ CrossRef ]
  • LaBerge, L.; O’Toole, C.; Schneider, J.; Smaje, K. How COVID-19 Has Pushed Companies over the Technology Tipping Point and Transformed Business Forever. 2020. Available online: https://www.mckinsey.com/capabilities/strategy-and-corporate-finance/our-insights/how-covid-19-has-pushed-companies-over-the-technology-tipping-point-and-transformed-business-forever (accessed on 21 May 2024).
  • Galvin, J.; LaBerge, L. The New Digital Edge: Rethinking Strategy for the Postpandemic Era. 2021. Available online: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-new-digital-edge-rethinking-strategy-for-the-postpandemic-era (accessed on 11 May 2024).
  • Verhoef, P.C.; Broekhuizen, T.; Bart, Y.; Bhattacharya, A.; Dong, J.Q.; Fabian, N.; Haenlein, M. Digital transformation: A multidisciplinary reflection and research agenda. J. Bus. Res. 2021 , 122 , 889–901. [ Google Scholar ] [ CrossRef ]
  • Lam, P.-D.; Gu, B.-H.; Lam, H.-K.; Ok, S.-Y.; Lee, S.-H. Digital Twin Smart City: Integrating IFC and CityGML with Semantic Graph for Advanced 3D City Model Visualization. Sensors 2024 , 24 , 3761. [ Google Scholar ] [ CrossRef ]
  • Priyono, A.; Moin, A.; Putri, V.N.A.O. Identifying digital transformation paths in the business model of SMEs during the COVID-19 Pandemic. J. Open Innov. Technol. Mark. Complex. 2020 , 6 , 104. [ Google Scholar ] [ CrossRef ]
  • Münch, C.; Hartmann, E. Transforming resilience in the context of a pandemic: Results from a cross-industry case study exploring supply chain viability. Int. J. Prod. Res. 2022 , 61 , 2544–2562. [ Google Scholar ] [ CrossRef ]
  • Thukral, E. COVID-19: Small and medium enterprises challenges and responses with creativity, innovation, and entrepreneurship. Strateg. Change 2021 , 30 , 153–158. [ Google Scholar ] [ CrossRef ]
  • De Lucas Ancillo, A.; Gavrila Gavrila, S.; Fernández del Castillo Díez, J.R.; Corro Beseler, J. LATAM and Spanish SME barriers to Industry 4.0. Acad. Rev. Latinoam. Adm. 2022 , 35 , 204–222. [ Google Scholar ] [ CrossRef ]
  • Tamvada, J.P.; Narula, S.; Audretsch, D.; Puppala, H.; Kumar, A. Adopting new technology is a distant dream? The risks of implementing Industry 4.0 in emerging economy SMEs. Technol. Forecast. Soc. Change 2022 , 185 , 122088. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

FloorAvailable StationsMaximum DetectedUnused Stations
0503614
112010416
212010515
31209723
412010218
51208832
61209129
712010614
81207842
91205862
101206654
111206060
Total1370991379
ANTE, 28.600 m POST, 20.800 m
kWh/YearMWh/YearCosts [EUR/Year]t CO /YearkWh/YearMWh/YearCosts [EUR/Year]t CO /Year
Lighting, electrical devices400,400.00400.40100,100.00774.77291,200.00291.2072,800.00563.47
Domestic hot water286,000,00286.00100,100.0027.17208,000.00208.0072,800.0019.76
Cooling143,000,00143.0035,750.00774.77104,000.00104.0026,000.00563.47
Heating1,115,400.001115.40278,850.00774.77811,200.00811.20202,800.00563.47
Total1,944,800.001944.80514,800.002351.491,414,400.001414.40374,400.001710.18
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Piras, G.; Muzi, F.; Tiburcio, V.A. Enhancing Space Management through Digital Twin: A Case Study of the Lazio Region Headquarters. Appl. Sci. 2024 , 14 , 7463. https://doi.org/10.3390/app14177463

Piras G, Muzi F, Tiburcio VA. Enhancing Space Management through Digital Twin: A Case Study of the Lazio Region Headquarters. Applied Sciences . 2024; 14(17):7463. https://doi.org/10.3390/app14177463

Piras, Giuseppe, Francesco Muzi, and Virginia Adele Tiburcio. 2024. "Enhancing Space Management through Digital Twin: A Case Study of the Lazio Region Headquarters" Applied Sciences 14, no. 17: 7463. https://doi.org/10.3390/app14177463

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. Case Study Interview Supply Chain

    supply chain case study dataset

  2. Supply Chain Management Case Study Help By No1AssignmentHelp.Com by Case Study Help

    supply chain case study dataset

  3. Supply Chain Analytics Case Study: Improving Procurement for a Retail Store

    supply chain case study dataset

  4. Supply Chain Analysis: Case Study

    supply chain case study dataset

  5. Supply Chain Case Study (Lesson 10)

    supply chain case study dataset

  6. How to Solve a Supply Chain Case Study Interview [2022]

    supply chain case study dataset

COMMENTS

  1. Supply Chain Analysis: Data Analysis Case Study Using Excel

    Supply chain analytics plays a crucial role in driving data-driven decision-making across industries, including manufacturing, retail, healthcare, and logistics. In this report, I presented an analysis of a Fashion and Beauty startup's supply chain data, specifically focused on the movement of makeup products. By collecting, analyzing, and interpreting this dataset, I aim to gain insights ...

  2. supply-chain-analytics · GitHub Topics · GitHub

    In this study, we aimed to detect fraudulent activities in the supply chain through the use of neural networks. The study focused on building two machine learning models using the MLPClassifier algorithm from the scikit-learn library and a custom neural network using the Keras library in Python.

  3. Supply Chain Analysis: Case Study

    Download the dataset below to solve this Data Science case study on Supply Chain Analysis. Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products ...

  4. Supply Chain DataSet

    The dataset solve case study on Supply Chain Analysis

  5. Supply Chain Optimization with Python

    Find the optimal locations of your manufacturing facilities to meet your customers' demands and reduce production costs. Supply chain optimization makes the best use of data analytics to find an optimal combination of factories and distribution centres to match supply and demand. Because of the current surge in shipping costs, companies start ...

  6. Big data and the supply chain: The big-supply-chain analytics landscape

    Big supply chain analytics uses data and quantitative methods to improve decision making for all activities across the supply chain. In particular, it does two new things. First, it expands the dataset for analysis beyond the traditional internal data held on Enterprise Resource Planning (ERP) and supply chain management (SCM) systems.

  7. Topology of International Supply Chain Networks: A Case Study Using

    In this work, we undertake an extensive structural and topological analysis of the supply chain networks constructed from the Factset Revere dataset. The dataset is provided by FactSet Research Systems Inc. that captures global supply chain relationships between companies.

  8. Visualizing Supply Chains with Neo4j Graph Data Science and Bloom

    Source Logistics Dataset To explore the application of graph data science to supply chain logistics, we will use the Cargo 2000 transport and logistics case study dataset.

  9. Value of data in multi-level supply chain decisions: a case study in

    This case study focuses on the data-driven multi-level supply chain decision making to overcome logistics challenges in the sector. We present four applications to demonstrate how data are used in very different supply chain decisions.

  10. (PDF) Topology of International Supply Chain Networks: A Case Study

    The dataset is provided by FactSet Research Systems Inc. that captures global supply chain relationships between companies.

  11. Supply Chain Data

    Data of a Supply Chain with Producers, Ports and Customers

  12. MisterAare/Supply-chain-analysis-with-excel

    Supply-chain-analysis-with-excel Supply chain analytics plays a crucial role in driving data-driven decision-making across industries, including manufacturing, retail, healthcare, and logistics. In this report, I presented an analysis of a Fashion and Beauty startup's supply chain data, specifically focused on the movement of makeup products. By collecting, analyzing, and interpreting this ...

  13. Predictive big data analytics for supply chain demand forecasting

    Abstract Big data analytics (BDA) in supply chain management (SCM) is receiving a growing attention. This is due to the fact that BDA has a wide range of applications in SCM, including customer behavior analysis, trend analysis, and demand prediction. In this survey, we investigate the predictive BDA applications in supply chain demand forecasting to propose a classification of these ...

  14. PDF Supply chain data analytics for predicting supplier disruptions: a case

    Our contribution thus includes a case study that demonstrates how data analytics, more specifically machine learning, can be used to predict supply chain disruptions.

  15. DataCo SMART SUPPLY CHAIN FOR BIG DATA ANALYSIS

    Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic. SUPPLY CHAIN FOR BIG DATA ANALYSIS

  16. Goods and Activities Tracking Through Supply Chain Network Using

    The developed models are based on two datasets: DataCo Supply Chain Dataset and SCMS Delivery History Dataset. A case study is worked out, and developed models are analyzed and discussed.

  17. Case Study: Supply Chain Analytics in Tableau

    Use Tableau for Supply Chain Analysis Test your Tableau skills on a real-world dataset focusing on supply chain analytics. In this case study, you will help solve key challenges of shipment and inventory management, analyze supply chain inefficiencies and create insightful dashboards to inform business stakeholders about the potential problems and propose structural business improvements.

  18. Descriptive Analysis

    Supply chain case studies have attracted increasing attention since the early 2000s, and the very last years (2014–2015) of the dataset being analysed here in particular have shown significant publication numbers. The average number of cases in published...

  19. Papers with Code

    To address the issue, we present a real-world benchmark dataset for temporal tasks, obtained from one of the leading FMCG companies in Bangladesh, focusing on supply chain planning for production purposes. The dataset includes temporal data as node features to enable sales predictions, production planning, and the identification of factory issues.

  20. Supply chain data analytics for predicting supplier disruptions: a case

    Through the discussion and the development of the case study we hope to shed light on the development and application of data analytics techniques in the analysis of supply chain data.

  21. Using supply chain databases in academic research: A methodological

    The analysis is based on a review of previous studies using Bloomberg SPLC, FactSet Supply Chain Relationships, and Mergent Supply Chain, publicly available materials, interviews with information service providers, and the direct experience of the authors. Some solutions draw upon established methodological literature on the use of secondary data.

  22. Data analytics for sustainable global supply chains

    The interdisciplinary research described in this study aims to investigate and identify patterns in road freight logistics operations in Europe, using data analytics, in order to enhance the sustainability and efficiency of supply chain logistics operations with negligible investment.

  23. Blockchain-Enabled Accountability in Data Supply Chain: A Data Bill of

    The proposed solution is evaluated in terms of feasibility and performance via case study and quantitative analysis respectively. Key terms - Data governance, blockchain ... Please note that if the dataset identifier is changed within data supply chain, the dataset should be recorded in DataBOM registry as a new dataset with the ...

  24. The Top 5 Data Challenges in Supply Chain and Logistics

    Disruptions in the global supply chain demand innovation, and the ongoing need for adaptation across the data landscape.Whether it is volatile fuel prices, a shrinking labor force, or geo-political trends, we understand your logistics operations are under constant pressure from external forces.

  25. Case Study: Universal Accessibility Makes Getting Around the ...

    Description. Team USA's Training Site at the Olympic and Paralympic Games Paris 2024 is an enormous complex that athletes, trainers, and staff unfamiliar with the space need help to traverse quickly and independently.

  26. Tesla: Transforming Supply Chain Sustainability

    Supply Chain Digital Magazine connects the leading supply chain and logistics executives of the world's largest brands. Our platform serves as a digital hub for connecting industry leaders, covering a wide range of services including media and advertising, events, research reports, demand generation, information, and data services. ...

  27. Navigating Supply Chain Resilience in the Knowledge Economy ...

    This study investigates the role of supply chain resilience in the knowledge economy, focusing on the impact of digital technologies and organizational information processing theory (OIPT) in managing supply chain disruptions. The research employs a mixed-methods approach, combining a survey of 200 supply chain professionals and in-depth interviews with experts in the field. The results ...

  28. A fuzzy logic-based risk assessment framework for the crude oil

    A Turkish international port organization case study used the fuzzy MCDM ... and promoting trade and economic growth. The world's oil supply chain connects oil-producing countries to ... RI enables experts to detect critical circumstances by evaluating the frequency of logistical risks in a procedure or dataset. ARI, on the other hand ...

  29. Case Study: Supply Chain Analytics in Power BI

    Learn how to use Power BI for supply chain analytics in this case study. Create a make vs. buy analysis tool, calculate costs, and analyze production volumes.

  30. Enhancing Space Management through Digital Twin: A Case Study of the

    Digital Twin is becoming an increasingly powerful resource in the field of building production, replacing traditional processes in the Architecture, Engineering, Construction and Operations sector. This study is concerned with the development of a DT, enabled by Building Information Modeling, artificial intelligence, machine learning, and the Internet of Things to implement space management ...