Advances, Systems and Applications

Journal of Cloud Computing Cover Image

  • Search by keyword
  • Search by citation

Page 1 of 15

Efficiently localizing system anomalies for cloud infrastructures: a novel Dynamic Graph Transformer based Parallel Framework

Cloud environment is a virtual, online, and distributed computing environment that provides users with large-scale services. And cloud monitoring plays an integral role in protecting infrastructures in the clo...

  • View Full Text

From data to insights: the application and challenges of knowledge graphs in intelligent audit

In recent years, knowledge graph technology has been widely applied in various fields such as intelligent auditing, urban transportation planning, legal research, and financial analysis. In traditional auditin...

HybOff: a Hybrid Offloading approach to improve load balancing in fog environments

Load balancing is crucial in distributed systems like fog computing, where efficiency is paramount. Offloading with different approaches is the key to balancing the load in distributed environments. Static off...

Hierarchical Identity-Based Authenticated Encryption with Keyword Search over encrypted cloud data

With the rapid development of cloud computing technology, cloud storage services are becoming more and more mature. However, the storage of sensitive data on remote servers poses privacy risks and is presently...

Correction to: Enhancing lung cancer diagnosis with data fusion and mobile edge computing using DenseNet and CNN

The original article was published in Journal of Cloud Computing 2024 13 :91

ABWOA: adaptive boundary whale optimization algorithm for large-scale digital twin network construction

Digital twin network (DTN) as an emerging network paradigm, have garnered growing attention. For large-scale networks, a crucial problem is how to effectively map physical networks onto the infrastructure plat...

Distance optimization and directional overcurrent relay coordination using edge-powered biogeography-genetic algorithms

The effective functioning and regulation of power systems crucially rely on the coordination of distance and directional overcurrent relays. Accurate fault detection and successful clearing sequences require s...

Towards optimized scheduling and allocation of heterogeneous resource via graph-enhanced EPSO algorithm

Efficient allocation of tasks and resources is crucial for the performance of heterogeneous cloud computing platforms. To achieve harmony between task completion time, device power consumption, and load balanc...

Topic and knowledge-enhanced modeling for edge-enabled IoT user identity linkage across social networks

The Internet of Things (IoT) devices spawn growing diverse social platforms and online data at the network edge, propelling the development of cross-platform applications. To integrate cross-platform data, use...

Privacy-preserving sports data fusion and prediction with smart devices in distributed environment

With the rapid advancement of sports analytics and fan engagement technologies, the volume and diversity of physique data generated by smart devices across various distributed sports platforms have grown signi...

Quantum support vector machine for forecasting house energy consumption: a comparative study with deep learning models

The Smart Grid operates autonomously, facilitating the smooth integration of diverse power generation sources into the grid, thereby ensuring a continuous, reliable, and high-quality supply of electricity to e...

Enhancing multimedia management: cloud-based movie type recognition with hybrid deep learning architecture

Film and movie genres play a pivotal role in captivating relevant audiences across interactive multimedia platforms. With a focus on entertainment, streaming providers are increasingly prioritizing the automat...

MTG_CD: Multi-scale learnable transformation graph for fault classification and diagnosis in microservices

The rapid advancement of microservice architecture in the cloud has led to the necessity of effectively detecting, classifying, and diagnosing run failures in microservice applications. Due to the high dynamic...

Recognizing online video genres using ensemble deep convolutional learning for digital media service management

It's evident that streaming services increasingly seek to automate the generation of film genres, a factor profoundly shaping a film's structure and target audience. Integrating a hybrid convolutional network ...

A bizarre synthesized cascaded optimized predictor (BizSCOP) model for enhancing security in cloud systems

Due to growing network data dissemination in cloud, the elasticity, pay as you go options, globally accessible facilities, and security of networks have become increasingly important in today's world. Cloud se...

STAM-LSGRU: a spatiotemporal radar echo extrapolation algorithm with edge computing for short-term forecasting

With the advent of Mobile Edge Computing (MEC), shifting data processing from cloud centers to the network edge presents an advanced computational paradigm for addressing latency-sensitive applications. Specif...

Constrained optimal grouping of cloud application components

Cloud applications are built from a set of components often deployed as containers, which can be deployed individually on separate Virtual Machines (VMs) or grouped on a smaller set of VMs. Additionally, the a...

Students health physique information sharing in publicly collaborative services over edge-cloud networks

Data privacy is playing a vital role while facing the digital life aspects. Today, the world is being extensively inter-connected through the internet of things (IoT) technologies. This huge interconnectivity ...

Efficient and secure privacy protection scheme and consensus mechanism in MEC enabled e-commerce consortium blockchain

The application of blockchain technology to the field of e-commerce has solved many dilemmas, such as low transparency of transactions, hidden risks of data security and high payment costs. Mobile edge computi...

A mobile edge computing-focused transferable sensitive data identification method based on product quantization

Sensitive data identification represents the initial and crucial step in safeguarding sensitive information. With the ongoing evolution of the industrial internet, including its interconnectivity across variou...

Blockchain-based 6G task offloading and cooperative computing resource allocation study

In the upcoming era of 6G, the accelerated development of the Internet of Everything and high-speed communication is poised to provide people with an efficient and intelligent life experience. However, the exp...

Deep Reinforcement Learning techniques for dynamic task offloading in the 5G edge-cloud continuum

The integration of new Internet of Things (IoT) applications and services heavily relies on task offloading to external devices due to the constrained computing and battery resources of IoT devices. Up to now,...

Enhancing patient healthcare with mobile edge computing and 5G: challenges and solutions for secure online health tools

Patient-focused healthcare applications are important to patients because they offer a range of advantages that add value and improve the overall healthcare experience. The 5G networks, along with Mobile Edge ...

Online dynamic multi-user computation offloading and resource allocation for HAP-assisted MEC: an energy efficient approach

Nowadays, the paradigm of mobile computing is evolving from a centralized cloud model towards Mobile Edge Computing (MEC). In regions without ground communication infrastructure, incorporating aerial edge comp...

Enhancing lung cancer diagnosis with data fusion and mobile edge computing using DenseNet and CNN

The recent advancements in automated lung cancer diagnosis through the application of Convolutional Neural Networks (CNN) on Computed Tomography (CT) scans have marked a significant leap in medical imaging and...

The Correction to this article has been published in Journal of Cloud Computing 2024 13 :111

Cross-chain asset trading scheme for notaries based on edge cloud storage

Blockchain has penetrated in various fields, such as finance, healthcare, supply chain, and intelligent transportation, but the value exchange between different blockchains limits their expansion. Cross-chain ...

An overview of QoS-aware load balancing techniques in SDN-based IoT networks

Increasing and heterogeneous service demands have led to traffic increase, and load imbalance challenges among network entities in the Internet of Things (IoT) environments. It can affect Quality of Service (Q...

MSCO: Mobility-aware Secure Computation Offloading in blockchain-enabled Fog computing environments

Fog computing has evolved as a promising computing paradigm to support the execution of latency-sensitive Internet of Things (IoT) applications. The mobile devices connected to the fog environment are resource...

Correction to: Edge intelligence‑assisted animation design with large models: a survey

The original article was published in Journal of Cloud Computing 2024 13 :48

Provably secure data selective sharing scheme with cloud-based decentralized trust management systems

The smart collection and sharing of data is an important part of cloud-based systems, since huge amounts of data are being created all the time. This feature allows users to distribute data to particular recip...

Stacked-CNN-BiLSTM-COVID: an effective stacked ensemble deep learning framework for sentiment analysis of Arabic COVID-19 tweets

Social networks are popular for advertising, idea sharing, and opinion formation. Due to COVID-19, coronavirus information disseminated on social media affects people’s lives directly. Individuals sometimes ma...

Traffic prediction for diverse edge IoT data using graph network

More researchers are proposing artificial intelligence algorithms for Internet of Things (IoT) devices and applying them to themes such as smart cities and smart transportation. In recent years, relevant resea...

Predicting UPDRS in Parkinson’s disease using ensembles of self-organizing map and neuro-fuzzy

Parkinson's Disease (PD) is a complex, degenerative disease that affects nerve cells that are responsible for body movement. Artificial Intelligence (AI) algorithms are widely used to diagnose and track the pr...

A cloud-edge computing architecture for monitoring protective equipment

The proper use of protective equipment is very important to avoid fatalities. One sector in which this has a great impact is that of construction sites, where a large number of workers die each year. In this s...

A cloud-edge collaborative task scheduling method based on model segmentation

With the continuous development and combined application of cloud computing and artificial intelligence, some new methods have emerged to reduce task execution time for training neural network models in a clou...

Analysis and prediction of virtual machine boot time on virtualized computing environments

Starting a virtual machine (VM) is a common operation in cloud computing platforms. In order to achieve better management of resource provisioning, a cloud platform needs to accurately estimate the VM boot tim...

IoT workload offloading efficient intelligent transport system in federated ACNN integrated cooperated edge-cloud networks

Intelligent transport systems (ITS) provide various cooperative edge cloud services for roadside vehicular applications. These applications offer additional diversity, including ticket validation across transp...

Intelligent code search aids edge software development

The growth of multimedia applications poses new challenges to software facilities in edge computing. Developers must effectively develop edge computing software to accommodate the rapid expansion of multimedia...

Correction to: Advanced series decomposition with a gated recurrent unit and graph convolutional neural network for non‑stationary data patterns

The original article was published in Journal of Cloud Computing 2024 13 :20

PMNet: a multi-branch and multi-scale semantic segmentation approach to water extraction from high-resolution remote sensing images with edge-cloud computing

In the field of remote sensing image interpretation, automatically extracting water body information from high-resolution images is a key task. However, facing the complex multi-scale features in high-resoluti...

Correction: FLM-ICR: a federated learning model for classification of internet of vehicle terminals using connection records

The original article was published in Journal of Cloud Computing 2024 13 :57

CG-PBFT: an efficient PBFT algorithm based on credit grouping

Because of its excellent properties of fault tolerance, efficiency and availability, the practical Byzantine fault tolerance (PBFT) algorithm has become the mainstream consensus algorithm in blockchain. Howeve...

Time-aware outlier detection in health physique monitoring in edge-aided sport education decision-makings

The increasing popularity of various intelligent sensor and mobile communication technologies has enabled quick health physique sensing, monitoring, collection and analyses of students, which significantly pro...

Accurate and fast congestion feedback in MEC-enabled RDMA datacenters

Mobile edge computing (MEC) is a novel computing paradigm that pushes computation and storage resources to the edge of the network. The interconnection of edge servers forms small-scale data centers, enabling ...

Optimus: association-based dynamic system call filtering for container attack surface reduction

While container adoption has witnessed significant growth in facilitating the operation of large-scale applications, this increased attention has also attracted adversaries who exploit numerous vulnerabilities...

A secure cross-domain authentication scheme based on threshold signature for MEC

The widespread adoption of fifth-generation mobile networks has spurred the rapid advancement of mobile edge computing (MEC). By decentralizing computing and storage resources to the network edge, MEC signific...

Edge intelligence empowered delivery route planning for handling changes in uncertain supply chain environment

Traditional delivery route planning faces challenges in reducing logistics costs and improving customer satisfaction with growing customer demand and complex road traffic, especially in uncertain supply chain ...

Security issues of news data dissemination in internet environment

With the rise of artificial intelligence and the development of social media, people's communication is more convenient and convenient. However, in the Internet environment, the untrue dissemination of news da...

Short-term forecasting of surface solar incident radiation on edge intelligence based on AttUNet

Solar energy has emerged as a key industry in the field of renewable energy due to its universality, harmlessness, and sustainability. Accurate prediction of solar radiation is crucial for optimizing the econo...

Edge computing-oriented smart agricultural supply chain mechanism with auction and fuzzy neural networks

Powered by data-driven technologies, precision agriculture offers immense productivity and sustainability benefits. However, fragmentation across farmlands necessitates distributed transparent automation. We d...

  • Editorial Board
  • Sign up for article alerts and news from this journal

Annual Journal Metrics

2022 Citation Impact 4.0 - 2-year Impact Factor 4.4 - 5-year Impact Factor 1.711 - SNIP (Source Normalized Impact per Paper) 0.976 - SJR (SCImago Journal Rank)

2023 Speed 10 days submission to first editorial decision for all manuscripts (Median) 116 days submission to accept (Median)

2023 Usage  733,672 downloads 49 Altmetric mentions 

  • More about our metrics
  • ISSN: 2192-113X (electronic)

Benefit from our free funding service

New Content Item

We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. 

Learn more here

cloud computing Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Simulation and performance assessment of a modified throttled load balancing algorithm in cloud computing environment

<span lang="EN-US">Load balancing is crucial to ensure scalability, reliability, minimize response time, and processing time and maximize resource utilization in cloud computing. However, the load fluctuation accompanied with the distribution of a huge number of requests among a set of virtual machines (VMs) is challenging and needs effective and practical load balancers. In this work, a two listed throttled load balancer (TLT-LB) algorithm is proposed and further simulated using the CloudAnalyst simulator. The TLT-LB algorithm is based on the modification of the conventional TLB algorithm to improve the distribution of the tasks between different VMs. The performance of the TLT-LB algorithm compared to the TLB, round robin (RR), and active monitoring load balancer (AMLB) algorithms has been evaluated using two different configurations. Interestingly, the TLT-LB significantly balances the load between the VMs by reducing the loading gap between the heaviest loaded and the lightest loaded VMs to be 6.45% compared to 68.55% for the TLB and AMLB algorithms. Furthermore, the TLT-LB algorithm considerably reduces the average response time and processing time compared to the TLB, RR, and AMLB algorithms.</span>

An improved forensic-by-design framework for cloud computing with systems engineering standard compliance

Reliability of trust management systems in cloud computing.

Cloud computing is an innovation that conveys administrations like programming, stage, and framework over the web. This computing structure is wide spread and dynamic, which chips away at the compensation per-utilize model and supports virtualization. Distributed computing is expanding quickly among purchasers and has many organizations that offer types of assistance through the web. It gives an adaptable and on-request administration yet at the same time has different security dangers. Its dynamic nature makes it tweaked according to client and supplier’s necessities, subsequently making it an outstanding benefit of distributed computing. However, then again, this additionally makes trust issues and or issues like security, protection, personality, and legitimacy. In this way, the huge test in the cloud climate is selecting a perfect organization. For this, the trust component assumes a critical part, in view of the assessment of QoS and Feedback rating. Nonetheless, different difficulties are as yet present in the trust the board framework for observing and assessing the QoS. This paper talks about the current obstructions present in the trust framework. The objective of this paper is to audit the available trust models. The issues like insufficient trust between the supplier and client have made issues in information sharing likewise tended to here. Besides, it lays the limits and their enhancements to help specialists who mean to investigate this point.

Cloud Computing Adoption in the Construction Industry of Singapore: Drivers, Challenges, and Strategies

An extensive review of web-based multi granularity service composition.

The paper reviews the efforts to compose SOAP, non-SOAP and non-web services. Traditionally efforts were made for composite SOAP services, however, these efforts did not include the RESTful and non-web services. A SOAP service uses structured exchange methodology for dealing with web services while a non-SOAP follows different approach. The research paper reviews the invoking and composing a combination of SOAP, non-SOAP, and non-web services into a composite process to execute complex tasks on various devices. It also shows the systematic integration of the SOAP, non-SOAP and non-web services describing the composition of heterogeneous services than the ones conventionally used from the perspective of resource consumption. The paper further compares and reviews different layout model for the discovery of services, selection of services and composition of services in Cloud computing. Recent research trends in service composition are identified and then research about microservices are evaluated and shown in the form of table and graphs.

Integrated Blockchain and Cloud Computing Systems: A Systematic Survey, Solutions, and Challenges

Cloud computing is a network model of on-demand access for sharing configurable computing resource pools. Compared with conventional service architectures, cloud computing introduces new security challenges in secure service management and control, privacy protection, data integrity protection in distributed databases, data backup, and synchronization. Blockchain can be leveraged to address these challenges, partly due to the underlying characteristics such as transparency, traceability, decentralization, security, immutability, and automation. We present a comprehensive survey of how blockchain is applied to provide security services in the cloud computing model and we analyze the research trends of blockchain-related techniques in current cloud computing models. During the reviewing, we also briefly investigate how cloud computing can affect blockchain, especially about the performance improvements that cloud computing can provide for the blockchain. Our contributions include the following: (i) summarizing the possible architectures and models of the integration of blockchain and cloud computing and the roles of cloud computing in blockchain; (ii) classifying and discussing recent, relevant works based on different blockchain-based security services in the cloud computing model; (iii) simply investigating what improvements cloud computing can provide for the blockchain; (iv) introducing the current development status of the industry/major cloud providers in the direction of combining cloud and blockchain; (v) analyzing the main barriers and challenges of integrated blockchain and cloud computing systems; and (vi) providing recommendations for future research and improvement on the integration of blockchain and cloud systems.

Cloud Computing and Undergraduate Researches in Universities in Enugu State: Implication for Skills Demand

Cloud building block chip for creating fpga and asic clouds.

Hardware-accelerated cloud computing systems based on FPGA chips (FPGA cloud) or ASIC chips (ASIC cloud) have emerged as a new technology trend for power-efficient acceleration of various software applications. However, the operating systems and hypervisors currently used in cloud computing will lead to power, performance, and scalability problems in an exascale cloud computing environment. Consequently, the present study proposes a parallel hardware hypervisor system that is implemented entirely in special-purpose hardware, and that virtualizes application-specific multi-chip supercomputers, to enable virtual supercomputers to share available FPGA and ASIC resources in a cloud system. In addition to the virtualization of multi-chip supercomputers, the system’s other unique features include simultaneous migration of multiple communicating hardware tasks, and on-demand increase or decrease of hardware resources allocated to a virtual supercomputer. Partitioning the flat hardware design of the proposed hypervisor system into multiple partitions and applying the chip unioning technique to its partitions, the present study introduces a cloud building block chip that can be used to create FPGA or ASIC clouds as well. Single-chip and multi-chip verification studies have been done to verify the functional correctness of the hypervisor system, which consumes only a fraction of (10%) hardware resources.

Study On Social Network Recommendation Service Method Based On Mobile Cloud Computing

Cloud-based network virtualization in iot with openstack.

In Cloud computing deployments, specifically in the Infrastructure-as-a-Service (IaaS) model, networking is one of the core enabling facilities provided for the users. The IaaS approach ensures significant flexibility and manageability, since the networking resources and topologies are entirely under users’ control. In this context, considerable efforts have been devoted to promoting the Cloud paradigm as a suitable solution for managing IoT environments. Deep and genuine integration between the two ecosystems, Cloud and IoT, may only be attainable at the IaaS level. In light of extending the IoT domain capabilities’ with Cloud-based mechanisms akin to the IaaS Cloud model, network virtualization is a fundamental enabler of infrastructure-oriented IoT deployments. Indeed, an IoT deployment without networking resilience and adaptability makes it unsuitable to meet user-level demands and services’ requirements. Such a limitation makes the IoT-based services adopted in very specific and statically defined scenarios, thus leading to limited plurality and diversity of use cases. This article presents a Cloud-based approach for network virtualization in an IoT context using the de-facto standard IaaS middleware, OpenStack, and its networking subsystem, Neutron. OpenStack is being extended to enable the instantiation of virtual/overlay networks between Cloud-based instances (e.g., virtual machines, containers, and bare metal servers) and/or geographically distributed IoT nodes deployed at the network edge.

Export Citation Format

Share document.

cloud computing research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 26 February 2024

A Framework for the Interoperability of Cloud Platforms: Towards FAIR Data in SAFE Environments

  • Robert L. Grossman   ORCID: orcid.org/0000-0003-3741-5739 1 ,
  • Rebecca R. Boyles 2 ,
  • Brandi N. Davis-Dusenbery 3 ,
  • Amanda Haddock 4 ,
  • Allison P. Heath 5 ,
  • Brian D. O’Connor 6 ,
  • Adam C. Resnick 5 ,
  • Deanne M. Taylor   ORCID: orcid.org/0000-0002-3302-4610 5 , 7 &
  • Stan Ahalt   ORCID: orcid.org/0000-0002-8395-1279 8  

Scientific Data volume  11 , Article number:  241 ( 2024 ) Cite this article

1070 Accesses

1 Altmetric

Metrics details

  • Computational platforms and environments

As the number of cloud platforms supporting scientific research grows, there is an increasing need to support interoperability between two or more cloud platforms. A well accepted core concept is to make data in cloud platforms Findable, Accessible, Interoperable and Reusable (FAIR). We introduce a companion concept that applies to cloud-based computing environments that we call a S ecure and A uthorized F AIR E nvironment (SAFE). SAFE environments require data and platform governance structures and are designed to support the interoperability of sensitive or controlled access data, such as biomedical data. A SAFE environment is a cloud platform that has been approved through a defined data and platform governance process as authorized to hold data from another cloud platform and exposes appropriate APIs for the two platforms to interoperate.

As the number of cloud platforms supporting scientific research grows 1 , there is an increasing need to support cross-platform interoperability. By a cloud platform, we mean a software platform in a public or private cloud 2 for managing and analyzing data and other authorized functions. With interoperability between cloud platforms, data does not have to be replicated in multiple cloud platforms but can be managed by one cloud platform and analyzed by researchers in another cloud platform. A common use case is to use specialized tools in another cloud platform that are unavailable in the cloud platform hosting the data. Interoperability also enables cross-platform functionality, allowing researchers analyzing data in one cloud platform to obtain the necessary amount of data required to power a statistical analysis, to validate an analysis using data from another cloud platform, or to bring together multiple data types for an integrated analysis when the data is distributed across two or more cloud platforms. In this paper, we are especially concerned with frameworks that are designed to support the interoperability of sensitive or controlled access data, such as biomedical data or qualitative research data.

There have been several attempts to provide frameworks for the interoperating cloud platforms for biomedical data, including those by the GA4GH organization 3 and by the European Open Science Cloud (EOSC) Interoperability Task Force of the FAIR Working Group 4 . A key idea in these frameworks is to make data in cloud platforms findable, accessible, interoperable and reusable (FAIR) 5 .

The authors have developed several cloud platforms operated by different organizations and were part of a working group, one of whose goals was to increase the interoperability between these cloud platforms. The challenge is that even when a dataset is FAIR and in a cloud platform (referred to here as Cloud Platform A), in general the governance structure put in place by the organization sponsoring Cloud Platform A (called the Project Sponsor below) requires that sensitive data remain in the platform and only be accessed by users within the platform. Therefore, even if a user was authorized to analyze the data, there was no simple way for the user to analyze the data in any cloud platform (referred to here as Cloud Platform B), except for the single cloud platform operated by the organization (Cloud Platform A).

There are several reasons for this lack of interoperability between cloud platforms hosting sensitive data: First, as just mentioned, for many cloud platforms, it is against policy to remove data from the cloud platform; instead, data must be analyzed within the cloud platform.

Second, in some cases, to manage the security and compliance of the data, often there is only a single cloud platform that has the right to distribute controlled access data; other cloud platforms may contain a copy of the data, but by policy cannot distribute it.

Third, a typical clause in a data access agreement requires that if the user elects not to use Cloud Platform A, the user’s organization is responsible for assessing and attesting to the security and compliance of Cloud Platform B. This can be difficult and time consuming unless there is a pre-existing relationship.

Fourth, once a Sponsor has approved a single cloud platform as authorized to host data and to analyze the hosted data, there may be a perception of increased risk to the Sponsor in allowing other third party platforms to be used to host or to analyze the data. Because of this increased risk, there has been limited interoperability of cloud platforms for controlled access data.

The consensus from the working group was that interoperability of data and an acceleration of research outcomes could be achieved if standard interoperating principals and interfaces could describe which platforms had the right to distribute a dataset and which cloud platforms could be used to analyze data.

In this note, we introduce a companion concept to FAIR that applies to cloud-based computing environments that we call a S ecure and A uthorized F AIR E nvironment (SAFE). The goal of the SAFE framework is to address the four issues described above that today limit the interoperability between cloud platforms. The cloud-based framework consisting of FAIR data in SAFE environments is intended to apply to research data that has restrictions on its access or its distribution or both its access and distribution. Some examples are: biomedical data 3 , 6 , including EHR data, clinical/phenotype data, genomics data, imaging data; social science data 7 and administrative data 8 . We emphasize that the environment itself is not FAIR in the sense of 5 , but rather that a SAFE environment contains FAIR data and is designed to be part of a framework to support the interoperability of FAIR data between two or more data platforms.

Also, SAFE cloud platforms are designed to support platform governance decisions about whether data in one cloud platform may be linked or transferred to another cloud platform, either for direct use by researchers or to redistribution. As we will argue below, SAFE is designed to support decisions between two or more cloud platforms to interoperate in the sense that data may be moved between them, but is not designed nor intended to be a security or compliance level describing a single cloud platform.

The proposed SAFE framework provides a way for a Sponsor to “extend its boundary” to selected third party platforms that can be used to analyze the data by authorized users. In this way, researchers can use the platform and tools that they are most comfortable with.

In order to discuss the complexities of an interoperability framework across cloud based resources, in the next section, we first define some important concepts from data and platform governance.

Distinguishing Data and Platform Governance

We assume that data is generated by research projects and that there is an organization that is responsible for the project. We call this organization the Project Sponsor . This can be any type of organization, including a government agency, an academic research center, a not-for-profit organization, or a commercial organization.

In the framework that we are proposing here, the Project Sponsor sets up and operates frameworks for (1) data governance and (2) platform governance. The Project Sponsor is ultimately responsible for the security and compliance of the data and of the cloud platform. Data governance includes: approving datasets to be distributed by cloud platforms, authorizing users to access data, and related activities. Platform governance includes: approving cloud platforms as having the right to distribute datasets to other platforms and to users and approving cloud platforms as authorized environments so that the cloud platforms can be used by users to access, analyze, and explore datasets.

By controlled access data , we mean data that is considered sensitive enough that agreements for the acceptable use of the data must be signed. One between the organization providing the data (the Data Contributor ) and the Project Sponsor and another between researchers (which we call Users in the framework) accessing the data and the Project Sponsor. Controlled access data arises, for example, when research participants contribute data for research purposes through a consent process, and a researcher signs an agreement to follow all the terms and conditions required by the consent agreements of the research participants or by an Institutional Review Board (IRB) that approves an exemption so that consents are not required.

Commonly used terms that are needed to describe SAFE are contained in Table  1 . Table  2 describes the roles and responsibilities of the Project Sponsor, Platform Operator, and User.

As is usual, we use the term authorized user , as someone who has applied for and been approved for access to controlled-access data. See Table  1 for a summary of definitions used in this paper.

One of the distinguishing features of our interoperability framework is that we formalize the concept of an authorized environment. An authorized environment is a cloud platform workspace or computing / analysis environment that is approved for the use or analysis of controlled access data.

Using the concepts of authorized user and authorized environment, we provide a framework enabling the interoperability between two or more cloud platforms.

SAFE Environments

Below we describe some suggested processes for authorizing environments, including having their security and compliance reviewed by the appropriate official or committee determined by the platform governance process. We also argue that the environments should have APIs so that they are findable, accessible and interoperable, enabling other cloud platforms to interoperate with it. As mentioned above, we use the acronym SAFE for S ecure and A uthorized F AIR E nvironments to describe these types of environments. In other words, a SAFE environment is a cloud platform that has been approved through a platform governance process as an authorized environment and exposes an API enabling other cloud platforms to interact with it (Fig.  1 ).

figure 1

An overview of supporting FAIR data in SAFE environments.

In this paper, we make the case that SAFE environments are a natural complement to FAIR data and establishing a trust relationship between a cloud platform with FAIR data and a cloud platform that is a SAFE environment for analyzing data is a good basis for interoperability . Examples of the functionality to be exposed by the API and proposed identifiers are discussed below. Importantly, our focus is to provide a framework for attestation and approvals to support interoperability. Definition of the exact requirements for approvals is based on the needs of a particular project sponsor and out of scope of this manuscript.

Of course, a cloud platform can include both FAIR data and a SAFE environment for analyzing data. The issue of interoperability between cloud platforms arises when a researcher using a cloud platform that is a SAFE environment for analyzing data needs to access data from another cloud platform that contains data of interest.

We emphasize that the framework applies to all types of controlled-access data, (e.g., clinical, genomic, imaging, environmental, etc.) and that decisions about authorized users and authorized platforms depend upon the sensitivity of the data, with more conditions for data access and uses as the sensitivity of the data increases.

The SAFE framework that we are implementing uses the following identifiers:

SAFE assumes that cloud platforms have a globally unique identifier (GUID) identifying them, which we call an authorized platform identifier (APID) .

SAFE assumes that cloud platforms form networks consisting of 2 or more cloud platforms, which we call authorized platform network (APN) . Authorized platform networks have a globally unique identifier, which we call an authorized platform network identifier (APNI) . As an example, cloud platforms in an authorized platform network can sign a common set of agreements or otherwise agree to interoperate. A particular cloud platform can interoperate with all or selected cloud platforms in an authorized platform network.

SAFE assumes that geographic regions are identified by a globally unique identifier, which we call an Authorized Region ID (ARID). For example, the entire world may be an authorized region, or a single country may be the only authorized region. SAFE assumes that datasets that limit their distribution and analysis to specified regions identify these regions in their metadata.

To implement SAFE, we propose that a cloud environment support an API that exposes metadata with the following information:

Authorized Platform Identifier (APID)

A list of the Authorized Platform Network Identifiers (APNIs) that it belongs to.

A particular authorized platform network must also agree to a protocol for securely exchanging the APID and list of APNIs that it belongs to, such as transport layer security (TLS) protocol.

In addition, cloud platforms that host data that can be accessed and analyzed in other cloud platforms, should associated with each dataset metadata that specifies: a) whether the data can be removed from the platform (i.e. does the platform have the right to distribute data); b) a list of authorize platform networks that have been approved as authorized environments to access and analyze the data; and, c) an optional list of authorized region IDs (ARIDs) describing any regional restrictions on where the data may be accessed and analyzed.

Platform Governance

Examples of platform governance frameworks.

An example of a process for authorization of an environment is provided by the process used by the NIH Data Access Committees (DACs) through the dbGaP system 9 for sharing genomics data 10 . Currently, if a NIH DAC approves a user’s access to data, and if the user specifies in the data access request (DAR) application that a cloud platform will be used for analysis, then the user’s designated IT Director takes the responsibility for a cloud platform as an authorized environment for the user’s analysis of controlled access data, and a designated official at the user’s institution (the Signing Official) takes the overarching responsibility on behalf of the researcher’s institution.

As another example, the platform governance process may follow the “NIST 800-53 Security and Privacy Controls for Information Systems and Organizations” framework developed by the US National Institute for Standards and Technology (NIST) 11 . This framework has policies, procedures, and controls at three Levels - Low, Moderate and High, and each organization designates a person that can approve an environment by issuing what is called an Authority to Operate (ATO). More specifically, in this example, the platform governance process may require the following to approve a cloud platform as an authorized environment for hosting controlled access data: (1) a potential cloud platform implement the policies, procedures and controls specified by NIST SP 800-53 at the Moderate level; (2) a potential cloud platform have an independent assessment by a third party to ensure that the policies, controls and procedures are appropriately implemented and documented; (3) an appropriate official or committee evaluate the assessment, and if acceptable, approves the environment as an authorized environment by issuing an Authority to Operate (ATO) or following another agreed to process; (4) yearly penetration tests by an independent third party, which are reviewed by the appropriate committee or official.

Many US government agencies follow NIST SP 800-53, and a designated government official issues an Authority to Operate (ATO) when appropriate after the evaluation of a system 11 . In the example above, we are using the term “authority to operate” to refer to a more general process in which any organization decides to evaluate a cloud platform using any security and compliance framework and has completed all the steps necessary so that the cloud platform can be used. In the example, an organization, which may or may not be a government organization, uses the NIST SP 800-53 security and compliance framework and designates an individual within the organization with the role and responsibility to check that (1), (2) and (4) have been accomplished and issues an ATO when this is the case.

The right to distribute controlled access data

In general, when a user or a cloud platform is granted access to controlled access data, the user or platform does not have the right to redistribute the data to other users, even if the other user has signed the appropriate Data Access Agreements. Instead, to ensure there is the necessary security and compliance in place, any user accessing data as an authorized user must access the data from a platform approved for this purpose. We refer to platforms with the ability to share controlled access data in this way as having the right to distribute the authorized data.

One of the core ideas of SAFE is that data which has been approved for hosting in a cloud platform can be accessed and transferred to another cloud platform in the case that: the first cloud platform has the right to distribute the data and the second cloud platform is recognized as an authorized environment for the data following an approved process, such as described in the next section. There remains the possibility that the cloud platform requesting access to the data is in fact an imposter and not the authorized environment it appears to be. For this reason, as part of SAFE, we recommend that the cloud platform with the right to distribute data should verify through a chain of trust that it is indeed the intended authorized environment.

Basis for approving authorized environments

The guiding principle of SAFE is that research outcomes are accelerated by supporting interoperability of data across authorized environments. While the specific requirements may vary by project and project sponsor, in order to align with this principle, it is critical that Project Sponsors define requirements transparently and support interoperability when the requirements are met.

Above we provided examples of approaches and requirements project sponsors may use in approving an Authorized Environment. As mentioned above, NIST SP 800-53 provides a basis for authorizing an environment, but there are many frameworks for evaluating the security and compliance of a system that may be used. As an example, the organization evaluating the cloud platform may choose to use a framework such as NIST SP 800-171 12 , or may choose another process for approving a cloud platform as an authorized environment rather than issuing an ATO.

For example, both the Genomic Data Commons 6 and the AnVIL system 13 follow NIST SP 800-53 at the Moderate Level and the four steps described above. The authorizing official for the Genomic Data Commons is a government official at the US National Cancer Institute, while the authorizing official for AnVIL is an organizational official associated with the Platform Operator.

Two or more cloud platforms can interoperate when both the Sponsors and Operators each agree to: (1) use the same framework and process for evaluating cloud platforms as authorized environments; (2) each authorize one or more cloud platforms as authorized environments for particular datasets; (3) each agree to a common protocol or process for determining when a given cloud platform is following (1) and (2). Sometimes, this situation is described as the platforms having a trust relationship between them.

Basis for approving the right to distribute datasets

For each dataset, a data governance responsibility is to determine the right of a cloud based data repository to distribute data to an authorized user in an authorized environment. To reduce risk of privacy and confidentiality breach, the data governance process may choose to limit the number of data repositories that can distribute a particular controlled access dataset and to impose additional security and compliance requirements on those cloud based data repositories that have the right to distribute particular sensitive controlled-access datasets. These risks of course must be balanced with the imperative to accelerate research and improve patient outcomes which underlies the motivations of many study participants.

Interoperability

SAFE is focused on the specific aspect of interoperability of whether data hosted in one cloud platform can be analyzed in another cloud platform.

With the concepts of an authorized user, an authorized environment, and the right to distribute, interoperability is achieved when two or more cloud platforms have the right to distribute data to an authorized user in a cloud based authorized environment.

This suggests a general principle for interoperability: the data governance process for a dataset should authorize users, the platform governance process for a dataset should authorize cloud platform environments, and two or more cloud platforms can interoperate by trusting these authorizations .

Figure  2 summarizes some of the key decisions enabling two cloud platforms to interoperate using the SAFE framework.

figure 2

Some of the key decisions for interoperating two cloud platforms using the SAFE framework.

Towards Fair Data in SAFE Environments

Today there are a growing number of cloud platforms that hold biomedical data of interest to the research community, a growing number of cloud-based analysis tools for analyzing biomedical data, and a growing challenge for researchers to access the data they need, since often the analysis of data takes place in a different cloud platform than the cloud platform that hosts the data of interest.

We have presented the concept of cloud-based authorized environments that are called SAFE environments, which are secure and authorized environments that are appropriate for the analysis of sensitive biomedical data. The role of platform governance is to identify the properties required for a cloud platform to be an authorized environment for a particular dataset and to approve a cloud based platform that holds controlled access data to distribute the data to specific authorized platforms.

By standardizing the properties to be a SAFE environment and agreeing to the principle that the data governance process for a dataset should authorize users and the platform governance process should authorize cloud platform environments, then all that is required for two or more cloud platforms to interoperate is for the cloud platforms to trust these authorizations. We can shorten this principle to: “authorize the users, authorize the cloud platforms, and trust the authorizations.” This is the core basis for interoperability in the SAFE framework. See Table  3 for a summary.

This principle came out of the NIH NCPI Community and Governance Working Group and is the basis for the interoperability of the data platforms in this group. We are currently implementing APID, APNI and AIRD identifiers as described above, as well as the dataset metadata describing whether a dataset can be redistributed or transferred to other data platforms for analysis.

Navale, V. & Bourne, P. E. Cloud computing applications for biomedical science: A perspective. PLOS Comput. Biol. 14 (no. 6), e1006144, https://doi.org/10.1371/journal.pcbi.1006144 (2018). Jun.

Article   ADS   PubMed   PubMed Central   CAS   Google Scholar  

Mell, P. M. & Grance, T. The NIST definition of cloud computing, National Institute of Standards and Technology, Gaithersburg, MD, NIST SP 800–145, https://doi.org/10.6028/NIST.SP.800-145 (2011).

Rehm, H. L. et al ., GA4GH: International policies and standards for data sharing across genomic research and healthcare, Cell Genomics , vol. 1, no. 2, p. 100029, https://doi.org/10.1016/j.xgen.2021.100029 (Nov. 2021).

Achieving interoperability in EOSC: The Interoperability Framework | EOSCSecretariat. [Online]. Available: https://www.eoscsecretariat.eu/news-opinion/achieving-interoperability-eosc-interoperability-framework (Accessed: Jul. 30, 2021).

Wilkinson, M. D. et al ., The FAIR Guiding Principles for scientific data management and stewardship, Sci. Data , vol. 3, no. 1, p. 160018, https://doi.org/10.1038/sdata.2016.18 (Dec. 2016).

Heath, A. P. et al . The NCI Genomic Data Commons, Nat. Genet ., pp. 1–6, https://doi.org/10.1038/s41588-021-00791-5 (Feb. 2021).

ICPSR Data Excellence Research Impact. [Online]. Available: https://www.icpsr.umich.edu/web/pages/ (Accessed: Nov. 20, 2023).

Lane, J. Building an Infrastructure to Support the Use of Government Administrative Data for Program Performance and Social Science Research. Ann. Am. Acad. Pol. Soc. Sci. 675 (no. 1), 240–252, https://doi.org/10.1177/0002716217746652 (2018). Jan.

Article   Google Scholar  

Mailman, M. D. et al . The NCBI dbGaP database of genotypes and phenotypes,. Nat. Genet. 39 (no. 10), 1181–1186 (2007).

Article   PubMed   PubMed Central   CAS   Google Scholar  

Paltoo, D. N. et al . Data use under the NIH GWAS Data Sharing Policy and future directions, Nat. Genet ., vol. 46, no. 9, pp. 934–938, https://doi.org/10.1038/ng.3062 (Sep. 2014).

Dempsey, K., Witte, G. & Rike, D. Summary of NIST SP 800-53 Revision 4, Security and Privacy Controls for Federal Information Systems and Organizations, National Institute of Standards and Technology, Gaithersburg, MD, NIST CSWP 02192014, https://doi.org/10.6028/NIST.CSWP.02192014 (Feb. 2014).

Ross, R., Pillitteri, V., Dempsey, K., Riddle, M. & Guissanie, G. Protecting Controlled Unclassified Information in Nonfederal Systems and Organizations, National Institute of Standards and Technology, NIST Special Publication (SP) 800-171 Rev. 2 , https://doi.org/10.6028/NIST.SP.800-171r2 (Feb. 2020).

Schatz, M. C. et al . Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL), https://doi.org/10.1101/2021.04.22.436044 (Apr. 2021).

Download references

Acknowledgements

This document captures discussions of the NIH Cloud-Based Platform Interoperability (NCPI) Community/Governance Working Group that have occurred over the past 24 months, and we want to acknowledge the contributions of this working group. This working group included personnel from federal agencies, health systems, industry, universities, and patient advocacy groups. However, this document does not represent any official decisions or endorsement of potential policy changes and is not an official work product of the NCPI Working Group. Rather, it is a summary of some of the working group discussions and is an opinion of the authors. Research reported in this publication was supported in part by the following grants and contracts: the NIH Common Fund under Award Number U2CHL138346, which is administered by the National Heart, Lung, and Blood Institute of the National Institutes of Health; the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services under the Agreement No. OT3 HL142478-01 and OT3 HL147154-01S1; National Cancer Institute, National Institutes of Health, Department of Health and Human Services under Contract No. HHSN261201400008C; and ID/IQ Agreement No. 17X146 under Contract No. HHSN261201500003I. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and affiliations.

Center for Translational Data Science, University of Chicago, Chicago, IL, USA

Robert L. Grossman

RTI International, Research Triangle Park, Triangle Park, NC, USA

Rebecca R. Boyles

Velsera, Charlestown, MA, USA

Brandi N. Davis-Dusenbery

Dragon Master Initiative, Kechi, KS, USA

Amanda Haddock

Children’s Hospital of Philadelphia, Philadelphia, PA, USA

Allison P. Heath, Adam C. Resnick & Deanne M. Taylor

Nimbus Informatics, Carrboro, NC, USA

Brian D. O’Connor

University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA

Deanne M. Taylor

University of North Carolina, Chapel Hill, Chapel Hill, NC, USA

You can also search for this author in PubMed   Google Scholar

Contributions

All the authors contributed to the drafting and review of the manuscript.

Corresponding author

Correspondence to Robert L. Grossman .

Ethics declarations

Competing interests.

One of the authors (BND-D) is an employee of a for-profit company (Velsera).

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Manuscript with marked changes, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Grossman, R.L., Boyles, R.R., Davis-Dusenbery, B.N. et al. A Framework for the Interoperability of Cloud Platforms: Towards FAIR Data in SAFE Environments. Sci Data 11 , 241 (2024). https://doi.org/10.1038/s41597-024-03041-5

Download citation

Received : 10 August 2023

Accepted : 03 February 2024

Published : 26 February 2024

DOI : https://doi.org/10.1038/s41597-024-03041-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

cloud computing research

Systematic Literature Review of Cloud Computing Research Between 2010 and 2023

  • Conference paper
  • First Online: 21 May 2024
  • Cite this conference paper

cloud computing research

  • Shailaja Jha 10 &
  • Devina Chaturvedi   ORCID: orcid.org/0009-0004-1242-2099 11  

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 508))

Included in the following conference series:

  • Workshop on e-Business

25 Accesses

We present a meta-analysis of cloud computing research in information systems. The study includes 152 referenced journal articles published between January 2010 to June 2023. We take stock of the literature and the associated research themes, research frameworks, the employed research methodology, and the geographical distribution of the articles. This review provides holistic insights into trends in cloud computing research based on themes, frameworks, methodology, geographical focus, and future research directions. The results indicate that the extant literature tends to skew toward themes related to business issues, which is an indicator of the maturing and widespread use of cloud computing. This trend is evidenced in the more recent articles published between 2016 to 2023.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The conference proceedings were primarily used to assess the year-on-year numerical trends in publications, and they have not been used for detailed analysis.

Abdalla Mikhaeil, C., James, T.L.: Examining the case of French hesitancy toward IDaaS solutions: technical and social contextual factors of the organizational IDaaS privacy calculus. Inform. Manage. 60 (4), 103779 (2023)

Google Scholar  

Allen, B., et al.: Software as a service for data scientists. Commun. ACM 55 (2), 81–88 (2012)

Andrade-Rojas, M.G., Kathuria, A., Lee, H.-H.: Multilevel synergy of IT operational integration: competition networks and operating performance. Prod. Oper. Manage. (forthcoming) (2024)

Andrade-Rojas, M.G., Saldanha, T., Kathuria, A., Khuntia, J., Boh, W.F.: How IT overcomes deficiencies for innovation in SMEs: closed innovation versus open innovation. Inform. Syst. Res. (forthcoming) (2024)

Anthes, G.: Security in the cloud. Commun. ACM 53 , 16–18 (2010)

Armbrust, M., et al.: A view of cloud computing. Commun. ACM 53 , 50–58 (2010)

August, T., Niculescu, M.F., Shin, H.: Cloud implications on software network structure and security risks. Inform. Syst. Res. 25 , 489–510 (2014)

Bandara, W., Furtmueller, E., Gorbacheva, E., Miskon, S., Beekhuyzen, J.: Achieving rigor in literature reviews: insights from qualitative data analysis and tool-support. Commun. Assoc. Inform. Syst. 37 (8), 154–204 (2015). http://aisel.aisnet.org/cais/vol37/iss1/8

Benlian, A.: Is traditional, open-source, or on-demand first choice? Developing an AHP-based framework for the comparison of different software models in office suites selection. Eur. J. Inform. Syst. 20 , 542–559 (2011)

Benlian, A., Kettinger, W.J., Sunyaev, A., Winkler, T.J.: Special section: the transformative value of cloud computing: a decoupling, platformization, and recombination theoretical framework. J. Manage. Inform. Syst. 35 , 719–739 (2018)

Benlian, A., Koufaris, M., Hess, T.: The role of SaaS service quality for continued SaaS use: Empirical insights from SaaS using firms (2010)

Bhattacherjee, A., Park, S.C.: Why end-users move to the cloud: a migration-theoretic analysis. Eur. J. Inform. Syst. 23, 357–372 (2014)

Chaturvedi, D., Kathuria, A., Andrade, M., Saldanha, T.: Navigating the Paradox of IT Novelty and Strategic Conformity: The Moderating Role of Industry Dynamism (2023)

Chen, F., Lu, A., Wu, H., Li, M.: Compensation and pricing strategies in cloud service SLAs: considering participants’ risk attitudes and consumer quality perception. Electron. Commerce Res. Appl. 56 , 101215 (2022)

Cheng, H.K., Li, Z., Naranjo, A.: Research note—cloud computing spot pricing dynamics: latency and limits to arbitrage. Inform. Syst. Res. 27 , 145–165 (2016)

Choudhary, V., Vithayathil, J.: The impact of cloud computing: should the IT department be organized as a cost center or a profit center? J. Manage. Inform. Syst. 30 , 67–100 (2013)

Choudhary, V., Zhang, Z.: Research note—patching the cloud: the impact of SaaS on patching strategy and the timing of software release. Inform. Syst. Res. 26 , 845–858 (2015)

Dasgupta, A., Karhade, P., Kathuria, A., Konsynski, B.: Holding space for voices that do not speak: design reform of rating systems for platforms in GREAT economies (2021)

Demirkan, H., Cheng, H.K., Bandyopadhyay, S.: Coordination strategies in an SaaS supply chain. J. Manage. Inform. Syst. 26 , 119–143 (2010)

Demirkan, H., Delen, D.: Leveraging the capabilities of service-oriented decision support systems: putting analytics and big data in cloud. Decis. Support Syst. 55 , 412–421 (2013)

Dierks, L., Seuken, S.: Cloud pricing: the spot market strikes back. Manage. Sci. 68 (1), 105–122 (2022)

Article   Google Scholar  

Ding, S., Xia, C., Wang, C., Desheng, Wu., Zhang, Y.: Multi-objective optimization based ranking prediction for cloud service recommendation. Decis. Support. Syst. 101 , 106–114 (2017)

Dong, L., Shu, W., Sun, D., Li, X., Zhang, L.: Pre-alarm system based on real-time monitoring and numerical simulation using internet of things and cloud computing for tailings dam in mines. IEEE Access 5 , 21080–21089 (2017)

Xin, Du., Tang, S., Zhihui, Lu., Gai, K., Jie, Wu., Hung, P.C.K.: Scientific workflows in IoT environments: a data placement strategy based on heterogeneous edge-cloud computing. ACM Trans. Manage. Inform. Syst. 13 (4), 1–26 (2022)

Ermakova, T., Fabian, B., Kornacka, M., Thiebes, S., Sunyaev, A.: Security and privacy requirements for cloud computing in healthcare: elicitation and prioritization from a patient perspective. ACM Trans. Manage. Inform. Syst. 11 (2), 1–29 (2020)

Garrison, G., Kim, S., Wakefield, R.L.: Success factors for deploying cloud computing. Commun. ACM 55 (9), 62–68 (2012)

Giessmann, A., Legner, C.: Designing business models for cloud platforms. Inf. Syst. J. 26 (5), 551–579 (2016). https://doi.org/10.1111/isj.12107

Gray, A.: Conflict of laws and the cloud. Comput. Law Secur. Rev. 29 (1), 58–65 (2013)

Hosseini, L., Tang, S., Mookerjee, V., Sriskandarajah, C.: A switch in time saves the dime: a model to reduce rental cost in cloud computing. Inform. Syst. Res. 31 (3), 753–775 (2020)

Huang, K.-W., Sundararajan, A.: Pricing digital goods: discontinuous costs and shared infrastructure. Inf. Syst. Res. 22 (4), 721–738 (2011)

Iosup, A., Ostermann, S., Yigitbasi, M.N., Prodan, R., Fahringer, T., Epema, D.H.J.: Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans. Parallel Distrib. Syst. 22 , 931–945 (2011)

Iyer, B., Henderson, J.C.: Preparing for the future: understanding the seven capabilities cloud computing. MIS Q. Exec. 9 , 2 (2010)

Jha, S. and Kathuria, A. Size Matters for Cloud Capability and Performance (2022)

Jha, S., Kathuria, A.: How firm age and size influence value creation from cloud computing (2023)

Joe-Wong, C., Sen, S.: Harnessing the power of the cloud: revenue, fairness, and cloud neutrality. J. Manage. Inf. Syst. 35 , 813–836 (2018)

Joint, A., Baker, E.: Knowing the past to understand the present–issues in the contracting for cloud based services. Comput. Law Secur. Rev. 27 (4), 407–415 (2011)

Karhade, P., Kathuria, A.: Missing impact of ratings on platform participation in India: a call for research in GREAT domains. Commun. Assoc. Inf. Syst. 47 (1), 19 (2020)

Karhade, P., Kathuria, A., Dasgupta, A., Malik, O., Konsynski, B.R.: Decolonization of digital platforms: a research agenda for GREAT domains. In: Garimella, A., Karhade, P., Kathuria, A., Liu, X., Xu, J., Zhao, K. (eds.) The Role of e-Business during the Time of Grand Challenges. LNBIP, vol. 418, pp. 51–58. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79454-5_5

Chapter   Google Scholar  

Karhade, P., Kathuria, A., Konsynski, B.: When choice matters: assortment and participation for performance on digital platforms (2021)

Kathuria, A., Karhade, P.P., Konsynski, B.R.: In the realm of hungry ghosts: multi-level theory for supplier participation on digital platforms. J. Manag. Inf. Syst. 37 (2), 396–430 (2020)

Kathuria, A., Mann, A., Khuntia, J., Saldanha, T.J.V., Kauffman, R.J.: A strategic value appropriation path for cloud computing. J. Manage. Inf. Syst. 35 (3), 740–775 (2018). https://doi.org/10.1080/07421222.2018.1481635

Kaur, J., Kaur, P.D.: CE-GMS: A cloud IoT-enabled grocery management system. Electron. Commer. Res. Appl. 28 , 63–72 (2018)

Kepes, B.: 30% of servers are sitting “Comatose” according to research. Forbes https://forbes.com/sites/benkepes/2015/06/03/30-of-servers-are-sitting-comatose-according-to-research (2015)

Khokhar, R.H., Fung, B.C.M., Iqbal, F., Alhadidi, D., Bentahar, J.: Privacy-preserving data mashup model for trading person-specific information. Electron. Commer. Res. Appl. 17 , 19–37 (2016)

Khuntia, J., Kathuria, A., Andrade-Rojas, M.G., Saldanha, T., Celly, N.: How foreign and domestic firms differ in leveraging IT-enabled supply chain information integration in BOP markets: the role of supplier and client business collaboration. J. Assoc. Inf. Syst. 22 (3), 6 (2021)

King, W.R., He, J.: Understanding the role and methods of meta-analysis in IS Research. Commun. Assoc. Inf. Syst. 16, 665–686 (2005)

Krancher, O., Luther, P., Jost, M.: Key affordances of Platform-as-a-Service: self-organization and continuous feedback. J. Manage. Inf. Syst. 35 , 776–812 (2018)

Kumar, C., Marston, S., Sen, R., Narisetty, A.: Greening the cloud: a load balancing mechanism to optimize cloud computing networks. J. Manage. Inf. Syst. 39 ,, 513–541 (2022)

Kung, L., Cegielski, C.G., Kung, H.-J.: An integrated environmental perspective on software as a service adoption in manufacturing and retail firms. J. Inf. Technol. 30 , 352–363 (2015)

Lansing, J., Benlian, A., Sunyaev, A.: Unblackboxing” decision makers’ interpretations of IS certifications in the context of cloud service certifications. J. Assoc. Inf. Syst. 19 (11), 1064–1096 (2018)

Lansing, J., Siegfried, N., Sunyaev, A., Benlian, A.: Strategic signaling through cloud service certifications: Comparing the relative importance of certifications’ assurances to companies and consumers. J. Strateg. Inf. Syst. 28 , 101579 (2019)

Lansing, J., Sunyaev, A.: Trust in cloud computing. ACM SIGMIS Database DATABASE Adv. Inform. Syst. 47 , 58–96 (2016)

Lee, J., Cho, D., Lim, G.: Design and validation of the bright internet. J. Assoc. Inform. Syst. 19 , 63–85 (2018)

Lee, M.H., Han, S.P., Park, S., Oh, W.: Positive demand spillover of popular app adoption: implications for platform owners’ management of complements. Inf. Syst. Res. 34 (3), 961–995 (2023)

Li, S., Chen, W., Chen, Y., Chen, C. and Zheng, Z.: Makespan-minimized computation offloading for smart toys in edge-cloud computing. Electron. Commerce Res. Appl. 37 , 100884 (2019)

Li, S., Cheng, H.K., Duan, Y., Yang, Y.-C.: A study of enterprise software licensing models. J. Manag. Inf. Syst. 34 (1), 177–205 (2017)

Lins, S., Schneider, S., Szefer, J., Ibraheem, S., Ali, A.: Designing monitoring systems for continuous certification of cloud services: deriving meta-requirements and design guidelines. Commun. Assoc. Inf. Syst. 44 (1), 460–510 (2019)

Liu, Y., Sheng, X., Marston, S.R.: The impact of client-side security restrictions on the competition of cloud computing services. Int. J. Electron. Comm. 19 (3), 90–117 (2015)

Ma, D., Seidmann, A.: Analyzing software as a service with per-transaction charges. Inf. Syst. Res. 26 , 360–378 (2015)

Malik, O., Jaiswal, A., Kathuria, A., Karhade, P.: Leveraging BI systems to overcome infobesity: a comparative analysis of incumbent and new entrant firms (2022)

Mani, D., Srikanth, K., Bharadwaj, A.: Efficacy of R&D work in offshore captive centers: an empirical study of task characteristics, coordination mechanisms, and performance. Inf. Syst. Res. 25 (4), 846–864 (2014)

Mann, A., Kathuria, A., Khuntia, J., Saldanha, T.: Cloud-integration and business flexibility: the mediating role of cloud functional capabilities (2016)

Marston, S., Li, Z., Bandyopadhyay, S., Zhang, J., Ghalsasi, A.: Cloud computing — the business perspective. Decis. Support. Syst. 51 (1), 176–189 (2011)

Mell, P.M., Grance, T.: The NIST definition of cloud computing. National Institute of Standards and Technology (2011)

Metz, C.: The epic story of dropboxs exodus from the amazon cloud empire (2016)

Mithas, R., Sambamurthy,: How information management capability influences firm performance. MIS Q. 35 (1), 237 (2011)

Mithas, T., Bardhan, G.: Information technology and firm profitability: mechanisms and empirical evidence. MIS Q. 36 (1), 205 (2012)

Muhic, M., Bengtsson, L., Holmström, J.: Barriers to continuance use of cloud computing: evidence from two case studies. Inf. Manage. 60 , 103792 (2023)

Mukherjee, A., Sundarraj, R.P., Dutta, K.: Time-preference-based on-spot bundled cloud-service provisioning. Decis. Support. Syst. 151 , 113607 (2021)

Müller, S.D., Holm, S.R., Søndergaard, J.: Benefits of cloud computing: literature review in a maturity model perspective. Commun. Assoc. Inform. Syst. 37 , 851–878 (2015)

Ojala, A.: Business models and opportunity creation: how IT entrepreneurs create and develop business models under uncertainty. Inf. Syst. J. 26 , 451–476 (2015)

Oliveira, T., Thomas, M., Espadanal, M.: Assessing the determinants of cloud computing adoption: An analysis of the manufacturing and services sectors. Inf. Manage. 51 , 497–510 (2014)

Owens, D. Securing elasticity in the cloud. Communications of the ACM , 53, 6 (2010/06 2010), 46–51 (2010)

Pang, M.-S., Tanriverdi, H.: Strategic roles of IT modernization and cloud migration in reducing cybersecurity risks of organizations: the case of U.S. federal government. J. Strat. Inf. Syst. 31 , 101707 (2022)

Park, J., Han, K., Lee, B.: Green cloud? An empirical analysis of cloud computing and energy efficiency. Manage. Sci. 69 , 1639–1664 (2023)

Parno, B., Howell, J., Gentry, C., Raykova, M.: Pinocchio. Commun. ACM 59 , 103–112 (2016)

Pye, J., Rai, A., Dong, J.Q.: Business value of information technology capabilities: an institutional governance perspective. Inf. Syst. Res. 35 , 28–44 (2023)

Ramakrishnan, T., Kathuria, A., Khuntia, J., Konsynski, B.: IoT value creation through supply chain analytics capability (2022)

Retana, G., Forman, C., Narasimhan, S., Niculescu, M.F., Wu, D.J.: Technical support, knowledge transfer, and service demand: evidence from the cloud. SSRN Electron. J. (2012)

Rodrigues, J., Ruivo, P., Oliveira, T.: Mediation role of business value and strategy in firm performance of organizations using software-as-a-service enterprise applications. Inf. Manag. 58 (1), 103289 (2021)

Saldanha, T.J., Andrade-Rojas, M.G., Kathuria, A., Khuntia, J., Krishnan, M.: How the locus of uncertainty shapes the influence of CEO long-term compensation on IT capital investments. MIS Q. (2023)

Sambhara, C., Rai, A., Xu, S.X.: Configuring the enterprise systems portfolio: the role of information risk. Inf. Syst. Res. 33 (2), 446–463 (2022)

Sarker, S., Chatterjee, S., Xiao, X., Elbanna, A.: The sociotechnical axis of cohesion for the IS discipline: its historical legacy and its continued relevance. MIS Q. 43 (3), 695–720 (2019)

Schlagwein, D., Thorogood, A., Willcocks, L.P.: How commonwealth bank of Australia gained benefits using a standards-based, multi-provider cloud model. MIS Q. Exec. 13 (4), 209–222 (2014)

Schneider, S., Sunyaev, A.: Determinant factors of cloud-sourcing decisions: reflecting on the IT outsourcing literature in the era of cloud computing. J. Inf. Technol. 31 (1), 1–31 (2016). https://doi.org/10.1057/jit.2014.25

Schneider, S., Wollersheim, J., Krcmar, H., Sunyaev, A.: How do Requirements evolve over Time? A case study investigating the role of context and experiences in the evolution of enterprise software requirements. J. Inf. Technol. 33 (2), 151–170 (2018)

Schniederjans, D.G., Hales, D.N.: Cloud computing and its impact on economic and environmental performance: a transaction cost economics perspective. Decis. Support. Syst. 86 , 73–82 (2016)

Schreieck, M., Wiesche, M., Krcmar, H.: Capabilities for value co-creation and value capture in emergent platform ecosystems: a longitudinal case study of SAP’s cloud platform. J. Inf. Technol. 36 (4), 365–390 (2021)

Shiau, W.-L., Chau, P.Y.K.: Understanding behavioral intention to use a cloud computing classroom: a multiple model comparison approach. Inf. Manag. 53 (3), 355–365 (2016)

Singh, V.K., Shivendu, S., Dutta, K.: Spot instance similarity and substitution effect in cloud spot market. Decis. Support. Syst. 159 , 113815 (2022)

Soh, F., Setia, P.: The impact of dominant IT infrastructure in multi-establishment firms: the moderating role of environmental dynamism. J. Assoc. Inf. Syst. 23 (6), 1603–1633 (2022)

Son, I., Lee, D., Lee, J.-N., Chang, Y.B.: Market perception on cloud computing initiatives in organizations: an extended resource-based view. Inf. Manag. 51 (6), 653–669 (2014)

Srinivasan, S.: Is security realistic in cloud computing? J. Int. Technol. Inf. Manag. 22 (4), 3 (2013). https://doi.org/10.58729/1941-6679.1020

Article   MathSciNet   Google Scholar  

Sun, T., Shi, L., Viswanathan, S., Zheleva, E.: Motivating effective mobile app adoptions: evidence from a large-scale randomized field experiment. Inf. Syst. Res. 30 (2), 523–539 (2019)

Templier, M., Paré, G.: Transparency in literature reviews: an assessment of reporting practices across review types and genres in top IS journals. Eur. J. Inf. Syst. 27 (5), 503–550 (2017). https://doi.org/10.1080/0960085X.2017.1398880

Trenz, M., Huntgeburth, J., Veit, D.: Uncertainty in cloud service relationships: uncovering the differential effect of three social influence processes on potential and current users. Inf. Manage. 55, 971–983 (2018)

van de Weerd, I., Mangula, I.S., Brinkkemper, S.: Adoption of software as a service in Indonesia: examining the influence of organizational factors. Inf. Manage. 53 (7), 915–928 (2016)

Venkatesh, V., Bala, H., Sambamurthy, V.: Implementation of an information and communication technology in a developing country: a multimethod longitudinal study in a Bank in India. Inf. Syst. Res. 27 (3), 558–579 (2016)

Venkatesh, V., Sykes, T.A.: Digital divide initiative success in developing countries: a longitudinal field study in a Village in India. Inf. Syst. Res. 24 (2), 239–260 (2013)

Venters, W., Whitley, E.A.: A critical review of cloud computing: researching desires and realities. J. Inf. Technol. 27 (3), 179–197 (2012)

Wang, N., Huigang Liang, Yu., Jia, S.G., Xue, Y., Wang, Z.: Cloud computing research in the IS discipline: a citation/co-citation analysis. Decis. Support. Syst. 86 , 35–47 (2016)

Wang, X., Wang, X.: Multimedia data delivery based on IoT clouds. Commun. ACM 64 (8), 80–86 (2021)

Winkler, T.J., Benlian, A., Piper, M., Hirsch, H.: Bayer healthcare delivers a dose of reality for cloud payoff mantras in multinationals. MIS Q. Exec. 13 , 4 (2014)

Winkler, T.J., Brown, C.V.: Horizontal allocation of decision rights for on-premise applications and Software-as-a-Service. J. Manage. Inf. Syst. 30 (3), 13–48 (2013)

Wright, R.T., Roberts, N., Wilson, D.: The role of context in IT assimilation: a multi-method study of a SaaS platform in the US nonprofit sector. Eur. J. Inf. Syst. 26 (5), 509–539 (2017). https://doi.org/10.1057/s41303-017-0053-2

Wulf, F., Lindner, T., Strahringer, S., Westner, M.: IaaS, PaaS, or SaaS? The why of cloud computing delivery model selection: vignettes on the post-adoption of cloud computing. In: The Proceedings of Proceedings of the 54th Hawaii International Conference on System Sciences, pp. 6285–6294 (2021)

Xiong, Hu., Wang, Yi., Li, W., Chen, C.-M.: Flexible, efficient, and secure access delegation in cloud computing. ACM Trans. Manage. Inf. Syst. 10 (1), 1–20 (2019)

Yang, H., Tate, M.: A descriptive literature review and classification of cloud computing research. Commun. Assoc. Inf. Syst. 31 (1), 2 (2012)

Yaraghi, N., Du, A.Y., Sharman, R., Gopal, R.D., Ramesh, R.: Health Information exchange as a multisided platform: adoption, usage, and practice involvement in service co-production. Inf. Syst. Res. 26 (1), 1–18 (2015)

Yuan, S., Sanjukta Das, R., Ramesh, C.Q.: Service agreement trifecta: backup resources, price and penalty in the availability-aware cloud. Inf. Syst. Res. 29 (4), 947–964 (2018)

Zhang, G., Ravishankar, M.N.: Exploring vendor capabilities in the cloud environment: a case study of Alibaba cloud computing. Inf. Manage. 56 , 343–355 (2019)

Zhang, X., Yue, W.: Integration of on-premises and cloud-based software: the product bundling perspective. J. Assoc. Inform. Syst. 21 , 1507–1551 (2020)

Zorrilla, M., García-Saiz, D.: A service oriented architecture to provide data mining services for non-expert data miners. Decis. Support. Syst.. Support. Syst. 55 (1), 399–411 (2013). https://doi.org/10.1016/j.dss.2012.05.045

Download references

Author information

Authors and affiliations.

SP Jain Institute of Management and Research, Mumbai, India

Shailaja Jha

Indian School of Business, Hyderabad, India

Devina Chaturvedi

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Devina Chaturvedi .

Editor information

Editors and affiliations.

#6104, Indian School of Business, Hyderabad, Telangana, India

Abhishek Kathuria

Chinese University of Hong Kong, Sha Tin District, Hong Kong

Prasanna P. Karhade

University of North Carolina at Charlotte, Charlotte, NC, USA

Indian School of Business, Hyderabad, Telangana, India

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper.

Jha, S., Chaturvedi, D. (2024). Systematic Literature Review of Cloud Computing Research Between 2010 and 2023. In: Kathuria, A., Karhade, P.P., Zhao, K., Chaturvedi, D. (eds) Digital Transformation in the Viral Age. WeB 2022. Lecture Notes in Business Information Processing, vol 508. Springer, Cham. https://doi.org/10.1007/978-3-031-60003-6_5

Download citation

DOI : https://doi.org/10.1007/978-3-031-60003-6_5

Published : 21 May 2024

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-60002-9

Online ISBN : 978-3-031-60003-6

eBook Packages : Computer Science Computer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Illustration showing how cloud computing enables access to intranet-based infrastructure and applications

Updated: 14 February 2024 Contributors: Stephanie Susnjara, Ian Smalley

Cloud computing is the on-demand access of computing resources—physical servers or virtual servers, data storage, networking capabilities, application development tools, software, AI-powered analytic tools and more—over the internet with pay-per-use pricing.

The cloud computing model offers customers greater flexibility and scalability compared to traditional on-premises infrastructure.

Cloud computing plays a pivotal role in our everyday lives, whether accessing a cloud application like Google Gmail, streaming a movie on Netflix or playing a cloud-hosted video game.

Cloud computing has also become indispensable in business settings, from small startups to global enterprises. Its many business applications include enabling remote work by making data and applications accessible from anywhere, creating the framework for seamless omnichannel customer engagement and providing the vast computing power and other resources needed to take advantage of cutting-edge technologies like generative AI and quantum computing . 

A cloud services provider (CSP) manages cloud-based technology services hosted at a remote data center and typically makes these resources available for a pay-as-you-go or monthly subscription fee.

Read how Desktop as a service (DaaS) enables enterprises to achieve the same level of performance and security as deploying the applications on-premises.

Register for the guide on app modernization

Compared to traditional on-premises IT that involves a company owning and maintaining physical data centers and servers to access computing power, data storage and other resources (and depending on the cloud services you select), cloud computing offers many benefits, including the following:

Cloud computing lets you offload some or all of the expense and effort of purchasing, installing, configuring and managing mainframe computers and other on-premises infrastructure. You pay only for cloud-based infrastructure and other computing resources as you use them. 

With cloud computing, your organization can use enterprise applications in minutes instead of waiting weeks or months for IT to respond to a request, purchase and configure supporting hardware and install software. This feature empowers users—specifically DevOps and other development teams—to help leverage cloud-based software and support infrastructure.

Cloud computing provides elasticity and self-service provisioning, so instead of purchasing excess capacity that sits unused during slow periods, you can scale capacity up and down in response to spikes and dips in traffic. You can also use your cloud provider’s global network to spread your applications closer to users worldwide.

Cloud computing enables organizations to use various technologies and the most up-to-date innovations to gain a competitive edge. For instance, in retail, banking and other customer-facing industries, generative AI-powered virtual assistants deployed over the cloud can deliver better customer response time and free up teams to focus on higher-level work. In manufacturing, teams can collaborate and use cloud-based software to monitor real-time data across logistics and supply chain processes.

The origins of cloud computing technology go back to the early 1960s when  Dr. Joseph Carl Robnett Licklider  (link resides outside ibm.com), an American computer scientist and psychologist known as the "father of cloud computing", introduced the earliest ideas of global networking in a series of memos discussing an Intergalactic Computer Network. However, it wasn’t until the early 2000s that modern cloud infrastructure for business emerged.

In 2002, Amazon Web Services started cloud-based storage and computing services. In 2006, it introduced Elastic Compute Cloud (EC2), an offering that allowed users to rent virtual computers to run their applications. That same year, Google introduced the Google Apps suite (now called Google Workspace), a collection of SaaS productivity applications. In 2009, Microsoft started its first SaaS application, Microsoft Office 2011. Today,  Gartner predicts  worldwide end-user spending on the public cloud will total USD 679 billion and is projected to exceed USD 1 trillion in 2027 (link resides outside ibm.com).

The following are a few of the most integral components of today’s modern cloud computing architecture.

CSPs own and operate remote data centers that house physical or bare metal servers , cloud storage systems and other physical hardware that create the underlying infrastructure and provide the physical foundation for cloud computing.

In cloud computing, high-speed networking connections are crucial. Typically, an internet connection known as a wide-area network (WAN) connects front-end users (for example, client-side interface made visible through web-enabled devices) with back-end functions (for example, data centers and cloud-based applications and services). Other advanced cloud computing networking technologies, including load balancers , content delivery networks (CDNs) and software-defined networking (SDN) , are also incorporated to ensure data flows quickly, easily and securely between front-end users and back-end resources. 

Cloud computing relies heavily on the virtualization of IT infrastructure —servers, operating system software, networking and other infrastructure that’s abstracted using special software so that it can be pooled and divided irrespective of physical hardware boundaries. For example, a single hardware server can be divided into multiple virtual servers . Virtualization enables cloud providers to make maximum use of their data center resources. 

IaaS (Infrastructure-as-a-Service), PaaS (Platform-as-a-Service), SaaS (Software-as-a-Service) and serverless computing are the most common models of cloud services, and it’s not uncommon for an organization to use some combination of all four.

IaaS (Infrastructure-as-a-Service) provides on-demand access to fundamental computing resources—physical and virtual servers, networking and storage—over the internet on a pay-as-you-go basis. IaaS enables end users to scale and shrink resources on an as-needed basis, reducing the need for high up-front capital expenditures or unnecessary on-premises or "owned" infrastructure and for overbuying resources to accommodate periodic spikes in usage. 

According to a  Business Research Company report  (link resides outside ibm.com), the IaaS market is predicted to grow rapidly in the next few years, growing to $212.34 billion in 2028 at a compound annual growth rate (CAGR) of 14.2%. 

PaaS (Platform-as-a-Service) provides software developers with an on-demand platform—hardware, complete software stack, infrastructure and development tools—for running, developing and managing applications without the cost, complexity and inflexibility of maintaining that platform on-premises. With PaaS, the cloud provider hosts everything at their data center. These include servers, networks, storage, operating system software, middleware  and databases. Developers simply pick from a menu to spin up servers and environments they need to run, build, test, deploy, maintain, update and scale applications.

Today, PaaS is typically built around  container s , a virtualized compute model one step removed from virtual servers. Containers virtualize the operating system, enabling developers to package the application with only the operating system services it needs to run on any platform without modification and the need for middleware.

Red Hat® OpenShift ® is a popular PaaS built around  Docker  containers and  Kubernetes , an open source container orchestration solution that automates deployment, scaling, load balancing and more for container-based applications.

SaaS (Software-as-a-Service) , also known as cloud-based software or cloud applications, is application software hosted in the cloud. Users access SaaS through a web browser, a dedicated desktop client or an API that integrates with a desktop or mobile operating system. Cloud service providers offer SaaS based on a monthly or annual subscription fee. They may also provide these services through pay-per-usage pricing. 

In addition to the cost savings, time-to-value and scalability benefits of cloud, SaaS offers the following:

  • Automatic upgrades:  With SaaS, users use new features when the cloud service provider adds them without orchestrating an on-premises upgrade.
  • Protection from data loss:  Because SaaS stores application data in the cloud with the application, users don’t lose data if their device crashes or breaks.

SaaS is the primary delivery model for most commercial software today. Hundreds of SaaS solutions exist, from focused industry and broad administrative (for example, Salesforce) to robust enterprise database and artificial intelligence (AI) software. According to an International Data Center (IDC) survey (the link resides outside IBM), SaaS applications represent the largest cloud computing segment, accounting for more than 48% of the $778 billion worldwide cloud software revenue.

Serverless computing , or simply serverless, is a cloud computing model that offloads all the back-end infrastructure management tasks, including provisioning, scaling, scheduling and patching to the cloud provider. This frees developers to focus all their time and effort on the code and business logic specific to their applications.

Moreover, serverless runs application code on a per-request basis only and automatically scales the supporting infrastructure up and down in response to the number of requests. With serverless, customers pay only for the resources used when the application runs; they never pay for idle capacity. 

FaaS, or Function-as-a-Service , is often confused with serverless computing when, in fact, it’s a subset of serverless. FaaS allows developers to run portions of application code (called functions) in response to specific events. Everything besides the code—physical hardware, virtual machine (VM) operating system and web server software management—is provisioned automatically by the cloud service provider in real-time as the code runs and is spun back down once the execution is complete. Billing starts when execution starts and stops when execution stops.

A  public cloud is a type of cloud computing in which a cloud service provider makes computing resources available to users over the public internet. These include SaaS applications, individual  virtual machines (VMs) , bare metal computing hardware, complete enterprise-grade infrastructures and development platforms. These resources might be accessible for free or according to subscription-based or pay-per-usage pricing models.

The public cloud provider owns, manages and assumes all responsibility for the data centers, hardware and infrastructure on which its customers’ workloads run. It typically provides high-bandwidth network connectivity to ensure high performance and rapid access to applications and data.

Public cloud is a  multi-tenant environment  where all customers pool and share the cloud provider’s data center infrastructure and other resources. In the world of the leading public cloud vendors, such as Amazon Web Services (AWS), Google Cloud, IBM Cloud®, Microsoft Azure and Oracle Cloud, these customers can number in the millions.

Most enterprises have moved portions of their computing infrastructure to the public cloud since public cloud services are elastic and readily scalable, flexibly adjusting to meet changing workload demands. The promise of greater efficiency and cost savings through paying only for what they use attracts customers to the public cloud. Still, others seek to reduce spending on hardware and on-premises infrastructure.  Gartner predicts  (link resides outside ibm.com) that by 2026, 75% of organizations will adopt a digital transformation model predicated on cloud as the fundamental underlying platform. 

A  private cloud is a cloud environment where all cloud infrastructure and computing resources are dedicated to one customer only. Private cloud combines many benefits of cloud computing—including elasticity, scalability and ease of service delivery—with the access control, security and resource customization of on-premises infrastructure.

A private cloud is typically hosted on-premises in the customer’s data center. However, it can also be hosted on an independent cloud provider’s infrastructure or built on rented infrastructure housed in an offsite data center.

Many companies choose a private cloud over a public cloud environment to meet their regulatory compliance requirements. Entities like government agencies, healthcare organizations and financial institutions often opt for private cloud settings for workloads that deal with confidential documents, personally identifiable information (PII), intellectual property, medical records, financial data or other sensitive data.

By building private cloud architecture according to  cloud-native  principles, an organization can quickly move workloads to a public cloud or run them within a hybrid cloud (see below) environment whenever ready.

A  hybrid cloud is just what it sounds like: a combination of public cloud, private cloud and on-premises environments. Specifically (and ideally), a hybrid cloud connects a combination of these three environments into a single, flexible infrastructure for running the organization’s applications and workloads. 

At first, organizations turned to hybrid cloud computing models primarily to migrate portions of their on-premises data into private cloud infrastructure and then connect that infrastructure to public cloud infrastructure hosted off-premises by cloud vendors. This process was done through a packaged hybrid cloud solution like Red Hat® OpenShift® or middleware and IT management tools to create a " single pane of glass ." Teams and administrators rely on this unified dashboard to view their applications, networks and systems.

Today, hybrid cloud architecture has expanded beyond physical connectivity and cloud migration to offer a flexible, secure and cost-effective environment that supports the portability and automated deployment of workloads across multiple environments. This feature enables an organization to meet its technical and business objectives more effectively and cost-efficiently than with a public or private cloud alone. For instance, a hybrid cloud environment is ideal for DevOps and other teams to develop and test web applications. This frees organizations from purchasing and expanding the on-premises physical hardware needed to run application testing, offering faster time to market. Once a team has developed an application in the public cloud, they may move it to a private cloud environment based on business needs or security factors.

A public cloud also allows companies to quickly scale resources in response to unplanned spikes in traffic without impacting private cloud workloads, a feature known as cloud bursting. Streaming channels like Amazon use cloud bursting to support the increased viewership traffic when they start new shows.

Most enterprise organizations today rely on a hybrid cloud model because it offers greater flexibility, scalability and cost optimization than traditional on-premises infrastructure setups. According to the  IBM Transformation Index: State of Cloud , more than 77% of businesses and IT professionals have adopted a hybrid cloud approach.

To learn more about the differences between public, private and hybrid cloud, check out “ Public cloud vs. private cloud vs. hybrid cloud: What’s the difference? ”

Watch the IBM hybrid cloud architecture video series.

Multicloud uses two or more clouds from two or more different cloud providers. A multicloud environment can be as simple as email SaaS from one vendor and image editing SaaS from another. But when enterprises talk about multicloud, they typically refer to using multiple cloud services—including SaaS, PaaS and IaaS services—from two or more leading public cloud providers. 

Organizations choose multicloud to avoid vendor lock-in, to have more services to select from and to access more innovation. With multicloud, organizations can choose and customize a unique set of cloud features and services to meet their business needs. This freedom of choice includes selecting “best-of-breed” technologies from any CSP, as needed or as they emerge, rather than being locked into offering from a single vendor. For example, an organization may choose AWS for its global reach with web-hosting, IBM Cloud for data analytics and machine learning platforms and Microsoft Azure for its security features.

A multicloud environment also reduces exposure to licensing, security and compatibility issues that can result from " shadow IT "— any software, hardware or IT resource used on an enterprise network without the IT department’s approval and often without IT’s knowledge or oversight.

Today, most enterprise organizations use a hybrid multicloud model. Apart from the flexibility to choose the most cost-effective cloud service, hybrid multicloud offers the most control over workload deployment, enabling organizations to operate more efficiently, improve performance and optimize costs. According to an  IBM® Institute for Business Value study , the value derived from a full hybrid multicloud platform technology and operating model at scale is two-and-a-half times the value derived from a single-platform, single-cloud vendor approach. 

Yet the modern hybrid multicloud model comes with more complexity. The more clouds you use—each with its own management tools, data transmission rates and security protocols—the more difficult it can be to manage your environment. With  over 97% of enterprises operating on more than one cloud  and most organizations running  10 or more clouds , a hybrid cloud management approach has become crucial. Hybrid multicloud management platforms provide visibility across multiple provider clouds through a central dashboard where development teams can see their projects and deployments, operations teams can monitor clusters and nodes and the cybersecurity staff can monitor for threats.

Learn more about hybrid cloud management.

Traditionally, security concerns have been the primary obstacle for organizations considering cloud services, mainly public cloud services. Maintaining cloud security demands different procedures and employee skillsets than in legacy IT environments. Some cloud security best practices include the following:

  • Shared responsibility for security:  Generally, the cloud service provider is responsible for securing cloud infrastructure, and the customer is responsible for protecting its data within the cloud. However, it’s also essential to clearly define data ownership between private and public third parties.
  • Data encryption:  Data should be encrypted while at rest, in transit and in use. Customers need to maintain complete control over security keys and hardware security modules.
  • Collaborative management:  Proper communication and clear, understandable processes between IT, operations and security teams will ensure seamless cloud integrations that are secure and sustainable.
  • Security and compliance monitoring:  This begins with understanding all regulatory compliance standards applicable to your industry and establishing active monitoring of all connected systems and cloud-based services to maintain visibility of all data exchanges across all environments, on-premises, private cloud, hybrid cloud and edge.

Cloud security is constantly changing to keep pace with new threats. Today’s CSPs offer a wide array of cloud security management tools, including the following:  

  • Identity and access management (IAM):  IAM   tools and services that automate policy-driven enforcement protocols for all users attempting to access both on-premises and cloud-based services. 
  • Data loss prevention (DLP): DLP services that combine remediation alerts data encryption and other preventive measures to protect all stored data, whether at rest or in motion.
  • Security information and event management (SIEM) :   SIEM is a comprehensive security orchestration solution that automates threat monitoring, detection and response in cloud-based environments. SIEM technology uses artificial intelligence (AI)-driven technologies to correlate log data across multiple platforms and digital assets. This allows IT teams to successfully apply their network security protocols, enabling them to react to potential threats quickly.
  • Automated data compliance platforms:   Automated software solutions provide compliance controls and centralized data collection to help organizations adhere to regulations specific to their industry. Regular compliance updates can be baked into these platforms so organizations can adapt to ever-changing regulatory compliance standards.

Learn more about cloud security.

Sustainability in business , a company’s strategy to reduce negative environmental impact from their operations in a particular market, has become an essential corporate governance mandate.  Moreover, Gartner predicts  (link resides outside ibm.com) that by 2025, the carbon emissions of hyperscale cloud services will be a top-three criterion in cloud purchase decisions.

As companies strive to advance their sustainability objectives, cloud computing has evolved to play a significant role in helping them reduce their carbon emissions and manage climate-related risks. For instance, traditional data centers require power supplies and cooling systems, which depend on large amounts of electrical power. By migrating IT resources and applications to the cloud, organizations only enhance operational and cost efficiencies and boost overall energy efficiency through pooled CSP resources.

All major cloud players have made net-zero commitments to reduce their carbon footprints and help clients reduce the energy they typically consume using an on-premises setup. For instance, IBM is driven by  sustainable procurement  initiatives to reach NetZero by 2030. By 2025, IBM Cloud worldwide data centers  will comprise energy procurement drawn from 75% renewable sources .

According to an  International Data Corporation (IDC) forecast  (link resides outside ibm.com), worldwide spending on the whole cloud opportunity (offerings, infrastructure and services) will surpass USD 1 trillion in 2024 while sustaining a double-digit compound annual growth rate (CAGR) of 15.7%. Here are some of the main ways businesses are benefitting from cloud computing: 

  • Scale infrastructure:  Allocate resources up or down quickly and easily in response to changes in business demands.
  • Enable business continuity and disaster recovery:  Cloud computing provides cost-effective redundancy to protect data against system failures and the physical distance required to apply disaster recovery strategies and recover data and applications during a local outage or disaster. All of the major public cloud providers offer Disaster-Recovery-as-a-Service (DRaaS) .
  • Build and test cloud-native applications : For development teams adopting Agile,  DevOps  or  DevSecOps to streamline development, the cloud offers on-demand end-user self-service that prevents operations tasks, such as spinning up development and test servers, from becoming development bottlenecks.
  • Support edge and IoT environments:  Address latency challenges and reduce downtime by bringing data sources closer to the edge . Support Internet of Things (IoT) devices (for example, patient monitoring devices and sensors on a production line) to gather real-time data.
  • Leverage cutting-edge technologies:  Cloud computing supports storing and processing huge volumes of data at high speeds—much more storage and computing capacity than most organizations can or want to purchase and deploy on-premises. These high-performance resources support technologies like  blockchain , quantum computing and  large language models (LLMs ) that power generative AI platforms like customer service automation. 

Create a no-charge IBM Cloud account and access more than 40 always-free products in cloud and AI.

IBM Cloud for VMware Solutions enables you to seamlessly migrate and modernize VMware workloads to the cloud, allowing you to leverage your existing investments for a consistent VMware experience—retaining the same level of access, security and control.

Let IBM Cloud manage your infrastructure while you manage your environment. Pay only for what you use.

Tackle large-scale, compute-intensive challenges and speed time to insight with hybrid cloud HPC solutions.

An industry-specific cloud, built to support your unique modernization and AI transformation needs.

Hybrid cloud integrates public cloud services, private cloud services and on-premises infrastructure into a single distributed computing environment.

DevOps speeds delivery of higher quality software by combining and automating the work of software development and IT operations teams.

Cloud migration is the process of relocating an organization’s data, applications, and workloads to a cloud infrastructure.

Although cloud computing is only a different way to deliver computer resources rather than a new technology, it has sparked a revolution in the way organizations provide information and service.

Determining the best cloud computing architecture for enterprise business is critical for overall success. That’s why it is essential to compare the different functionalities of private cloud versus public cloud versus hybrid cloud.

We're excited to introduce a three-part lightboarding video series that will delve into the world of hybrid cloud architecture. In this intro video, our guide, Sai Vennam, lays out the three major hybrid cloud architecture issues that we're going to cover: Connectivity, Modernization and Security.

Designed for industry, security and the freedom to build and run anywhere, IBM Cloud is a full stack cloud platform with over 170 products and services covering data, containers, AI, IoT and blockchain. Use IBM Cloud to build scalable infrastructure at a lower cost, deploy new applications instantly and scale up workloads based on demand.

  • Trending Now
  • Foundational Courses
  • Data Science
  • Practice Problem
  • Machine Learning
  • System Design
  • DevOps Tutorial
  • Top 10 Cloud Computing Research Topics in 2020
  • Top 10 Cloud Computing Trends [2024]
  • An Insight to Cloud Computing
  • Top 10 Job Opportunities in Cloud Computing
  • 10 Best Cloud Computing Project Ideas
  • Orchestration in Cloud Computing
  • Cloud computing Research challenges
  • Issues in Cloud Computing
  • Economics of Cloud Computing
  • History of Cloud Computing
  • Cloud Computing Infrastructure
  • Top 5 Cloud Computing Companies to Work For in 2024
  • Security Issues in Cloud Computing
  • Cloud Computing Services in Financial Market
  • Top Cloud Computing Platforms for Machine Learning
  • Cloud Computing Interview Questions
  • How to Make a Career in Cloud Computing?
  • Energy Efficiency in Cloud Computing
  • Principles of Cloud Computing
  • Cloud Computing Platforms and Technologies

Top 15 Cloud Computing Research Topics in 2024

Cloud computing has suddenly seen a spike in employment opportunities around the globe with tech giants like Amazon , Google , and Microsoft hiring people for their cloud infrastructure . Before the onset of cloud computing , companies and businesses had to set up their own data centers , and allocate resources and other IT professionals thereby increasing the cost. The rapid development of the cloud has led to more flexibility , cost-cutting , and scalability .

Top-10-Cloud-Computing-Research-Topics-in-2020

The Cloud Computing market is at an all-time high with the current market size at USD 371.4 billion and is expected to grow up to USD 832.1 billion by 2025 ! It’s quickly evolving and gradually realizing its business value along with attracting more and more researchers , scholars , computer scientists , and practitioners. Cloud computing is not a single topic but a composition of various techniques which together constitute the cloud . Below are 10 of the most demanded research topics in the field of cloud computing .

What is Cloud Computing?

Cloud computing is the practice of storing and accessing data and applications on remote servers hosted over the internet, as opposed to local servers or the computer’s hard drive. Cloud computing, often known as Internet-based computing, is a technique in which the user receives a resource as a service via the Internet. Files, photos, documents, and other storable documents can all be considered types of data that are stored.

Let us look at the latest in cloud computing research for 2024! We’ve compiled 15 important cloud computing research topics that are changing how cloud computing is used.

1. Big Data

Big data refers to the large amounts of data produced by various programs in a very short duration of time. It is quite cumbersome to store such huge and voluminous amounts of data in company-run data centers . Also, gaining insights from this data becomes a tedious task and takes a lot of time to run and provide results, therefore cloud is the best option. All the data can be pushed onto the cloud without the need for physical storage devices that are to be managed and secured. Also, some popular public clouds provide comprehensive big data platforms to turn data into actionable insights.

DevOps is an amalgamation of two terms, Development and Operations . It has led to Continuous Delivery , Integration, and Deployment therefore reducing boundaries between the development team and the operations team . Heavy applications and software need elaborate and complex tech stacks that demand extensive labor to develop and configure which can easily be eliminated by cloud computing . It offers a wide range of tools and technologies to build , test , and deploy applications within a few minutes and a single click. They can be customized as per the client’s requirements and can be discarded when not in use hence making the process seamless and cost-efficient for development teams .

3. Cloud Cryptography

Data in the cloud needs to be protected and secured from foreign attacks and breaches . To accomplish this, cryptography in the cloud is a widely used technique to secure data present in the cloud . It allows users and clients to easily and reliably access the shared cloud services since all the data is secured using either encryption techniques or by using the concept of the private key . It can make the plain text unreadable and limit the view of the data being transferred. Best cloud cryptographic security techniques are the ones that do not compromise the speed of data transfer and provide security without delaying the exchange of sensitive data.

4. Cloud Load Balancing

It refers to splitting and distributing the incoming load to the server from various sources. It permits companies and organizations to govern and supervise workload demands or application demands by redistributing, reallocating, and administering resources between different computers, networks, or servers. Cloud load balancing encompasses holding the circulation of traffic and demands that exist over the Internet. This reduces the problem of sudden outages, results in an improvement in overall performance, has rare chances of server crashes and also provides an advanced level of security. Cloud-based server farms can accomplish more precise scalability and accessibility using the server load balancing mechanism . Due to this, the workload demands can be easily distributed and controlled.

5. Mobile Cloud Computing

It is a mixture of cloud computing , mobile computing , and wireless network to provide services such as seamless and abundant computational resources to mobile users, network operators, and cloud computing professionals. The handheld device is the console and all the processing and data storage takes place outside the physical mobile device. Some advantages of using mobile cloud computing are that there is no need for costly hardware, battery life is longer, extended data storage capacity and processing power, improved synchronization of data, and high availability due to “store in one place, accessible from anywhere”. The integration and security aspects are taken care of by the backend that enables support to an abundance of access methods.

6. Green Cloud Computing

The major challenge in the cloud is the utilization of energy-efficient and hence develop economically friendly cloud computing solutions. Data centers that include servers , cables , air conditioners , networks , etc. in large numbers consume a lot of power and release enormous quantities of Carbon Dioxide in the atmosphere. Green Cloud Computing focuses on making virtual data centers and servers to be more environmentally friendly and energy-efficient. Cloud resources often consume so much power and energy leading to a shortage of energy and affecting the global climate. Green cloud computing provides solutions to make such resources more energy efficient and to reduce operational costs. This pivots on power management , virtualization of servers and data centers, recycling vast e-waste , and environmental sustainability .

7. Edge Computing

It is the advancement and a much more efficient form of Cloud computing with the idea that the data is processed nearer to the source. Edge Computing states that all of the computation will be carried out at the edge of the network itself rather than on a centrally managed platform or data warehouse. Edge computing distributes various data processing techniques and mechanisms across different positions. This makes the data deliverable to the nearest node and the processing at the edge . This also increases the security of the data since it is closer to the source and eliminates late response time and latency without affecting productivity

8. Containerization

Containerization in cloud computing is a procedure to obtain operating system virtualization . The user can work with a program and its dependencies utilizing remote resource procedures . The container in cloud computing is used to construct blocks, which aid in producing operational effectiveness , version control , developer productivity , and environmental stability . The infrastructure is upgraded since it provides additional control over the granular activities of the resources. The usage of containers in online services assists storage with cloud computing data security, elasticity, and availability. Containers provide certain advantages such as a steady runtime environment , the ability to run virtually anywhere, and the low overhead compared to virtual machines .

9. Cloud Deployment Model

There are four main cloud deployment models namely public cloud , private cloud , hybrid cloud , and community cloud . Each deployment model is defined as per the location of the infrastructure. The public cloud allows systems and services to be easily accessible to the general public . The public cloud could also be less reliable since it is open to everyone e.g. Email. A private cloud allows systems and services to be accessible inside an organization with no access to outsiders. It offers better security due to its access restrictions. A hybrid cloud is a mixture of private and public clouds with critical activities being performed using the private cloud and non-critical activities being performed using the public cloud. Community cloud allows systems and services to be accessible by a group of organizations.

10. Cloud Security

Since the number of companies and organizations using cloud computing is increasing at a rapid rate, the security of the cloud is a major concern. Cloud computing security detects and addresses every physical and logical security issue that comes across all the varied service models of code, platform, and infrastructure. It collectively addresses these services, however, these services are delivered in units, that is, the public, private, or hybrid delivery model. Security in the cloud protects the data from any leakage or outflow, theft, calamity, and removal. With the help of tokenization, Virtual Private Networks , and firewalls , data can be secured.

11. Serverless Computing

Serverless computing is a way of running computer programs without having to manage the underlying infrastructure. Instead of worrying about servers, networking, and scaling, you can focus solely on writing code to solve your problem. In serverless computing, you write small pieces of code called functions. These functions are designed to do specific tasks, like processing data, handling user requests, or performing calculations. When something triggers your function, like a user making a request to your website or a timer reaching a certain time, the cloud provider automatically runs your function for you. You don’t have to worry about setting up servers or managing resources.

12. Cloud-Native Applications

Modern applications built for the cloud , also known as cloud-native applications , are made so to take full advantage of cloud computing environments . Instead of bulky programs like monolithic systems , they’re built to prioritize flexibility , easy scaling , reliability , and constant updates . This modular approach allows them to adapt to changing needs by growing or shrinking on demand, making them perfect for the ever-shifting world of cloud environments. Deployed in various cloud environments like public, private, or hybrid clouds, they’re optimized to make the most of cloud-native technologies and methodologies . Instead of one big chunk, they’re made up of lots of smaller pieces called microservices .

13. Multi-Cloud Management

Multi-cloud management means handling and controlling your stuff (like software, data, and services) when they’re spread out across different cloud companies, like Amazon, Google, or Microsoft. It’s like having a central command center for your cloud resources spread out across different cloud services. Multi-cloud gives you the freedom to use the strengths of different cloud providers. You can choose the best service for each specific workload, based on factors like cost, performance, or features. This flexibility allows you to easily scale your applications up or down as required by you. Managing a complex environment with resources spread across multiple cloud providers can be a challenge. Multi-cloud management tools simplify this process by providing a unified view and standardized management interface.

14. Blockchain in Cloud Computing

Cloud computing provides flexible storage and processing power that can grow or shrink as needed. Blockchain keeps data secure by spreading it across many computers. When we use them together, blockchain apps can use the cloud’s power for big tasks while keeping data safe and transparent. This combo boosts cloud data security and makes it easy to track data. It also lets people manage their identities without a central authority. However, there are challenges like making sure different blockchain and cloud systems work well together and can handle large amounts of data.

15. Cloud-Based Internet of Things (IoT)

Cloud-based Internet of Things (IoT) refers to the integration of cloud computing with IoT devices and systems. This integration allows IoT devices to leverage the computational power, storage, and analytics capabilities of cloud platforms to manage, process, and analyze the vast amounts of data they generate. The cloud serves as a central hub for connecting and managing multiple IoT devices, regardless of their geographical location. This connectivity is crucial for monitoring and controlling devices remotely.

Also Read Cloud computing Research challenges 7 Privacy Challenges in Cloud Computing Difference Between Cloud Computing and Fog Computing

Cloud computing has helped businesses grow by offering greater scalability , flexibility , and saving money by charging less money for the same job. As cloud computing is having a great growth period right now, it has created lots of employment opportunities and research work is done is different areas which is changing the future of this technology. We have discussed about the top 15 cloud computing research topics . You can try to explore and research in these areas to contribute to the growth of cloud computing technology .

author

Please Login to comment...

Similar reads.

  • Cloud-Computing

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

  • Reference Manager
  • Simple TEXT file

People also looked at

Systematic review article, securing machine learning in the cloud: a systematic review of cloud machine learning security.

www.frontiersin.org

  • 1 Information Technology University (ITU), Lahore, Pakistan
  • 2 AI4Networks Research Center, University of Oklahoma, Norman, OK, United States
  • 3 Social Data Science (SDS) Lab, Queen Mary University of London, London, United Kingdom
  • 4 School of Computing and Communications, Lancaster University, Lancaster, United Kingdom
  • 5 Hamad Bin Khalifa University (HBKU), Doha, Qatar

With the advances in machine learning (ML) and deep learning (DL) techniques, and the potency of cloud computing in offering services efficiently and cost-effectively, Machine Learning as a Service (MLaaS) cloud platforms have become popular. In addition, there is increasing adoption of third-party cloud services for outsourcing training of DL models, which requires substantial costly computational resources (e.g., high-performance graphics processing units (GPUs)). Such widespread usage of cloud-hosted ML/DL services opens a wide range of attack surfaces for adversaries to exploit the ML/DL system to achieve malicious goals. In this article, we conduct a systematic evaluation of literature of cloud-hosted ML/DL models along both the important dimensions— attacks and defenses —related to their security. Our systematic review identified a total of 31 related articles out of which 19 focused on attack, six focused on defense, and six focused on both attack and defense. Our evaluation reveals that there is an increasing interest from the research community on the perspective of attacking and defending different attacks on Machine Learning as a Service platforms. In addition, we identify the limitations and pitfalls of the analyzed articles and highlight open research issues that require further investigation.

1 Introduction

In recent years, machine learning (ML) techniques have been successfully applied to a wide range of applications, significantly outperforming previous state-of-the-art methods in various domains: for example, image classification, face recognition, and object detection. These ML techniques—in particular deep learning (DL)–based ML techniques—are resource intensive and require a large amount of training data to accomplish a specific task with good performance. Training DL models on large-scale datasets is usually performed using high-performance graphics processing units (GPUs) and tensor processing units. However, keeping in mind the cost of GPUs/Tensor Processing Units and the fact that small businesses and individuals cannot afford such computational resources, the training of deep models is typically outsourced to clouds, which is referred to in the literature as “Machine Learning as a Service” (MLaaS).

MLaaS refers to different ML services that are offered as a component of a cloud computing services, for example, predictive analytics, face recognition, natural language services, and data modeling APIs. MLaaS allows users to upload their data and model for training at the cloud. In addition to training, cloud-hosted ML services can also be used for inference purposes, that is, models can be deployed on the cloud environments; the system architecture of a typical MLaaS is shown in Figure 1 .

www.frontiersin.org

FIGURE 1 . Taxonomy of different defenses proposed for defending attacks on the third-party cloud-hosted machine learning (ML) or deep learning (DL) models.

MLaaS 1 can help reduce the entry barrier to the use of ML and DL through access to managed services of wide hardware heterogeneity and incredible horizontal scale. MLaaS is currently provided by several major organizations such as Google, Microsoft, and Amazon. For example, Google offers Cloud ML Engine 2 that allows developers and data scientists to upload training data and model which is trained on the cloud in the Tensorflow 3 environment. Similarly, Microsoft offers Azure Batch AI 4 —a cloud-based service for training DL models using different frameworks supported by both Linux and Windows operating systems and Amazon offers a cloud service named Deep Learning AMI (DLAMI) 5 that provides several pre-built DL frameworks (e.g., MXNet, Caffe, Theano, and Tensorflow) that are available in Amazon’s EC2 cloud computing infrastructure. Such cloud services are popular among researchers as evidenced by the price lifting of Amazon’s p2.16x large instance to the maximum possible—two days before the deadline of NeurIPS 2017 (the largest research venue on ML)—indicating that a large number of users request to reserve instances.

In addition to MLaaS services that allow users to upload their model and data for training on the cloud, transfer learning is another strategy to reduce computational cost in which a pretrained model is fine-tuned for a new task (using a new dataset). Transfer learning is widely applied for image recognition tasks using a convolutional neural network (CNN). A CNN model learns and encodes features like edges and other patterns. The learned weights and convolutional filters are useful for image recognition tasks in other domains and state-of-the-art results can be obtained with a minimal amount of training even on a single GPU. Moreover, various popular pretrained models such as AlexNet ( Krizhevsky et al., 2012 ), VGG ( Simonyan and Zisserman, 2015 ), and Inception ( Szegedy et al., 2016 ) are available for download and fine-tuning online. Both of the aforementioned outsourcing strategies come with new security concerns. In addition, the literature suggests that different types of attacks can be realized on different components of the communication network as well ( Usama et al., 2020a ), for example, intrusion detection ( Han et al., 2020 ; Usama et al., 2020b ), network traffic classification ( Usama et al., 2019 ), and malware detection systems ( Chen et al., 2018 ). Moreover, adversarial ML attacks have also been devised for client-side ML classifiers, that is, Google’s phishing pages filter ( Liang et al., 2016 ).

Contributions of the article: In this article, we analyze the security of MLaaS and other cloud-hosted ML/DL models and provide a systematic review of associated security challenges and solutions. To the best of our knowledge, this article is the first effort on providing a systematic review of the security of cloud-hosted ML models and services. The following are the major contributions of this article:

(1) We conducted a systematic evaluation of 31 articles related to MLaaS attacks and defenses.

(2) We investigated five themes of approaches aiming to attack MLaaS and cloud-hosted ML services.

(3) We examined five themes of defense methods for securing MLaaS and cloud-hosted ML services.

(4) We identified the pitfalls and limitations of the examined articles. Finally, we have highlighted open research issues that require further investigation.

Organization of the article: The rest of the article is organized as follows. The methodology adopted for the systematic review is presented in Section 2. The results of the systematic review are presented in Section 3. Section 4 presents various security challenges associated with cloud-hosted ML models and potential solutions for securing cloud-hosted ML models are presented in Section 5. The pitfalls and limitations of the reviewed approaches are discussed in Section 6. We briefly reflect on our methodology to identify any threats to the validity in Section 8 and various open research issues that require further investigation are highlighted in Section 7. Finally, we conclude the article in Section 9.

2 Review Methodology

In this section, we present the research objectives and the adopted methodology for the systematic review. The purpose of this article is to identify and systematically review the state-of-the art research related to the security of the cloud-based ML/DL techniques. The methodology followed for this study is depicted in Figure 2 .

www.frontiersin.org

FIGURE 2 . An illustration of a typical cloud-based ML or machine learning as a service (MLaaS) architecture.

2.1 Research Objectives

The following are the key objectives of this article.

O1: To build upon the existing work around the security of cloud-based ML/DL methods and present a broad overview of the existing state-of-the-art literature related to MLaaS and cloud-hosted ML services.

O2: To identify and present a taxonomy of different attack and defense strategies for cloud-hosted ML/DL models.

O3: To identify the pitfalls and limitations of the existing approaches in terms of research challenges and opportunities.

2.2 Research Questions

To achieve our objectives, we consider answering two important questions that are described below and conducted a systematic analysis of 31 articles.

Q1: What are the well-known attacks on cloud-hosted/third-party ML/DL models?

Q2: What are the countermeasures and defenses against such attacks?

2.3 Review Protocol

We developed a review protocol to conduct the systematic review; the details are described below.

2.3.1 Search Strategy and Searching Phase

To build a knowledge base and extract the relevant articles, eight major publishers and online repositories were queried that include ACM Digital Library, IEEE Xplore, ScienceDirect, international conference on machine learning, international conference on learning representations, journal of machine learning research, neural information processing systems, USENIX, and arXiv. As we added non-peer–reviewed articles from electric preprint archive (arXiv), we (AQ and AI) performed the critical appraisal using AACODS checklist; it is designed to enable evaluation and appraisal of gray literature ( Tyndall, 2010 ), which is designed for the critical evaluation of gray literature.

In the initial phase, we queried main libraries using a set of different search terms that evolved using an iterative process to maximize the number of relevant articles. To achieve optimal sensitivity, we used a combination of words: attack, poisoning, Trojan attack, contamination, model inversion, evasion, backdoor, model stealing, black box, ML, neural networks, MLaaS, cloud computing, outsource, third party, secure, robust, and defense. The combinations of search keywords used are depicted in Figure 3 . We then created search strategies with controlled or index terms given in Figure 3 . Please note that no lower limit for the publication date was applied; the last search date was June 2020. The researchers (WI and AI) searched additional articles through citations and by snowballing on Google Scholar. Any disagreement was adjudicated by the third reviewer (AQ). Finally, articles focusing on the attack/defense for cloud-based ML models were retrieved.

www.frontiersin.org

FIGURE 3 . The methodology for systematic review.

2.3.2 Inclusion and Exclusion Criteria

The inclusion and exclusion criteria followed for this systematic review are defined below.

2.3.2.1 Inclusion Criteria

The following are the key points that we considered for screening retrieved articles as relevant for conducting a systematic review.

• We included all articles relevant to the research questions and published in the English language that discusses the attacks on cloud-based ML services, for example, offered by cloud computing service providers.

• We then assessed the eligibility of the relevant articles by identifying whether they discussed either attack or defense for cloud-based ML/DL models.

• Comparative studies that compare the attacks and robustness against different well-known attacks on cloud-hosted ML services (poisoning attacks, black box attacks, Trojan attacks, backdoor attacks, contamination attacks, inversion, stealing, and invasion attacks).

• Finally, we categorized the selected articles into three categories, that is, articles on attacks, articles on defenses, and articles on attacks and defenses.

2.3.2.2 Exclusion Criteria

The exclusion criteria are outlined below.

• Articles that are written in a language other than English.

• Articles not available in full text.

• Secondary studies (e.g., systematic literature reviews, surveys, editorials, and abstracts or short papers) are not included.

• Articles that do not discuss attacks and defenses for cloud-based/third-party ML services, that is, we only consider those articles which have proposed an attack or defense for a cloud-hosted ML or MLaaS service.

2.3.3 Screening Phase

For the screening of articles, we employ two phases based on the content of the retrieved articles: 1) title and abstract screening and 2) full text of the publication. Please note that to avoid bias and to ensure that the judgment about the relevancy of articles is entirely based on the content of the publications, we intentionally do not consider authors, publication type (e.g., conference and journal), and publisher (e.g., IEEE and ACM). Titles and abstracts might not be true reflectors of the articles’ contents; however, we concluded that our review protocol is sufficient to avoid provenance-based bias.

It is very common that the same work got published in multiple venues, for example, conference papers are usually extended to journals. In such cases, we only consider the original article. In the screening phase, every article was screened by at least two authors of this article that were tasked to annotate the articles as either relevant, not relevant, or need further investigation, which was finalized by the discussion between the authors until any such article is either marked relevant or not relevant. Only original technical articles are selected, while survey and review articles are ignored. Finally, all selected publications were thoroughly read by the authors for categorization and thematic analysis.

3 Review Results

3.1 overview of the search and selection process outcome.

The search using the aforementioned strategy identified a total of 4,384 articles. After removing duplicate articles, title, and abstract screening, the overall number of articles reduced to 384. A total of 230 articles did not meet the inclusion criteria and were therefore excluded. From the remaining 154 articles, 123 articles did not discuss attack/defense for third-party cloud-hosted ML models and were excluded as well. Of the remaining articles, a total of 31 articles are identified as relevant. Reasons for excluding articles were documented and reported in a PRISMA flow diagram, depicted in Figure 4 . These articles were categorized into three classes, that is, articles that are specifically focused on attacks, articles that are specifically focused on defenses, and articles that considered both attacks and defenses containing 19, 6, and 6 articles each, respectively.

www.frontiersin.org

FIGURE 4 . Search queries used to identify publications to include in the systematic review.

3.2 Overview of the Selected Studies

The systematic review eventually identified a set of 31 articles related to cloud-based ML/DL models and MLaaS, which we categorized into three classes as mentioned above and shown in Figure 4 . As shown in Figure 5 , a significant portion of the selected articles were published in conferences (41.94%); comparatively, a very smaller proportion of these articles were published in journals or transactions (19.35%). The percentage of gray literature (i.e., non-peer–reviewed articles) is 25.81%. Yet, a very small proportion of publications are published in symposia (6.45%), and this percentage is the same for workshop papers. The distribution of selected publications by their types over the years is shown in Figure 6 . The figure depicts that the interest in the security of cloud-hosted ML/DL models increased in the year 2017 and was at a peak in the year 2018 and was slightly lower in the year 2019 as compared to 2018. Also, the majority of the articles during these years were published in conferences. The distribution of selected publications by their publishers over the years is depicted in Figure 7 , the figure shows that the majority of the publications have been published at IEEE, ACM, and arXiv. There is a similar trend in the number of articles in the year 2017, 2018, and 2019 as discussed previously.

www.frontiersin.org

FIGURE 5 . Flowchart of systematic review and categorization.

www.frontiersin.org

FIGURE 6 . Distribution of selected publications according to their types.

www.frontiersin.org

FIGURE 7 . Distribution of selected publications by types over years.

3.3 Some Partially Related Non-Selected Studies: A Discussion

We have described our inclusion and exclusion criteria that help us to identify relevant articles. We note, however, that some seemingly relevant articles failed to meet the inclusion criteria. Here, we briefly describe few such articles for giving a rationale why they were not included.

• Liang et al. (2016) investigated the security challenges for the client-side classifiers via a case study on the Google’s phishing pages filter, a very widely used classifier for automatically detecting unknown phishing pages. They devised an attack that is not relevant to the cloud-based service.

• Demetrio et al. (2020) presented WAF-A-MoLE, a tool that models the presence of an adversary. This tool leverages a set of mutation operators that alter the syntax of a payload without affecting the original semantics. Using the results, the authors demonstrated that ML-based WAFs are exposed to a concrete risk of being bypassed. However, this attack is not associated with any cloud-based services.

• Authors in Apruzzese et al. (2019) discussed adversarial attacks where the machine learning model is compromised to induce an output favorable to the attacker. These attacks are realized in a different setting as compared to the scope of this systematic review, as we only included the articles which discuss the attack or defense when the cloud is outsourcing its services as MLaaS.

• Han et al. (2020) conducted the first systematic study of the practical traffic space evasion attack on learning-based network intrusion detection systems; again it is out of the inclusion criteria of our work.

• Chen et al. (2018) designed and evaluated three types of attackers targeting the training phases to poison our detection. To address this threat, the authors proposed the detection system, KuafuDet, and showed it significantly reduces false negatives and boosts the detection accuracy.

• Song et al. (2020) presented a federated defense approach for mitigating the effect of adversarial perturbations in a federated learning environment. This article can be potentially relevant for our study as they address the problem of defending cloud-hosted ML models; however, instead of using a third-party service, the authors conducted the experiments on a single computer system in a simulated environment; therefore, this study is not included in the analysis of this article.

• In a similar study, Zhang et al. (2019) presented a defense mechanism for defending adversarial attacks on cloud-aided automatic speech recognition (ASR); however, it is not explicitly stated that the cloud is outsourcing ML services and also which ML/DL model or MLaaS was used in experiments.

4 Attacks on Cloud-Hosted Machine Learning Models (Q1)

In this section, we present the findings from the systematically selected articles that aim at attacking cloud-hosted/third-party ML/DL models.

4.1 Attacks on Cloud-Hosted Machine Learning Models: Thematic Analysis

In ML practice, it is very common to outsource the training of ML/DL models to third-party services that provide high computational resources on the cloud. Such services enable ML practitioners to upload their models along with training data which is then trained on the cloud. Although such services have clear benefits for reducing the training and inference time; however, these services can easily be compromised and to this end, different types of attacks against these services have been proposed in the literature. In this section, we present the thematic analysis of 19 articles that are focused on attacking cloud-hosted ML/DL models. These articles are classified into five major themes: 1) attack type, 2) threat model, 3) attack method, 4) target model(s), and 5) dataset.

Attack type: A wide variety of attacks have been proposed in the literature. These are listed below with their descriptions provided in the next section.

• Adversarial attacks ( Brendel et al., 2017 );

• Backdoor attacks 6 ( Chen et al., 2017 ; Gu et al., 2019 );

• Cyber kill chain–based attack ( Nguyen, 2017 );

• Data manipulation attacks ( Liao et al., 2018 );

• Evasion attacks ( Hitaj et al., 2019 );

• Exploration attacks ( Sethi and Kantardzic, 2018 );

• Model extraction attacks ( Correia-Silva et al., 2018 ; Kesarwani et al., 2018 ; Joshi and Tammana, 2019 ; Reith et al., 2019 );

• Model inversion attacks ( Yang et al., 2019 );

• Model-reuse attacks ( Ji et al., 2018 );

• Trojan attacks ( Liu et al., 2018 ).

black box attacks (no knowledge) ( Brendel et al., 2017 ; Chen et al., 2017 ; Hosseini et al., 2017 ; Correia-Silva et al., 2018 ; Sethi and Kantardzic, 2018 ; Hitaj et al., 2019 );

white box attacks (full knowledge) ( Liao et al., 2018 ; Liu et al., 2018 ; Gu et al., 2019 ; Reith et al., 2019 );

gray box attacks (partial knowledge) ( Ji et al., 2018 ; Kesarwani et al., 2018 ).

Attack method: In each article, a different type of method is proposed for attacking cloud-hosted ML/DL models; a brief description of these methods is presented in Table 1 and is discussed in detail in the next section.

www.frontiersin.org

TABLE 1 . Summary of the state-of-the art attack types for cloud-based/third-party ML/DL models.

Target model(s): Considered studies have used different MLaaS services (e.g., Google Cloud ML Services ( Hosseini et al., 2017 ; Salem et al., 2018 ; Sethi and Kantardzic, 2018 ), ML models of BigML Platform ( Kesarwani et al., 2018 ), IBM’s visual recognition ( Nguyen, 2017 ), and Amazon Prediction APIs ( Reith et al., 2019 ; Yang et al., 2019 )).

Dataset: These attacks have been realized using different datasets ranging from small size datasets (e.g., MNIST ( Gu et al., 2019 ) and Fashion-MNIST ( Liu et al., 2018 )) to large size datasets (e.g., YouTube Aligned Face Dataset ( Chen et al., 2017 ), Project Wolf Eye ( Nguyen, 2017 ), and Iris dataset ( Joshi and Tammana, 2019 )). Other datasets include California Housing, Boston House Prices, UJIIndoorLoc, and IPIN 2016 Tutorial ( Reith et al., 2019 ), FaceScrub, CelebA, and CIFAR-10 ( Yang et al., 2019 ). A summary of thematic analyses of these attacks is presented in Table 1 and briefly described in the next section.

4.2 Taxonomy of Attacks on Cloud-Hosted Machine Learning Models

In this section, we present a taxonomy and description of different attacks described above in thematic analysis. A taxonomy of attacks on cloud-hosted ML/DL models is depicted in Figure 8 and is described next.

www.frontiersin.org

FIGURE 8 . Distribution of selected publications by publishers over years.

4.2.1 Adversarial Attacks

In recent years, DL models have been found vulnerable to carefully crafted imperceptible adversarial examples ( Goodfellow et al., 2014 ). For instance, a decision-based adversarial attack namely the boundary attack against two black box ML models trained for brand and celebrity recognition hosted at Clarifai.com are proposed in ( Brendel et al., 2017 ). The first model identifies brand names from natural images for 500 distinct brands and the second model recognizes over 10,000 celebrities. To date, a variety of adversarial examples generation methods have been proposed in the literature so far, the interesting readers are referred to recent surveys articles for detailed taxonomy of different types of adversarial attacks (i.e., Akhtar and Mian, 2018 ; Yuan et al., 2019 ; Qayyum et al., 2020b ; Demetrio et al., 2020 ).

4.2.2 Exploratory Attacks

These attacks are inference time attacks in which adversary attempts to evade the underlying ML/DL model, for example, by forcing the classifier (i.e., ML/DL model) to misclassify a positive sample as a negative one. Exploratory attacks do not harm the training data and only affects the model at test time. A data-driven exploratory attack using the Seed – Explore – Exploit strategy for evading Google’s cloud prediction API considering black box settings is presented in ( Sethi and Kantardzic, 2018 ). The performance evaluation of the proposed framework was performed using 10 real-world datasets.

4.2.3 Model Extraction Attacks

In model extraction attacks, adversaries can query the deployed ML model and can use query–response pair for compromising future predictions and also, they can potentially realize privacy breaches of the training data and can steal the model by learning extraction queries. In Kesarwani et al. (2018) , the authors presented a novel method for quantifying the extraction status of models for users with an increasing number of queries, which aims to measure model learning rate using information gain observed by query and response streams of users. The key objective of the authors was to design a cloud-based system for monitoring model extraction status and warnings. The performance evaluation of the proposed method was performed using a decision tree model deployed on the BigML MLaaS platform for different adversarial attack scenarios. Similarly, a model extraction/stealing strategy is presented by Correia-Silva et al. (2018) . The authors queried the cloud-hosted DL model with random unlabeled samples and used their predictions for creating a fake dataset. Then they used the fake dataset for building a fake model by training an oracle (copycat) model in an attempt to achieve similar performance as of the target model.

4.2.4 Backdooring Attacks

In backdooring attacks, an adversary maliciously creates the trained model which performs as good as expected on the users’ training and validation data, but it performs badly on attacker input samples. The backdooring attacks on deep neural networks (DNNs) are explored and evaluated in ( Gu et al., 2019 ). The authors first explored the properties of backdooring for a toy example and created a backdoor model for handwritten digit classifier and then demonstrated that backdoors are powerful for DNN by creating a backdoor model for a United States street sign classifier. Where, two scenarios were considered, that is, outsourced training of the model and transfer learning where an attacker can acquire a backdoor pretrained model online. In another similar study ( Chen et al., 2017 ), a targeted backdoor attack for two state-of-the art face recognition models, that is, DeepID ( Sun et al., 2014 ) and VGG-Face ( Parkhi et al., 2015 ) is presented. The authors proposed two categories of backdooring poisoning attacks, that is, input–instance–key attacks and pattern–key attacks using two different data poising strategies, that is, input–instance–key strategies and pattern–key strategies, respectively.

4.2.5 Trojan Attacks

In Trojan attacks, the attacker inserts malicious content into the system that looks legitimate but can take over the control of the system. However, the purpose of Trojan insertion can be varied, for example, stealing, disruption, misbehaving, or getting intended behavior. In Liu et al. (2018) , the authors proposed a stealth infection on neural networks, namely, SIN2 to realize a practical supply chain triggered neural Trojan attacks. Also, they proposed a variety of Trojan insertion strategies for agile and practical Trojan attacks. The proof of the concept is demonstrated by developing a prototype of the proposed neural Trojan attack (i.e., SIN2) in Linux sandbox and used Torch ( Collobert et al., 2011 ) ML/DL framework for building visual recognition models using the Fashion-MNIST dataset.

4.2.6 Model-Reuse Attacks

In model-reuse attacks, an adversary creates a malicious model (i.e., adversarial model) that influences the host model to misbehave on targeted inputs (i.e., triggers) in extremely predictable fashion, that is, getting a sample classified into specific (intended class). For instance, experimental evaluation of model-reuse attacks for four pretrained primitive DL models (i.e., speech recognition, autonomous steering, face verification, and skin cancer screening) is evaluated by Ji et al. (2018) .

4.2.7 Data Manipulation Attacks

Those attacks in which training data are manipulated to get intended behavior by the ML/DL model are known as data manipulation attacks. Data manipulation attacks for stealthily manipulating traditional supervised ML techniques and logistic regression (LR) and CNN models are studied by Liao et al. (2018) . In the attack strategy, the authors added a new constraint on fully connected layers of the models and used gradient descent for retraining them, and other layers were frozen (i.e., were made non-trainable).

4.2.8 Cyber Kill Chain–Based Attacks

Kill chain is a term used to define steps for attacking a target usually used in the military. In cyber kill chain–based attacks, the cloud-hosted ML/DL models are attacked, for example, a high-level threat model targeting ML cyber kill chain is presented by Nguyen (2017) . Also, the authors provided proof of concept by providing a case study using IBM visual recognition MLaaS (i.e., cognitive classifier for classification cats and female lions) and provided recommendations for ensuring secure and robust ML.

4.2.9 Membership Inference Attacks

In a typical membership inference attack, for given input data and black box access to the ML model, an attacker attempts to figure out if the given input sample was the part of the training set or not. To realize a membership inference attack against a target model, a classification model is trained for distinguishing between the predictions of the target model against the inputs on which it was trained and that those on which it was not trained ( Shokri et al., 2017 ).

4.2.10 Evasion Attacks

Evasion attacks are inference time attacks in which an adversary attempts to modify the test data for getting the intended outcome from the ML/DL model. Two evasion attacks against watermarking techniques for DL models hosted as MLaaS have been presented by Hitaj et al. (2019) . The authors used five publicly available models and trained them for distinguishing between watermarked and clean (non-watermarked) images, that is, binary image classification tasks.

4.2.11 Model Inversion Attacks

In model inversion attacks, an attacker tries to learn about training data using the model’s outcomes. Two model inversion techniques have been proposed by Yang et al. (2019) , that is, training an inversion model using auxiliary set composed by utilizing adversary’s background knowledge and truncation-based method for aligning the inversion model. The authors evaluated their proposed methods on a commercial prediction MLaaS named Amazon Rekognition.

5 Toward Securing Cloud-Hosted Machine Learning Models (Q2)

In this section, we present the insights from the systematically selected articles that provide tailored defense against specific attacks and report the articles that along with creating attacks propose countermeasure for the attacks for cloud-hosted/third-party ML/DL models.

5.1 Defenses for Attacks on Cloud-Hosted Machine Learning Models: Thematic Analysis

Leveraging cloud-based ML services for computational offloading and minimizing the communication overhead is accepted as a promising trend. While cloud-based prediction services have significant benefits, however, by sharing the model and the training data raises many privacy and security challenges. Several attacks that can compromise the model and data integrity, as described in the previous section. To avoid such issues, users can download the model and make inferences locally. However, this approach has certain drawbacks, including, confidentiality issues, service providers cannot update the models, adversaries can use the model to develop evading strategies, and privacy of the user data is compromised. To outline the countermeasures against these attacks, we present the thematic analysis of six articles that are focused on defense against the tailored attacks for cloud-hosted ML/DL models or data. In addition, we also provide the thematic analysis of those six articles that propose defense against specific attacks. These articles are classified into five major themes: 1) attack type, 2) defense, 3) target model(s), 4) dataset, and 5) measured outcomes. The thematic analysis of these systematically reviewed articles that are focused on developing defense strategies against attacks is given below.

Considered attacks for developing defenses: The defenses proposed in the reviewed articles are developed against the following specific attacks.

• Extraction attacks ( Tramèr et al., 2016 ; Liu et al., 2017 );

• Inversion attacks ( Liu et al., 2017 ; Sharma and Chen, 2018 );

• Adversarial attacks ( Hosseini et al., 2017 ; Wang et al., 2018b ; Rouhani et al., 2018 );

• Evasion attacks ( Lei et al., 2020 );

• GAN attacks ( Sharma and Chen, 2018 );

• Privacy threat attacks ( Hesamifard et al., 2017 );

• ide channel and cache-timing attacks ( Jiang et al., 2018 );

• Membership inference attacks ( Shokri et al., 2017 ; Salem et al., 2018 ).

Most of the aforementioned attacks are elaborated in previous sections. However, in the selected articles that are identified as either defense or attack and defense articles, some attacks are specifically created, for instance, GAN attacks, side channel, cache-timing attack, privacy threats, etc. Therefore, the attacks are worth mentioning in this section to explain the specific countermeasures proposed against them in the defense articles.

Defenses against different attacks: To provide resilience against these attacks, the authors of selected articles proposed different defense algorithms, which are listed below against each type of attack.

• Extraction attacks: MiniONN ( Liu et al., 2017 ), rounding confidence, differential, and ensemble methods ( Tramèr et al., 2016 );

• Adversarial attacks: ReDCrypt ( Rouhani et al., 2018 ) and Arden ( Wang et al., 2018b );

• Inversion attacks: MiniONN ( Liu et al., 2017 ) and image disguising techniques ( Sharma and Chen, 2018 );

• Privacy attacks: encryption-based defense ( Hesamifard et al., 2017 ; Jiang et al., 2018 );

• Side channel and cache-timing attacks: encryption-based defense ( Hesamifard et al., 2017 ; Jiang et al., 2018 );

• Membership inference attack: dropout and model stacking ( Salem et al., 2018 ).

Target model(s): Different cloud-hosted ML/DL models have been used for the evaluation of the proposed defenses, as shown in Table 2 .

www.frontiersin.org

TABLE 2 . Summary of attack types and corresponding defenses for cloud-based/third-party ML/DL models.

Dataset(s) used: The robustness of these defenses have been evaluated using various datasets ranging from small size datasets (e.g., MNIST ( Liu et al., 2017 ; Wang et al., 2018b ; Rouhani et al., 2018 ; Sharma and Chen, 2018 )) and CIFAR-10 ( Liu et al., 2017 ; Wang et al., 2018b ; Sharma and Chen, 2018 )), to large size datasets (e.g., Iris dataset ( Tramèr et al., 2016 ), fertility and climate dataset ( Hesamifard et al., 2017 ), and breast cancer ( Jiang et al., 2018 )). Other datasets include Crab dataset ( Hesamifard et al., 2017 ), Face dataset, Traffic signs dataset, Traffic signs dataset ( Tramèr et al., 2016 ), SVHN ( Wang et al., 2018b ), Edinburgh MI, Edinburgh MI, WI-Breast Cancerband MONKs Prob ( Jiang et al., 2018 ), crab dataset, fertility dataset, and climate dataset ( Hesamifard et al., 2017 ). Each of the defense techniques discussed above is mapped in Table 2 to the specific attack for which it was developed.

Measured outcomes: The measured outcomes based on which the defenses are evaluated are response latency and message sizes ( Liu et al., 2017 ; Wang et al., 2018b ), throughput comparison ( Rouhani et al., 2018 ), average on the cache miss rates per second ( Sharma and Chen, 2018 ), AUC, space complexity to demonstrate approximated storage costs ( Jiang et al., 2018 ), classification accuracy of the model as well as running time ( Hesamifard et al., 2017 ; Sharma and Chen, 2018 ), similarity index ( Lei et al., 2020 ), and training time ( Hesamifard et al., 2017 ; Jiang et al., 2018 ).

5.2 Taxonomy of Defenses on Cloud-Hosted Machine Learning Model Attacks

In this section, we present a taxonomy and summary of different defensive strategies against attacks on cloud-hosted ML/DL models as described above in thematic analysis. A taxonomy of these defenses strategies is presented in Figure 9 and is described next.

www.frontiersin.org

FIGURE 9 . Taxonomy of different attacks realized on the third-party cloud-hosted machine learning (ML) or deep learning (DL) models.

5.2.1 MiniONN

DNNs are vulnerable to model inversion and extraction attacks. Liu et al. (2017) proposed that without making any changes to the training phase of the model it is possible to change the model into an oblivious neural network. They make the nonlinear function such as tanh and sigmoid function more flexible, and by training the models on several datasets, the authors demonstrated significant results with minimal loss in the accuracy. In addition, they also implemented the offline precomputation phase to perform encryption incremental operations along with the SIMD batch processing technique.

5.2.2 ReDCrypt

A reconfigurable hardware-accelerated framework is proposed by Rouhani et al. (2018) , for protecting the privacy of deep neural models in cloud networks. The authors perform an innovative and power-efficient implementation of Yao’s Garbled Circuit (GC) protocol on FPGAs for preserving privacy. The proposed framework is evaluated for different DL applications, and it has achieved up to 57-fold throughput gain per core.

5.2.3 Arden

To offload the large portion of DNNs from the mobile devices to the clouds and to make the framework secure, a privacy-preserving mechanism Arden is proposed by Wang et al. (2018b) . While uploading the data to the mobile-cloud perturbation, noisy samples are included to make the data secure. To verify the robustness, the authors perform rigorous analysis based on three image datasets and demonstrated that this defense is capable to preserve the user privacy along with inference performance.

5.2.4 Image Disguising Techniques

While leveraging services from the cloud GPU server, the adversary can realize an attack by introducing malicious created training data, perform model inversion, and use the model for getting desirable incentives and outcomes. To protect from such attacks and to preserve the data as well as the model, Sharma and Chen (2018) proposed an image disguising mechanism. They developed a toolkit that can be leveraged to calibrate certain parameter settings. They claim that the disguised images with block-wise permutation and transformations are resilient to GAN-based attack and model inversion attacks.

5.2.5 Homomorphic Encryption

For making the cloud services of outsourced MLaaS secure, Hesamifard et al. (2017) proposed a privacy-preserving framework using homomorphic encryption. They trained the neural network using the encrypted data and then performed the encrypted predictions. The authors demonstrated that by carefully choosing the polynomials of the activation functions to adopt neural networks, it is possible to achieve the desired accuracy along with privacy-preserving training and classification.

In a similar study, to preserve the privacy of outsourced biomedical data and computation on public cloud servers, Jiang et al. (2018) built a homomorphically encrypted model that reinforces the hardware security through Software Guard Extensions. They combined homomorphic encryption and Software Guard Extensions to devise a hybrid model for the security of the most commonly used model for biomedical applications, that is, LR. The robustness of the Secure LR framework is evaluated on various datasets, and the authors also compared its performance with state-of-the-art secure LR solutions and demonstrated its superior efficiency.

5.2.6 Pelican

Lei et al. (2020) proposed three mutation-based evasion attacks and a sample-based collision attack in white-, gray-, and black box scenarios. They evaluated the attacks and demonstrated a 100% success rate of attack on Google’s phishing page filter classifier, while a success rate of up to 81% for the transferability on Bitdefender TrafficLight. To deal with such attacks and to increase the robustness of classifiers, they proposed a defense method known as Pelican.

5.2.7 Rounding Confidences and Differential Privacy

Tramèr et al. (2016) presented the model extraction attacks against the online services of BigML and Amazon ML. The attacks are capable of model evasion, monetization, and can compromise the privacy of training data. The authors also proposed and evaluated countermeasures such as rounding confidences against equation-solving and decision tree pathfinding attacks; however, this defense has no impact on the regression tree model attack. For the preservation of training data, differential privacy is proposed; this defense reduces the ability of an attacker to learn insights about the training dataset. The impact of both defenses is evaluated on the attacks for different models, while the authors also proposed ensemble models to mitigate the impact of attacks; however, their resilience is not evaluated.

5.2.8 Increasing Entropy and Reducing Precision

The training of attack using shadow training techniques against black box models in the cloud-based Google Prediction API and Amazon ML models are studied by Shokri et al. (2017) . The attack does not require prior knowledge of training data distribution. The authors emphasize that in order to protect the privacy of medical-related datasets or other public-related data, countermeasures should be designed. For instance, restriction of prediction vector to top k classes, which will prevent the leakage of important information or rounding down or up the classification probabilities in the prediction. They show that regularization can be effective to cope with overfitting and increasing the randomness of the prediction vector.

5.2.9 Dropout and Model Stacking

In the study by Salem et al. (2018) , the authors created three diverse attacks and tested the applicability of these attacks on eight datasets from which six are similar as used by Shokri et al. (2017) , whereas in this work, news dataset and face dataset is included. In the threat model, the authors considered black box access to the target model which is a supervised ML classifier with binary classes that was trained for binary classification. To mitigate the privacy threats, the authors proposed a dropout-based method which reduces the impact of an attack by randomly deleting a proportion of edges in each training iteration in a fully connected neural network. The second defense strategy is model stacking, which hierarchically organizes multiple ML models to avoid overfitting. After extensive evaluation, these defense techniques showed the potential to mitigate the performance of the membership inference attack.

5.2.10 Randomness to Video Analysis Algorithms

Hosseini et al. designed two attacks specifically to analyze the robustness of video classification and shot detection ( Hosseini et al., 2017 ). The attack can subtly manipulate the content of the video in such a way that it is undetected by humans, while the output from the automatic video analysis method is altered. Depending on the fact that the video and shot labels are generated by API by processing only the first video frame of every second, the attack can successfully deceive API. To deal with the shot removal and generation attacks, the authors proposed the inclusion of randomness for enhancing the robustness of algorithms. However, in this article, the authors thoroughly evaluated the applicability of these attacks in different video setting, but the purposed defense is not rigorously evaluated.

5.2.11 Neuron Distance Threshold and Obfuscation

Transfer learning is an effective technique for quickly building DL student models in which knowledge from a Teacher model is transferred to a Student model. However, Wang et al. (2018a) discussed that due to the centralization of model training, the vulnerability against misclassification attacks for image recognition on black box Student models increases. The authors proposed several defenses to mitigate the impact of such an attack, such as changing the internal representation of the Student model from the Teacher model. Other defense methods include increasing dropout randomization which alters the student model training process, modification in input data before classification, adding redundancy, and using orthogonal model against transfer learning attack. The authors analyzed the robustness of these attacks and demonstrated that the neuron distance threshold is the most effective in obfuscating the identity of the Teacher model.

6 Pitfalls and Limitations

6.1 lack of attack diversity.

The attacks presented in the selected articles have limited scope and lack diversity, that is, they are limited to a specific setting, and the variability of attacks is limited as well. However, the diversity of attacks is an important consideration for developing robust attacks from the perspective of adversaries, and it ensures the detection and prevention of the attacks to be difficult. The diversity of attacks ultimately helps in the development of robust defense strategies. Moreover, the empirical evaluation of attack variabilities can identify the potential vulnerabilities of cybersecurity systems. Therefore, to make a more robust defense solution, it is important to test the model robustness under a diverse set of attacks.

6.2 Lack of Consideration for Adaptable Adversaries

Most of the defenses in the systematically reviewed articles are proposed for a specific attack and did not consider the adaptable adversaries. On the other hand, in practice, the adversarial attacks are an arms race between attackers and defenders. That is, the attackers continuously evolve and enhance their knowledge and attacking strategies to evade the underlying defensive system. Therefore, the consideration of adaptable adversaries is crucial for developing a robust and long-lasting defense mechanism. If we do not consider this, the adversary will adapt to our defensive system over time and will bypass it to get the intended behavior or outcomes.

6.3 Limited Progress in Developing Defenses

From the systematically selected articles that are collected from different databases, only 12 articles have presented defense methods for the proposed attack as compared to the articles that are focused on attacks, that is, 19. In these 12 articles, six have only discussed/presented a defense strategy and six have developed a defense against a particular attack. This indicates that there is limited activity from the research community in developing defense strategies for already proposed attacks in the literature. In addition, the proposed defenses only mitigate or detect those attacks for which they have been developed, and therefore, they are not generalizable. On the contrary, the increasing interest in developing different attacks and the popularity of cloud-hosted/third-party services demand a proportionate amount of interest in developing defense systems as well.

7 Open Research Issues

7.1 adversarially robust machine learning models.

In recent years, adversarial ML attacks have emerged as a major panacea for ML/DL models and the systematically selected articles have highlighted the threat of these attacks for cloud-hosted Ml/DL models as well. Moreover, the diversity of these attacks is drastically increasing as compared with the defensive strategies that can pose serious challenges and consequences for the security of cloud-hosted ML/DL models. Each defense method presented in the literature so far has been shown resilient to a particular attack which is realized in specific, settings and it fails to withstand for yet stronger and unseen attacks. Therefore, the development of adversarially robust ML/DL models remains an open research problem, while the literature suggests that worst-case robustness analysis should be performed while considering adversarial ML settings ( Qayyum et al., 2020a ; Qayyum et al., 2020b ; Ilahi et al., 2020 ). In addition, it has been argued in the literature that most of ML developers and security incident responders are unequipped with the required tools for securing industry-grade ML systems against adversarial ML attacks Kumar et al. (2020) . This indicates the increasing need for the development of defense strategies for securing ML/DL models against adversarial ML attacks.

7.2 Privacy-Preserving Machine Learning Models

In cloud-hosted ML services, preserving user privacy is fundamentally important and is a matter of high concern. Also, it is desirable that ML models built using users’ data should not learn information that can compromise the privacy of the individuals. However, the literature on developing privacy-preserving ML/DL models or MLaaS is limited. On the other hand, one of the privacy-preserving techniques that have been used for privacy protection for building a defense system for cloud-hosted ML/DL models, that is, the homomorphic encryption-based protocol ( Jiang et al., 2018 ), has been shown vulnerable to model extraction attack ( Reith et al., 2019 ). Therefore, the development of privacy-preserving ML models for cloud computing platforms is another open research problem.

7.3 Proxy Metrics for Evaluating Security and Robustness

From systematically reviewed literature on the security of cloud-hosted ML/DL models, we orchestrate that the interest from the research community in the development of novel security-centric proxy metrics for the evaluation of security threats and model robustness of cloud-hosted models is very limited. However, with the increasing proliferation of cloud-hosted ML services (i.e., MLaaS) and with the development/advancements of different attacks (e.g., adversarial ML attacks), the development of effective and scalable metrics for evaluating the robustness ML/DL models toward different attacks and defense strategies is required.

8 Threats to Validity

We now briefly reflect on our methodology in order to identify any threats to the validity of our findings. First, internal validity is maintained as the research questions we pose in Section 2.2 capture the objectives of the study. Construct validity relies on a sound understanding of the literature and how it represents the state of the field. A detailed study of the reviewed articles along with deep discussions between the members of the research team helped ensure the quality of this understanding. Note that the research team is of diverse skills and expertise in ML, DL, cloud computing, ML/DL security, and analytics. Also, the inclusion and exclusion criteria (Section 2.3) help define the remit of our survey. Data extraction is prone to human error as is always the case. This was mitigated by having different members of the research team review each reviewed article. However, we did not attempt to evaluate the quality of the reviewed studies or validate their content due to time constraints. In order to minimize selection bias, we cast a wide net in order to capture articles from different communities publishing in the area of MLaaS via a comprehensive set of bibliographical databases without discriminating based on the venue/source.

9 Conclusion

In this article, we presented a systematic review of literature that is focused on the security of cloud-hosted ML/DL models, also named as MLaaS. The relevant articles were collected from eight major publishers that include ACM Digital Library, IEEE Xplore, ScienceDirect, international conference on machine learning, international conference on learning representations, journal of machine learning research, USENIX, neural information processing systems, and arXiv. For the selection of articles, we developed a review protocol that includes inclusion and exclusion formulas and analyzed the selected articles that fulfill these criteria across two dimensions (i.e., attacks and defenses) on MLaaS and provide a thematic analysis of these articles across five attack and five defense themes, respectively. We also identified the limitations and pitfalls from the reviewed literature, and finally, we have highlighted various open research issues that require further investigation.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

AQ led the work in writing the manuscript and performed the annotation of the data and analysis as well. AI performed data acquisition, annotation, and analysis from four venues, and contributed to the paper write-up. MU contributed to writing a few sections, did annotations of papers, and helped in analysis. WI performed data scrapping, annotation, and analysis from four venues, and helped in developing graphics. All the first four authors validated the data, analysis, and contributed to the interpretation of the results. AQ and AI helped in developing and refining the methodology for this systematic review. JQ conceived the idea and supervises the overall work. JQ, YEK, and AF provided critical feedback and helped shape the research, analysis, and manuscript. All authors contributed to the final version of the manuscript.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1 We use MLaaS to cover both ML and DL as a Service cloud provisions.

2 https://cloud.google.com/ml-engine/ .

3 A popular Python library for DL.

4 https://azure.microsoft.com/en-us/services/machine-learning-service/ .

5 https://docs.aws.amazon.com/dlami/latest/devguide/AML2_0.html .

6 Backdoor attacks on cloud-hosted models can be further categorized into three categories ( Chen et al., 2020 ): 1) complete model–based attacks, 2) partial model–based attacks, and 3) model-free attacks).

Akhtar, N., and Mian, A. (2018). Threat of adversarial attacks on deep learning in computer vision: a survey. IEEE Access 6, 14410–14430. doi:10.1109/access.2018.2807385

CrossRef Full Text | Google Scholar

Apruzzese, G., Colajanni, M., Ferretti, L., and Marchetti, M. (2019). “Addressing adversarial attacks against security systems based on machine learning,” in 2019 11th International conference on cyber conflict (CyCon) , Tallinn, Estonia , May 28–31, 2019 ( IEEE ), 900, 1–18

Google Scholar

Brendel, W., Rauber, J., and Bethge, M. (2017). “Decision-based adversarial attacks: reliable attacks against black-box machine learning models,” in International Conference on Learning Representations (ICLR)

Chen, S., Xue, M., Fan, L., Hao, S., Xu, L., Zhu, H., et al. (2018). Automated poisoning attacks and defenses in malware detection systems: an adversarial machine learning approach. Comput. Secur. 73, 326–344. doi:10.1016/j.cose.2017.11.007

Chen, X., Liu, C., Li, B., Lu, K., and Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning. arXiv

Chen, Y., Gong, X., Wang, Q., Di, X., and Huang, H. (2020). Backdoor attacks and defenses for deep neural networks in outsourced cloud environments. IEEE Network 34 (5), 141–147. doi:10.1109/MNET.011.1900577

Collobert, R., Kavukcuoglu, K., and Farabet, C. (2011). “Torch7: a Matlab-like environment for machine learning,” in BigLearn, NIPS workshop .

Correia-Silva, J. R., Berriel, R. F., Badue, C., de Souza, A. F., and Oliveira-Santos, T. (2018). “Copycat CNN: stealing knowledge by persuading confession with random non-labeled data,” in 2018 International joint conference on neural networks (IJCNN) , Rio de Janeiro, Brazil , July 8–13, 2018 ( IEEE ), 1–8

Demetrio, L., Valenza, A., Costa, G., and Lagorio, G. (2020). “Waf-a-mole: evading web application firewalls through adversarial machine learning,” in Proceedings of the 35th annual ACM symposium on applied computing , Brno, Czech Republic , March 2020 , 1745–1752

Gong, Y., Li, B., Poellabauer, C., and Shi, Y. (2019). “Real-time adversarial attacks,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI) , Macao, China , August 2019

Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv

Gu, T., Liu, K., Dolan-Gavitt, B., and Garg, S. (2019). BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244. doi:10.1109/access.2019.2909068

Han, D., Wang, Z., Zhong, Y., Chen, W., Yang, J., Lu, S., et al. (2020). Practical traffic-space adversarial attacks on learning-based nidss. arXiv

Hesamifard, E., Takabi, H., Ghasemi, M., and Jones, C. (2017). “Privacy-preserving machine learning in cloud,” in Proceedings of the 2017 on cloud computing security workshop , 39–43

Hilprecht, B., Härterich, M., and Bernau, D. (2019). “Monte Carlo and reconstruction membership inference attacks against generative models,” in Proceedings on Privacy Enhancing Technologies , Stockholm, Sweden , July 2019 , 2019, 232–249

Hitaj, D., Hitaj, B., and Mancini, L. V. (2019). “Evasion attacks against watermarking techniques found in MLaaS systems,” in 2019 sixth international conference on software defined systems (SDS) , Rome, Italy , June 10–13, 2019 ( IEEE )

Hosseini, H., Xiao, B., Clark, A., and Poovendran, R. (2017). “Attacking automatic video analysis algorithms: a case study of google cloud video intelligence API,” in Proceedings of the 2017 conference on multimedia Privacy and security (ACM) , 21–32

Ilahi, I., Usama, M., Qadir, J., Janjua, M. U., Al-Fuqaha, A., Hoang, D. T., et al. (2020). Challenges and countermeasures for adversarial attacks on deep reinforcement learning. arXiv

Ji, Y., Zhang, X., Ji, S., Luo, X., and Wang, T. (2018). “Model-reuse attacks on deep learning systems, “in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (New York, NY: ACM) , December 2018 , 349–363

Jiang, Y., Hamer, J., Wang, C., Jiang, X., Kim, M., Song, Y., et al. (2018). Securelr: secure logistic regression model via a hybrid cryptographic protocol. IEEE ACM Trans. Comput. Biol. Bioinf 16, 113–123. doi:10.1109/TCBB.2018.2833463

Joshi, N., and Tammana, R. (2019). “GDALR: an efficient model duplication attack on black box machine learning models,” in 2019 IEEE international Conference on system, computation, Automation and networking (ICSCAN) , Pondicherry, India , March 29–30, 2019 ( IEEE ), 1–6

Kesarwani, M., Mukhoty, B., Arya, V., and Mehta, S. (2018). Model extraction warning in MLaaS paradigm. In Proceedings of the 34th Annual Computer Security Applications Conference (ACM) , 371–380

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems , 1097–1105 Available at: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

Kumar, R. S. S., Nyström, M., Lambert, J., Marshall, A., Goertzel, M., Comissoneru, A., et al. (2020). Adversarial machine learning–industry perspectives. arXiv . Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3532474

Lei, Y., Chen, S., Fan, L., Song, F., and Liu, Y. (2020). Advanced evasion attacks and mitigations on practical ml-based phishing website classifiers. arXiv

Liang, B., Su, M., You, W., Shi, W., and Yang, G. (2016). “Cracking classifiers for evasion: a case study on the google’s phishing pages filter,” in Proceedings of the 25th international conference on world wide web Montréal, Québec, Canada , 345–356

Liao, C., Zhong, H., Zhu, S., and Squicciarini, A. (2018). “Server-based manipulation attacks against machine learning models,” in Proceedings of the eighth ACM conference on data and application security and privacy (ACM) , New York, NY , March 2018 , 24–34

Liu, J., Juuti, M., Lu, Y., and Asokan, N.. (2017). “Oblivious neural network predictions via minionn transformations,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security , October 2017 , 619–631

Liu, T., Wen, W., and Jin, Y. (2018). “SIN 2: stealth infection on neural network—a low-cost agile neural Trojan attack methodology,” in 2018 IEEE international symposium on hardware oriented security and trust (HOST) , Washington, DC , April 30–4 May, 2018 ( IEEE ), 227–230

Nguyen, T. N. (2017). Attacking machine learning models as part of a cyber kill chain. arXiv

Parkhi, O. M., Vedaldi, A., Zisserman, A., et al. (2015). Deep face recognition. Bmvc 1, 6. doi:10.5244/C.29.41

Qayyum, A., Qadir, J., Bilal, M., and Al-Fuqaha, A. (2020a). Secure and robust machine learning for healthcare: a survey. IEEE Rev. Biomed. Eng. , 1. doi:10.1109/RBME.2020.3013489

Qayyum, A., Usama, M., Qadir, J., and Al-Fuqaha, A. (2020b). Securing connected & autonomous vehicles: challenges posed by adversarial machine learning and the way forward. IEEE Commun. Surv. Tutorials 22, 998–1026. doi:10.1109/comst.2020.2975048

Reith, R. N., Schneider, T., and Tkachenko, O. (2019). “Efficiently stealing your machine learning models,” in Proceedings of the 18th ACM workshop on privacy in the electronic society , November 2019 , 198–210

Rouhani, B. D., Hussain, S. U., Lauter, K., and Koushanfar, F. (2018). Redcrypt: real-time privacy-preserving deep learning inference in clouds using fpgas. ACM Trans. Reconfigurable Technol. Syst. 11, 1–21. doi:10.1145/3242899

Saadatpanah, P., Shafahi, A., and Goldstein, T. (2019). Adversarial attacks on copyright detection systems. arXiv .

Salem, A., Zhang, Y., Humbert, M., Berrang, P., Fritz, M., and Backes, M. (2018). ML-leaks: model and data independent membership inference attacks and defenses on machine learning models. arXiv .

Sehwag, V., Bhagoji, A. N., Song, L., Sitawarin, C., Cullina, D., Chiang, M., et al. (2019). Better the devil you know: an analysis of evasion attacks using out-of-distribution adversarial examples. arXiv .

Sethi, T. S., and Kantardzic, M. (2018). Data driven exploratory attacks on black box classifiers in adversarial domains. Neurocomputing 289, 129–143. doi:10.1016/j.neucom.2018.02.007

Sharma, S., and Chen, K.. (2018). “Image disguising for privacy-preserving deep learning,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security , ( ACM, Toronto, Canada ), 2291–2293

Shokri, R., Stronati, M., Song, C., and Shmatikov, V. (2017). “Membership inference attacks against machine learning models,” in 2017 IEEE Symposium on Security and privacy (SP) , San Jose, CA , May 22–26, 2017 ( IEEE ), 3–18

Simonyan, K., and Zisserman, A. (2015). “Very deep convolutional networks for large-scale image recognition,”in International Conference on Learning Representations (ICLR)

Song, Y., Liu, T., Wei, T., Wang, X., Tao, Z., and Chen, M. (2020). Fda3: federated defense against adversarial attacks for cloud-based iiot applications. IEEE Trans. Industr. Inform. , 1. doi:10.1109/TII.2020.3005969

Sun, Y., Wang, X., and Tang, X. (2014). “Deep learning face representation from predicting 10,000 classes,” in Proceedings of the IEEE conference on computer vision and pattern recognition , Columbus, OH , June 23–28, 2014 , ( IEEE ).

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. “(2016). Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) , Las Vegas, NV , June 27–30, 2016 ( IEEE ), 2818–2826

Tramèr, F., Zhang, F., Juels, A., Reiter, M. K., and Ristenpart, T. (2016). “Stealing machine learning models via prediction APIs,” in 25th USENIX security symposium (USENIX Security 16) , 601–618

Tyndall, J. (2010). AACODS checklist . Adelaide, Australia: Adelaide Flinders University

Usama, M., Mitra, R. N., Ilahi, I., Qadir, J., and Marina, M. K. (2020a). Examining machine learning for 5g and beyond through an adversarial lens. arXiv . Available at: https://arxiv.org/abs/2009.02473 .

Usama, M., Qadir, J., Al-Fuqaha, A., and Hamdi, M. (2020b). The adversarial machine learning conundrum: can the insecurity of ML become the achilles' heel of cognitive networks? IEEE Network 34, 196–203. doi:10.1109/mnet.001.1900197

Usama, M., Qayyum, A., Qadir, J., and Al-Fuqaha, A. (2019). “Black-box adversarial machine learning attack on network traffic classification, “in 2019 15th international wireless communications and mobile computing conference (IWCMC) , Tangier, Morocco , June 24–28, 2019

Wang, B., Yao, Y., Viswanath, B., Zheng, H., and Zhao, B. Y. (2018a). “With great training comes great vulnerability: practical attacks against transfer learning,” in 27th USENIX security symposium (USENIX Security 18) , Baltimore, MD , August 2018 , 1281–1297

Wang, J., Zhang, J., Bao, W., Zhu, X., Cao, B., and Yu, P. S. (2018b). “Not just privacy: improving performance of private deep learning in mobile cloud,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining London, United Kingdom , January 2018 , 2407–2416

Yang, Z., Zhang, J., Chang, E.-C., and Liang, Z. (2019). “Neural network inversion in adversarial setting via background knowledge alignment,” in Proceedings of the 2019 ACM SIGSAC conference on computer and communications security , London, UK , November 2019 , 225–240

Yuan, X., He, P., Zhu, Q., and Li, X. (2019). Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural. Netw. Learn. Syst. 30 (9), 2805–2824. doi:10.1109/TNNLS.2018.2886017

Zhang, J., Zhang, B., and Zhang, B. (2019). “Defending adversarial attacks on cloud-aided automatic speech recognition systems, “in Proceedings of the seventh international workshop on security in cloud computing , New York , 23–31. Available at: https://dl.acm.org/doi/proceedings/10.1145/3327962

Keywords: Machine Learning as a Service, cloud-hosted machine learning models, machine learning security, cloud machine learning security, systematic review, attacks, defenses

Citation: Qayyum A, Ijaz A, Usama M, Iqbal W, Qadir J, Elkhatib Y and Al-Fuqaha A (2020) Securing Machine Learning in the Cloud: A Systematic Review of Cloud Machine Learning Security. Front. Big Data 3:587139. doi: 10.3389/fdata.2020.587139

Received: 24 July 2020; Accepted: 08 October 2020; Published: 12 November 2020.

Reviewed by:

Copyright © 2020 Qayyum, Ijaz, Usama, Iqbal, Qadir, Elkhatib and Al-Fuqaha. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Adnan Qayyum, [email protected]

This article is part of the Research Topic

Safe and Trustworthy Machine Learning

DataFlair

  • Cloud Computing Tutorials

12 Latest Cloud Computing Research Topics

Free AWS Course for AWS Certified Cloud Practitioner (CLF-C01) Start Now!!

Cloud Computing is gaining so much popularity an demand in the market. It is getting implemented in many organizations very fast.

One of the major barriers for the cloud is real and perceived lack of security. There are many Cloud Computing Research Topics ,  which can be further taken to get the fruitful output.

In this tutorial, we are going to discuss 12 latest Cloud Computing Research Topics. These Cloud computing topics will help in your researches, projects and assignments.

So, let’s start the Cloud Computing Research Topics.

12 Latest Cloud Computing Research Topics

List of Cloud Computing Research Topics

These Cloud Computing researches topics, help you to can eliminate many issues and provide a better environment. We can assoicate these issues with:

  • Virtualizations infrastructure
  • Software platform
  • Identity management
  • Access control

There is some important research direction in Cloud Security in areas such as trusted computing, privacy-preserving models, and information-centric security. These are the following Trending Cloud Computing Research Topics .

  • Green Cloud Computing
  • Edge Computing
  • Cloud Cryptography
  • Load Balancing
  • Cloud Analytics
  • Cloud Scalability
  • Service Model
  • Cloud Computing Platforms
  • Mobile Cloud Computing
  • Cloud Deployment Model
  • Cloud Security

i. Green Cloud Computing

Green Cloud Computing is a broad topic, that makes virtualized data centres and servers to save energy. The IT services are utilizing so many resources and this leads to the shortage of resources.

Green Cloud Computing provides many solutions, which makes IT resources more energy efficient and reduces the operational cost. It can also take care of power management, virtualization , sustainability, and recycling the environment.

ii. Edge Computing

Although edge computing has several benefits, it is frequently combined with cloud computing to form a hybrid strategy. In this hybrid architecture, certain data processing and analytics take place at the edge, while more intense and extensive long-term data storage and analysis happen in the central cloud infrastructure. The edge-to-cloud continuum refers to this fusion of edge and cloud computing.

iii. Cloud Cryptography

Cloud cryptography is the practise of securing data and communications in cloud computing environments using cryptographic methods and protocols. Sensitive data is secured against unauthorised access and possible security breaches by encrypting it both in transit and at rest.

By allowing consumers to keep control of their data while entrusting it to cloud service providers, cloud cryptography protects the confidentiality, integrity, and authenticity of that data. Cloud cryptography improves the security posture of cloud-based apps and services, promoting trust and compliance with data privacy rules by using encryption methods and key management procedures.

iv. Load Balancing

Load Balancing is the distribution of the load over the servers so that the work can be easily done. Due to this, the workload demands can be distributed and managed. There are several advantages of load balancing and they are-

  • Fewer chances of the server crash.
  • Advanced security.
  • Improvement in overall performance.

The load balancing techniques are easy to implement and less expensive. Moreover, the problem of sudden outages is diminished.

v. Cloud Analytics

Cloud analytics can become an interesting topic for researchers, as it has evolved from the diffusion of data analytics and cloud computing technologies . The Cloud analytics is beneficial for small as well as large organizations.

It has been observed that there is tremendous growth in the cloud analytics market. Moreover, it can be delivered through various models such as

  • Community model

Analysis has a wide scope, as there are many segments to perform research. Some of the segments are  business intelligence tools , enterprise information management, analytics solutions, governance, risk and compliance, enterprise performance management, and complex event processing

vi. Scalability

Scalability can reach much advancement if proper research is done on it. Many limits can be reached and tasks such as workload in infrastructure can be maintained. It also has the ability to expand the existing infrastructure.

There are two types of scalability:

The applications have rooms to scale up and down, which eliminates the lack of resources that hamper the performance.

vii. Cloud Computing Platforms

Cloud Computing platforms include different applications run by organizations. It is a very vast platform and we can do many types of research within it. We can do research in two ways: individually or in an existing platform, some are-

  • Amazon’s Elastic Compute Cloud
  • IBM Computing
  • Microsoft’s Azure
  • Google’s AppEngine
  • Salesforce.com

viii. Cloud Service Model

There are 3 cloud service models. They are:

  • Platform as a Service (PaaS)
  • Software as a Service (SaaS)
  • Infrastructure as a Service (IaaS)

These are the vast topics for research and development as IaaS provides resources such as storage , virtual machines, and network to the users. The user further deploys and run software and applications. In software as a service , the software services are delivered to the customer.

The customer can provide various software services and can do research on it. PaaS also provides the services over the internet such as infrastructure and the customers can deploy over the existing infrastructure.

ix. Mobile Cloud Computing

In mobile cloud computing , the mobile is the console and storage and processing of the data takes outside of it. It is one of the leading Cloud Computing research topics.

The main advantage of Mobile Cloud Computing is that there is no costly hardware and it comes with extended battery life. The only disadvantage is that has low bandwidth and heterogeneity.

x. Big Data

Big data is the technology denotes the tremendous amount of data. This data is classified in 2 forms that are structured (organized data) and unstructured (unorganized).

Big data is characterized by three Vs which are:

  • Volume – It refers to the amount of data which handled by technologies such as Hadoop.
  • Variety –  It refers to the present format of data.
  • Velocity – It means the speed of data (generation and transmission).

This can be used for research purpose and companies can use it to detect failures, costs, and issues. Big data along with Hadoop is one of the major topics for research.

xi. Cloud Deployment Model

Deployment model is one of the major Cloud Computing research topics, which includes models such as:

Public Cloud –  It is under the control of the third party. It has a benefit of pay-as-you-go.

Private Cloud – It is under a single organization and so it has few restrictions. We can use it for only single or a particular group of the organization.

Hybrid Cloud – The hybrid cloud comprises of two or more different models. Its architecture is complex to deploy.

Community Cloud

x. Cloud Security

Cloud Security is one of the most significant shifts in information technology. Its development brings revolution to the current business model. There is an open Gate when cloud computing as cloud security is becoming a new hot topic.

To build a strong secure cloud storage model and Tekken issues faced by the cloud one can postulate that cloud groups can find the issues, create a context-specific access model which limits data and preserve privacy.

In security research, there are three specific areas such as trusted computing, information-centric security, and privacy-preserving models.

Cloud Security protects the data from leakage, theft, disaster, and deletion. With the help of tokenization, VPNs, and firewalls, we can secure our data. Cloud Security is a vast topic and we can use it for more researches.

The number of organizations using cloud services is increasing. There are some security measures, which will help to implement the cloud security-

  • Accessibility
  • Confidentiality

So, this was all about Cloud Computing Research Topics. Hope you liked our explanation.

Hence, we can use Cloud Computing for remote processing of the application, outsourcing, and data giving quick momentum. The above Cloud Computing research topics can help a lot to provide various benefits to the customer and to make the cloud better.

With these cloud computing research, we can make this security more advanced. There are many high-level steps towards security assessment framework. This will provide many benefits in the future to cloud computing. Furthermore, if you have any query, feel free to ask in the comment section.

Did you like our efforts? If Yes, please give DataFlair 5 Stars on Google

Tags: big data Cloud Analytics Cloud Computing Platforms cloud computing research Cloud Computing Research Topics Cloud Computing Topics Cloud Cryptography Cloud Deployment Model Cloud Scalability Cloud Security Cloud Service Model Edge Computing Green Cloud Computing Load Balancing Mobile Cloud Computing Research Topics on Cloud Computing

15 Responses

  • Comments 15
  • Pingbacks 0

cloud computing research

Dear, I wants to write a research paper on the cloud computing security, will also discuss the comparison of the present security shecks vs improvement suggested, I am thankful to you, as your paper helps me…

cloud computing research

hay thanks for this valueable information dear i am just going to start my research in cloud computing from scratch i dnt now more about this field but i have to now work hard for this so plz give me idea how i start with effiecient manner

cloud computing research

Hey Yaseen, Research is a great way to explore the entire topic. But it is recommended you master Cloud computing first, then start your research. Refer to our Free Cloud Computing Tutorial Series You can research on topics like Cloud Security, Optimization of resources, and Cloud cryptography.

cloud computing research

Hi, Thank you for your article. I’m working on Cloud Computing Platforms research paper. Would you recommend any sources where I can get a real data or DB with numbers on cloud computing platforms. So, I can analyze it, create graphs, and draw a conclusion. Thank you

….or any sources with data on Cloud Service Models. Thank you

cloud computing research

Can you please provide your contact details as I am also starting to research on Cloud Computing, Am a 11 years exp Consultant in an MNC working in Large Infrastructure. My email is partha.059@gmail .com so that we can communicate accordingly.

cloud computing research

Can you please put some references you used, so that we can refer for more information? Thanks.

cloud computing research

Hi, Very much pleased to know the latest topic for research. very informative, thanks for this i am interested in optimizing the resource here when i say resource it becomes too vast in terms of cloud computing components according to the definition of cloud computing. bit confused to hit the link.. could you plz.

cloud computing research

hello iam searching for research gap in cloud computing I cant identify the problem please suggest me research topic on cloud computing

cloud computing research

hello I am searching for research gap in cloud computing I cant identify the problem please suggest me research topic on cloud computing

cloud computing research

we discuss optimization of resources, the gaps available

cloud computing research

I want to do research in cloud databases,may i know the latest challenges in cloud databases?

cloud computing research

I am a student of MS(computer science) and i am currently finding research topics in the area of cloud computing, Please let me know the topic of cloud computing and as well research gap so i will continue the research ahead with research gap.

cloud computing research

Hi I am a student of MS(computer science) and i am currently finding research topics in the area of cloud computing, Please let me know the topic of cloud computing and as well research gap so I will continue the research.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Cloud – Introduction
  • Cloud – Features
  • Cloud – Pros & Cons
  • Cloud – Working
  • Cloud – Applications
  • Cloud – Architecture
  • Cloud – List of Certifications
  • Cloud – SaaS
  • Cloud – PaaS
  • Cloud – IaaS
  • Cloud – NaaS
  • Cloud – IDaaS
  • Cloud – Public Cloud
  • Cloud – Private Cloud
  • Cloud – Hybrid Cloud
  • Cloud – Community Cloud
  • Cloud – Virtualization
  • Cloud – Hardware Virtualization
  • Cloud – Software Virtualization
  • Cloud – Server Virtualization
  • Cloud – Linux Virtualization
  • Cloud – Storage Virtualization
  • Cloud – OS Virtualization
  • Cloud – Operations
  • Cloud – Challenges
  • Cloud – Storage
  • Cloud – Management
  • Cloud – Technologies
  • Cloud – Service Providers
  • Cloud – Cube Model
  • Cloud – Security
  • Cloud – Books
  • Cloud – Research Topics
  • Google Cloud Platform
  • Cloud – Mobile Cloud Computing
  • Grid Computing Vs Cloud Computing
  • Big Data Vs Cloud Computing
  • Big Data & Cloud Computing for Business
  • Future of Cloud Computing
  • What’s Next After Cloud Computing
  • Interview Questions Part-1
  • Cloud Computing Quiz Part-1
  • Cloud Computing Quiz Part-2
  • Cloud Computing Quiz Part-3
  • Cloud Computing Quiz Part-4
  • Español – América Latina
  • Português – Brasil

What is Cloud Computing?

Understanding the types of cloud computing resources can be time-consuming and costly. Enterprises need to buy physical servers and other infrastructure through procurement processes that can take months, and support the architecture of cloud computing. The acquired systems require a physical space, typically a specialized room with sufficient power and cooling. After configuring and deploying the systems, enterprises need expert personnel to manage them.

This long process is difficult to scale when demand spikes or business expands. Enterprises can acquire more computing resources than needed, ending up with low utilization numbers.

Cloud computing addresses these issues by offering computing resources as scalable, on-demand services. Learn more about Google Cloud , a suite of cloud computing service models offered by Google.

image of person learning

Cloud computing defined

Cloud computing is the on-demand availability of computing resources (such as storage and infrastructure), as services over the internet. It eliminates the need for individuals and businesses to self-manage physical resources themselves, and only pay for what they use.

The main cloud computing service models include infrastructure as a service offers compute and storage services, platform as a service offers a develop-and-deploy environment to build cloud apps, and software as a service delivers apps as services.

Understanding how cloud computing works

Cloud computing service models are based on the concept of sharing on-demand computing resources, software, and information over the internet. Companies or individuals pay to access a virtual pool of shared resources, including compute, storage, and networking services, which are located on remote servers that are owned and managed by service providers. 

One of the many advantages of cloud computing is that you only pay for what you use. This allows organizations to scale faster and more efficiently without the burden of having to buy and maintain their own physical data centers and servers.  

In simpler terms, cloud computing uses a network (most often, the internet) to connect users to a cloud platform where they request and access rented computing services. A central server handles all the communication between client devices and servers to facilitate the exchange of data. Security and privacy features are common components to keep this information secure and safe.  

When adopting cloud computing architecture, there is no one-size-fits-all. What works for another company may not suit you and your business needs. In fact, this flexibility and versatility is one of the hallmarks of cloud, allowing enterprises to quickly adapt to changing markets or metrics.

There are three different cloud computing deployment models: public cloud, private cloud, and hybrid cloud.

Types of cloud computing deployment models

Public cloud.

Public clouds are run by third-party cloud service providers. They offer compute, storage, and network resources over the internet, enabling companies to access shared on-demand resources based on their unique requirements and business goals.

Private cloud

Private clouds are built, managed, and owned by a single organization and privately hosted in their own data centers, commonly known as “on-premises” or “on-prem.” They provide greater control, security, and management of data while still enabling internal users to benefit from a shared pool of compute, storage, and network resources.

Hybrid cloud

Hybrid clouds combine public and private cloud models, allowing companies to leverage public cloud services and maintain the security and compliance capabilities commonly found in private cloud architectures.

What are the types of cloud computing services?

There are three main types of cloud computing service models that you can select based on the level of control, flexibility, and management your business needs: 

Infrastructure as a service (IaaS)

Infrastructure as a service (IaaS) offers on-demand access to IT infrastructure services, including compute, storage, networking, and virtualization. It provides the highest level of control over your IT resources and most closely resembles traditional on-premises IT resources.

Platform as a service (PaaS)

Platform as a service (PaaS) offers all the hardware and software resources needed for cloud application development. With PaaS, companies can focus fully on application development without the burden of managing and maintaining the underlying infrastructure.

Software as a service (SaaS)

Software as a service (SaaS) delivers a full application stack as a service, from underlying infrastructure to maintenance and updates to the app software itself. A SaaS solution is often an end-user application, where both the service and the infrastructure is managed and maintained by the cloud service provider.

What are the benefits of cloud computing?

It’s flexible.

Due to the architecture of cloud computing, enterprises and their users can access cloud services from anywhere with an internet connection, scaling services up or down as needed.

It’s efficient

Enterprises can develop new applications and rapidly get them into production—without worrying about the underlying infrastructure.

It offers strategic value

Because cloud providers stay on top of the latest innovations and offer them as services to customers, enterprises can get more competitive advantages—and a higher return on investment—than if they’d invested in soon-to-be obsolete technologies.

It’s secure

Enterprises often ask, What are the security risks of cloud computing? They are considered relatively low. Cloud computing security is generally recognized as stronger than that in enterprise data centers, because of the depth and breadth of the security mechanisms cloud providers put into place. Plus, cloud providers’ security teams are known as top experts in the field.

It’s cost-effective

Whatever cloud computing service model is used, enterprises only pay for the computing resources they use. They don’t need to overbuild data center capacity to handle unexpected spikes in demand or business growth, and they can deploy IT staff to work on more strategic initiatives.

Solve your business challenges with Google Cloud

How cloud computing can help your organization.

The pace of innovation—and the need for advanced computing to accelerate this growth—makes cloud computing a viable option to advance research and speed up new product development. Cloud computing can give enterprises access to scalable resources and the latest technologies without needing to worry about capital expenditures or limited fixed infrastructure. What is the future of cloud computing? It’s expected to become the dominant enterprise IT environment.

If your organization experiences any of the following, you’re probably a good candidate for cloud computing:

  • High business growth that outpaces infrastructure capabilities
  • Low utilization of existing infrastructure resources
  • Large volumes of data that are overwhelming your on-premises data storage resources
  • Slow response times with on-premises infrastructure
  • Delayed product development cycles due to infrastructure constraints
  • Cash flow challenges due to high computing infrastructure expenses
  • Highly mobile or distributed user population

These scenarios require more than traditional data centers can provide.

Infrastructure scaling

Many organizations, including those in retail, have wildly varying needs for compute capacity. Cloud computing easily accommodates these fluctuations.  

Disaster recovery

Rather than building more data centers to ensure continuity during disasters, businesses use cloud computing to safely back up their digital assets.

Data storage

Cloud computing helps overloaded data centers by storing large volumes of data, making it more accessible, easing analysis, and making backup easier.

Application development

Cloud computing offers enterprise developers quick access to tools and platforms for building and testing applications, speeding up time to market.

Big data analytics

Cloud computing offers almost unlimited resources to process large volumes of data to speed research and reduce time to insights.

Related products and services

Google Cloud is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for their own consumer products, such as Google Search, Gmail, and YouTube.

The list of available Google Cloud services is long—and it keeps growing. When developing applications or running workloads on Google Cloud, enterprises can mix and match these services into combinations that provide the infrastructure they need.

Take the next step

Start building on Google Cloud with $300 in free credits and 20+ always free products.

Start your next project, explore interactive tutorials, and manage your account.

  • Need help getting started? Contact sales
  • Work with a trusted partner Find a partner
  • Continue browsing See all products
  • Get tips & best practices See tutorials

'ZDNET Recommends': What exactly does it mean?

ZDNET's recommendations are based on many hours of testing, research, and comparison shopping. We gather data from the best available sources, including vendor and retailer listings as well as other relevant and independent reviews sites. And we pore over customer reviews to find out what matters to real people who already own and use the products and services we’re assessing.

When you click through from our site to a retailer and buy a product or service, we may earn affiliate commissions. This helps support our work, but does not affect what we cover or how, and it does not affect the price you pay. Neither ZDNET nor the author are compensated for these independent reviews. Indeed, we follow strict guidelines that ensure our editorial content is never influenced by advertisers.

ZDNET's editorial team writes on behalf of you, our reader. Our goal is to deliver the most accurate information and the most knowledgeable advice possible in order to help you make smarter buying decisions on tech gear and a wide array of products and services. Our editors thoroughly review and fact-check every article to ensure that our content meets the highest standards. If we have made an error or published misleading information, we will correct or clarify the article. If you see inaccuracies in our content, please report the mistake via this form .

What is cloud computing? Everything you need to know about the cloud explained

steve-ranger

What is cloud computing, in simple terms?

Cloud computing is the delivery of computing services—including servers, storage, databases, networking, software, analytics, and intelligence—over the Internet ("the cloud") to offer faster innovation, flexible resources, and economies of scale.

How does cloud computing work?

Rather than owning their own computing infrastructure or data centres, companies can rent access to anything from applications to storage from a cloud service provider.

One benefit of using cloud-computing services is that firms can avoid the upfront cost and complexity of owning and maintaining their own IT infrastructure, and instead simply pay for what they use, when they use it.

In turn, providers of cloud-computing services can benefit from significant economies of scale by delivering the same services to a wide range of customers.

What cloud-computing services are available?

Cloud-computing services cover a vast range of options now, from the basics of storage, networking and processing power, through to natural language processing and artificial intelligence as well as standard office applications. Pretty much any service that doesn't require you to be physically close to the computer hardware that you are using can now be delivered via the cloud –  even quantum computing .

What are examples of cloud computing?

Cloud computing underpins a vast number of services. That includes consumer services like Gmail or the cloud backup of the photos on your smartphone, though to the services that allow large enterprises to host all their data and run all of their applications in the cloud. For example, Netflix  relies on cloud-computing  services to run its its video-streaming service and its other business systems, too.

Cloud computing is becoming the default option for many apps: software vendors are increasingly offering their applications as services over the internet rather than standalone products as they try to switch to a subscription model. However, there are potential downsides to cloud computing, in that it can also introduce new costs and new risks for companies using it.

Why is it called cloud computing?

A fundamental concept behind cloud computing is that the location of the service, and many of the details such as the hardware or operating system on which it is running, are largely irrelevant to the user. It's with this in mind that the metaphor of the cloud was borrowed from old telecoms network schematics, in which the public telephone network (and later the internet) was often represented as a cloud to denote that the location didn't matter – it was just a cloud of stuff. This is an over-simplification of course; for many customers, location of their services and data remains a key issue.

What is the history of cloud computing?

Cloud computing as a term has been around since the early 2000s, but the concept of computing as a service has been around for much, much longer – as far back as the 1960s, when computer bureaus would allow companies to rent time on a mainframe, rather than have to buy one themselves.

These 'time-sharing' services were largely overtaken by the rise of the PC, which made owning a computer much more affordable, and then in turn by the rise of corporate data centres where companies would store vast amounts of data.

But the concept of renting access to computing power has resurfaced again and again – in the application service providers, utility computing, and grid computing of the late 1990s and early 2000s. This was followed by cloud computing, which really took hold with the emergence of software as a service and hyperscale cloud-computing providers such as Amazon Web Services.

How important is the cloud?

Building the infrastructure to support cloud computing now accounts for a significant chunk of all IT spending, while spending on traditional, in-house IT slides as computing workloads continue to move to the cloud, whether that is public cloud services offered by vendors or private clouds built by enterprises themselves.

Indeed, it's increasingly clear that when it comes to enterprise computing platforms,  like it or not, the cloud has won .

Tech analyst Gartner predicts that as much as half of spending across application software, infrastructure software, business process services and system infrastructure markets will have shifted to the cloud by 2025, up from 41% in 2022. It estimates that almost two-thirds of spending on application software will be via cloud computing, up from 57.7% in 2022.

Top Cloud Providers

Top cloud providers: aws, microsoft azure, and google cloud, hybrid, saas players.

Here's a look at how the cloud leaders stack up, the hybrid market, and the SaaS players that run your company as well as their latest strategic moves.

That's a shift that only gained momentum in 2020 and 2021 as businesses accelerated their digital transformation plans during the pandemic. The lockdowns throughout the pandemic showed companies how important it was to be able to access their computing infrastructure, applications and data from wherever their staff were working – and not just from an office.

Gartner said that demand for integration capabilities, agile work processes and composable architecture will drive the continued shift to the cloud.

The scale of cloud spending continues to rise. For the full year 2021, tech analyst IDC expects cloud infrastructure spending to have grown 8.3% compared to 2020 to $71.8 billion, while non-cloud infrastructure is expected to grow just 1.9% to $58.4 billion. Long term, the analyst expects spending on compute and storage cloud infrastructure to see a compound annual growth rate of 12.4% over the 2020-2025 period, reaching $118.8 billion in 2025, and it will account for 67.0% of total compute and storage infrastructure spend. Spending on non-cloud infrastructure will be relatively flat in comparison and reach $58.6 billion in 2025.

All predictions around cloud-computing spending are pointing in the same direction, even if the details are slightly different. The momentum they are describing is the same: tech analyst Canalys reports that worldwide cloud infrastructure services expenditure topped $50 billion in a quarter for the first time in Q4 2021. For the full year, it has cloud infrastructure services spending growing 35% to $191.7 billion

Canalys argues that there is already a new growth opportunity for cloud on the horizon, in the form of augmented and virtual reality and the metaverse. "This will be a significant driver for both cloud services spend and infrastructure deployment over the next decade. In many ways, the metaverse will resemble the internet today, with enhanced capabilities and an amplified compute consumption rate," the analyst said .

What are the core elements of cloud computing?

Cloud computing can be broken down into a number of different constituent elements, focusing on different parts of the technology stack and different use cases. Let's take a look at some of the best known in a bit more detail.

What is Infrastructure as a Service?

Infrastructure as a Service (IaaS) refers to the fundamental building blocks of computing that can be rented: physical or virtual servers, storage and networking. This is attractive to companies that want to build applications from the very ground up and want to control nearly all the elements themselves, but it does require firms to have the technical skills to be able to orchestrate services at that level. 

What is Platform as a Service?

Platform as a Service (PaaS) is the next layer up – as well as the underlying storage, networking, and virtual servers, this layer also includes the tools and software that developers need to build applications on top, which could include middleware, database management, operating systems, and development tools.

What is Software as a Service?

Software as a Service (SaaS) is the delivery of applications as a service, probably the version of cloud computing that most people are used to on a day-to-day basis. The underlying hardware and operating system is irrelevant to the end user, who will access the service via a web browser or app; it is often bought on a per-seat or per-user basis.

SaaS is the largest chunk of cloud spending simply because the variety of applications delivered via SaaS is huge, from CRM such as Salesforce, through to Microsoft's Office 365. And while the whole market is growing at a furious rate, it's the IaaS and PaaS segments that have consistently grown at much faster rates, according to analyst IDC: "This highlights the increasing reliance of enterprises on a cloud foundation built on cloud infrastructure, software-defined data, compute and governance solutions as a Service, and cloud-native platforms for application deployment for enterprise IT internal applications." IDC predicts that IaaS and PaaS will continue growing at a higher rate than the overall cloud market "as resilience, flexibility, and agility guide IT platform decisions".

What is multi-cloud computing?

While the big cloud vendors would be very happy to provide all the computing needs of their enterprise customers, increasingly businesses are looking to spread the load across a number of suppliers.  All of this has lead to the rise of multi-cloud . Part of this approach is to avoid being locked in to just one vendor (which can lead to the sort of high costs and inflexibility that the cloud is often claimed to avoid), and part of it is to find the best mix of technologies across the industry.

That means being able to connect and integrate cloud services from multiple vendors is going to be a new and increasing challenge for business. Problems here include skills shortages (a lack of workers with expertise across multiple clouds) and workflow differences between cloud environments. Customers will also want to manage all their different cloud infrastructure from one place, make it easy to build applications and services and then move them, and ensure that security tools can work across multiple clouds – none of which is especially easy right now.

What are the benefits of cloud computing?

The exact benefits will vary according to the type of cloud service being used but, fundamentally, using cloud services means companies not having to buy or maintain their own computing infrastructure.

No more buying servers, updating applications or operating systems, or decommissioning and disposing of hardware or software when it is out of date, as it is all taken care of by the supplier. For commodity applications, such as email, it can make sense to switch to a cloud provider, rather than rely on in-house skills. A company that specializes in running and securing these services is likely to have better skills and more experienced staff than a small business could afford to hire, so cloud services may be able to deliver a more secure and efficient service to end users.

Using cloud services means companies can move faster on projects and test out concepts without lengthy procurement and big upfront costs, because firms only pay for the resources they consume. This concept of business agility is often mentioned by cloud advocates as a key benefit. The ability to spin up new services without the time and effort associated with traditional IT procurement should mean that it is easier to get going with new applications faster. And if a new application turns out to be wildly popular, the elastic nature of the cloud means it is easier to scale it up fast.

For a company with an application that has big peaks in usage, such as one that is only used at a particular time of the week or year, it might make financial sense to have it hosted in the cloud, rather than have dedicated hardware and software laying idle for much of the time. Moving to a cloud-hosted application for services like email or CRM could remove a burden on internal IT staff, and if such applications don't generate much competitive advantage, there will be little other impact. Moving to a services model also moves spending from capital expenditure (capex) to operational expenditure (opex), which may be useful for some companies.

  • Business continuity is the ultimate killer application for cloud
  • It's official: Supercomputing is now ho-hum (thanks, cloud)

What are the advantages and disadvantages of cloud computing?

Cloud computing is not necessarily cheaper than other forms of computing, just as renting is not always cheaper than buying in the long term. If an application has a regular and predictable requirement for computing services it may be more economical to provide that service in-house.

Some companies may be reluctant to host sensitive data in a service that is also used by rivals. Moving to a SaaS application may also mean you are using the same applications as a rival, which might make it hard to create any competitive advantage if that application is core to your business.

While it may be easy to start using a new cloud application, migrating existing data or apps to the cloud might be much more complicated and expensive. And it seems there is now something of a  shortage in cloud skills , with staff with DevOps and multi-cloud monitoring and management knowledge in particularly short supply.

In one report, a significant proportion of experienced cloud users  said they thought upfront migration costs  ultimately outweigh the long-term savings created by IaaS.

And of course, you can only access your applications if you have an internet connection.

What is cloud-computing adoption doing to IT budgets?

Cloud computing tends to shift spending from capex to opex, as companies buy computing as a service rather than in the form of physical servers. This may allow companies to avoid large increases in IT spending which would traditionally be seen with new projects; using the cloud to make room in the budget might be easier than going to the CFO and looking for more money.

Of course, this doesn't mean that cloud computing is always or necessarily cheaper that keeping applications in-house; for applications with a predictable and stable demand for computing power, it might be cheaper (from a processing power point of view at least) to keep them in-house.

  • Cloud computing spending is growing even faster than expected  

How do you build a business case for cloud computing?

To build a  business case for moving systems to the cloud , you first need to understand what your existing infrastructure actually costs. There's a lot to factor in: obvious things like the cost of running data centres, and extras such as leased lines. The cost of physical hardware – servers and details of specifications like CPUs, cores and RAM, plus the cost of storage. You'll also need to calculate the cost of applications, whether you plan to dump them, re-host them in the cloud unchanged, completely rebuilding them for the cloud, or buy an entirely new SaaS package. Each of these options will have different cost implications. The cloud business case also needs to include people costs (often second only to the infrastructure costs) and more nebulous concepts like the benefit of being able to provide new services faster. Any cloud business case should also factor in the potential downsides, including the risk of being locked into one vendor for your tech infrastructure (see multi-cloud, above).

  • Cloud computing: How to build a business case
  • Cloud computing: What it's like to make the move

Cloud-computing adoption

Analysts argue that as the cloud now underpins most new technological disruptions in everything from mobile banking to healthcare, usage is only going grow. It's hard to see many new technology projects being delivered that don't harness the cloud in some way. Gartner says that more than 85% of organizations will embrace a cloud-first principle by 2025 and will not be able to fully execute on their digital strategies without it. The analyst says new workloads deployed in a cloud-native environment will be pervasive, not just popular, and anything non-cloud will be considered legacy. By 2025, Gartner estimates that over 95% of new digital workloads will be deployed on cloud-native platforms, up from 30% in 2021.

And if that sounds unrealistic, it may be that figures on adoption of cloud depend on who you talk to inside an organisation. Not all cloud spending will be driven centrally by the CIO: cloud services are relatively easy to sign-up for, so business managers can start using them, and pay out of their own budget, without needing to inform the IT department. This can enable businesses to move faster, but also can create security risks if the use of apps is not managed.

Adoption will also vary by application: cloud-based email is much easier to adopt than a new finance system, for example. And for systems such as supply chain management, that are working efficiently as they are, there will be less short-term pressure to do a potentially costly and risky shift to the cloud.

What about cloud-computing security?

Many companies remain concerned about the security of cloud services, although breaches of security are rare. How secure you consider cloud computing to be will largely depend on how secure your existing systems are. In-house systems managed by a team with many other things to worry about are likely to be more leaky than systems monitored by a cloud provider's engineers dedicated to protecting that infrastructure.

However, concerns do remain about security, especially for companies moving their data between many cloud services, which has led to growth in  cloud security tools , which monitor data moving to and from the cloud and between cloud platforms. These tools can identify fraudulent use of data in the cloud, unauthorised downloads, and malware. There is a financial and performance impact, however: these tools can reduce the return on investment of the cloud by 5% to 10%, and impact performance by 5% to 15%. The country of origin of cloud services is also worrying some organisations (see ' Is geography irrelevant when it comes to cloud computing?'  below)

  • Cloud security and IoT are the new peanut butter and jelly
  • Azure confidential computing: Microsoft boosts security for cloud data
  • Three smart cloud services that can help keep your business more secure
  • Cloud computing security: This is where you'll be spending the money
  • Security as a Service? We want it, say IT leaders

What is public cloud?

Public cloud is the classic cloud-computing model, where users can access a large pool of computing power over the internet (whether that is IaaS, PaaS, or SaaS). One of the significant benefits here is the ability to rapidly scale a service. The cloud-computing suppliers have vast amounts of computing power, which they share out between a large number of customers – the 'multi-tenant' architecture. Their huge scale means they have enough spare capacity that they can easily cope if any particular customer needs more resources, which is why it is often used for less-sensitive applications that demand a varying amount of resources.

What is private cloud?

Private cloud allows organizations to benefit from some of the advantages of public cloud – but without the concerns about relinquishing control over data and services, because it is tucked away behind the corporate firewall. Companies can control exactly where their data is being held and can build the infrastructure in a way they want – largely for IaaS or PaaS projects – to give developers access to a pool of computing power that scales on-demand without putting security at risk. However, that additional security comes at a cost, as few companies will have the scale of AWS, Microsoft or Google, which means they will not be able to create the same economies of scale. Still, for companies that require additional security, private cloud might be a useful stepping stone, helping them to understand cloud services or rebuild internal applications for the cloud, before shifting them into the public cloud.

What is hybrid cloud?

Hybrid cloud is perhaps  where everyone is in reality: a bit of this, a bit of that. Some data in the public cloud, some projects in private cloud, multiple vendors and different levels of cloud usage. 

  • Infographic: Companies are turning to hybrid cloud to save money
  • What does 'hybrid cloud' mean? It depends on whom you ask
  • Managing the multi-cloud: It's complicated

What are the cloud-computing migration costs?

For startups that plan to run all their systems in the cloud, getting started is pretty simple. But the majority of companies, it is not so simple: with existing applications and data, they need to work out which systems are best left running as they are, and which to start moving to cloud infrastructure. This is a potentially risky and expensive move, and migrating to the cloud could cost companies more if they underestimate the scale of such projects.

A survey of 500 businesses that  were early cloud adopters  found that the need to rewrite applications to optimise them for the cloud was one of the biggest costs, especially if the apps were complex or customised. A third of those surveyed cited high fees for passing data between systems as a challenge in moving their mission-critical applications. The skills required for migration are both difficult and expensive to find – and even when organisations could find the right people, they risked them being stolen away by cloud-computing vendors with deep pockets. 

Beyond this, the majority also remained worried about the performance of critical apps, and one in three cited this as a reason for not moving some critical applications.

  • Cloud computing migration: More expensive and complicated than you thought
  • Technology migrations are more painful, and cloud isn't making them any easier
  • Where does the NAS fit in an increasingly cloud-centric world?

Is geography irrelevant when it comes to cloud computing?

Actually, it turns out that is where the cloud really does matter. Geopolitics is forcing significant changes on cloud-computing users and vendors. Firstly, there is the issue of latency: if the application is coming from a data centre on the other side of the planet, or on the other side of a congested network, then you might find it sluggish compared to a local connection. That's the latency problem.

Secondly, there is the issue of data sovereignty. Many companies, particularly in Europe, have to worry about where their data is being processed and stored. European companies are worried that, for example, if their customer data is being stored in data centres in the US or (owned by US companies), it could be accessed by US law enforcement. As a result, the big cloud vendors have been building out a regional data centre network so that organizations can keep their data in their own region.

Some have gone further, effectively detatching some of those datacenters from their main business to make it much harder for US authorities – and others – to demand access to the customer data stored there. The customer data in the data centres is under the control of an independent company, which acts as a "data trustee", and US parents cannot access data at the sites without the permission of customers or the data trustee. Expect to see cloud vendors opening more data centres around the world to cater to customers with requirements to keep data in specific locations.

Cloud security is another issue; the UK government's cyber security agency has warned that government agencies  need to consider the country of origin  when it comes to adding cloud services into their supply chains. While it was warning about antivirus software in particular, the issue is the same for other types of services too.

What is a cloud-computing region? And what is a cloud-computing availability zone?

Cloud-computing services are operated from giant datacenters around the world. AWS divides this up by  'regions' and 'availability zones' . Each AWS region is a separate geographic area, like EU (London) or US West (Oregon), which AWS then further subdivides into what it calls availability zones (AZs). An AZ is composed of one or more datacenters that are far enough apart that in theory a single disaster won't take both offline, but close enough together for business continuity applications that require rapid failover. Each AZ has multiple internet connections and power connections to multiple grids: AWS has over 80 AZs.

Google  uses a similar model , dividing its cloud-computing resources into regions that are then subdivided into zones, which include one or more datacenters from which customers can run their services. It currently over eight zones: Google recommends customers deploy applications across multiple zones and regions to help protect against unexpected failures.

Microsoft Azure  divides its resources slightly differently . It offers regions that it describes as is a "set of datacentres deployed within a latency-defined perimeter and connected through a dedicated regional low-latency network". It also offers 'geographies' typically containing two or more regions, that can be used by customers with specific data-residency and compliance needs "to keep their data and apps close". It also offers availability zones made up of one or more data centres equipped with independent power, cooling and networking.

Cloud computing and power usage

Those data centres are also sucking up a huge amount of power: for example, Microsoft struck a deal with GE to buy all of the output from its new 37-megawatt wind farm in Ireland for the next 15 years in order to power its cloud data centres. Ireland said it now expects data centres to account for  15% of total energy demand by 2026 , up from less than 2% back in 2015.

  • Cloud computing: IBM overhauls access rules at Euro data centre
  • AWS just sold some of its cloud computing infrastructure in China

Which are the big cloud-computing companies?

When it comes to IaaS and PaaS, there are really only a few giant cloud providers. Leading the way is Amazon Web Services, and then the following pack of Microsoft's Azure, Google, and IBM. According to data from Synergy Research, Amazon, Microsoft and Google continue to attract well over half of worldwide cloud spending, with Q3 market shares of 33%, 20% and 10% respectively. And with growth rates that are higher than the overall market, their share of worldwide revenues continues to grow. However, that still leaves plenty of revenue for the chasing pack of companies – about $17 billion. "Clearly there are challenges with the big three companies lurking in the background, so the name of the game is not competing with them head on,"  said the analyst .

AWS, Azure and Google Cloud – what's the difference?

The big three cloud companies all have their own strengths. AWS is the most established player and was behind Amazon's ability to support huge seasonal swings in demand from consumers. Being first out to market with cloud services and pushing hard to gain market share has made it the market leader, and it continues to innovate. Microsoft's Azure has become an absolutely core part of Microsoft's strategy, and the company has the enterprise history and products to support businesses as they switch to the cloud. Google Cloud is the smallest of the big three players, but clearly has the might of the advertising-to-Android giant behind it.

Who are the other main cloud-computing players?

Beyond the big three there are others, such as Alibaba Cloud, IBM, Dell and Hewlett Packard Enterprise, that all want to be part of the enterprise cloud project. And of course, from giants like Salesforce down to tiny startups, pretty much every software company is a SaaS company now.  

Can cloud computing go wrong?

There are and will continue to be cloud outages. Those outages might happen at a local level because your internet is disrupted either by physical means (a digger cuts your broadband) or because of cyberattacks. But the big vendors have outages too and because, we are all increasingly reliant on their services, when the cloud stops, work stops. Few companies have backup systems to turn to in this situation. So long as cloud vendors keep outages to a minimum, then users will probably consider that using the cloud is more reliable than home-grown apps. But if outages become widespread, that opinion might change.

What is the future of cloud computing?

Cloud computing is reaching the point where it is likely to account for more of enterprise tech spending than the traditional forms of delivering applications and services in-house that have been around for decades. However, use of the cloud is only likely to climb as organisations get more comfortable with the idea of their data being somewhere other than a server in the basement. And now cloud-computing vendors are increasingly pushing cloud computing as an agent of digital transformation instead of focusing simply on cost. Moving to the cloud can help companies rethink business processes and accelerate business change, goes the argument, by helping to break  down data any organisational silos . Some companies that need to boost momentum around their digital transformation programmes might find this argument appealing; others may find enthusiasm for the cloud waning as the costs of making the switch add up.

  • Why you're still scared of the Cloud (it's not about security or reliability)
  • Cloud computing switch as digital transformation takes priority
  • Moving to the cloud? Some advice to consider

Cloud-computing case studies

There are plenty of examples of organisations deciding to go down the cloud-computing route: here are a few examples of recent announcements.

  • What's the best way to make the most of the cloud?
  • US Air Force plots IT overhaul, aims for cloud
  • DuluxGroup paints a future of procurement in the cloud
  • Marketo to migrate to Google Cloud as part of multi-year deal
  • AWS infrastructure is now behind three main streaming media providers
  • American Airlines to move consumer-facing apps to IBM Cloud, says Cloud Foundry key

Previous coverage

The Art of the Hybrid Cloud

Cloud computing is gobbling up more of the services that power businesses. But, some have privacy, security, and regulatory demands that preclude the public cloud. Here's how to find the right mix.

  • Public cloud, private cloud, or hybrid cloud: What's the difference?

Trying to understand and articulate the differences between public, private, and hybrid cloud? Here's a quick breakdown.

Read more on cloud computing

  • Businesses will spend $128 billion on public cloud this year, says IDC

Kubernetes turns 10: How it steered cloud-native computing for the last decade - and what's next

What is a vpn and why do you need one all your virtual private network questions answered, static vs. dynamic ip addresses: what's the difference and why you need to know.

banner-in1

  • Cloud Computing

Top 10 Cloud Computing Research Topics of 2024

Home Blog Cloud Computing Top 10 Cloud Computing Research Topics of 2024

Play icon

Cloud computing is a fast-growing area in the technical landscape due to its recent developments. If we look ahead to 2024, there are new research topics in cloud computing that are getting more traction among researchers and practitioners. Cloud computing has ranged from new evolutions on security and privacy with the use of AI & ML usage in the Cloud computing for the new cloud-based applications for specific domains or industries. In this article, we will investigate some of the top cloud computing research topics for 2024 and explore what we get most out of it for researchers or cloud practitioners. To master a cloud computing field, we need to check these Cloud Computing online courses .

Why Cloud Computing is Important for Data-driven Business?

The Cloud computing is crucial for data-driven businesses because it provides scalable and cost-effective ways to store and process huge amounts of data. Cloud-based storage and analytical platform helps business to easily access their data whenever required irrespective of where it is located physically. This helps businesses to take good decisions about their products and marketing plans. 

Cloud computing could help businesses to improve their security in terms of data, Cloud providers offer various features such as data encryption and access control to their customers so that they can protect the data as well as from unauthorized access. 

Few benefits of Cloud computing are listed below: 

  • Scalability: With Cloud computing we get scalable applications which suits for large scale production systems for Businesses which store and process large sets of data.
  • Cost-effectiveness : It is evident that Cloud computing is cost effective solution compared to the traditional on-premises data storage and analytical solutions due to its scaling capacity which leads to saving more IT costs. 
  • Security : Cloud providers offer various security features which includes data encryption and access control, that can help businesses to protect their data from unauthorized access.
  • Reliability : Cloud providers ensure high reliability to their customers based on their SLA which is useful for the data-driven business to operate 24X7. 

Top 10 Cloud Computing Research Topics

1. neural network based multi-objective evolutionary algorithm for dynamic workflow scheduling in cloud computing.

Cloud computing research topics are getting wider traction in the Cloud Computing field. These topics in the paper suggest a multi-objective evolutionary algorithm (NN-MOEA) based on neural networks for dynamic workflow scheduling in cloud computing. Due to the dynamic nature of cloud resources and the numerous competing objectives that need to be optimized, scheduling workflows in cloud computing is difficult. The NN-MOEA algorithm utilizes neural networks to optimize multiple objectives, such as planning, cost, and resource utilization. This research focuses on cloud computing and its potential to enhance the efficiency and effectiveness of businesses' cloud-based workflows.

The algorithm predicts workflow completion time using a feedforward neural network based on input and output data sizes and cloud resources. It generates a balanced schedule by taking into account conflicting objectives and projected execution time. It also includes an evolutionary algorithm for future improvement.

The proposed NN-MOEA algorithm has several benefits, such as the capacity to manage dynamic changes in cloud resources and the capacity to simultaneously optimize multiple objectives. The algorithm is also capable of handling a variety of workflows and is easily expandable to include additional goals. The algorithm's use of neural networks to forecast task execution times is a crucial component because it enables the algorithm to generate better schedules and more accurate predictions.

The paper concludes by presenting a novel multi-objective evolutionary algorithm-based neural network-based approach to dynamic workflow scheduling in cloud computing. In terms of optimizing multiple objectives, such as make span and cost, and achieving a better balance between them, these cloud computing dissertation topics on the proposed NN-MOEA algorithm exhibit encouraging results.

Key insights and Research Ideas:

Investigate the use of different neural network architectures for predicting the future positions of optimal solutions. Explore the use of different multi-objective evolutionary algorithms for solving dynamic workflow scheduling problems. Develop a cloud-based workflow scheduling platform that implements the proposed algorithm and makes it available to researchers and practitioners.

2. A systematic literature review on cloud computing security: threats and mitigation strategies 

This is one of cloud computing security research topics in the cloud computing paradigm. The authors then provide a systematic literature review of studies that address security threats to cloud computing and mitigation techniques and were published between 2010 and 2020. They list and classify the risks and defense mechanisms covered in the literature, as well as the frequency and distribution of these subjects over time.

The paper suggests the data breaches, Insider threats and DDoS attack are most discussed threats to the security of cloud computing. Identity and access management, encryption, and intrusion detection and prevention systems are the mitigation techniques that are most frequently discussed. Authors depict the future trends of machine learning and artificial intelligence might help cloud computing to mitigate its risks. 

The paper offers a thorough overview of security risks and mitigation techniques in cloud computing, and it emphasizes the need for more research and development in this field to address the constantly changing security issues with cloud computing. This research could help businesses to reduce the amount of spam that they receive in their cloud-based email systems.

Explore the use of blockchain technology to improve the security of cloud computing systems. Investigate the use of machine learning and artificial intelligence to detect and prevent cloud computing attacks. Develop new security tools and technologies for cloud computing environments. 

3. Spam Identification in Cloud Computing Based on Text Filtering System

A text filtering system is suggested in the paper "Spam Identification in Cloud Computing Based on Text Filtering System" to help identify spam emails in cloud computing environments. Spam emails are a significant issue in cloud computing because they can use up computing resources and jeopardize the system's security. 

To detect spam emails, the suggested system combines text filtering methods with machine learning algorithms. The email content is first pre-processed by the system, which eliminates stop words and stems the remaining words. The preprocessed text is then subjected to several filters, including a blacklist filter and a Bayesian filter, to identify spam emails.

In order to categorize emails as spam or non-spam based on their content, the system also employs machine learning algorithms like decision trees and random forests. The authors use a dataset of emails gathered from a cloud computing environment to train and test the system. They then assess its performance using metrics like precision, recall, and F1 score.

The findings demonstrate the effectiveness of the proposed system in detecting spam emails, achieving high precision and recall rates. By contrasting their system with other spam identification systems, the authors also show how accurate and effective it is. 

The method presented in the paper for locating spam emails in cloud computing environments has the potential to improve the overall security and performance of cloud computing systems. This is one of the interesting clouds computing current research topics to explore and innovate. This is one of the good Cloud computing research topics to protect the Mail threats. 

Create a stronger spam filtering system that can recognize spam emails even when they are made to avoid detection by more common spam filters. examine the application of artificial intelligence and machine learning to the evaluation of spam filtering system accuracy. Create a more effective spam filtering system that can handle a lot of emails quickly and accurately.

4. Blockchain data-based cloud data integrity protection mechanism 

The "Blockchain data-based cloud data integrity protection mechanism" paper suggests a method for safeguarding the integrity of cloud data and which is one of the Cloud computing research topics. In order to store and process massive amounts of data, cloud computing has grown in popularity, but issues with data security and integrity still exist. For the proposed mechanism to guarantee the availability and integrity of cloud data, data redundancy and blockchain technology are combined.

A data redundancy layer, a blockchain layer, and a verification and recovery layer make up the mechanism. For availability in the event of server failure, the data redundancy layer replicates the cloud data across multiple cloud servers. The blockchain layer stores the metadata (such as access rights) and hash values of the cloud data and access control information

Using a dataset of cloud data, the authors assess the performance of the suggested mechanism and compare it to other cloud data protection mechanisms. The findings demonstrate that the suggested mechanism offers high levels of data availability and integrity and is superior to other mechanisms in terms of processing speed and storage space.

Overall, the paper offers a promising strategy for using blockchain technology to guarantee the availability and integrity of cloud data. The suggested mechanism may assist in addressing cloud computing's security issues and enhancing the dependability of cloud data processing and storage. This research could help businesses to protect the integrity of their cloud-based data from unauthorized access and manipulation.

Create a data integrity protection system based on blockchain that is capable of detecting and preventing data tampering in cloud computing environments. For enhancing the functionality and scalability of blockchain-based data integrity protection mechanisms, look into the use of various blockchain consensus algorithms. Create a data integrity protection system based on blockchain that is compatible with current cloud computing platforms. Create a safe and private data integrity protection system based on blockchain technology.

5. A survey on internet of things and cloud computing for healthcare

This article suggests how recent tech trends like the Internet of Things (IoT) and cloud computing could transform the healthcare industry. It is one of the Cloud computing research topics. These emerging technologies open exciting possibilities by enabling remote patient monitoring, personalized care, and efficient data management. This topic is one of the IoT and cloud computing research papers which aims to share a wider range of information. 

The authors categorize the research into IoT-based systems, cloud-based systems, and integrated systems using both IoT and the cloud. They discussed the pros of real-time data collection, improved care coordination, automated diagnosis and treatment.

However, the authors also acknowledge concerns around data security, privacy, and the need for standardized protocols and platforms. Widespread adoption of these technologies faces challenges in ensuring they are implemented responsibly and ethically. To begin the journey KnowledgeHut’s Cloud Computing online course s are good starter for beginners so that they can cope with Cloud computing with IOT. 

Overall, the paper provides a comprehensive overview of this rapidly developing field, highlighting opportunities to revolutionize how healthcare is delivered. New devices, systems and data analytics powered by IoT, and cloud computing could enable more proactive, preventative and affordable care in the future. But careful planning and governance will be crucial to maximize the value of these technologies while mitigating risks to patient safety, trust and autonomy. This research could help businesses to explore the potential of IoT and cloud computing to improve healthcare delivery.

Examine how IoT and cloud computing are affecting patient outcomes in various healthcare settings, including hospitals, clinics, and home care. Analyze how well various IoT devices and cloud computing platforms perform in-the-moment patient data collection, archival, and analysis. assessing the security and privacy risks connected to IoT devices and cloud computing in the healthcare industry and developing mitigation strategies.

6. Targeted influence maximization based on cloud computing over big data in social networks

Big data in cloud computing research papers are having huge visibility in the industry. The paper "Targeted Influence Maximization based on Cloud Computing over Big Data in Social Networks" proposes a targeted influence maximization algorithm to identify the most influential users in a social network. Influence maximization is the process of identifying a group of users in a social network who can have a significant impact or spread information. 

A targeted influence maximization algorithm is suggested in the paper "Targeted Influence maximization based on Cloud Computing over Big Data in Social Networks" to find the most influential users in a social network. The process of finding a group of users in a social network who can make a significant impact or spread information is known as influence maximization.

Four steps make up the suggested algorithm: feature extraction, classification, influence maximization, and data preprocessing. The authors gather and preprocess social network data, such as user profiles and interaction data, during the data preprocessing stage. Using machine learning methods like text mining and sentiment analysis, they extract features from the data during the feature extraction stage. Overall, the paper offers a promising strategy for maximizing targeted influence using big data and Cloud computing research topics to look into. The suggested algorithm could assist companies and organizations in pinpointing their marketing or communication strategies to reach the most influential members of a social network.

Key insights and Research Ideas: 

Develop a cloud-based targeted influence maximization algorithm that can effectively identify and influence a small number of users in a social network to achieve a desired outcome. Investigate the use of different cloud computing platforms to improve the performance and scalability of cloud-based targeted influence maximization algorithms. Develop a cloud-based targeted influence maximization algorithm that is compatible with existing social network platforms. Design a cloud-based targeted influence maximization algorithm that is secure and privacy-preserving.

7. Security and privacy protection in cloud computing: Discussions and challenges

Cloud computing current research topics are getting traction, this is of such topic which provides an overview of the challenges and discussions surrounding security and privacy protection in cloud computing. The authors highlight the importance of protecting sensitive data in the cloud, with the potential risks and threats to data privacy and security. The article explores various security and privacy issues that arise in cloud computing, including data breaches, insider threats, and regulatory compliance.

The article explores challenges associated with implementing these security measures and highlights the need for effective risk management strategies. Azure Solution Architect Certification course is suitable for a person who needs to work on Azure cloud as an architect who will do system design with keep security in mind. 

Final take away of cloud computing thesis paper by an author points out by discussing some of the emerging trends in cloud security and privacy, including the use of artificial intelligence and machine learning to enhance security, and the emergence of new regulatory frameworks designed to protect data in the cloud and is one of the Cloud computing research topics to keep an eye in the security domain. 

Develop a more comprehensive security and privacy framework for cloud computing. Explore the options with machine learning and artificial intelligence to enhance the security and privacy of cloud computing. Develop more robust security and privacy mechanisms for cloud computing. Design security and privacy policies for cloud computing that are fair and transparent. Educate cloud users about security and privacy risks and best practices.

8. Intelligent task prediction and computation offloading based on mobile-edge cloud computing

This Cloud Computing thesis paper "Intelligent Task Prediction and Computation Offloading Based on Mobile-Edge Cloud Computing" proposes a task prediction and computation offloading mechanism to improve the performance of mobile applications under the umbrella of cloud computing research ideas.

An algorithm for offloading computations and a task prediction model makes up the two main parts of the suggested mechanism. Based on the mobile application's usage patterns, the task prediction model employs machine learning techniques to forecast its upcoming tasks. This prediction is to decide whether to execute a specific task locally on the mobile device or offload the computation of it to the cloud.

Using a dataset of mobile application usage patterns, the authors assess the performance of the suggested mechanism and compare it to other computation offloading mechanisms. The findings demonstrate that the suggested mechanism performs better in terms of energy usage, response time, and network usage.

The authors also go over the difficulties in putting the suggested mechanism into practice, including the need for real-time task prediction and the trade-off between offloading computation and network usage. Additionally, they outline future research directions for mobile-edge cloud computing applications, including the use of edge caching and the integration of blockchain technology for security and privacy. 

Overall, the paper offers a promising strategy for enhancing mobile application performance through mobile-edge cloud computing. The suggested mechanism might improve the user experience for mobile users while lowering the energy consumption and response time of mobile applications. These Cloud computing dissertation topic leads to many innovation ideas. 

Develop an accurate task prediction model considering mobile device and cloud dynamics. Explore machine learning and AI for efficient computation offloading. Create a robust framework for diverse tasks and scenarios. Design a secure, privacy-preserving computation offloading mechanism. Assess computation offloading effectiveness in real-world mobile apps.

9. Cloud Computing and Security: The Security Mechanism and Pillars of ERPs on Cloud Technology

Enterprise resource planning (ERP) systems are one of the Cloud computing research topics in particular face security challenges with cloud computing, and the paper "Cloud Computing and Security: The Security Mechanism and Pillars of ERPs on Cloud Technology" discusses these challenges and suggests a security mechanism and pillars for protecting ERP systems on cloud technology.

The authors begin by going over the benefits of ERP systems and cloud computing as well as the security issues with cloud computing, like data breaches and insider threats. They then go on to present a security framework for cloud-based ERP systems that is built around four pillars: access control, data encryption, data backup and recovery, and security monitoring. The access control pillar restricts user access, while the data encryption pillar secures sensitive data. Data backup and recovery involve backing up lost or failed data. Security monitoring continuously monitors the ERP system for threats. The authors also discuss interoperability challenges and the need for standardization in securing ERP systems on the cloud. They propose future research directions, such as applying machine learning and artificial intelligence to security analytics.

Overall, the paper outlines a thorough strategy for safeguarding ERP systems using cloud computing and emphasizes the significance of addressing security issues related to this technology. Organizations can protect their ERP systems and make sure the Security as well as privacy of their data by implementing these security pillars and mechanisms. 

Investigate the application of blockchain technology to enhance the security of cloud-based ERP systems. Look into the use of machine learning and artificial intelligence to identify and stop security threats in cloud-based ERP systems. Create fresh security measures that are intended only for cloud-based ERP systems. By more effectively managing access control and data encryption, cloud-based ERP systems can be made more secure. Inform ERP users about the security dangers that come with cloud-based ERP systems and how to avoid them.

10. Optimized data storage algorithm of IoT based on cloud computing in distributed system

The article proposes an optimized data storage algorithm for Internet of Things (IoT) devices which runs on cloud computing in a distributed system. In IoT apps, which normally generate huge amounts of data by various devices, the algorithm tries to increase the data storage and faster retrials of the same. 

The algorithm proposed includes three main components: Data Processing, Data Storage, and Data Retrieval. The Data Processing module preprocesses IoT device data by filtering or compressing it. The Data Storage module distributes the preprocessed data across cloud servers using partitioning and stores it in a distributed database. The Data Retrieval module efficiently retrieves stored data in response to user queries, minimizing data transmission and enhancing query efficiency. The authors evaluated the algorithm's performance using an IoT dataset and compared it to other storage and retrieval algorithms. Results show that the proposed algorithm surpasses others in terms of storage effectiveness, query response time, and network usage. 

They suggest future directions such as leveraging edge computing and blockchain technology for optimizing data storage and retrieval in IoT applications. In conclusion, the paper introduces a promising method to improve data archival and retrieval in distributed cloud based IoT applications, enhancing the effectiveness and scalability of IoT applications.

Create a data storage algorithm capable of storing and managing large amounts of IoT data efficiently. Examine the use of cloud computing to improve the performance and scalability of data storage algorithms for IoT. Create a secure and privacy-preserving data storage algorithm. Assess the performance and effectiveness of data storage algorithms for IoT in real-world applications.

How to Write a Perfect Research Paper?

  • Choose a topic: Select the topic which is interesting to you so that you can share things with the viewer seamlessly with good content. 
  • Do your research: Read books, articles, and websites on your topic. Take notes and gather evidence to support your arguments.
  • Write an outline: This will help you organize your thoughts and make sure your paper flows smoothly.
  • Start your paper: Start with an introduction that grabs the reader's attention. Then, state your thesis statement and support it with evidence from your research. Finally, write a conclusion that summarizes your main points.
  • Edit and proofread your paper. Make sure you check the grammatical errors and spelling mistakes. 

Cloud computing is a rapidly evolving area with more interesting research topics being getting traction by researchers and practitioners. Cloud providers have their research to make sure their customer data is secured and take care of their security which includes encryption algorithms, improved access control and mitigating DDoS – Deniel of Service attack etc., 

With the improvements in AI & ML, a few features developed to improve the performance, efficiency, and security of cloud computing systems. Some of the research topics in this area include developing new algorithms for resource allocation, optimizing cloud workflows, and detecting and mitigating cyberattacks.

Cloud computing is being used in industries such as healthcare, finance, and manufacturing. Some of the research topics in this area include developing new cloud-based medical imaging applications, building cloud-based financial trading platforms, and designing cloud-based manufacturing systems.

Frequently Asked Questions (FAQs)

Data security and privacy problems, vendor lock-in, complex cloud management, a lack of standardization, and the risk of service provider disruptions are all current issues in cloud computing. Because data is housed on third-party servers, data security and privacy are key considerations. Vendor lock-in makes transferring providers harder and increases reliance on a single one. Managing many cloud services complicates things. Lack of standardization causes interoperability problems and restricts workload mobility between providers. 

Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) are the cloud computing scenarios where industries focusing right now. 

The six major components of cloud infrastructure are compute, storage, networking, security, management and monitoring, and database. These components enable cloud-based processing and execution, data storage and retrieval, communication between components, security measures, management and monitoring of the infrastructure, and database services.  

Profile

Vinoth Kumar P

Vinoth Kumar P is a Cloud DevOps Engineer at Amadeus Labs. He has over 7 years of experience in the IT industry, and is specialized in DevOps, GitOps, DevSecOps, MLOps, Chaos Engineering, Cloud and Cloud Native landscapes. He has published articles and blogs on recent tech trends and best practices on GitHub, Medium, and LinkedIn, and has delivered a DevSecOps 101 talk to Developers community , GitOps with Argo CD Webinar for DevOps Community. He has helped multiple enterprises with their cloud migration, cloud native design, CICD pipeline setup, and containerization journey.

Avail your free 1:1 mentorship session.

Something went wrong

Upcoming Cloud Computing Batches & Dates

NameDateFeeKnow more

Course advisor icon

  • Generative AI
  • Office Suites
  • Collaboration Software
  • Productivity Software
  • Augmented Reality
  • Emerging Technology
  • Remote Work
  • Artificial Intelligence
  • Operating Systems
  • IT Leadership
  • IT Management
  • IT Operations

Cloud Computing

  • Computers and Peripherals
  • Data Center
  • Enterprise Applications
  • Vendors and Providers
  • Enterprise Buyer’s Guides
  • United States
  • Netherlands
  • United Kingdom
  • New Zealand
  • Newsletters
  • Foundry Careers
  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Copyright Notice
  • Member Preferences
  • About AdChoices
  • E-commerce Affiliate Relationships
  • Your California Privacy Rights

Our Network

  • Network World

Cloud Computing | News, analysis, features, how-tos, and videos

Image

Anthropic’s latest version of Claude comes to Amazon Bedrock

Amazon continues to keep the latest Anthropic generative AI models in its cloud.

Image

Do cloud-based genAI services have an enterprise future?

Image

AIaaS buyer’s guide: How to choose an AI-as-a-service provider

Elon musk sues openai alleging breach of founding agreement, google calls microsoft’s cloud practices in the eu anti-competitive, microsoft plans €2b investment in ai and cloud infrastructure in spain, box ceo 'insanely excited' about future of vision pro, work, and ai, microsoft and google invest in ai in europe, generative ai boosts cloud revenue for microsoft, google, will super chips disrupt the ‘everything to the cloud’ it mentality.

It's no secret that enterprise IT in recent years has been disappointed in corporate clouds. But in general they've not done anything about it. That could soon change.

Public cloud service revenue jumps 19%, with Microsoft leading overall: IDC

Worldwide public cloud sales reached $315.5 billion in the first half of 2023, due partly to AI adoption, and is expected to surpass $1 trillion annually in only a few more years.

AWS blames Microsoft for anti-competitive practices in the UK

Last month, AWS, in a letter to the UK’s competition watchdog, alleged that Microsoft was involved in cloud licensing practices that restrict customer choices and make switching between cloud providers difficult.

Amazon Q: AWS’ answer to Microsoft’s GPT-driven Copilot

Amazon Q can do all the tasks that Copilot can and is expected to appeal to IT managers who want to limit the number of generative AI assistants in use at their enterprises.

Windows-as-an-app is coming

Who needs a Windows PC when you can run the operating system as a desktop-as-a-service on your Chromebook, Linux box, or Mac?

The end of the standalone application

Microsoft has pulled cloud support from Office 2016 and 2019. The day of the standalone software you actually "own" is over.

Google Cloud fails to cash in on generative AI

As enterprises optimize their cloud spending, Google Cloud has registered its slowest growth in three years.

Microsoft launches AI content safety service

Microsoft’s Azure AI Content Safety service includes image and text detection to identify and grade content based on the likelihood that it will cause harm.

Wi-Fi 7 could make thin clients much more viable

The days when bandwidth and networking issues could hold back thin client devices may be over — thanks to Wi-Fi 7.

Microsoft, Amazon go head-to-head on genAI in the cloud — here’s who’ll win

Both tech bigwigs are making a play for customers with genAI tools running in the cloud. But one company has a natural advantage.

Managed Apple IDs, iCloud, and the shadow IT connection

Apple's expansion of managed IDs for business users opens the door to helpful features such as iCloud backup and syncing — but beware of increased data sprawl and siloes.

Microsoft wants you to pay for Windows from the cloud — good luck with that

Windows can sometimes be glitchy. Windows in the cloud would be worse, especially for anyone who doesn't have their own IT desk.

Gartner® Magic Quadrant™ for Privileged Access Management

CyberArk is proud to announce we have been named a Leader in the 2023 Gartner® Magic Quadrant™ for Privileged Access Management1 , again. 

The post Gartner® Magic Quadrant™ for Privileged Access Management appeared first on Whitepaper Repository – .

Image

How CIOs Can Prepare Their IT Organizations and Enterprises for Generative AI

Image

Efficacy above AII

Image

Video on demand

Does age discrimination exist in the tech industry.

While the technology job market remains vibrant and strong for many workers, people over the age of 50 are finding it more difficult to get a new job, or navigate past the layers of hiring with the institution of new AI screening tools. Keith talks with Christina Matz, Ph.D., an associate professor in the Boston College School of Social Work and director of the Center on Aging & Work, about why age discrimination still exists within much of the corporate world, including in the tech sector.

Image

Google has more problems than its ‘glue on pizza’ error

Image

Why computing now feels like ‘disposable technology’

Image

The AI hype squad hits some major bumps

Explore a topic.

  • View all topics

Show me more

Windows 11 insider previews: what’s in the latest build.

Image

DuckDuckGo launches anonymous AI chatbot

Image

US chip export control rules circumvented by AI cloud services, says report

Image

Podcast: Does age discrimination exist in the tech industry?

Image

Podcast: Google's AI problems go beyond the 'glue on pizza' error

Image

Podcast: Why computing now feels like ‘disposable technology’

Image

Why are people still bad at video calls?

Image

Will new AI tools create a better Siri or voice assistant?

Image

Is AI crushing creativity, or creating mediocrity?

Image

  • Original Papers
  • Open access
  • Published: 20 April 2010

Cloud computing: state-of-the-art and research challenges

  • Qi Zhang 1 ,
  • Lu Cheng 1 &
  • Raouf Boutaba 1  

Journal of Internet Services and Applications volume  1 ,  pages 7–18 ( 2010 ) Cite this article

91k Accesses

2082 Citations

28 Altmetric

Metrics details

Cloud computing has recently emerged as a new paradigm for hosting and delivering services over the Internet. Cloud computing is attractive to business owners as it eliminates the requirement for users to plan ahead for provisioning, and allows enterprises to start from the small and increase resources only when there is a rise in service demand. However, despite the fact that cloud computing offers huge opportunities to the IT industry, the development of cloud computing technology is currently at its infancy, with many issues still to be addressed. In this paper, we present a survey of cloud computing, highlighting its key concepts, architectural principles, state-of-the-art implementation as well as research challenges. The aim of this paper is to provide a better understanding of the design challenges of cloud computing and identify important research directions in this increasingly important area.

Al-Fares M et al (2008) A scalable, commodity data center network architecture. In: Proc SIGCOMM

Amazon Elastic Computing Cloud, aws.amazon.com/ec2

Amazon Web Services, aws.amazon.com

Ananthanarayanan R, Gupta K et al (2009) Cloud analytics: do we really need to reinvent the storage stack? In: Proc of HotCloud

Armbrust M et al (2009) Above the clouds: a Berkeley view of cloud computing. UC Berkeley Technical Report

Berners-Lee T, Fielding R, Masinter L (2005) RFC 3986: uniform resource identifier (URI): generic syntax, January 2005

Bodik P et al (2009) Statistical machine learning makes automatic control practical for Internet datacenters. In: Proc HotCloud

Brooks D et al (2000) Power-aware microarchitecture: design and modeling challenges for the next-generation microprocessors, IEEE Micro

Chandra A et al (2009) Nebulas: using distributed voluntary resources to build clouds. In: Proc of HotCloud

Chang F, Dean J et al (2006) Bigtable: a distributed storage system for structured data. In: Proc of OSDI

Chekuri C, Khanna S (2004) On multi-dimensional packing problems. SIAM J Comput 33(4):837–851

Article   MATH   MathSciNet   Google Scholar  

Church K et al (2008) On delivering embarrassingly distributed cloud services. In: Proc of HotNets

Clark C, Fraser K, Hand S, Hansen JG, Jul E, Limpach C, Pratt I, Warfield A (2005) Live migration of virtual machines. In: Proc of NSDI

Cloud Computing on Wikipedia, en.wikipedia.org/wiki/Cloudcomputing , 20 Dec 2009

Cloud Hosting, CLoud Computing and Hybrid Infrastructure from GoGrid, http://www.gogrid.com

Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Proc of OSDI

Dedicated Server, Managed Hosting, Web Hosting by Rackspace Hosting, http://www.rackspace.com

FlexiScale Cloud Comp and Hosting, www.flexiscale.com

Ghemawat S, Gobioff H, Leung S-T (2003) The Google file system. In: Proc of SOSP, October 2003

Google App Engine, URL http://code.google.com/appengine

Greenberg A, Jain N et al (2009) VL2: a scalable and flexible data center network. In: Proc SIGCOMM

Guo C et al (2008) DCell: a scalable and fault-tolerant network structure for data centers. In: Proc SIGCOMM

Guo C, Lu G, Li D et al (2009) BCube: a high performance, server-centric network architecture for modular data centers. In: Proc SIGCOMM

Hadoop Distributed File System, hadoop.apache.org/hdfs

Hadoop MapReduce, hadoop.apache.org/mapreduce

Hamilton J (2009) Cooperative expendable micro-slice servers (CEMS): low cost, low power servers for Internet-scale services In: Proc of CIDR

IEEE P802.3az Energy Efficient Ethernet Task Force, www.ieee802.org/3/az

Kalyvianaki E et al (2009) Self-adaptive and self-configured CPU resource provisioning for virtualized servers using Kalman filters. In: Proc of international conference on autonomic computing

Kambatla K et al (2009) Towards optimizing Hadoop provisioning in the cloud. In: Proc of HotCloud

Kernal Based Virtual Machine, www.linux-kvm.org/page/MainPage

Krautheim FJ (2009) Private virtual infrastructure for cloud computing. In: Proc of HotCloud

Kumar S et al (2009) vManage: loosely coupled platform and virtualization management in data centers. In: Proc of international conference on cloud computing

Li B et al (2009) EnaCloud: an energy-saving application live placement approach for cloud computing environments. In: Proc of international conf on cloud computing

Meng X et al (2010) Improving the scalability of data center networks with traffic-aware virtual machine placement. In: Proc INFOCOM

Mysore R et al (2009) PortLand: a scalable fault-tolerant layer 2 data center network fabric. In: Proc SIGCOMM

NIST Definition of Cloud Computing v15, csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc

Osman S, Subhraveti D et al (2002) The design and implementation of zap: a system for migrating computing environments. In: Proc of OSDI

Padala P, Hou K-Y et al (2009) Automated control of multiple virtualized resources. In: Proc of EuroSys

Parkhill D (1966) The challenge of the computer utility. Addison-Wesley, Reading

Google Scholar  

Patil S et al (2009) In search of an API for scalable file systems: under the table or above it? HotCloud

Salesforce CRM, http://www.salesforce.com/platform

Sandholm T, Lai K (2009) MapReduce optimization using regulated dynamic prioritization. In: Proc of SIGMETRICS/Performance

Santos N, Gummadi K, Rodrigues R (2009) Towards trusted cloud computing. In: Proc of HotCloud

SAP Business ByDesign, www.sap.com/sme/solutions/businessmanagement/businessbydesign/index.epx

Sonnek J et al (2009) Virtual putty: reshaping the physical footprint of virtual machines. In: Proc of HotCloud

Srikantaiah S et al (2008) Energy aware consolidation for cloud computing. In: Proc of HotPower

Urgaonkar B et al (2005) Dynamic provisioning of multi-tier Internet applications. In: Proc of ICAC

Valancius V, Laoutaris N et al (2009) Greening the Internet with nano data centers. In: Proc of CoNext

Vaquero L, Rodero-Merino L, Caceres J, Lindner M (2009) A break in the clouds: towards a cloud definition. ACM SIGCOMM computer communications review

Vasic N et al (2009) Making cluster applications energy-aware. In: Proc of automated ctrl for datacenters and clouds

Virtualization Resource Chargeback, www.vkernel.com/products/EnterpriseChargebackVirtualAppliance

VMWare ESX Server, www.vmware.com/products/esx

Windows Azure, www.microsoft.com/azure

Wood T et al (2007) Black-box and gray-box strategies for virtual machine migration. In: Proc of NSDI

XenSource Inc, Xen, www.xensource.com

Zaharia M et al (2009) Improving MapReduce performance in heterogeneous environments. In: Proc of HotCloud

Zhang Q et al (2007) A regression-based analytic model for dynamic resource provisioning of multi-tier applications. In: Proc ICAC

Download references

Author information

Authors and affiliations.

University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1

Qi Zhang, Lu Cheng & Raouf Boutaba

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Raouf Boutaba .

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Zhang, Q., Cheng, L. & Boutaba, R. Cloud computing: state-of-the-art and research challenges. J Internet Serv Appl 1 , 7–18 (2010). https://doi.org/10.1007/s13174-010-0007-6

Download citation

Received : 08 January 2010

Accepted : 25 February 2010

Published : 20 April 2010

Issue Date : May 2010

DOI : https://doi.org/10.1007/s13174-010-0007-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cloud computing
  • Data centers
  • Virtualization

cloud computing research

cloud computing research

  • NIH Grants & Funding
  • Blog Policies

NIH Extramural Nexus

cloud computing research

Making Advanced Research Computing More Accessible With NIH Cloud Lab

Guest post by Sean Mooney, PhD, recently appointed Director of the NIH Center for Information Technology , to announce the launch of NIH Cloud Lab for all NIH-affiliated* researchers. Learn more or request an account here .

Headshot of Sean Mooney

As I step into my role as Director of the NIH Center for Information Technology (CIT), I am continually amazed by the contributions our NIH community makes to the digital ecosystem underpinning biomedical research. A key focus of mine is the advancement of collaborative, data-driven research in “the cloud”, and my colleagues and I are determined to make access to and expertise in cloud computing more widely available across the NIH research community.

Researchers today often face barriers when adopting and integrating cloud-based technologies. Common concerns range from complex procurement processes and unpredictable costs to the availability of relevant training and assurances on data security. As a response, the NIH Center for Information Technology has partnered with NIH’s Office of Data Science Strategy to offer two programs: the NIH STRIDES Initiative and NIH Cloud Lab.

NIH STRIDES streamlines the procurement of commercial cloud services from our partners: Amazon Web Services, Google Cloud, and Microsoft Azure. Participating NIH-funded researchers benefit from exclusive features, favorable pricing, dedicated support, and discounted cloud training . As of May 2024, more than 2,300 research programs have enrolled in STRIDES, storing over 300 petabytes of data!

For those interested in using cloud technology without making an initial long-term commitment, NIH Cloud Lab enables researchers to try cloud capabilities at no cost in a secure, NIH-approved environment. Participants receive 90 days of access to Amazon Web Services, Google Cloud, or Microsoft Azure along with $500 of no-cost cloud credits. 

To help accelerate researchers’ cloud fluency, we offer an extensive repository of interactive tutorials that demonstrate how to run realistic bioinformatic, data science, and AI workflows. Tutorials address a variety of scientific topics, including generative AI, variant calling, single-cell RNA sequencing, proteomics, medical imaging, and more! The best part is these tutorials are also public and available in GitHub to anyone with a self-funded cloud account.

Cloud Lab’s breadth of technical features and support from bioinformatic and cloud experts make it a good fit for individuals with all levels of cloud knowledge, from cloud beginners to experienced researchers. Regardless of a participant’s level of experience, Cloud Lab’s integration with the Researcher Auth Services (RAS) solution minimizes time spent setting up an account by enabling most researchers to log in with their home institution username and password. This helps participants get up and running as quickly as possible.

One of the most significant benefits of this simplified onboarding process has been the democratization of access to computational and data science resources. Since 2023, Cloud Lab has seen participation from 368 National Institute of General Medical Sciences (NIGMS) grantees from 102 research institutions in the Institutional Development Program , which builds research capacity in states that have historically received low levels of NIH funding. Our team also integrated twelve interactive bioinformatics tutorials developed by NIGMS grantees into the Cloud Lab experience!

NIH-affiliated* researchers interested in Cloud Lab are welcome to sign up at any time by visiting our Cloud Lab homepage , and academic institutions looking to supplement their coursework or workshops with Cloud Lab can contact our team at [email protected] . Stay tuned to the CIT homepage for exciting updates and enhancements to our offerings. We’re looking forward to helping you on your cloud adoption journey!

*NIH-affiliated includes recipients of NIH funding and researchers at institutions eligible for NIH funding who may not have an active award.

RELATED NEWS

One comment.

I’m interested in using AI for clinical data analysis. Can you assist? Thanks.

Before submitting your comment, please review our blog comment policies.

Your email address will not be published. Required fields are marked *

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Comput Intell Neurosci
  • v.2022; 2022

Logo of cin

This article has been retracted.

The rise of cloud computing: data protection, privacy, and open research challenges—a systematic literature review (slr), junaid hassan.

1 Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad, Chiniot-Faisalabad Campus, Chiniot 35400, Pakistan

Danish Shehzad

2 Department of Computer Science, Superior University, Lahore 54000, Pakistan

Usman Habib

3 Faculty of Computer Sciences and Engineering, GIK Institute of Engineering Sciences and Technology, Topi, Swabi 23640, Khyber Pakhtunkhwa, Pakistan

Muhammad Umar Aftab

Muhammad ahmad, ramil kuleev.

4 Institute of Software Development and Engineering, Innopolis University, Innopolis 420500, Russia

Manuel Mazzara

Associated data.

The data used to support the findings of this study are provided in this article.

Cloud computing is a long-standing dream of computing as a utility, where users can store their data remotely in the cloud to enjoy on-demand services and high-quality applications from a shared pool of configurable computing resources. Thus, the privacy and security of data are of utmost importance to all of its users regardless of the nature of the data being stored. In cloud computing environments, it is especially critical because data is stored in various locations, even around the world, and users do not have any physical access to their sensitive data. Therefore, we need certain data protection techniques to protect the sensitive data that is outsourced over the cloud. In this paper, we conduct a systematic literature review (SLR) to illustrate all the data protection techniques that protect sensitive data outsourced over cloud storage. Therefore, the main objective of this research is to synthesize, classify, and identify important studies in the field of study. Accordingly, an evidence-based approach is used in this study. Preliminary results are based on answers to four research questions. Out of 493 research articles, 52 studies were selected. 52 papers use different data protection techniques, which can be divided into two main categories, namely noncryptographic techniques and cryptographic techniques. Noncryptographic techniques consist of data splitting, data anonymization, and steganographic techniques, whereas cryptographic techniques consist of encryption, searchable encryption, homomorphic encryption, and signcryption. In this work, we compare all of these techniques in terms of data protection accuracy, overhead, and operations on masked data. Finally, we discuss the future research challenges facing the implementation of these techniques.

1. Introduction

Recent advances have given rise to the popularity and success of cloud computing. It is a new computing and business model that provides on-demand storage and computing resources. The main objective of cloud computing is to gain financial benefits as cloud computing offers an effective way to reduce operational and capital costs. Cloud storage is a basic service of cloud computing architecture that allows users to store and share data over the internet. Some of the advantages of cloud storage are offsite backup, efficient and secure file access, unlimited data storage space, and low cost of use. Generally, cloud storage is divided into five categories: (1) private cloud storage, (2) personal cloud storage, (3) public cloud storage, (4) community cloud storage, and (5) hybrid cloud storage.

However, when we outsource data and business applications to a third party, security and privacy issues become a major concern [ 1 ]. Before outsourcing private data to the cloud, there is a need to protect private data by applying different data protection techniques, which we will discuss later in this SLR. After outsourcing the private data to the cloud, sometimes the user wants to perform certain operations on their data, such as secure search. Therefore, while performing such operations on private data, the data needs to be protected from intruders so that intruders cannot hack or steal their sensitive information.

Cloud computing has many advantages because of many other technical resources. For example, it has made it possible to store large amounts of data, perform computation on data, and many other various services. In addition, the cloud computing platform reduces the cost of services and also solves the problem of limited resources by sharing important resources among different users. Performance and resource reliability requires that the platform should be able to tackle the security threats [ 2 ]. In recent years, cloud computing has become one of the most important topics in security research. These pieces of research include software security, network security, and data storage security.

The National Institute of Standards and Technology (NIST) defines cloud computing as [ 3 ] “a model for easy access, ubiquitous, resource integration, and on-demand access that can be easily delivered through various types of service providers. The Pay as You Go (PAYG) mechanism is followed by cloud computing, in which users pay only for the services they use. The PAYG model gives users the ability to develop platforms, storage, and customize the software according to the needs of the end-user or client. These advantages are the reason that the research community has put so much effort into this modern concept [ 4 ].

Security is gained by achieving confidentiality, integrity, and data availability. Cloud users want assurance that their data must be saved while using cloud services. There are various types of attacks that launch on a user's private data, such as intrusion attacks, hacking, stealing the user's private data, and denial of service attacks. 57% of companies report security breaches using cloud services [ 5 ]. Data privacy is more important than data security because cloud service providers (CSPs) have full access to all cloud user's data and can monitor their activities, because of which the cloud user privacy is compromised. For example, a user is a diabetic, and the CSP is analyzing their activities, such as what he is searching for more and what kind of medicine he is using the most. Because of this access, CSP can get all the sensitive information about an individual user and can also share this information with a medicine company or an insurance company [ 6 ]. Another problem is that the user cannot fully trust CSP. Because of this reason, there are many legal issues. Users cannot store their sensitive data on unreliable cloud services because of this mistrust. As a result, many users cannot use cloud services to store their personal or sensitive data in the cloud. There are two ways to solve this problem. One is that the user installs a proxy on his side, and this proxy takes the user's data, encrypts and saves their data using some data protection techniques, and then sends it to the untrusted CSP [ 7 ].

The recent Google privacy policy is that any user can use any Google service free of cost; however, Google monitors their activity by monitoring their data to improve their services [ 8 ]. In this paper, we compare different types of data protection techniques that provide privacy and security over the data stored on the cloud. Many papers discuss outsourcing data storage on the cloud [ 9 , 10 ], however, we also discuss how we can secure the outsourced data on the cloud. Most of the paper describes the data security on the cloud vs the external intruder attacks [ 11 , 12 ]. This paper not only discusses the security attacks from outside intruders and securing mechanisms but also inner attacks from the CSP itself. Many surveys cover data privacy by applying cryptographic techniques [ 13 , 14 ]. These cryptographic techniques are very powerful for the protection of data and also provide a very significant result. However, there is a problem as these cryptographic techniques require key management, and some of the cloud functionalities are not working on these cryptographic techniques. In this paper, we also discuss some steganographic techniques. To the best of our knowledge, no study discusses all the conventional and nonconventional security techniques. Therefore, all the data protection techniques need to be combined in one paper.

The rest of this paper is organized as follows: Section 3 of the paper describes the research methodology that consists of inclusion, exclusion criteria, quality assessment criteria, study selection process, research questions, and data extraction process. Also, we discuss assumptions and requirements for data protection in the cloud. Section 4 presents all the cryptographic and also noncryptographic techniques that are used for data protection over the cloud. Also, we discuss the demographic characteristics of the relevant studies by considering the following four aspects: (i) publication trend, (ii) publication venues (proceeding and journals), (iii) number of citations, and (iv) author information. Section 4 also compares all these data protection techniques. Lastly, in Section 5 , we discuss results and present conclusion and future work.

2. Related Work

The first access control mechanism and data integrity in the provable data possession (PDP) model is proposed in the paper [ 15 ], and it provides two mobile applications based on the RSA algorithm. Like the PDP, the author in the paper [ 16 ] proposed a proof of retrievability (PoR) scheme that is used to ensure the integrity of remote data. PoR scheme efficiency is improved using a shorter authentication tag that is integrated with the PoR system [ 17 ]. A more flexible PDP scheme is proposed by the author of the paper [ 18 ] that uses symmetric key encryption techniques to support dynamic operations. A PDP protocol with some flexible functionality is developed, in which, we can add some blocks at run time [ 19 ]. A new PDP system with a different data structure is introduced, and it improves flexibility performance [ 20 ]. Similarly, another PDP model with a different data structure is designed to handle its data functionality [ 21 ]. To improve the accuracy of the data, the author of the paper [ 22 ] designed a multireplicas data verification scheme that fully supports dynamic data updates.

A unique data integration protocol [ 23 ] for multicloud servers is developed. The author of the paper [ 24 ] also considers the complex area where multiple copies are stored in multiple CSPs and builds a solid system to ensure the integrity of all copies at once. A proxy PDP scheme [ 25 ] is proposed, which supports the delegation of data checking that uses concessions to verify auditor consent. In addition, the restrictions of the verifier are removed that strengthened the scheme, and it proposes a separate PDP certification system [ 26 ]. To maintain the security of information, a concept for information security is proposed and a PDP protocol for public research is developed [ 27 ]. To resolve the certification management issue, the PDP system with data protection is introduced [ 28 ].

Identity-based cryptography is developed, in which a user's unique identity is used as input to generate a secret key [ 29 ]. Another PDP protocol is recommended to ensure confidentiality [ 30 ]. The author of the paper [ 31 ] proposed a scheme, in which tags are generated through the ring signature technique for group-based data sharing that supports public auditing and maintains user privacy. A new PDP system is introduced for data sharing over the cloud while maintaining user privacy [ 32 ]. Additionally, it supports the dynamic group system and allows users to exit or join the group at any time. Another PDP system [ 33 ] that is based on broadcast encryption and supports dynamic groups [ 34 ] is introduced. The issue of user revocation has been raised [ 35 ], and to address this issue, a PDP scheme has been proposed, which removes the user from the CSP using the proxy signature method. A PDP-based group data protocol was developed to track user privacy and identity [ 36 ]. A PDP system [ 37 ] is proposed for data sharing between multiple senders. The author of the paper [ 38 ] provides SEPDP systems while maintaining data protection. However, the author of the paper [ 39 ] proved that the scheme proposed in [ 38 ] is vulnerable to malicious counterfeiting by the CSP. A collision-resistant user revocable public auditing (CRUPA) system [ 40 ] is introduced for managing the data that is shared in groups. Another scheme [ 41 ] is introduced as a way to ensure the integrity of mobile data terminals in cloud computing.

To address the PKI issue, identity-based encryption [ 42 ] is designed to enhance the PDP protocol and maintain user privacy in a dynamic community. Before sharing user-sensitive data with third parties or researchers, data owners ensure that the privacy of user-sensitive data is protected. We can do this using data anonymization techniques [ 43 ]. In recent years, the research community has focused on the PPDP search area and developed several approaches for tabular data and SN [ 44 – 49 ]. There are two popular settings in PPDP: one is interactive, and the other is noninteractive [ 50 ]. The K-anonymity model [ 51 ] and its effects are most commonly used in the noninteractive setting of PPDP [ 52 – 56 ]. Differential privacy (DP) [ 57 ] and an interactive configuration of PPDP make extensive use of DP-based methods [ 58 – 60 ]. Meanwhile, several studies for a noninteractive setting reported a PD-dependent approach [ 61 ]. Researchers have expanded the concepts used to anonymize tabular data to protect the privacy of SN users [ 62 – 64 ].

Most images on the internet are in a compressed form. Hence, various studies design some techniques for AMBTC-compressed images. Data concealment has become an active research area. We can hide the data by adding confidential information to the cover image, and as a result, we get the stego image. There are two types of data hiding schemes: one is irreversible [ 65 – 68 ], and the other is a reversible data hiding scheme [ 69 – 71 ]. A cipher text designated for data collection can be re-encrypted as designated for another by a semitrusted proxy without decryption [ 72 ]. The first concrete construction of collusion-resistant unidirectional identity-based proxy re-encryption scheme, for both selective and adaptive identity, is proposed in the paper [ 73 ]. One of the data hiding schemes is the histogram shifting scheme [ 74 – 76 ], and it is the most widely used. A histogram-shifting data hiding scheme [ 77 ] that detects pixel histograms in the cover image is introduced. When big and diverse data are distributed everywhere, we cannot control the vicious attacks. Therefore, we need a cryptosystem to protect our data [ 78 – 80 ].

Some identity-based signature (IBS) schemes [ 81 – 84 ] are introduced that are based on bilinear pairing. However, the authentication schemes based on bilinear pairing over elliptic curve are more efficient and safer than traditional public key infrastructure [ 85 , 86 ]. The paper [ 87 ] proposed a preserving proxy re-encryption scheme for public cloud access control. A differential attack is performed on one-to-many order preserving encryption OPE by exploiting the differences of the ordered ciphertexts in [ 88 ]. Another scheme is proposed, which consists of a cancelable biometric template protection scheme that is based on the format-preserving encryption and Bloom filters [ 89 ]. Some of the researchers also use the concept of paring free identity-based signature schemes [ 90 – 93 ]. A lightweight proxy re-encryption scheme with certificate-based and incremental cryptography for fog-enabled e-healthcare is proposed in [ 94 ].

3. Research Methodology

The objective of this SLR is to evaluate, investigate, and identify the existing research in the context of data storage security in cloud computing to find and evaluate all the existing techniques. SLR is a fair and unbiased way of evaluating all the existing techniques. This way provides a complete and evidence-based search related to a specific topic. At this time, there is no SLR conducted on data storage security techniques that explains all the cryptographic and noncryptographic techniques. Hence, this SLR fulfills the gap by conducting itself. This SLR aims to provide a systematic method using the guidelines of an SLR provided by Kitchenham [ 95 ]. Furthermore, to increase the intensity of our evidence, we follow another study that is provided by [ 96 ]. Our SLR consists of three phases, namely planning, conducting, and reporting. By following these three phases, we conduct our SLR, as shown in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.001.jpg

Review procedure.

3.1. Research Questions

The primary research question of this systematic literature review is “What types of data protection techniques have been proposed in cloud computing?” This primary research question is further divided into four RQs. All these four questions are enlisted below.

  •   RQ1: what types of data protection techniques have been proposed in cloud computing?
  •   RQ2: what are the demographic characteristics of the relevant studies?
  •   RQ3: which data protection technique provides more data protection among all the techniques?
  •   RQ4: what are the primary findings, research challenges, and directions for future research in the field of data privacy in cloud computing?

3.2. Electronic Databases

Six electronic databases were selected to collect primary search articles. All these six electronic databases are well-reputed in the domain of cloud computing. Most of the relevant articles are taken from two electronic databases, namely IEEE and Elsevier. All the electronic databases that we use in this research process are given in Table 1 .

Databases sources.

Electronic databasesURL
IEEE xplore
Wiley
Springer link
ACM
Elsevier
Hindawi

3.3. Research Terms

First of all, the title base search is done on the different electronic databases, which are given in Table 1 . After that, most related studies/articles are taken. Search is done using the string (p1 OR p2. . . . . .OR pn.) AND (t1 OR t2. . . . . . OR tn.). This string/query is constructed using a population, intervention, control, and outcomes (PICO) structure that consists of population, intervention, and outcome. Database search queries are given in Table 2 .

  •   Population : “cloud computing”
  •   Intervention : “data security,” “data privacy,” “data integrity”
  •   Using the PICO structure, we construct a general query for the electronic database. Generic: ((“Document Title”: cloud∗) AND (“Document Title”: data AND (privacy OR protect∗ OR secure∗ OR integrity∗))).

Databases search query.

Database nameSearch query
IEEE xplore((“Document Title”: cloud∗) AND (“Document Title”: data AND (privacy OR protect∗ OR secure∗ OR integrity∗)))
Wiley“Cloud computing” in Title and “data AND (privacy OR protect∗ OR secure∗ OR integrity∗)” in Title
Springer link((“Document Title”: cloud∗) AND (“Document Title”: data AND (privacy OR protect∗ OR secure∗ OR integrity∗)))
ACMacmdlTitle:(+“cloud computing” +data privacy protect∗ secure∗ integrity∗)
Elsevier((Document Title: cloud computing∗) AND (Document Title: data AND (privacy OR protect∗ OR secure∗)))
Hindawi((“Document Title” cloud) AND (“Document Title” data AND (privacy OR protect OR secure OR integrity)))

3.4. Procedure of Study Selection

The procedure of study selection is described in Figure 2 . This procedure has three phases: the first one is exclusion based on the title, in which articles are excluded based on the title, and the relevant titles are included. The second is exclusion based on the abstract in which articles are excluded. By reading the abstract of the articles, the most relevant abstract is included, and the last one is exclusion based on a full text that also includes quality assessment criteria.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.002.jpg

Study selection procedure.

3.5. Eligibility Control

In this phase, all the selected papers are fully readied, and relevant papers are selected to process our SLR further. Table 3 shows the final selected papers from each database based on inclusion and exclusion criteria. The related papers are selected based on inclusion and exclusion criteria, which are given in Table 4 .

Results from electronic databases.

IdentifierDatabaseInitial resultsAfter title screeningAfter abstract screeningAfter exclusion and inclusion
ED1IEEE9422233824
ED2ACM3371272800
ED3Elsevier78521711
ED4Springer45311809
ED5Wiley5345402
ED6Hindawi449301
ED7Others17153405

Inclusion and exclusion criteria.

Inclusion criteriaExclusion criteria
(a). Articles proposing data protection techniques in the context of cloud computing.(a). Articles other than the English language.
(b). Peer-reviewed articles only.(b). Articles that are not supported the research questions.
(c). Take the latest study if there are several papers with the same objectives.(c). Articles providing no validation of proposed techniques.
(d). Comparative studies that compare one or more data protection techniques in cloud computing.(d). Articles that do not clearly define findings and unbiased results.
(e). Journal papers with impact factors only.(e). Duplicate studies concerning title or content.
(f). Ranked conference papers only.(f). Editorials, short papers, posters, technical reports, patents, and reviews.

3.6. Inclusion and Exclusion Criteria

We can use the inclusion and exclusion criteria to define eligibility for basic study selection. We apply the inclusion and exclusion criteria to those studies that are selected after reading the abstract of the papers. The criteria for inclusion and exclusion are set out in Table 4. Table 4 outlines some of the conditions that we have applied to the articles. After applying the inclusion and exclusion criteria, we get relevant articles, which we finally added to our SLR. The search period is from 2010 to 2021, and most of the papers included in our SLR are from 2015 to onward.

We apply inclusion and exclusion criteria in the third phase of the study selection process, and we get 139 results. After that, we also apply quality criteria, and finally, we get 52 articles, which are included in this SLR. Most of the articles are taken from Elsevier and IEEE electronic databases. IEEE is the largest Venus for data storage security in cloud computing. The ratio of the selected articles from different electronic databases is shown in Figure 3 .

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.003.jpg

Percentage of selected studies.

3.7. Quality Assessment Criteria

Quality checking/assessment is done in the 3 rd phase of the study selection process. A scale of 0-1 is used for the quality assessment (QA) of the articles.

Poor-quality articles get 0 points on the scale, and good-quality articles get 1 point on the scale. The articles with 1 point on the scale are included in this SLR. Hence, by applying the quality checking/assessment criteria on all the articles, we finally get 52 articles. All the selected papers have validity and novelty for different data protection techniques, and also, we find the relevance of the articles in the quality assessment criteria, which ensures that all the articles are related to the SLR (data storage protection and privacy in cloud computing). The quality checking (QC) criteria are given in Table 5 .

Quality checking criteria.

QC1Are the goals and objectives of the paper described?
QC2Are there any concise and clear limitations and statements?
QC3Does the research design support state objectives?
QC4Is the proposed technique providing any validation?

3.8. Taxonomy of the Data Protection Techniques

In this section, all the data protection techniques are depicted in Figure 4 . All the data protection techniques are arranged and classified in their related categories. The purpose of the taxonomy is to give a presentational view of all the data protection techniques. The data protection techniques are mainly divided into two categories, namely (1) noncryptographic techniques and (2) cryptographic techniques.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.004.jpg

Taxonomy of the data protection techniques.

4. Results and Discussions

Data protection on the cloud is done by developing a third-party proxy that is trusted by the user. The trusted proxy is not a physical entity. It is a logical entity that can be developed on the user end (like on the user's personal computer) or at that location on which the user can trust. Mostly, all the local proxies are used as an additional service or as an additional module (like browser plugins). To fulfill the objective of data protection by proxies, some requirements are needed to fulfill necessarily. The requirements are given below:

  • User privilege. There are several objectives of user privilege or user empowerment, however, the main objective is to increase the trust of the users in data protection proxies used by the cloud.
  • Transparency. Another important objective is that when users outsource their sensitive data to trusted proxies, their data should remain the same and should not be altered.
  • Cloud computing provides large computing power and cost saving resources. However, one concern is that if we increase data security, computation overhead should not increase. We want to minimize the computation overhead over the proxies.
  • Cloud functionalities preservation. Cloud functionalities preservation is the most important objective. The users encrypt their sensitive data on their personal computers by applying different encryption techniques to increase the protection of their data, however, by applying these different encryption techniques, they are not able to avail some of the cloud functionalities because of compatibility issues [ 97 ]. Hence, it is the main issue.

Figure 5 provides a data workflow for protecting sensitive data on the cloud using a local proxy. There are different types of the assumption that are made for data protection, and some of them are discussed below.

  • Curious CSPs, the most commonly used model in cloud computing, is given in the literature [ 98 ]. The cloud service provider honestly fulfills the responsibilities, i.e., they do not interfere in the user activities, and they only follow the stander protocols. The CSP is honest, however, sometimes, it is curious to analyze the users' queries and analyze their sensitive data, which is not good because it is against the protocol. Also, by this, the privacy of the user is compromised. Hence, we can avoid these things by applying some data protection techniques on the user end to protect the users' sensitive data from the CSPs.
  • In some cases, CSPs may collaborate with data protection proxies that are present on the users' sides to increase the level of trust between the users and CSPs because better trust can motivate more users to move to the cloud. This collaboration can be done if CSPs provide some services to the users with a stable interface for storing, searching, and computing their data.
  • A multicloud approach to cloud computing infrastructure has also been proposed to improve their performance. In this regard, multiple cloud computing services are provided in the same heterogeneous architecture [ 19 ]. A multicloud gives the user multiple different places to store their data at their desired location. There are several benefits to use a multicloud, e.g., it reduces reliance on a single CSP, which increases flexibility.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.005.jpg

Data workflow on cloud using local proxy.

4.1. RQ1: What Type of Data Protection Techniques has Been Proposed in Cloud Computing?

In this session, we will discuss all the techniques for data storage security over the cloud. All these techniques are divided into two main categories, namely (i) cryptographic techniques and (ii) noncryptographic techniques. The local proxy uses different techniques to protect data that are stored on the cloud. Because of this reason, we cannot gain all the advantages of cloud services. Therefore, we analyze and compare all these techniques based on different criteria. These different criteria are as follows: (i) the data accuracy of all the techniques, (ii) the data protection level of all the techniques, (iii) all the functionalities these schemes allow on masked and unmasked data, and (iv) the overhead to encrypt and decrypt data over the cloud.

4.1.1. Noncryptographic Techniques

There are some noncryptographic techniques, and we discuss them in this paper as follows:

(1) Data Anonymization . Data anonymization is a data privacy technique used to protect a user's personal information. This technique hides the person's personal information by hiding the person's identifier or attributes that could reveal a person's identity. Data anonymization can be done by applying various mechanisms, for example, by removing or hiding identifiers or attributes. It can also be done by encrypting the user's personal information. The main purpose of performing data anonymization is that we can hide the identity of the person in any way. Data anonymity can be defined as the user's personal data being altered in such a way that we cannot directly or indirectly identify that person, and the CSP cannot retrieve any person's personal information. Data anonymization techniques have been developed in the field of statistical control disclosure. These techniques are most often used when we want to outsource sensitive data for testing purposes. Data anonymization is graphically represented in Figure 6 .

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.006.jpg

Data anonymization flow diagram.

Data anonymization techniques are most often used when we want to outsource sensitive data for testing purposes. For example, if some doctors want to diagnose certain diseases, some details of these diseases are required for this purpose. This information is obtained from the patients that suffer from these diseases, but it is illegal to share or disclose anyone's personal information. However, for this purpose, we use data anonymization technique to hide or conceal the person's personal information before outsourcing the data. In some cases, however, the CSP wants to analyze the user's masked data. In the data anonymization technique, attributes are the most important part. Attributes can include name, age, gender, address, salary, etc. Table 6 shows the identifiers classification.

Identifiers classification.

IdentifierCategoricalNumerical
Name×
Age×
Gender×
Address×
Zip-code×
Designation×
Salary information×
Diseases×

Data anonymization can be performed horizontally or vertically on this table and also on the record or group of records. The attributes are further classified into the following categories.

  • Sensitive Attributes: sensitive attributes possess sensitive information of the person, such as salary, disease information, phone number, etc. These attributes are strongly protected by applying some protection techniques.
  • Nonsensitive Attributes: these types of attributes do not belong to any type of category. Hence, they do not disclose the identity of a person.
  • Identifiers: identifier belongs to the identity of a person, such as Id card, name, social security number, etc. Because of the presence of these identifiers, the relationship between different attributes can be detected. Hence, these identifiers must be replaced or anonymized.
  • Quasi-Identifiers: quasi-identifiers are the group of identifiers that are available publicly, such as zip-code, designation, gender, etc. Separately, these identifiers cannot reveal the personal identity, however, by combining them, they may reveal the identity of the person. Hence, we want to separate these quasi-identifiers to avoid the discloser.

There are two main categories of data masking: (1) perturbative masking and (2) nonperturbative masking.

  • (1) Perturbative Masking
  • In perturbation, masking data is altered or masked with dummy datasets. Original data is replaced with dummy data, however, this data looks like the original data with some noise addition. The statistical properties of the original data are present in the masked data, however, nonperturbative masking does not contain the statistical properties of original data, because in perturbation masking, data is altered or masked with physically same but dummy data.
  • Data swapping
  • In data swapping, the data is randomly changed with the same but dummy data between different records [ 99 ]. However, if the numerical values are present in the dataset, then in certain limits, the values can be changed. Otherwise, the meaning of the data is changed. The masked data cannot look like the original data. For those attributes that can be ranked, the attribute is replaced with the nearby ranked attributes, and a very large difference between ranks is not suitable [ 100 ]. In data swapping, higher-level attributes are swapped [ 101 ] and individual values are not changed.
  • Noise Addition
  • In this mechanism, some noise is added to the original dataset to alter the original data. Noise is only added to the data that is continuous and divided into categories [ 102 ]. The noise is added into all the attributes that are present in the original dataset, such as sensitive attributes and also quasi-attributes.
  • Microaggregation
  • In this technique, all the relevant data is stored into different groups, and these different groups release average values from each record [ 103 ]. If a large number of similar records is present in different groups, then more data utility is done. We can cluster the data in many ways, e.g., in categorical versions [ 104 ]. Microaggregation is done on a quasi-attribute to protect these attributes from reidentification, and the quasi-attributes protect all the other attributes from reidentification. We can also minimize reidentification by data clustering [ 105 ].
  • Pseudonymization
  • In this method, the original data is replaced with artificial datasets [ 106 ]. In this technique, each attribute present in the original data is a pseudonym, and by doing this, data is less identifiable.
  • (2) Nonperturbative Masking
  • Nonperturbative masking does not change or alter the original data, however, it changes the statistical properties of the original data. Mask data is created by the reduction of the original data or suppressions of the original data [ 107 ].
  • Bucketization
  • In this method, original data is stored in different buckets, and these buckets are protected through encryption [ 108 ]. We can protect the sensitive attributes through bucketization.
  • Data slicing is a method in which a larger group of data is divided into smaller slices or segments [ 109 ]. Hence, we can slice the data, and in this way, the sensitive attribute and the quasi-attributes are divided into different slices. By identifying the individual slice, the identity of the person cannot be disclosed.
  • Sampling is a technique in which the population and sample concept is present. The entire data is called population, and the masked data is called a sample. In this technique, we make different samples of the original data. A smaller data sample provides more protection [ 110 ].
  • Generalization
  • It is a technique in which some additional attributes are added to the record. If the number of quasi-attributes is less rare, then some dummy attributes are added into the record, which look like the quasi-attributes. Hence, by doing this, reidentification becomes more difficult [ 111 ]. By applying generalization on data, we can protect the identity of a person because it hides the relationship between the quasi-attributes.

The summary of data anonymization techniques is given in Table 7 .

The summary of data anonymization techniques.

MethodReferencesOperations supportedUsabilityPrivacy
Swapping[ – ]Research and application testingApplicable for any type of attributes
Noise addition[ ]Research and application testingUsed for the numerical data setDifferential privacy
Microaggregation[ – ]Research and application testingUsed for categorical attributes and numerical data sets -Anonymity. -Diversity. -Closeness
Pseudonymization[ ]Research and application testingUsed for the numerical data set
Bucketization[ ]Research and application testingUsed for categorical attributes and numerical data setsSegmentation
Slicing[ ]Research and application testingUsed for categorical attributesClustering
Sampling[ ]Research and application testingLarge utility loss
Generalization[ ]Research and application testingGranularity and utility loss -Anonymity. -Diversity. -Closeness

(2) Data Splitting . Data splitting is a technique in which sensitive data is divided into different fragments [ 112 ] to protect it from unauthorized access. In this technique, we first split the data into different fragments, then these fragments are randomly stored on different clouds. Even if the intruder gains access to a single fragment in any way, still the intruder will not be able to identify the person. For example, if an intruder gets a fragment from the cloud that contains the salary information of an organization, it is useless until he knows which salary belongs to which person. Hence, data splitting is a very useful technique for protecting data stored on the cloud.

Local proxies outsource data to the cloud without splitting the data, and they can also split the data first and then outsource to the same cloud using different accounts in the same CSP. It can also store data on different cloud platforms that run through different CSPs but provide some of the same services. Data is split before storing in different locations because even if some part or piece of data is known to an intruder, they will not be able to identify anyone.

Firstly, the local proxy retrieves sensitive data from the user and then calculates the risk factor for disclosure. In this method, the user can define the privacy level, and this privacy level provides information about all the sensitive attributes that can reveal someone's identity. These sensitive attributes are called quasi-attributes or quasi-identifiers. Next, the local proxy decides the number of pieces into which the sensitive data will be split and the number of locations that will be needed to store those pieces. Therefore, no one can reveal a person's identity, and all this information about the data splitting mechanism is stored at the local proxy. However, the system must be able to function properly and respond to the queries on time. After that, the local proxy stores these different data fragments in different cloud databases, and now, they are free from disclosure. The data-splitting mechanism supports almost all the functions of the cloud. Hence, we can use almost all the services provided by CSP using the data-splitting mechanism for storing data in the cloud.

When the users want to retrieve the original data, they process a query on a local proxy. The query is processed, and the data storage locations are retrieved from the local database. After that, the query is replicated as many times as the data is split into fragments, and these queries are forwarded to the relevant CSPs. As a result, each CSP provides a set of results that represent a partial view of the complete result. Finally, the proxy collects partial results according to the criteria used to split the data and provides the complete result to the user. Mostly, all these fragments are stored on different cloud databases in their original structure. Therefore, computation on these fragments can be performed easily. However, there is a problem if we want to perform computation separately on the individual fragment. Then, there is no algorithm that exists for this computation. Therefore, some algorithms are required to perform these types of computation as this computation requires communication between different CSPs. The redundancy of proxy metadata and backup policies must be essential to ensure the robustness of the mechanism. The data-splitting is graphically represented in Figure 7 .

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.007.jpg

Data-splitting flow diagram.

The summary of the data-splitting is given in Table 8 . Different data-splitting techniques are used for the protection of data stored on the cloud. Some of these are given below.

  • Byte level splitting
  • In this type, all the sensitive data is converted into bytes [ 113 ]. Then, these bytes are randomly shuffled with each other. After that, all the bytes are recombined. Fixed length fragments are made, and then, these fragments are stored on a different cloud.
  • Privacy level splitting
  • In this mechanism, the user chose the privacy level of each file [ 114 ] that is to be stored on a cloud database. Hence, a privacy level is attached with the file that is to be stored on the cloud. Using this privacy level, the user can decide that the higher privacy level files should be stored on the trusted cloud.
  • Byte level splitting with replication
  • Byte-level data splitting is combined with data replication to improve both performance and security. The author of the paper [ 115 ] proposed an algorithm to store the data fragments on different clouds, so that they are at a certain distance and by doing this; we can avoid confabulation attacks where the intruder can aggregate the split fragments.
  • Byte level splitting with encryption
  • Firstly, byte-level data splitting [ 116 , 117 ] is proposed. In this scheme, every fragment of data is encrypted to enhance the security of sensitive data. In this mechanism, the data is split into bytes, and these bytes are randomly shuffled and finally recombined. This type of data splitting is suitable for binary or multimedia files that are not processed through the cloud.
  • Another problem is the length of a fragment in which we can say that the data cannot be reidentified or the identity of a person cannot be revealed. If the length is too short, then the probability of disclosure increases, and if the length is too long, then it is difficult to handle these fragments. Hence, it should have a certain length so that we can also protect the identity of a person.
  • There is another type of data splitting in which we split data into attributes. The attribute level splitting is performed in two ways: one is horizontal splitting and the second is vertical splitting. These types of splitting are mostly done on structural databases, and they provide strong privacy.
  • Vertical splitting
  • In vertical data splitting [ 118 , 119 ], we divide quasi-identifiers or quasi-attributes in such a way that all the risky attributes are divided into different fragments to secure the reidentification. Some of the sensitive fragments required encryption on it. Hence, we can encrypt these fragments by applying some encryption algorithms or by applying some other privacy methods to increase the security level.

The summary of the data-splitting techniques.

Splitting techniquesReferencesOperations supportedUsabilityPrivacy
Byte level splitting[ ]Storage and retrievalUseful for binary files. Provides week data privacyProvides a low level of privacy
Privacy level splitting[ ]Storage and retrievalUsed for sensitive data. Provides strong protectionFragments stored on the trusted locations. Provides a high level of privacy
Byte level splitting with replication[ ]Storage and retrievalProvides fast retrievalData duplication provides low levels of privacy.
Byte level splitting with encryption[ , ]Storage and retrievalProvides very strong protectionCiphertext provides a very high level of privacy
Vertical splitting[ – ]Storage, retrieval, search, computationUseful for structural dataProvides a low level of privacy

A solution for sensitive data splitting without performing encryption on fragments is proposed [ 120 ]. This mechanism is suitable for data on which we want to perform some computation, because on encrypted data, we cannot perform computation directly. Another technique has been proposed [ 121 ], which demonstrates the redaction and sanitization of a document that identifies all sensitive attributes and protects the data in most documents.

The schemes that use vertical splitting to protect data are faster than other splitting techniques because data fragments consist of a single attribute or multiple attributes. It does not involve data masking or encryption. Hence, the computation is easy. There is another type of encryption in which we do not encrypt and decrypt every time to perform computation. It is called homomorphic encryption. In this case, all data modification is done on encrypted data, and actual data is not changed, however, the final result is preserved [ 122 ].

(3) Steganography . Steganography is the practice of concealing a message within another message or a physical object. In computing contexts, video, audio, image, message, or computer file is concealed within another image, message, or file. The steganography flow diagram is depicted in Figure 8 . There are two main types of steganography, namely (1) linguistic steganography and (2) technical steganography. These techniques are given as follows:

  • (1) Linguistic Steganography
  • It uses images and symbols alone to cover the data. There are two types of Semagrams [ 123 ]. The first is a visual Semagram. In this type, we can visualize the massage. The second type is a text Semagram. In this type, we change the font, color, or symbols of the text message.
  • In this case, we hide the real message from the intruder by installing the original massage in an authorized carrier [ 124 ]. Open code technique is further divided into two types: one is jargon code, and the second is covered ciphers.
  • (2) Technical Steganography
  • Text steganography
  • In this type, we change some textual characteristics of text, such as the font, color, or symbols of the text message [ 127 ]. Three coding techniques are used to change these textual features, which are as follows: (1) line-shift coding, (2) word-shift coding, and (3) feature coding.
  • Image steganography
  • It is the most popular type of steganography. Image steganography refers to the process of hiding sensitive data inside an image file [ 128 ]. The transformed image is expected to look very similar to the original image because the visible features of the stego image remain the same. The image steganography is divided into three parts, namely (1) least significant bits coding, (2) masking and filtering, and (3) transformations.
  • Audio steganography
  • Audio steganography is a technique that is used to transmit secret data by modifying a digitalized audio signal in an imperceptible manner [ 129 ]. Following types of audio steganography are given: (1) least significant bits coding, (2) phase coding, (3) spread spectrum, and (4) echo hiding.
  • Video steganography
  • In video steganography, both image and audio steganography are used [ 130 ]. A video consists of many frames. Hence, video steganography hides a large amount of data in carrier images. In this type of steganography, we select the specific frame in which we want to hide the sensitive data.
  • (ii) Methods
  • Frequency Domain
  • A frequency-domain steganography technique is used for hiding a large amount of data with no loss of secret message, good invisibility, and high security [ 131 ]. In the frequency domain, we change the magnitude of all of the DCT coefficients of the cover image. There are two types of frequency domain: (1) discrete cosine transformation and (2) discrete wavelet transformation.
  • Spatial Domain
  • The spatial domain is based on the physical location of pixels in an image [ 132 ]. A spatial domain technique gives the idea of pixel regulation, which minimizes the progressions of a stego image created from the spread image. Some methods of the spatial domain are given as follows: (1) least significant bit, (2) pixel value differencing, (3) pixel indicator, (4) gray level modification, and (5) quantized indexed modulation.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.008.jpg

Steganography flow diagram.

The summary of the steganographic techniques is given in Table 9 .

The summary of the steganographic techniques.

Steganographic techniquesReferencesOperations supportedUsabilityPrivacy
Semagrams[ ]Storage and retrievalOnly uses images and symbols to cover the dataProvides a low level of privacy
Open code[ ]StorageUsed to hide the message from the intruderLow level of privacy as compared to cryptographic techniques
Text steganography[ ]Storage and retrievalUsed to change some textual characteristics of the textVery low level of privacy
Image steganography[ ]Storage and retrievalUsed to hide sensitive data inside an image fileProvides a medium level of privacy
Audio steganography[ ]Storage and retrievalModifying a digitalized audio signalProvides a high level of privacy
Video steganography[ ]Storage and retrievalUses both image and audio steganographyDepends on the video resolution. Higher the resolution, greater the privacy
Frequency Domain[ ]Only storageHiding a large amount of data with no loss of secret messageProvides a high level of privacy as compared to other steganographic techniques
Spatial Domain[ ]Storage and retrievalThe used physical location of pixels in an imageDepends on the image resolution. Higher the resolution, greater the privacy

4.1.2. Cryptographic Techniques

Cryptography is the most important and most widely used technique for security purposes. In cryptography, the plain text is converted into ciphertext using a key and some encryption algorithms. Cryptographic techniques are the most secure techniques among all the other security techniques. Hence, these cryptography techniques are widely used in data storage security over the cloud. The present day's cryptography techniques are more realistic. We can achieve different objectives by applying these cryptographic techniques, for example, data confidentiality and data integrity. Because of an increase in the number of data breaches in the last few years, some cloud service provider companies are shifting toward cryptographic techniques to achieve more security. The most commonly used cryptographic technique is AES [ 133 ]. Key management is an important issue in cryptographic techniques because if the key is hacked by an intruder, then all the data will be hacked or stolen by this intruder. Hence, key protection or key management is a very important issue. Therefore, it is mostly the responsibility of CSP to manage the key and also provide the protection of key. Cryptographic techniques also protect the user from an untrusted CSP because sometimes the CSP outsources sensitive data without taking the permission of users, and it is an illegal activity. Hence, to avoid these things and protect our sensitive data from untrusted CSPs, we use cryptographic techniques, and it is the best option for users. However, there are some difficulties the user has to face while using cryptographic techniques, i.e., if a user wants to update a small amount of data, the user needs to decrypt the data and then perform this minor update. Hence, this work is very costly. Over time, implementing cryptographic techniques gives us a higher level of security, however, we compromise on performance or speed. It all depends on the user, the standard, the performance, or the high level of security the user wants to achieve. In this paper, we are focusing on the four main functionalities that are required or needed on cloud computing when using cryptographic techniques. Figure 9 shows the flow diagram of encryption.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.009.jpg

Encryption flow diagram.

Some of the main functionalities of cryptographic functions are given below.

  • Search on encrypted data
  • If a user wants to retrieve their data stored in a cloud database, they generate a query and run the query on a local proxy server and search for the data they want. Searching for encrypted data is a very important part of cryptography because every user who stores their sensitive data in a cloud database wants to retrieve it, and it is done by searching their sensitive data through queries. Therefore, the procedure of retrieving their data is very difficult.
  • Storage control
  • Sometimes the user wants to store data in a desired location or trusted database. Hence, the user must have full control over the storage of data.
  • Access control
  • It is a very important control and is referred to as data access restriction. Sometimes, the user does not want to share a private file publicly. Hence, access control is an important functionality.
  • Computation on data
  • Data computation is the main functionality of cloud computing. Sometimes, the user wants to perform some computation on data that are stored on a cloud database. For example, if a user wants to perform computation on encrypted data that is stored on cloud databases, then there are two ways. One is that the user, firstly, decrypts the entire data, performs computation on the data, and finally, the user encrypts the entire data and stores on the cloud database. This process is very expensive in terms of computation.

Some of the cryptographic techniques are as follows:

(1) Homomorphic Encryption . Homomorphic encryption is a form of encryption that permits users to perform computations on encrypted data without decrypting it. These resulting computations are left in an encrypted form, which, when decrypted, result in an identical output to that produced had the operations been performed on the unencrypted data. There are some types of homomorphic encryption that are described below.

  • Partial Homomorphic Encryption
  • In partial homomorphic encryption, only one arithmetic function addition or multiplication is performed at one time. If the resultant ciphertext is the addition of the plain text, then it is called an additive homomorphic scheme, and if the resultant ciphertext is the multiplication of the plaintext, then it is called the multiplicative homomorphic scheme. Two multiplicative homomorphic schemes are given as in [ 134 , 135 ]. There is one additive homomorphic scheme that is called Paillier [ 136 ].
  • Somewhat Homomorphic Encryption
  • This technique allows the user to perform the multiplication and subtraction mathematical operations. However, this scheme allows a limited number of arithmetic operations, because if it allows a large number of arithmetic operations, then it produces noise. This noise changes the structure of the original data. Hence, limited numerical math operations are allowed. There is a somewhat homomorphic encryption scheme that is presented by the authors of the papers [ 137 , 138 ]. In this scheme, the time of encryption and decryption is increased when multiplication operations are increased. To avoid this increase in time, we allow only a limited number of mathematical operations.
  • Fully Homomorphic Encryption
  • This technique allows a large number of arithmetic operations, namely multiplication and subtraction. Multiplication and addition in this technique are performed in the form of XOR and AND gates [ 139 ]. Completely homomorphic encryption techniques require a higher computation time to encrypt and decrypt data. Therefore, this technique is not applicable in real-life applications for implementation. This technique uses a bootstrapping algorithm when a large number of multiplication operations is performed on data and also for the decryption of the data it is used. Homomorphic encryption, on the other hand, represents the trade-off between operations and speed performance. Only a limited number of arithmetic operations are allowed if someone wants low computation, and a large number of arithmetic operations are allowed if someone wants high security. It depends on the needs of the user.

(2) Searchable Encryption . A searchable encryption technique is proposed by the author of the paper [ 140 ]. In this technique, before storing data on a cloud database, encryption is performed, and after that, it is stored on the cloud. The advantage of this technique is that when we search for some data over the cloud database, this technique provides a secure search over the cloud database.

  • Searchable Asymmetric Encryption
  • Over the past two decades, we have focused on searchable encryption. Much of the work is related to the multiwriter and single-reader cases. Searchable encryption is also called public keyword search encryption along with keyword search (PEKS) [ 141 ].
  • Searchable Symmetric Encryption
  • Symmetric-key algorithms use the same key for massage encryption and ciphertext decryption. The keys can be the same, or there can be a simple transformation to go between the two keys. Verifiable searchable symmetric encryption, as a key cloud security technique, allows users to retrieve encrypted data from the cloud with keywords and verify the accuracy of the returned results. Another scheme is proposed for keyword search over dynamic encrypted cloud data with a symmetric-key-based verification scheme [ 142 ].

(3) Encryption . In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can decipher a ciphertext back to plaintext and access the original information.

  • Symmetric Key Encryption
  • Only one key is used in symmetric encryption to encrypt and decrypt the message. Two parties that communicate through symmetric encryption should exchange the key so that it can be used in the decryption process. This method of encryption differs from asymmetric encryption, where a pair of keys is used to encrypt and decrypt messages. A secure transmission method of network communication data based on symmetric key encryption algorithm is proposed in [ 143 ].
  • Public Key Encryption
  • The public-key encryption scheme is proposed by the author of the paper [ 144 ]. In this scheme, a public key pair is created by the receiver. This public key pair consists of two keys. One is called a public key, which is known publicly to everyone, and the second is the private key, which is kept a secret. Hence, in this scheme, the sender performs encryption on the data using the public key of the receiver and then sends this encrypted data to the receiver. After receiving this encrypted data, the receiver can decrypt this data using the private key. Hence, in this way, we can perform secure communication between two parties.
  • Identity-Based Encryption
  • Identity-based encryption is proposed by the author of the paper [ 145 ]. In this technique, a set of users is registered on the database and a unique identity is assigned to all the registered users by an admin that controls this scheme. The identity of the users can be represented by their name or their e-mail address. Just like in a public-key encryption, there is a public key pair that consists of one public key, which is the identity of the user, and one private key, which is a secret key. Just like in public-key encryption, the receiver cannot generate their public key in identity-based encryption. The identity cannot be generated by the user. There is a central authority that generates and manage the user's identity. The identity-based encryption is improved by the author [ 146 ]. The main advantage of identity-based encryption is that anyone can generate the public key of a given identity with the help of the central main authority.
  • Attribute-Based Encryption
  • The authors of the papers [ 147 , 148 ] propose a technique called attribute-based encryption. Similar to identity-based encryption, attribute-based encryption also depends on the central main authority. The central main authority generates the private key and distributes it to all the registered users. It can be encrypting the messages, however, if it does not have this designation, then it cannot be generating the messages. Attribute-based encryption is used when the number of registered users is very large. Then, the attribute-based encryption is useful. The attribute-based encryption consists of two schemes, which are key policy and ciphertext policy.
  • Functional Encryption
  • A functional encryption technique [ 149 , 150 ] consists of identity-based encryption, attribute-based encryption, and public-key encryption. All the functionalities of these three techniques combinedly make function encryption. In this technique, all the private keys are generated by the central main authority, which is associated with a specific function. Functional encryption is a very powerful encryption technique that holds all the functionalities of three encryption techniques. A functional encryption technique is used in many applications.

(4) Signcryption . Cryptography is publicly open-source, and it functions simultaneously as a digital signature and cipher. Cryptography and digital signatures are two basic encryption tools that can ensure confidentiality, integrity, and immutability. In [ 151 ], a new scheme called signature, encryption and encryption is proposed, based on effectively verifiable credentials. The system not only performs encryption and encryption but also provides an encryption or signature form only when needed [ 152 ]. The paper proposes lightweight certificate-based encryption using a proxy cipher scheme (CSS) for smart devices connected to an IoT network to reduce computing and communications costs. To ensure the security and efficiency of the proposed CBSS project, we used a cipher system encoded with 80 bit subparameters. Reference [ 153 ] proposes an input control scheme for the IoT environment using a cryptographic scheme corresponding to the efficiency and robustness of the UK security system. The proposed scheme shows that besides security services, such as protection against attacks, confidentiality, integrity, nonblocking, nondisclosure, and confidentiality, accounting and communication costs are low compared to the current scheme. Document [ 154 ] gives the informal and formal security proof of the proposed scheme. Automated Validation of Internet Security Protocols and Applications (AVISPA) tool is used for formal security analysis, which confirms that the proposed CB-PS scheme can potentially be implemented for resource-constrained low-computing electronic devices in E-prescription systems. The proposed scheme [ 155 ] introduced a new concept that does not require a reliable channel. The main production center sends a part of the private key to the public consumers. The summary of the cryptographic schemes is given in Table 10 .

The summary of the cryptographic techniques.

Cryptography techniquesReferencesSupported operationsUsabilityPrivacy
Identity-based encryption[ , ]Use for data access controlRequired a valid password to access dataProvides a high level of privacy
Symmetric-key encryption[ ]Encryption and Decryption of data using the same keyNo functionality can be performed on encrypted dataProvides a high level of privacy
Public-key encryption[ ]Use for data access controlRequired a valid public key for encryption and private key for decryptionNo key exchange is required. Provides a very high level of privacy
Attribute-based encryption[ , ]Data access control based on attributesLess secure than public-key encryptionProvides lesser privacy than public-key encryption
Functional encryption[ , ]Used for selected plaintextRequired a valid functionPrivacy depends on the function
Fully HE[ ]Allows all the arithmetic operationsPractically not useableProvides a very high level of privacy
Somewhat HE[ , ]Allows more addition and one multiplicationUseable for limited arithmetic operationsProvides a medium level of privacy
Partially HE[ – ]Allows only one arithmetic operationUseable for limited arithmetic operationsProvides a low level of privacy
Searchable encryption[ – ]Allows query search on encrypted dataUseable on encrypted dataProvides a high level of privacy
Signcryption[ – ]Used for user authenticationUseable when efficient authentication is requiredProvides a high level of privacy

All data storage protection on cloud computing is discussed in session 3. There are a lot of data protection techniques, however, all these techniques are only divided into three main categories, namely (i) data splitting, (ii) data anonymization, and (iii) cryptography. From different points views, we discuss all these techniques, e.g., overhead on the local proxy, computation cost, search on encrypted data, data accuracy all these techniques retained, and data protection level all these techniques have, and all the masked data techniques have the functionalities. These are some different views, and by considering them, we can analyze all the data protection techniques. Cryptography provides high-level security but limited cloud functionalities and a high cost of performing computation on cloud data. Data splitting provide low computation cost but a low level of security. Data anonymization is of two types: one is perturbative masking, and the second is nonperturbative masking. Hence, in perturbative masking, data is altered with dummy data. Hence, security is high, however, we cannot perform some functionalities.

4.2. RQ2: What are the Demographic Characteristics of the Relevant Studies?

We answer this question by considering the four following aspects: (i) publication trend, (ii) publication venues (proceeding and journals), (iii) number of citations, and (iv) author information.

4.2.1. Publication Trend

From 2010 to 2021, we found 52 papers that were of top ranked journals and conferences. From 2010 to 2017, there is linear work in cloud computing, however, after 2017, a lot of work is done in cloud computing data security. From 2018 to 2021, 37 papers are published. After 2018, the trend about data security in cloud computing increased very vastly. Most of the work is done in 2021. High-ranked studies are published in 2021. Figure 10 shows all trends of all the publications from 2010. Most of the articles are published in journals venue, and the highest number of papers have been published in IEEE Access journal. 6 papers were published in this journal.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.010.jpg

Number of publications per year.

4.2.2. Publication Venues

There are different types of publication venues, and some of them are book articles, conference proceedings, journals, workshop proceedings, and symposium proceedings. Hence, in our SLR, the number of publications in a different venue is given in Figure 11 . We have a total of 52 papers after applying the inclusion and exclusion criteria in Section 2 .

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.011.jpg

Publication venues.

Out of 52 papers, 0 papers are published in book chapters. 1 paper is published in workshop proceedings. 0 papers are published in symposium proceedings. 43 papers are published in journals. 8 papers are published in conference proceedings. There are some most active journals in cloud data security, which are enlisted in Table 11 .

Top 5 most active journals.

TitleNumber of papers
IEEE access6
Journal of cryptology3
Information fusion3
Information science2
IEEE transactions on knowledge and Data engineering2

The most active journal is the IEEE Access. In this journal, 6 papers are published. Journal of Cryptology is the second most active journal in the field of data storage, security, and privacy in cloud computing. In this journal, 3 papers are published. In the third journal, i.e., in the Journal of Information Fusion, 3 papers are published. The fourth journal is the Information Science. In this journal, 2 papers are published. The fifth journal is IEEE Transactions on Knowledge and Data Engineering, and in this journal, 2 papers are published. Most active conferences are given in Table 12 .

Top 5 most active conferences.

TitleNumber of papers
International conference on privacy in statistical databases1
International conference on database systems for advanced applications1
International conference on high performance and smart computing1
International conference on mechatronic sciences, electric engineering, and computer1
Conference on computer vision and pattern recognition1

4.2.3. Number of Citations

The number of citations of a paper also tells the quality of the paper. The more the number of citations, the higher the quality, and the fewer the number of citations of the paper, the lower the paper quality. Table 13 shows the most influential authors, and Figure 12 shows the number of citations of all the papers that we have used in this SLR. Few papers have citations of more than 100. Hence, it shows that papers have a very high quality, and hence, the citation of those papers is very high. These papers are [ 105 , 118 , 124 , 139 ].

An external file that holds a picture, illustration, etc.
Object name is CIN2022-8303504.012.jpg

Number of citations of the papers.

Top 10 most influential authors in data protection in cloud computing.

NameInstitutionNumber of papers
Insaf UllahDepartment of information Technology, Hazara University, mansehra 21120, Pakistan4
Rodríguez-Hoyos ADepartamento de electrónica, telecomunicaciones y redes de Información, escuela politécnica Nacional, ladrón de Guevara2
Yang JJTsinghua National laboratory for information science and Technology, tsinghua University2
Ahmad Al BadawiFaculty of engineering, National University of Singapore, Singapore1
Nicolas GamaLaboratoire de mathématiques de versailles1
Xinrui GeX. Ge is with the college of computer science and Technology, Qingdao University1
Hua DengCollege of computer science and electronic engineering, Hunan University1
Jiguo LiFujian provincial key laboratory of network security and cryptology1
Gil SegevSchool of computer science and engineering, Hebrew University of Jerusalem1
Andreea BDepartment of electrical and systems engineering, University of Pennsylvania1

4.2.4. Author Information

Some authors are most active in their publication. To identify these authors, we enlist the names of the top 10 authors that are more active in the field of data protection and privacy in cloud computing. Hence, we enlist the names of the top 10 authors and also their numbers of publications in Table 13 .

4.3. RQ3: Which Data Protection Technique Provides More Data Protection among all the Techniques?

We answer this question by considering the following four aspects: (i) publication trend, (ii) publication venues (proceeding and journals), (iii) number of citations, and (iv) author information.

4.3.1. Comparison of Data Protection Techniques

In this section, we compare all the data protection techniques that are discussed in this SLR, and finally, we review which technique is better and provides more protection among all these data protection techniques. We compare these techniques based on different functionalities, which are given as (i) local proxy overhead, (ii) data accuracy retain, (iii) level of data protection, (iv) transparency, and (v) operation supported, and finally, we discuss RQ2. Table 14 depicts a comparison of all the data protection techniques and provides a brief comparison of all the data protection techniques discussed in this SLR. Now, we discuss all these five functionalities one by one in more detail.

  • The overhead on the local proxy for encryption is very high because the data is encrypted. If the user wants to update the data, firstly, the user decrypts the data and then updates the data. After that, the user encrypts the data again. Hence, this operation requires a lot of time, and all this work is performed by the local proxy. It is the reason the overhead on the local proxy for encryption is very high for encryption.
  • Data Splitting
  • The overhead on a local proxy for data splitting is very low. The local proxy overhead remains constant while splitting data into fragments.
  • Anonymization
  • The overhead on a local proxy for anonymization is average because most of the anonymization methods require quasilinear computation in the number of records to generate the anonymized data set. Whenever the anonymized data is generated and stored in the cloud database, then there is no overhead on the local proxy.
  • Homomorphic Encryption
  • The overhead on local proxies for homomorphic encryption is very high because homomorphic encryption involves a large number of mathematical operations. Therefore, there is a lot of overhead on local proxies for homomorphic encryption.
  • Steganography
  • The overhead on the local proxy for steganography is not too much as the data is concealed inside the cover for secure communication. However, based on the complexity of the operation in the transformed domain technique, the local proxy overhead is more than the spatial domain technique.
  • Signcryption
  • The overhead on the local proxy for signcryption is high compared to the simple encryption because in signcryption, hashing and encryption are performed in a single logical step. Because of an extra operation in signcryption, the overhead on the local proxy is higher than the simple encryption.
  • The data accuracy level for encryption is very high because data is encrypted by applying some algorithms. The sensitive data is encrypted by the sender, and this data is decrypted by the receiver using a key. This data cannot be read by anyone who does not have the secret key. Therefore, data accuracy is very high for encryption.
  • The data accuracy level for data splitting is average because data-splitting data is present in the form of fragments. Therefore, CSP can easily access the fragments of data. Both encryption and data splitting are irreversible methods. Hence, we can retrieve the original data easily.
  • The data accuracy level for data anonymization is very low because anonymization is not irreversible. In anonymization, data is replaced with dummy data, and it cannot be retrieved back. Therefore, anonymization has a very low level of data accuracy.
  • The data accuracy level for homomorphic encryption is very high because data is encrypted by applying some algorithms.
  • The data accuracy level for steganography is very low as compared to the other cryptographic techniques because data is embedded inside the cover of another medium. Any change in the cover during transmission results in the change of the concealed data. Therefore, it is hard to ensure a high accuracy level in steganography. The stego image contains the secrete data that is transmitted over the communication channel. Data concealed by the sender is extracted from the cover by the receiver. Therefore, the concealment of data results in accurate data transmission.
  • The data accuracy level for signcryption is also very high, because in signcryption, confidentiality and authentication are achieved. Therefore, we can also verify the identity of the sender.
  • The level of data protection is very high for encryption techniques, because in encryption, data is changed into ciphertext, which cannot be understood. Therefore, we can say that the identification of data is impossible without decryption using a secret key because encryption is a one-way function that is easy to execute in one direction, however, it is impossible to execute in the opposite direction.
  • The level of data protection for data splitting is less high as compared to cryptographic techniques because data is split into different fragments, and these fragments contain original forms of data. Hence, if an intruder hacks or steal these fragments, then the untired data can be easily read. Hence, the data protection level is not high as compared to encrypted methods.
  • The level of data protection for data anonymization is less high as compared to cryptographic techniques, because in anonymization techniques, quasi-identifiers are protected if the quasi-identifiers are not protected strongly. Then, there is a change in the reidentification of person-sensitive data.
  • The level of data protection is very high for homomorphic encryption techniques because encryption data is changed into ciphertext, which cannot be understood.
  • The data protection level for steganography is medium because data is embedded inside the cover of another medium. The stego image contains the secrete data that is transmitted over the communication channel. Data concealed by the sender is extracted from the cover by the receiver. Therefore, the concealment of data results in secure data transmission.
  • The data protection level for signcryption is also very high, because in signcryption, both confidentiality and authentication are achieved. Therefore, we can also verify the identity of the sender.
  • There is no transparency for the encrypted data, because in encryption, there is a need for key management. Hence, the local proxy needs to keep the records of all the keys and manage all these keys. Therefore, there is no transparency for the encrypted data.
  • There is no transparency for the data-splitting mechanism, because in the data-splitting mechanism, data is split into different fragments, and the local proxy stores these fragments in different locations. Hence, there is a need to keep the record of the location of all the fragments that are stored on different locations.
  • Anonymization is fully transparent, because in anonymization, there is no need to keep the record of data storage by the local proxy. In anonymization, data is statistically similar to the original data. Hence, CSP also performs computation and some analysis on the anonymized data.
  • There is no transparency for the homomorphically encrypted data, because in encryption, there is a need for key management. Hence, the local proxy needs to keep the records of all the keys.
  • In steganography, as compared to other data protection techniques, the main aim is to transmit data without letting the attacker know about the data transmission as it is concealed inside the cover of another medium. The data transmission in steganography is fully transparent. No key management is required, and there is no need to keep track of data storage.
  • There is no transparency for the signcrypted data, because in signcryption, there is a need for key management. Hence, the local proxy needs to keep the records of all the keys and also manage all these keys.
  • Only the data storage operation is supported on the encrypted data, because if the user wants to update some encrypted data that are stored on a cloud database, firstly, the user needs to decrypt this data, and then the user performs an update on this data. We cannot perform any modification operation on encrypted data.
  • All the operations cloud be performed on data splitting, because in data splitting, the data is present in their original structure. Hence, we can perform data storage, search, data update, and also data computation.
  • In anonymization, there are two types of data anonymization: one is data masking, and the second is data nonmasking. If data is nonmasked, then we can perform data storage and search on this data. Otherwise, we can only perform data storage.
  • Only the data storage operation is supported on the encrypted data, because if the user wants to update some encrypted data that are stored on the cloud database, firstly, the user needs to decrypt this data, and then the user performs some updates on this data.
  • A stego image only supports data storage operations because if the user wants to update the data hidden in a stego image, the user, firstly, retrieves that data from the stego image, and the user can perform any modification on this data.
  • Only the data storage operation is supported on the signcrypted data, because if the user wants to update signcrypted data that are stored on the cloud database, firstly, the user needs to unsign this data, and then the user can perform any update on this data.

Comparison of data protection techniques.

TechniquesLocal proxy overheadData accuracy retainsLevel of data protectionTransparencyOperation supportedApplicable condition
EncryptionLarge overhead on proxy because of encryption and decryptionProvides a high level of data accuracyProvides a very high level of data protection using encryptionRequires management of keyOnly storageApplicable when user wants high-level security and low-level performance
AnonymizationQuasi-attribute splitting overheadLow-level data accuracy depends on masking methodsThe average level of data protection depends on the anonymization methodsFully transparent for CSP and local proxyStorage, search on nonmasked data, and computationApplicable when testing over the statistical original data is required
SplittingRemains the same in all operationsProvides a high level of accuracy for the user and CSPProvides no guarantee about the protection of data fragmentsNot transparent for local proxy, keeps record of the fragments' locationAll the operation cloud be performedApplicable when user wants high-level computation performance and low-level data security
Homomorphic encryptionLarge overhead on proxy because of large numbers of arithmetic operationsProvides a high level of accuracyProvides a high level of data protectionRequires management of keyStorage and arithmetic operation computationApplicable when the user wants high-level computation performance and also high-level data security
SigncryptionLarge overhead on proxy because of signcryption and unsigncryptionProvides a high level of data accuracyProvides a very high level of data protection like confidentiality and authenticationRequires management of keyOnly storageApplicable when user wants data confidentiality and authentication with high protection
SteganographyNo overhead on the local proxyProvides very low accuracy as compared to the other cryptographic techniquesProvides a medium level of data protectionFully transparent for CSP and local proxyOnly storageApplicable when the user wants a medium level of data protection with low computation

5. Conclusion and Future Work

5.1. rq4: what are the primary findings, research challenges, and direction for future work in the field of data privacy in cloud computing, 5.1.1. conclusion and research challenges.

In this SLR, we have presented all the data privacy techniques related to data storage on cloud computing systematically, and we also present a comparison among all the protection techniques concerning the five finalities, which are the (i) local proxy overhead, (ii) data accuracy retains, (iii) level of data protection, (iv) transparency, and (v) operation supported. There are some research gaps we found in all these techniques of data splitting, anonymization, steganography, encryption, homomorphic encryption, and signcryption.

  • There is a very strong need to develop some ad hoc protocols for the communication of data splitting fragments that are stored on different CSPs, and also, there is a strong need to develop some protocol for the communication between different CSPs. Noncryptographic techniques are faster on different CSPs but do not provide enough security. Hence, we can improve security by developing some methods for data-splitting techniques.
  • Anonymity techniques work very effectively on a small amount of data but not for big data. Hence, there is a search gap in which we can develop some anonymity techniques to achieve more efficient performance. Therefore, some anonymous schemes need to be developed, which provide stronger protection to the quasi-identifier. Current anonymity techniques are very immature.
  • One of the limitations of steganography is that one can only use it to defend against a third party who does not know steganography. If the third party knows steganography, it can extract the data in the same way that the recipient extracts it. Therefore, we always use encryption with steganography. Therefore, there is a need to develop such steganography techniques that can protect sensitive data from third parties.
  • There is a need to develop some cryptographic techniques that can take less time than the existing cryptographic techniques to perform search and computation operation on encrypted data. Cryptographic techniques provide high security but low computational utility. Therefore, it is a search gap to develop some techniques that provide both high security with more efficiency.
  • The complexity of homomorphic encryption and decryption is far greater than that of normal encryption and decryption, and it is not applicable to many applications, such as healthcare and time-sensitive applications. Therefore, there is an urgent need to develop such homomorphic encryption schemes that have low complexity and computation cost.
  • Signcryption is used to verify and authenticate users. We can obtain confidentiality and authentication using signcryption, however, the main limitation of signcryption is that the calculation costs of the encryption algorithm used in signcryption are very high. Therefore, there is a need to develop such signcryption schemes that use such encryption algorithms, which have low computation cost.

Acknowledgments

This research was financially supported by The Analytical Center for the Government of the Russian Federation (Agreement nos. 70-2021- 00143 dd. 01.11.2021, IGK 000000D730321P5Q0002).

Data Availability

Conflicts of interest.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

More From Forbes

Cloud is having a moment—why now.

Forbes Technology Council

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

Fredrik Nilsson is Vice President of the Americas for Axis Communications , overseeing the company’s operations in North and South America.

Cloud technology isn’t exactly new. In fact, it’s been around for decades. So why does it seem like organizations across nearly every industry are suddenly more eager to talk about the cloud than ever? Even in the security industry—which began leveraging cloud solutions two decades ago—it feels like the cloud is having a bit of a "moment."

There are a few reasons for this—the most important of which is the simple fact that as technology has improved, so has our ability to get the most out of the cloud. Surveillance devices are a perfect example: Dramatic increases in edge-based deep learning capabilities have improved processing power, while vastly improved compression algorithms are reducing the bandwidth needed to transmit video data to the cloud.

As a result, the technology is now affordable to businesses of all sizes—and as IoT devices are increasingly built with cybersecurity in mind, they are safer to use than ever. Cloud capabilities have become thoroughly democratized—and more and more organizations are taking advantage.

Why Cloud Is More Accessible Than Ever

It’s important to understand that cloud adoption isn’t surging because we just figured out how to use it. We’ve known about the potential of the cloud for a long time, but it hasn’t always been easy to leverage. For security companies, the ability to send video footage to the cloud for further analysis was always tantalizing—but uploading thousands of hours of raw footage carried with it extreme bandwidth and storage requirements, creating a financial barrier.

Forget iOS 18 Millions Of iPhone Users Now Have RCS Messaging

Microsoft issues new warning for 70% of all windows users, samsung issues critical update for millions of galaxy users.

Today, processing speeds have improved, and it has become increasingly common for surveillance devices and other security sensors to be equipped with deep learning capabilities, allowing the devices to leverage the power of AI to process video in real time.

As a result, organizations no longer need to send raw footage to the cloud for analysis. Instead, they can perform their desired analytics at the network edge—on the devices themselves—sending only relevant metadata and effectively compressed video to the cloud where it can be further analyzed and correlated with other data.

Perhaps the most impactful change has been the advent of hybrid cloud solutions. In the past, organizations seeking to leverage analytics and store video had to choose whether to invest in an on-premises server (or servers) or commit fully to the cloud at a high cost. Now, a business might maintain a small server for certain functions while turning to the cloud where appropriate. This allows the business to scale more effectively, balancing cloud and on-premises resources according to the organization’s evolving needs.

Some organizations might not even use a server, instead relying primarily on SD storage cards. Others might use an on-site network video recorder (NVR), and others might use their hybrid deployment as a bridge before turning to a full cloud solution. This flexibility has enabled organizations to dip their toes into the cloud and see how it works for them.

Using The Cloud To Generate Critical Business Insights

Increased power at the network edge and the growing accessibility of the cloud have put exciting new capabilities in the hands of businesses. It has greatly enhanced a host of security use cases, allowing organizations to more accurately and effectively detect humans, vehicles and other objects; identify suspicious behavior; and leverage analytics like license plate and facial recognition.

Deep learning capabilities have made these security use cases significantly more reliable, generating fewer false alarms and giving operators a more accurate impression of what is actually happening. Additionally, this data can be integrated with information from across the organization, such as customer traffic flows, inventory data or marketing information, generating valuable business insights.

While some of this analysis can be performed at the network edge, the ability to correlate between vast data sets is done most effectively in the cloud—and now that powerful edge devices are significantly more accessible, a wider range of organizations can generate and leverage those datasets. The cloud has been around for a long time, but this one-two punch is what has led to its growing prominence today.

As advanced technology becomes more accessible to organizations and those organizations continue to mature in their understanding of how to analyze and leverage a broad range of data, the cloud has become an invaluable resource.

Of course, that doesn’t mean all cloud platforms are the same. There is no “one-size-fits-all” cloud solution—different businesses have different individual needs, and prioritizing a flexible platform that allows integration with third-party solutions (rather than locking the organization into a single product line) is generally a wise course of action.

Cybersecurity is also a significant concern and was once a reason for customers to steer clear of the cloud. Today, it is often the other way around. Thanks to cyber-secure onboarding of devices using the cloud (and the ability to automatically manage devices via firmware patches), the cloud can now make systems more secure, not less. Still, it’s important to choose a cloud platform provider with a reputation for strong security, reliable service and frequent updates—one that can both protect their data and maintain their flexibility.

What’s Next For The Cloud

While the cloud has long been a valuable resource for data analysis, advancements in processing power have made it possible for more businesses than ever to leverage those capabilities for their own gain. It’s very exciting—and it’s also just beginning.

As organizations grow more comfortable with the cloud, many are turning to not just one provider, but many. Multi-cloud environments are on the rise, often balanced with on-premises resources amid hybrid deployments. This has led to an even greater level of flexibility and customization, generating opportunities to leverage analytics and AI for a growing slate of individualized use cases.

The cloud is having a moment, yes—and it’s not going to end anytime soon. The availability of cloud resources and advancement of cloud capabilities will only continue putting exciting new resources in the hands of modern businesses.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Fredrik Nilsson

  • Editorial Standards
  • Reprints & Permissions
  • Topics ›
  • Infrastructure as a Service (IaaS) ›

Amazon Maintains Cloud Lead as Microsoft Edges Closer

Cloud infrastructure market.

Having established itself as an early leader in the market for cloud infrastructure, Amazon Web Services (AWS), the online retailer’s profitable cloud platform, is still ahead of the pack. According to estimates from Synergy Research Group , Amazon’s market share in the worldwide cloud infrastructure market amounted to 31 percent in the first quarter of 2024, down from 32 percent a year earlier. Meanwhile, Amazon's main rival Microsoft slowly edges closer, growing the market share of its Azure platform to an all-time high of 25 percent in Q1 2024. Combined with Google at 11 percent market share, the "Big Three" now account for two thirds of the ever-growing cloud market, with the rest of the competition stuck in the low single digits.

In Q1 2024, global cloud infrastructure service spending grew $13.5 billion or 21 percent compared to the first quarter of 2023, bringing total spending to over $76 billion for the three months ended March 31. Looking at the full year, the market is now at a $300-billion revenue run rate, explaining why the it is so fiercely contested. Despite its size, the cloud market is still growing strongly, with year-over-year growth even re-accelerating for the second consecutive quarter.

"In terms of annualized run rate we now have a $300-billion market which is growing at 21 percent per year," John Dinsdale, chief analyst at Synergy Research Group said. "We will not return to the growth rates seen prior to 2022, as the market has become too massive to grow that rapidly, but we will see the market continue to expand substantially. We are forecasting that it will double in size over the next four years."

Description

This chart shows worldwide market share of leading cloud infrastructure service providers in Q1 2024.

Can I integrate infographics into my blog or website?

Yes, Statista allows the easy integration of many infographics on other websites. Simply copy the HTML code that is shown for the relevant statistic in order to integrate it. Our standard is 660 pixels, but you can customize how the statistic is displayed to suit your site by setting the width and the display size. Please note that the code must be integrated into the HTML code (not only the text) for WordPress pages and other CMS sites.

Infographic: Amazon Maintains Cloud Lead as Microsoft Edges Closer | Statista

Infographic Newsletter

Statista offers daily infographics about trending topics, covering: Economy & Finance , Politics & Society , Tech & Media , Health & Environment , Consumer , Sports and many more.

Related Infographics

Alphabet continues double-digit earnings trend, microsoft and amazon dominate the cloud, cloud computing, how important is cloud computing for microsoft, big tech vs. europe, overseas dominance, microsoft's intelligent cloud keeps soaring, 5g to disrupt auto industry most, the u.s. rules the cloud, amazon joins four-comma club, how did microsoft fare in the post-bill gates era, cloud vendors, what cloud vendors do tech professionals use, cloud computing market, the cloud market keeps moving upwards, amazon web services, the blockbuster growth of amazon's cloud business.

  • Who may use the "Chart of the Day"? The Statista "Chart of the Day", made available under the Creative Commons License CC BY-ND 3.0, may be used and displayed without charge by all commercial and non-commercial websites. Use is, however, only permitted with proper attribution to Statista. When publishing one of these graphics, please include a backlink to the respective infographic URL. More Information
  • Which topics are covered by the "Chart of the Day"? The Statista "Chart of the Day" currently focuses on two sectors: "Media and Technology", updated daily and featuring the latest statistics from the media, internet, telecommunications and consumer electronics industries; and "Economy and Society", which current data from the United States and around the world relating to economic and political issues as well as sports and entertainment.
  • Does Statista also create infographics in a customized design? For individual content and infographics in your Corporate Design, please visit our agency website www.statista.design

Any more questions?

Get in touch with us quickly and easily. we are happy to help.

Feel free to contact us anytime using our contact form or visit our FAQ page .

Statista Content & Design

Need infographics, animated videos, presentations, data research or social media charts?

More Information

The Statista Infographic Newsletter

Receive a new up-to-date issue every day for free.

  • Our infographics team prepares current information in a clear and understandable format
  • Relevant facts covering media, economy, e-commerce, and FMCG topics
  • Use our newsletter overview to manage the topics that you have subscribed to

cloud computing research

McKinsey Technology Trends Outlook 2023

After a tumultuous 2022 for technology investment and talent, the first half of 2023 has seen a resurgence of enthusiasm about technology’s potential to catalyze progress in business and society. Generative AI deserves much of the credit for ushering in this revival, but it stands as just one of many advances on the horizon that could drive sustainable, inclusive growth and solve complex global challenges.

To help executives track the latest developments, the McKinsey Technology Council  has once again identified and interpreted the most significant technology trends unfolding today. While many trends are in the early stages of adoption and scale, executives can use this research to plan ahead by developing an understanding of potential use cases and pinpointing the critical skills needed as they hire or upskill talent to bring these opportunities to fruition.

Our analysis examines quantitative measures of interest, innovation, and investment to gauge the momentum of each trend. Recognizing the long-term nature and interdependence of these trends, we also delve into underlying technologies, uncertainties, and questions surrounding each trend. This year, we added an important new dimension for analysis—talent. We provide data on talent supply-and-demand dynamics for the roles of most relevance to each trend. (For more, please see the sidebar, “Research methodology.”)

New and notable

All of last year’s 14 trends remain on our list, though some experienced accelerating momentum and investment, while others saw a downshift. One new trend, generative AI, made a loud entrance and has already shown potential for transformative business impact.

Research methodology

To assess the development of each technology trend, our team collected data on five tangible measures of activity: search engine queries, news publications, patents, research publications, and investment. For each measure, we used a defined set of data sources to find occurrences of keywords associated with each of the 15 trends, screened those occurrences for valid mentions of activity, and indexed the resulting numbers of mentions on a 0–1 scoring scale that is relative to the trends studied. The innovation score combines the patents and research scores; the interest score combines the news and search scores. (While we recognize that an interest score can be inflated by deliberate efforts to stimulate news and search activity, we believe that each score fairly reflects the extent of discussion and debate about a given trend.) Investment measures the flows of funding from the capital markets into companies linked with the trend. Data sources for the scores include the following:

  • Patents. Data on patent filings are sourced from Google Patents.
  • Research. Data on research publications are sourced from the Lens (www.lens.org).
  • News. Data on news publications are sourced from Factiva.
  • Searches. Data on search engine queries are sourced from Google Trends.
  • Investment. Data on private-market and public-market capital raises are sourced from PitchBook.
  • Talent demand. Number of job postings is sourced from McKinsey’s proprietary Organizational Data Platform, which stores licensed, de-identified data on professional profiles and job postings. Data is drawn primarily from English-speaking countries.

In addition, we updated the selection and definition of trends from last year’s study to reflect the evolution of technology trends:

  • The generative-AI trend was added since last year’s study.
  • We adjusted the definitions of electrification and renewables (previously called future of clean energy) and climate technologies beyond electrification and renewables (previously called future of sustainable consumption).
  • Data sources were updated. This year, we included only closed deals in PitchBook data, which revised downward the investment numbers for 2018–22. For future of space technologies investments, we used research from McKinsey’s Aerospace & Defense Practice.

This new entrant represents the next frontier of AI. Building upon existing technologies such as applied AI and industrializing machine learning, generative AI has high potential and applicability across most industries. Interest in the topic (as gauged by news and internet searches) increased threefold from 2021 to 2022. As we recently wrote, generative AI and other foundational models  change the AI game by taking assistive technology to a new level, reducing application development time, and bringing powerful capabilities to nontechnical users. Generative AI is poised to add as much as $4.4 trillion in economic value from a combination of specific use cases and more diffuse uses—such as assisting with email drafts—that increase productivity. Still, while generative AI can unlock significant value, firms should not underestimate the economic significance and the growth potential that underlying AI technologies and industrializing machine learning can bring to various industries.

Investment in most tech trends tightened year over year, but the potential for future growth remains high, as further indicated by the recent rebound in tech valuations. Indeed, absolute investments remained strong in 2022, at more than $1 trillion combined, indicating great faith in the value potential of these trends. Trust architectures and digital identity grew the most out of last year’s 14 trends, increasing by nearly 50 percent as security, privacy, and resilience become increasingly critical across industries. Investment in other trends—such as applied AI, advanced connectivity, and cloud and edge computing—declined, but that is likely due, at least in part, to their maturity. More mature technologies can be more sensitive to short-term budget dynamics than more nascent technologies with longer investment time horizons, such as climate and mobility technologies. Also, as some technologies become more profitable, they can often scale further with lower marginal investment. Given that these technologies have applications in most industries, we have little doubt that mainstream adoption will continue to grow.

Organizations shouldn’t focus too heavily on the trends that are garnering the most attention. By focusing on only the most hyped trends, they may miss out on the significant value potential of other technologies and hinder the chance for purposeful capability building. Instead, companies seeking longer-term growth should focus on a portfolio-oriented investment across the tech trends most important to their business. Technologies such as cloud and edge computing and the future of bioengineering have shown steady increases in innovation and continue to have expanded use cases across industries. In fact, more than 400 edge use cases across various industries have been identified, and edge computing is projected to win double-digit growth globally over the next five years. Additionally, nascent technologies, such as quantum, continue to evolve and show significant potential for value creation. Our updated analysis for 2023 shows that the four industries likely to see the earliest economic impact from quantum computing—automotive, chemicals, financial services, and life sciences—stand to potentially gain up to $1.3 trillion in value by 2035. By carefully assessing the evolving landscape and considering a balanced approach, businesses can capitalize on both established and emerging technologies to propel innovation and achieve sustainable growth.

Tech talent dynamics

We can’t overstate the importance of talent as a key source in developing a competitive edge. A lack of talent is a top issue constraining growth. There’s a wide gap between the demand for people with the skills needed to capture value from the tech trends and available talent: our survey of 3.5 million job postings in these tech trends found that many of the skills in greatest demand have less than half as many qualified practitioners per posting as the global average. Companies should be on top of the talent market, ready to respond to notable shifts and to deliver a strong value proposition to the technologists they hope to hire and retain. For instance, recent layoffs in the tech sector may present a silver lining for other industries that have struggled to win the attention of attractive candidates and retain senior tech talent. In addition, some of these technologies will accelerate the pace of workforce transformation. In the coming decade, 20 to 30 percent of the time that workers spend on the job could be transformed by automation technologies, leading to significant shifts in the skills required to be successful. And companies should continue to look at how they can adjust roles or upskill individuals to meet their tailored job requirements. Job postings in fields related to tech trends grew at a very healthy 15 percent between 2021 and 2022, even though global job postings overall decreased by 13 percent. Applied AI and next-generation software development together posted nearly one million jobs between 2018 and 2022. Next-generation software development saw the most significant growth in number of jobs (exhibit).

Job posting for fields related to tech trends grew by 400,000 between 2021 and 2022, with generative AI growing the fastest.

Image description:

Small multiples of 15 slope charts show the number of job postings in different fields related to tech trends from 2021 to 2022. Overall growth of all fields combined was about 400,000 jobs, with applied AI having the most job postings in 2022 and experiencing a 6% increase from 2021. Next-generation software development had the second-highest number of job postings in 2022 and had 29% growth from 2021. Other categories shown, from most job postings to least in 2022, are as follows: cloud and edge computing, trust architecture and digital identity, future of mobility, electrification and renewables, climate tech beyond electrification and renewables, advanced connectivity, immersive-reality technologies, industrializing machine learning, Web3, future of bioengineering, future of space technologies, generative AI, and quantum technologies.

End of image description.

This bright outlook for practitioners in most fields highlights the challenge facing employers who are struggling to find enough talent to keep up with their demands. The shortage of qualified talent has been a persistent limiting factor in the growth of many high-tech fields, including AI, quantum technologies, space technologies, and electrification and renewables. The talent crunch is particularly pronounced for trends such as cloud computing and industrializing machine learning, which are required across most industries. It’s also a major challenge in areas that employ highly specialized professionals, such as the future of mobility and quantum computing (see interactive).

Michael Chui is a McKinsey Global Institute partner in McKinsey’s Bay Area office, where Mena Issler is an associate partner, Roger Roberts  is a partner, and Lareina Yee  is a senior partner.

The authors wish to thank the following McKinsey colleagues for their contributions to this research: Bharat Bahl, Soumya Banerjee, Arjita Bhan, Tanmay Bhatnagar, Jim Boehm, Andreas Breiter, Tom Brennan, Ryan Brukardt, Kevin Buehler, Zina Cole, Santiago Comella-Dorda, Brian Constantine, Daniela Cuneo, Wendy Cyffka, Chris Daehnick, Ian De Bode, Andrea Del Miglio, Jonathan DePrizio, Ivan Dyakonov, Torgyn Erland, Robin Giesbrecht, Carlo Giovine, Liz Grennan, Ferry Grijpink, Harsh Gupta, Martin Harrysson, David Harvey, Kersten Heineke, Matt Higginson, Alharith Hussin, Tore Johnston, Philipp Kampshoff, Hamza Khan, Nayur Khan, Naomi Kim, Jesse Klempner, Kelly Kochanski, Matej Macak, Stephanie Madner, Aishwarya Mohapatra, Timo Möller, Matt Mrozek, Evan Nazareth, Peter Noteboom, Anna Orthofer, Katherine Ottenbreit, Eric Parsonnet, Mark Patel, Bruce Philp, Fabian Queder, Robin Riedel, Tanya Rodchenko, Lucy Shenton, Henning Soller, Naveen Srikakulam, Shivam Srivastava, Bhargs Srivathsan, Erika Stanzl, Brooke Stokes, Malin Strandell-Jansson, Daniel Wallance, Allen Weinberg, Olivia White, Martin Wrulich, Perez Yeptho, Matija Zesko, Felix Ziegler, and Delphine Zurkiya.

They also wish to thank the external members of the McKinsey Technology Council.

This interactive was designed, developed, and edited by McKinsey Global Publishing’s Nayomi Chibana, Victor Cuevas, Richard Johnson, Stephanie Jones, Stephen Landau, LaShon Malone, Kanika Punwani, Katie Shearer, Rick Tetzeli, Sneha Vats, and Jessica Wang.

Explore a career with us

Related articles.

A profile of a woman with her hand up to her chin in a thoughtful pose.  A galaxy bursting with light is superimposed over profile, centered over her mind.

McKinsey Technology Trends Outlook 2022

illustration two females standing in metaverse

Value creation in the metaverse

illustration of eye in dots

Quantum computing funding remains strong, but talent gap raises concern

  • Accessibility Policy
  • Skip to content
  • QUICK LINKS
  • Oracle Cloud Infrastructure
  • Oracle Fusion Cloud Applications
  • Download Java
  • Careers at Oracle

 alt=

Oracle Cloud Infrastructure (OCI)

The next-generation cloud designed to run any application, faster and more securely, for less.

Stay at the forefront of the AI revolution. Get certified on OCI Generative AI for free until July 31.

cloud computing research

Developers can now incorporate semantic and generative AI search within their applications, as well as in their observability and security analytics dashboards.

cloud computing research

Address AI sovereignty needs with Oracle’s distributed cloud. Deploy Oracle’s full AI stack in the public cloud or in your data center for increased control of data and AI infrastructure.

cloud computing research

Oracle and NVIDIA to Deliver Sovereign AI Worldwide

Oracle and NVIDIA are collaborating to put accelerated computing and generative AI services wherever countries must meet digital sovereignty requirements.

cloud computing research

Clay Magouyrk, executive vice president of OCI development, shares how Oracle is bringing the cloud and AI to industries worldwide during his keynote at Oracle CloudWorld.

cloud computing research

The expanded Oracle-Microsoft partnership, announced by Larry Ellison and Satya Nadella, delivers Oracle database services on Oracle Cloud Infrastructure in Microsoft Azure.

Learn how to get the most of AI for your business at an Oracle Data and AI Forum.

A distributed cloud that can be wherever you need it

The only cloud that delivers complete cloud services around the world, across clouds, or in your data center.

1. Multicloud

Make multiple clouds work as one. Oracle offers direct database integration with Microsoft Azure as well as high performance interconnection.

2. Public cloud

The only cloud with a consistent set of services and consistent low prices across commercial and government public cloud regions.

Distributed cloud

3. Hybrid cloud

Trusted cloud solutions for nearly any location, including Oracle Exadata Cloud@Customer, a highly optimized database as a service already deployed in more than 60 countries.

4. Dedicated cloud

OCI Dedicated Region and Oracle Alloy are the only products that provide more than 100 cloud services at public cloud pricing entirely within your data centers.

Complete cloud infrastructure and platform services for every workload

OCI offers a common set of 100+ services in each cloud region. Get all the services you need—from containers and VMware to AI—to migrate, modernize, build, and scale your IT. Automate all your workloads, including both existing and new applications and data platforms.

Developer Services

Build, deploy, and manage modern cloud applications using developer-friendly tools and services.

Build and run

  • API Gateway
  • API Management
  • Blockchain Platform
  • Notifications
  • Resource Manager
  • Visual Builder Studio
  • Application Integration
  • Content Management
  • Digital Assistant

Integration Services

Oracle Cloud Infrastructure integration services connect any application and data source to automate end-to-end processes and centralize management. The broad array of integrations, with prebuilt adapters and low-code customization, simplify migration to the cloud while streamlining hybrid and multicloud operations.

Application integration

  • Service Oriented Architecture (SOA)

Data integration

  • Data Integration
  • Autonomous Database Data Studio
  • Data Integrator

Process automation

  • Process Automation

API full lifecycle management

Event-driven, analytics and bi.

Gain comprehensive business intelligence with augmented analytics to help your organization grow through unique insights.

  • Oracle Analytics Platform
  • Oracle Fusion CX Analytics
  • Oracle Fusion ERP Analytics
  • Oracle Fusion HCM Analytics
  • Oracle Analytics mobile app

AI and Machine Learning

Easily add intelligence to your applications and workloads with prebuilt perception and decision models and out-of-the-box chatbots, or build and train your own models with our data science services.

AI Services

  • Document Understanding

ML Services

  • Data Science
  • In-Database Machine Learning
  • HeatWave AutoML

Media Services

  • Media Streams

Big Data and Data Lake

Gain new insights across all of your data with our comprehensive platform of managed Spark, Hadoop, Elasticsearch, and Kafka-compatible services, combined with best-in-class data warehouse and data management services.

  • Big Data Service
  • Data Catalog

Secure and elastic compute capacity in the cloud that ranges from flexible virtual machines (Flex VMs) and high-performance bare metal servers to HPC and GPUs.

  • Arm-based Compute
  • Bare Metal Servers
  • GPU-Accelerated Compute
  • HPC Compute
  • Virtual Machines

Address key use cases with on-demand local, object, file, block, and archive storage.

  • Archive Storage
  • Block Volumes
  • Data Transfer Service
  • File Storage
  • Object Storage
  • Storage Gateway

Containers and Functions

Deploy microservices applications on high-performance, managed, open source Docker, Kubernetes, and Fn Functions services.

  • Container Instances
  • Kubernetes Engine
  • Service Mesh

Connect securely to a customizable, isolated virtual cloud network (VCN) as the heart of your cloud deployment. Move data at 25% of the cost of other cloud providers.

  • Customer-Premises Equipment
  • DNS Management
  • FastConnect
  • Load Balancers
  • Networking Gateways
  • Private Endpoint
  • Site-to-Site VPN
  • Virtual Cloud Networks

Oracle Databases

Run cost-optimized, high-performance, and autonomous versions of Oracle Database, the world's leading converged, multimodel database management system in the cloud.

  • Autonomous Database
  • Autonomous Database on Exadata Cloud@Customer
  • Database Management
  • Database Migration
  • Oracle Database@Azure
  • Database Service for Azure
  • Enterprise Database Service
  • Exadata Cloud@Customer
  • Exadata Database Service
  • Globally Distributed Autonomous Database
  • Ops Insights
  • Standard Database Service
  • Zero Data Loss Autonomous Recovery Service

Open Source Databases

MySQL HeatWave is a fully managed database service, powered by the integrated HeatWave in-memory query accelerator. It’s the only cloud database service that combines transactions, analytics, and machine learning services in one MySQL Database, delivering real-time, secure analytics without the complexity, latency, and cost of ETL duplication.

  • Database with PostgreSQL
  • MySQL HeatWave
  • Search with OpenSearch

Security, Observability and Management, Compliance, and Cost Management and Governance

Protect your most valuable data in the cloud with Oracle’s security-first approach and comprehensive compliance programs. Oracle provides visibility and machine-learning–driven insights to ease management across all layers of the stack deployed on any technology, anywhere.

Cloud Infrastructure Security

  • Access Governance
  • Autonomous Linux
  • Certificates
  • Cloud Guard
  • Hardware Root of Trust
  • Identity and Access Management
  • Isolated Network Virtualization
  • Key Management
  • Network Firewall
  • Security Zones
  • Threat Intelligence
  • Vulnerability Scanning

Cloud Database Security

  • Autonomous Database Security (PDF)

Cloud Application Security

  • Web Application Firewall

Observability and Management

  • Application Performance Monitoring
  • Connector Hub
  • Full Stack Disaster Recovery
  • Java Management
  • Logging Analytics
  • OS Management Hub
  • OS Management Service

Cost Management and Governance

  • Cloud Advisor
  • Cost Analysis
  • License Manager
  • Organization Management
  • Support Rewards
  • Oracle Cloud Compliance
  • Oracle Corporate Security

Global Cloud Data Center Infrastructure

Global, secure, high-performance environments to move, build, and run all your workloads. Hybrid and edge offerings provide specialized deployment, disconnected and intermittently connected operation, low latency and high performance, as well as data locality and security.

  • Multicloud solutions
  • Database Service for Microsoft Azure
  • Interconnect for Microsoft Azure

Public cloud

  • Global cloud regions
  • Oracle EU Sovereign Cloud
  • UK government regions
  • US Department of Defense regions
  • US FedRAMP regions

Hybrid Cloud

  • Hybrid cloud solutions
  • Roving Edge Infrastructure

Dedicated cloud

  • Dedicated Regions

Run IT for a lot less with ease

Save money without inflexible up-front commitments. In contrast to others, OCI has the same low price in all cloud regions, including government and dedicated regions. We offer innovative features, such as flexible compute, auto-tuning storage, up to 10X lower data egress fees, and free tools and reports to help eliminate needless overpayment and billing surprises. Committed use discounts, software license portability, and loyalty rewards are also available.

The cost for outbound bandwidth 1,2

Better price-performance 2,3

Base IOPS for half the price 4,5

Pricing comparison as of April 9, 2023 1. OCI Network Pricing 2. AWS Compute and Network Pricing 3. OCI Compute Pricing 4. AWS Storage Pricing 5. OCI Storage Pricing

Cloud infrastructure and applications everywhere

Oracle Cloud spans 48 interconnected geographic commercial and government cloud regions. Unlike other providers, each region offers a consistent set of more than 100 Oracle Cloud Infrastructure services, with consistent low pricing worldwide. For more complete support of customer cloud strategies, Oracle Cloud also offers a full suite of Oracle Cloud Applications and direct interconnection with Microsoft Azure.

Oracle Cloud data center global distribution map, details below

Oracle data centers are distributed around the world.

Oracle data centers by Region
Region Current Regions Regions Coming soon Azure Interconnect
North America 13 0 4
South America 5 0 1
Europe 15 0 3
Middle East & Africa 5 3 1
Asia Pacific 10 0 3

Technology and industry solutions

Resolve your organization’s unique challenges, make data-driven decisions, and increase efficiency with OCI’s purpose-built solutions.

  • Telecommunications

Financial services institutions benefit from OCI’s secure, reliable, and highly scalable multicloud platform, which offers robust services and an accelerated AI infrastructure with the broadest set of deployment options to meet your data residency requirements.

Meet all your business and technology objectives for an on-premises database cloud deployment with Oracle Exadata Cloud@Customer.

Benefit from a single unified platform to run every healthcare workload—from legacy applications to AI/ML-enabled services—to enhance the patient experience, decrease the cost of care, and enable precision medicine.

Realize the value of data—use all the available data sources to help improve patient outcomes and provide a seamless healthcare experience.

Many defense, intelligence community, and federal civilian agencies across the globe rely on Oracle Cloud for Government to modernize and innovate faster for better mission outcomes. Oracle Cloud provides world-class security and compliance, consistent high performance, and simple and predictable pricing.

Dedicated to serving governments and safeguarding global defense missions at hyperscale, this innovative cloud solution includes a fully integrated infrastructure with IaaS/PaaS/SaaS and industry-leading price-performance.

Oracle Cloud for telcos is a comprehensive set of cloud solutions built on OCI to help telcos become more agile, reduce capital investments and operating costs, and establish a flexible foundation for innovation.

Use Oracle’s database clustering and massive I/O to maintain near zero-downtime environments while improving performance by 50% or more compared with on-premises or other clouds.

Get started with OCI

  • Free OCI services
  • Hands-on labs
  • Reference architectures
  • Events and webinars

Contact sales

Try 20+ always free cloud services, with a 30-day trial for even more.

Oracle offers a Free Tier with no time limits on more than 20 services such as Autonomous Database, Arm Compute, and Storage, as well as US$300 in free credits to try additional cloud services. Get the details and sign up for your free account today.

What’s included with Oracle Cloud Free Tier?

  • 2 Autonomous Databases, 20 GB each
  • AMD and Arm Compute VMs
  • 200 GB total block storage
  • 10 GB object storage
  • 10 TB outbound data transfer per month
  • 10+ more Always Free services
  • US$300 in free credits for 30 days for even more

Learn with step-by-step guidance

Experience a wide range of OCI services through tutorials and hands-on labs. Whether you're a developer, admin, or analyst, we can help you see how OCI works. Many labs run on the Oracle Cloud Free Tier or an Oracle-provided free lab environment.

Get started with OCI core services

The labs in this workshop cover an introduction to Oracle Cloud Infrastructure (OCI) core services including virtual cloud networks (VCN) and compute and storage services.

Autonomous Database quick start

In this workshop, you’ll go through the steps to get started using Oracle Autonomous Database.

Build an app from a spreadsheet

This lab walks you through uploading a spreadsheet into an Oracle Database table, and then creating an application based on this new table.

Deploy an HA application on OCI

In this lab you’ll deploy web servers on two compute instances in Oracle Cloud Infrastructure (OCI), configured in High Availability mode by using a Load Balancer.

Explore over 150 best practice designs

See how our architects and other customers deploy a wide range of workloads, from enterprise apps to HPC, from microservices to data lakes. Understand the best practices, hear from other customer architects in our Built & Deployed series, and even deploy many workloads with our "click to deploy" capability or do it yourself from our GitHub repo.

Popular architectures

  • Apache Tomcat with MySQL Database Service
  • Oracle Weblogic on Kubernetes with Jenkins
  • Machine-learning (ML) and AI environments
  • Tomcat on Arm with Oracle Autonomous Database
  • Log analysis with ELK Stack
  • HPC with OpenFOAM

Explore our informative events series featuring the latest announcements, customer conversations, product-specific insights, technical sessions, and hands-on labs.

Oracle Cloud Infrastructure events

Get more out of OCI with a wide range of live and on-demand events.

Oracle CloudWorld

Join us at Oracle’s new global customer and partner conference for cloud infrastructure and applications.

Interested in learning more about Oracle Cloud Infrastructure? Let one of our experts help.

They can answer questions like:

  • What workloads run best on OCI?
  • How do I get the most out of my overall Oracle investments?
  • How does OCI compare to other cloud computing providers?
  • How can OCI support your IaaS and PaaS goals?

To revisit this article, visit My Profile, then View saved stories .

  • Backchannel
  • Newsletters
  • WIRED Insider
  • WIRED Consulting

If you buy something using links in our stories, we may earn a commission. Learn more.

Will Knight

Pocket-Sized AI Models Could Unlock a New Era of Computing

Abstract 3D render of a transparent smartphone with chat bubbles hovering over the surface while it rests on a blue surface

When ChatGPT was released in November 2023, it could only be accessed through the cloud because the model behind it was downright enormous.

Today I am running a similarly capable AI program on a Macbook Air, and it isn’t even warm. The shrinkage shows how rapidly researchers are refining AI models to make them leaner and more efficient. It also shows how going to ever larger scales isn’t the only way to make machines significantly smarter.

The model now infusing my laptop with ChatGPT-like wit and wisdom is called Phi-3-mini. It’s part of a family of smaller AI models recently released by researchers at Microsoft. Although it’s compact enough to run on a smartphone, I tested it by running it on a laptop and accessing it from an iPhone through an app called Enchanted that provides a chat interface similar to the official ChatGPT app.

In a paper describing the Phi-3 family of models, Microsoft’s researchers say the model I used measures up favorably to GPT-3.5, the OpenAI model behind the first release of ChatGPT. That claim is based on measuring its performance on several standard AI benchmarks designed to measure common sense and reasoning. In my own testing, it certainly seems just as capable.

Microsoft announced a new “multimodal” Phi-3 model capable of handling audio, video, and text at its annual developer conference, Build, this week. That came just days after OpenAI and Google both touted radical new AI assistants built on top of multimodal models accessed via the cloud.

Microsoft’s Lilliputian family of AI models suggest it’s becoming possible to build all kinds of handy AI apps that don’t depend on the cloud. That could open up new use cases, by allowing them to be more responsive or private. (Offline algorithms are a key piece of the Recall feature Microsoft announced that uses AI to make everything you ever did on your PC searchable.) But the Phi family also reveals something about the nature of modern AI, and perhaps how it can be improved. Sébastien Bubeck, a researcher at Microsoft involved with the project, tells me the models were built to test whether being more selective about what an AI system is trained on could provide a way to fine-tune its abilities.

The large language models like OpenAI’s GPT-4 or Google’s Gemini that power chatbots and other services are typically spoon-fed huge gobs of text siphoned from books, websites, and just about any other accessible source. Although it’s raised legal questions, OpenAI and others have found that increasing the amount of text fed to these models, and the amount of computer power used to train them, can unlock new capabilities.

The Snowflake Attack May Be Turning Into One of the Largest Data Breaches Ever

By Matt Burgess

The End of El Niño Might Make the Weather Even More Extreme

By Sachi Mulkey

Microsoft Will Switch Off Recall by Default After Security Backlash

By Andy Greenberg

The Lords of Silicon Valley Are Thrilled to Present a ‘Handheld Iron Dome’

By Matthew Gault

Bubeck, who is interested in the nature of the “intelligence” exhibited by language models , decided to see if carefully curating the data fed to a model could improve its abilities without having to balloon its training data.

Last September, his team took a model roughly one-17th the size of OpenAI’s GPT-3.5, trained it on “textbook quality” synthetic data generated by a larger AI model, including factoids from specific domains including programming. The resulting model displayed surprising abilities for its size. “Lo and behold, what we observed is that we were able to beat GPT-3.5 at coding using this technique,” he says. “That was really surprising to us.”

Bubeck’s group at Microsoft has made other discoveries using this approach. One experiment showed that feeding an extra-tiny model children’s stories allowed it to produce consistently coherent output , even though AI programs of this size typically produce gibberish when trained the conventional way. Once again, the result suggests you can make seemingly underpowered AI software useful if you educate it with the right material.

Bubeck says these results seem to indicate that making future AI systems smarter will require more than just scaling them up to still greater sizes. And it also seems likely that scaled-down models like Phi-3 will be an important feature of the future of computing. Running AI models “locally” on a smartphone, laptop, or PC reduces the latency or outages that can occur when queries have to be fed into the cloud. It guarantees that your data stays on your device and could unlock entirely new use cases for AI not possible under the cloud-centric model, such as AI apps deeply integrated into a device’s operating system.

Apple is widely expected to unveil its long-awaited AI strategy at its WWDC conference next month, and it has previously boasted that its custom hardware and software allows machine learning to happen locally on its devices . Rather than go toe-to-toe with OpenAI and Google in building ever more enormous cloud AI models, it might think different by focusing on shrinking AI down to fit into its customers’ pockets.

You Might Also Like …

In your inbox: Will Knight's Fast Forward explores advances in AI

Welcome to the hellhole of programmatic advertising

How many EV charging stations does the US need to replace gas stations?

A nonprofit tried to fix tech culture —but lost control of its own

It's always sunny: Here are the best sunglasses for every adventure

cloud computing research

Steven Levy

The Big-Tech Clean Energy Crunch Is Here

Morgan Meaker

Astra Is Google’s Answer to the New ChatGPT

Lauren Goode

Chatbot Teamwork Makes the AI Dream Work

Reece Rogers

The EU Is Taking on Big Tech. It May Be Outmatched

Luca Zorloni

COMMENTS

  1. Research Note Cloud computing research: A review of research themes, frameworks, methods and future research directions

    Cloud computing research started to gain recognition around 2009 and has seen considerable rise over the years. From 6 journal articles in year 2009, cloud computing research continues to rise yearly as there are over 200 journal articles currently. We predict that more studies will be conducted on cloud computing in the coming years.

  2. Home page

    The Journal of Cloud Computing: Advances, Systems and Applications (JoCCASA) will publish research articles on all aspects of Cloud Computing. Principally, articles will address topics that are core to Cloud Computing, focusing on the Cloud applications, the Cloud systems, and the advances that will lead to the Clouds of the future.

  3. Cloud computing research: A review of research themes, frameworks

    Thus, this paper provides a summative meta-analysis of cloud computing research from 2009 to 2015. With the aim of taking stock and providing insights into theoretical frameworks and models, research methodologies, geographical focus, and trends of cloud computing research over these years. The rest of the paper is organized as follows.

  4. 40 cloud computing stats and trends to know in 2023

    1. IT leaders are looking to the cloud to help prepare for whatever lies ahead. Due to the current macroeconomic climate, cloud leaders say they are increasing their use of cloud-based services and products (41.4%), planning to migrate from legacy enterprise software to cloud-based tools (33.4%), and migrating on-premises workloads to the cloud (32.8%).

  5. Articles

    With the rapid development of cloud computing technology, cloud storage services are becoming more and more mature. However, the storage of sensitive data on remote servers poses privacy risks and is presently... Danial Shiraly, Ziba Eslami and Nasrollah Pakniat. Journal of Cloud Computing 2024 13 :112. Research Published on: 26 May 2024.

  6. Cloud services selection: A systematic review and future research

    1. Introduction. Cloud computing is a paradigm that provides on-demand computing resources such as storage, network, servers, databases, platforms, and so on as a service to its users via the internet, using a pay-as-you-go model [1], [2].Cloud computing has changed the way Micro, Small, and Medium Enterprises provide and use computing services.

  7. Future of cloud computing: 5 insights from new global research

    Here are five themes that stood out to us from this brand-new research. 1. Cloud computing will move to the forefront of enterprise technology over the next decade, backed by strong executive support. Globally, 47 percent of survey participants said that the majority of their companies' IT infrastructures already use public or private cloud ...

  8. cloud computing Latest Research Papers

    The paper further compares and reviews different layout model for the discovery of services, selection of services and composition of services in Cloud computing. Recent research trends in service composition are identified and then research about microservices are evaluated and shown in the form of table and graphs. Download Full-text.

  9. A Framework for the Interoperability of Cloud Platforms ...

    As the number of cloud platforms supporting scientific research grows, there is an increasing need to support interoperability between two or more cloud platforms. A well accepted core concept is ...

  10. Cloud Computing: A Systematic Literature Review and Future Agenda

    review is thought to inspire enterprises and managers that would like to use cloud computing in. terms of the scope, solution methods, factors, dimensions, and the results achieved in a holistic ...

  11. Systematic Literature Review of Cloud Computing Research ...

    We present a meta-analysis of cloud computing research in information systems. The study includes 152 referenced journal articles published between January 2010 to June 2023. We take stock of the literature and the associated research themes, research frameworks, the employed research methodology, and the geographical distribution of the articles.

  12. What Is Cloud Computing?

    Cloud computing is the on-demand access of computing resources—physical servers or virtual servers, data storage, networking capabilities, application development tools, software, AI-powered analytic tools and more—over the internet with pay-per-use pricing. The cloud computing model offers customers greater flexibility and scalability ...

  13. Top 10 Cloud Computing Research Topics in 2020

    We've compiled 15 important cloud computing research topics that are changing how cloud computing is used. 1. Big Data. Big data refers to the large amounts of data produced by various programs in a very short duration of time. It is quite cumbersome to store such huge and voluminous amounts of data in company-run data centers.

  14. Securing Machine Learning in the Cloud: A Systematic Review of Cloud

    With the advances in machine learning (ML) and deep learning (DL) techniques, and the potency of cloud computing in offering services efficiently and cost-effectively, Machine Learning as a Service (MLaaS) cloud platforms have become popular. In addition, there is increasing adoption of third-party cloud services for outsourcing training of DL models, which requires substantial costly ...

  15. Adoption of cloud computing as innovation in the organization

    Finally, we investigate the future research directions for cloud computing and expand this paper into further articles with experiments and results. Introduction. Cloud Computing makes data processing more efficient on multiple computing and storage systems where accessibility is executed through the internet. With the new inventive and ...

  16. 12 Latest Cloud Computing Research Topics

    Cloud Computing is gaining so much popularity an demand in the market. It is getting implemented in many organizations very fast. One of the major barriers for the cloud is real and perceived lack of security. There are many Cloud Computing Research Topics, which can be further taken to get the fruitful output.. In this tutorial, we are going to discuss 12 latest Cloud Computing Research Topics.

  17. What is Cloud Computing?

    Cloud computing defined. Cloud computing is the on-demand availability of computing resources (such as storage and infrastructure), as services over the internet. It eliminates the need for individuals and businesses to self-manage physical resources themselves, and only pay for what they use. The main cloud computing service models include ...

  18. What is cloud computing? Everything you need to know about the cloud

    What is cloud computing, in simple terms? Cloud computing is the delivery of computing services—including servers, storage, databases, networking, software, analytics, and intelligence—over ...

  19. Next generation cloud computing: New trends and research directions

    However, incorporating resilient computing into distributed cloud applications remains challenging, still requires significant programming efforts and is an open area of research 50 [122]. Notwithstanding, disaster recovery is an expensive operation, and is required as a service to minimise recovery time and costs after a failure has occurred ...

  20. Top 10 Cloud Computing Research Topics of 2024

    4. Blockchain data-based cloud data integrity protection mechanism. The "Blockchain data-based cloud data integrity protection mechanism" paper suggests a method for safeguarding the integrity of cloud data and which is one of the Cloud computing research topics. In order to store and process massive amounts of data, cloud computing has grown ...

  21. Cloud Computing

    Annual public cloud revenue reaches half a trillion dollars. Worldwide spending on public cloud services reached $545.8 billion in 2022, new data from IDC says. By Jon Gold. Jul 07, 2023 3 mins ...

  22. Cloud computing: state-of-the-art and research challenges

    Cloud computing has recently emerged as a new paradigm for hosting and delivering services over the Internet. Cloud computing is attractive to business owners as it eliminates the requirement for users to plan ahead for provisioning, and allows enterprises to start from the small and increase resources only when there is a rise in service demand.

  23. Making Advanced Research Computing More Accessible With NIH Cloud Lab

    A key focus of mine is the advancement of collaborative, data-driven research in "the cloud", and my colleagues and I are determined to make access to and expertise in cloud computing more widely available across the NIH research community. Researchers today often face barriers when adopting and integrating cloud-based technologies.

  24. The Rise of Cloud Computing: Data Protection, Privacy, and Open

    From 2010 to 2021, we found 52 papers that were of top ranked journals and conferences. From 2010 to 2017, there is linear work in cloud computing, however, after 2017, a lot of work is done in cloud computing data security. From 2018 to 2021, 37 papers are published. After 2018, the trend about data security in cloud computing increased very ...

  25. (PDF) A COMPREHENSIVE STUDY ON CLOUD COMPUTING

    A COMPREHENSIVE STUDY ON CLOUD. COMPUTING PARADIGM. Ab Rashid Dar 1, Dr. D. Ravindran 2. 1,2 Department of Computer Science, St. Joseph's College. (Autonomous), Tiruchirappalli Tamil Nadu, (Indi ...

  26. Cloud Is Having A Moment—Why Now?

    This has led to an even greater level of flexibility and customization, generating opportunities to leverage analytics and AI for a growing slate of individualized use cases. The cloud is having a ...

  27. Amazon Maintains Cloud Lead as Microsoft Edges Closer

    Meanwhile, Amazon's main rival Microsoft slowly edges closer, growing the market share of its Azure platform to an all-time high of 25 percent in Q1 2024. Combined with Google at 11 percent market ...

  28. McKinsey Technology Trends Outlook 2023

    Technologies such as cloud and edge computing and the future of bioengineering have shown steady increases in innovation and continue to have expanded use cases across industries. In fact, more than 400 edge use cases across various industries have been identified, and edge computing is projected to win double-digit growth globally over the ...

  29. Cloud Infrastructure

    Complete cloud infrastructure and platform services for every workload. OCI offers a common set of 100+ services in each cloud region. Get all the services you need—from containers and VMware to AI—to migrate, modernize, build, and scale your IT. Automate all your workloads, including both existing and new applications and data platforms.

  30. Pocket-Sized AI Models Could Unlock a New Era of Computing

    Research at Microsoft shows it's possible to make AI models small enough to run on phones or laptops without major compromises to their smarts. The technique could open up new use cases for AI.