• Privacy Policy

Research Method

Home » Data Collection – Methods Types and Examples

Data Collection – Methods Types and Examples

Table of Contents

Data collection

Data Collection

Definition:

Data collection is the process of gathering and collecting information from various sources to analyze and make informed decisions based on the data collected. This can involve various methods, such as surveys, interviews, experiments, and observation.

In order for data collection to be effective, it is important to have a clear understanding of what data is needed and what the purpose of the data collection is. This can involve identifying the population or sample being studied, determining the variables to be measured, and selecting appropriate methods for collecting and recording data.

Types of Data Collection

Types of Data Collection are as follows:

Primary Data Collection

Primary data collection is the process of gathering original and firsthand information directly from the source or target population. This type of data collection involves collecting data that has not been previously gathered, recorded, or published. Primary data can be collected through various methods such as surveys, interviews, observations, experiments, and focus groups. The data collected is usually specific to the research question or objective and can provide valuable insights that cannot be obtained from secondary data sources. Primary data collection is often used in market research, social research, and scientific research.

Secondary Data Collection

Secondary data collection is the process of gathering information from existing sources that have already been collected and analyzed by someone else, rather than conducting new research to collect primary data. Secondary data can be collected from various sources, such as published reports, books, journals, newspapers, websites, government publications, and other documents.

Qualitative Data Collection

Qualitative data collection is used to gather non-numerical data such as opinions, experiences, perceptions, and feelings, through techniques such as interviews, focus groups, observations, and document analysis. It seeks to understand the deeper meaning and context of a phenomenon or situation and is often used in social sciences, psychology, and humanities. Qualitative data collection methods allow for a more in-depth and holistic exploration of research questions and can provide rich and nuanced insights into human behavior and experiences.

Quantitative Data Collection

Quantitative data collection is a used to gather numerical data that can be analyzed using statistical methods. This data is typically collected through surveys, experiments, and other structured data collection methods. Quantitative data collection seeks to quantify and measure variables, such as behaviors, attitudes, and opinions, in a systematic and objective way. This data is often used to test hypotheses, identify patterns, and establish correlations between variables. Quantitative data collection methods allow for precise measurement and generalization of findings to a larger population. It is commonly used in fields such as economics, psychology, and natural sciences.

Data Collection Methods

Data Collection Methods are as follows:

Surveys involve asking questions to a sample of individuals or organizations to collect data. Surveys can be conducted in person, over the phone, or online.

Interviews involve a one-on-one conversation between the interviewer and the respondent. Interviews can be structured or unstructured and can be conducted in person or over the phone.

Focus Groups

Focus groups are group discussions that are moderated by a facilitator. Focus groups are used to collect qualitative data on a specific topic.

Observation

Observation involves watching and recording the behavior of people, objects, or events in their natural setting. Observation can be done overtly or covertly, depending on the research question.

Experiments

Experiments involve manipulating one or more variables and observing the effect on another variable. Experiments are commonly used in scientific research.

Case Studies

Case studies involve in-depth analysis of a single individual, organization, or event. Case studies are used to gain detailed information about a specific phenomenon.

Secondary Data Analysis

Secondary data analysis involves using existing data that was collected for another purpose. Secondary data can come from various sources, such as government agencies, academic institutions, or private companies.

How to Collect Data

The following are some steps to consider when collecting data:

  • Define the objective : Before you start collecting data, you need to define the objective of the study. This will help you determine what data you need to collect and how to collect it.
  • Identify the data sources : Identify the sources of data that will help you achieve your objective. These sources can be primary sources, such as surveys, interviews, and observations, or secondary sources, such as books, articles, and databases.
  • Determine the data collection method : Once you have identified the data sources, you need to determine the data collection method. This could be through online surveys, phone interviews, or face-to-face meetings.
  • Develop a data collection plan : Develop a plan that outlines the steps you will take to collect the data. This plan should include the timeline, the tools and equipment needed, and the personnel involved.
  • Test the data collection process: Before you start collecting data, test the data collection process to ensure that it is effective and efficient.
  • Collect the data: Collect the data according to the plan you developed in step 4. Make sure you record the data accurately and consistently.
  • Analyze the data: Once you have collected the data, analyze it to draw conclusions and make recommendations.
  • Report the findings: Report the findings of your data analysis to the relevant stakeholders. This could be in the form of a report, a presentation, or a publication.
  • Monitor and evaluate the data collection process: After the data collection process is complete, monitor and evaluate the process to identify areas for improvement in future data collection efforts.
  • Ensure data quality: Ensure that the collected data is of high quality and free from errors. This can be achieved by validating the data for accuracy, completeness, and consistency.
  • Maintain data security: Ensure that the collected data is secure and protected from unauthorized access or disclosure. This can be achieved by implementing data security protocols and using secure storage and transmission methods.
  • Follow ethical considerations: Follow ethical considerations when collecting data, such as obtaining informed consent from participants, protecting their privacy and confidentiality, and ensuring that the research does not cause harm to participants.
  • Use appropriate data analysis methods : Use appropriate data analysis methods based on the type of data collected and the research objectives. This could include statistical analysis, qualitative analysis, or a combination of both.
  • Record and store data properly: Record and store the collected data properly, in a structured and organized format. This will make it easier to retrieve and use the data in future research or analysis.
  • Collaborate with other stakeholders : Collaborate with other stakeholders, such as colleagues, experts, or community members, to ensure that the data collected is relevant and useful for the intended purpose.

Applications of Data Collection

Data collection methods are widely used in different fields, including social sciences, healthcare, business, education, and more. Here are some examples of how data collection methods are used in different fields:

  • Social sciences : Social scientists often use surveys, questionnaires, and interviews to collect data from individuals or groups. They may also use observation to collect data on social behaviors and interactions. This data is often used to study topics such as human behavior, attitudes, and beliefs.
  • Healthcare : Data collection methods are used in healthcare to monitor patient health and track treatment outcomes. Electronic health records and medical charts are commonly used to collect data on patients’ medical history, diagnoses, and treatments. Researchers may also use clinical trials and surveys to collect data on the effectiveness of different treatments.
  • Business : Businesses use data collection methods to gather information on consumer behavior, market trends, and competitor activity. They may collect data through customer surveys, sales reports, and market research studies. This data is used to inform business decisions, develop marketing strategies, and improve products and services.
  • Education : In education, data collection methods are used to assess student performance and measure the effectiveness of teaching methods. Standardized tests, quizzes, and exams are commonly used to collect data on student learning outcomes. Teachers may also use classroom observation and student feedback to gather data on teaching effectiveness.
  • Agriculture : Farmers use data collection methods to monitor crop growth and health. Sensors and remote sensing technology can be used to collect data on soil moisture, temperature, and nutrient levels. This data is used to optimize crop yields and minimize waste.
  • Environmental sciences : Environmental scientists use data collection methods to monitor air and water quality, track climate patterns, and measure the impact of human activity on the environment. They may use sensors, satellite imagery, and laboratory analysis to collect data on environmental factors.
  • Transportation : Transportation companies use data collection methods to track vehicle performance, optimize routes, and improve safety. GPS systems, on-board sensors, and other tracking technologies are used to collect data on vehicle speed, fuel consumption, and driver behavior.

Examples of Data Collection

Examples of Data Collection are as follows:

  • Traffic Monitoring: Cities collect real-time data on traffic patterns and congestion through sensors on roads and cameras at intersections. This information can be used to optimize traffic flow and improve safety.
  • Social Media Monitoring : Companies can collect real-time data on social media platforms such as Twitter and Facebook to monitor their brand reputation, track customer sentiment, and respond to customer inquiries and complaints in real-time.
  • Weather Monitoring: Weather agencies collect real-time data on temperature, humidity, air pressure, and precipitation through weather stations and satellites. This information is used to provide accurate weather forecasts and warnings.
  • Stock Market Monitoring : Financial institutions collect real-time data on stock prices, trading volumes, and other market indicators to make informed investment decisions and respond to market fluctuations in real-time.
  • Health Monitoring : Medical devices such as wearable fitness trackers and smartwatches can collect real-time data on a person’s heart rate, blood pressure, and other vital signs. This information can be used to monitor health conditions and detect early warning signs of health issues.

Purpose of Data Collection

The purpose of data collection can vary depending on the context and goals of the study, but generally, it serves to:

  • Provide information: Data collection provides information about a particular phenomenon or behavior that can be used to better understand it.
  • Measure progress : Data collection can be used to measure the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Support decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions.
  • Identify trends : Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Monitor and evaluate : Data collection can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.

When to use Data Collection

Data collection is used when there is a need to gather information or data on a specific topic or phenomenon. It is typically used in research, evaluation, and monitoring and is important for making informed decisions and improving outcomes.

Data collection is particularly useful in the following scenarios:

  • Research : When conducting research, data collection is used to gather information on variables of interest to answer research questions and test hypotheses.
  • Evaluation : Data collection is used in program evaluation to assess the effectiveness of programs or interventions, and to identify areas for improvement.
  • Monitoring : Data collection is used in monitoring to track progress towards achieving goals or targets, and to identify any areas that require attention.
  • Decision-making: Data collection is used to provide decision-makers with information that can be used to inform policies, strategies, and actions.
  • Quality improvement : Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Characteristics of Data Collection

Data collection can be characterized by several important characteristics that help to ensure the quality and accuracy of the data gathered. These characteristics include:

  • Validity : Validity refers to the accuracy and relevance of the data collected in relation to the research question or objective.
  • Reliability : Reliability refers to the consistency and stability of the data collection process, ensuring that the results obtained are consistent over time and across different contexts.
  • Objectivity : Objectivity refers to the impartiality of the data collection process, ensuring that the data collected is not influenced by the biases or personal opinions of the data collector.
  • Precision : Precision refers to the degree of accuracy and detail in the data collected, ensuring that the data is specific and accurate enough to answer the research question or objective.
  • Timeliness : Timeliness refers to the efficiency and speed with which the data is collected, ensuring that the data is collected in a timely manner to meet the needs of the research or evaluation.
  • Ethical considerations : Ethical considerations refer to the ethical principles that must be followed when collecting data, such as ensuring confidentiality and obtaining informed consent from participants.

Advantages of Data Collection

There are several advantages of data collection that make it an important process in research, evaluation, and monitoring. These advantages include:

  • Better decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions, leading to better decision-making.
  • Improved understanding: Data collection helps to improve our understanding of a particular phenomenon or behavior by providing empirical evidence that can be analyzed and interpreted.
  • Evaluation of interventions: Data collection is essential in evaluating the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Identifying trends and patterns: Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Increased accountability: Data collection increases accountability by providing evidence that can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.
  • Validation of theories: Data collection can be used to test hypotheses and validate theories, leading to a better understanding of the phenomenon being studied.
  • Improved quality: Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Limitations of Data Collection

While data collection has several advantages, it also has some limitations that must be considered. These limitations include:

  • Bias : Data collection can be influenced by the biases and personal opinions of the data collector, which can lead to inaccurate or misleading results.
  • Sampling bias : Data collection may not be representative of the entire population, resulting in sampling bias and inaccurate results.
  • Cost : Data collection can be expensive and time-consuming, particularly for large-scale studies.
  • Limited scope: Data collection is limited to the variables being measured, which may not capture the entire picture or context of the phenomenon being studied.
  • Ethical considerations : Data collection must follow ethical principles to protect the rights and confidentiality of the participants, which can limit the type of data that can be collected.
  • Data quality issues: Data collection may result in data quality issues such as missing or incomplete data, measurement errors, and inconsistencies.
  • Limited generalizability : Data collection may not be generalizable to other contexts or populations, limiting the generalizability of the findings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Research Questions

Research Questions – Types, Examples and Writing...

Data collection in research: Your complete guide

Last updated

31 January 2023

Reviewed by

Cathy Heath

In the late 16th century, Francis Bacon coined the phrase "knowledge is power," which implies that knowledge is a powerful force, like physical strength. In the 21st century, knowledge in the form of data is unquestionably powerful.

But data isn't something you just have - you need to collect it. This means utilizing a data collection process and turning the collected data into knowledge that you can leverage into a successful strategy for your business or organization.

Believe it or not, there's more to data collection than just conducting a Google search. In this complete guide, we shine a spotlight on data collection, outlining what it is, types of data collection methods, common challenges in data collection, data collection techniques, and the steps involved in data collection.

Analyze all your data in one place

Uncover hidden nuggets in all types of qualitative data when you analyze it in Dovetail

  • What is data collection?

There are two specific data collection techniques: primary and secondary data collection. Primary data collection is the process of gathering data directly from sources. It's often considered the most reliable data collection method, as researchers can collect information directly from respondents.

Secondary data collection is data that has already been collected by someone else and is readily available. This data is usually less expensive and quicker to obtain than primary data.

  • What are the different methods of data collection?

There are several data collection methods, which can be either manual or automated. Manual data collection involves collecting data manually, typically with pen and paper, while computerized data collection involves using software to collect data from online sources, such as social media, website data, transaction data, etc. 

Here are the five most popular methods of data collection:

Surveys are a very popular method of data collection that organizations can use to gather information from many people. Researchers can conduct multi-mode surveys that reach respondents in different ways, including in person, by mail, over the phone, or online.

As a method of data collection, surveys have several advantages. For instance, they are relatively quick and easy to administer, you can be flexible in what you ask, and they can be tailored to collect data on various topics or from certain demographics.

However, surveys also have several disadvantages. For instance, they can be expensive to administer, and the results may not represent the population as a whole. Additionally, survey data can be challenging to interpret. It may also be subject to bias if the questions are not well-designed or if the sample of people surveyed is not representative of the population of interest.

Interviews are a common method of collecting data in social science research. You can conduct interviews in person, over the phone, or even via email or online chat.

Interviews are a great way to collect qualitative and quantitative data . Qualitative interviews are likely your best option if you need to collect detailed information about your subjects' experiences or opinions. If you need to collect more generalized data about your subjects' demographics or attitudes, then quantitative interviews may be a better option.

Interviews are relatively quick and very flexible, allowing you to ask follow-up questions and explore topics in more depth. The downside is that interviews can be time-consuming and expensive due to the amount of information to be analyzed. They are also prone to bias, as both the interviewer and the respondent may have certain expectations or preconceptions that may influence the data.

Direct observation

Observation is a direct way of collecting data. It can be structured (with a specific protocol to follow) or unstructured (simply observing without a particular plan).

Organizations and businesses use observation as a data collection method to gather information about their target market, customers, or competition. Businesses can learn about consumer behavior, preferences, and trends by observing people using their products or service.

There are two types of observation: participatory and non-participatory. In participatory observation, the researcher is actively involved in the observed activities. This type of observation is used in ethnographic research , where the researcher wants to understand a group's culture and social norms. Non-participatory observation is when researchers observe from a distance and do not interact with the people or environment they are studying.

There are several advantages to using observation as a data collection method. It can provide insights that may not be apparent through other methods, such as surveys or interviews. Researchers can also observe behavior in a natural setting, which can provide a more accurate picture of what people do and how and why they behave in a certain context.

There are some disadvantages to using observation as a method of data collection. It can be time-consuming, intrusive, and expensive to observe people for extended periods. Observations can also be tainted if the researcher is not careful to avoid personal biases or preconceptions.

Automated data collection

Business applications and websites are increasingly collecting data electronically to improve the user experience or for marketing purposes.

There are a few different ways that organizations can collect data automatically. One way is through cookies, which are small pieces of data stored on a user's computer. They track a user's browsing history and activity on a site, measuring levels of engagement with a business’s products or services, for example.

Another way organizations can collect data automatically is through web beacons. Web beacons are small images embedded on a web page to track a user's activity.

Finally, organizations can also collect data through mobile apps, which can track user location, device information, and app usage. This data can be used to improve the user experience and for marketing purposes.

Automated data collection is a valuable tool for businesses, helping improve the user experience or target marketing efforts. Businesses should aim to be transparent about how they collect and use this data.

Sourcing data through information service providers

Organizations need to be able to collect data from a variety of sources, including social media, weblogs, and sensors. The process to do this and then use the data for action needs to be efficient, targeted, and meaningful.

In the era of big data, organizations are increasingly turning to information service providers (ISPs) and other external data sources to help them collect data to make crucial decisions. 

Information service providers help organizations collect data by offering personalized services that suit the specific needs of the organizations. These services can include data collection, analysis, management, and reporting. By partnering with an ISP, organizations can gain access to the newest technology and tools to help them to gather and manage data more effectively.

There are also several tools and techniques that organizations can use to collect data from external sources, such as web scraping, which collects data from websites, and data mining, which involves using algorithms to extract data from large data sets. 

Organizations can also use APIs (application programming interface) to collect data from external sources. APIs allow organizations to access data stored in another system and share and integrate it into their own systems.

Finally, organizations can also use manual methods to collect data from external sources. This can involve contacting companies or individuals directly to request data, by using the right tools and methods to get the insights they need.

  • What are common challenges in data collection?

There are many challenges that researchers face when collecting data. Here are five common examples:

Big data environments

Data collection can be a challenge in big data environments for several reasons. It can be located in different places, such as archives, libraries, or online. The sheer volume of data can also make it difficult to identify the most relevant data sets.

Second, the complexity of data sets can make it challenging to extract the desired information. Third, the distributed nature of big data environments can make it difficult to collect data promptly and efficiently.

Therefore it is important to have a well-designed data collection strategy to consider the specific needs of the organization and what data sets are the most relevant. Alongside this, consideration should be made regarding the tools and resources available to support data collection and protect it from unintended use.

Data bias is a common challenge in data collection. It occurs when data is collected from a sample that is not representative of the population of interest. 

There are different types of data bias, but some common ones include selection bias, self-selection bias, and response bias. Selection bias can occur when the collected data does not represent the population being studied. For example, if a study only includes data from people who volunteer to participate, that data may not represent the general population.

Self-selection bias can also occur when people self-select into a study, such as by taking part only if they think they will benefit from it. Response bias happens when people respond in a way that is not honest or accurate, such as by only answering questions that make them look good. 

These types of data bias present a challenge because they can lead to inaccurate results and conclusions about behaviors, perceptions, and trends. Data bias can be avoided by identifying potential sources or themes of bias and setting guidelines for eliminating them.

Lack of quality assurance processes

One of the biggest challenges in data collection is the lack of quality assurance processes. This can lead to several problems, including incorrect data, missing data, and inconsistencies between data sets.

Quality assurance is important because there are many data sources, and each source may have different levels of quality or corruption. There are also different ways of collecting data, and data quality may vary depending on the method used. 

There are several ways to improve quality assurance in data collection. These include developing clear and consistent goals and guidelines for data collection, implementing quality control measures, using standardized procedures, and employing data validation techniques. By taking these steps, you can ensure that your data is of adequate quality to inform decision-making.

Limited access to data

Another challenge in data collection is limited access to data. This can be due to several reasons, including privacy concerns, the sensitive nature of the data, security concerns, or simply the fact that data is not readily available.

Legal and compliance regulations

Most countries have regulations governing how data can be collected, used, and stored. In some cases, data collected in one country may not be used in another. This means gaining a global perspective can be a challenge. 

For example, if a company is required to comply with the EU General Data Protection Regulation (GDPR), it may not be able to collect data from individuals in the EU without their explicit consent. This can make it difficult to collect data from a target audience.

Legal and compliance regulations can be complex, and it's important to ensure that all data collected is done so in a way that complies with the relevant regulations.

  • What are the key steps in the data collection process?

There are five steps involved in the data collection process. They are:

1. Decide what data you want to gather

Have a clear understanding of the questions you are asking, and then consider where the answers might lie and how you might obtain them. This saves time and resources by avoiding the collection of irrelevant data, and helps maintain the quality of your datasets. 

2. Establish a deadline for data collection

Establishing a deadline for data collection helps you avoid collecting too much data, which can be costly and time-consuming to analyze. It also allows you to plan for data analysis and prompt interpretation. Finally, it helps you meet your research goals and objectives and allows you to move forward.

3. Select a data collection approach

The data collection approach you choose will depend on different factors, including the type of data you need, available resources, and the project timeline. For instance, if you need qualitative data, you might choose a focus group or interview methodology. If you need quantitative data , then a survey or observational study may be the most appropriate form of collection.

4. Gather information

When collecting data for your business, identify your business goals first. Once you know what you want to achieve, you can start collecting data to reach those goals. The most important thing is to ensure that the data you collect is reliable and valid. Otherwise, any decisions you make using the data could result in a negative outcome for your business.

5. Examine the information and apply your findings

As a researcher, it's important to examine the data you're collecting and analyzing before you apply your findings. This is because data can be misleading, leading to inaccurate conclusions. Ask yourself whether it is what you are expecting? Is it similar to other datasets you have looked at? 

There are many scientific ways to examine data, but some common methods include:

looking at the distribution of data points

examining the relationships between variables

looking for outliers

By taking the time to examine your data and noticing any patterns, strange or otherwise, you can avoid making mistakes that could invalidate your research.

  • How qualitative analysis software streamlines the data collection process

Knowledge derived from data does indeed carry power. However, if you don't convert the knowledge into action, it will remain a resource of unexploited energy and wasted potential.

Luckily, data collection tools enable organizations to streamline their data collection and analysis processes and leverage the derived knowledge to grow their businesses. For instance, qualitative analysis software can be highly advantageous in data collection by streamlining the process, making it more efficient and less time-consuming.

Secondly, qualitative analysis software provides a structure for data collection and analysis, ensuring that data is of high quality. It can also help to uncover patterns and relationships that would otherwise be difficult to discern. Moreover, you can use it to replace more expensive data collection methods, such as focus groups or surveys.

Overall, qualitative analysis software can be valuable for any researcher looking to collect and analyze data. By increasing efficiency, improving data quality, and providing greater insights, qualitative software can help to make the research process much more efficient and effective.

what is data collection procedure in research

Learn more about qualitative research data analysis software

Should you be using a customer insights hub.

Do you want to discover previous research faster?

Do you share your research findings with others?

Do you analyze research data?

Start for free today, add your research, and get to key insights faster

Editor’s picks

Last updated: 13 April 2023

Last updated: 14 February 2024

Last updated: 27 January 2024

Last updated: 18 April 2023

Last updated: 8 February 2023

Last updated: 23 January 2024

Last updated: 30 January 2024

Last updated: 7 February 2023

Last updated: 7 March 2023

Last updated: 18 May 2023

Last updated: 13 May 2024

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next.

what is data collection procedure in research

Users report unexpectedly high data usage, especially during streaming sessions.

what is data collection procedure in research

Users find it hard to navigate from the home page to relevant playlists in the app.

what is data collection procedure in research

It would be great to have a sleep timer feature, especially for bedtime listening.

what is data collection procedure in research

I need better filters to find the songs or artists I’m looking for.

Log in or sign up

Get started for free

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

what is data collection procedure in research

Home Market Research

Data Collection: What It Is, Methods & Tools + Examples

what is data collection procedure in research

Let’s face it, no one wants to make decisions based on guesswork or gut feelings. The most important objective of data collection is to ensure that the data gathered is reliable and packed to the brim with juicy insights that can be analyzed and turned into data-driven decisions. There’s nothing better than good statistical analysis .

LEARN ABOUT: Level of Analysis

Collecting high-quality data is essential for conducting market research, analyzing user behavior, or just trying to get a handle on business operations. With the right approach and a few handy tools, gathering reliable and informative data.

So, let’s get ready to collect some data because when it comes to data collection, it’s all about the details.

Content Index

What is Data Collection?

Data collection methods, data collection examples, reasons to conduct online research and data collection, conducting customer surveys for data collection to multiply sales, steps to effectively conduct an online survey for data collection, survey design for data collection.

Data collection is the procedure of collecting, measuring, and analyzing accurate insights for research using standard validated techniques.

Put simply, data collection is the process of gathering information for a specific purpose. It can be used to answer research questions, make informed business decisions, or improve products and services.

To collect data, we must first identify what information we need and how we will collect it. We can also evaluate a hypothesis based on collected data. In most cases, data collection is the primary and most important step for research. The approach to data collection is different for different fields of study, depending on the required information.

LEARN ABOUT: Action Research

There are many ways to collect information when doing research. The data collection methods that the researcher chooses will depend on the research question posed. Some data collection methods include surveys, interviews, tests, physiological evaluations, observations, reviews of existing records, and biological samples. Let’s explore them.

LEARN ABOUT: Best Data Collection Tools

Data Collection Methods

Phone vs. Online vs. In-Person Interviews

Essentially there are four choices for data collection – in-person interviews, mail, phone, and online. There are pros and cons to each of these modes.

  • Pros: In-depth and a high degree of confidence in the data
  • Cons: Time-consuming, expensive, and can be dismissed as anecdotal
  • Pros: Can reach anyone and everyone – no barrier
  • Cons: Expensive, data collection errors, lag time
  • Pros: High degree of confidence in the data collected, reach almost anyone
  • Cons: Expensive, cannot self-administer, need to hire an agency
  • Pros: Cheap, can self-administer, very low probability of data errors
  • Cons: Not all your customers might have an email address/be on the internet, customers may be wary of divulging information online.

In-person interviews always are better, but the big drawback is the trap you might fall into if you don’t do them regularly. It is expensive to regularly conduct interviews and not conducting enough interviews might give you false positives. Validating your research is almost as important as designing and conducting it.

We’ve seen many instances where after the research is conducted – if the results do not match up with the “gut-feel” of upper management, it has been dismissed off as anecdotal and a “one-time” phenomenon. To avoid such traps, we strongly recommend that data-collection be done on an “ongoing and regular” basis.

LEARN ABOUT: Research Process Steps

This will help you compare and analyze the change in perceptions according to marketing for your products/services. The other issue here is sample size. To be confident with your research, you must interview enough people to weed out the fringe elements.

A couple of years ago there was a lot of discussion about online surveys and their statistical analysis plan . The fact that not every customer had internet connectivity was one of the main concerns.

LEARN ABOUT:   Statistical Analysis Methods

Although some of the discussions are still valid, the reach of the internet as a means of communication has become vital in the majority of customer interactions. According to the US Census Bureau, the number of households with computers has doubled between 1997 and 2001.

Learn more: Quantitative Market Research

In 2001 nearly 50% of households had a computer. Nearly 55% of all households with an income of more than 35,000 have internet access, which jumps to 70% for households with an annual income of 50,000. This data is from the US Census Bureau for 2001.

There are primarily three modes of data collection that can be employed to gather feedback – Mail, Phone, and Online. The method actually used for data collection is really a cost-benefit analysis. There is no slam-dunk solution but you can use the table below to understand the risks and advantages associated with each of the mediums:

Keep in mind, the reach here is defined as “All U.S. Households.” In most cases, you need to look at how many of your customers are online and determine. If all your customers have email addresses, you have a 100% reach of your customers.

Another important thing to keep in mind is the ever-increasing dominance of cellular phones over landline phones. United States FCC rules prevent automated dialing and calling cellular phone numbers and there is a noticeable trend towards people having cellular phones as the only voice communication device.

This introduces the inability to reach cellular phone customers who are dropping home phone lines in favor of going entirely wireless. Even if automated dialing is not used, another FCC rule prohibits from phoning anyone who would have to pay for the call.

Learn more: Qualitative Market Research

Multi-Mode Surveys

Surveys, where the data is collected via different modes (online, paper, phone etc.), is also another way of going. It is fairly straightforward and easy to have an online survey and have data-entry operators to enter in data (from the phone as well as paper surveys) into the system. The same system can also be used to collect data directly from the respondents.

Learn more: Survey Research

Data collection is an important aspect of research. Let’s consider an example of a mobile manufacturer, company X, which is launching a new product variant. To conduct research about features, price range, target market, competitor analysis, etc. data has to be collected from appropriate sources.

The marketing team can conduct various data collection activities such as online surveys or focus groups .

The survey should have all the right questions about features and pricing, such as “What are the top 3 features expected from an upcoming product?” or “How much are your likely to spend on this product?” or “Which competitors provide similar products?” etc.

For conducting a focus group, the marketing team should decide the participants and the mediator. The topic of discussion and objective behind conducting a focus group should be clarified beforehand to conduct a conclusive discussion.

Data collection methods are chosen depending on the available resources. For example, conducting questionnaires and surveys would require the least resources, while focus groups require moderately high resources.

Feedback is a vital part of any organization’s growth. Whether you conduct regular focus groups to elicit information from key players or, your account manager calls up all your marquee  accounts to find out how things are going – essentially they are all processes to find out from your customers’ eyes – How are we doing? What can we do better?

Online surveys are just another medium to collect feedback from your customers , employees and anyone your business interacts with. With the advent of Do-It-Yourself tools for online surveys, data collection on the internet has become really easy, cheap and effective.

Learn more:  Online Research

It is a well-established marketing fact that acquiring a new customer is 10 times more difficult and expensive than retaining an existing one. This is one of the fundamental driving forces behind the extensive adoption and interest in CRM and related customer retention tactics.

In a research study conducted by Rice University Professor Dr. Paul Dholakia and Dr. Vicki Morwitz, published in Harvard Business Review, the experiment inferred that the simple fact of asking customers how an organization was performing by itself to deliver results proved to be an effective customer retention strategy.

In the research study, conducted over the course of a year, one set of customers were sent out a satisfaction and opinion survey and the other set was not surveyed. In the next one year, the group that took the survey saw twice the number of people continuing and renewing their loyalty towards the organization data .

Learn more: Research Design

The research study provided a couple of interesting reasons on the basis of consumer psychology, behind this phenomenon:

  • Satisfaction surveys boost the customers’ desire to be coddled and induce positive feelings. This crops from a section of the human psychology that intends to “appreciate” a product or service they already like or prefer. The survey feedback collection method is solely a medium to convey this. The survey is a vehicle to “interact” with the company and reinforces the customer’s commitment to the company.
  • Surveys may increase awareness of auxiliary products and services. Surveys can be considered modes of both inbound as well as outbound communication. Surveys are generally considered to be a data collection and analysis source. Most people are unaware of the fact that consumer surveys can also serve as a medium for distributing data. It is important to note a few caveats here.
  • In most countries, including the US, “selling under the guise of research” is illegal. b. However, we all know that information is distributed while collecting information. c. Other disclaimers may be included in the survey to ensure users are aware of this fact. For example: “We will collect your opinion and inform you about products and services that have come online in the last year…”
  • Induced Judgments:  The entire procedure of asking people for their feedback can prompt them to build an opinion on something they otherwise would not have thought about. This is a very underlying yet powerful argument that can be compared to the “Product Placement” strategy currently used for marketing products in mass media like movies and television shows. One example is the extensive and exclusive use of the “mini-Cooper” in the blockbuster movie “Italian Job.” This strategy is questionable and should be used with great caution.

Surveys should be considered as a critical tool in the customer journey dialog. The best thing about surveys is its ability to carry “bi-directional” information. The research conducted by Paul Dholakia and Vicki Morwitz shows that surveys not only get you the information that is critical for your business, but also enhances and builds upon the established relationship you have with your customers.

Recent technological advances have made it incredibly easy to conduct real-time surveys and  opinion polls . Online tools make it easy to frame questions and answers and create surveys on the Web. Distributing surveys via email, website links or even integration with online CRM tools like Salesforce.com have made online surveying a quick-win solution.

So, you’ve decided to conduct an online survey. There are a few questions in your mind that you would like answered, and you are looking for a fast and inexpensive way to find out more about your customers, clients, etc.

First and foremost thing you need to decide what the smart objectives of the study are. Ensure that you can phrase these objectives as questions or measurements. If you can’t, you are better off looking at other data sources like focus groups and other qualitative methods . The data collected via online surveys is dominantly quantitative in nature.

Review the basic objectives of the study. What are you trying to discover? What actions do you  want to take as a result of the survey? –  Answers to these questions help in validating collected data. Online surveys are just one way of collecting and quantifying data .

Learn more: Qualitative Data & Qualitative Data Collection Methods

  • Visualize all of the relevant information items you would like to have. What will the output survey research report look like? What charts and graphs will be prepared? What information do you need to be assured that action is warranted?
  • Assign ranks to each topic (1 and 2) according to their priority, including the most important topics first. Revisit these items again to ensure that the objectives, topics, and information you need are appropriate. Remember, you can’t solve the research problem if you ask the wrong questions.
  • How easy or difficult is it for the respondent to provide information on each topic? If it is difficult, is there an alternative medium to gain insights by asking a different question? This is probably the most important step. Online surveys have to be Precise, Clear and Concise. Due to the nature of the internet and the fluctuations involved, if your questions are too difficult to understand, the survey dropout rate will be high.
  • Create a sequence for the topics that are unbiased. Make sure that the questions asked first do not bias the results of the next questions. Sometimes providing too much information, or disclosing purpose of the study can create bias. Once you have a series of decided topics, you can have a basic structure of a survey. It is always advisable to add an “Introductory” paragraph before the survey to explain the project objective and what is expected of the respondent. It is also sensible to have a “Thank You” text as well as information about where to find the results of the survey when they are published.
  • Page Breaks – The attention span of respondents can be very low when it comes to a long scrolling survey. Add page breaks as wherever possible. Having said that, a single question per page can also hamper response rates as it increases the time to complete the survey as well as increases the chances for dropouts.
  • Branching – Create smart and effective surveys with the implementation of branching wherever required. Eliminate the use of text such as, “If you answered No to Q1 then Answer Q4” – this leads to annoyance amongst respondents which result in increase survey dropout rates. Design online surveys using the branching logic so that appropriate questions are automatically routed based on previous responses.
  • Write the questions . Initially, write a significant number of survey questions out of which you can use the one which is best suited for the survey. Divide the survey into sections so that respondents do not get confused seeing a long list of questions.
  • Sequence the questions so that they are unbiased.
  • Repeat all of the steps above to find any major holes. Are the questions really answered? Have someone review it for you.
  • Time the length of the survey. A survey should take less than five minutes. At three to four research questions per minute, you are limited to about 15 questions. One open end text question counts for three multiple choice questions. Most online software tools will record the time taken for the respondents to answer questions.
  • Include a few open-ended survey questions that support your survey object. This will be a type of feedback survey.
  • Send an email to the project survey to your test group and then email the feedback survey afterward.
  • This way, you can have your test group provide their opinion about the functionality as well as usability of your project survey by using the feedback survey.
  • Make changes to your questionnaire based on the received feedback.
  • Send the survey out to all your respondents!

Online surveys have, over the course of time, evolved into an effective alternative to expensive mail or telephone surveys. However, you must be aware of a few conditions that need to be met for online surveys. If you are trying to survey a sample representing the target population, please remember that not everyone is online.

Moreover, not everyone is receptive to an online survey also. Generally, the demographic segmentation of younger individuals is inclined toward responding to an online survey.

Learn More: Examples of Qualitarive Data in Education

Good survey design is crucial for accurate data collection. From question-wording to response options, let’s explore how to create effective surveys that yield valuable insights with our tips to survey design.

  • Writing Great Questions for data collection

Writing great questions can be considered an art. Art always requires a significant amount of hard work, practice, and help from others.

The questions in a survey need to be clear, concise, and unbiased. A poorly worded question or a question with leading language can result in inaccurate or irrelevant responses, ultimately impacting the data’s validity.

Moreover, the questions should be relevant and specific to the research objectives. Questions that are irrelevant or do not capture the necessary information can lead to incomplete or inconsistent responses too.

  • Avoid loaded or leading words or questions

A small change in content can produce effective results. Words such as could , should and might are all used for almost the same purpose, but may produce a 20% difference in agreement to a question. For example, “The management could.. should.. might.. have shut the factory”.

Intense words such as – prohibit or action, representing control or action, produce similar results. For example,  “Do you believe Donald Trump should prohibit insurance companies from raising rates?”.

Sometimes the content is just biased. For instance, “You wouldn’t want to go to Rudolpho’s Restaurant for the organization’s annual party, would you?”

  • Misplaced questions

Questions should always reference the intended context, and questions placed out of order or without its requirement should be avoided. Generally, a funnel approach should be implemented – generic questions should be included in the initial section of the questionnaire as a warm-up and specific ones should follow. Toward the end, demographic or geographic questions should be included.

  • Mutually non-overlapping response categories

Multiple-choice answers should be mutually unique to provide distinct choices. Overlapping answer options frustrate the respondent and make interpretation difficult at best. Also, the questions should always be precise.

For example: “Do you like water juice?”

This question is vague. In which terms is the liking for orange juice is to be rated? – Sweetness, texture, price, nutrition etc.

  • Avoid the use of confusing/unfamiliar words

Asking about industry-related terms such as caloric content, bits, bytes, MBS , as well as other terms and acronyms can confuse respondents . Ensure that the audience understands your language level, terminology, and, above all, the question you ask.

  • Non-directed questions give respondents excessive leeway

In survey design for data collection, non-directed questions can give respondents excessive leeway, which can lead to vague and unreliable data. These types of questions are also known as open-ended questions, and they do not provide any structure for the respondent to follow.

For instance, a non-directed question like “ What suggestions do you have for improving our shoes?” can elicit a wide range of answers, some of which may not be relevant to the research objectives. Some respondents may give short answers, while others may provide lengthy and detailed responses, making comparing and analyzing the data challenging.

To avoid these issues, it’s essential to ask direct questions that are specific and have a clear structure. Closed-ended questions, for example, offer structured response options and can be easier to analyze as they provide a quantitative measure of respondents’ opinions.

  • Never force questions

There will always be certain questions that cross certain privacy rules. Since privacy is an important issue for most people, these questions should either be eliminated from the survey or not be kept as mandatory. Survey questions about income, family income, status, religious and political beliefs, etc., should always be avoided as they are considered to be intruding, and respondents can choose not to answer them.

  • Unbalanced answer options in scales

Unbalanced answer options in scales such as Likert Scale and Semantic Scale may be appropriate for some situations and biased in others. When analyzing a pattern in eating habits, a study used a quantity scale that made obese people appear in the middle of the scale with the polar ends reflecting a state where people starve and an irrational amount to consume. There are cases where we usually do not expect poor service, such as hospitals.

  • Questions that cover two points

In survey design for data collection, questions that cover two points can be problematic for several reasons. These types of questions are often called “double-barreled” questions and can cause confusion for respondents, leading to inaccurate or irrelevant data.

For instance, a question like “Do you like the food and the service at the restaurant?” covers two points, the food and the service, and it assumes that the respondent has the same opinion about both. If the respondent only liked the food, their opinion of the service could affect their answer.

It’s important to ask one question at a time to avoid confusion and ensure that the respondent’s answer is focused and accurate. This also applies to questions with multiple concepts or ideas. In these cases, it’s best to break down the question into multiple questions that address each concept or idea separately.

  • Dichotomous questions

Dichotomous questions are used in case you want a distinct answer, such as: Yes/No or Male/Female . For example, the question “Do you think this candidate will win the election?” can be Yes or No.

  • Avoid the use of long questions

The use of long questions will definitely increase the time taken for completion, which will generally lead to an increase in the survey dropout rate. Multiple-choice questions are the longest and most complex, and open-ended questions are the shortest and easiest to answer.

Data collection is an essential part of the research process, whether you’re conducting scientific experiments, market research, or surveys. The methods and tools used for data collection will vary depending on the research type, the sample size required, and the resources available.

Several data collection methods include surveys, observations, interviews, and focus groups. We learn each method has advantages and disadvantages, and choosing the one that best suits the research goals is important.

With the rise of technology, many tools are now available to facilitate data collection, including online survey software and data visualization tools. These tools can help researchers collect, store, and analyze data more efficiently, providing greater results and accuracy.

By understanding the various methods and tools available for data collection, we can develop a solid foundation for conducting research. With these research skills , we can make informed decisions, solve problems, and contribute to advancing our understanding of the world around us.

Analyze your survey data to gauge in-depth market drivers, including competitive intelligence, purchasing behavior, and price sensitivity, with QuestionPro.

You will obtain accurate insights with various techniques, including conjoint analysis, MaxDiff analysis, sentiment analysis, TURF analysis, heatmap analysis, etc. Export quality data to external in-depth analysis tools such as SPSS and R Software, and integrate your research with external business applications. Everything you need for your data collection. Start today for free!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

Best Dynata Alternatives

Top 10 Dynata Alternatives & Competitors

May 27, 2024

what is data collection procedure in research

What Are My Employees Really Thinking? The Power of Open-ended Survey Analysis

May 24, 2024

When I think of “disconnected”, it is important that this is not just in relation to people analytics, Employee Experience or Customer Experience - it is also relevant to looking across them.

I Am Disconnected – Tuesday CX Thoughts

May 21, 2024

Customer success tools

20 Best Customer Success Tools of 2024

May 20, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Jump to navigation

Home

Cochrane Training

Chapter 5: collecting data.

Tianjing Li, Julian PT Higgins, Jonathan J Deeks

Key Points:

  • Systematic reviews have studies, rather than reports, as the unit of interest, and so multiple reports of the same study need to be identified and linked together before or after data extraction.
  • Because of the increasing availability of data sources (e.g. trials registers, regulatory documents, clinical study reports), review authors should decide on which sources may contain the most useful information for the review, and have a plan to resolve discrepancies if information is inconsistent across sources.
  • Review authors are encouraged to develop outlines of tables and figures that will appear in the review to facilitate the design of data collection forms. The key to successful data collection is to construct easy-to-use forms and collect sufficient and unambiguous data that faithfully represent the source in a structured and organized manner.
  • Effort should be made to identify data needed for meta-analyses, which often need to be calculated or converted from data reported in diverse formats.
  • Data should be collected and archived in a form that allows future access and data sharing.

Cite this chapter as: Li T, Higgins JPT, Deeks JJ (editors). Chapter 5: Collecting data. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.4 (updated August 2023). Cochrane, 2023. Available from www.training.cochrane.org/handbook .

5.1 Introduction

Systematic reviews aim to identify all studies that are relevant to their research questions and to synthesize data about the design, risk of bias, and results of those studies. Consequently, the findings of a systematic review depend critically on decisions relating to which data from these studies are presented and analysed. Data collected for systematic reviews should be accurate, complete, and accessible for future updates of the review and for data sharing. Methods used for these decisions must be transparent; they should be chosen to minimize biases and human error. Here we describe approaches that should be used in systematic reviews for collecting data, including extraction of data directly from journal articles and other reports of studies.

5.2 Sources of data

Studies are reported in a range of sources which are detailed later. As discussed in Section 5.2.1 , it is important to link together multiple reports of the same study. The relative strengths and weaknesses of each type of source are discussed in Section 5.2.2 . For guidance on searching for and selecting reports of studies, refer to Chapter 4 .

Journal articles are the source of the majority of data included in systematic reviews. Note that a study can be reported in multiple journal articles, each focusing on some aspect of the study (e.g. design, main results, and other results).

Conference abstracts are commonly available. However, the information presented in conference abstracts is highly variable in reliability, accuracy, and level of detail (Li et al 2017).

Errata and letters can be important sources of information about studies, including critical weaknesses and retractions, and review authors should examine these if they are identified (see MECIR Box 5.2.a ).

Trials registers (e.g. ClinicalTrials.gov) catalogue trials that have been planned or started, and have become an important data source for identifying trials, for comparing published outcomes and results with those planned, and for obtaining efficacy and safety data that are not available elsewhere (Ross et al 2009, Jones et al 2015, Baudard et al 2017).

Clinical study reports (CSRs) contain unabridged and comprehensive descriptions of the clinical problem, design, conduct and results of clinical trials, following a structure and content guidance prescribed by the International Conference on Harmonisation (ICH 1995). To obtain marketing approval of drugs and biologics for a specific indication, pharmaceutical companies submit CSRs and other required materials to regulatory authorities. Because CSRs also incorporate tables and figures, with appendices containing the protocol, statistical analysis plan, sample case report forms, and patient data listings (including narratives of all serious adverse events), they can be thousands of pages in length. CSRs often contain more data about trial methods and results than any other single data source (Mayo-Wilson et al 2018). CSRs are often difficult to access, and are usually not publicly available. Review authors could request CSRs from the European Medicines Agency (Davis and Miller 2017). The US Food and Drug and Administration had historically avoided releasing CSRs but launched a pilot programme in 2018 whereby selected portions of CSRs for new drug applications were posted on the agency’s website. Many CSRs are obtained through unsealed litigation documents, repositories (e.g. clinicalstudydatarequest.com ), and other open data and data-sharing channels (e.g. The Yale University Open Data Access Project) (Doshi et al 2013, Wieland et al 2014, Mayo-Wilson et al 2018)).

Regulatory reviews such as those available from the US Food and Drug Administration or European Medicines Agency provide useful information about trials of drugs, biologics, and medical devices submitted by manufacturers for marketing approval (Turner 2013). These documents are summaries of CSRs and related documents, prepared by agency staff as part of the process of approving the products for marketing, after reanalysing the original trial data. Regulatory reviews often are available only for the first approved use of an intervention and not for later applications (although review authors may request those documents, which are usually brief). Using regulatory reviews from the US Food and Drug Administration as an example, drug approval packages are available on the agency’s website for drugs approved since 1997 (Turner 2013); for drugs approved before 1997, information must be requested through a freedom of information request. The drug approval packages contain various documents: approval letter(s), medical review(s), chemistry review(s), clinical pharmacology review(s), and statistical reviews(s).

Individual participant data (IPD) are usually sought directly from the researchers responsible for the study, or may be identified from open data repositories (e.g. www.clinicalstudydatarequest.com ). These data typically include variables that represent the characteristics of each participant, intervention (or exposure) group, prognostic factors, and measurements of outcomes (Stewart et al 2015). Access to IPD has the advantage of allowing review authors to reanalyse the data flexibly, in accordance with the preferred analysis methods outlined in the protocol, and can reduce the variation in analysis methods across studies included in the review. IPD reviews are addressed in detail in Chapter 26 .

MECIR Box 5.2.a Relevant expectations for conduct of intervention reviews

5.2.1 Studies (not reports) as the unit of interest

In a systematic review, studies rather than reports of studies are the principal unit of interest. Since a study may have been reported in several sources, a comprehensive search for studies for the review may identify many reports from a potentially relevant study (Mayo-Wilson et al 2017a, Mayo-Wilson et al 2018). Conversely, a report may describe more than one study.

Multiple reports of the same study should be linked together (see MECIR Box 5.2.b ). Some authors prefer to link reports before they collect data, and collect data from across the reports onto a single form. Other authors prefer to collect data from each report and then link together the collected data across reports. Either strategy may be appropriate, depending on the nature of the reports at hand. It may not be clear that two reports relate to the same study until data collection has commenced. Although sometimes there is a single report for each study, it should never be assumed that this is the case.

MECIR Box 5.2.b Relevant expectations for conduct of intervention reviews

It can be difficult to link multiple reports from the same study, and review authors may need to do some ‘detective work’. Multiple sources about the same trial may not reference each other, do not share common authors (Gøtzsche 1989, Tramèr et al 1997), or report discrepant information about the study design, characteristics, outcomes, and results (von Elm et al 2004, Mayo-Wilson et al 2017a).

Some of the most useful criteria for linking reports are:

  • trial registration numbers;
  • authors’ names;
  • sponsor for the study and sponsor identifiers (e.g. grant or contract numbers);
  • location and setting (particularly if institutions, such as hospitals, are named);
  • specific details of the interventions (e.g. dose, frequency);
  • numbers of participants and baseline data; and
  • date and duration of the study (which also can clarify whether different sample sizes are due to different periods of recruitment), length of follow-up, or subgroups selected to address secondary goals.

Review authors should use as many trial characteristics as possible to link multiple reports. When uncertainties remain after considering these and other factors, it may be necessary to correspond with the study authors or sponsors for confirmation.

5.2.2 Determining which sources might be most useful

A comprehensive search to identify all eligible studies from all possible sources is resource-intensive but necessary for a high-quality systematic review (see Chapter 4 ). Because some data sources are more useful than others (Mayo-Wilson et al 2018), review authors should consider which data sources may be available and which may contain the most useful information for the review. These considerations should be described in the protocol. Table 5.2.a summarizes the strengths and limitations of different data sources (Mayo-Wilson et al 2018). Gaining access to CSRs and IPD often takes a long time. Review authors should begin searching repositories and contact trial investigators and sponsors as early as possible to negotiate data usage agreements (Mayo-Wilson et al 2015, Mayo-Wilson et al 2018).

Table 5.2.a Strengths and limitations of different data sources for systematic reviews

5.2.3 Correspondence with investigators

Review authors often find that they are unable to obtain all the information they seek from available reports about the details of the study design, the full range of outcomes measured and the numerical results. In such circumstances, authors are strongly encouraged to contact the original investigators (see MECIR Box 5.2.c ). Contact details of study authors, when not available from the study reports, often can be obtained from more recent publications, from university or institutional staff listings, from membership directories of professional societies, or by a general search of the web. If the contact author named in the study report cannot be contacted or does not respond, it is worthwhile attempting to contact other authors.

Review authors should consider the nature of the information they require and make their request accordingly. For descriptive information about the conduct of the trial, it may be most appropriate to ask open-ended questions (e.g. how was the allocation process conducted, or how were missing data handled?). If specific numerical data are required, it may be more helpful to request them specifically, possibly providing a short data collection form (either uncompleted or partially completed). If IPD are required, they should be specifically requested (see also Chapter 26 ). In some cases, study investigators may find it more convenient to provide IPD rather than conduct additional analyses to obtain the specific statistics requested.

MECIR Box 5.2.c Relevant expectations for conduct of intervention reviews

5.3 What data to collect

5.3.1 what are data.

For the purposes of this chapter, we define ‘data’ to be any information about (or derived from) a study, including details of methods, participants, setting, context, interventions, outcomes, results, publications, and investigators. Review authors should plan in advance what data will be required for their systematic review, and develop a strategy for obtaining them (see MECIR Box 5.3.a ). The involvement of consumers and other stakeholders can be helpful in ensuring that the categories of data collected are sufficiently aligned with the needs of review users ( Chapter 1, Section 1.3 ). The data to be sought should be described in the protocol, with consideration wherever possible of the issues raised in the rest of this chapter.

The data collected for a review should adequately describe the included studies, support the construction of tables and figures, facilitate the risk of bias assessment, and enable syntheses and meta-analyses. Review authors should familiarize themselves with reporting guidelines for systematic reviews (see online Chapter III and the PRISMA statement; (Liberati et al 2009) to ensure that relevant elements and sections are incorporated. The following sections review the types of information that should be sought, and these are summarized in Table 5.3.a (Li et al 2015).

MECIR Box 5.3.a Relevant expectations for conduct of intervention reviews

Table 5.3.a Checklist of items to consider in data collection

*Full description required for assessments of risk of bias (see Chapter 8 , Chapter 23 and Chapter 25 ).

5.3.2 Study methods and potential sources of bias

Different research methods can influence study outcomes by introducing different biases into results. Important study design characteristics should be collected to allow the selection of appropriate methods for assessment and analysis, and to enable description of the design of each included study in a table of ‘Characteristics of included studies’, including whether the study is randomized, whether the study has a cluster or crossover design, and the duration of the study. If the review includes non-randomized studies, appropriate features of the studies should be described (see Chapter 24 ).

Detailed information should be collected to facilitate assessment of the risk of bias in each included study. Risk-of-bias assessment should be conducted using the tool most appropriate for the design of each study, and the information required to complete the assessment will depend on the tool. Randomized studies should be assessed using the tool described in Chapter 8 . The tool covers bias arising from the randomization process, due to deviations from intended interventions, due to missing outcome data, in measurement of the outcome, and in selection of the reported result. For each item in the tool, a description of what happened in the study is required, which may include verbatim quotes from study reports. Information for assessment of bias due to missing outcome data and selection of the reported result may be most conveniently collected alongside information on outcomes and results. Chapter 7 (Section 7.3.1) discusses some issues in the collection of information for assessments of risk of bias. For non-randomized studies, the most appropriate tool is described in Chapter 25 . A separate tool also covers bias due to missing results in meta-analysis (see Chapter 13 ).

A particularly important piece of information is the funding source of the study and potential conflicts of interest of the study authors.

Some review authors will wish to collect additional information on study characteristics that bear on the quality of the study’s conduct but that may not lead directly to risk of bias, such as whether ethical approval was obtained and whether a sample size calculation was performed a priori.

5.3.3 Participants and setting

Details of participants are collected to enable an understanding of the comparability of, and differences between, the participants within and between included studies, and to allow assessment of how directly or completely the participants in the included studies reflect the original review question.

Typically, aspects that should be collected are those that could (or are believed to) affect presence or magnitude of an intervention effect and those that could help review users assess applicability to populations beyond the review. For example, if the review authors suspect important differences in intervention effect between different socio-economic groups, this information should be collected. If intervention effects are thought constant over such groups, and if such information would not be useful to help apply results, it should not be collected. Participant characteristics that are often useful for assessing applicability include age and sex. Summary information about these should always be collected unless they are not obvious from the context. These characteristics are likely to be presented in different formats (e.g. ages as means or medians, with standard deviations or ranges; sex as percentages or counts for the whole study or for each intervention group separately). Review authors should seek consistent quantities where possible, and decide whether it is more relevant to summarize characteristics for the study as a whole or by intervention group. It may not be possible to select the most consistent statistics until data collection is complete across all or most included studies. Other characteristics that are sometimes important include ethnicity, socio-demographic details (e.g. education level) and the presence of comorbid conditions. Clinical characteristics relevant to the review question (e.g. glucose level for reviews on diabetes) also are important for understanding the severity or stage of the disease.

Diagnostic criteria that were used to define the condition of interest can be a particularly important source of diversity across studies and should be collected. For example, in a review of drug therapy for congestive heart failure, it is important to know how the definition and severity of heart failure was determined in each study (e.g. systolic or diastolic dysfunction, severe systolic dysfunction with ejection fractions below 20%). Similarly, in a review of antihypertensive therapy, it is important to describe baseline levels of blood pressure of participants.

If the settings of studies may influence intervention effects or applicability, then information on these should be collected. Typical settings of healthcare intervention studies include acute care hospitals, emergency facilities, general practice, and extended care facilities such as nursing homes, offices, schools, and communities. Sometimes studies are conducted in different geographical regions with important differences that could affect delivery of an intervention and its outcomes, such as cultural characteristics, economic context, or rural versus city settings. Timing of the study may be associated with important technology differences or trends over time. If such information is important for the interpretation of the review, it should be collected.

Important characteristics of the participants in each included study should be summarized for the reader in the table of ‘Characteristics of included studies’.

5.3.4 Interventions

Details of all experimental and comparator interventions of relevance to the review should be collected. Again, details are required for aspects that could affect the presence or magnitude of an effect or that could help review users assess applicability to their own circumstances. Where feasible, information should be sought (and presented in the review) that is sufficient for replication of the interventions under study. This includes any co-interventions administered as part of the study, and applies similarly to comparators such as ‘usual care’. Review authors may need to request missing information from study authors.

The Template for Intervention Description and Replication (TIDieR) provides a comprehensive framework for full description of interventions and has been proposed for use in systematic reviews as well as reports of primary studies (Hoffmann et al 2014). The checklist includes descriptions of:

  • the rationale for the intervention and how it is expected to work;
  • any documentation that instructs the recipient on the intervention;
  • what the providers do to deliver the intervention (procedures and processes);
  • who provides the intervention (including their skill level), how (e.g. face to face, web-based) and in what setting (e.g. home, school, or hospital);
  • the timing and intensity;
  • whether any variation is permitted or expected, and whether modifications were actually made; and
  • any strategies used to ensure or assess fidelity or adherence to the intervention, and the extent to which the intervention was delivered as planned.

For clinical trials of pharmacological interventions, key information to collect will often include routes of delivery (e.g. oral or intravenous delivery), doses (e.g. amount or intensity of each treatment, frequency of delivery), timing (e.g. within 24 hours of diagnosis), and length of treatment. For other interventions, such as those that evaluate psychotherapy, behavioural and educational approaches, or healthcare delivery strategies, the amount of information required to characterize the intervention will typically be greater, including information about multiple elements of the intervention, who delivered it, and the format and timing of delivery. Chapter 17 provides further information on how to manage intervention complexity, and how the intervention Complexity Assessment Tool (iCAT) can facilitate data collection (Lewin et al 2017).

Important characteristics of the interventions in each included study should be summarized for the reader in the table of ‘Characteristics of included studies’. Additional tables or diagrams such as logic models ( Chapter 2, Section 2.5.1 ) can assist descriptions of multi-component interventions so that review users can better assess review applicability to their context.

5.3.4.1 Integrity of interventions

The degree to which specified procedures or components of the intervention are implemented as planned can have important consequences for the findings from a study. We describe this as intervention integrity ; related terms include adherence, compliance and fidelity (Carroll et al 2007). The verification of intervention integrity may be particularly important in reviews of non-pharmacological trials such as behavioural interventions and complex interventions, which are often implemented in conditions that present numerous obstacles to idealized delivery.

It is generally expected that reports of randomized trials provide detailed accounts of intervention implementation (Zwarenstein et al 2008, Moher et al 2010). In assessing whether interventions were implemented as planned, review authors should bear in mind that some interventions are standardized (with no deviations permitted in the intervention protocol), whereas others explicitly allow a degree of tailoring (Zwarenstein et al 2008). In addition, the growing field of implementation science has led to an increased awareness of the impact of setting and context on delivery of interventions (Damschroder et al 2009). (See Chapter 17, Section 17.1.2.1 for further information and discussion about how an intervention may be tailored to local conditions in order to preserve its integrity.)

Information about integrity can help determine whether unpromising results are due to a poorly conceptualized intervention or to an incomplete delivery of the prescribed components. It can also reveal important information about the feasibility of implementing a given intervention in real life settings. If it is difficult to achieve full implementation in practice, the intervention will have low feasibility (Dusenbury et al 2003).

Whether a lack of intervention integrity leads to a risk of bias in the estimate of its effect depends on whether review authors and users are interested in the effect of assignment to intervention or the effect of adhering to intervention, as discussed in more detail in Chapter 8, Section 8.2.2 . Assessment of deviations from intended interventions is important for assessing risk of bias in the latter, but not the former (see Chapter 8, Section 8.4 ), but both may be of interest to decision makers in different ways.

An example of a Cochrane Review evaluating intervention integrity is provided by a review of smoking cessation in pregnancy (Chamberlain et al 2017). The authors found that process evaluation of the intervention occurred in only some trials and that the implementation was less than ideal in others, including some of the largest trials. The review highlighted how the transfer of an intervention from one setting to another may reduce its effectiveness when elements are changed, or aspects of the materials are culturally inappropriate.

5.3.4.2 Process evaluations

Process evaluations seek to evaluate the process (and mechanisms) between the intervention’s intended implementation and the actual effect on the outcome (Moore et al 2015). Process evaluation studies are characterized by a flexible approach to data collection and the use of numerous methods to generate a range of different types of data, encompassing both quantitative and qualitative methods. Guidance for including process evaluations in systematic reviews is provided in Chapter 21 . When it is considered important, review authors should aim to collect information on whether the trial accounted for, or measured, key process factors and whether the trials that thoroughly addressed integrity showed a greater impact. Process evaluations can be a useful source of factors that potentially influence the effectiveness of an intervention.

5.3.5 Outcome s

An outcome is an event or a measurement value observed or recorded for a particular person or intervention unit in a study during or following an intervention, and that is used to assess the efficacy and safety of the studied intervention (Meinert 2012). Review authors should indicate in advance whether they plan to collect information about all outcomes measured in a study or only those outcomes of (pre-specified) interest in the review. Research has shown that trials addressing the same condition and intervention seldom agree on which outcomes are the most important, and consequently report on numerous different outcomes (Dwan et al 2014, Ismail et al 2014, Denniston et al 2015, Saldanha et al 2017a). The selection of outcomes across systematic reviews of the same condition is also inconsistent (Page et al 2014, Saldanha et al 2014, Saldanha et al 2016, Liu et al 2017). Outcomes used in trials and in systematic reviews of the same condition have limited overlap (Saldanha et al 2017a, Saldanha et al 2017b).

We recommend that only the outcomes defined in the protocol be described in detail. However, a complete list of the names of all outcomes measured may allow a more detailed assessment of the risk of bias due to missing outcome data (see Chapter 13 ).

Review authors should collect all five elements of an outcome (Zarin et al 2011, Saldanha et al 2014):

1. outcome domain or title (e.g. anxiety);

2. measurement tool or instrument (including definition of clinical outcomes or endpoints); for a scale, name of the scale (e.g. the Hamilton Anxiety Rating Scale), upper and lower limits, and whether a high or low score is favourable, definitions of any thresholds if appropriate;

3. specific metric used to characterize each participant’s results (e.g. post-intervention anxiety, or change in anxiety from baseline to a post-intervention time point, or post-intervention presence of anxiety (yes/no));

4. method of aggregation (e.g. mean and standard deviation of anxiety scores in each group, or proportion of people with anxiety);

5. timing of outcome measurements (e.g. assessments at end of eight-week intervention period, events occurring during eight-week intervention period).

Further considerations for economics outcomes are discussed in Chapter 20 , and for patient-reported outcomes in Chapter 18 .

5.3.5.1 Adverse effects

Collection of information about the harmful effects of an intervention can pose particular difficulties, discussed in detail in Chapter 19 . These outcomes may be described using multiple terms, including ‘adverse event’, ‘adverse effect’, ‘adverse drug reaction’, ‘side effect’ and ‘complication’. Many of these terminologies are used interchangeably in the literature, although some are technically different. Harms might additionally be interpreted to include undesirable changes in other outcomes measured during a study, such as a decrease in quality of life where an improvement may have been anticipated.

In clinical trials, adverse events can be collected either systematically or non-systematically. Systematic collection refers to collecting adverse events in the same manner for each participant using defined methods such as a questionnaire or a laboratory test. For systematically collected outcomes representing harm, data can be collected by review authors in the same way as efficacy outcomes (see Section 5.3.5 ).

Non-systematic collection refers to collection of information on adverse events using methods such as open-ended questions (e.g. ‘Have you noticed any symptoms since your last visit?’), or reported by participants spontaneously. In either case, adverse events may be selectively reported based on their severity, and whether the participant suspected that the effect may have been caused by the intervention, which could lead to bias in the available data. Unfortunately, most adverse events are collected non-systematically rather than systematically, creating a challenge for review authors. The following pieces of information are useful and worth collecting (Nicole Fusco, personal communication):

  • any coding system or standard medical terminology used (e.g. COSTART, MedDRA), including version number;
  • name of the adverse events (e.g. dizziness);
  • reported intensity of the adverse event (e.g. mild, moderate, severe);
  • whether the trial investigators categorized the adverse event as ‘serious’;
  • whether the trial investigators identified the adverse event as being related to the intervention;
  • time point (most commonly measured as a count over the duration of the study);
  • any reported methods for how adverse events were selected for inclusion in the publication (e.g. ‘We reported all adverse events that occurred in at least 5% of participants’); and
  • associated results.

Different collection methods lead to very different accounting of adverse events (Safer 2002, Bent et al 2006, Ioannidis et al 2006, Carvajal et al 2011, Allen et al 2013). Non-systematic collection methods tend to underestimate how frequently an adverse event occurs. It is particularly problematic when the adverse event of interest to the review is collected systematically in some studies but non-systematically in other studies. Different collection methods introduce an important source of heterogeneity. In addition, when non-systematic adverse events are reported based on quantitative selection criteria (e.g. only adverse events that occurred in at least 5% of participants were included in the publication), use of reported data alone may bias the results of meta-analyses. Review authors should be cautious of (or refrain from) synthesizing adverse events that are collected differently.

Regardless of the collection methods, precise definitions of adverse effect outcomes and their intensity should be recorded, since they may vary between studies. For example, in a review of aspirin and gastrointestinal haemorrhage, some trials simply reported gastrointestinal bleeds, while others reported specific categories of bleeding, such as haematemesis, melaena, and proctorrhagia (Derry and Loke 2000). The definition and reporting of severity of the haemorrhages (e.g. major, severe, requiring hospital admission) also varied considerably among the trials (Zanchetti and Hansson 1999). Moreover, a particular adverse effect may be described or measured in different ways among the studies. For example, the terms ‘tiredness’, ‘fatigue’ or ‘lethargy’ may all be used in reporting of adverse effects. Study authors also may use different thresholds for ‘abnormal’ results (e.g. hypokalaemia diagnosed at a serum potassium concentration of 3.0 mmol/L or 3.5 mmol/L).

No mention of adverse events in trial reports does not necessarily mean that no adverse events occurred. It is usually safest to assume that they were not reported. Quality of life measures are sometimes used as a measure of the participants’ experience during the study, but these are usually general measures that do not look specifically at particular adverse effects of the intervention. While quality of life measures are important and can be used to gauge overall participant well-being, they should not be regarded as substitutes for a detailed evaluation of safety and tolerability.

5.3.6 Results

Results data arise from the measurement or ascertainment of outcomes for individual participants in an intervention study. Results data may be available for each individual in a study (i.e. individual participant data; see Chapter 26 ), or summarized at arm level, or summarized at study level into an intervention effect by comparing two intervention arms. Results data should be collected only for the intervention groups and outcomes specified to be of interest in the protocol (see MECIR Box 5.3.b ). Results for other outcomes should not be collected unless the protocol is modified to add them. Any modification should be reported in the review. However, review authors should be alert to the possibility of important, unexpected findings, particularly serious adverse effects.

MECIR Box 5.3.b Relevant expectations for conduct of intervention reviews

Reports of studies often include several results for the same outcome. For example, different measurement scales might be used, results may be presented separately for different subgroups, and outcomes may have been measured at different follow-up time points. Variation in the results can be very large, depending on which data are selected (Gøtzsche et al 2007, Mayo-Wilson et al 2017a). Review protocols should be as specific as possible about which outcome domains, measurement tools, time points, and summary statistics (e.g. final values versus change from baseline) are to be collected (Mayo-Wilson et al 2017b). A framework should be pre-specified in the protocol to facilitate making choices between multiple eligible measures or results. For example, a hierarchy of preferred measures might be created, or plans articulated to select the result with the median effect size, or to average across all eligible results for a particular outcome domain (see also Chapter 9, Section 9.3.3 ). Any additional decisions or changes to this framework made once the data are collected should be reported in the review as changes to the protocol.

Section 5.6 describes the numbers that will be required to perform meta-analysis, if appropriate. The unit of analysis (e.g. participant, cluster, body part, treatment period) should be recorded for each result when it is not obvious (see Chapter 6, Section 6.2 ). The type of outcome data determines the nature of the numbers that will be sought for each outcome. For example, for a dichotomous (‘yes’ or ‘no’) outcome, the number of participants and the number who experienced the outcome will be sought for each group. It is important to collect the sample size relevant to each result, although this is not always obvious. A flow diagram as recommended in the CONSORT Statement (Moher et al 2001) can help to determine the flow of participants through a study. If one is not available in a published report, review authors can consider drawing one (available from www.consort-statement.org ).

The numbers required for meta-analysis are not always available. Often, other statistics can be collected and converted into the required format. For example, for a continuous outcome, it is usually most convenient to seek the number of participants, the mean and the standard deviation for each intervention group. These are often not available directly, especially the standard deviation. Alternative statistics enable calculation or estimation of the missing standard deviation (such as a standard error, a confidence interval, a test statistic (e.g. from a t-test or F-test) or a P value). These should be extracted if they provide potentially useful information (see MECIR Box 5.3.c ). Details of recalculation are provided in Section 5.6 . Further considerations for dealing with missing data are discussed in Chapter 10, Section 10.12 .

MECIR Box 5.3.c Relevant expectations for conduct of intervention reviews

5.3.7 Other information to collect

We recommend that review authors collect the key conclusions of the included study as reported by its authors. It is not necessary to report these conclusions in the review, but they should be used to verify the results of analyses undertaken by the review authors, particularly in relation to the direction of effect. Further comments by the study authors, for example any explanations they provide for unexpected findings, may be noted. References to other studies that are cited in the study report may be useful, although review authors should be aware of the possibility of citation bias (see Chapter 7, Section 7.2.3.2 ). Documentation of any correspondence with the study authors is important for review transparency.

5.4 Data collection tools

5.4.1 rationale for data collection forms.

Data collection for systematic reviews should be performed using structured data collection forms (see MECIR Box 5.4.a ). These can be paper forms, electronic forms (e.g. Google Form), or commercially or custom-built data systems (e.g. Covidence, EPPI-Reviewer, Systematic Review Data Repository (SRDR)) that allow online form building, data entry by several users, data sharing, and efficient data management (Li et al 2015). All different means of data collection require data collection forms.

MECIR Box 5.4.a Relevant expectations for conduct of intervention reviews

The data collection form is a bridge between what is reported by the original investigators (e.g. in journal articles, abstracts, personal correspondence) and what is ultimately reported by the review authors. The data collection form serves several important functions (Meade and Richardson 1997). First, the form is linked directly to the review question and criteria for assessing eligibility of studies, and provides a clear summary of these that can be used to identify and structure the data to be extracted from study reports. Second, the data collection form is the historical record of the provenance of the data used in the review, as well as the multitude of decisions (and changes to decisions) that occur throughout the review process. Third, the form is the source of data for inclusion in an analysis.

Given the important functions of data collection forms, ample time and thought should be invested in their design. Because each review is different, data collection forms will vary across reviews. However, there are many similarities in the types of information that are important. Thus, forms can be adapted from one review to the next. Although we use the term ‘data collection form’ in the singular, in practice it may be a series of forms used for different purposes: for example, a separate form could be used to assess the eligibility of studies for inclusion in the review to assist in the quick identification of studies to be excluded from or included in the review.

5.4.2 Considerations in selecting data collection tools

The choice of data collection tool is largely dependent on review authors’ preferences, the size of the review, and resources available to the author team. Potential advantages and considerations of selecting one data collection tool over another are outlined in Table 5.4.a (Li et al 2015). A significant advantage that data systems have is in data management ( Chapter 1, Section 1.6 ) and re-use. They make review updates more efficient, and also facilitate methodological research across reviews. Numerous ‘meta-epidemiological’ studies have been carried out using Cochrane Review data, resulting in methodological advances which would not have been possible if thousands of studies had not all been described using the same data structures in the same system.

Some data collection tools facilitate automatic imports of extracted data into RevMan (Cochrane’s authoring tool), such as CSV (Excel) and Covidence. Details available here https://documentation.cochrane.org/revman-kb/populate-study-data-260702462.html

Table 5.4.a Considerations in selecting data collection tools

5.4.3 Design of a data collection form

Regardless of whether data are collected using a paper or electronic form, or a data system, the key to successful data collection is to construct easy-to-use forms and collect sufficient and unambiguous data that faithfully represent the source in a structured and organized manner (Li et al 2015). In most cases, a document format should be developed for the form before building an electronic form or a data system. This can be distributed to others, including programmers and data analysts, and as a guide for creating an electronic form and any guidance or codebook to be used by data extractors. Review authors also should consider compatibility of any electronic form or data system with analytical software, as well as mechanisms for recording, assessing and correcting data entry errors.

Data described in multiple reports (or even within a single report) of a study may not be consistent. Review authors will need to describe how they work with multiple reports in the protocol, for example, by pre-specifying which report will be used when sources contain conflicting data that cannot be resolved by contacting the investigators. Likewise, when there is only one report identified for a study, review authors should specify the section within the report (e.g. abstract, methods, results, tables, and figures) for use in case of inconsistent information.

If review authors wish to automatically import their extracted data into RevMan, it is advised that their data collection forms match the data extraction templates available via the RevMan Knowledge Base. Details available here https://documentation.cochrane.org/revman-kb/data-extraction-templates-260702375.html.

A good data collection form should minimize the need to go back to the source documents. When designing a data collection form, review authors should involve all members of the team, that is, content area experts, authors with experience in systematic review methods and data collection form design, statisticians, and persons who will perform data extraction. Here are suggested steps and some tips for designing a data collection form, based on the informal collation of experiences from numerous review authors (Li et al 2015).

Step 1. Develop outlines of tables and figures expected to appear in the systematic review, considering the comparisons to be made between different interventions within the review, and the various outcomes to be measured. This step will help review authors decide the right amount of data to collect (not too much or too little). Collecting too much information can lead to forms that are longer than original study reports, and can be very wasteful of time. Collection of too little information, or omission of key data, can lead to the need to return to study reports later in the review process.

Step 2. Assemble and group data elements to facilitate form development. Review authors should consult Table 5.3.a , in which the data elements are grouped to facilitate form development and data collection. Note that it may be more efficient to group data elements in the order in which they are usually found in study reports (e.g. starting with reference information, followed by eligibility criteria, intervention description, statistical methods, baseline characteristics and results).

Step 3. Identify the optimal way of framing the data items. Much has been written about how to frame data items for developing robust data collection forms in primary research studies. We summarize a few key points and highlight issues that are pertinent to systematic reviews.

  • Ask closed-ended questions (i.e. questions that define a list of permissible responses) as much as possible. Closed-ended questions do not require post hoc coding and provide better control over data quality than open-ended questions. When setting up a closed-ended question, one must anticipate and structure possible responses and include an ‘other, specify’ category because the anticipated list may not be exhaustive. Avoid asking data extractors to summarize data into uncoded text, no matter how short it is.
  • Avoid asking a question in a way that the response may be left blank. Include ‘not applicable’, ‘not reported’ and ‘cannot tell’ options as needed. The ‘cannot tell’ option tags uncertain items that may promote review authors to contact study authors for clarification, especially on data items critical to reach conclusions.
  • Remember that the form will focus on what is reported in the article rather what has been done in the study. The study report may not fully reflect how the study was actually conducted. For example, a question ‘Did the article report that the participants were masked to the intervention?’ is more appropriate than ‘Were participants masked to the intervention?’
  • Where a judgement is required, record the raw data (i.e. quote directly from the source document) used to make the judgement. It is also important to record the source of information collected, including where it was found in a report or whether information was obtained from unpublished sources or personal communications. As much as possible, questions should be asked in a way that minimizes subjective interpretation and judgement to facilitate data comparison and adjudication.
  • Incorporate flexibility to allow for variation in how data are reported. It is strongly recommended that outcome data be collected in the format in which they were reported and transformed in a subsequent step if required. Review authors also should consider the software they will use for analysis and for publishing the review (e.g. RevMan).

Step 4. Develop and pilot-test data collection forms, ensuring that they provide data in the right format and structure for subsequent analysis. In addition to data items described in Step 2, data collection forms should record the title of the review as well as the person who is completing the form and the date of completion. Forms occasionally need revision; forms should therefore include the version number and version date to reduce the chances of using an outdated form by mistake. Because a study may be associated with multiple reports, it is important to record the study ID as well as the report ID. Definitions and instructions helpful for answering a question should appear next to the question to improve quality and consistency across data extractors (Stock 1994). Provide space for notes, regardless of whether paper or electronic forms are used.

All data collection forms and data systems should be thoroughly pilot-tested before launch (see MECIR Box 5.4.a ). Testing should involve several people extracting data from at least a few articles. The initial testing focuses on the clarity and completeness of questions. Users of the form may provide feedback that certain coding instructions are confusing or incomplete (e.g. a list of options may not cover all situations). The testing may identify data that are missing from the form, or likely to be superfluous. After initial testing, accuracy of the extracted data should be checked against the source document or verified data to identify problematic areas. It is wise to draft entries for the table of ‘Characteristics of included studies’ and complete a risk of bias assessment ( Chapter 8 ) using these pilot reports to ensure all necessary information is collected. A consensus between review authors may be required before the form is modified to avoid any misunderstandings or later disagreements. It may be necessary to repeat the pilot testing on a new set of reports if major changes are needed after the first pilot test.

Problems with the data collection form may surface after pilot testing has been completed, and the form may need to be revised after data extraction has started. When changes are made to the form or coding instructions, it may be necessary to return to reports that have already undergone data extraction. In some situations, it may be necessary to clarify only coding instructions without modifying the actual data collection form.

5.5 Extracting data from reports

5.5.1 introduction.

In most systematic reviews, the primary source of information about each study is published reports of studies, usually in the form of journal articles. Despite recent developments in machine learning models to automate data extraction in systematic reviews (see Section 5.5.9 ), data extraction is still largely a manual process. Electronic searches for text can provide a useful aid to locating information within a report. Examples include using search facilities in PDF viewers, internet browsers and word processing software. However, text searching should not be considered a replacement for reading the report, since information may be presented using variable terminology and presented in multiple formats.

5.5.2 Who should extract data?

Data extractors should have at least a basic understanding of the topic, and have knowledge of study design, data analysis and statistics. They should pay attention to detail while following instructions on the forms. Because errors that occur at the data extraction stage are rarely detected by peer reviewers, editors, or users of systematic reviews, it is recommended that more than one person extract data from every report to minimize errors and reduce introduction of potential biases by review authors (see MECIR Box 5.5.a ). As a minimum, information that involves subjective interpretation and information that is critical to the interpretation of results (e.g. outcome data) should be extracted independently by at least two people (see MECIR Box 5.5.a ). In common with implementation of the selection process ( Chapter 4, Section 4.6 ), it is preferable that data extractors are from complementary disciplines, for example a methodologist and a topic area specialist. It is important that everyone involved in data extraction has practice using the form and, if the form was designed by someone else, receives appropriate training.

Evidence in support of duplicate data extraction comes from several indirect sources. One study observed that independent data extraction by two authors resulted in fewer errors than data extraction by a single author followed by verification by a second (Buscemi et al 2006). A high prevalence of data extraction errors (errors in 20 out of 34 reviews) has been observed (Jones et al 2005). A further study of data extraction to compute standardized mean differences found that a minimum of seven out of 27 reviews had substantial errors (Gøtzsche et al 2007).

MECIR Box 5.5.a Relevant expectations for conduct of intervention reviews

5.5.3 Training data extractors

Training of data extractors is intended to familiarize them with the review topic and methods, the data collection form or data system, and issues that may arise during data extraction. Results of the pilot testing of the form should prompt discussion among review authors and extractors of ambiguous questions or responses to establish consistency. Training should take place at the onset of the data extraction process and periodically over the course of the project (Li et al 2015). For example, when data related to a single item on the form are present in multiple locations within a report (e.g. abstract, main body of text, tables, and figures) or in several sources (e.g. publications, ClinicalTrials.gov, or CSRs), the development and documentation of instructions to follow an agreed algorithm are critical and should be reinforced during the training sessions.

Some have proposed that some information in a report, such as its authors, be blinded to the review author prior to data extraction and assessment of risk of bias (Jadad et al 1996). However, blinding of review authors to aspects of study reports generally is not recommended for Cochrane Reviews as there is little evidence that it alters the decisions made (Berlin 1997).

5.5.4 Extracting data from multiple reports of the same study

Studies frequently are reported in more than one publication or in more than one source (Tramèr et al 1997, von Elm et al 2004). A single source rarely provides complete information about a study; on the other hand, multiple sources may contain conflicting information about the same study (Mayo-Wilson et al 2017a, Mayo-Wilson et al 2017b, Mayo-Wilson et al 2018). Because the unit of interest in a systematic review is the study and not the report, information from multiple reports often needs to be collated and reconciled. It is not appropriate to discard any report of an included study without careful examination, since it may contain valuable information not included in the primary report. Review authors will need to decide between two strategies:

  • Extract data from each report separately, then combine information across multiple data collection forms.
  • Extract data from all reports directly into a single data collection form.

The choice of which strategy to use will depend on the nature of the reports and may vary across studies and across reports. For example, when a full journal article and multiple conference abstracts are available, it is likely that the majority of information will be obtained from the journal article; completing a new data collection form for each conference abstract may be a waste of time. Conversely, when there are two or more detailed journal articles, perhaps relating to different periods of follow-up, then it is likely to be easier to perform data extraction separately for these articles and collate information from the data collection forms afterwards. When data from all reports are extracted into a single data collection form, review authors should identify the ‘main’ data source for each study when sources include conflicting data and these differences cannot be resolved by contacting authors (Mayo-Wilson et al 2018). Flow diagrams such as those modified from the PRISMA statement can be particularly helpful when collating and documenting information from multiple reports (Mayo-Wilson et al 2018).

5.5.5 Reliability and reaching consensus

When more than one author extracts data from the same reports, there is potential for disagreement. After data have been extracted independently by two or more extractors, responses must be compared to assure agreement or to identify discrepancies. An explicit procedure or decision rule should be specified in the protocol for identifying and resolving disagreements. Most often, the source of the disagreement is an error by one of the extractors and is easily resolved. Thus, discussion among the authors is a sensible first step. More rarely, a disagreement may require arbitration by another person. Any disagreement that cannot be resolved should be addressed by contacting the study authors; if this is unsuccessful, the disagreement should be reported in the review.

The presence and resolution of disagreements should be carefully recorded. Maintaining a copy of the data ‘as extracted’ (in addition to the consensus data) allows assessment of reliability of coding. Examples of ways in which this can be achieved include the following:

  • Use one author’s (paper) data collection form and record changes after consensus in a different ink colour.
  • Enter consensus data onto an electronic form.
  • Record original data extracted and consensus data in separate forms (some online tools do this automatically).

Agreement of coded items before reaching consensus can be quantified, for example using kappa statistics (Orwin 1994), although this is not routinely done in Cochrane Reviews. If agreement is assessed, this should be done only for the most important data (e.g. key risk of bias assessments, or availability of key outcomes).

Throughout the review process informal consideration should be given to the reliability of data extraction. For example, if after reaching consensus on the first few studies, the authors note a frequent disagreement for specific data, then coding instructions may need modification. Furthermore, an author’s coding strategy may change over time, as the coding rules are forgotten, indicating a need for retraining and, possibly, some recoding.

5.5.6 Extracting data from clinical study reports

Clinical study reports (CSRs) obtained for a systematic review are likely to be in PDF format. Although CSRs can be thousands of pages in length and very time-consuming to review, they typically follow the content and format required by the International Conference on Harmonisation (ICH 1995). Information in CSRs is usually presented in a structured and logical way. For example, numerical data pertaining to important demographic, efficacy, and safety variables are placed within the main text in tables and figures. Because of the clarity and completeness of information provided in CSRs, data extraction from CSRs may be clearer and conducted more confidently than from journal articles or other short reports.

To extract data from CSRs efficiently, review authors should familiarize themselves with the structure of the CSRs. In practice, review authors may want to browse or create ‘bookmarks’ within a PDF document that record section headers and subheaders and search key words related to the data extraction (e.g. randomization). In addition, it may be useful to utilize optical character recognition software to convert tables of data in the PDF to an analysable format when additional analyses are required, saving time and minimizing transcription errors.

CSRs may contain many outcomes and present many results for a single outcome (due to different analyses) (Mayo-Wilson et al 2017b). We recommend review authors extract results only for outcomes of interest to the review (Section 5.3.6 ). With regard to different methods of analysis, review authors should have a plan and pre-specify preferred metrics in their protocol for extracting results pertaining to different populations (e.g. ‘all randomized’, ‘all participants taking at least one dose of medication’), methods for handling missing data (e.g. ‘complete case analysis’, ‘multiple imputation’), and adjustment (e.g. unadjusted, adjusted for baseline covariates). It may be important to record the range of analysis options available, even if not all are extracted in detail. In some cases it may be preferable to use metrics that are comparable across multiple included studies, which may not be clear until data collection for all studies is complete.

CSRs are particularly useful for identifying outcomes assessed but not presented to the public. For efficacy outcomes and systematically collected adverse events, review authors can compare what is described in the CSRs with what is reported in published reports to assess the risk of bias due to missing outcome data ( Chapter 8, Section 8.5 ) and in selection of reported result ( Chapter 8, Section 8.7 ). Note that non-systematically collected adverse events are not amenable to such comparisons because these adverse events may not be known ahead of time and thus not pre-specified in the protocol.

5.5.7 Extracting data from regulatory reviews

Data most relevant to systematic reviews can be found in the medical and statistical review sections of a regulatory review. Both of these are substantially longer than journal articles (Turner 2013). A list of all trials on a drug usually can be found in the medical review. Because trials are referenced by a combination of numbers and letters, it may be difficult for the review authors to link the trial with other reports of the same trial (Section 5.2.1 ).

Many of the documents downloaded from the US Food and Drug Administration’s website for older drugs are scanned copies and are not searchable because of redaction of confidential information (Turner 2013). Optical character recognition software can convert most of the text. Reviews for newer drugs have been redacted electronically; documents remain searchable as a result.

Compared to CSRs, regulatory reviews contain less information about trial design, execution, and results. They provide limited information for assessing the risk of bias. In terms of extracting outcomes and results, review authors should follow the guidance provided for CSRs (Section 5.5.6 ).

5.5.8 Extracting data from figures with software

Sometimes numerical data needed for systematic reviews are only presented in figures. Review authors may request the data from the study investigators, or alternatively, extract the data from the figures either manually (e.g. with a ruler) or by using software. Numerous tools are available, many of which are free. Those available at the time of writing include tools called Plot Digitizer, WebPlotDigitizer, Engauge, Dexter, ycasd, GetData Graph Digitizer. The software works by taking an image of a figure and then digitizing the data points off the figure using the axes and scales set by the users. The numbers exported can be used for systematic reviews, although additional calculations may be needed to obtain the summary statistics, such as calculation of means and standard deviations from individual-level data points (or conversion of time-to-event data presented on Kaplan-Meier plots to hazard ratios; see Chapter 6, Section 6.8.2 ).

It has been demonstrated that software is more convenient and accurate than visual estimation or use of a ruler (Gross et al 2014, Jelicic Kadic et al 2016). Review authors should consider using software for extracting numerical data from figures when the data are not available elsewhere.

5.5.9 Automating data extraction in systematic reviews

Because data extraction is time-consuming and error-prone, automating or semi-automating this step may make the extraction process more efficient and accurate. The state of science relevant to automating data extraction is summarized here (Jonnalagadda et al 2015).

  • At least 26 studies have tested various natural language processing and machine learning approaches for facilitating data extraction for systematic reviews.

· Each tool focuses on only a limited number of data elements (ranges from one to seven). Most of the existing tools focus on the PICO information (e.g. number of participants, their age, sex, country, recruiting centres, intervention groups, outcomes, and time points). A few are able to extract study design and results (e.g. objectives, study duration, participant flow), and two extract risk of bias information (Marshall et al 2016, Millard et al 2016). To date, well over half of the data elements needed for systematic reviews have not been explored for automated extraction.

  • Most tools highlight the sentence(s) that may contain the data elements as opposed to directly recording these data elements into a data collection form or a data system.
  • There is no gold standard or common dataset to evaluate the performance of these tools, limiting our ability to interpret the significance of the reported accuracy measures.

At the time of writing, we cannot recommend a specific tool for automating data extraction for routine systematic review production. There is a need for review authors to work with experts in informatics to refine these tools and evaluate them rigorously. Such investigations should address how the tool will fit into existing workflows. For example, the automated or semi-automated data extraction approaches may first act as checks for manual data extraction before they can replace it.

5.5.10 Suspicions of scientific misconduct

Systematic review authors can uncover suspected misconduct in the published literature. Misconduct includes fabrication or falsification of data or results, plagiarism, and research that does not adhere to ethical norms. Review authors need to be aware of scientific misconduct because the inclusion of fraudulent material could undermine the reliability of a review’s findings. Plagiarism of results data in the form of duplicated publication (either by the same or by different authors) may, if undetected, lead to study participants being double counted in a synthesis.

It is preferable to identify potential problems before, rather than after, publication of the systematic review, so that readers are not misled. However, empirical evidence indicates that the extent to which systematic review authors explore misconduct varies widely (Elia et al 2016). Text-matching software and systems such as CrossCheck may be helpful for detecting plagiarism, but they can detect only matching text, so data tables or figures need to be inspected by hand or using other systems (e.g. to detect image manipulation). Lists of data such as in a meta-analysis can be a useful means of detecting duplicated studies. Furthermore, examination of baseline data can lead to suspicions of misconduct for an individual randomized trial (Carlisle et al 2015). For example, Al-Marzouki and colleagues concluded that a trial report was fabricated or falsified on the basis of highly unlikely baseline differences between two randomized groups (Al-Marzouki et al 2005).

Cochrane Review authors are advised to consult with Cochrane editors if cases of suspected misconduct are identified. Searching for comments, letters or retractions may uncover additional information. Sensitivity analyses can be used to determine whether the studies arousing suspicion are influential in the conclusions of the review. Guidance for editors for addressing suspected misconduct will be available from Cochrane’s Editorial Publishing and Policy Resource (see community.cochrane.org ). Further information is available from the Committee on Publication Ethics (COPE; publicationethics.org ), including a series of flowcharts on how to proceed if various types of misconduct are suspected. Cases should be followed up, typically including an approach to the editors of the journals in which suspect reports were published. It may be useful to write first to the primary investigators to request clarification of apparent inconsistencies or unusual observations.

Because investigations may take time, and institutions may not always be responsive (Wager 2011), articles suspected of being fraudulent should be classified as ‘awaiting assessment’. If a misconduct investigation indicates that the publication is unreliable, or if a publication is retracted, it should not be included in the systematic review, and the reason should be noted in the ‘excluded studies’ section.

5.5.11 Key points in planning and reporting data extraction

In summary, the methods section of both the protocol and the review should detail:

  • the data categories that are to be extracted;
  • how extracted data from each report will be verified (e.g. extraction by two review authors, independently);
  • whether data extraction is undertaken by content area experts, methodologists, or both;
  • pilot testing, training and existence of coding instructions for the data collection form;
  • how data are extracted from multiple reports from the same study; and
  • how disagreements are handled when more than one author extracts data from each report.

5.6 Extracting study results and converting to the desired format

In most cases, it is desirable to collect summary data separately for each intervention group of interest and to enter these into software in which effect estimates can be calculated, such as RevMan. Sometimes the required data may be obtained only indirectly, and the relevant results may not be obvious. Chapter 6 provides many useful tips and techniques to deal with common situations. When summary data cannot be obtained from each intervention group, or where it is important to use results of adjusted analyses (for example to account for correlations in crossover or cluster-randomized trials) effect estimates may be available directly.

5.7 Managing and sharing data

When data have been collected for each individual study, it is helpful to organize them into a comprehensive electronic format, such as a database or spreadsheet, before entering data into a meta-analysis or other synthesis. When data are collated electronically, all or a subset of them can easily be exported for cleaning, consistency checks and analysis.

Tabulation of collected information about studies can facilitate classification of studies into appropriate comparisons and subgroups. It also allows identification of comparable outcome measures and statistics across studies. It will often be necessary to perform calculations to obtain the required statistics for presentation or synthesis. It is important through this process to retain clear information on the provenance of the data, with a clear distinction between data from a source document and data obtained through calculations. Statistical conversions, for example from standard errors to standard deviations, ideally should be undertaken with a computer rather than using a hand calculator to maintain a permanent record of the original and calculated numbers as well as the actual calculations used.

Ideally, data only need to be extracted once and should be stored in a secure and stable location for future updates of the review, regardless of whether the original review authors or a different group of authors update the review (Ip et al 2012). Standardizing and sharing data collection tools as well as data management systems among review authors working in similar topic areas can streamline systematic review production. Review authors have the opportunity to work with trialists, journal editors, funders, regulators, and other stakeholders to make study data (e.g. CSRs, IPD, and any other form of study data) publicly available, increasing the transparency of research. When legal and ethical to do so, we encourage review authors to share the data used in their systematic reviews to reduce waste and to allow verification and reanalysis because data will not have to be extracted again for future use (Mayo-Wilson et al 2018).

5.8 Chapter information

Editors: Tianjing Li, Julian PT Higgins, Jonathan J Deeks

Acknowledgements: This chapter builds on earlier versions of the Handbook . For details of previous authors and editors of the Handbook , see Preface. Andrew Herxheimer, Nicki Jackson, Yoon Loke, Deirdre Price and Helen Thomas contributed text. Stephanie Taylor and Sonja Hood contributed suggestions for designing data collection forms. We are grateful to Judith Anzures, Mike Clarke, Miranda Cumpston and Peter Gøtzsche for helpful comments.

Funding: JPTH is a member of the National Institute for Health Research (NIHR) Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol. JJD received support from the NIHR Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham. JPTH received funding from National Institute for Health Research Senior Investigator award NF-SI-0617-10145. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

5.9 References

Al-Marzouki S, Evans S, Marshall T, Roberts I. Are these data real? Statistical methods for the detection of data fabrication in clinical trials. BMJ 2005; 331 : 267-270.

Allen EN, Mushi AK, Massawe IS, Vestergaard LS, Lemnge M, Staedke SG, Mehta U, Barnes KI, Chandler CI. How experiences become data: the process of eliciting adverse event, medical history and concomitant medication reports in antimalarial and antiretroviral interaction trials. BMC Medical Research Methodology 2013; 13 : 140.

Baudard M, Yavchitz A, Ravaud P, Perrodeau E, Boutron I. Impact of searching clinical trial registries in systematic reviews of pharmaceutical treatments: methodological systematic review and reanalysis of meta-analyses. BMJ 2017; 356 : j448.

Bent S, Padula A, Avins AL. Better ways to question patients about adverse medical events: a randomized, controlled trial. Annals of Internal Medicine 2006; 144 : 257-261.

Berlin JA. Does blinding of readers affect the results of meta-analyses? University of Pennsylvania Meta-analysis Blinding Study Group. Lancet 1997; 350 : 185-186.

Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. Journal of Clinical Epidemiology 2006; 59 : 697-703.

Carlisle JB, Dexter F, Pandit JJ, Shafer SL, Yentis SM. Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials. Anaesthesia 2015; 70 : 848-858.

Carroll C, Patterson M, Wood S, Booth A, Rick J, Balain S. A conceptual framework for implementation fidelity. Implementation Science 2007; 2 : 40.

Carvajal A, Ortega PG, Sainz M, Velasco V, Salado I, Arias LHM, Eiros JM, Rubio AP, Castrodeza J. Adverse events associated with pandemic influenza vaccines: Comparison of the results of a follow-up study with those coming from spontaneous reporting. Vaccine 2011; 29 : 519-522.

Chamberlain C, O'Mara-Eves A, Porter J, Coleman T, Perlen SM, Thomas J, McKenzie JE. Psychosocial interventions for supporting women to stop smoking in pregnancy. Cochrane Database of Systematic Reviews 2017; 2 : CD001055.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implementation Science 2009; 4 : 50.

Davis AL, Miller JD. The European Medicines Agency and publication of clinical study reports: a challenge for the US FDA. JAMA 2017; 317 : 905-906.

Denniston AK, Holland GN, Kidess A, Nussenblatt RB, Okada AA, Rosenbaum JT, Dick AD. Heterogeneity of primary outcome measures used in clinical trials of treatments for intermediate, posterior, and panuveitis. Orphanet Journal of Rare Diseases 2015; 10 : 97.

Derry S, Loke YK. Risk of gastrointestinal haemorrhage with long term use of aspirin: meta-analysis. BMJ 2000; 321 : 1183-1187.

Doshi P, Dickersin K, Healy D, Vedula SS, Jefferson T. Restoring invisible and abandoned trials: a call for people to publish the findings. BMJ 2013; 346 : f2865.

Dusenbury L, Brannigan R, Falco M, Hansen WB. A review of research on fidelity of implementation: implications for drug abuse prevention in school settings. Health Education Research 2003; 18 : 237-256.

Dwan K, Altman DG, Clarke M, Gamble C, Higgins JPT, Sterne JAC, Williamson PR, Kirkham JJ. Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials. PLoS Medicine 2014; 11 : e1001666.

Elia N, von Elm E, Chatagner A, Popping DM, Tramèr MR. How do authors of systematic reviews deal with research malpractice and misconduct in original studies? A cross-sectional analysis of systematic reviews and survey of their authors. BMJ Open 2016; 6 : e010442.

Gøtzsche PC. Multiple publication of reports of drug trials. European Journal of Clinical Pharmacology 1989; 36 : 429-432.

Gøtzsche PC, Hróbjartsson A, Maric K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences. JAMA 2007; 298 : 430-437.

Gross A, Schirm S, Scholz M. Ycasd - a tool for capturing and scaling data from graphical representations. BMC Bioinformatics 2014; 15 : 219.

Hoffmann TC, Glasziou PP, Boutron I, Milne R, Perera R, Moher D, Altman DG, Barbour V, Macdonald H, Johnston M, Lamb SE, Dixon-Woods M, McCulloch P, Wyatt JC, Chan AW, Michie S. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ 2014; 348 : g1687.

ICH. ICH Harmonised tripartite guideline: Struture and content of clinical study reports E31995. ICH1995. www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E3/E3_Guideline.pdf .

Ioannidis JPA, Mulrow CD, Goodman SN. Adverse events: The more you search, the more you find. Annals of Internal Medicine 2006; 144 : 298-300.

Ip S, Hadar N, Keefe S, Parkin C, Iovin R, Balk EM, Lau J. A web-based archive of systematic review data. Systematic Reviews 2012; 1 : 15.

Ismail R, Azuara-Blanco A, Ramsay CR. Variation of clinical outcomes used in glaucoma randomised controlled trials: a systematic review. British Journal of Ophthalmology 2014; 98 : 464-468.

Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJM, Gavaghan DJ, McQuay H. Assessing the quality of reports of randomized clinical trials: Is blinding necessary? Controlled Clinical Trials 1996; 17 : 1-12.

Jelicic Kadic A, Vucic K, Dosenovic S, Sapunar D, Puljak L. Extracting data from figures with software was faster, with higher interrater reliability than manual extraction. Journal of Clinical Epidemiology 2016; 74 : 119-123.

Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews. Journal of Clinical Epidemiology 2005; 58 : 741-742.

Jones CW, Keil LG, Holland WC, Caughey MC, Platts-Mills TF. Comparison of registered and published outcomes in randomized controlled trials: a systematic review. BMC Medicine 2015; 13 : 282.

Jonnalagadda SR, Goyal P, Huffman MD. Automating data extraction in systematic reviews: a systematic review. Systematic Reviews 2015; 4 : 78.

Lewin S, Hendry M, Chandler J, Oxman AD, Michie S, Shepperd S, Reeves BC, Tugwell P, Hannes K, Rehfuess EA, Welch V, McKenzie JE, Burford B, Petkovic J, Anderson LM, Harris J, Noyes J. Assessing the complexity of interventions within systematic reviews: development, content and use of a new tool (iCAT_SR). BMC Medical Research Methodology 2017; 17 : 76.

Li G, Abbade LPF, Nwosu I, Jin Y, Leenus A, Maaz M, Wang M, Bhatt M, Zielinski L, Sanger N, Bantoto B, Luo C, Shams I, Shahid H, Chang Y, Sun G, Mbuagbaw L, Samaan Z, Levine MAH, Adachi JD, Thabane L. A scoping review of comparisons between abstracts and full reports in primary biomedical research. BMC Medical Research Methodology 2017; 17 : 181.

Li TJ, Vedula SS, Hadar N, Parkin C, Lau J, Dickersin K. Innovations in data collection, management, and archiving for systematic reviews. Annals of Internal Medicine 2015; 162 : 287-294.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Medicine 2009; 6 : e1000100.

Liu ZM, Saldanha IJ, Margolis D, Dumville JC, Cullum NA. Outcomes in Cochrane systematic reviews related to wound care: an investigation into prespecification. Wound Repair and Regeneration 2017; 25 : 292-308.

Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. Journal of the American Medical Informatics Association 2016; 23 : 193-201.

Mayo-Wilson E, Doshi P, Dickersin K. Are manufacturers sharing data as promised? BMJ 2015; 351 : h4169.

Mayo-Wilson E, Li TJ, Fusco N, Bertizzolo L, Canner JK, Cowley T, Doshi P, Ehmsen J, Gresham G, Guo N, Haythomthwaite JA, Heyward J, Hong H, Pham D, Payne JL, Rosman L, Stuart EA, Suarez-Cuervo C, Tolbert E, Twose C, Vedula S, Dickersin K. Cherry-picking by trialists and meta-analysts can drive conclusions about intervention efficacy. Journal of Clinical Epidemiology 2017a; 91 : 95-110.

Mayo-Wilson E, Fusco N, Li TJ, Hong H, Canner JK, Dickersin K, MUDS Investigators. Multiple outcomes and analyses in clinical trials create challenges for interpretation and research synthesis. Journal of Clinical Epidemiology 2017b; 86 : 39-50.

Mayo-Wilson E, Li T, Fusco N, Dickersin K. Practical guidance for using multiple data sources in systematic reviews and meta-analyses (with examples from the MUDS study). Research Synthesis Methods 2018; 9 : 2-12.

Meade MO, Richardson WS. Selecting and appraising studies for a systematic review. Annals of Internal Medicine 1997; 127 : 531-537.

Meinert CL. Clinical trials dictionary: Terminology and usage recommendations . Hoboken (NJ): Wiley; 2012.

Millard LAC, Flach PA, Higgins JPT. Machine learning to assist risk-of-bias assessments in systematic reviews. International Journal of Epidemiology 2016; 45 : 266-277.

Moher D, Schulz KF, Altman DG. The CONSORT Statement: revised recommendations for improving the quality of reports of parallel-group randomised trials. Lancet 2001; 357 : 1191-1194.

Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 2010; 340 : c869.

Moore GF, Audrey S, Barker M, Bond L, Bonell C, Hardeman W, Moore L, O'Cathain A, Tinati T, Wight D, Baird J. Process evaluation of complex interventions: Medical Research Council guidance. BMJ 2015; 350 : h1258.

Orwin RG. Evaluating coding decisions. In: Cooper H, Hedges LV, editors. The Handbook of Research Synthesis . New York (NY): Russell Sage Foundation; 1994. p. 139-162.

Page MJ, McKenzie JE, Kirkham J, Dwan K, Kramer S, Green S, Forbes A. Bias due to selective inclusion and reporting of outcomes and analyses in systematic reviews of randomised trials of healthcare interventions. Cochrane Database of Systematic Reviews 2014; 10 : MR000035.

Ross JS, Mulvey GK, Hines EM, Nissen SE, Krumholz HM. Trial publication after registration in ClinicalTrials.Gov: a cross-sectional analysis. PLoS Medicine 2009; 6 .

Safer DJ. Design and reporting modifications in industry-sponsored comparative psychopharmacology trials. Journal of Nervous and Mental Disease 2002; 190 : 583-592.

Saldanha IJ, Dickersin K, Wang X, Li TJ. Outcomes in Cochrane systematic reviews addressing four common eye conditions: an evaluation of completeness and comparability. PloS One 2014; 9 : e109400.

Saldanha IJ, Li T, Yang C, Ugarte-Gil C, Rutherford GW, Dickersin K. Social network analysis identified central outcomes for core outcome sets using systematic reviews of HIV/AIDS. Journal of Clinical Epidemiology 2016; 70 : 164-175.

Saldanha IJ, Lindsley K, Do DV, Chuck RS, Meyerle C, Jones LS, Coleman AL, Jampel HD, Dickersin K, Virgili G. Comparison of clinical trial and systematic review outcomes for the 4 most prevalent eye diseases. JAMA Ophthalmology 2017a; 135 : 933-940.

Saldanha IJ, Li TJ, Yang C, Owczarzak J, Williamson PR, Dickersin K. Clinical trials and systematic reviews addressing similar interventions for the same condition do not consider similar outcomes to be important: a case study in HIV/AIDS. Journal of Clinical Epidemiology 2017b; 84 : 85-94.

Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G, Tierney JF, PRISMA-IPD Development Group. Preferred reporting items for a systematic review and meta-analysis of individual participant data: the PRISMA-IPD statement. JAMA 2015; 313 : 1657-1665.

Stock WA. Systematic coding for research synthesis. In: Cooper H, Hedges LV, editors. The Handbook of Research Synthesis . New York (NY): Russell Sage Foundation; 1994. p. 125-138.

Tramèr MR, Reynolds DJ, Moore RA, McQuay HJ. Impact of covert duplicate publication on meta-analysis: a case study. BMJ 1997; 315 : 635-640.

Turner EH. How to access and process FDA drug approval packages for use in research. BMJ 2013; 347 .

von Elm E, Poglia G, Walder B, Tramèr MR. Different patterns of duplicate publication: an analysis of articles used in systematic reviews. JAMA 2004; 291 : 974-980.

Wager E. Coping with scientific misconduct. BMJ 2011; 343 : d6586.

Wieland LS, Rutkow L, Vedula SS, Kaufmann CN, Rosman LM, Twose C, Mahendraratnam N, Dickersin K. Who has used internal company documents for biomedical and public health research and where did they find them? PloS One 2014; 9 .

Zanchetti A, Hansson L. Risk of major gastrointestinal bleeding with aspirin (Authors' reply). Lancet 1999; 353 : 149-150.

Zarin DA, Tse T, Williams RJ, Califf RM, Ide NC. The ClinicalTrials.gov results database: update and key issues. New England Journal of Medicine 2011; 364 : 852-860.

Zwarenstein M, Treweek S, Gagnier JJ, Altman DG, Tunis S, Haynes B, Oxman AD, Moher D. Improving the reporting of pragmatic trials: an extension of the CONSORT statement. BMJ 2008; 337 : a2390.

For permission to re-use material from the Handbook (either academic or commercial), please see here for full details.

what is data collection procedure in research

The Ultimate Guide to Qualitative Research - Part 1: The Basics

what is data collection procedure in research

  • Introduction and overview
  • What is qualitative research?
  • What is qualitative data?
  • Examples of qualitative data
  • Qualitative vs. quantitative research
  • Mixed methods
  • Qualitative research preparation
  • Theoretical perspective
  • Theoretical framework
  • Literature reviews
  • Research question
  • Conceptual framework
  • Conceptual vs. theoretical framework
  • Introduction

Data in research

Data collection methods, challenges in data collection, using technology in data collection, data organization.

  • Qualitative research methods
  • Focus groups
  • Observational research
  • Case studies
  • Ethnographical research
  • Ethical considerations
  • Confidentiality and privacy
  • Power dynamics
  • Reflexivity

Data collection - What is it and why is it important?

The data collected for your study informs the analysis of your research. Gathering data in a transparent and thorough manner informs the rest of your research and makes it persuasive to your audience.

what is data collection procedure in research

We will look at the data collection process, the methods of data collection that exist in quantitative and qualitative research , and the various issues around data in qualitative research.

When it comes to defining data, data can be any sort of information that people use to better understand the world around them. Having this information allows us to robustly draw and verify conclusions, as opposed to relying on blind guesses or thought exercises.

Necessity of data collection skills

Collecting data is critical to the fundamental objective of research as a vehicle to organize knowledge. While this may seem intuitive, it's important to acknowledge that researchers must be as skilled in data collection as they are in data analysis .

Collecting the right data

Rather than just collecting as much data as possible, it's important to collect data that is relevant for answering your research question . Imagine a simple research question: what factors do people consider when buying a car? It would not be possible to ask every living person about their car purchases. Even if it was possible, not everyone drives a car, so asking non-drivers seems unproductive. As a result, the researcher conducting a study to devise data reports and marketing strategies has to take a sample of the relevant data to ensure reliable analysis and findings.

Data collection examples

In the broadest terms, any sort of data gathering contributes to the research process. In any work of science, researchers cannot make empirical conclusions without relying on some body of data to make rational judgments.

Various examples of data collection in the social sciences include:

  • responses to a survey about product satisfaction
  • interviews with students about their career goals
  • reactions to an experimental vitamin supplement regimen
  • observations of workplace interactions and practices
  • focus group data about customer behavior

Data science and scholarly research have almost limitless possibilities to collect data, and the primary requirement is that the dataset should be relevant to the research question and clearly defined. Researchers thus need to rule out any irrelevant data so that they can develop new theory or key findings.

Types of data

Researchers can collect data themselves (primary data) or use third-party data (secondary data). The data collection considerations regarding which type of data to work with have a direct relationship to your research question and objectives.

Primary data

Original research relies on first-party data, or primary data that the researcher collects themselves for their own analysis. When you are collecting information in a primary study yourself, you are more likely to gain the high quality you require.

Because the researcher is most aware of the inquiry they want to conduct and has tailored the research process to their inquiry, first-party data collection has the greatest potential for congruence between the data collected and the potential to generate relevant insights.

Ethnographic research , for example, relies on first-party data collection since a description of a culture or a group of people is contextualized through a comprehensive understanding of the researcher and their relative positioning to that culture.

Secondary data

Researchers can also use publicly available secondary data that other researchers have generated to analyze following a different approach and thus produce new insights. Online databases and literature reviews are good examples where researchers can find existing data to conduct research on a previously unexplored inquiry. However, it is important to consider data accuracy or relevance when using third-party data, given that the researcher can only conduct limited quality control of data that has already been collected.

what is data collection procedure in research

A relatively new consideration in data collection and data analysis has been the advent of big data, where data scientists employ automated processes to collect data in large amounts.

what is data collection procedure in research

The advantage of collecting data at scale is that a thorough analysis of a greater scope of data can potentially generate more generalizable findings. Nonetheless, this is a daunting task because it is time-consuming and arduous. Moreover, it requires skilled data scientists to sift through large data sets to filter out irrelevant data and generate useful insights. On the other hand, it is important for qualitative researchers to carefully consider their needs for data breadth versus depth: Qualitative studies typically rely on a relatively small number of participants but very detailed data is collected for each participant, because understanding the specific context and individual interpretations or experiences is often of central importance. When using big data, this depth of data is usually replaced with a greater breadth of data that includes a much greater number of participants. Researchers need to consider their need for depth or breadth to decide which data collection method is best suited to answer their research question.

Data science made easy with ATLAS.ti

ATLAS.ti handles all research projects big and small. See how with a free trial.

Different data collection procedures for gathering data exist depending on the research inquiry you want to conduct. Let's explore the common data collection methods in quantitative and qualitative research.

Quantitative data collection methods

Quantitative methods are used to collect numerical or quantifiable data. These can then be processed statistically to test hypotheses and gain insights. Quantitative data gathering is typically aimed at measuring a particular phenomenon (e.g., the amount of awareness a brand has in the market, the efficacy of a particular diet, etc.) in order to test hypotheses (e.g., social media marketing campaigns increase brand awareness, eating more fruits and vegetables leads to better physical performance, etc.).

what is data collection procedure in research

Some qualitative methods of research can contribute to quantitative data collection and analysis. Online surveys and questionnaires with multiple-choice questions can produce structured data ready to be analyzed. A survey platform like Qualtrics, for example, aggregates survey responses in a spreadsheet to allow for numerical or frequency analysis.

Qualitative data collection methods

Analyzing qualitative data is important for describing a phenomenon (e.g., the requirements for good teaching practices), which may lead to the creation of propositions or the development of a theory. Behavioral data, transactional data, and data from social media monitoring are examples of different forms of data that can be collected qualitatively.

Consideration of tools or equipment for collecting data is also important. Primary data collection methods in observational research , for example, employ tools such as audio and video recorders , notebooks for writing field notes , and cameras for taking photographs. As long as the products of such tools can be analyzed, those products can be incorporated into a study's data collection.

Employing multiple data collection methods

Moreover, qualitative researchers seldom rely on one data collection method alone. Ethnographic researchers , in particular, can incorporate direct observation , interviews , focus group sessions , and document collection in their data collection process to produce the most contextualized data for their research. Mixed methods research employs multiple data collection methods, including qualitative and quantitative data, along with multiple tools to study a phenomenon from as many different angles as possible.

what is data collection procedure in research

New forms of data collection

External data sources such as social media data and big data have also gained contemporary focus as social trends change and new research questions emerge. This has prompted the creation of novel data collection methods in research.

Ultimately, there are countless data collection instruments used for qualitative methods, but the key objective is to be able to produce relevant data that can be systematically analyzed. As a result, researchers can analyze audio, video, images, and other formats beyond text. As our world is continuously changing, for example, with the growing prominence of generative artificial intelligence and social media, researchers will undoubtedly bring forth new inquiries that require continuous innovation and adaptation with data collection methods.

what is data collection procedure in research

Collecting data for qualitative research is a complex process that often comes with unique challenges. This section discusses some of the common obstacles that researchers may encounter during data collection and offers strategies to navigate these issues.

Access to participants

Obtaining access to research participants can be a significant challenge. This might be due to geographical distance, time constraints, or reluctance from potential participants. To address this, researchers need to clearly communicate the purpose of their study, ensure confidentiality, and be flexible with their scheduling.

Cultural and language barriers

Researchers may face cultural and language barriers, particularly in cross-cultural research. These barriers can affect communication and understanding between the researcher and the participant. Employing translators, cultural mediators, or learning the local language can be beneficial in overcoming these barriers.

what is data collection procedure in research

Non-responsive or uncooperative participants

At times, researchers might encounter participants who are unwilling or unable to provide the required information. In these situations, rapport-building is crucial. The researcher should aim to build trust, create a comfortable environment for the participant, and reassure them about the confidentiality of their responses.

Time constraints

Qualitative research can be time-consuming, particularly when involving interviews or focus groups that require coordination of multiple schedules, transcription , and in-depth analysis . Adequate planning and organization can help mitigate this challenge.

Bias in data collection

Bias in data collection can occur when the researcher's preconceptions or the participant's desire to present themselves favorably affect the data. Strategies for mitigating bias include reflexivity , triangulation, and member checking .

Handling sensitive topics

Research involving sensitive topics can be challenging for both the researcher and the participant. Ensuring a safe and supportive environment , practicing empathetic listening, and providing resources for emotional support can help navigate these sensitive issues.

what is data collection procedure in research

Collecting data in qualitative research can be a very rewarding but challenging experience. However, with careful planning, ethical conduct, and a flexible approach, researchers can effectively navigate these obstacles and collect robust, meaningful data.

Considerations when collecting data

Research relies on empiricism and credibility at all stages of a research inquiry. As a result, there are various data collection problems and issues that researchers need to keep in mind.

Data quality issues

Your analysis may depend on capturing the fine-grained details that some data collection tools may miss. In that case, you should carefully consider data quality issues regarding the precision of your data collection. For example, think about a picture taken with a smartphone camera and a picture taken with a professional camera. If you need high-resolution photos, it would make sense to rely on a professional camera that can provide adequate data quality.

Quantitative data collection often relies on precise data collection tools to evaluate outcomes, but researchers collecting qualitative data should also be concerned with quality assurance. For example, suppose a study involving direct observation requires multiple observers in different contexts. In that case, researchers should take care to ensure that all observers can gather data in a similar fashion to ensure that all data can be analyzed in the same way.

what is data collection procedure in research

Data quality is a crucial consideration when gathering information. Even if the researcher has chosen an appropriate method for data collection, is the data that they collect useful and detailed enough to provide the necessary analysis to answer the given research inquiry?

One example where data quality is consequential in qualitative data collection includes interviews and focus groups. Recordings may lose some of the finer details of social interaction, such as pauses, thinking words, or utterances that aren't loud enough for the microphone to pick up.

Suppose you are conducting an interview for a study where such details are relevant to your analysis. In that case, you should consider employing tools that collect sufficiently rich data to record these aspects of interaction.

Data integrity

The possibility of inaccurate data has the potential to confound the data analysis process, as drawing conclusions or making decisions becomes difficult, if not impossible, with low-quality data. Failure to establish the integrity of data collection can cast doubt on the findings of a given study. Accurate data collection is just one aspect researchers should consider to protect data integrity. After that, it is a matter of preserving the data after data collection. How is the data stored? Who has access to the collected data? To what extent can the data be changed between data collection and research dissemination?

Data integrity is an issue of research ethics as well as research credibility . The researcher needs to establish that the data presented for research dissemination is an accurate representation of the phenomenon under study.

Imagine if a photograph of wildlife becomes so aged that the color becomes distorted over time. Suppose the findings depend on describing the colors of a particular animal or plant. In that case, then not preserving the integrity of the data presents a serious threat to the credibility of the research and the researcher. In addition, when transcribing an interview or focus group, it is important to take care that participants’ words are accurately transcribed to avoid unintentionally changing the data.

Transparency

As explored earlier, researchers rely on both intuition and data to make interpretations about the world. As a result, researchers have an obligation to explain how they collected data and describe their data so that audiences can also understand it. Establishing research transparency also allows other researchers to examine a study and determine if they find it credible and how they can continue to build off it.

To address this need, research papers typically have a methodology section, which includes descriptions of the tools employed for data collection and the breadth and depth of the data that is collected for the study. It is important to transparently convey every aspect of the data collection and analysis , which might involve providing a sample of the questions participants were asked, demographic information about participants, or proof of compliance with ethical standards, to name a few examples.

Subjectivity

How to gather data is also a key concern, especially in social sciences where people's perspectives represent the collected data, and these perspectives can vastly differ.

what is data collection procedure in research

In interviews and focus groups, how questions are framed may change the nature of the answers that participants provide. In market research, researchers have to carefully design questions to not inadvertently lead customers to provide a certain response or to facilitate useful feedback. Even in the natural sciences, researchers have to regularly check whether the data collection equipment they use for gathering data is producing accurate data sets for analysis.

Finally, the different methods of data collection raise questions about whether the data says what we think it says. Consider how people might establish monitoring systems to track behavioral data online. When a user spends a certain amount of time on a mobile app, are they deeply engaged in using the app, or are they leaving it on while they work on other tasks?

Data collection is only as useful as the extent to which the resulting data can be systematically analyzed and is relevant to the research inquiry being pursued. While it is tempting to collect as much data as possible, it is the researcher’s analyses and inferences, not just the quantity of data, that ultimately determine the impact of the research.

Validity and reliability in qualitative data

Ensuring validity and reliability in qualitative data collection is paramount to producing meaningful, rigorous, and trustworthy research findings. This section will outline the core principles of validity and reliability, which stem from quantitative research, and then we will consider relevant quality criteria for qualitative research.

Understanding validity

In general terms, validity is about ensuring that the research accurately reflects the phenomena it purports to represent. It is tied to how well the methods and techniques used in a study align with the intended research question and how accurately the findings represent the participants' experiences or perceptions. In qualitative research, however, the co-existence of multiple realities is often recognized, rather than believing there is only one “true” reality out there that can be measured. Thus, qualitative researchers can instead convey credibility by transparently communicating their research question, operationalization of key concepts, and how this translated into their data collection instruments and analysis. Moreover, qualitative researchers should pay attention to whether their own preconceptions or goals might be inadvertently shaping their findings. In addition, potential reactivity effects can be considered, to assess how the research may have influenced their participants or research setting while collecting data.

Understanding reliability

Reliability broadly refers to the consistency of the research approach across different contexts and with different researchers. A quantitative study is considered reliable if its findings can be replicated in a similar context or if the same results can be obtained by a different researcher following the same research procedure.

In qualitative research, however, researchers acknowledge and embrace the specific context of their data and analysis. All knowledge that is generated is context-specific, so rather than claiming that a study’s findings can be reliably reproduced in a wholly different context, qualitative researchers aim to demonstrate the trustworthiness or dependability of their data and findings. Transparent descriptions and clear communication can convey to audiences that the research was conducted with rigor and coherence between the research question , methods, and findings, all of which can bolster the credibility of the qualitative study.

what is data collection procedure in research

Enhancing data quality

Various strategies can be used to enhance data quality in qualitative research. Among them are:

1. Triangulation: This involves using multiple data sources, methods, or researchers to gather data about the same phenomenon. This can help to ensure the findings are robust and not dependent on a single source. 2. Member checking: This method involves returning the findings to the participants to check if the interpretations accurately reflect their experiences or perceptions. This can help to ensure the validity of the research findings. 3. Thick description: Providing detailed accounts of the context, interactions, and interpretations in the research report can allow others to understand the research process better, which is important to foster the communicability of one’s research. 4. Audit trail: Keeping a detailed record of the research process, decisions, and reflections can increase the transparency and coherence of the study.

what is data collection procedure in research

A wide variety of technologies can be used to work with qualitative data. Technology not only aids in data collection but also in the organization , analysis , and presentation of data .

This section explores some of the key ways that technology can be integrated into qualitative data collection.

Digital tools for data collection

Digital tools can vastly improve the efficiency and effectiveness of data collection. For example, audio and video recording devices can capture interviews , focus groups , and observational data with great detail.

what is data collection procedure in research

Online surveys and questionnaires can reach a wider audience, often at a lower cost and with quicker turnaround times compared with traditional methods. Mobile applications can also be used to capture real-time experiences, emotions, and activities through diary studies or experience sampling.

Online platforms for qualitative research

Online platforms like social media , blogs, and discussion forums provide a rich source of qualitative data. Researchers can analyze these platforms for insights into people's behaviors, attitudes, and experiences.

In addition, virtual communities and digital ethnography are becoming increasingly common as researchers explore these online spaces.

Ethical considerations with technology

With the increased use of technology, researchers must be mindful of ethical considerations , including privacy and consent . It's important to secure informed consent when collecting data from online platforms or using digital tools, and all researchers should obtain the necessary approvals for collecting data and adhering to any applicable codes of conduct (such as GDPR). It's also crucial to ensure data security and confidentiality when storing data on digital platforms.

Advantages and limitations of technology

While technology offers numerous advantages in terms of efficiency, accessibility, and breadth of data, it also presents limitations. For example, digital tools may not capture the full nuance and richness of face-to-face interactions.

Furthermore, technological glitches and data loss are potential risks. Therefore, it's important for researchers to understand these trade-offs when incorporating technology into their data collection process.

As technology continues to evolve, so too will its applications in qualitative research. Embracing these technological advancements can help researchers to enhance their data collection practices, offering new opportunities for capturing, analyzing , and presenting qualitative data .

Data analysis after collecting data is only possible if the data is sufficiently organized into a form that can be easily sorted and understood. Imagine collecting social media data , which could be millions of posts from millions of social media users every day. You can dump every single post into a file, but how can you make sense of it?

Data organization is especially important when dealing with unstructured data. The researcher needs to structure the data in some way that facilitates the analytical process.

Transcription

Collecting data in focus groups, interviews, or other similar interactions produces raw video and audio recordings . This data can often be analyzed for contextual cues such as non-verbal interaction, facial expressions, and accents. However, most traditional analyses of interview and focus group data benefit from converting participants’ words into text.

Recordings are typically transcribed so that the text can be systematically analyzed and incorporated into research papers or presentations . Transcription can be a tedious task, especially if a researcher has to deal with hours of audio data. These days, researchers can often choose between manually transcribing their raw data or using automated transcription services to greatly speed up this process.

Survey data

In online survey platforms, participant responses to closed-ended questions can be easily aggregated in a spreadsheet. Responses to any open-ended questions can also be included in a spreadsheet or saved as separate files for subsequent analysis of the text participants wrote. Since survey data is relatively structured, it tends to be quicker and easier to organize than other forms of qualitative data that are more unstructured, such as interviews or observations.

Field notes and artifacts

In ethnographic research or research involving direct observation , gathering data often means writing notes or taking photographs during field work. While field notes can be typed into a document for data analysis, the researcher can also scan their notes into an image or a PDF for later organization.

This degree of flexibility allows researchers to code all forms of data that aren't textual in nature but can still provide useful data points for analysis and theoretical development.

Coding is among the most fundamental skills in qualitative research, because coding is how researchers can effectively reduce large datasets into a series of compact codes for later analysis. If you are dealing with dozens or hundreds of pages of qualitative data, then applying codes to your data is a key method for condensing, synthesizing, and understanding the data.

what is data collection procedure in research

Organize your data with ATLAS.ti.

All your research data in one organized place. Give ATLAS.ti a try with a free trial.

caltech

  • Data Science

Caltech Bootcamp / Blog / /

Data Collection Methods: A Comprehensive View

  • Written by John Terra
  • Updated on February 21, 2024

What Is Data Processing

Companies that want to be competitive in today’s digital economy enjoy the benefit of countless reams of data available for market research. In fact, thanks to the advent of big data, there’s a veritable tidal wave of information ready to be put to good use, helping businesses make intelligent decisions and thrive.

But before that data can be used, it must be processed. But before it can be processed, it must be collected, and that’s what we’re here for. This article explores the subject of data collection. We will learn about the types of data collection methods and why they are essential.

We will detail primary and secondary data collection methods and discuss data collection procedures. We’ll also share how you can learn practical skills through online data science training.

But first, let’s get the definition out of the way. What is data collection?

What is Data Collection?

Data collection is the act of collecting, measuring and analyzing different kinds of information using a set of validated standard procedures and techniques. The primary objective of data collection procedures is to gather reliable, information-rich data and analyze it to make critical business decisions. Once the desired data is collected, it undergoes a process of data cleaning and processing to make the information actionable and valuable for businesses.

Your choice of data collection method (or alternately called a data gathering procedure) depends on the research questions you’re working on, the type of data required, and the available time and resources and time. You can categorize data-gathering procedures into two main methods:

  • Primary data collection . Primary data is collected via first-hand experiences and does not reference or use the past. The data obtained by primary data collection methods is exceptionally accurate and geared to the research’s motive. They are divided into two categories: quantitative and qualitative. We’ll explore the specifics later.
  • Secondary data collection. Secondary data is the information that’s been used in the past. The researcher can obtain data from internal and external sources, including organizational data.

Let’s take a closer look at specific examples of both data collection methods.

Also Read: Why Use Python for Data Science?

The Specific Types of Data Collection Methods

As mentioned, primary data collection methods are split into quantitative and qualitative. We will examine each method’s data collection tools separately. Then, we will discuss secondary data collection methods.

Quantitative Methods

Quantitative techniques for demand forecasting and market research typically use statistical tools. When using these techniques, historical data is used to forecast demand. These primary data-gathering procedures are most often used to make long-term forecasts. Statistical analysis methods are highly reliable because they carry minimal subjectivity.

  • Barometric Method. Also called the leading indicators approach, data analysts and researchers employ this method to speculate on future trends based on current developments. When past events are used to predict future events, they are considered leading indicators.
  • Smoothing Techniques. Smoothing techniques can be used in cases where the time series lacks significant trends. These techniques eliminate random variation from historical demand and help identify demand levels and patterns to estimate future demand. The most popular methods used in these techniques are the simple moving average and the weighted moving average methods.
  • Time Series Analysis. The term “time series” refers to the sequential order of values in a variable, also known as a trend, at equal time intervals. Using patterns, organizations can predict customer demand for their products and services during the projected time.

Qualitative Methods

Qualitative data collection methods are instrumental when no historical information is available, or numbers and mathematical calculations aren’t required. Qualitative research is closely linked to words, emotions, sounds, feelings, colors, and other non-quantifiable elements. These techniques rely on experience, conjecture, intuition, judgment, emotion, etc. Quantitative methods do not provide motives behind the participants’ responses. Additionally, they often don’t reach underrepresented populations and usually involve long data collection periods. Therefore, you get the best results using quantitative and qualitative methods together.

  • Questionnaires . Questionnaires are a printed set of either open-ended or closed-ended questions. Respondents must answer based on their experience and knowledge of the issue. A questionnaire is a part of a survey, while the questionnaire’s end goal doesn’t necessarily have to be a survey.
  • Surveys. Surveys collect data from target audiences, gathering insights into their opinions, preferences, choices, and feedback on the organization’s goods and services. Most survey software has a wide range of question types, or you can also use a ready-made survey template that saves time and effort. Surveys can be distributed via different channels such as e-mail, offline apps, websites, social media, QR codes, etc.

Once researchers collect the data, survey software generates reports and runs analytics algorithms to uncover hidden insights. Survey dashboards give you statistics relating to completion rates, response rates, filters based on demographics, export and sharing options, etc. Practical business intelligence depends on the synergy between analytics and reporting. Analytics uncovers valuable insights while reporting communicates these findings to the stakeholders.

  • Polls. Polls consist of one or more multiple-choice questions. Marketers can turn to polls when they want to take a quick snapshot of the audience’s sentiments. Since polls tend to be short, getting people to respond is more manageable. Like surveys, online polls can be embedded into various media and platforms. Once the respondents answer the question(s), they can be shown how they stand concerning other people’s responses.
  • Delphi Technique. The name is a callback to the Oracle of Delphi, a priestess at Apollo’s temple in ancient Greece, renowned for her prophecies. In this method, marketing experts are given the forecast estimates and assumptions made by other industry experts. The first batch of experts may then use the information provided by the other experts to revise and reconsider their estimates and assumptions. The total expert consensus on the demand forecasts creates the final demand forecast.
  • Interviews. In this method, interviewers talk to the respondents either face-to-face or by telephone. In the first case, the interviewer asks the interviewee a series of questions in person and notes the responses. The interviewer can opt for a telephone interview if the parties cannot meet in person. This data collection form is practical for use with only a few respondents; repeating the same process with a considerably larger group takes longer.
  • Focus Groups. Focus groups are one of the primary examples of qualitative data in education. In focus groups, small groups of people, usually around 8-10 members, discuss the research problem’s common aspects. Each person provides their insights on the issue, and a moderator regulates the discussion. When the discussion ends, the group reaches a consensus.

Also Read: A Beginner’s Guide to the Data Science Process

Secondary Data Collection Methods

Secondary data is the information that’s been used in past situations. Secondary data collection methods can include quantitative and qualitative techniques. In addition, secondary data is easily available, so it’s less time-consuming and expensive than using primary data. However, the authenticity of data gathered with secondary data collection tools cannot be verified.

Internal secondary data sources:

  • CRM Software
  • Executive summaries
  • Financial Statements
  • Mission and vision statements
  • Organization’s health and safety records
  • Sales Reports

External secondary data sources:

  • Business journals
  • Government reports
  • Press releases

The Importance of Data Collection Methods

Data collection methods play a critical part in the research process as they determine the accuracy and quality and accuracy of the collected data. Here’s a sample of some reasons why data collection procedures are so important:

  • They determine the quality and accuracy of collected data
  • They ensure the data and the research findings are valid, relevant and reliable
  • They help reduce bias and increase the sample’s representation
  • They are crucial for making informed decisions and arriving at accurate conclusions
  • They provide accurate data, which facilitates the achievement of research objectives

Also Read: What Is Data Processing? Definition, Examples, Trends

So, What’s the Difference Between Data Collecting and Data Processing?

Data collection is the first step in the data processing process. Data collection involves gathering information (raw data) from various sources such as interviews, surveys, questionnaires, etc. Data processing describes the steps taken to organize, manipulate and transform the collected data into a useful and meaningful resource. This process may include tasks such as cleaning and validating data, analyzing and summarizing data, and creating visualizations or reports.

So, data collection is just one step in the overall data processing chain of events.

Do You Want to Become a Data Scientist?

If this discussion about data collection and the professionals who conduct it has sparked your enthusiasm for a new career, why not check out this online data science program ?

The Glassdoor.com jobs website shows that data scientists in the United States typically make an average yearly salary of $129,127 plus additional bonuses and cash incentives. So, if you’re interested in a new career or are already in the field but want to upskill or refresh your current skill set, sign up for this bootcamp and prepare to tackle the challenges of today’s big data.

You might also like to read:

Navigating Data Scientist Roles and Responsibilities in Today’s Market

Differences Between Data Scientist and Data Analyst: Complete Explanation

What Is Data Collection? A Guide for Aspiring Data Scientists

A Data Scientist Job Description: The Roles and Responsibilities in 2024

Top Data Science Projects With Source Code to Try

Data Science Bootcamp

  • Learning Format:

Online Bootcamp

Leave a comment cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Recommended Articles

What is exploratory data analysis

What is Exploratory Data Analysis? Types, Tools, Importance, etc.

This article highlights exploratory data analysis, including its definition, role in data science, types, and overall importance.

What is Data Wrangling

What is Data Wrangling? Importance, Tools, and More

This article explores data wrangling, including its definition, importance, steps, benefits, and tools.

Spatial Data Science

What is Spatial Data Science? Definition, Applications, Careers & More

Do you want to know what spatial data science is? Read this guide to learn its basics, real-world applications, and the exciting career options in this field.

Data Science and Marketing

Data Science and Marketing: Transforming Strategies and Enhancing Engagement

Employing data science in marketing is critical for any organization today. This blog explores this intersection of the two disciplines and how professionals and businesses can ensure they have the skills to drive successful digital marketing strategies.

Natural Language Processing in Data Science

An Introduction to Natural Language Processing in Data Science

Natural language processing may seem straightforward, but there’s a lot going on behind the scenes. This blog explores NLP in data science.

Why Python for Data Science

Why Use Python for Data Science?

This article explains why you should use Python for data science tasks, including how it’s done and the benefits.

Learning Format

Program Benefits

  • 12+ tools covered, 25+ hands-on projects
  • Masterclasses by distinguished Caltech CTME instructors
  • Caltech CTME Circle Membership
  • Industry-specific training from global experts
  • Call us on : 1800-212-7688

Scientific Research and Methodology : An introduction to quantitative research and statistics

10 collecting data.

So far, you have learnt to ask a RQ and design the study. In this chapter , you will learn how to:

  • record the important steps in data collection.
  • describe study protocols.
  • ask survey questions.

what is data collection procedure in research

10.1 Protocols

If the RQ is well-constructed, terms are clearly defined, and the study is well designed and explained, then the process for collecting the data should be easy to describe. Data collection is often time-consuming, tedious and expensive, so collecting the data correctly first time is important.

Before collecting the data, a plan should be established and documented that explains exactly how the data will be obtained, which will include operational definitions (Sect. 2.10 ). This plan is called a protocol .

Definition 10.1 (Protocol) A protocol is a procedure documenting the details of the design and implementation of studies, and for data collection.

Unforeseen complications are not unusual, so often a pilot study (or a practice run ) is conducted before the real data collection, to:

  • determine the feasibility of the data collection protocol.
  • identify unforeseen challenges.
  • obtain data to determine appropriate sample sizes (Sect. 30 ).
  • potentially save time and money.

The pilot study may suggest changes to the protocol.

Definition 10.2 (Pilot study) A pilot study is a small test run of the study protocol used to check that the protocol is appropriate and practical, and to identify (and hence fix) possible problems with the research design or protocol.

A pilot study allows the researcher

what is data collection procedure in research

The data can be collected once the protocol has been finalised. Protocols ensure studies are repeatable (Sect. 4.3 ) so others can confirm or compare results, and others can understand exactly what was done, and how. Protocols should indicate how design aspects (such as blinding the individuals, random allocation of treatments, etc.) will happen. The final protocol , without pedantic detail, should be reported. Diagrams can be useful to support explanations. All studies should have a well-established protocol for describing how the study was done.

A protocol usually has at least three components that describe:

  • how individuals are chosen from the population (i.e., external validity).
  • how information is collected from the individuals (i.e., internal validity).
  • the analyses and software (including version) used.

Example 10.1 (Protocol) Romanchik-Cerpovicz, Jeffords, and Onyenwoke ( 2018 ) made cookies using pureed green peas in place of margarine (to increase the nutritional value of cookies). They assessed the acceptance of these cookies to college students.

The protocol discussed how the individuals were chosen (p. 4):

...through advertisement across campus from students attending a university in the southeastern United States.

This voluntary sample comprised \(80.6\) % women, a higher percentage of women than in the general population, or the college population. (Other extraneous variables were also recorded.)

Exclusion criteria were also applied, excluding people "with an allergy or sensitivity to an ingredient used in the preparation of the cookies" (p. 5). The researchers also described how the data was obtained (p. 5):

During the testing session, panelists were seated at individual tables. Each cookie was presented one at a time on a disposable white plate. Samples were previously coded and randomized. The presentation order for all samples was \(25\) %, \(0\) %, \(50\) %, \(100\) % and \(75\) % substitution of fat with puree of canned green peas. To maintain standard procedures for sensory analysis [...], panelists cleansed their palates between cookie samples with distilled water ( \(25^\circ\) C) [...] characteristics of color, smell, moistness, flavor, aftertaste, and overall acceptability, for each sample of cookies [was recorded]...

Thus, internal validity was managed using random allocation, blinding individuals, and washouts. Details are also given of how the cookies were prepared, and how objective measurements (such as moisture content) were determined.

The analyses and software used were also given.

Consider this partial protocol, which shows honesty in describing a protocol:

Fresh cow dung was obtained from free-ranging, grass fed, and antibiotic-free Milking Shorthorn cows ( Bos taurus ) in the Tilden Regional Park in Berkeley, CA. Resting cows were approached with caution and startled by loud shouting, whereupon the cows rapidly stood up, defecated, and moved away from the source of the annoyance. Dung was collected in ZipLoc bags ( \(1\) gallon), snap-frozen and stored at \(-80\)  C. --- Hare et al. ( 2008 ) , p. 10

10.2 Collecting data using questionnaires

10.2.1 writing questions.

Collecting data using questionnaires is common for both observational and experimental studies. Questionnaires are very difficult to do well: question wording is crucial, and surprisingly difficult to get right ( Fink 1995 ) . Pilot testing questionnaires is crucial!

Definition 10.3 (Questionnaire) A questionnaire is a set of questions for respondents to answer.

A questionnaire is a set of question to obtain information from individuals. A survey is an entire methodology, that includes gathering data using a questionnaire, finding a sample, and other components.

Questions in a questionnaire may be open-ended (respondents can write their own answers) or closed (respondents select from a small number of possible answers, as in multiple-choice questions). Open and closed questions both have advantages and disadvantages. Answers to open questions more easily lend themselves to qualitative analysis. This section briefly discusses writing questions.

Example 10.2 (Open and closed questions) Raab and Bogner ( 2021 ) asked German students a series of questions about microplastics, including:

  • Name sources of microplastics in the household.
  • In which ecosystems are microplastics in Germany? Tick the answer (multiple ticks are possible). Options : (a) sea; (b) rivers; (c) lakes; (d) groundwater.
  • Assess the potential danger posed by microplastics. Options : (a) very dangerous; (b) dangerous; (c) hardly dangerous; (d) not dangerous.

The first question is an open : respondents could provide their own answers. The second question is closed , where multiple options can be selected. The third question is closed , where only one option can be selected

Important advice for writing questionnaire questions include:

  • Avoid leading questions , which may lead respondents to answer a certain way. Imprecise question wording is the usual reason for leading questions.
  • Avoid ambiguity : avoid unfamiliar terms and unclear questions.
  • Avoid asking the uninformed : avoid asking respondents about issues they don't know about. Many people will give a response even if they do not understand (such responses are worthless). For example, people may give directions to places that do not even exist ( Collett and O’Shea 1976 ) .
  • Avoid complex and double-barrelled questions , which are hard to understand.
  • Avoid problems with ethics : avoid questions about people breaking laws, or revealing confidential or private information. In special cases and with justification, ethics committees may allow such questions.
  • Ensure clarity in question wording.
  • Ensure options are mutually exhaustive , so answers fit into only one category.
  • Ensure options are exhaustive , so that the categories cover all options.

Example 10.3 (Poor question wording) Consider a questionnaire asking these questions:

  • Because bottles from bottled water create enormous amounts of non-biodegradable landfill and hence threaten native wildlife, do you support banning bottled water?
  • Do you drink more water now?
  • Are you more concerned about Coagulase-negative Staphylococcus or Neisseria pharyngis in bottled water?
  • Do you drink water in plastic and glass bottles?
  • Do you have a water tank installed illegally, without permission?
  • Do you avoid purchasing water in plastic bottles unless it is carbonated, unless the bottles are plastic but not necessarily if the lid is recyclable?

Question 1 is leading because the expected response is obvious.

Question 2 is ambiguous : it is unclear what 'more water now' is being compared to.

Question 3 is unlikely to give sensible answers, as most people will be uninformed . Many people will still give an opinion, but the data will be effectively useless (though the researcher may not realise).

Question 4 is double-barrelled , and would be better asked as two separate questions (one asking about plastic bottles, and one about glass bottles).

Question 5 is unlikely to be given ethical approval or to obtain truthful answers, as respondents are unlikely to admit to breaking rules.

Question 6 is unclear , since knowing what a yes or no answer means is confusing.

Example 10.4 (Question wording) Question wording can be important. In the 2014 General Social Survey ( https://gss.norc.org ), when white Americans were asked for their opinion of the amount America spends on welfare , \(58\) % of respondents answered 'Too much' ( Jardina 2018 ) .

However, when white Americans were asked for their opinion of the amount America spends on assistance to the poor , only \(16\) % of respondents answered 'Too much'.

Example 10.5 (Leading question) Consider this question:

Do you like this new orthotic?

This question is leading , since liking is the only option presented. Better would be:

Do you like or dislike this new orthotic?

Example 10.6 (Mutually exclusive options) In a study to determine the time doctors spent on patients (from Chan et al. ( 2008 ) ), doctors were given the options:

  • \(0\) -- \(5\)  mins;
  • \(5\) -- \(10\)  mins; or
  • more than \(10\)  mins.

This is a poor question, because a respondent does not know which option to select for an answer of ' \(5\)  minutes'. The options are not mutually exclusive .

The following (humourous) video shows how questions can be manipulated by those not wanting to be ethical:

10.2.2 Challenges using questionnaires

Using questionnaires presents myriad challenges.

  • Non-response bias (Sect. 5.11 ): Non-response bias is common with questionnaires, as they are often used with voluntary-response samples. The people who do not respond to the survey may be different than those who do respond.
  • Response bias (Sect. 5.11 ): People do not always answer truthfully; for example, what people say may not correspond with what people do (Example 9.6 ). Sometimes this is unintentional (e.g., poor questions wording), due to embarrassment or because questions are controversial. Sometimes, respondents repeatedly provide the same answer to a series of multichoice questions.
  • Recall bias : People may not be able to accurately recall past events clearly, or recall when they happened.
  • Question order : The order of the questions can influence the responses.
  • Interpretation : Phrases and words such as 'Sometimes' and 'Somewhat disagree' may mean different things to different people.

Many of these can be managed with careful questionnaire design, but discussing the methods are beyond the scope of this book.

10.3 Chapter summary

Having a detailed procedure for collecting the data (the protocol ) is important. Using a pilot study to trial the protocol an often reveal unexpected changes necessary for a good protocol. Creating good questionnaires questions is difficult, but important.

10.4 Quick review questions

What is the biggest problem with this question: 'Do you have bromodosis?'

What is the biggest problem with this question: 'Do you spend too much time connected to the internet?'

What is the biggest problem with this question: 'Do you eat fruits and vegetables?'

Which of these are reasons for producing a well-defined protocol?

  • It allows the researchers to make the study externally valid. TRUE FALSE
  • It ensures that others know exactly what was done. TRUE FALSE
  • It ensures that the study is repeatable for others. TRUE FALSE

Which of the following questionnaire questions likely to be leading questions?

  • Do you, or do you not, believe that permeable pavements are a viable alternative to traditional pavements? TRUE FALSE
  • Do you support a ban on bottled water? TRUE FALSE
  • Do you believe that double-gloving by paramedics reduces the risk of infection, increases the risk of infection, or makes no difference to the risk of infection? TRUE FALSE
  • Should Ireland ban breakfast cereals with unhealthy sugar levels? TRUE FALSE

10.5 Exercises

Answers to odd-numbered exercises are available in App.  E .

Exercise 10.1 What is the problem with this question?

What is your age? (Select one option) Under \(18\) Over \(18\)

Exercise 10.2 What is the problem with this question?

How many children do you have? (Select one option) None 1 or 2 2 or 3 More than 4

Exercise 10.3 Which of these questionnaire questions is better? Why?

  • Should concerned cat owners vaccinate their pets?
  • Should domestic cats be required to be vaccinated or not?
  • Do you agree that pet-owners should have their cats vaccinated?

Exercise 10.4 Which of these questionnaire questions is better? Why?

  • Do you own an environmentally-friendly electric vehicle?
  • Do you own an electric vehicle?
  • Do you own or do you not own an electric vehicle?

Exercise 10.5 Falk and Anderson ( 2013 ) studied sunscreen use, and asked participants questions, including these:

  • How often do you sun bathe with the intention to tan during the summer in Sweden? (Possible answers: never, seldom, sometimes, often, always).
  • How long do you usually stay in the sun between \(11\) am and \(3\) pm, during a typical day-off in the summer (June--August)? (Possible answers: \(<30\)  min, \(30\) min-- \(1\) h, \(1\) -- \(2\) h, \(2\) -- \(3\) h, \(>3\) h).

Critique these questions. What biases may be present?

Exercise 10.6 Morón-Monge, Hamed, and Morón Monge ( 2021 ) studied primary-school children's knowledge of their natural environment. They were asked three questions:

  • No, I don’t like parks.
  • No, I don’t usually visit it.
  • Yes, once per week.
  • Yes, more than once a week
  • Two to three times
  • More than three times
  • Write a story
  • Draw a picture

Which questions are open and which are closed ? Critique the questions.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Data Collection Methods | Step-by-Step Guide & Examples

Data Collection Methods | Step-by-Step Guide & Examples

Published on 4 May 2022 by Pritha Bhandari .

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental, or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address, and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analysed through statistical methods .
  • Qualitative data is expressed in words and analysed through interpretations and categorisations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data.

If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Prevent plagiarism, run a free check.

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research, and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design .

Operationalisation

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalisation means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness, and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and time frame of the data collection.

Standardising procedures

If multiple researchers are involved, write a detailed manual to standardise data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorise observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organise and store your data.

  • If you are collecting data from people, you will likely need to anonymise and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimise distortion.
  • You can prevent loss of data by having an organisation system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1 to 5. The data produced is numerical and can be statistically analysed for averages and patterns.

To ensure that high-quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organisations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g., understanding the needs of your consumers or user testing your website).
  • You can control and standardise the process for high reliability and validity (e.g., choosing appropriate measurements and sampling methods ).

However, there are also some drawbacks: data collection can be time-consuming, labour-intensive, and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research , you also have to consider the internal and external validity of your experiment.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, May 04). Data Collection Methods | Step-by-Step Guide & Examples. Scribbr. Retrieved 27 May 2024, from https://www.scribbr.co.uk/research-methods/data-collection-guide/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs quantitative research | examples & methods, triangulation in research | guide, types, examples, what is a conceptual framework | tips & examples.

Table of Contents

What is data collection, why do we need data collection, what are the different data collection methods, data collection tools, the importance of ensuring accurate and appropriate data collection, issues related to maintaining the integrity of data collection, what are common challenges in data collection, what are the key steps in the data collection process, data collection considerations and best practices, choose the right data science program, are you interested in a career in data science, what is data collection: methods, types, tools.

What is Data Collection? Definition, Types, Tools, and Techniques

The process of gathering and analyzing accurate data from various sources to find answers to research problems, trends and probabilities, etc., to evaluate possible outcomes is Known as Data Collection. Knowledge is power, information is knowledge, and data is information in digitized form, at least as defined in IT. Hence, data is power. But before you can leverage that data into a successful strategy for your organization or business, you need to gather it. That’s your first step.

So, to help you get the process started, we shine a spotlight on data collection. What exactly is it? Believe it or not, it’s more than just doing a Google search! Furthermore, what are the different types of data collection? And what kinds of data collection tools and data collection techniques exist?

If you want to get up to speed about what is data collection process, you’ve come to the right place. 

Transform raw data into captivating visuals with Simplilearn's hands-on Data Visualization Courses and captivate your audience. Also, master the art of data management with Simplilearn's comprehensive data management courses  - unlock new career opportunities today!

Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences, business, and healthcare.

Accurate data collection is necessary to make informed business decisions, ensure quality assurance, and keep research integrity.

During data collection, the researchers must identify the data types, the sources of data, and what methods are being used. We will soon see that there are many different data collection methods . There is heavy reliance on data collection in research, commercial, and government fields.

Before an analyst begins collecting data, they must answer three questions first:

  • What’s the goal or purpose of this research?
  • What kinds of data are they planning on gathering?
  • What methods and procedures will be used to collect, store, and process the information?

Additionally, we can break up data into qualitative and quantitative types. Qualitative data covers descriptions such as color, size, quality, and appearance. Quantitative data, unsurprisingly, deals with numbers, such as statistics, poll numbers, percentages, etc.

Before a judge makes a ruling in a court case or a general creates a plan of attack, they must have as many relevant facts as possible. The best courses of action come from informed decisions, and information and data are synonymous.

The concept of data collection isn’t a new one, as we’ll see later, but the world has changed. There is far more data available today, and it exists in forms that were unheard of a century ago. The data collection process has had to change and grow with the times, keeping pace with technology.

Whether you’re in the world of academia, trying to conduct research, or part of the commercial sector, thinking of how to promote a new product, you need data collection to help you make better choices.

Now that you know what is data collection and why we need it, let's take a look at the different methods of data collection. While the phrase “data collection” may sound all high-tech and digital, it doesn’t necessarily entail things like computers, big data , and the internet. Data collection could mean a telephone survey, a mail-in comment card, or even some guy with a clipboard asking passersby some questions. But let’s see if we can sort the different data collection methods into a semblance of organized categories.

Primary and secondary methods of data collection are two approaches used to gather information for research or analysis purposes. Let's explore each data collection method in detail:

1. Primary Data Collection:

Primary data collection involves the collection of original data directly from the source or through direct interaction with the respondents. This method allows researchers to obtain firsthand information specifically tailored to their research objectives. There are various techniques for primary data collection, including:

a. Surveys and Questionnaires: Researchers design structured questionnaires or surveys to collect data from individuals or groups. These can be conducted through face-to-face interviews, telephone calls, mail, or online platforms.

b. Interviews: Interviews involve direct interaction between the researcher and the respondent. They can be conducted in person, over the phone, or through video conferencing. Interviews can be structured (with predefined questions), semi-structured (allowing flexibility), or unstructured (more conversational).

c. Observations: Researchers observe and record behaviors, actions, or events in their natural setting. This method is useful for gathering data on human behavior, interactions, or phenomena without direct intervention.

d. Experiments: Experimental studies involve the manipulation of variables to observe their impact on the outcome. Researchers control the conditions and collect data to draw conclusions about cause-and-effect relationships.

e. Focus Groups: Focus groups bring together a small group of individuals who discuss specific topics in a moderated setting. This method helps in understanding opinions, perceptions, and experiences shared by the participants.

2. Secondary Data Collection:

Secondary data collection involves using existing data collected by someone else for a purpose different from the original intent. Researchers analyze and interpret this data to extract relevant information. Secondary data can be obtained from various sources, including:

a. Published Sources: Researchers refer to books, academic journals, magazines, newspapers, government reports, and other published materials that contain relevant data.

b. Online Databases: Numerous online databases provide access to a wide range of secondary data, such as research articles, statistical information, economic data, and social surveys.

c. Government and Institutional Records: Government agencies, research institutions, and organizations often maintain databases or records that can be used for research purposes.

d. Publicly Available Data: Data shared by individuals, organizations, or communities on public platforms, websites, or social media can be accessed and utilized for research.

e. Past Research Studies: Previous research studies and their findings can serve as valuable secondary data sources. Researchers can review and analyze the data to gain insights or build upon existing knowledge.

Now that we’ve explained the various techniques, let’s narrow our focus even further by looking at some specific tools. For example, we mentioned interviews as a technique, but we can further break that down into different interview types (or “tools”).

Word Association

The researcher gives the respondent a set of words and asks them what comes to mind when they hear each word.

Sentence Completion

Researchers use sentence completion to understand what kind of ideas the respondent has. This tool involves giving an incomplete sentence and seeing how the interviewee finishes it.

Role-Playing

Respondents are presented with an imaginary situation and asked how they would act or react if it was real.

In-Person Surveys

The researcher asks questions in person.

Online/Web Surveys

These surveys are easy to accomplish, but some users may be unwilling to answer truthfully, if at all.

Mobile Surveys

These surveys take advantage of the increasing proliferation of mobile technology. Mobile collection surveys rely on mobile devices like tablets or smartphones to conduct surveys via SMS or mobile apps.

Phone Surveys

No researcher can call thousands of people at once, so they need a third party to handle the chore. However, many people have call screening and won’t answer.

Observation

Sometimes, the simplest method is the best. Researchers who make direct observations collect data quickly and easily, with little intrusion or third-party bias. Naturally, it’s only effective in small-scale situations.

Accurate data collecting is crucial to preserving the integrity of research, regardless of the subject of study or preferred method for defining data (quantitative, qualitative). Errors are less likely to occur when the right data gathering tools are used (whether they are brand-new ones, updated versions of them, or already available).

Among the effects of data collection done incorrectly, include the following -

  • Erroneous conclusions that squander resources
  • Decisions that compromise public policy
  • Incapacity to correctly respond to research inquiries
  • Bringing harm to participants who are humans or animals
  • Deceiving other researchers into pursuing futile research avenues
  • The study's inability to be replicated and validated

When these study findings are used to support recommendations for public policy, there is the potential to result in disproportionate harm, even if the degree of influence from flawed data collecting may vary by discipline and the type of investigation.

Let us now look at the various issues that we might face while maintaining the integrity of data collection.

In order to assist the errors detection process in the data gathering process, whether they were done purposefully (deliberate falsifications) or not, maintaining data integrity is the main justification (systematic or random errors).

Quality assurance and quality control are two strategies that help protect data integrity and guarantee the scientific validity of study results.

Each strategy is used at various stages of the research timeline:

  • Quality control - tasks that are performed both after and during data collecting
  • Quality assurance - events that happen before data gathering starts

Let us explore each of them in more detail now.

Quality Assurance

As data collecting comes before quality assurance, its primary goal is "prevention" (i.e., forestalling problems with data collection). The best way to protect the accuracy of data collection is through prevention. The uniformity of protocol created in the thorough and exhaustive procedures manual for data collecting serves as the best example of this proactive step. 

The likelihood of failing to spot issues and mistakes early in the research attempt increases when guides are written poorly. There are several ways to show these shortcomings:

  • Failure to determine the precise subjects and methods for retraining or training staff employees in data collecting
  • List of goods to be collected, in part
  • There isn't a system in place to track modifications to processes that may occur as the investigation continues.
  • Instead of detailed, step-by-step instructions on how to deliver tests, there is a vague description of the data gathering tools that will be employed.
  • Uncertainty regarding the date, procedure, and identity of the person or people in charge of examining the data
  • Incomprehensible guidelines for using, adjusting, and calibrating the data collection equipment.

Now, let us look at how to ensure Quality Control.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Quality Control

Despite the fact that quality control actions (detection/monitoring and intervention) take place both after and during data collection, the specifics should be meticulously detailed in the procedures manual. Establishing monitoring systems requires a specific communication structure, which is a prerequisite. Following the discovery of data collection problems, there should be no ambiguity regarding the information flow between the primary investigators and staff personnel. A poorly designed communication system promotes slack oversight and reduces opportunities for error detection.

Direct staff observation conference calls, during site visits, or frequent or routine assessments of data reports to spot discrepancies, excessive numbers, or invalid codes can all be used as forms of detection or monitoring. Site visits might not be appropriate for all disciplines. Still, without routine auditing of records, whether qualitative or quantitative, it will be challenging for investigators to confirm that data gathering is taking place in accordance with the manual's defined methods. Additionally, quality control determines the appropriate solutions, or "actions," to fix flawed data gathering procedures and reduce recurrences.

Problems with data collection, for instance, that call for immediate action include:

  • Fraud or misbehavior
  • Systematic mistakes, procedure violations 
  • Individual data items with errors
  • Issues with certain staff members or a site's performance 

Researchers are trained to include one or more secondary measures that can be used to verify the quality of information being obtained from the human subject in the social and behavioral sciences where primary data collection entails using human subjects. 

For instance, a researcher conducting a survey would be interested in learning more about the prevalence of risky behaviors among young adults as well as the social factors that influence these risky behaviors' propensity for and frequency. Let us now explore the common challenges with regard to data collection.

There are some prevalent challenges faced while collecting data, let us explore a few of them to understand them better and avoid them.

Data Quality Issues

The main threat to the broad and successful application of machine learning is poor data quality. Data quality must be your top priority if you want to make technologies like machine learning work for you. Let's talk about some of the most prevalent data quality problems in this blog article and how to fix them.

Inconsistent Data

When working with various data sources, it's conceivable that the same information will have discrepancies between sources. The differences could be in formats, units, or occasionally spellings. The introduction of inconsistent data might also occur during firm mergers or relocations. Inconsistencies in data have a tendency to accumulate and reduce the value of data if they are not continually resolved. Organizations that have heavily focused on data consistency do so because they only want reliable data to support their analytics.

Data Downtime

Data is the driving force behind the decisions and operations of data-driven businesses. However, there may be brief periods when their data is unreliable or not prepared. Customer complaints and subpar analytical outcomes are only two ways that this data unavailability can have a significant impact on businesses. A data engineer spends about 80% of their time updating, maintaining, and guaranteeing the integrity of the data pipeline. In order to ask the next business question, there is a high marginal cost due to the lengthy operational lead time from data capture to insight.

Schema modifications and migration problems are just two examples of the causes of data downtime. Data pipelines can be difficult due to their size and complexity. Data downtime must be continuously monitored, and it must be reduced through automation.

Ambiguous Data

Even with thorough oversight, some errors can still occur in massive databases or data lakes. For data streaming at a fast speed, the issue becomes more overwhelming. Spelling mistakes can go unnoticed, formatting difficulties can occur, and column heads might be deceptive. This unclear data might cause a number of problems for reporting and analytics.

Become a Data Science Expert & Get Your Dream Job

Become a Data Science Expert & Get Your Dream Job

Duplicate Data

Streaming data, local databases, and cloud data lakes are just a few of the sources of data that modern enterprises must contend with. They might also have application and system silos. These sources are likely to duplicate and overlap each other quite a bit. For instance, duplicate contact information has a substantial impact on customer experience. If certain prospects are ignored while others are engaged repeatedly, marketing campaigns suffer. The likelihood of biased analytical outcomes increases when duplicate data are present. It can also result in ML models with biased training data.

Too Much Data

While we emphasize data-driven analytics and its advantages, a data quality problem with excessive data exists. There is a risk of getting lost in an abundance of data when searching for information pertinent to your analytical efforts. Data scientists, data analysts, and business users devote 80% of their work to finding and organizing the appropriate data. With an increase in data volume, other problems with data quality become more serious, particularly when dealing with streaming data and big files or databases.

Inaccurate Data

For highly regulated businesses like healthcare, data accuracy is crucial. Given the current experience, it is more important than ever to increase the data quality for COVID-19 and later pandemics. Inaccurate information does not provide you with a true picture of the situation and cannot be used to plan the best course of action. Personalized customer experiences and marketing strategies underperform if your customer data is inaccurate.

Data inaccuracies can be attributed to a number of things, including data degradation, human mistake, and data drift. Worldwide data decay occurs at a rate of about 3% per month, which is quite concerning. Data integrity can be compromised while being transferred between different systems, and data quality might deteriorate with time.

Hidden Data

The majority of businesses only utilize a portion of their data, with the remainder sometimes being lost in data silos or discarded in data graveyards. For instance, the customer service team might not receive client data from sales, missing an opportunity to build more precise and comprehensive customer profiles. Missing out on possibilities to develop novel products, enhance services, and streamline procedures is caused by hidden data.

Finding Relevant Data

Finding relevant data is not so easy. There are several factors that we need to consider while trying to find relevant data, which include -

  • Relevant Domain
  • Relevant demographics
  • Relevant Time period and so many more factors that we need to consider while trying to find relevant data.

Data that is not relevant to our study in any of the factors render it obsolete and we cannot effectively proceed with its analysis. This could lead to incomplete research or analysis, re-collecting data again and again, or shutting down the study.

Deciding the Data to Collect

Determining what data to collect is one of the most important factors while collecting data and should be one of the first factors while collecting data. We must choose the subjects the data will cover, the sources we will be used to gather it, and the quantity of information we will require. Our responses to these queries will depend on our aims, or what we expect to achieve utilizing your data. As an illustration, we may choose to gather information on the categories of articles that website visitors between the ages of 20 and 50 most frequently access. We can also decide to compile data on the typical age of all the clients who made a purchase from your business over the previous month.

Not addressing this could lead to double work and collection of irrelevant data or ruining your study as a whole.

Dealing With Big Data

Big data refers to exceedingly massive data sets with more intricate and diversified structures. These traits typically result in increased challenges while storing, analyzing, and using additional methods of extracting results. Big data refers especially to data sets that are quite enormous or intricate that conventional data processing tools are insufficient. The overwhelming amount of data, both unstructured and structured, that a business faces on a daily basis. 

The amount of data produced by healthcare applications, the internet, social networking sites social, sensor networks, and many other businesses are rapidly growing as a result of recent technological advancements. Big data refers to the vast volume of data created from numerous sources in a variety of formats at extremely fast rates. Dealing with this kind of data is one of the many challenges of Data Collection and is a crucial step toward collecting effective data. 

Low Response and Other Research Issues

Poor design and low response rates were shown to be two issues with data collecting, particularly in health surveys that used questionnaires. This might lead to an insufficient or inadequate supply of data for the study. Creating an incentivized data collection program might be beneficial in this case to get more responses.

Now, let us look at the key steps in the data collection process.

In the Data Collection Process, there are 5 key steps. They are explained briefly below -

1. Decide What Data You Want to Gather

The first thing that we need to do is decide what information we want to gather. We must choose the subjects the data will cover, the sources we will use to gather it, and the quantity of information that we would require. For instance, we may choose to gather information on the categories of products that an average e-commerce website visitor between the ages of 30 and 45 most frequently searches for. 

2. Establish a Deadline for Data Collection

The process of creating a strategy for data collection can now begin. We should set a deadline for our data collection at the outset of our planning phase. Some forms of data we might want to continuously collect. We might want to build up a technique for tracking transactional data and website visitor statistics over the long term, for instance. However, we will track the data throughout a certain time frame if we are tracking it for a particular campaign. In these situations, we will have a schedule for when we will begin and finish gathering data. 

3. Select a Data Collection Approach

We will select the data collection technique that will serve as the foundation of our data gathering plan at this stage. We must take into account the type of information that we wish to gather, the time period during which we will receive it, and the other factors we decide on to choose the best gathering strategy.

4. Gather Information

Once our plan is complete, we can put our data collection plan into action and begin gathering data. In our DMP, we can store and arrange our data. We need to be careful to follow our plan and keep an eye on how it's doing. Especially if we are collecting data regularly, setting up a timetable for when we will be checking in on how our data gathering is going may be helpful. As circumstances alter and we learn new details, we might need to amend our plan.

5. Examine the Information and Apply Your Findings

It's time to examine our data and arrange our findings after we have gathered all of our information. The analysis stage is essential because it transforms unprocessed data into insightful knowledge that can be applied to better our marketing plans, goods, and business judgments. The analytics tools included in our DMP can be used to assist with this phase. We can put the discoveries to use to enhance our business once we have discovered the patterns and insights in our data.

Let us now look at some data collection considerations and best practices that one might follow.

We must carefully plan before spending time and money traveling to the field to gather data. While saving time and resources, effective data collection strategies can help us collect richer, more accurate, and richer data.

Below, we will be discussing some of the best practices that we can follow for the best results -

1. Take Into Account the Price of Each Extra Data Point

Once we have decided on the data we want to gather, we need to make sure to take the expense of doing so into account. Our surveyors and respondents will incur additional costs for each additional data point or survey question.

2. Plan How to Gather Each Data Piece

There is a dearth of freely accessible data. Sometimes the data is there, but we may not have access to it. For instance, unless we have a compelling cause, we cannot openly view another person's medical information. It could be challenging to measure several types of information.

Consider how time-consuming and difficult it will be to gather each piece of information while deciding what data to acquire.

3. Think About Your Choices for Data Collecting Using Mobile Devices

Mobile-based data collecting can be divided into three categories -

  • IVRS (interactive voice response technology) -  Will call the respondents and ask them questions that have already been recorded. 
  • SMS data collection - Will send a text message to the respondent, who can then respond to questions by text on their phone. 
  • Field surveyors - Can directly enter data into an interactive questionnaire while speaking to each respondent, thanks to smartphone apps.

We need to make sure to select the appropriate tool for our survey and responders because each one has its own disadvantages and advantages.

4. Carefully Consider the Data You Need to Gather

It's all too easy to get information about anything and everything, but it's crucial to only gather the information that we require. 

It is helpful to consider these 3 questions:

  • What details will be helpful?
  • What details are available?
  • What specific details do you require?

5. Remember to Consider Identifiers

Identifiers, or details describing the context and source of a survey response, are just as crucial as the information about the subject or program that we are actually researching.

In general, adding more identifiers will enable us to pinpoint our program's successes and failures with greater accuracy, but moderation is the key.

6. Data Collecting Through Mobile Devices is the Way to Go

Although collecting data on paper is still common, modern technology relies heavily on mobile devices. They enable us to gather many various types of data at relatively lower prices and are accurate as well as quick. There aren't many reasons not to pick mobile-based data collecting with the boom of low-cost Android devices that are available nowadays.

The Ultimate Ticket to Top Data Science Job Roles

The Ultimate Ticket to Top Data Science Job Roles

1. What is data collection with example?

Data collection is the process of collecting and analyzing information on relevant variables in a predetermined, methodical way so that one can respond to specific research questions, test hypotheses, and assess results. Data collection can be either qualitative or quantitative. Example: A company collects customer feedback through online surveys and social media monitoring to improve their products and services.

2. What are the primary data collection methods?

As is well known, gathering primary data is costly and time intensive. The main techniques for gathering data are observation, interviews, questionnaires, schedules, and surveys.

3. What are data collection tools?

The term "data collecting tools" refers to the tools/devices used to gather data, such as a paper questionnaire or a system for computer-assisted interviews. Tools used to gather data include case studies, checklists, interviews, occasionally observation, surveys, and questionnaires.

4. What’s the difference between quantitative and qualitative methods?

While qualitative research focuses on words and meanings, quantitative research deals with figures and statistics. You can systematically measure variables and test hypotheses using quantitative methods. You can delve deeper into ideas and experiences using qualitative methodologies.

5. What are quantitative data collection methods?

While there are numerous other ways to get quantitative information, the methods indicated above—probability sampling, interviews, questionnaire observation, and document review—are the most typical and frequently employed, whether collecting information offline or online.

6. What is mixed methods research?

User research that includes both qualitative and quantitative techniques is known as mixed methods research. For deeper user insights, mixed methods research combines insightful user data with useful statistics.

7. What are the benefits of collecting data?

Collecting data offers several benefits, including:

  • Knowledge and Insight
  • Evidence-Based Decision Making
  • Problem Identification and Solution
  • Validation and Evaluation
  • Identifying Trends and Predictions
  • Support for Research and Development
  • Policy Development
  • Quality Improvement
  • Personalization and Targeting
  • Knowledge Sharing and Collaboration

8. What’s the difference between reliability and validity?

Reliability is about consistency and stability, while validity is about accuracy and appropriateness. Reliability focuses on the consistency of results, while validity focuses on whether the results are actually measuring what they are intended to measure. Both reliability and validity are crucial considerations in research to ensure the trustworthiness and meaningfulness of the collected data and measurements.

Are you thinking about pursuing a career in the field of data science? Simplilearn's Data Science courses are designed to provide you with the necessary skills and expertise to excel in this rapidly changing field. Here's a detailed comparison for your reference:

Program Name Data Scientist Master's Program Post Graduate Program In Data Science Post Graduate Program In Data Science Geo All Geos All Geos Not Applicable in US University Simplilearn Purdue Caltech Course Duration 11 Months 11 Months 11 Months Coding Experience Required Basic Basic No Skills You Will Learn 10+ skills including data structure, data manipulation, NumPy, Scikit-Learn, Tableau and more 8+ skills including Exploratory Data Analysis, Descriptive Statistics, Inferential Statistics, and more 8+ skills including Supervised & Unsupervised Learning Deep Learning Data Visualization, and more Additional Benefits Applied Learning via Capstone and 25+ Data Science Projects Purdue Alumni Association Membership Free IIMJobs Pro-Membership of 6 months Resume Building Assistance Upto 14 CEU Credits Caltech CTME Circle Membership Cost $$ $$$$ $$$$ Explore Program Explore Program Explore Program

We live in the Data Age, and if you want a career that fully takes advantage of this, you should consider a career in data science. Simplilearn offers a Caltech Post Graduate Program in Data Science  that will train you in everything you need to know to secure the perfect position. This Data Science PG program is ideal for all working professionals, covering job-critical topics like R, Python programming , machine learning algorithms , NLP concepts , and data visualization with Tableau in great detail. This is all provided via our interactive learning model with live sessions by global practitioners, practical labs, and industry projects.

Data Science & Business Analytics Courses Duration and Fees

Data Science & Business Analytics programs typically range from a few weeks to several months, with fees varying based on program and institution.

Recommended Reads

Data Science Career Guide: A Comprehensive Playbook To Becoming A Data Scientist

Capped Collection in MongoDB

An Ultimate One-Stop Solution Guide to Collections in C# Programming With Examples

Managing Data

Difference Between Collection and Collections in Java

What Are Java Collections and How to Implement Them?

Get Affiliated Certifications with Live Class programs

Data scientist.

  • Industry-recognized Data Scientist Master’s certificate from Simplilearn
  • Dedicated live sessions by faculty of industry experts

Caltech Data Sciences-Bootcamp

  • Exclusive visit to Caltech’s Robotics Lab

Caltech Post Graduate Program in Data Science

  • Earn a program completion certificate from Caltech CTME
  • Curriculum delivered in live online sessions by industry experts
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Research-Methodology

Data Collection Methods

Data collection is a process of collecting information from all the relevant sources to find answers to the research problem, test the hypothesis (if you are following deductive approach ) and evaluate the outcomes. Data collection methods can be divided into two categories: secondary methods of data collection and primary methods of data collection.

Secondary Data Collection Methods

Secondary data is a type of data that has already been published in books, newspapers, magazines, journals, online portals etc.  There is an abundance of data available in these sources about your research area in business studies, almost regardless of the nature of the research area. Therefore, application of appropriate set of criteria to select secondary data to be used in the study plays an important role in terms of increasing the levels of research validity and reliability.

These criteria include, but not limited to date of publication, credential of the author, reliability of the source, quality of discussions, depth of analyses, the extent of contribution of the text to the development of the research area etc. Secondary data collection is discussed in greater depth in Literature Review chapter.

Secondary data collection methods offer a range of advantages such as saving time, effort and expenses. However they have a major disadvantage. Specifically, secondary research does not make contribution to the expansion of the literature by producing fresh (new) data.

Primary Data Collection Methods

Primary data is the type of data that has not been around before. Primary data is unique findings of your research. Primary data collection and analysis typically requires more time and effort to conduct compared to the secondary data research. Primary data collection methods can be divided into two groups: quantitative and qualitative.

Quantitative data collection methods are based on mathematical calculations in various formats. Methods of quantitative data collection and analysis include questionnaires with closed-ended questions, methods of correlation and regression, mean, mode and median and others.

Quantitative methods are cheaper to apply and they can be applied within shorter duration of time compared to qualitative methods. Moreover, due to a high level of standardisation of quantitative methods, it is easy to make comparisons of findings.

Qualitative research methods , on the contrary, do not involve numbers or mathematical calculations. Qualitative research is closely associated with words, sounds, feeling, emotions, colours and other elements that are non-quantifiable.

Qualitative studies aim to ensure greater level of depth of understanding and qualitative data collection methods include interviews, questionnaires with open-ended questions, focus groups, observation, game or role-playing, case studies etc.

Your choice between quantitative or qualitative methods of data collection depends on the area of your research and the nature of research aims and objectives.

My e-book, The Ultimate Guide to Writing a Dissertation in Business Studies: a step by step assistance offers practical assistance to complete a dissertation with minimum or no stress. The e-book covers all stages of writing a dissertation starting from the selection to the research area to submitting the completed version of the work within the deadline.

John Dudovskiy

Data Collection Methods

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Can J Hosp Pharm
  • v.68(3); May-Jun 2015

Logo of cjhp

Qualitative Research: Data Collection, Analysis, and Management

Introduction.

In an earlier paper, 1 we presented an introduction to using qualitative research methods in pharmacy practice. In this article, we review some principles of the collection, analysis, and management of qualitative data to help pharmacists interested in doing research in their practice to continue their learning in this area. Qualitative research can help researchers to access the thoughts and feelings of research participants, which can enable development of an understanding of the meaning that people ascribe to their experiences. Whereas quantitative research methods can be used to determine how many people undertake particular behaviours, qualitative methods can help researchers to understand how and why such behaviours take place. Within the context of pharmacy practice research, qualitative approaches have been used to examine a diverse array of topics, including the perceptions of key stakeholders regarding prescribing by pharmacists and the postgraduation employment experiences of young pharmacists (see “Further Reading” section at the end of this article).

In the previous paper, 1 we outlined 3 commonly used methodologies: ethnography 2 , grounded theory 3 , and phenomenology. 4 Briefly, ethnography involves researchers using direct observation to study participants in their “real life” environment, sometimes over extended periods. Grounded theory and its later modified versions (e.g., Strauss and Corbin 5 ) use face-to-face interviews and interactions such as focus groups to explore a particular research phenomenon and may help in clarifying a less-well-understood problem, situation, or context. Phenomenology shares some features with grounded theory (such as an exploration of participants’ behaviour) and uses similar techniques to collect data, but it focuses on understanding how human beings experience their world. It gives researchers the opportunity to put themselves in another person’s shoes and to understand the subjective experiences of participants. 6 Some researchers use qualitative methodologies but adopt a different standpoint, and an example of this appears in the work of Thurston and others, 7 discussed later in this paper.

Qualitative work requires reflection on the part of researchers, both before and during the research process, as a way of providing context and understanding for readers. When being reflexive, researchers should not try to simply ignore or avoid their own biases (as this would likely be impossible); instead, reflexivity requires researchers to reflect upon and clearly articulate their position and subjectivities (world view, perspectives, biases), so that readers can better understand the filters through which questions were asked, data were gathered and analyzed, and findings were reported. From this perspective, bias and subjectivity are not inherently negative but they are unavoidable; as a result, it is best that they be articulated up-front in a manner that is clear and coherent for readers.

THE PARTICIPANT’S VIEWPOINT

What qualitative study seeks to convey is why people have thoughts and feelings that might affect the way they behave. Such study may occur in any number of contexts, but here, we focus on pharmacy practice and the way people behave with regard to medicines use (e.g., to understand patients’ reasons for nonadherence with medication therapy or to explore physicians’ resistance to pharmacists’ clinical suggestions). As we suggested in our earlier article, 1 an important point about qualitative research is that there is no attempt to generalize the findings to a wider population. Qualitative research is used to gain insights into people’s feelings and thoughts, which may provide the basis for a future stand-alone qualitative study or may help researchers to map out survey instruments for use in a quantitative study. It is also possible to use different types of research in the same study, an approach known as “mixed methods” research, and further reading on this topic may be found at the end of this paper.

The role of the researcher in qualitative research is to attempt to access the thoughts and feelings of study participants. This is not an easy task, as it involves asking people to talk about things that may be very personal to them. Sometimes the experiences being explored are fresh in the participant’s mind, whereas on other occasions reliving past experiences may be difficult. However the data are being collected, a primary responsibility of the researcher is to safeguard participants and their data. Mechanisms for such safeguarding must be clearly articulated to participants and must be approved by a relevant research ethics review board before the research begins. Researchers and practitioners new to qualitative research should seek advice from an experienced qualitative researcher before embarking on their project.

DATA COLLECTION

Whatever philosophical standpoint the researcher is taking and whatever the data collection method (e.g., focus group, one-to-one interviews), the process will involve the generation of large amounts of data. In addition to the variety of study methodologies available, there are also different ways of making a record of what is said and done during an interview or focus group, such as taking handwritten notes or video-recording. If the researcher is audio- or video-recording data collection, then the recordings must be transcribed verbatim before data analysis can begin. As a rough guide, it can take an experienced researcher/transcriber 8 hours to transcribe one 45-minute audio-recorded interview, a process than will generate 20–30 pages of written dialogue.

Many researchers will also maintain a folder of “field notes” to complement audio-taped interviews. Field notes allow the researcher to maintain and comment upon impressions, environmental contexts, behaviours, and nonverbal cues that may not be adequately captured through the audio-recording; they are typically handwritten in a small notebook at the same time the interview takes place. Field notes can provide important context to the interpretation of audio-taped data and can help remind the researcher of situational factors that may be important during data analysis. Such notes need not be formal, but they should be maintained and secured in a similar manner to audio tapes and transcripts, as they contain sensitive information and are relevant to the research. For more information about collecting qualitative data, please see the “Further Reading” section at the end of this paper.

DATA ANALYSIS AND MANAGEMENT

If, as suggested earlier, doing qualitative research is about putting oneself in another person’s shoes and seeing the world from that person’s perspective, the most important part of data analysis and management is to be true to the participants. It is their voices that the researcher is trying to hear, so that they can be interpreted and reported on for others to read and learn from. To illustrate this point, consider the anonymized transcript excerpt presented in Appendix 1 , which is taken from a research interview conducted by one of the authors (J.S.). We refer to this excerpt throughout the remainder of this paper to illustrate how data can be managed, analyzed, and presented.

Interpretation of Data

Interpretation of the data will depend on the theoretical standpoint taken by researchers. For example, the title of the research report by Thurston and others, 7 “Discordant indigenous and provider frames explain challenges in improving access to arthritis care: a qualitative study using constructivist grounded theory,” indicates at least 2 theoretical standpoints. The first is the culture of the indigenous population of Canada and the place of this population in society, and the second is the social constructivist theory used in the constructivist grounded theory method. With regard to the first standpoint, it can be surmised that, to have decided to conduct the research, the researchers must have felt that there was anecdotal evidence of differences in access to arthritis care for patients from indigenous and non-indigenous backgrounds. With regard to the second standpoint, it can be surmised that the researchers used social constructivist theory because it assumes that behaviour is socially constructed; in other words, people do things because of the expectations of those in their personal world or in the wider society in which they live. (Please see the “Further Reading” section for resources providing more information about social constructivist theory and reflexivity.) Thus, these 2 standpoints (and there may have been others relevant to the research of Thurston and others 7 ) will have affected the way in which these researchers interpreted the experiences of the indigenous population participants and those providing their care. Another standpoint is feminist standpoint theory which, among other things, focuses on marginalized groups in society. Such theories are helpful to researchers, as they enable us to think about things from a different perspective. Being aware of the standpoints you are taking in your own research is one of the foundations of qualitative work. Without such awareness, it is easy to slip into interpreting other people’s narratives from your own viewpoint, rather than that of the participants.

To analyze the example in Appendix 1 , we will adopt a phenomenological approach because we want to understand how the participant experienced the illness and we want to try to see the experience from that person’s perspective. It is important for the researcher to reflect upon and articulate his or her starting point for such analysis; for example, in the example, the coder could reflect upon her own experience as a female of a majority ethnocultural group who has lived within middle class and upper middle class settings. This personal history therefore forms the filter through which the data will be examined. This filter does not diminish the quality or significance of the analysis, since every researcher has his or her own filters; however, by explicitly stating and acknowledging what these filters are, the researcher makes it easer for readers to contextualize the work.

Transcribing and Checking

For the purposes of this paper it is assumed that interviews or focus groups have been audio-recorded. As mentioned above, transcribing is an arduous process, even for the most experienced transcribers, but it must be done to convert the spoken word to the written word to facilitate analysis. For anyone new to conducting qualitative research, it is beneficial to transcribe at least one interview and one focus group. It is only by doing this that researchers realize how difficult the task is, and this realization affects their expectations when asking others to transcribe. If the research project has sufficient funding, then a professional transcriber can be hired to do the work. If this is the case, then it is a good idea to sit down with the transcriber, if possible, and talk through the research and what the participants were talking about. This background knowledge for the transcriber is especially important in research in which people are using jargon or medical terms (as in pharmacy practice). Involving your transcriber in this way makes the work both easier and more rewarding, as he or she will feel part of the team. Transcription editing software is also available, but it is expensive. For example, ELAN (more formally known as EUDICO Linguistic Annotator, developed at the Technical University of Berlin) 8 is a tool that can help keep data organized by linking media and data files (particularly valuable if, for example, video-taping of interviews is complemented by transcriptions). It can also be helpful in searching complex data sets. Products such as ELAN do not actually automatically transcribe interviews or complete analyses, and they do require some time and effort to learn; nonetheless, for some research applications, it may be a valuable to consider such software tools.

All audio recordings should be transcribed verbatim, regardless of how intelligible the transcript may be when it is read back. Lines of text should be numbered. Once the transcription is complete, the researcher should read it while listening to the recording and do the following: correct any spelling or other errors; anonymize the transcript so that the participant cannot be identified from anything that is said (e.g., names, places, significant events); insert notations for pauses, laughter, looks of discomfort; insert any punctuation, such as commas and full stops (periods) (see Appendix 1 for examples of inserted punctuation), and include any other contextual information that might have affected the participant (e.g., temperature or comfort of the room).

Dealing with the transcription of a focus group is slightly more difficult, as multiple voices are involved. One way of transcribing such data is to “tag” each voice (e.g., Voice A, Voice B). In addition, the focus group will usually have 2 facilitators, whose respective roles will help in making sense of the data. While one facilitator guides participants through the topic, the other can make notes about context and group dynamics. More information about group dynamics and focus groups can be found in resources listed in the “Further Reading” section.

Reading between the Lines

During the process outlined above, the researcher can begin to get a feel for the participant’s experience of the phenomenon in question and can start to think about things that could be pursued in subsequent interviews or focus groups (if appropriate). In this way, one participant’s narrative informs the next, and the researcher can continue to interview until nothing new is being heard or, as it says in the text books, “saturation is reached”. While continuing with the processes of coding and theming (described in the next 2 sections), it is important to consider not just what the person is saying but also what they are not saying. For example, is a lengthy pause an indication that the participant is finding the subject difficult, or is the person simply deciding what to say? The aim of the whole process from data collection to presentation is to tell the participants’ stories using exemplars from their own narratives, thus grounding the research findings in the participants’ lived experiences.

Smith 9 suggested a qualitative research method known as interpretative phenomenological analysis, which has 2 basic tenets: first, that it is rooted in phenomenology, attempting to understand the meaning that individuals ascribe to their lived experiences, and second, that the researcher must attempt to interpret this meaning in the context of the research. That the researcher has some knowledge and expertise in the subject of the research means that he or she can have considerable scope in interpreting the participant’s experiences. Larkin and others 10 discussed the importance of not just providing a description of what participants say. Rather, interpretative phenomenological analysis is about getting underneath what a person is saying to try to truly understand the world from his or her perspective.

Once all of the research interviews have been transcribed and checked, it is time to begin coding. Field notes compiled during an interview can be a useful complementary source of information to facilitate this process, as the gap in time between an interview, transcribing, and coding can result in memory bias regarding nonverbal or environmental context issues that may affect interpretation of data.

Coding refers to the identification of topics, issues, similarities, and differences that are revealed through the participants’ narratives and interpreted by the researcher. This process enables the researcher to begin to understand the world from each participant’s perspective. Coding can be done by hand on a hard copy of the transcript, by making notes in the margin or by highlighting and naming sections of text. More commonly, researchers use qualitative research software (e.g., NVivo, QSR International Pty Ltd; www.qsrinternational.com/products_nvivo.aspx ) to help manage their transcriptions. It is advised that researchers undertake a formal course in the use of such software or seek supervision from a researcher experienced in these tools.

Returning to Appendix 1 and reading from lines 8–11, a code for this section might be “diagnosis of mental health condition”, but this would just be a description of what the participant is talking about at that point. If we read a little more deeply, we can ask ourselves how the participant might have come to feel that the doctor assumed he or she was aware of the diagnosis or indeed that they had only just been told the diagnosis. There are a number of pauses in the narrative that might suggest the participant is finding it difficult to recall that experience. Later in the text, the participant says “nobody asked me any questions about my life” (line 19). This could be coded simply as “health care professionals’ consultation skills”, but that would not reflect how the participant must have felt never to be asked anything about his or her personal life, about the participant as a human being. At the end of this excerpt, the participant just trails off, recalling that no-one showed any interest, which makes for very moving reading. For practitioners in pharmacy, it might also be pertinent to explore the participant’s experience of akathisia and why this was left untreated for 20 years.

One of the questions that arises about qualitative research relates to the reliability of the interpretation and representation of the participants’ narratives. There are no statistical tests that can be used to check reliability and validity as there are in quantitative research. However, work by Lincoln and Guba 11 suggests that there are other ways to “establish confidence in the ‘truth’ of the findings” (p. 218). They call this confidence “trustworthiness” and suggest that there are 4 criteria of trustworthiness: credibility (confidence in the “truth” of the findings), transferability (showing that the findings have applicability in other contexts), dependability (showing that the findings are consistent and could be repeated), and confirmability (the extent to which the findings of a study are shaped by the respondents and not researcher bias, motivation, or interest).

One way of establishing the “credibility” of the coding is to ask another researcher to code the same transcript and then to discuss any similarities and differences in the 2 resulting sets of codes. This simple act can result in revisions to the codes and can help to clarify and confirm the research findings.

Theming refers to the drawing together of codes from one or more transcripts to present the findings of qualitative research in a coherent and meaningful way. For example, there may be examples across participants’ narratives of the way in which they were treated in hospital, such as “not being listened to” or “lack of interest in personal experiences” (see Appendix 1 ). These may be drawn together as a theme running through the narratives that could be named “the patient’s experience of hospital care”. The importance of going through this process is that at its conclusion, it will be possible to present the data from the interviews using quotations from the individual transcripts to illustrate the source of the researchers’ interpretations. Thus, when the findings are organized for presentation, each theme can become the heading of a section in the report or presentation. Underneath each theme will be the codes, examples from the transcripts, and the researcher’s own interpretation of what the themes mean. Implications for real life (e.g., the treatment of people with chronic mental health problems) should also be given.

DATA SYNTHESIS

In this final section of this paper, we describe some ways of drawing together or “synthesizing” research findings to represent, as faithfully as possible, the meaning that participants ascribe to their life experiences. This synthesis is the aim of the final stage of qualitative research. For most readers, the synthesis of data presented by the researcher is of crucial significance—this is usually where “the story” of the participants can be distilled, summarized, and told in a manner that is both respectful to those participants and meaningful to readers. There are a number of ways in which researchers can synthesize and present their findings, but any conclusions drawn by the researchers must be supported by direct quotations from the participants. In this way, it is made clear to the reader that the themes under discussion have emerged from the participants’ interviews and not the mind of the researcher. The work of Latif and others 12 gives an example of how qualitative research findings might be presented.

Planning and Writing the Report

As has been suggested above, if researchers code and theme their material appropriately, they will naturally find the headings for sections of their report. Qualitative researchers tend to report “findings” rather than “results”, as the latter term typically implies that the data have come from a quantitative source. The final presentation of the research will usually be in the form of a report or a paper and so should follow accepted academic guidelines. In particular, the article should begin with an introduction, including a literature review and rationale for the research. There should be a section on the chosen methodology and a brief discussion about why qualitative methodology was most appropriate for the study question and why one particular methodology (e.g., interpretative phenomenological analysis rather than grounded theory) was selected to guide the research. The method itself should then be described, including ethics approval, choice of participants, mode of recruitment, and method of data collection (e.g., semistructured interviews or focus groups), followed by the research findings, which will be the main body of the report or paper. The findings should be written as if a story is being told; as such, it is not necessary to have a lengthy discussion section at the end. This is because much of the discussion will take place around the participants’ quotes, such that all that is needed to close the report or paper is a summary, limitations of the research, and the implications that the research has for practice. As stated earlier, it is not the intention of qualitative research to allow the findings to be generalized, and therefore this is not, in itself, a limitation.

Planning out the way that findings are to be presented is helpful. It is useful to insert the headings of the sections (the themes) and then make a note of the codes that exemplify the thoughts and feelings of your participants. It is generally advisable to put in the quotations that you want to use for each theme, using each quotation only once. After all this is done, the telling of the story can begin as you give your voice to the experiences of the participants, writing around their quotations. Do not be afraid to draw assumptions from the participants’ narratives, as this is necessary to give an in-depth account of the phenomena in question. Discuss these assumptions, drawing on your participants’ words to support you as you move from one code to another and from one theme to the next. Finally, as appropriate, it is possible to include examples from literature or policy documents that add support for your findings. As an exercise, you may wish to code and theme the sample excerpt in Appendix 1 and tell the participant’s story in your own way. Further reading about “doing” qualitative research can be found at the end of this paper.

CONCLUSIONS

Qualitative research can help researchers to access the thoughts and feelings of research participants, which can enable development of an understanding of the meaning that people ascribe to their experiences. It can be used in pharmacy practice research to explore how patients feel about their health and their treatment. Qualitative research has been used by pharmacists to explore a variety of questions and problems (see the “Further Reading” section for examples). An understanding of these issues can help pharmacists and other health care professionals to tailor health care to match the individual needs of patients and to develop a concordant relationship. Doing qualitative research is not easy and may require a complete rethink of how research is conducted, particularly for researchers who are more familiar with quantitative approaches. There are many ways of conducting qualitative research, and this paper has covered some of the practical issues regarding data collection, analysis, and management. Further reading around the subject will be essential to truly understand this method of accessing peoples’ thoughts and feelings to enable researchers to tell participants’ stories.

Appendix 1. Excerpt from a sample transcript

The participant (age late 50s) had suffered from a chronic mental health illness for 30 years. The participant had become a “revolving door patient,” someone who is frequently in and out of hospital. As the participant talked about past experiences, the researcher asked:

  • What was treatment like 30 years ago?
  • Umm—well it was pretty much they could do what they wanted with you because I was put into the er, the er kind of system er, I was just on
  • endless section threes.
  • Really…
  • But what I didn’t realize until later was that if you haven’t actually posed a threat to someone or yourself they can’t really do that but I didn’t know
  • that. So wh-when I first went into hospital they put me on the forensic ward ’cause they said, “We don’t think you’ll stay here we think you’ll just
  • run-run away.” So they put me then onto the acute admissions ward and – er – I can remember one of the first things I recall when I got onto that
  • ward was sitting down with a er a Dr XXX. He had a book this thick [gestures] and on each page it was like three questions and he went through
  • all these questions and I answered all these questions. So we’re there for I don’t maybe two hours doing all that and he asked me he said “well
  • when did somebody tell you then that you have schizophrenia” I said “well nobody’s told me that” so he seemed very surprised but nobody had
  • actually [pause] whe-when I first went up there under police escort erm the senior kind of consultants people I’d been to where I was staying and
  • ermm so er [pause] I . . . the, I can remember the very first night that I was there and given this injection in this muscle here [gestures] and just
  • having dreadful side effects the next day I woke up [pause]
  • . . . and I suffered that akathesia I swear to you, every minute of every day for about 20 years.
  • Oh how awful.
  • And that side of it just makes life impossible so the care on the wards [pause] umm I don’t know it’s kind of, it’s kind of hard to put into words
  • [pause]. Because I’m not saying they were sort of like not friendly or interested but then nobody ever seemed to want to talk about your life [pause]
  • nobody asked me any questions about my life. The only questions that came into was they asked me if I’d be a volunteer for these student exams
  • and things and I said “yeah” so all the questions were like “oh what jobs have you done,” er about your relationships and things and er but
  • nobody actually sat down and had a talk and showed some interest in you as a person you were just there basically [pause] um labelled and you
  • know there was there was [pause] but umm [pause] yeah . . .

This article is the 10th in the CJHP Research Primer Series, an initiative of the CJHP Editorial Board and the CSHP Research Committee. The planned 2-year series is intended to appeal to relatively inexperienced researchers, with the goal of building research capacity among practising pharmacists. The articles, presenting simple but rigorous guidance to encourage and support novice researchers, are being solicited from authors with appropriate expertise.

Previous articles in this series:

Bond CM. The research jigsaw: how to get started. Can J Hosp Pharm . 2014;67(1):28–30.

Tully MP. Research: articulating questions, generating hypotheses, and choosing study designs. Can J Hosp Pharm . 2014;67(1):31–4.

Loewen P. Ethical issues in pharmacy practice research: an introductory guide. Can J Hosp Pharm. 2014;67(2):133–7.

Tsuyuki RT. Designing pharmacy practice research trials. Can J Hosp Pharm . 2014;67(3):226–9.

Bresee LC. An introduction to developing surveys for pharmacy practice research. Can J Hosp Pharm . 2014;67(4):286–91.

Gamble JM. An introduction to the fundamentals of cohort and case–control studies. Can J Hosp Pharm . 2014;67(5):366–72.

Austin Z, Sutton J. Qualitative research: getting started. C an J Hosp Pharm . 2014;67(6):436–40.

Houle S. An introduction to the fundamentals of randomized controlled trials in pharmacy research. Can J Hosp Pharm . 2014; 68(1):28–32.

Charrois TL. Systematic reviews: What do you need to know to get started? Can J Hosp Pharm . 2014;68(2):144–8.

Competing interests: None declared.

Further Reading

Examples of qualitative research in pharmacy practice.

  • Farrell B, Pottie K, Woodend K, Yao V, Dolovich L, Kennie N, et al. Shifts in expectations: evaluating physicians’ perceptions as pharmacists integrated into family practice. J Interprof Care. 2010; 24 (1):80–9. [ PubMed ] [ Google Scholar ]
  • Gregory P, Austin Z. Postgraduation employment experiences of new pharmacists in Ontario in 2012–2013. Can Pharm J. 2014; 147 (5):290–9. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Marks PZ, Jennnings B, Farrell B, Kennie-Kaulbach N, Jorgenson D, Pearson-Sharpe J, et al. “I gained a skill and a change in attitude”: a case study describing how an online continuing professional education course for pharmacists supported achievement of its transfer to practice outcomes. Can J Univ Contin Educ. 2014; 40 (2):1–18. [ Google Scholar ]
  • Nair KM, Dolovich L, Brazil K, Raina P. It’s all about relationships: a qualitative study of health researchers’ perspectives on interdisciplinary research. BMC Health Serv Res. 2008; 8 :110. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Pojskic N, MacKeigan L, Boon H, Austin Z. Initial perceptions of key stakeholders in Ontario regarding independent prescriptive authority for pharmacists. Res Soc Adm Pharm. 2014; 10 (2):341–54. [ PubMed ] [ Google Scholar ]

Qualitative Research in General

  • Breakwell GM, Hammond S, Fife-Schaw C. Research methods in psychology. Thousand Oaks (CA): Sage Publications; 1995. [ Google Scholar ]
  • Given LM. 100 questions (and answers) about qualitative research. Thousand Oaks (CA): Sage Publications; 2015. [ Google Scholar ]
  • Miles B, Huberman AM. Qualitative data analysis. Thousand Oaks (CA): Sage Publications; 2009. [ Google Scholar ]
  • Patton M. Qualitative research and evaluation methods. Thousand Oaks (CA): Sage Publications; 2002. [ Google Scholar ]
  • Willig C. Introducing qualitative research in psychology. Buckingham (UK): Open University Press; 2001. [ Google Scholar ]

Group Dynamics in Focus Groups

  • Farnsworth J, Boon B. Analysing group dynamics within the focus group. Qual Res. 2010; 10 (5):605–24. [ Google Scholar ]

Social Constructivism

  • Social constructivism. Berkeley (CA): University of California, Berkeley, Berkeley Graduate Division, Graduate Student Instruction Teaching & Resource Center; [cited 2015 June 4]. Available from: http://gsi.berkeley.edu/gsi-guide-contents/learning-theory-research/social-constructivism/ [ Google Scholar ]

Mixed Methods

  • Creswell J. Research design: qualitative, quantitative, and mixed methods approaches. Thousand Oaks (CA): Sage Publications; 2009. [ Google Scholar ]

Collecting Qualitative Data

  • Arksey H, Knight P. Interviewing for social scientists: an introductory resource with examples. Thousand Oaks (CA): Sage Publications; 1999. [ Google Scholar ]
  • Guest G, Namey EE, Mitchel ML. Collecting qualitative data: a field manual for applied research. Thousand Oaks (CA): Sage Publications; 2013. [ Google Scholar ]

Constructivist Grounded Theory

  • Charmaz K. Grounded theory: objectivist and constructivist methods. In: Denzin N, Lincoln Y, editors. Handbook of qualitative research. 2nd ed. Thousand Oaks (CA): Sage Publications; 2000. pp. 509–35. [ Google Scholar ]

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 21, Issue 3
  • Data collection in qualitative research
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • David Barrett 1 ,
  • http://orcid.org/0000-0003-1130-5603 Alison Twycross 2
  • 1 Faculty of Health Sciences , University of Hull , Hull , UK
  • 2 School of Health and Social Care , London South Bank University , London , UK
  • Correspondence to Dr David Barrett, Faculty of Health Sciences, University of Hull, Hull HU6 7RX, UK; D.I.Barrett{at}hull.ac.uk

https://doi.org/10.1136/eb-2018-102939

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Qualitative research methods allow us to better understand the experiences of patients and carers; they allow us to explore how decisions are made and provide us with a detailed insight into how interventions may alter care. To develop such insights, qualitative research requires data which are holistic, rich and nuanced, allowing themes and findings to emerge through careful analysis. This article provides an overview of the core approaches to data collection in qualitative research, exploring their strengths, weaknesses and challenges.

Collecting data through interviews with participants is a characteristic of many qualitative studies. Interviews give the most direct and straightforward approach to gathering detailed and rich data regarding a particular phenomenon. The type of interview used to collect data can be tailored to the research question, the characteristics of participants and the preferred approach of the researcher. Interviews are most often carried out face-to-face, though the use of telephone interviews to overcome geographical barriers to participant recruitment is becoming more prevalent. 1

A common approach in qualitative research is the semistructured interview, where core elements of the phenomenon being studied are explicitly asked about by the interviewer. A well-designed semistructured interview should ensure data are captured in key areas while still allowing flexibility for participants to bring their own personality and perspective to the discussion. Finally, interviews can be much more rigidly structured to provide greater control for the researcher, essentially becoming questionnaires where responses are verbal rather than written.

Deciding where to place an interview design on this ‘structural spectrum’ will depend on the question to be answered and the skills of the researcher. A very structured approach is easy to administer and analyse but may not allow the participant to express themselves fully. At the other end of the spectrum, an open approach allows for freedom and flexibility, but requires the researcher to walk an investigative tightrope that maintains the focus of an interview without forcing participants into particular areas of discussion.

Example of an interview schedule 3

What do you think is the most effective way of assessing a child’s pain?

Have you come across any issues that make it difficult to assess a child’s pain?

What pain-relieving interventions do you find most useful and why?

When managing pain in children what is your overall aim?

Whose responsibility is pain management?

What involvement do you think parents should have in their child’s pain management?

What involvement do children have in their pain management?

Is there anything that currently stops you managing pain as well as you would like?

What would help you manage pain better?

Interviews present several challenges to researchers. Most interviews are recorded and will need transcribing before analysing. This can be extremely time-consuming, with 1 hour of interview requiring 5–6 hours to transcribe. 4 The analysis itself is also time-consuming, requiring transcriptions to be pored over word-for-word and line-by-line. Interviews also present the problem of bias the researcher needs to take care to avoid leading questions or providing non-verbal signals that might influence the responses of participants.

Focus groups

The focus group is a method of data collection in which a moderator/facilitator (usually a coresearcher) speaks with a group of 6–12 participants about issues related to the research question. As an approach, the focus group offers qualitative researchers an efficient method of gathering the views of many participants at one time. Also, the fact that many people are discussing the same issue together can result in an enhanced level of debate, with the moderator often able to step back and let the focus group enter into a free-flowing discussion. 5 This provides an opportunity to gather rich data from a specific population about a particular area of interest, such as barriers perceived by student nurses when trying to communicate with patients with cancer. 6

From a participant perspective, the focus group may provide a more relaxing environment than a one-to-one interview; they will not need to be involved with every part of the discussion and may feel more comfortable expressing views when they are shared by others in the group. Focus groups also allow participants to ‘bounce’ ideas off each other which sometimes results in different perspectives emerging from the discussion. However, focus groups are not without their difficulties. As with interviews, focus groups provide a vast amount of data to be transcribed and analysed, with discussions often lasting 1–2 hours. Moderators also need to be highly skilled to ensure that the discussion can flow while remaining focused and that all participants are encouraged to speak, while ensuring that no individuals dominate the discussion. 7

Observation

Participant and non-participant observation are powerful tools for collecting qualitative data, as they give nurse researchers an opportunity to capture a wide array of information—such as verbal and non-verbal communication, actions (eg, techniques of providing care) and environmental factors—within a care setting. Another advantage of observation is that the researcher gains a first-hand picture of what actually happens in clinical practice. 8 If the researcher is adopting a qualitative approach to observation they will normally record field notes . Field notes can take many forms, such as a chronological log of what is happening in the setting, a description of what has been observed, a record of conversations with participants or an expanded account of impressions from the fieldwork. 9 10

As with other qualitative data collection techniques, observation provides an enormous amount of data to be captured and analysed—one approach to helping with collection and analysis is to digitally record observations to allow for repeated viewing. 11 Observation also provides the researcher with some unique methodological and ethical challenges. Methodologically, the act of being observed may change the behaviour of the participant (often referred to as the ‘Hawthorne effect’), impacting on the value of findings. However, most researchers report a process of habitation taking place where, after a relatively short period of time, those being observed revert to their normal behaviour. Ethically, the researcher will need to consider when and how they should intervene if they view poor practice that could put patients at risk.

The three core approaches to data collection in qualitative research—interviews, focus groups and observation—provide researchers with rich and deep insights. All methods require skill on the part of the researcher, and all produce a large amount of raw data. However, with careful and systematic analysis 12 the data yielded with these methods will allow researchers to develop a detailed understanding of patient experiences and the work of nurses.

  • Twycross AM ,
  • Williams AM ,
  • Huang MC , et al
  • Onwuegbuzie AJ ,
  • Dickinson WB ,
  • Leech NL , et al
  • Twycross A ,
  • Emerson RM ,
  • Meriläinen M ,
  • Ala-Kokko T

Competing interests None declared.

Patient consent Not required.

Provenance and peer review Commissioned; internally peer reviewed.

Read the full text or download the PDF:

Case Western Reserve University

  • Research Data Lifecycle Guide

Developing a Data Management Plan

This section breaks down different topics required for the planning and preparation of data used in research at Case Western Reserve University. In this phase you should understand the research being conducted, the type and methods used for collecting data, the methods used to prepare and analyze the data, addressing budgets and resources required, and have a sound understanding of how you will manage data activities during your research project.

Many federal sponsors of Case Western Reserve funded research have required data sharing plans in research proposals since 2003. As of Jan. 25, 2023, the National Institutes of Health has revised its data management and sharing requirements. 

This website is designed to provide basic information and best practices to seasoned and new investigators as well as detailed guidance for adhering to the revised NIH policy.  

Basics of Research Data Management

What is research data management?

Research data management (RDM) comprises a set of best practices that include file organization, documentation, storage, backup, security, preservation, and sharing, which affords researchers the ability to more quickly, efficiently, and accurately find, access, and understand their own or others' research data.

Why should you care about research data management?

RDM practices, if applied consistently and as early in a project as possible, can save you considerable time and effort later, when specific data are needed, when others need to make sense of your data, or when you decide to share or otherwise upload your data to a digital repository. Adopting RDM practices will also help you more easily comply with the data management plan (DMP) required for obtaining grants from many funding agencies and institutions.

Does data need to be retained after a project is completed?

Research data must be retained in sufficient detail and for an adequate period of time to enable appropriate responses to questions about accuracy, authenticity, primacy and compliance with laws and regulations governing the conduct of the research. External funding agencies will each have different requirements regarding storage, retention, and availability of research data. Please carefully review your award or agreement for the disposition of data requirements and data retention policies.

A good data management plan begins by understanding the sponsor requirements funding your research. As a principal investigator (PI) it is your responsibility to be knowledgeable of sponsors requirements. The Data Management Plan Tool (DMPTool) has been designed to help PIs adhere to sponsor requirements efficiently and effectively. It is strongly recommended that you take advantage of the DMPTool.  

CWRU has an institutional account with DMPTool that enables users to access all of its resources via your Single Sign On credentials. CWRU's DMPTool account is supported by members of the Digital Scholarship team with the Freedman Center for Digital Scholarship. Please use the RDM Intake Request form to schedule a consultation if you would like support or guidance regarding developing a Data Management Plan.

Some basic steps to get started:

  • Sign into the  DMPTool site  to start creating a DMP for managing and sharing your data. 
  • On the DMPTool site, you can find the most up to date templates for creating a DMP for a long list of funders, including the NIH, NEH, NSF, and more. 
  • Explore sample DMPs to see examples of successful plans .

Be sure that your DMP is addressing any and all federal and/or funder requirements and associated DMP templates that may apply to your project. It is strongly recommended that investigators submitting proposals to the NIH utilize this tool. 

The NIH is mandating Data Management and Sharing Plans for all proposals submitted after Jan. 25, 2023.  Guidance for completing a NIH Data Management Plan has its own dedicated content to provide investigators detailed guidance on development of these plans for inclusion in proposals. 

A Data Management Plan can help create and maintain reliable data and promote project success. DMPs, when carefully constructed and reliably adhered to, help guide elements of your research and data organization.

A DMP can help you:

Document your process and data.

  • Maintain a file with information on researchers and collaborators and their roles, sponsors/funding sources, methods/techniques/protocols/standards used, instrumentation, software (w/versions), references used, any applicable restrictions on its distribution or use.
  • Establish how you will document file changes, name changes, dates of changes, etc. Where will you record of these changes? Try to keep this sort of information in a plain text file located in the same folder as the files to which it pertains.
  • How are derived data products created? A DMP encourages consistent description of data processing performed, software (including version number) used, and analyses applied to data.
  • Establish regular forms or templates for data collection. This helps reduce gaps in your data, promotes consistency throughout the project.

Explain your data

  • From the outset, consider why your data were collected, what the known and expected conditions may be for collection, and information such as time and place, resolution, and standards of data collected.
  • What attributes, fields, or parameters will be studied and included in your data files? Identify and describe these in each file that employs them.
  • For an overview of data dictionaries, see the USGS page here: https://www.usgs.gov/products/data-and-tools/data-management/data-dictionaries

DMP Requirements

Why are you being asked to include a data management plan (DMP) in your grant application? For grants awarded by US governmental agencies, two federal memos from the US Office of Science and Technology Policy (OSTP), issued in 2013 and 2015 , respectively, have prompted this requirement. These memos mandate public access to federally- (and, thus, taxpayer-) funded research results, reflecting a commitment by the government to greater accountability and transparency. While "results" generally refers to the publications and reports produced from a research project, it is increasingly used to refer to the resulting data as well.

Federal research-funding agencies  have responded to the OSTP memos by issuing their own guidelines and requirements for grant applicants (see below), specifying whether and how research data in particular are to be managed in order to be publicly and properly accessible.

  • NSF—National Science Foundation "Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled 'Data Management Plan'. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results." Note: Additional requirements may apply per Directorate, Office, Division, Program, or other NSF unit.
  • NIH—National Institutes of Health "To facilitate data sharing, investigators submitting a research application requesting $500,000 or more of direct costs in any single year to NIH on or after October 1, 2003 are expected to include a plan for sharing final research data for research purposes, or state why data sharing is not possible."
  • NASA—National Aeronautics and Space Administration "The purpose of a Data Management Plan (DMP) is to address the management of data from Earth science missions, from the time of their data collection/observation, to their entry into permanent archives."
  • DOD—Department of Defense "A Data Management Plan (DMP) describing the scientific data expected to be created or gathered in the course of a research project must be submitted to DTIC at the start of each research effort. It is important that DoD researchers document plans for preserving data at the outset, keeping in mind the potential utility of the data for future research or to support transition to operational or other environments. Otherwise, the data is lost as researchers move on to other efforts. The essential descriptive elements of the DMP are listed in section 3 of DoDI 3200.12, although the format of the plan may be adjusted to conform to standards established by the relevant scientific discipline or one that meets the requirements of the responsible Component"
  • Department of Education "The purpose of this document is to describe the implementation of this policy on public access to data and to provide guidance to applicants for preparing the Data Management Plan (DMP) that must outline data sharing and be submitted with the grant application. The DMP should describe a plan to provide discoverable and citable dataset(s) with sufficient documentation to support responsible use by other researchers, and should address four interrelated concerns—access, permissions, documentation, and resources—which must be considered in the earliest stages of planning for the grant."
  • " Office of Scientific and Technical Information (OSTI) Provides access to free, publicly-available research sponsored by the Department of Energy (DOE), including technical reports, bibliographic citations, journal articles, conference papers, books, multimedia, software, and data.

Data Management Best Practices

As you plan to collect data for research, keep in mind the following best practices. 

Keep Your Data Accessible to You

  • Store your temporary working files somewhere easily accessible, like on a local hard drive or shared server.
  • While cloud storage is a convenient solution for storage and sharing, there are often concerns about data privacy and preservation. Be sure to only put data in the cloud that you are comfortable with and that your funding and/or departmental requirements allow.
  • For long-term storage, data should be put into preservation systems that are well-managed. [U]Tech provides several long-term data storage options for cloud and campus. 
  • Don't keep your original data on a thumb drive or portable hard drive, as it can be easily lost or stolen.
  • Think about file formats that have a long life and that are readable by many programs. Formats like ascii, .txt, .csv, .pdf are great for long term  preservation.
  • A DMP is not a replacement for good data management practices, but it can set you on the right path if it is consistently followed. Consistently revisit your plan to ensure you are following it and adhering to funder requirements.

Preservation

  • Know the difference between storing and preserving your data. True preservation is the ongoing process of making sure your data are secure and accessible for future generations. Many sponsors have preferred or recommended data repositories. The DMP tool can help you identify these preferred repositories. 
  • Identify data with long-term value. Preserve the raw data and any intermediate/derived products that are expensive to reproduce or can be directly used for analysis. Preserve any scripted code that was used to clean and transform the data.
  • Whenever converting your data from one format to another, keep a copy of the original file and format to avoid loss or corruption of your important files.
  • Leverage online platforms like OSF can help your group organize, version, share, and preserve your data, if the sponsor hasn’t specified a specific platform.
  • Adhere to federal sponsor requirements on utilizing accepted data repositories (NIH dbGaP, NIH SRA, NIH CRDC, etc.) for preservation. 

Backup, Backup, Backup

  • The general rule is to keep 3 copies of your data: 2 copies onsite, 1 offsite.
  • Backup your data regularly and frequently - automate the process if possible. This may mean weekly duplication of your working files to a separate drive, syncing your folders to a cloud service like Box, or dedicating a block of time every week to ensure you've copied everything to another location.

Organization

  • Establish a consistent, descriptive filing system that is intelligible to future researchers and does not rely on your own inside knowledge of your research.
  • A descriptive directory and file-naming structure should guide users through the contents to help them find whatever they are looking for.

Naming Conventions

  • Use consistent, descriptive filenames that reliably indicate the contents of the file.
  • If your discipline requires or recommends particular naming conventions, use them!
  • Do not use spaces between words. Use either camelcase or underscores to separate words
  • Include LastnameFirstname descriptors where appropriate.
  • Avoid using MM-DD-YYYY formats
  • Do not append vague descriptors like "latest" or "final" to your file versions. Instead, append the version's date or a consistently iterated version number.

Clean Your Data

  • Mistakes happen, and often researchers don't notice at first. If you are manually entering data, be sure to double-check the entries for consistency and duplication. Often having a fresh set of eyes will help to catch errors before they become problems.
  • Tabular data can often be error checked by sorting the fields alphanumerically to catch simple typos, extra spaces, or otherwise extreme outliers. Be sure to save your data before sorting it to ensure you do not disrupt the records!
  • Programs like OpenRefine  are useful for checking for consistency in coding for records and variables, catching missing values, transforming data, and much more.

What should you do if you need assistance implementing RDM practices?

Whether it's because you need discipline-specific metadata standards for your data, help with securing sensitive data, or assistance writing a data management plan for a grant, help is available to you at CWRU. In addition to consulting the resources featured in this guide, you are encouraged to contact your department's liaison librarian.

If you are planning to submit a research proposal and need assistance with budgeting for data storage and or applications used to capture, manage, and or process data UTech provides information and assistance including resource boilerplates that list what centralized resources are available. 

More specific guidance for including a budget for Data Management and Sharing is included on this document: Budgeting for Data Management and Sharing . 

Custody of Research Data

The PI is the custodian of research data, unless agreed on in writing otherwise and the agreement is on file with the University, and is responsible for the collection, management, and retention of research data. The PI should adopt an orderly system of data organization and should communicate the chosen system to all members of a research group and to the appropriate administrative personnel, where applicable. Particularly for long-term research projects, the PI should establish and maintain procedures for the protection and management of essential records.

CWRU Custody of Research Data Policy  

Data Sharing

Many funding agencies require data to be shared for the purposes of reproducibility and other important scientific goals. It is important to plan for the timely release and sharing of final research data for use by other researchers.  The final release of data should be included as a key deliverable of the DMP. Knowledge of the discipline-specific database, data repository, data enclave, or archive store used to disseminate the data should also be documented as needed. 

The NIH is mandating Data Management and Sharing Plans for all proposals submitted after Jan. 25, 2023. Guidance for completing a NIH Data Management and Sharing Plan  has its own dedicated content to provide investigators detailed guidance on development of these plans for inclusion in proposals.

Conducting sustainability research in the anthropocene: toward a relational approach

  • Original Article
  • Open access
  • Published: 24 May 2024

Cite this article

You have full access to this open access article

what is data collection procedure in research

  • Jessica Böhme   ORCID: orcid.org/0000-0003-4591-3754 1   na1 ,
  • Eva-Maria Spreitzer 2 &
  • Christine Wamsler   ORCID: orcid.org/0000-0003-4511-1532 3   na1  

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Scholars and practitioners are urgently highlighting the need to apply a relational approach to effectively address societal crises. At the same time, little is known about the associated challenges, and there is little advice regarding how to operationalize this approach in sustainability science. Against this background, this article explores how we can break out of our current paradigms and approaches, and instead apply relational thinking, being, and acting in the way we conduct research. To achieve this, we systematically list all major research phases, and assess possible pathways for integrating a relational paradigm for each step. We show that moving toward a relational paradigm requires us to methodically question and redefine existing theories of change, concepts, and approaches, for instance by combining abductive reasoning, first-person inquiries, and decentering the human through critical complexity theory. Challenging mainstream thought, and daring to ask different questions in each step is crucial to ultimately shift scientific norms and systems. Hence, we offer a catalog of questions that may help to systematically integrate relational being, thinking, and acting into the process, as a tool for transforming current paradigms in research, and associated education and practice. Finally, we highlight the importance of further research to develop and refine our outcomes.

Similar content being viewed by others

what is data collection procedure in research

Solutions-oriented research for sustainability: Turning knowledge into action

Coevolving ostrom’s social–ecological systems (ses) framework and sustainability science: four key co-benefits, sociology for sustainability science.

Avoid common mistakes on your manuscript.

Introduction

The anthropocene is characterized by significant human impact on the Earth’s geology and ecosystems; examples include biodiversity loss, climate change, social inequalities, and conflicts (IPCC 2021 ). These challenges are part of an underlying metacrisis of accelerating, causally entangled, complex grand challenges (Jørgsen et al. 2023 ; Rosa 2019 ). In fact, there is mounting evidence that today’s societal crises have one common denominator, or root cause: they are a reflection of an inner, human crisis of disconnection or separation from self, others, and nature, which is grounded in modern societies’ social paradigm (Ives et al. 2023 ; Leichenko and O’Brien 2020 ; Rowson 2021 ; Wamsler et al. 2021 ; Wamsler and Bristow 2022 ). Hence, the current focus on external, technological approaches is insufficient to support transformation toward sustainable and just futures (ibid).

Consequently, there is also a need for sustainability science to re-consider and expand its ontological, epistemological, and ethical foundations, and associated approaches for researching and engaging with complex, wicked sustainability challenges (Alford and Head 2017 ; Ives et al. 2023 ; Lang et al. 2012 ; Lönngren and van Peock 2020 ; Mauser et al. 2013 ; Wiek and Lang 2016 ; Xiang 2013 ). Accordingly, an increasing number of scholars and practitioners argue that effectively addressing and researching sustainability challenges requires a shift in paradigms Footnote 1 to address societal crises differently (Ives et al. 2023 ; Walsh et al. 2020 ; Wamsler et al. 2021 ).

The dominant social paradigm in the modern, industrialized world is what we refer to in the following as the ‘mechanistic paradigm’.Scholars and practitioners highlight that current mechanistic approaches, and associated reductionist strategies and perspectives, are inadequate for tackling sustainability issues (Leichenko and O’Brien 2020 ; Porter and Reischer 2018 ). Furthermore, it can be argued that the paradigm’s underlying values (individualism, materialism, capitalism) and the associated norms, Footnote 2 mechanisms, and structures enhance separation from self, others and nature, and a kind of alienation, as an integral element of modern life, forms (Wamsler and Bristow 2022 ; Rosa 2019 ).

The core pattern that emerges from the mechanistic paradigm, which is especially relevant in the context of todays’ sustainability crises and associated research, is that we are increasingly exhausting and exploiting ourselves, others, and nature (Wamsler and Bristow 2022 ). This is based on the perception that humans are separate from each other, that they are separate and superior to the rest of the natural world, and that nature, like any other system, behaves like a machine, and can be controlled and known by reducing it to its parts (Capra and Luisi 2014 ; Redclift and Sage 1994 ; Rees 1999 ; Walsh et al. 2020 ). The result is separation between self, others, and the more-than-human world (Ives et al. 2019 ; Wamsler and Bristow 2022 ).

The mechanistic paradigm has dominated both policy-making and research. It favors “outer” approaches and solutions (IPCC 2022a ; b ; Wendt 2015 ; Todd 2016 ; Wamsler and Bristow 2022 ), while largely ignoring the inner dimension of sustainability, which includes people’s individual and collective mindsets, beliefs, values, worldviews, and associated inner qualities/capacities (Capra and Luisi 2014 ; Redclift and Sage 1994 ; Rees 1999 ; Wamsler 2020 ; Wamsler et al. 2021 , 2022a ). This has, in turn, narrowed the possibilities for deeper change that can tackle the underlying root causes of today’s crises, while fostering mechanistic and unsustainable interactions with the living world around us (Leal Filho and Consorte McCrea 2019 ; Wamsler et al. 2021 ).

To address this gap, an increasing number of scholars advocate for a shift toward a relational paradigm (e.g., Audouin et al. 2013 ; Böhme et al. 2022 ; Hertz et al. 2020 ; Ives et al. 2023 ; Mancilla Garcia et al. 2020 ; Stalhammar and Thorén 2019 ; Walsh et al. 2020 ; Wamsler et al. 2021 , 2022a ; West et al. 2020 ). A relational paradigm Footnote 3 attempts to understand complex phenomena in terms of constitutive processes and relations and recognizes the intricate interconnectedness of humans and the more-than-human world, as well as the associated nonlinear dynamics, uncertainty, and the emergence of change (West et al. 2020 ; Walsh et al. 2020 ). It builds on the ontological premise that inner and outer phenomena are entangled and interconnected across individual, collective, and system levels, and recognizes the multiple potential that is latent within each of us to enable transformative change across these scales (Ives et al. 2023 ). From an epistemological point of view, it requires the inclusion of diverse perspectives, and the expansion of knowledge systems for enhanced “transformation” Footnote 4 toward more sustainable futures (Ives et al. 2023 ; Künkel and Ragnarsdottir 2022 ).

On these premises, relational research should not be understood as simple introspection, but as a new form of praxis for integrative inner–outer transformation that includes different modes of activating the inner human dimension across individual, collective, and system levels, and the generation of so-called transformative capacities through intentional practices (Ives et al. 2023 ; Spreitzer 2021 ). “Such cognitive, emotional and relational capacities support the cultivation of values, beliefs, and worldviews regarding how people relate (or reconnect) to themselves, others, nature, and future generations in ways that can support transformation” (Wamsler et al. 2022a , b , p. 9).

In contrast, current scientific mainstream approaches and methods risk reproducing and strengthening the dominant social paradigm that underlies today’s sustainability crises, instead of questioning and reframing the underlying assumptions (Fischer et al. 2015 ; Walsh et al. 2020 ). While these challenges are increasingly addressed in emerging frameworks and perspectives that may form the foundations of transformative approaches toward more sustainable and just futures (e.g., Ison 2018 ; Gearty and Marshall 2020 ; Hertz et al. 2020 ; Wamsler et al. 2021 ), there is little knowledge on how to systemically conduct sustainability science from a relational paradigm perspective (Fischer et al. 2015 ; Walsh et al. 2020 ; West et al. 2020 ).

Sustainability science is both an inter- and transdisciplinary field, and it is concerned with addressing complex challenges that threaten humanity and the planet (Wiek and Lang 2016 ). It bridges natural and social sciences and the humanities in the search for creative solutions to these challenges (Jerneck et al. 2010 ; Kajikawa et al. 2014 ; Miller 2012 ; van Kerkhoff 2013 ). Accordingly, sustainability research tends to combine “descriptive-analytical” and “transformational” approaches with different methodologies, based on systems thinking as an epistemological frame (Miller et al. 2013 ; Wiek and Lang 2016 ). While the descriptive-analytical stream draws mostly on systems modeling for describing and analyzing the causes and effects of complex sustainability challenges, the transformational stream often focuses on evidence-based solutions, by accommodating systems thinking for generating actionable insights into how to address sustainability challenges more effectively (Abson et al. 2017 ; Wiek and Lang 2016 ). Hence, both approaches are built on the premise of addressing un/sustainability by identifying and “solving” wicked problems. In general, however, these premises and their corresponding understanding of systems, and systems change, operate within the dominant social paradigm (Latour 2005 ; Poli 2013 ). In other words, they typically do not align with, or support, a relational paradigm, notably its epistemological, ontological, and praxis dimensions (Ives et al. 2023 ). Despite the above-described call for a relational turn in sustainability science, related endeavors are still in their infancy, and there is a need for further efforts to learn how to nurture more relational and, thus, transformative approaches.

Put together, there is an urgent need for a move toward more relational thinking, being and acting, and thus a related shift in: (1) how we see the world; and (2) how we get to know, (3) engage, and (4) ensure quality and equity considerations across these aspects (Ives et al. 2023 ; Walsh et al. 2020 ; Wamsler et al. 2021 , 2024 ). This involves examining how ethical considerations shape our understanding of reality (ontology), influence the ways we acquire, validate, and apply knowledge (epistemology), and translate it into action (praxis).

Against this background, in this article we explore how we can break out of societies’ dominant social paradigm and apply a relational paradigm to the conduct of sustainability research in more transformative ways. More specifically, we identify key implications and possible ways forward for all major steps typically found in any scientific research process.

Methodological considerations

In the next section, we describe the particularities that result from a relational paradigm for each of the following research steps: (1) identifying the research problem and niche; (2) reviewing the literature; (3) creating research hypotheses; (4) designing the overall approach; (5) data collection and analyses; (6) writing up the results; and (7) disseminating them (Booth et al. 2016 ; Cohen et al. 2018 ; Creswell 2018 ). For each of these steps, we compare: (1) how sustainability research is generally conducted based on the mechanistic paradigm; and (2) how the approach might change if a relational paradigm is applied. Related analyses are based on an exploratory analysis Footnote 5 of the literature that calls for a relational shift in sustainability and social sciences. While our comparison relates to mainstream sustainability approaches that are built on a mechanistic paradigm, we recognize the existence of alternatives (cf. Bradbury 2015 , 2022 ; Drawson et al. 2017 ; Goodchild 2021 ; Mbah et al. 2022 ; Romm 2015 ; Rowell et al. 2017 ).

We do not attempt to present a comprehensive overview of research methodologies based on a relational paradigm. Instead, we critically reflect on existing approaches and review how a relational paradigm could be operationalized in sustainability research, particularly as there is no single, coherent relational paradigm to build upon (Alvesson and Sandberg 2020 ; Böhme et al. 2022 ).

To do so, we do not present tools, methods, or steps with specific prescriptions and instructions for how to move toward a more relational paradigm and overcome related challenges—instead, we offer a proposition that could trigger conditions of emergence (Springgay 2015 ). This is important, because the idea that specific actions lead to defined outcomes is not aligned with a relational perspective and thus on how transformation can be supported in complex systems (Smartt Gullion 2018 ). Moreover, relational epistemologies question the idea that tools can be used to represent reality, without acknowledging the entanglement of the researcher who is co-creating the knowledge (Latour 2005 ). Ultimately, “tools are never ‘mere’ tools ready to be applied: they always modify the goals you had in mind” (Latour 2005 , p. 143). Offering a practical tool runs the risk of offering a simplistic conceptualization that narrows understanding and changes our object of study (Mancilla Garcia et al. 2020 ).

Instead, in each section, we conclude with some questions that can be used to make the implicit explicit when conducting research within a relational paradigm. Making the implicit explicit is an important strategy for dealing with complexity (Audouin et al. 2013 ; Cilliers 2005 ). We thus follow Puig de la Bellacasa ( 2017 ), who suggests that the aim should be a commitment to asking how things could be different, as developing processes and practices of asking can challenge the status quo and, thus, help to increasingly integrate the relational paradigm into current approaches.

Pathways toward a relational paradigm in research

Step 1: identifying the research problem and niche.

The first step in the process is the identification of the research problem and niche.

From a mechanistic paradigm, the problem and niche can be found by identifying and isolating certain parts of a system that relate to a particular sustainability challenge. For example, a focus on carbon emissions in a particular sector (e.g., transportation).

A relational paradigm would require adding a perspective that is based on an understanding of sustainability challenges as evolving, complex adaptive systems marked by interdependencies, connectedness, nonlinearity, uncertainty, and emergence (Ives et al. 2023 ; Turner and Baker 2019 ). Instead of focusing on individual parts of systems—such as carbon emissions in transportation—a relational approach thus also requires looking into relationships, and the quality of these relationships, within and between systems, and how this influences or prevents integrative inner–outer transformation processes across individual, collective, and system levels (Wamsler et al. 2021 , 2022a , b ). In this context, “boundaries” do not define a research problem or theoretical puzzle, but “interfaces” do, which are understood as dynamic interchanges that form the edges of systems, and, are, at the same time, the focus; that is, “the appropriate center of interest in a particular system, process, or mind” (Bateson 1979 ; Charlton 2008 , p. 41).

An important aspect to consider during the first research step is the fact that paradigms form frames and language, and vice versa (Lakoff 2014 ; Ives et al. 2019 ). Reframing sustainability challenges is thus crucial for supporting transformation (Lakoff 2014 ) and must be accounted for when conducting research. While formulating the problem, it is for instance essential to consider which pre-defined concepts the problem is based upon, as moving toward a relational paradigm asks us to question established norms and understandings.

A relational paradigm also requires special attention to the wording of the research gap and associated niche, including the use of expressions that can foster or challenge dominant beliefs, values, and worldviews. Examples of wording that aims to support more relational understandings are natureculture and intra-action (Barad 2007 ; Hertz and Mancilla Garcia 2021 ), socialecological (Böhme 2023 ), thinking-with (Vu 2018 ), or the more-than-human world (Haraway 2016 ). In contrast, Hertz et al. ( 2020 ) point out that current sustainability research often employs “the environment” or “nature” and “the social” or “culture” as separate entities or phenomena, which can reinforce a reductionist paradigm. The separation between the social and the ecological also manifests in research in the so-called socio-ecological systems, a conceptualization that has strongly been influencing related research, frameworks, theories, methods, and policy insights.

In summary, identifying the problem through the lens of a relational paradigm involves a shift from focusing only on analyzing certain parts of a system, to the quality of relationships, associated meaning-making, and integrative inner–outer transformation processes. It also involves identifying and developing appropriate frames, language, and concepts that align with these characteristics.

A study on reducing carbon emissions from transportation might, for instance, be framed within a continuum and integrative understanding that links analyses at the level of behavior, at the level of systems and structures, and at the level of individual and collective mindsets. Moreover, employing a relational paradigm might involve framing emission and transportation-related challenges also around concepts of community well-being, social connectivity, and environmental justice.

In conclusion, the following questions can help in moving toward a relational paradigm:

I. How do my research problem and associated niche consider interdependencies, connectedness, nonlinearity, uncertainty, and emergence? How do they account for (the quality of) relationships and related inner–outer transformation processes across individual, collective, and system levels? For example, if my research focus and associated aims reinforce (the perception of) a separation between humans and non-humans, I might want to reframe the research.

II. Is the wording of the problem, niche and associated aims aligned with relational perspectives, or does it strengthen current mechanistic paradigms? For example, “if the words in a given language focus on shapes over function, then no wonder the speakers of that language prefer to group things according to their shape rather than their function” (Bollier and Helfrich 2019 , p. 708).

III. How can I explain relational, unfamiliar, or new concepts so that others (co-researchers, readers), who are new to this way of thinking, can understand? How can I create a bridge between the current and a potential new, more sustainable paradigm? For example, I could consider adding a glossary of newly-formed or uncommon terms.

Step 2: reviewing the literature

In general, the literature review entails identifying relevant sources and databases, and screening and selecting articles based on predetermined criteria. After extracting relevant information and data from the selected articles, researchers systematize and synthesize the findings to identify gaps, themes, and patterns.

From the perspective of the dominat mechanistic paradigm, scientific, peer-reviewed information is generally considered the key source for ensuring credibility and reliability. Adopting a relational paradigm challenges this notion. It requires questioning the dominance of the existing sensemaking frames and discussing their possible limitations, biases, and blind spots, including regarding the ontological premises underlying other epistemological and ethical considerations and emergent phenomena (Storm et al. 2019 ; Ives et al. 2023 ; Alvesson and Sandberg 2020 ).

Epistemologically, the focus shifts from privileging empiricism and positivism to embracing multiple ways of knowing. It acknowledges that different knowledge systems offer unique perspectives and understandings of the world. This may include lived experience, traditional and Indigenous wisdom, artistic expression, and other non-conventional sources that can offer valuable insights into the complexities of environmental issues, associated human–environment relationships, and esthetics (Osgood et al. 2020 ). It challenges the idea that only ‘objectifiable’ data is valid and recognizes that experiential, subjective, and transpersonal insights are equally essential in comprehending sustainability and the associated literature (Storm et al. 2019 ).

Ethically, the relational paradigm prompts critical reflection on whose knowledge is recognized and legitimized. It questions power dynamics within knowledge production, highlighting the need to amplify marginalized voices and perspectives that may have been historically excluded or undervalued within academia or the scientific discourse. It thus requires decolonizing strategies for identifying and reviewing the literature (Vu 2018 ).

Continuing with the example of carbon emissions from transport that was given in step 1, a relational paradigm would also require reviewing related, non-scientific literature and other sources and perspectives that shed light on aspects that have so far not been explored by mainstream science. This might involve considering the (limited) methodological bases and foci of the examined literature, and including additional data and voices for a more comprehensive review (e.g., examining all levels of transformation, related views, structures, and practices that might add additional context and perspectives).

Other common assumptions during step 2 are that the literature presents external, fixed knowledge, which the author has developed, and that the reader interprets the literature through a reflective process that is independent of dominant social paradigms. Accordingly, a systematic literature review should always lead to the same results and interpretations when repeated, regardless of the author(s), researcher(s), and reader(s). In contrast, a relational paradigm acknowledges the relational nature of knowledge creation, distribution, and interpretation, which arises from a process of entangled relations and associated paradigms (Barad 2007 ). The literature review is thus as much influenced by the researcher(s) themselves, as it is influenced by the perspective(s) of the respective author(s).

In the light of these observations, reviewing the literature is as much about understanding current knowledge as it is about understanding and considering how knowledge came to be. A relational paradigm thus posits that knowledge arises because it is co-produced by sociomaterial configurations and associated inner–outer change processes; it is neither fixed and permanent, nor individualized. Knowledge is a product of intra-action, “not something that someone or something has” (Barad 2007 , p. 178). As Cilliers ( 2005 , p. 609) argues, “There are facts that exist independently of the observer of those facts, but the facts do not have their meaning written on their faces. Meaning only comes to be in the process of interaction. Knowledge is interpreted data.”

Put simply, any literature review needs to recognize that (the analyzed and produced) knowledge is co-created and influenced by dominant social paradigms and associated inner–outer change processes. By conducting a literature review, we participate in a relational configuration through the entanglements of the involved agents.

The following guiding questions might thus help in moving toward a relational paradigm:

How can I integrate sources beyond scholarly articles to better understand current knowledge? Are there ways to systematically include non-human perspectives? For example, if the identified literature only represents knowledge from certain elements, communities or groups, other sources need to be considered (e.g., illustrated by Vu’s ( 2018 ) ethico-auto-ethnographies or Kuntz and Presnall’s ( 2012 ) intra-views).

What underlying or tacit ontological, epistemological, and ethical assumptions might be present within the reviewed literature? For example, how might dominant social paradigms and perspectives have influenced the presented theories of change, the exclusion of inner dimensions, or an overlooking of marginalized agents and non-human actors?

How does my perspective, subjectivity, and social–ecological position influence the interpretation and analysis of the literature, and how can I take account of this? For example, I could consider adding related considerations when discussing the limitations of the review.

Step 3: creating research hypotheses

From a mechanistic, positivist stance, a literature review is generally used to formulate hypotheses about the relationship between the independent and dependent variables. Within our dominant paradigm, these are generally expressed as testable hypotheses, and each hypothesis should be specific, concise, and presented as a statement that establishes a clear cause-and-effect relationship between the variables. They should also be falsifiable, which means that they offer supportive or neglective evidence through empirical qualitative or quantitative testing. Commonly, such hypotheses are formulated using either inductive or deductive reasoning (Smartt Gullion 2018 ).

An alternative approach, which is aligned with a relational paradigm, is abductive reasoning (Tullio 2016 ), sometimes also referred to as adductive reasoning. Abduction differs from both deduction and induction. It begins with an observed phenomenon that requires an explanation, then speculates on potential answers. Related reasoning involves a leap of the imagination and proposing hypotheses or interpretations that go beyond current evidence or knowledge. It is essentially a creative process of suggesting answers based on relational patterns, analogies, and insights from diverse sources. The researcher synthesizes information and uses speculative reasoning to suggest potential explanations, in addition to ‘obvious’ hypotheses (Nersessian 2010 ; Selg and Ventsel 2020 ; Van der Hoorn 1995 ). As Hertz et al. ( 2020 , p. 9) point out:

“Abduction reverses the order of reasoning. It focuses on a phenomenon that needs explaining and then ponders potential causes. During this speculative activity, novel conceptualizations and dynamics can be introduced to an explanatory scheme. Methods and approaches in social-ecological systems research with this potential include place-based and context-rich qualitative research methods (like narratives and participatory scenario development) and computational methods.”

Bateson ( 1982 ) argues that abductive reasoning is particularly pertinent for studying complex systems, such as ecosystems, social systems, and associated mental processes. Engaging in abductive reasoning allows researchers to extend their understanding beyond existing knowledge, potentially revealing deeper insights (ibid).

This can be illustrated by studying community resilience in the face of natural disasters. From a positivist perspective, the focus might be on testing specific hypotheses that predict the relationship between factors like socioeconomic status and disaster preparedness. Each hypothesis would be clearly defined and testable, aiming to establish a cause-and-effect relationship between independent and dependent variables. For instance, a hypothesis could propose that higher socioeconomic status correlates with better disaster preparedness measures. In contrast, adopting a relational paradigm would also involve abductive reasoning, which allows for additional exploration of the phenomenon and associated inner–outer transformation processes, enabling the researchers to identify and explore further hypotheses.

In summary, formulating hypotheses from a relational perspective requires their anchoring in the above-described steps 1–2. In addition, it should not only involve inductive and deductive, but also abductive reasoning. Deductive reasoning starts with a general rule, and inductive reasoning begins with a specific observation. In contrast, abductive reasoning assumes that observations are incomplete. Abductive reasoning embraces the idea that phenomena are unpredictable, contingent, dynamic, and emerge through open-ended intra-actions and relationships.

To explore this alternative path and move toward a relational paradigm, the following guiding questions might assist:

I. Do my hypotheses reflect the dominant social paradigm and related ontological assumptions? For example, are they based on a ‘fix-it’ and ‘fix-others’ mindset that reinforces current, unsustainable paradigms? Do they only focus on apparent external problems and solutions without due consideration of related inner dimensions of transformation? Or do they presuppose a division between nature and culture? If yes, I might need to reconsider or make explicit related biases and effects.

II. How do my hypotheses adequately consider the role of relationships (to self, others, nature, and the world at large)? For example, if they examine values without considering the relationships from which these values are co-created and emerge across individual, collective, and system levels, I might consider redirecting their focus.

III. How might abductive reasoning enhance my hypotheses? For example, I might speculate on potential explanations through the lens of different disciplines and sources, including Indigenous and local knowledge systems.

Step 4: defining the overall research design

During the design process, the research object is further defined, and an overall methodology is chosen to investigate it. Within the mechanistic paradigm, the boundaries of the object are clearly drawn. Complex phenomena are broken down into simpler components. The prevailing thought is that all complexity can be reduced to manageable parts and then understood through discrete analyses, measurements, or computational simulation (Smartt Gullion 2018 ).

It is clear that reductionist approaches are necessary in all scientific approaches to study some ‘thing’ or some ‘one’. At the same time, reduction has to be handled with particular care to include relational, ever-moving, and changing processes and aspects of systems that are key for understanding and transformation. For example, Bateson ( 2021 ) argues that common research approaches alone cannot answer questions regarding what and how autopoietic cycles of adaption within complexity are learning (Bateson 2021 ). In other words, overly mechanistic reduction might result in overlooking, or not engaging enough with so-called ‘warm data’, which is information about the interrelationships that form complexity, and thus the foundation of living systems and life itself. Warm data capture qualitative dynamics and offer another dimension of understanding to what is learned through “living data” (Bateson 2021 , 2022 ).

The overall research design has to take account of this relational living systems information and associated knowledge creation processes. It requires consideration of constantly emerging inner–outer learning processes of experiences, cultural beliefs, and perspectives. Unreflected simplification might lead to unintended or even harmful outcomes and consequences that support unsustainable paradigms.

At the same time, as the relational paradigm builds on the ontological premise that everything is related to everything else, the challenge is to design research in a way that stays true to its ideas, while not becoming too diffuse or abstract. A view that attempts to encompass all relations risks losing the distinction between the system and its environment. Researchers can then fall into two traps—either a radical openness systems view that leads to relativism, or an approach that relies on measurement and computational simulation (Morin 2008 ). The former is criticized for being a reaction to reductionism and promoting a kind of holism that negates the need for ontology. The latter fails to recognize the intangible nature of emergent properties (Preiser 2012 ). Therefore, both views have limitations: they either neglect the need for a reliable ontology, or oversimplify the intangible nature of ever-moving and emergent properties. A rigorous understanding of complexity denies total holism and total reductionism simultaneously, resulting in what Cilliers ( 2005 , p. 261) describes as “performative tension”.

In practice, this performative tension can be addressed by drawing boundaries, while simultaneously redirecting attention to related interfaces and being aware of, and making explicit, the fact that these boundaries are artificial. This is also referred to as “critical complexity” (Audouin et al. 2013 ), which transcends and incorporates mechanistic strategies while recognizing the need for reduction and transparency. Critical complexity can bring value-based choices to the forefront, if the reduction itself is a conscious value-based choice, where the researcher chooses which aspects to focus on, while staying aware that the research and the researcher(s) themselves are part of the living system of engaging with knowledge creation (and thus are constantly changing and are changed through responsively relating with the emergent character of this process). It is not either the researcher(s) or the research outcome that independently creates knowledge; instead, the overall design process can be regarded as learning and potentially transformation on all levels (Bateson 2021 ; Preiser 2012 ; Wamsler et al. 2022a , b ). This differs from the mechanistic approach, which often overlooks the consequences of reductionist practices, especially when defining the overall research design.

The critical complexity rationale recognizes that reductionism, under specific conditions, can by itself effectively enhance understanding. For instance, Cilliers ( 2005 ) argues that although reduction is unavoidable in our efforts to comprehend socialecological systems, we can shift our focus toward framing the strategies that are employed during the process of reduction. This change promotes a more relational standpoint, fostered through self-reflection.

Overall, finding an appropriate methodology can be a challenge and requires the careful consideration of relationships and engagements regarding both external and internal research stakeholders. Although several relational methodologies exist, such as intra-views (Kuntz and Presnall 2012 ), diffractive ethnography (Smartt Gullion 2018 ), ethico-auto-ethnography (Vu 2018 ), phenomenology, integral and narrative-based methodologies (Snowden and Greenberg 2021 ; Van der Merwe et al. 2019 ; Wilber 2021 ), the relational paradigm does not advocate prescriptive methodologies.

In summary, the challenge is to maintain a relational perspective without becoming overly abstract and risking relativist holism. This requires explicitly: (1) acknowledging the limitations of reductionist strategies; (2) accounting for relationships and associated inner–outer change processes (individual, collective, system levels) that are relevant for understanding the research object; and (3) considering how the overall design can itself support transformation, both regarding its object and stakeholders.

For example, when investigating the impact of a city’s electric vehicle adoption program on reducing carbon emissions, the researcher might consciously adopt a design that avoids falling into the trap of exhausting and exploiting oneself, others, and the planet (e.g., through explicit consideration of wellbeing, equity issues, the research’s inherent CO 2 emissions, time management, and meeting formats). At the same time, methodologies can be applied in ways such that they, themselves, can support individual, cultural, and system transformation toward post-carbon behaviors (e.g., Osberg et al. 2024 ; Wamsler et al. 2022b ).

To navigate alternative pathways for designing an overall research methodology, the following guiding questions might be thus helpful:

I. How can I explicitly integrate a relational perspective when using reductionist methodologies? For instance, would it be beneficial to develop a research process that pursues a reductionist approach, while critically highlighting its limitations?

II. How can I design the overall research approach in a way that accounts for relationships and associated inner–outer change processes (individual, collective, and system levels) that are relevant for understanding the selected object? For instance, how might I employ a hybrid methodology that integrates qualitative, quantitative, and related innovative approaches to ensure a comprehensive understanding (e.g., contemplative and creative approaches)?

III. How can the overall design support transformation, for example, a change toward a more relational paradigm (both regarding the research object and stakeholders)? For instance, what relational approaches exist, and how might I combine them in my overall research design?

Step 5: data collection and analysis

Data collection aims to gather relevant information and answer the research questions and/or hypotheses. Diverse methods and techniques are used to systematically collect, record, organize, examine, and interpret related data and draw meaningful conclusions.

Within the mechanistic approach, new scientific knowledge and theory is usually built on the collection and analysis of credible sources of data. In this context, focus tends to be on certain (but not all) dimensions of reality and associated methods for data collection, and, consequently, certain (but not all) ways of generating knowledge about the world (Ives et al. 2019 ). Footnote 6

The relational paradigm questions this fragmented approach (cf. Steps 1–4). In a context where all parts (e.g., culture, institutions, individual and collective behavior and views) are colored by the dominant social paradigm, the combination of scientific, philosophical, Footnote 7 and other methods of enquiry is particularly important to support both an integrated understanding of existing ways of knowing and innovative pathways for new knowledge generation. It requires introspection, contemplative, esthetic, visual, sensory, and embodied forms of sensemaking, and it also demands that we decolonize current methods, for instance, to avoid undermining local knowledge and the experiences of marginalized populations.

From a relational perspective, data that can be used to construct and test ideas can be empirical, but can also take theoretical, conceptual, or other forms (Bhaskar et al. 2016 ). For instance, viewing first-person enquiry or embodiment as a way of perceiving and understanding the world distinguishes it from the dominant mode of knowledge (Frank et al. 2024 ), known as propositional knowing. Propositional knowing primarily relies on creating conceptual maps, which, although helpful, can sometimes be deceptive as they oversimplify reality (the map is not the territory). According to systems theorist Nassim Nicholas Taleb, phenomenological knowledge is often more resilient and adaptable than propositional knowledge (Taleb 2013 ). This does not mean that propositional knowledge should be disregarded entirely; rather, when enriched by phenomenological knowledge, it creates space for the emergence of more imaginative and practical ideas (Pöllänen et al. 2023 ).

Purely objective data does not exist, as pointed out by post-structuralists (Kirby 2011 ). Accordingly, St. Pierre ( 2013 , p. 226) states that “if being is always already entangled, then something called data cannot be separate from me, out there for me to collect.” Denzin ( 2013 , p. 35) therefore suggests thinking about data in terms of “empirical materials”. Data selection and interpretation thus always have material consequences (Barad 2007 ; Smartt Gullion 2018 ). Based on this understanding, data are phenomena that “cannot be engineered by human subjects but are differential patterns of mattering produced by neither the material nor the cultural but the material–cultural” (Vu 2018 , p. 85) or naturecultural (Haraway 2016 ). Phenomenological and narrative-based methods explicitly account for this perspective (see related studies by Pöllänen et al. 2023 ; Wamsler et al. 2022b ).

Furthermore, a relational paradigm involves acknowledging the potential relevance of data that are generally dismissed (Smartt Gullion 2018 ). For example, in statistical modeling, deviation from the mean is often dismissed as noise. To streamline the analysis, ‘noisy’ data undergo various manipulations including outlier removal, logarithmic transformation, or smoothing, ultimately resulting in a linear form (ibid.). While reductionist approaches are necessary (cf. Step 4), noise might conceal significant insights, for instance from non-human or marginalized groups (West 2006 ).

West ( 2006 , p. 72) asserts that “smoothing or filtering the time series might eliminate the very thing we want to know.” Such processing tends to neglect the unique variability that characterizes individuals and emphasizes commonalities. Additionally, the understanding that large sample sizes are good undermines individual variability. As sample sizes grow, models tend to produce statistically significant results. However, this significance is purely a statistical concept and does not always reflect substantial relationships between variables. Even random correlations can appear statistically significant with large sample sizes (Smartt Gullion 2019). For certain studies, it might thus be beneficial to scrutinize the noise.

Building on the previous arguments, it is crucial to employ methods that can investigate all, also today’s ‘hidden’ dimensions, of reality and their inherent relationships. This requires combining traditional methods with other techniques and data sources, such as introspection, contemplative, esthetic, visual, sensory, and embodied forms of sensemaking.

For example, instead of merely using a statistical analysis of the number of bikes rented daily, and the corresponding decrease in individual car usage and emissions, researchers who are studying the impact of a city’s new bike-sharing program on reducing carbon emissions might also consider data from users about underlying (shifts in) values, beliefs, emotions, and paradigms, inter-group variations, and obstacles and enablers for inner–outer change, which can take different forms (e.g., collected stories, constellations, or drawings).

When collecting and analyzing data from a relational perspective, the following questions should be considered to move toward a relational paradigm:

I. How can I critically examine my role as a researcher during the data collection and analysis process? For example, how might my perspectives, assumptions, and values shape my data selection and interpretation?

II. How can I embrace a broad range of methods, data types, and formats beyond traditional textual or numerical approaches? For example, maybe I can incorporate experiential, visual, or sensory forms of data to capture relevant human and non-human interactions.

III. What is the noise that I might be overlooking? For example, if I have smoothed or filtered data, it might be relevant to revisit those data points (if possible) for a closer examination.

Step 6: the writing process

The end product of research is some form of representation of the findings. Commonly, findings are reported in written form in an international journal, a poster, a book, or a monograph. The underlying assumption is that the results—through the use of language—can reflect and influence reality.

This is based on a certain understanding of objectivity and the role of information. From a mechanistic paradigm, research results represent an objective truth that was discovered. Epistemologically, the common understanding is that a knowing subject (the researcher) can objectively study objects (things in the world) to understand them.

As described above, relational epistemology questions the idea of an objective observer (Ngunjiri et al. 2010 ). This understanding is by no means original in its attempt to expose the limitations of reductionist practices. “Philosophers of science, such as Popper ( 1963 ), Feyerabend ( 1975 ), and Kuhn ( 1996 ), are well known for their arguments against false claims of objectivity and scientific autonomy” (Audouin et al. 2013 , p. 17).

The challenge that arises from this understanding is how to represent this subjectivity when reporting results, sometimes referred to as a crisis of representation (Smartt Gullion 2018 ). The crisis of representation comes from asking whether the final product represents reality. Is it accurate? Trustworthy? Ethical? It results from speaking for others—in sustainability science, this is often marginalized humans or non-humans—and the adequacy of their representation (ibid).

Within the relational paradigm, the crisis of representation could be addressed by explicitly acknowledging related challenges, choosing alternative or additional forms of representation (art, stories, music), and portraying the self as performative (Verlie 2018 ). The latter can for instance involve moving away from a first-person scholarly narrator who is self-referential and unavailable to criticism or revision (Pollock 2007 ).

In contrast to representationalism, performativism focuses on “understanding thinking, observing, and theorizing as practices of engagement with, and as part of, the world in which we have our being” (Barad 2007 , p. 133). This understanding of self represents identity and experience as uncertain, fluid, and open to interpretation and revision (Jones and Adams 2010 ).

Although this last research step makes the performative ‘I’ visible, related considerations are relevant for all steps. In the latter case, it relates to: inquiries about one’s role and entanglements; actively engaging with the subjects of research, for example, through dialog; making deliberate methodological choices; considering potential power dynamics, informed consent, confidentiality, and the well-being of participants and oneself; and being transparent about the role of the performative ‘I’ in shaping outcomes.

Another important aspect to consider during the writing process is the fact that writing itself can (and should) be understood as a relational process that, in turn, can foster or hamper relationality in real life (Barad 2007 ; Puig de la Bellacasa 2017 ). For example, the process can be constrained by project schedules, power structures, or other external pressures. This scenario tends to result in a more mechanical, instrumental, and task-oriented approach to crafting or ‘fitting’ content, scope, and form. Conversely, when writing emanates from an integrated self and an embodied, deeper connection to one’s thoughts, emotions, body, and creativity, words can flow more organically. In these instances, the writing process becomes an expressive act, which allows the person to tap into their full potential, rather than fulfilling external demands. Hawkins ( 2015 ) points out that writing is not merely a cognitive or linguistic activity, but is deeply entwined with social, emotional, and spatial contexts and relationships. Thus, writing itself is affected by relational influences, and the way of writing can support or hamper engagement in transformational change (ibid.).

In summary, the writing process requires addressing relational aspects of representation. It involves explicitly addressing related limitations (such as power dynamics and ethical considerations), portraying the self as performative, and using alternative or additional forms of representation where relevant (art, stories, music).

To integrate relational perspectives into the writing process, the following questions can thus be helpful:

I. How might my perspectives and assumptions shape the interpretation and representation of my results, and how can I make them explicit in my writing? For instance, do I acknowledge related limitations in the description of research outcomes?

II. Who do I speak for? Am I contributing to empowerment and justice, or am I disempowering certain individuals, groups, or other agents? For example, how can I give voice to non-human actors and consider their perspectives and interactions? How can I make my writings widely accessible for diverse audiences?

III. What kind of world or other ways of representation can I use to support integrative understanding and transformation? For example, do my research results contribute to, or challenge, existing paradigms and practices? How can I reach people’s minds and hearts, and foster individual and collective agency, hope, and courage to act?

Step 7: dissemination of the results

Lastly, the research process involves the dissemination of its results. Especially in sustainability science, the transfer of knowledge is a crucial step for fostering transformation, and results are disseminated through publishing in academic journals (cf. step 6).

From a relational perspective, relationships also play a key role in dissemination. As relational approaches require the consideration of the perspectives, needs, and relationships of human and non-human stakeholders, it is important to involve stakeholders in different forms in all research steps. In the context of dissemination, this relates to the sharing and application of research findings. Science communication increasingly uses dissemination formats beyond academic papers, such as podcasts, books, or policy briefs that aim to reach different societal groups. However, to support transformation, more relational communication and implementation strategies might also be needed, for instance, the creation of reflection and generative dialog spaces, community workshops, communities of practice, or other interactive formats (Mar et al. 2023 ). What makes these formats particularly relevant is their co-creative approach, placing researchers within a learning ecosystem, field, or network, as learning subjects themselves. Moreover, dissemination could place greater emphasis on the relationships and contexts in which the results were generated. This could involve storytelling, case study illustrations, or imaginary narratives as part of the dissemination strategy that highlight the interconnectedness of the findings within specific social, cultural, or environmental contexts.

Another key aspect of the relational paradigm that is relevant for dissemination is epistemic justice (Fricker 2007 ; Puig de la Bellacasa 2017 ; Whyte 2020 ). Epistemic justice calls for the recognition and amplification of marginalized or underrepresented voices in knowledge production and learning. In the context of dissemination, this translates into actively seeking out, addressing, and including diverse perspectives and knowledge holders in the communication and sharing of research findings. It involves sharing results beyond the scientific community, both with humans and other agents, where possible. It also includes the use of diverse communication channels and formats that cater to different audiences, languages, and accessibility needs. Such an approach embraces tangible actions and accessibility, to have a more inclusive impact that integrates different ways of learning and understanding.

In summary, dissemination requires actively seeking out and addressing relationships and diverse perspectives, and making research outcomes widely accessible in ways that integrate cognitive, social, emotional, ethical, and embodied learning.

For example, when disseminating outcomes, the researcher might also want to represent and ‘let speak’ other voices—such as birds or trees—through videos, photographs, or exhibitions, as an addition to the dissemination of written material.

To integrate relational perspectives into the dissemination process, the following questions can be helpful:

I. In what forms can I best share these research results to account for, and address diverse stakeholders, needs, and perspectives? For example, are videos, exhibitions, networks, or communities of practice relevant channels for dissemination and implementation?

II. Am I conveying information accurately, respectfully, and in ways that honor diverse contributions and contexts, particularly those of marginalized groups? For example, have I critically examined and reframed narratives that perpetuate injustices or exclude certain perspectives?

III. How do I engage with relevant stakeholders during the dissemination process to support integrative understanding and transformation? For example, how can I move from traditional communication formats to more relational approaches that challenge current paradigms?

Our assessment of the different research steps has shown that some characteristics of a relational paradigm apply to several, or all, steps. For example, it is important to consider that sustainability science, by nature, is intertwined with human values, societal norms, and ethics throughout the process. It is inherently subjective and normative, which makes the idea of “total” objectivity obsolete (Ngunjiri et al. 2010 ). Consequently, inner dimensions, including people’s individual and collective mindsets, beliefs, values, worldviews, and associated inner qualities/capacities are key for defining, pursuing, and achieving sustainability goals across all levels (individual, collective, system). Embracing a relational approach in sustainability science therefore necessitates an explicit consideration of related inner–outer transformation processes, which, in turn, requires conscious inter- and intrarelating through introspection and reflexivity. This shift broadens the scope of sustainability science and poses epistemological, ontological, ethical, and praxis-related questions regarding (1) how we see the world, (2) how we get to know, (3) how we engage, and (4) how we ensure equity considerations across all aspects (Ives et al. 2023 ; Wamsler et al. 2024 ). The relational paradigm thus decenters the human in the production of knowledge. We have explained related aspects in detail in the previous sections, and in those research steps in which their influence is greatest.

New pathways for sustainability science: toward a relational approach in research

Given the challenges of the anthropocene, scholars are increasingly calling for a relational turn to address the root causes of today’s polycrisis. At the same time, little is known about the associated challenges, and there is little advice regarding how to operationalize the approach in sustainability science.

Against this background, this paper explored how we can break out of modern, unsustainable paradigms and approaches, and instead apply more relational thinking, being, and acting in the way we conduct research. To achieve this, we systematically list all major research phases and assess possible pathways for integrating a relational paradigm (see Table  1 for an overview and suppl. material).

We show that moving toward a relational paradigm requires us to methodically question and redefine existing theories of change, concepts, and approaches. However, transitioning from a mechanistic to a relational paradigm in the domain of sustainability science and beyond does not involve a straightforward substitution.

Instead of viewing paradigm shifts as abrupt replacements, our analyses highlight the evolutionary and emergent nature of such changes. Contrary to Kuhn’s ( 1996 ) concept of successive paradigms, our approach recognizes the value of integrating and acknowledging the partial validity of multiple, preceding, and mutually informing paradigms. It is about taking small steps and creating bridges between the current and a potential new paradigm, by exploring how best to be in relationship, with ourselves, our fellow humans, and the other-than-human in a regenerative way.

Yet, as Raymond et al. ( 2021 ) point out, methodological challenges and pragmatic decisions to move toward more relational thinking must be addressed, such as the need for setting certain systems boundaries or interfaces. As suggested by the concept of critical complexity, it is possible to transcend the limitations of our dominant mechanistic approaches, while acknowledging the necessity for reduction in research. It embraces the nuanced understanding that some reductionist practices are indispensable, while advocating for a broader framework that encompasses the complexity of entangled socio-ecological systems. Moreover, as Walsh et al. ( 2020 ) point out, applying a differentiated relational ontology acknowledges both the separate as well as the relational reality. For instance, dealing with challenges such as identifying leverage points in research—which stems from a bifurcation—means that we acknowledge paradoxes. We might apply the leverage points model to identify where to intervene in the system, while at the same time acknowledging that the model is limited and not fully aligned with relational thinking (Raymond et al. 2021 ). The need to embrace paradoxes is, in fact, part of moving toward a relational approach (e.g., Kulundu-Bolus 2023 ): it requires a humble and thus relational attitude and understanding of the research process and the results in themselves.

A key challenge for moving toward a relational paradigm is the current landscape within which sustainability science operates, as it is in itself an expression of the dominant modern paradigm. The field operates within a larger context that is characterized by constant acceleration, a high-speed society, exponential technological development, and continuous social change, all of which affect our own relationships and those involved in any research object (Rosa 2019 ). Tensions thus arise from the clash between the inherent qualities of a relational approach—which emphasizes interdependencies, connectedness, nonlinearity, uncertainty, and emergence—and systemic pressures that prioritize rapid outputs, quantifiable outcomes, and often individualistic gains. We therefore acknowledge that a paradigm shift needs to go hand in hand with an overall reevaluation of how systems, institutions, policies, and practices are structured and incentivized within sustainability science.

To integrate a relational paradigm into the researcher’s work, we suggest developing processes and practices of reflexive praxis, such as interrupting existing conversations, listening deeply to overlooked, marginalized, or suppressed perspectives, and daring to ask difficult and new questions that support mutual learning toward the emergence of a more relational being, understanding, and acting upon the world (Spreitzer 2021 ). Moving toward a more relational paradigm is thus not just about adopting a different framework, but is about cultivating individual and collective capabilities and capacities that allow us to challenge conventional norms, structures, and institutions, and encourage exploration and creation from diverse viewpoints toward potential alternatives (Wamsler et al. 2024 ).

Challenging mainstream thought and daring to ask different questions in each research step are crucial to shifting current scientific norms and systems. Hence, we offer a catalog of questions that allows us to systematically integrate relational being, thinking, and acting into the research process (see Table  1 , as well as suppl. material for an overview of the questions and examples). Each question encapsulates underlying assumptions and implications for the research process and can thus serve as a catalyst for embracing a more relational perspective.

Many of the characteristics of a relational paradigm have an impact across multiple research steps. These aspects include the need to decenter the human perspective, account for the role of relationships, support integrative inner–outer transformation processes across individual, collective, and system levels, and encourage deep reflection on one’s positionality. While these characteristics influence the entire research process, their significance becomes more pronounced in certain steps, which we therefore explored in more detail in the previous sections.

Although we offer some concrete ideas regarding how to move toward a relational paradigm, further research is required to test our theoretical and conceptual considerations and generate further measures and pathways. As the relational paradigm focuses on the (quality of) relationships within systems, and associated inner–outer transformation processes, one key aspect to consider is whether, and how, changes in relationships can be best addressed. Research on the human–nature connection, such as the Connectedness to Nature Scale (Mayer and McPherson Frantz 2004 ), already exists. However, this only addresses a small part of the story, and related work is generally not linked to sustainability outcomes across individual, collective, and system levels. Other ways to study changes in relationships and their link to sustainability outcomes have been tested, for instance, in the context of leadership training for the European Commission, the UNDP Conscious Food Systems Alliance, and the Inner Development Goals (IDG) initiative (Janss et al. 2023 ; Jordan 2021 ; Ramstetter et al. 2023 ; Rupprecht and Wamsler 2023 ; Wamsler et al. 2024 ). Based on the inner–outer transformation model, the change in the relationship to self, others, nature, and the world at large is here applied as a proxy for inner–outer transformation and associated sustainability outcomes (Wamsler et al. 2021 ). Research is needed to further assess related aspects, for instance to account for intergenerational trauma and power dynamics, and identify whether the latter might be transactional or a means-in-itself, as transactional relationships often lead to overexploitation and injustice (Rosa 2019 ).

To conclude, we must dare to question our questions, and dare to ask new questions—relational, existential questions about our identity, our role, and our responsibility in the world in more reflexive and thus transformative ways. It is about developing sustainability and regeneration as a capacity, and as a foundation for pursuing research not as only a form of ‘about-ing’ and ‘enact-ing’, but also as a ‘within-ing’ and thus ‘be-ing’. The suggested guiding questions may appear to be small, individual acts. However, these small choices can have profound impacts, as they can help to initiate deeper changes, to let go of mental habits, decolonize our minds, and, ultimately, challenge the cultural, institutional, and political landscape that maintains the story of separation of humans and nature, and the story of human dominance and superiority over the “living” that underlies both our current research approaches and today’s sustainability crises.

Paradigms shape our ways of knowing, being, and acting in the world (Walsh et al. 2020 ) and can thus be both a critical barrier and driver for sustainability. They not only influence us personally (i.e., our motivation, values, attitudes, and psychological makeup), but also shape our systems (social, economic, political, technical, ecological) and cultural associations (i.e., narrative frames and cultural norms) (Escobar 2017 ; Lakoff 2014 ; Orr 2002 ; Wahl 2017 ). Paradigms represent the dominant thought patterns in societies, and thus underlie the theories and methods we use in science (O’Brien 2016 ; Walsh et al. 2020 ). This is also true for sustainability, climate science, and any other related field (Kuhn 1996 ). As a result, they hold significant potential as catalysts for transforming systems (Meadows 1999 ).

In the context of research, related norms are characterized by rationalism, reductionism, empiricism, dualism, and determinism (Redclift and Sage 1994 ; Rees 1999 ; Capra and Luisi 2014 ; Böhme et al. 2022 ).

Despite a rich discourse on relationality, there is no single, comprehensive definition of a relational paradigm. It can rather be seen as an umbrella term that encompasses various strands of thoughts (Walsh et al. 2020 ), as presented in our article.

Transformation literacy is the skill to steward transformative change collectively across the boundaries of institutions, nations, sectors, and cultures (Künkel and Ragnarsdottir 2022 ).

Our work draws heavily on a literature review that explores relational ontologies, epistemologies, and ethics by Walsh et al. ( 2020 ). We also included recent research papers that specifically address sustainability science and relational perspectives. Examples include Hertz et al. ( 2020 ) and Mancilla Garcia et al. ( 2020 ), who look at socio-ecological systems research from a process-relational perspective, and West et al. ( 2020 ), who look at the relational turn in sustainability science in general. These key sources led us to further papers dealing with the relational research approaches relevant for our review.

According to integral theory (Wilber 2021 ), there are two dimensions of reality: an internally versus externally experienced dimension; and an individually versus collectively experienced dimension. Combining these two dimensions yields four domains of human experience, or ways of generating knowledge about the world. These four dimensions involve: (1) ‘it’—knowledge of exterior and individual phenomena; (2) ‘they’—knowledge of exterior and collective phenomena and their interactions; 3) ‘we’—knowledge of internal and collective phenomena and their interactions’ and 4) ‘I’—knowledge of internal and individual phenomena and experiences (Esbjörn-Hargens 2010 ). In sustainability science, the fourth dimension—‘I’— and the in-depth assessment of the relationship between the different dimensions has been largely neglected (Ives et al. 2019 , 2023 ).

For a philosophical theory to be valid, it must be internally consistent within its self-referential axioms and core assumptions. Philosophy makes reasoned arguments based on systems of logic, while science is focused on the systematic collection of evidence (Esbjörn-Hargens 2010 ).

Abson DJ, Fischer J, Leventon J, Newig J, Schomerus T, Vilsmaier U, von Wehrden H, Abernethy P, Ives CD, Jager NW et al (2017) Leverage points for sustainability transformation. Ambio 46:30–39

Article   Google Scholar  

Alford J, Head BW (2017) Wicked and less wicked problems: a typology and a contingency framework. Policy Soc 36(3):397–413. https://doi.org/10.1080/14494035.2017.1361634

Alvesson M, Sandberg J (2020) The problematizing literature review: a counterpoint to Elsbach and Van Kippenberg’s argument for integrative reviews. J Manag Stud 57(6):1290–1304. https://doi.org/10.1111/joms.12582

Audouin M, Preiser S, Nienaber S, Downsborough L, Lanz J, Mavengahama S (2013) Exploring the implications of critical complexity for the study of social-ecological systems. Ecol Soc 18(3):12. https://doi.org/10.5751/ES-05434-180312

Barad K (2007) Meeting the universe halfway: quantum physics and the entanglement of matter and meaning. Duke University Press, Durham

Book   Google Scholar  

Bateson G (1979) Mind and nature: a necessary unity. Dutton, New York

Google Scholar  

Bateson G (1982) Steps to an ecology of mind. Reprint 1987. Jason Aronson, Lanham

Bateson N (2021) Aphanipoiesis. Journal of the International Society for the Systems Sciences, Proceedings of the 64th Annual Meeting of the ISSS 1(1)

Bateson N (2022) An essay on ready-ing: tending the prelude to change. Syst Res Behav Sci 39(5):990–1004. https://doi.org/10.1002/sres.2896

Bhaskar R, Esbjön-Hargens S, Hedlund N, Hartwig M (2016) Metatheory for the twenty-first century (ontological explorations), Kindle. Taylor and Francis, London

Böhme J (2023). Inner and outer transformation in the anthropocene: a relational approach. Leuphana Universität Lüneburg, Universitätsbibliothek der Leuphana Universität Lüneburg

Böhme J, Walsh Z, Wamsler C (2022) Sustainable lifestyles: towards a relational approach. Sustain Sci 17:2063–2076. https://doi.org/10.1007/s11625-022-01117-y

Bollier D, Helfrich S (2019) Free, fair and alive. New Society Publishers, Gabriola

Booth WC, Colomb GG, Williams JM (2016) The craft of research, 4th edn. University of Chicago Press, Chicago

Bradbury H (ed) (2015) The SAGE handbook of action research, 3rd edn. SAGE Publications, Thousand Oakes

Bradbury H (2022) How to do action research for transformations at a time of eco-social crisis. Edward Elgar Publishing, Cheltenham

Capra F, Luisi PL (2014) The systems view of life: a unifying vision. Cambridge University Press, Cambridge

Charlton NG (2008) Understanding Gregory Bateson. Mind, beauty and the sacred earth. Suny Press, New York

Cilliers P (2005) Knowledge, limits and boundaries. Futures 37:605–613

Cohen L, Manion L, Morrison K (2018) Research methods in education, 8th edn. Routledge, London

Creswell JW (2018) Research design: qualitative, quantitative, and mixed methods approaches, 5th edn. SAGE Publications, New York

Drawson AS, Toombs E, Mushquash CJ (2017) Indigenous research methods: a systematic review. Int Indig Policy J. https://doi.org/10.18584/iipj.2017.8.2.5

Denzin N (2013) The death of data? Cult Stud Crit Methodologies 13(4):353-356

Esbjörn-Hargens S (2010) An overview of integral theory: an all-inclusive framework for the twenty-first century. In: Esbjörn-Hargens S (ed) Integral theory in action: applied, theoretical, and constructive perspectives on the AQAL model. State University of New York Press, New York, pp 33–61

Chapter   Google Scholar  

Escobar A (2017) Designs for the pluriverse: radical interdependence, autonomy, and the making of worlds. Duke University Press Books, Durham

Feyerabend PK (1975) Against method. New Left Books, London

Fischer J, Gardner TA, Bennett EM, Balvanera P, Biggs R, Carpenter S, Tenhunen J (2015) Advancing sustainability through main-streaming a social-ecological systems perspective. Curr Opin Environ Sustain 14:144–149. https://doi.org/10.1016/j.cosust.2015.06.002

Frank P, Wagemann J, Grund J et al (2024) Directing personal sustainability science toward subjective experience: conceptual, methodological, and normative cornerstones for a first-person inquiry into inner worlds. Sustain Sci 19:555–574. https://doi.org/10.1007/s11625-023-01442-w

Fricker M (2007) Epistemic injustice: power and the ethics of knowing. Oxford University Press, Oxford

Gearty MR, Marshall J (2020) Living life as inquiry—a systemic practice for change agents. Syst Pract Action Res. https://doi.org/10.1007/s11213-020-09539-4

Goodchild M (2021) Relational systems thinking. J Aware Based Syst Change 1(1):75–103. https://doi.org/10.47061/jabsc.v1i1.577

Haraway DJ (2016) Staying with the trouble: making kin in the chthulucene. Duke University Press, Durham

Hawkins H (2015) Creative geographic methods: knowing, representing, intervening. On composing place and page. Cult Geogr 22(2):247–268. https://doi.org/10.1177/1474474015569995

Hertz T, Mancilla Garcia M (2021) The cod and the cut: intra-active intuitions. Front Sociol 6:724751. https://doi.org/10.3389/fsoc.2021.724751

Hertz T, Mancilla Garcia M, Schlüter M (2020) From nouns to verbs: how process ontologies enhance our understanding of social-ecological systems understood as complex adaptive systems. People Nat. https://doi.org/10.1002/pan3.10079

IPCC (2021) Climate change 2021: the physical science basis. IPCC Working Group I contribution to AR6. Cambridge University Press, Cambridge

IPCC (2022a) Climate change 2022: mitigation of climate change. In: Shukla PR, Skea J, Slade R, Al Khourdajie A, van Diemen R, McCollum D, Pathak M, Some S, Vyas P, Fradera R, Belkacemi M, Hasija A, Lisboa G, Luz S, Malley J (eds) Contribution of Working Group III to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge. https://doi.org/10.1017/9781009157926

IPCC (2022b) Climate change 2022: impacts, adaptation and vulnerability. In: Pörtner H-O, Roberts DC, Tignor M, Poloczanska ES, Mintenbeck K, Alegría A, Craig M, Langsdorf S, Löschke S, Möller V, Okem A, Rama B (eds) Contribution of Working Group II to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge. https://doi.org/10.1017/9781009325844

Ison R (2018) Governing the human–environment relationship: systemic practice. Curr Opin Environ Sustain 33:114–123. https://doi.org/10.1016/j.cosust.2018.05.009

Ives C, Freeth R, Fischer J (2019) Inside-out sustainability: the neglect of inner worlds. Ambio 49:208–217

Ives CD, Schäpke N, Woiwode C, Wamsler C (2023) IMAGINE sustainability: integrated inner–outer transformation in research, education and practice. Sustain Sci 18:2777–2786. https://doi.org/10.1007/511625-023-01368-3

Janss J, Wamsler C, Smith A, Stephan L (2023) The human dimension of the Green Deal: How to overcome polarisation and facilitate culture and system change. Published by the Inner Green Deal gGmbH, Cologne, Germany, and Lund University Centre for Sustainability Studies (LUCSUS), Lund, Sweden

Jerneck A, Olsson L, Ness B, Anderberg S, Baier M, Clark E, Persson J (2010) Structuring sustainability science. Sustain Sci 6(1):69–82. https://doi.org/10.1007/s11625-010-0117-x

Jones SH, Adams TE (2010) Autoethnography and queer theory: Making possibilities. In: Denzin NK, Giardina MD (Eds) Qualitative inquiry and human rights. Left Coast Press, Walnut Creek California, pp 136-157

Jordan T (2021) Inner development goals: background, method and the IDG framework. Growth that Matters AB. https://drive.google.com/file/d/1s_TQbFreKH13kruxss8aQsbWlxKVFsfK/edit

Jørgsen PS, Jansen RE, Ortega DIA et al (2023) Evolution of the polycrisis: anthropocene traps that challenge global sustainability. Philos Trans R Soc. https://doi.org/10.1098/rstb.2022.0261

Kajikawa Y, Tacoa F, Yamaguchi K (2014) Sustainability science: the changing landscape of sustainability research. Sustain Sci 9(4):431–438. https://doi.org/10.1007/s11625-014-0244-x

Kirby V (2011) Quantum anthropologies: life at large. Duke University Press, Durham

Kuhn TS (1996 [1962]) The structure of scientific revolutions, 3rd edn. The University of Chicago Press, Chicago

Kulundu-Bolus I (2023) On regenerative african futures: sovereignty, becoming human, death, and forgiveness as fertile paradoxes for decolonial soul work. J Aware Based Syst Change 3(2):11–22

Künkel P, Ragnarsdottir KV (2022) Transformation literacy: pathways to regenerative civilizations. Springer, Cham

Kuntz AM, Presnall MM (2012) Wandering the tactical: from interview to intraview. Qual Inq 18(9):732–744. https://doi.org/10.1177/1077800412453016

Lakoff G (2014) The ALL NEW Don’t Think of an Elephant! Chelsea Green Publishing, London

Lang DJ, Wiek A, Bergmann M, Stauffacher M, Martens P, Moll P, Swilling M, Thomas CJ (2012) Transdisciplinary research in sustainability science: practice, principles, and challenges (in en). Sustain Sci 7(S1):25–43. https://doi.org/10.1007/s11625-011-0149-x

Latour B (2005) Reassembling the social: an introduction to actor-network theory. Oxford University Press, Oxford

Leal Filho W, Consorte McCrea A (2019) Sustainability and the humanities. Springer International Publishing, Cham

Leichenko R, O’Brien K (2020) Climate and society: transforming the future. Wiley, New York

Lönngren J, van Poeck K (2020) Wicked problems: a mapping review of the literature. Int J Sust Dev World 28(6):481–502. https://doi.org/10.1080/13504509.2020.1859415

Mancilla Garcia M, Hertz T, Schlüter M, Prieser R, Woermann M (2020) Adopting process-relational perspectives to tackle the challenges of social-ecological systems research. Ecol Soc 25(1):29. https://doi.org/10.5751/ES-11425-250129

Mar KA, Schäpke N, Fraude C, Bruhn T, Wamsler C, Stasiak D, Schroeder H, Lawrence MG (2023) Learning and community building in support of collective action: toward a new climate of communication at the COP. Wiley Interdiscip Rev Clim Change 14(4):e832. https://doi.org/10.1002/wcc.832

Mauser W, Klepper G, Rice M, Schmalzbauer BS, Hackmann H, Leemans R, Moore H (2013) Transdisciplinary global change research: the co-creation of knowledge for sustainability (in en). Curr Opin Environ Sustain 5(3–4):420–431. https://doi.org/10.1016/j.cosust.2013.07.001

Mayer FS, McPherson Frantz C (2004) The connectedness to nature scale: a measure of individuals’ feeling in community with nature. J Environ Psychol 24:503–515

Mbah MF, Leal Filho W, Ajaps S (eds) (2022) Indigenous methodologies, research and practices for sustainable development. Springer, Cham. https://doi.org/10.1007/978-3-031-12326-9

Meadows D (1999) Leverage points: places to intervene in a system. The Sustainability Institute, Chennai

Miller TR (2012) Constructing sustainability science: Emerging perspectives and research trajectories. Sustain Sci 8(2):279–293. https://doi.org/10.1007/s11625-012-0180-6

Miller TR, Wiek A, Sarewitz D, Robinson J, Olsson L, Kriebel D, Loorbach D (2013) The future of sustainability science: a solutions-oriented research agenda. Sustain Sci 9(2):239–246. https://doi.org/10.1007/s11625-013-0224-6

Morin E (2008) On complexity. Cresskill, Hampton

Nersessian N (2010) Creating scientific concepts. MIT Press, Cambridge

Ngunjiri F, Hernandez KA, Chang H (2010) Living autoethnography: connecting life and research. J Res Pract 6:E1

O’Brien KL (2016) Climate change and social transformations: is it time for a quantum leap? Wiley Interdiscip Rev Clim Change 7:618–626

Orr DW (2002) The nature of design—ecology, culture, and human intention. Oxford University Press, Oxford

Osberg G, Islar M, Wamsler C (2024) Toward a post-carbon society: supporting agency for collaborative climate action. Ecol Soc 29(1):16

Osgood J, Taylor C, Andersen C, Benozzo A, Carey N, Elmenhorst C, Fairchild N, Koro M, Moxnes A, Otterstad A, Rantala T, Tobias-Green K (2020) Conferencing otherwise: a feminist new materialist writing experiment. Cult Stud Crit Methodol 20:596–609. https://doi.org/10.1177/1532708620912801

Poli R (2013) A note on the difference between complicated and complex social systems. Cadmus 2(1):142–147

Pöllänen E, Walter O, Bojner Horwitz E, Wamsler C (2023) Education for sustainability: understanding processes of change across individual, collective and systems levels. Challenges 14(1):5. https://doi.org/10.3390/challe14010005

Pollock D (2007) The Performative “I”. Cultural Studies. Crit Methodologies 7(3):239-255

Popper K (1963) Conjectures and refutations: the growth of scientific knowledge. Routledge, London

Porter T, Reischer R (2018) We can’t get here from there: sustainability from complexity vs. conventional perspectives. Emerg Complex Organ 1:1–8

Preiser R (2012) The problem of complexity. Re-thinking the role of critique. Dissertation. Department of Philosophy, Stellenbosch University, Stellenbosch

Puig de la Bellacasa M (2017) Matters of care: speculative ethics in more than human worlds. University of Minnesota Press, Minneapolis

Ramstetter L, Rupprecht S, Mundaca L, Klackl J, Osika W, Stenfors C, Wamsler C (2023) Fostering collective climate action and leadership: insights from a pilot experiment involving mindfulness and compassion. iScience 26(3):106191. https://doi.org/10.1016/j.isci.2023.106191

Raymond CM, Kaaronen R, Giusti M, Linder N, Barthel S (2021) Engaging with the pragmatics of relational thinking, leverage points and transformations—Reply to West et al. Ecosyst People 17(1):1–5. https://doi.org/10.1080/26395916.2020.1867645

Redclift M, Sage C (1994) Strategies for sustainable development. Local agendas for the Southern Hemisphere. Wiley, Chichester

Rees WE (1999) Achieving sustainability: reform or transformation? In: Satterthwaite D (ed) The earthscan reader in sustainable cities. Earthscan, London, pp 22–52

Romm NRA (2015) Reviewing the transformative paradigm: a critical systemic and relational (indigenous) lens. Syst Pract Action Res 28(5):411–427. https://doi.org/10.1007/s11213-015-9344-5

Rosa H (2019) Resonance: a sociology of the relationship to the world. Polity Press, Medford, MA

Rowell L, Bruce CD, Shosh JM, Riel MM (eds) (2017) The Palgrave international handbook of action research. Palgrave McMillen, New York

Rowson J (2021) Tasting the pickle: ten flavours of meta-crisis and the appetite for a new civilisation. Perspectiva. https://systems-souls-society.com/wp-content/uploads/2021/02/Tasting-the-Pickle-Ten-flavours-of-meta-crisis-and-the-appetite-for-a-new-civilisation-1.pdf

Rupprecht S, Wamsler C (2023) The global leadership for sustainable development programme: inner development for accelerating action towards the sustainable development goals, evaluation report written for the inner development goals and the Templeton World Charity Foundation. Published by the Inner Green Deal and Lund University Centre for Sustainability Studies (LUCSUS), Lund

Selg P, Ventsel A (2020) Introducing relational political analysis. Palgrave Studies in Relational Sociology. Springer, Cham

Smartt Gullion J (2018) Diffractive ethnography: social sciences and the ontological turn. Routledge, London

Snowden D, Greenberg R, Bertsch B (2021) Cynefin: weaving sense-making into the fabric of our world. Google Scholar, Mountain View

Spreitzer EM (2021) Being the change? How learning communities shape social changemaking as an awareness-led praxis. University of Cambridge. Unpublished Master Dissertation

Springgay S (2015) Approximate-rigorous-abstractions: propositions of activation for posthumanist research. In: Snaza N, Weaver JA (eds) Posthumanism and educational research. Routledge, New York, pp 76–91

St. Pierre E (2013) The appearance of data. Cult Stud Crit Methodol 13(4):223–227

Stalhammar S, Thorén H (2019) Three perspectives on relational values of nature. Sustain Sci 14:1201–1212. https://doi.org/10.1007/s11625-019-00718-4

Storm K, Ringrose J, Osgood J, Renold E (2019) Special issue, PhEmaterialism: response-able research and pedagogy. Reconceptualizing Educ Res Methodol 10(2):3

Taleb NN (2013) Anti-fragile. Penguin, London

Todd Z (2016) An Indigenous feminist’s take on the ontological turn: ‘Ontology’ is just another word for colonialism. J Hist Sociol 29(4):22

Tullio V (2016) Peirce on abduction and embodiment. In: Madzi R, Jung M (eds) Pragmatism and embodied cognitive science. De Gruyter, Berlin, pp 251–268

Turner JR, Baker RM (2019) Complexity theory: an overview with potential applications for the social sciences. Systems 7(1):4. https://doi.org/10.3390/systems7010004

Van der Hoorn S (1995) The development of ecosystemic thinking: an epistemological study. Unpublished doctoral thesis. University of Stellenbosch

Van der Merwe SE, Biggs R, Preiser R, Cunningham C, Snowden DJ, O’Brien K, Jenal M, Vosloo M, Blignaut S, Goh Z (2019) Making sense of complexity: using SenseMaker as a research tool. Systems 7(2):25. https://doi.org/10.3390/systems7020025

Van Kerkhoff L (2013) Developing integrative research for sustainability science through a complexity principles-based approach. Sustain Sci 9(2):143–155. https://doi.org/10.1007/s11625-013-0203-y

Verlie B (2018) From action to intra-action? Agency, identity and ‘goals’ in a relational approach to climate change education. Environ Educ Res. https://doi.org/10.1080/13504622.2018.1497147

Vu C (2018). New Materialist auto-ethico-ethnography: agential-realist authenticity and objectivity in intimate scholarship. In: Strom K, Mills T, Ovens A (eds) Decentering the researcher in intimate scholarship. Emerald Group Publishing. https://bit.ly/2TqCodh

Wahl D (2017) Designing regenerative cultures. Triarchy Press, Axminster

Walsh Z, Böhme J, Wamsler C (2020) Towards a relational paradigm in sustainability research, practice, and education. Ambio. https://doi.org/10.1007/s13280-020-01322-y

Wamsler C (2020) Education for sustainability: fostering a more conscious society and transformation towards sustainability. Int J Sustain High Educ 21(1):112–130. https://doi.org/10.1108/IJSHE-04-2019-0152

Wamsler C, Bristow J (2022) At the intersection of mind and climate change: integrating inner dimensions of climate change into policymaking and practice. Clim Change 173:7. https://doi.org/10.1007/s10584-022-03398-9

Wamsler C, Osberg G, Osika W, Herndersson H, Mundaca L, Hendersson H, Mundaca L (2021) Linking internal and external transformation for sustainability and climate action: towards a new research and policy agenda. Glob Environ Change 71:102373. https://doi.org/10.1016/j.gloenvcha.2021.102373

Wamsler C, Bristow J, Cooper K, Steidle G, Taggart S, Søvold L, Bockler J, Oliver TH, Legrand T (2022a) Theoretical foundations report: research and evidence for the potential of consciousness approaches and practices to unlock sustainability and systems transformation. Report written for the UNDP Conscious Food Systems Alliance (CoFSA). https://www.contemplative-sustainable-futures.com/_files/ugd/4cc31e_143f3bc24f2c43ad94316cd50fbb8e4a.pdf

Wamsler C, Osberg G, Panagiotou A, Smith B, Stanbridge P, Osika W, Mundaca L (2022b) Meaning-making in a context of climate change: supporting agency and political engagement. Clim Policy 23:829–844. https://doi.org/10.1080/14693062.2022.2121254

Wamsler C, Osberg G, Janss J, Stephan L (2024) Revolutionising sustainability leadership and education: addressing the human dimension to support flourishing, culture and system transformation. Clim Change 177:4. https://doi.org/10.1007/s10584-023-03636-8

Wendt A (2015) Quantum mind and social science: unifying physical and social ontology. Cambridge University Press, Cambridge

West B (2006) Where medicine went wrong: rediscovering the path to complexity. World Scientific, Hacksensack

West S, Haider LJ, Stålhammar S, Woroniecki S (2020) A relational turn for sustainability science? Relational thinking, leverage points and transformations. Ecosyst People 16(1):304–325. https://doi.org/10.1080/26395916.2020.1814417

Whyte KP (2020) Too late for Indigenous climate justice: Ecological and relational tipping points. Wires Clim Change 11:e603

Wiek A, Lang DJ (2016) Transformational sustainability research methodology. In: Heinrichs H, Martens P, Michelsen G, Wiek A (eds) Sustainability science. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7242-6_3

Wilber K (2021) A theory of everything: an integral vision for business, politics, science, and spirituality. Shambhala

Xiang WN (2013) Working with wicked problems in socio-ecological systems: awareness, acceptance and adaptation. Landsc Urban Plan 110:1–4

Download references

Acknowledgements

This article is a co-creation, encompassing the more-than-human-world, along with our cultural heritage and predecessors who have formed our knowledge and understanding, and the cultural and technological tools such as networks and laptops and people involved in their development.

The research was supported by the Existential Resilience project funded by Lund University and two projects funded by the Swedish Research Council Formas: (1) Mind4Change (grant number 2019-00390; full title: Agents of Change: Mind, Cognitive Bias and Decision-Making in a Context of Social and Climate Change), and (2) TransVision (grant number 2019-01969; full title: Transition Visions: Coupling Society, Well-being and Energy Systems for Transitioning to a Fossil-free Society).

Open access funding provided by Lund University.

Author information

Jessica Böhme and Christine Wamsler have contributed equally to this work and thus share first authorship.

Authors and Affiliations

Fachhochschule des Mittelstands Berlin, Ernst-Reuter-Platz 3-5, 10587, Berlin, Germany

Jessica Böhme

Leuphana University Lüneburg, Universitätsallee 1, 21335, Lüneburg, Germany

Eva-Maria Spreitzer

Lund University Centre for Sustainability Studies (LUCSUS), Box 170, 221 00, Lund, Sweden

Christine Wamsler

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Christine Wamsler .

Ethics declarations

Conflict of interest.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Handled by Shizuka Hashimoto, Tokyo Daigaku, Japan.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 88 kb)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Böhme, J., Spreitzer, EM. & Wamsler, C. Conducting sustainability research in the anthropocene: toward a relational approach. Sustain Sci (2024). https://doi.org/10.1007/s11625-024-01510-9

Download citation

Received : 26 July 2023

Accepted : 21 March 2024

Published : 24 May 2024

DOI : https://doi.org/10.1007/s11625-024-01510-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Eco-justice
  • Inner transformation
  • Inner transition
  • Existential sustainability
  • Relationality
  • Relational ontology
  • Systems thinking
  • Transformation research
  • Existential resilience
  • Inner-outer transformation

Advertisement

  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 22 May 2024

Feasibility and acceptability of a cohort study baseline data collection of device-measured physical behaviors and cardiometabolic health in Saudi Arabia: expanding the Prospective Physical Activity, Sitting and Sleep consortium (ProPASS) in the Middle East

  • Abdulrahman I. Alaqil   ORCID: orcid.org/0000-0003-0458-2354 1 , 2 , 3 ,
  • Borja del Pozo Cruz   ORCID: orcid.org/0000-0002-9728-1317 2 , 4 , 5 ,
  • Shaima A. Alothman   ORCID: orcid.org/0000-0003-2739-0929 6 ,
  • Matthew N. Ahmadi   ORCID: orcid.org/0000-0002-3115-338X 7 , 8 ,
  • Paolo Caserotti 2 ,
  • Hazzaa M. Al-Hazzaa   ORCID: orcid.org/0000-0002-3099-0389 6 , 9 ,
  • Andreas Holtermann   ORCID: orcid.org/0000-0003-4825-5697 3 ,
  • Emmanuel Stamatakis 7 , 8 &
  • Nidhi Gupta 3  

BMC Public Health volume  24 , Article number:  1379 ( 2024 ) Cite this article

5 Altmetric

Metrics details

Physical behaviors such physical activity, sedentary behavior, and sleep are associated with mortality, but there is a lack of epidemiological data and knowledge using device-measured physical behaviors.

To assess the feasibility of baseline data collection using the Prospective Physical Activity, Sitting, and Sleep consortium (ProPASS) protocols in the specific context of Saudi Arabia. ProPASS is a recently developed global platform for collaborative research that aims to harmonize retrospective and prospective data on device-measured behaviors and health. Using ProPASS methods for collecting data to perform such studies in Saudi Arabia will provide standardized data from underrepresented countries.

This study explored the feasibility of baseline data collection in Saudi Arabia between November and December 2022 with a target recruitment of 50 participants aged ≥ 30 years. Established ProPASS methods were used to measure anthropometrics, measure blood pressure, collect blood samples, carry out physical function test, and measure health status and context of physical behaviors using questionnaires. The ActivPal™ device was used to assess physical behaviors and the participants were asked to attend two sessions at (LHRC). The feasibility of the current study was assessed by evaluating recruitment capability, acceptability, suitability of study procedures, and resources and abilities to manage and implement the study. Exit interviews were conducted with all participants.

A total of 75 participants expressed an interest in the study, out of whom 54 initially agreed to participate. Ultimately, 48 participants were recruited in the study (recruitment rate: 64%). The study completion rate was 87.5% of the recruited participants; 95% participants were satisfied with their participation in the study and 90% reported no negative feelings related to participating in the study. One participant reported experiencing moderate skin irritation related to placement of the accelerometer. Additionally, 96% of participants expressed their willingness to participate in the study again.

Based on successful methodology, data collection results, and participants’ acceptability, the ProPASS protocols are feasible to administer in Saudi Arabia. These findings are promising for establishing a prospective cohort in Saudi Arabia.

Peer Review reports

Global data from 2023 indicate that an estimated 27.5% of adults do not meet physical activity guidelines and have poor physical behaviors (e.g., physical activity, sedentary behavior, and sleep) that are linked with an increased risk of morbidity and mortality [ 1 , 2 , 3 , 4 ]. Sufficient physical activity and sensible sedentary times are associated with better health outcomes (e.g., cardiovascular health, mental health, and physical function) [ 1 , 2 ]. Despite this fact, 50–90% of Saudi Arabian adults perform low or insufficient daily physical activity; about 50% spend at least five hours per day sitting [ 5 ]. Furthermore, around 33% of the population experiences sleep durations of less than 7 h per night [ 6 ]. These trends could be a reason why non-communicable diseases account for 73% of mortality and cardiovascular diseases account for 37% of all deaths among Saudi Arabian adults [ 7 ]. However, there have been few studies in Middle Eastern countries, and the evidence that links between physical behaviors and health outcomes is under-represented in Saudi Arabia [ 1 ].

Furthermore, within Saudi Arabia, the few studies exploring this connection often rely on self-reported physical behaviors that often do not provide the most accurate picture [ 5 , 8 , 9 , 10 , 11 ]. This lack of data necessitates studies that incorporate measurements from devices that directly track these behaviors among Saudi Arabian adults, which aligns with recent guidance from the World Health Organization (WHO) on the necessity of incorporating device-measured physical behaviors into future studies to explore their relationships with various health aspects [ 1 , 12 ]. By employing such a method, we can gain more precise insights into the dose-response relationships between different physical behaviors and various health outcomes among Saudi Arabian adults.

The Prospective Physical Activity, Sitting, and Sleep Consortium (ProPASS) is an initiative that aims to explore how thigh-based accelerometry measurement of physical behaviors influences a wide range of health outcomes. This initiative operates on a global scale and aims to harmonize data from both retrospective and future studies [ 13 ]. To fulfill the aim, ProPASS is developing methods for collecting prospective data and processing, harmonizing, and pooling data from previous and future studies [ 14 ]. To date, the methods of the ProPASS consortium have been used to harmonize data from large-scale epidemiological studies, such as the 1970 British Birth Cohort, the Australian Longitudinal Study on Women’s Health [ 15 ], and Norway’s Trøndelag Health Study (HUNT) [ 16 , 17 ]. As such, this study seeks to determine if the ProPASS methodologies will be effective in the context of data collection within Saudi Arabia. This will be beneficial because it will help to standardize the measurement of physical behaviors, enhance harmonization across studies, and create more a representative and valid understanding of the associations between physical behaviors and health globally, including under-represented countries such as Saudi Arabia.

This paper describes the feasibility of baseline ProPASS data collection in Saudi Arabia with prospectively harmonized data with the main resource. This feasibility study of baseline data collection will serve as a framework for a future cohort study that will investigate the associations between device-measured physical behavior (e.g., physical activity, sedentary behavior, and sleep) and cardiometabolic health in Saudi adults.

The study was approved by the Institutional Review Board at Princess Nourah Bint Abdul Rahman University, Riyadh, Saudi Arabia (IRB 22–0146), and was carried out in accordance with the principles of the Declaration of Helsinki.

Study design and procedures

Participants were informed about the study’s aims and asked to read and sign the consent form before any measurements were taken. After agreeing to participate, they were asked to attend two sessions at the Lifestyle and Health Research Center (LHRC) at the Health Sciences Research Center of Princess Nourah Bint Abdulrahman University. During the first visit, each participant’s anthropometric measurements (e.g., height, weight, waist circumference), blood pressure and heart rate, blood samples, and handgrip strength were measured. Next, the participants completed questionnaires on demographic information, dietary habits, self-rated health, self-reported smoking status, and the Global Physical Activity, Sedentary Behaviors, and Sleep behavior questionnaires. At the end of the first visit, the researcher attached the ActivPAL™ accelerometer device to their thigh which they were asked to wear for seven consecutive days. Participants were also provided with a diary to record their waking and sleeping hours [ 18 ]. On the 8th day of study, the participants were asked to attend the LHRC for session two where they returned the device and were interviewed (see Fig.  1 ).

figure 1

Demonstration and summary of the study procedure

Participants and eligibility

The study aimed to recruit a total of 50 Saudi adults aged ≥ 30 years, which is generally considered a common sample size for feasibility studies [ 19 , 20 ]. The eligibility criteria were: (1) Saudi nationals (2), resident in Riyadh, and (3) aged ≥ 30 years old. The exclusion criteria were: (1) having a current medical condition that forces them to be chair-bound or bedridden for more than half of their waking hours (2), being allergic to plasters or adhesives (3), being allergic to low-density polyethylene (4), having a skin condition that would prevent them from wearing the monitor, and (5) those who may need to pass through a metal detector/security checkpoint during the duration of the study. The study’s aims, protocol, and procedures were clearly described to all participants before any measurements were taken.

Recruitment

Participant recruitment was carried out over the month of November 2022. Participants were recruited from different locations across Riyadh, Saudi Arabia, by using electronic flyers on social media (e.g., Twitter, WhatsApp) that provided information about the study and the researcher’s contact details. Prospective participants who were interested in joining the study were asked to provide their contact information via a link to Google Forms featured in the study description. The participants who initially expressed interest but later decided not to join were invited to share their reasons for non-participation through a physical or telephonic meeting.

Measurements based on ProPASS methodology

The current study employed the ProPASS method and protocol for new cohort studies that seek to join ProPASS prospectively [ 14 , 21 ]. All measurements were taken by researchers that were well-trained in the ProPASS protocol and methods. Blood pressure and hand grip strength measurements were taken three times, and the mean average was then calculated; all other measurements were taken only once.

Anthropometric measurements

Height (to the nearest 0.1 cm) and weight (to the nearest 0.1 kg) were measured with a stadiometer (SECA 284; Seca, Hamburg, Germany), and scale (SECA 284; Seca, Hamburg, Germany), respectively. Waist circumference (to the nearest 0.1 cm) was measured midway between the lower rib margin and the iliac crest at the end of a gentle expiration [ 22 ]. Body mass index (BMI) was calculated using the standard calculation (height in meters squared/body weight in kilograms).

Blood pressure and heart rate

Blood pressure was taken after resting for five minutes in a sitting position. Blood pressure was taken three times with one minute between measurements and the average reading was recorded [ 23 ]. Blood pressure and heart rate were measured using a Welch Allyn Connex 7300 Spot Vital Signs Monitor, which provides a high degree of accuracy [ 24 ]. Mean arterial pressure (MAP) was then calculated (MAP = 1/3 * SBP + 2/3 * DBP in mm Hg) using the average of both the SBP and DBP values [ 25 ].

Blood samples

Non-fasting finger-prick (capillary) blood samples (40 µL) were collected for analysis after warming the finger for five minutes. A drop of blood was taken directly from the heated finger to be analysed for blood glucose, triglycerides, total cholesterol, high-density lipoprotein cholesterol, and low-density lipoprotein cholesterol. A previously validated CardioChek PA analyser (CardioChek PA Blood Analyser, UK) was used to analyse the blood samples [ 26 , 27 ].

Medication use

Participants’ medication use was evaluated by the question: Do you currently use any prescription medicines ? If the answer was yes, the participants were asked which medications they use, such as medication for high blood pressure, high cholesterol, asthma, COPD, anxiety, depression, thyroid problems, allergies. They were also asked whether the medication was in the form of tablets, or nasal sprays, whether the medication was anti-inflammatory, chemotherapeutic, urological, birth control, or neurological, and the age at which the participants had begun using the medication.

Familial disease history

Familial disease history was assessed by the question: Do your parents, siblings or children have, or have they ever had, some of the following diseases before the age of 60 ? The responses included asthma, hay fever/nasal allergies, chronic bronchitis, emphysema or COPD, anxiety or depression, myocardial infarction (heart attack), diabetes, stroke or brain hemorrhage, and cancer. The responses were yes, no , and I don’t know .

Chronic health status

Participants’ chronic disease status and/or long-term health issues were assessed by the question: Have you had, or do you have any of the following diseases? The responses included angina, myocardial infarction (heart attack), heart failure, peripheral vascular disease, atrial fibrillation, stroke/brain hemorrhage, thrombosis, pulmonary embolism, asthma, COPD or emphysema, diabetes, hypothyroidism (low metabolism), hyperthyroidism (high metabolism), cancer, migraine, psoriasis, kidney disease, arthritis (rheumatoid arthritis), Bechterew’s disease, gout, mental health problems, osteoporosis, sleep apnea, arthrosis, nerve disease, hearing/ear disease, eye disease, and infection. Those who replied yes were asked a follow-up question: How old were you when you had it for the first time?

Mobility limitations

The questionnaire was based on three questions on performance-based measures of mobility, which had already been translated and culturally adapted into Arabic [ 28 ]. These three questions are valid and reliable tools to identify the early indications of disability and can be used as indicators to identify those at high risk of future disability [ 29 ]. Self-reported mobility was assessed via the following questions: (1)  Do you have difficulty in walking 2.0 km?  (2)  Do you have difficulty in walking 0.5 km ? and (3)  Do you have difficulty in walking up one flight of stairs? The five response options were: (1)  able to manage without difficulty  (2), able to manage with some difficulty  (3), able to manage with a great deal of difficulty  (4), able to manage only with the help of another person, and  (5)  unable to manage even with help.

Dietary habits

The dietary habits questionnaire was translated and culturally adapted into Arabic [ 28 ]. The questionnaire assessed the dietary habits of the participants was adapted from the Survey of Health, Aging, and Retirement in Europe (SHARE), which has been demonstrated to be a valid and reliable tool for assessing diet [ 30 ]. The questionnaire focused on the consumption of dairy products, legumes, eggs, meat, fruit and vegetables.

Self-rated health

A set of valid and reliable questions adapted from Idler et al.’s (1997) questionnaire was used to assess participants’ self-rated health by asking them to rate their health status using the following questions: (1)  In general, would you say your health is…: Excellent; Very good; Good; Fair; Poor;  (2)  Compared to one year ago, how would you rate your health in general now?: Much better now than one year ago; Somewhat better now than one year ago; About the same; Somewhat worse now than one year ago; Much worse now than one year ago [ 31 , 32 ].

Smoking habits

Self-report questions on smoking behavior were adapted from the UK Biobank questionnaire and were used to assess participants’ present and past smoking habits including at what age they began smoking. the number of cigarettes smoked per day, the type of tobacco used, the duration of smoking, and, among former smokers, the age when smoking ceased [ 33 ].

Physical behaviours

Physical behaviors such as physical activity, sedentary behavior, and sleep were measured by using (1) self-reported and (2) device-based measures:

Self-report measures

Physical activity was measured on a self-report basis via the Global Physical Activity Questionnaire (GPAQ) which was translated into Arabic and previously validated [ 34 ]. In addition, the Sedentary Behavior Questionnaire (SBQ), which had already been translated into Arabic [ 28 ], was used to subjectively assess participants’ sedentary behavior time [ 35 ]. Lastly, the Pittsburgh Sleep Quality Index was used to assess sleep quality and sleep disturbances over a one-month period [ 36 ].

Device-based measures

Physical behaviors were measured by wearing a thigh-worn accelerometer device (an ActivPAL™ Micro4, PAL technologies, Glasgow, Scotland) that participants wore continuously for 24 h for seven full days [ 37 ]. The Activpal™ device was sealed with a nitrile sleeve and attached with a medical waterproof 3 M Tegaderm transparent dressing on the front of the right mid-thigh on the muscle belly by a well-trained member of researcher team. The ActivPAL™ monitor is a valid and reliable measure of time spent walking [ 38 ], sitting, and standing time in healthy adults [ 39 ]. In addition, the participants were asked to fill in a recording sheet that included a sleep diary (times that the participant went to and got out of bed), as well as, the dates and times when the accelerometer fell off or was removed.

Physical function

Physical function was objectively measured using a digital hand-grip strength dynamometer (Takei Hand Grip Dynamometer 5401-C, Japan) via three successive hand-grip assessments for each hand (left and right); the mean value for each hand was then recorded. The instrument can measure hand-grip values from 5 to 100 kg; the minimum unit of measurement is 0.1 kg. The tool is a good health outcomes predictor [ 40 , 41 ].

Data collection evaluation of feasibility

Overall, the study evaluated feasibility in two main stages where feedback from the first six participants was used to resolve any unforeseen issues in the protocol implementation on the remaining participants. Any changes to the procedure were documented.

The current study evaluated the feasibility of Saudi adults’ participation based on the following constructs: (1) recruitment capability (2), acceptability and suitability of study procedures, and (3) resources and ability to manage and implement the study. Table  1 outlines the feasibility constructs, measures, outcome definitions, and methods employed. In evaluating feasibility, the current study followed the recommendations for a feasibility study as reported by Orsmond and Cohn, 2015 [ 42 ].

Overall, the study collected data on the feasibility constructs via tracking the registration, equipment availability, and time spent on various tasks performed (for example training researchers, performing various tasks like attaching the sensor) and completion rate (such as tracking diary entries, questionnaire entries and number of days with accelerometer data), via personal contacts (for information on barriers and facilitators of participation), via processing sensor data, and via interviews after the measurement (for example obtaining information on potential issues during measurement and willingness to participate).

Participant interviews after measurement

After the completion of the study, face-to-face semi-structured interviews were conducted with all participants who had completed the 7-day study period. The aim of these interviews was to collect comprehensive feedback regarding participants’ experiences with the study protocol, with the goal of capturing additional insights that was not captured by other feasibility measures. Some examples of such measures were motivations for joining the study, their expectations prior to participation, and their levels of satisfaction with the study procedures. A detailed interview guide is described in Appendix A [ 28 , 43 , 44 ].

Statistical analysis

Descriptive analysis summarized participants’ demographics, anthropometric measurements, health status, clinical measurements, physical behaviors characteristics, and interview questions responses. The continuous variables were characterized using mean ± standard deviations (SD), while categorical variables were presented using frequencies accompanied by percentages (%). The recruitment rate was calculated by the number of participants who participated and signed the consent form / total number of participants who registered in the study (see Fig.  2 ). Additional analyses were performed to compare participants who reported burden with those who reported no burden of participation (see supplementary materials). T-tests and Chi-square tests were employed for this comparison. IBM’s Statistical Package for the Social Sciences (SPSS) (version 27 SPSS, Inc. Chicago, Illinois) was used to conduct the qualitative analysis. The raw data of ActivPAL were analyzed by using the ActiPASS software (ActiPASS © 2021 - Uppsala University, Sweden).

figure 2

Recruitment and study participant’s diagram

A total of 75 participants initially volunteered to participate. Ten participants were excluded from the study as they did not meet the inclusion criteria ( n  = 8) or could not be contacted ( n  = 2). In addition, 11 participants withdrew their interest in participating for various reasons: (1) excessive distance between the location of the study (LRHC) and their residence ( n  = 3) (2), hesitant about joining the study ( n  = 1) (3), believed that the ActivPAL™ device would interfere with his/her health ( n  = 1) (4), believed that the ActivPAL™ device would interfere with their regular exercise routine ( n  = 2) (5), had family and work commitments ( n  = 3), and (6) claimed that the timing was unsuitable ( n  = 1). Out of a total of 54 participants who had agreed to participate in the study, 48 participants from Riyadh, Saudi Arabia, attended and completed the consent form. However, four of those participants provided incomplete data (i.e., they completed the questionnaires only and did not wear an ActivPAL™ device). Therefore, a total of 44 participants out of 75 potential participants (59%) successfully completed the study (wore an ActivPAL™ device and completed all questionnaires). See Fig.  2 for the study’s recruitment flow.

Participants

Of the 48 participants, nearly half were female (47.9%). On average, the participants were 37 ± 7.3 years old, had a BMI of 28.3 ± 5.6, and a waist circumference of 86.9 ± 16.4 cm. Most participants were married, had college degrees, were employed as office workers and professionals, had never smoked, and did not use any medication (see Table  2 ). A total of 87.5% of participants had a family history of disease; 85.4%, 95.8%, and 89.6%, reported having no difficulty walking 2 km, 500 m, and up one flight of stairs, respectively. Approximately 48% of participants rated their health as very good , while 39.6% reported their health as about the same compared to one year ago . In terms of dietary habits, nearly half the participants reported consuming dairy products every day, 25% consumed legumes and eggs 3 to 6 times a week, 56.3% consumed meat every day, and 45.8% consumed fruits and vegeTables 3, 4, 5 and 6 times a week.

Table  3 presents the primary variables of the study: including average systolic, diastolic, and mean arterial pressure values of 121.13 ± 11.81 mmHg, 79.26 ± 8.92 mmHg, and 93.15 ± 9.20 mmHg, respectively. The mean resting heart rate was 74.3 ± 12.66. Furthermore, the non-fasting blood profile of the sample was analyzed and showed the following values: total cholesterol: 177.89 ± 33.79 mg/dL; HDL-cholesterol: 50.96 ± 13.02 mg/dL; triglycerides: 123.94 ± 68.92 mg/dL; LDL-cholesterol: 103 ± 29.89 mg/dL; TC/HDL-cholesterol ratio: 3.71 ± 1.11; LDL/HDL-cholesterol ratio: 2.19 ± 0.81; non-HDL-cholesterol: 127.06 ± 33.51 mg/dL; non-fasting glucose: 102.98 ± 35.36 mg/dL. Table  3 provides an overview of the participants’ physical activity related behaviors.

Feasibility evaluation

The following results highlight the approaches taken by the current study to assess the feasibility of baseline data collection using ProPASS methodology specifically in the context of Saudi Arabia.

The evaluation of the feasibility of the study protocol was conducted in two stages, initially involving six participants, whose feedback was used to refine and improve the protocol implementation for the remaining participants. Of the six selected participants, three were female. In the pre-evaluation, only two minor issues were encountered; (1) accessing the lab outside of working hours (16:00–22:00) as most participants were unable to attend during the day (07:00–16:00) due to work commitments. This issue was resolved in all subsequent data collection points by receiving approval for extended lab hours; (2) obtaining the required number of ActivPAL™ devices from the technical coordinator due to miscommunication and high demand by other researchers. To prevent further issues, the author obtained 30 devices in advance for the feasibility evaluation.

Recruitment capability

The recruitment rate was used to measure the feasibility of recruitment methodology to collect baseline ProPASS data; the results showed that 64% ( n  = 48) of participants signed the consent form and attended the LRHC lab (see Fig.  2 ). After screening the eligibility criteria, out of a total of 75 participants, 65 met the study criteria, and 11 were excluded from participating due to the reasons as detailed in Fig.  2 . As Fig.  2 illustrates, although 54 participants scheduled an appointment for the study, only 48 (64%) attended and signed the consent form. In the final stage of the recruitment process, around 59% ( n  = 44) of participants completed all the required measurements for the study.

Acceptability and suitability of study procedures

The adherence rate (i.e., the extent to which participants adhered to the outlined procedures in terms of the number of days with valid accelerometry data) was 5.7 days. Furthermore, participants provided sleep diary entries for 85.4% of days. All questionnaires were completed with a 100% response rate.

To assess the study’s time demands on participants, the length of time participants needed to complete all measurements was mean time of 25 min (23 min to complete the questionnaires and two minutes to attach the sensor). Additionally, the completion rates for the registered participants who completed all the required measurements (i.e., accelerometer measurement, diary registration, and questionnaires) was 91.6%. (See Table  4 ).

Resources and ability

The final feasibility outcomes (i.e., having the required resources and ability to manage and implement the study) are presented in Table  5 . This objective was assessed based on four domains: skin irritation, equipment availability, training requirements, and accelerometer loss (see Table  5 ). The first domain revealed that three participants experienced skin irritation during the study; of these, two participants had mild symptoms, such as itchiness and discomfort that lasted for the first three days but did not lead to their withdrawal from the study. However, one participant reported moderate irritation resulting in red skin which required them to withdraw from the study. The second domain, equipment availability, indicated that all the necessary equipment was available 100% of the time. The third domain was training requirements, and the researchers required four hours of training on how to use it correctly. Finally, in the accelerometer loss domain, the study recorded four failed devices out of 30 that did not generate data for seven days.

Participant interview after measurement

After completing the study, all participants were interviewed around five primary themes: (1)  motivation and expectations of participation  (2), participant satisfaction  (3), the burden of participation  (4), willingness to participate again , and (5)  perception of time usage (see Fig.  3 ).

figure 3

Interview outcomes of participant’s experience with the study protocol

To determine the participants’ motivations for and expectations about joining the study, they were asked: What made you want to join this study? The results showed that 90% of participants were interested in learning about their physical behaviors and health status; 43% participated in supporting the researcher, and 14% reported that the final report attracted them to participate (see Fig.  3 a and the example of final report in supplementary material). Participant satisfaction was assessed via two questions: (1)  What was your overall experience of participating in the study? and (2)  Was it as you expected? The findings indicated that 62% of participants were satisfied that the study was as expected, 33% were more satisfied than expected, and 5% were unsatisfied and found the study below their expectations (see Fig.  3 b).

Regarding the overall burden of participation, 76% of participants reported that it was no burden , 5% reported that it was a burden , and 14% believed it was somewhat burdensome (see Fig.  3 c). Additionally, 79% of participants expressed their willingness to participate again in the future (see Fig.  3 d). Finally, regarding time usage, 67% of participants found it easy to complete the seven-day study without any concerns (see Fig.  3 h).

The feasibility of the baseline ProPASS data collection methodology was evaluated among Saudi adults who participated in this study. The findings revealed that the methodology was both feasible and acceptable, paving the way for large-scale prospective cohort research in Saudi Arabia. This research marks the first attempt to establish a prospective cohort study in Saudi Arabia using established ProPASS methods [ 13 , 15 ] and protocols. Conducting such a cohort study in Saudi Arabia is crucial due to the country’s high prevalence of non-communicable diseases that are mostly due to poor physical behaviors (e.g., lack of physical activity, sedentary behavior, and sleep) [ 7 ], due to recent enormous economic growth accompanied by technological transformations and urbanization [ 11 ].

The first aspect of feasibility evaluated of the baseline ProPASS data collection methodology was the capability to recruit participants. The findings indicated that the recruitment rate was 64% which is similar to prior studies [ 46 , 47 ]. One study indicated that a recruitment rate of at least between 20 and 40% is required to be deemed feasible [ 48 ]. Thus, the recruitment rate in the current study seems acceptable for creating a future cohort using ProPASS methods in Saudi Arabia. Additionally, in the current study, the refusal rate was only 15% which is significantly lower than in previous studies [ 45 , 49 ] where refusal rates ranged from 50 to 66%. One reason for the low refusal rate in the current study is that the recruitment was material specifically designed to motivate Saudi participants to join the study by indicating that the study would provide data and insight into their current state of health. For example, the results of the semi-structured interviews illustrated that 90% of participants joined the study because they wanted to know about their physical behaviors and health status (see Fig.  3 ). This result also indicates that our recruitment material might be suitable for ensuring high participation in the future cohort study.

The second aspect of feasibility for the baseline ProPASS data collection methodology that was evaluated in this study was the acceptability and suitability of the study procedures. Previous studies have shown that in order to obtain reliable estimates of adults’ habitual physical activity, it is necessary to record accelerometer data for 3–5 days [ 50 , 51 ] to gather valid data to perform analysis and provide information about the habitual physical behaviors. A recent study indicated that distributing accelerometers in person was associated with a high proposition of participants consenting to wear an accelerometer and meeting minimum wear criteria [ 21 ]. Our study was able to collect an average six days of valid data which was sufficient to obtain representative descriptions of the participants’ physical behaviors [ 52 ]. There were high general adherence rates for participant diary entries, questionnaires completion, and adherence to the study protocol, indicating that the ProPASS methods could be feasibly implemented with a larger study population. The study also assessed the time commitment necessary to complete the questionnaires and attach the ActivPAL™ devices to participants’ thighs. Completing the questionnaires took approximately 23 min (SD = 8). Prior studies have indicated that shorter questionnaires (e.g., 20 min) yield a higher response rate from participants, a finding that was consistent with our study [ 53 , 54 ]. Additionally, attaching the sensor to the participant’s thigh took about two minutes. These findings indicate that participation in this study was not burdensome, which was confirmed by the interviews that showed that 95% of participants felt that participating in the study (i.e., filling out all questionnaires and wearing the ActivPal™ device for 7 days) was not a burden. Overall, ProPASS methods appear to be less burdensome, well-suited, and readily accepted by participants.

The third aspect of feasibility for the baseline ProPASS data collection methodology was the availability of resources and the ability to manage and execute the study. As we aim to create a new cohort adhering to global (ProPASS) standards, protocol training was vital to obtain quality outcomes as per the ProPASS protocol. As a result, the protocol training took around four hours which was similar to a prior study [ 45 ]. In terms of the availability of resources, all essential equipment was always accessible. The study also considered skin irritation as an important factor. One study noted that 38% of participants stopped using ActivPal™ due to skin irritation from PALstickies or Tegaderm dressings [ 55 ]; another reported one discontinuation due to irritation associated with a Tegaderm dressing [ 56 ]. In the current study, there were three reported irritations, with two having mild initial discomfort that eventually subsided. One participant left the study due to moderate irritation. Nonetheless, it is important to note that the data collection occurred during colder winter periods (average 20 degrees Celsius). It is possible that instances of skin irritation could be more pronounced during Saudi Arabia’s hot summer season, characterized by temperatures of approximately 40 degrees Celsius. Future studies should investigate the feasibility of using devices and tape suitable for summer temperatures. In addition, the current study also had a low accelerometer failure rate: only four accelerometers failed to record, which is similar to previous studies [ 57 , 58 ]. All ActivPal™ devices were returned at the end of the study during visit two, ensuring that the ProPASS method is suitable to be used in future cohorts in Saudi Arabia.

Strengths and limitations of Study

This study represents the first of its kind to utilize device-based measures for assessing physical behaviors among adults in Saudi Arabia. The device-based measure has been shown to provide useful information about physical behaviors when compared to using self-report questionnaires [ 16 ]. Furthermore, it marks the initial examination of the ProPASS consortium method in the Middle East, particularly in Saudi Arabia. Nevertheless, the current study has certain limitations including recruiting among relatively young participants, presumably without any medical conditions and with postgraduate qualifications. This may limit the generalization of the findings to the entire population. The acceptability of the study in other age groups and among individuals with lower educational backgrounds is yet to be studied. In addition, the feasibility of the baseline ProPASS data collection methodology study was conducted during winter, which might have influenced the observed levels of physical behaviors in our sample. Similarly, the study was unable to evaluate the feasibility of utilizing 3 M Tegaderm dressings in hot summer months. Lastly, it’s important to note that our study employed a relatively small sample size; nonetheless, this size is considered acceptable for feasibility studies.

The baseline ProPASS data collection methodology and protocol for a future cohort study are both feasible and acceptable for implementation within the context of Saudi Arabia. This feasibility study represents the first step toward establishing a prospective ProPASS cohort study to examine the association between physical behaviors and cardiometabolic health among Saudi Arabian adults.

Availability of data and materials

The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.

Abbreviations

The Prospective Physical Activity, Sitting and Sleep consortium

Physical activity, sedentary behavior, and sleep

Bull FC, Al-Ansari SS, Biddle S, Borodulin K, Buman MP, Cardon G, et al. World health organization 2020 guidelines on physical activity and sedentary behaviour. Br J Sports Med. 2020;54(24):1451–62.

Article   PubMed   Google Scholar  

Chrysant SG, Chrysant GS. Association of physical activity and trajectories of physical activity with cardiovascular disease. Expert Rev Cardiovasc Ther. 2023;0(0):1–10.

Google Scholar  

Falck RS, Davis JC, Li L, Stamatakis E, Liu-Ambrose T. Preventing the ‘24-hour Babel’: the need for a consensus on a consistent terminology scheme for physical activity, sedentary behaviour and sleep. Br J Sports Med. 2022;56(7):367–8.

Guthold R, Stevens GA, Riley LM, Bull FC. Worldwide trends in insufficient physical activity from 2001 to 2016: a pooled analysis of 358 population-based surveys with 1·9 million participants. Lancet Global Health. 2018;6(10):e1077-1086.

Evenson KR, Alhusseini N, Moore CC, Hamza MM, Al-Qunaibet A, Rakic S, et al. Scoping review of Population-based physical activity and sedentary behavior in Saudi Arabia. J Phys Activity Health. 2023;20(6):471–86.

Article   Google Scholar  

Ahmed AE, Al-Jahdali F, AlALwan A, Abuabat F, Salih SB, Al-Harbi A, et al. Prevalence of sleep duration among Saudi adults. Saudi Med J. 2017;38(3):276–83.

Article   PubMed   PubMed Central   Google Scholar  

World Health Organization. Noncommunicable Diseases Progress Monitor 2022. 2022. Available from: https://www.who.int/publications-detail-redirect/9789240047761 . Cited 2023 Jun 22.

Besson H, Brage S, Jakes RW, Ekelund U, Wareham NJ. Estimating physical activity energy expenditure, sedentary time, and physical activity intensity by self-report in adults. Am J Clin Nutr. 2010;91(1):106–14.

Article   CAS   PubMed   Google Scholar  

Cerin E, Cain KL, Owen Oyeyemial, Conway N, Cochrane TL. Correlates of agreement between accelerometry and self-reported physical activity. Med Sci Sports Exerc. 2016;48(6):1075–84.

Klesges RC, Eck LH, Mellon MW, Fulliton W, Somes GW, Hanson CL. The accuracy of self-reports of physical activity. Med Sci Sports Exerc. 1990;22(5):690–7.

Al-Hazzaa HM. Physical inactivity in Saudi Arabia revisited: a systematic review of inactivity prevalence and perceived barriers to active living. Int J Health Sci (Qassim). 2018;12(6):50–64.

PubMed   Google Scholar  

DiPietro L, Al-Ansari SS, Biddle SJH, Borodulin K, Bull FC, Buman MP, et al. Advancing the global physical activity agenda: recommendations for future research by the 2020 WHO physical activity and sedentary behavior guidelines development group. Int J Behav Nutr Phys Act. 2020;17(1):143.

Stamatakis E, Koster A, Hamer M, Rangul V, Lee IM, Bauman AE, et al. Emerging collaborative research platforms for the next generation of physical activity, sleep and exercise medicine guidelines: the prospective physical activity, sitting, and Sleep consortium (ProPASS). Br J Sports Med. 2020;54(8):435–7.

The prospective physical activity, sitting and sleep consortium. Prospective Physical. 2022. ProPASS - prospective physical activity, sitting, and sleep consortium. Available from: https://www.propassconsortium.org . Cited 2022 May 20.

Wei L, Ahmadi MN, Chan HW, Chastin S, Hamer M, Mishra GD, et al. Association between device-measured stepping behaviors and cardiometabolic health markers in middle-aged women: the Australian longitudinal study on women’s Health. Scand J Med Sci Sports. 2023;33(8):1384–98.

Ahmadi MN, Blodgett JM, Atkin AJ, Chan HW, Pozo CB del, Suorsa K, et al. Device-measured physical activity type, posture, and cardiometabolic health markers: pooled dose-response associations from the ProPASS Consortium. medRxiv. 2023; 2023.07.31.23293468. Available from: https://www.medrxiv.org/content/10.1101/2023.07.31.23293468v1 . Cited 2023 Aug 28.

Blodgett JM, Ahmadi MN, Atkin AJ, Chastin S, Chan HW, Suorsa K, et al. Device measured sedentary behaviour, sleep, light and moderate-vigorous physical activity and cardio-metabolic health: A compositional individual participant data analysis in the ProPASS consortium. medRxiv. 2023:2023.08.01.23293499. Available from: https://www.medrxiv.org/content/10.1101/2023.08.01.23293499v1 . Cited 2023 Aug 28.

Inan-Eroglu E, Huang BH, Shepherd L, Pearson N, Koster A, Palm P, et al. Comparison of a thigh-worn accelerometer algorithm with diary estimates of time in bed and time asleep: the 1970 British cohort study. J Meas Phys Behav. 2021;4(1):60–7.

Lancaster GA, Dodd S, Williamson PR. Design and analysis of pilot studies: recommendations for good practice. J Eval Clin Pract. 2004;10(2):307–12.

Thabane L, Ma J, Chu R, Cheng J, Ismaila A, Rios LP, et al. A tutorial on pilot studies: the what, why and how. BMC Med Res Methodol. 2010;10(1):1.

Pulsford RM, Brocklebank L, Fenton SAM, Bakker E, Mielke GI, Tsai LT, et al. The impact of selected methodological factors on data collection outcomes in observational studies of device-measured physical behaviour in adults: a systematic review. Int J Behav Nutr Phys Act. 2023;20(1):26.

Ma WY, Yang CY, Shih SR, Hsieh HJ, Hung CS, Chiu FC, et al. Measurement of Waist circumference. Diabetes Care. 2013;36(6):1660–6.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Berenson GS, Srinivasan SR, Bao W, Newman WP, Tracy RE, Wattigney WA. Association between multiple cardiovascular risk factors and atherosclerosis in children and young adults. The Bogalusa heart study. N Engl J Med. 1998;338(23):1650–6.

Alpert BS, Quinn D, Kinsley M, Whitaker T, John TT. Accurate blood pressure during patient arm movement: the Welch allyn connex spot monitor’s SureBP algorithm. Blood Press Monit. 2019;24(1):42–4.

The Sixth Report of the Joint National Committee on Prevention. Detection, evaluation, and treatment of high blood pressure. Arch Intern Med. 1997;157(21):2413–46.

Panz VR, Raal FJ, Paiker J, Immelman R, Miles H. Performance of the CardioChek PA and Cholestech LDX point-of-care analysers compared to clinical diagnostic laboratory methods for the measurement of lipids. Cardiovasc J S Afr. 2005;16(2):112–7.

CAS   PubMed   Google Scholar  

PTS Diagnostics. CardioChek PA Analyzer. PTS Diagnostics. 2022. Available from: https://ptsdiagnostics.com/cardiochek-pa-analyzer/ . Cited 2022 Feb 26.

Alaqil AI, Gupta N, Alothman SA, Al-Hazzaa HM, Stamatakis E, del Pozo Cruz B. Arabic translation and cultural adaptation of sedentary behavior, dietary habits, and preclinical mobility limitation questionnaires: a cognitive interview study. PLOS One. 2023;18(6):e0286375.

Mänty M, Heinonen A, Leinonen R, Törmäkangas T, Sakari-Rantala R, Hirvensalo M, et al. Construct and predictive validity of a self-reported measure of preclinical mobility limitation. Arch Phys Med Rehabil. 2007;88(9):1108–13.

Börsch-Supan A, Brandt M, Hunkler C, Kneip T, Korbmacher J, Malter F, et al. Data Resource Profile: the Survey of Health, Ageing and Retirement in Europe (SHARE). Int J Epidemiol. 2013;42(4):992–1001.

Idler EL, Benyamini Y. Self-rated health and mortality: a review of twenty-seven community studies. J Health Soc Behav. 1997;38(1):21–37.

Lundberg O, Manderbacka K. Assessing reliability of a measure of self-rated health. Scand J Soc Med. 1996;24(3):218–24.

Peters SAE, Huxley RR, Woodward M. Do smoking habits differ between women and men in contemporary western populations? Evidence from half a million people in the UK Biobank study. BMJ Open. 2014;4(12):e005663.

Doyle C, Khan A, Burton N. Reliability and validity of a self-administered arabic version of the global physical activity questionnaire (GPAQ-A). J Sports Med Phys Fit. 2019;59(7):1221–8.

Rosenberg DE, Norman GJ, Wagner N, Patrick K, Calfas KJ, Sallis JF. Reliability and validity of the sedentary behavior questionnaire (SBQ) for adults. J Phys Act Health. 2010;7(6):697–705.

Buysse DJ, Reynolds CF, Monk TH, Berman SR, Kupfer DJ. The Pittsburgh sleep quality index: a new instrument for psychiatric practice and research. Psychiatry Res. 1989;28(2):193–213.

Crowley P, Skotte J, Stamatakis E, Hamer M, Aadahl M, Stevens ML, et al. Comparison of physical behavior estimates from three different thigh-worn accelerometers brands: a proof-of-concept for the prospective physical activity, sitting, and Sleep consortium (ProPASS). Int J Behav Nutr Phys Act. 2019;16(1):65.

Ryan CG, Grant PM, Tigbe WW, Granat MH. The validity and reliability of a novel activity monitor as a measure of walking. Br J Sports Med. 2006;40(9):779–84.

Kozey-Keadle S, Libertine A, Lyden K, Staudenmayer J, Freedson PS. Validation of wearable monitors for assessing sedentary behavior. Med Sci Sports Exerc. 2011;43(8):1561–7.

Altankhuyag I, Byambaa A, Tuvshinjargal A, Bayarmunkh A, Jadamba T, Dagvajantsan B, et al. Association between hand-grip strength and risk of stroke among Mongolian adults: results from a population-based study. Neurosci Res Notes. 2021;4(3Suppl):8–16.

Bohannon RW. Hand-grip dynamometry predicts future outcomes in aging adults. J Geriatr Phys Ther. 2008;31(1):3–10.

Garcia L, Ferguson SE, Facio L, Schary D, Guenther CH. Assessment of well-being using fitbit technology in college students, faculty and staff completing breathing meditation during COVID-19: a pilot study. Mental Health Prev. 2023;30:200280.

Al-Hazzaa HM, Alothman SA, Albawardi NM, Alghannam AF, Almasud AA. An arabic sedentary behaviors questionnaire (ASBQ): development, content validation, and pre-testing findings. Behav Sci. 2022;12(6):183.

Orsmond GI, Cohn ES. The distinctive features of a feasibility study: objectives and guiding questions. OTJR. 2015;35(3):169–77. https://doi.org/10.1177/1539449215578649 . (Cited 2022 Aug 4).

Marmash D, Ha K, Sakaki JR, Hair R, Morales E, Duffy VB, et al. A feasibility and pilot study of a personalized nutrition intervention in mobile food pantry users in Northeastern connecticut. Nutrients. 2021;13(9):2939.

Ouchi K, Lee RS, Block SD, Aaronson EL, Hasdianda MA, Wang W, Rossmassler S, Palan Lopez R, Berry D, Sudore R, Schonberg MA, Tulsky JA. An emergency department nurse led intervention to facilitate serious illness conversations among seriously ill older adults: A feasibility study. Palliat Med. 2023;37(5):730–9. https://doi.org/10.1177/02692163221136641 .

Bajwah S, Ross JR, Wells AU, Mohammed K, Oyebode C, Birring SS, et al. Palliative care for patients with advanced fibrotic lung disease: a randomised controlled phase II and feasibility trial of a community case conference intervention. Thorax. 2015;70(9):830–9.

Mosadeghi S, Reid MW, Martinez B, Rosen BT, Spiegel BMR. Feasibility of an immersive virtual reality intervention for hospitalized patients: an observational cohort study. JMIR Mental Health. 2016;3(2):e5801.

Papatzikis E, Elhalik M, Inocencio SAM, Agapaki M, Selvan RN, Muhammed FS, et al. Key challenges and future directions when running auditory Brainstem Response (ABR) Research Protocols with newborns: a Music and Language EEG Feasibility Study. Brain Sci. 2021;11(12):1562.

Trost SG, Mciver KL, Pate RR. Conducting accelerometer-based activity assessments in field-based research. Med Sci Sports Exerc. 2005;37(11):S531-543.

Wagnild JM, Hinshaw K, Pollard TM. Associations of sedentary time and self-reported television time during pregnancy with incident gestational diabetes and plasma glucose levels in women at risk of gestational diabetes in the UK. BMC Public Health. 2019;19(1):575.

Ham SA, Ainsworth BE. Disparities in data on healthy people 2010 physical activity objectives collected by accelerometry and self-report. Am J Public Health. 2010;100(S1):S263-268.

Marcus B, Bosnjak M, Lindner S, Pilischenko S, Schütz A. Compensating for low topic interest and long surveys: a field experiment on nonresponse in web surveys. Social Sci Comput Rev. 2007;25(3):372–83.

Sharma H. How short or long should be a questionnaire for any research? Researchers dilemma in deciding the appropriate questionnaire length. Saudi J Anaesth. 2022;16(1):65–8.

De Decker E, De Craemer M, Santos-Lozano A, Van Cauwenberghe E, De Bourdeaudhuij I, Cardon G. Validity of the ActivPAL ™ and the ActiGraph monitors in preschoolers. Med Sci Sports Exerc. 2013;45(10):2002.

Aguilar-Farias N, Martino-Fuentealba P, Chandia-Poblete D. Cultural adaptation, translation and validation of the Spanish version of past-day adults’ sedentary time. BMC Public Health. 2021;21(1):182.

Reid RER, Carver TE, Andersen KM, Court O, Andersen RE. Physical activity and sedentary behavior in bariatric patients long-term post-surgery. Obes Surg. 2015;25(6):1073–7.

Reid RER, Carver TE, Reid TGR, Picard-Turcot MA, Andersen KM, Christou NV, et al. Effects of neighborhood walkability on physical activity and sedentary behavior long-term post-bariatric surgery. Obes Surg. 2017;27(6):1589–94.

Download references

Acknowledgements

The authors would like to express gratitude to all participants for their involvement in the study. Additionally, we extend our appreciation to the research assistants (Rasil Alhadi, Ragad Alasiri, and Khalid Aldosari) who assisted in the data collection. Finally, we would like to thank the LHRC, Princess Nourah Bint Abdulrahman University for providing their site for collecting the data.

This research was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Project No. GrantA353]. The funders had no role in study design, data collection and analysis, the decision to publish, or the preparation of the manuscript.

Author information

Authors and affiliations.

Department of Physical Education, College of Education, King Faisal University, Al-Ahsa, 31982, Saudi Arabia

Abdulrahman I. Alaqil

Center for Active and Healthy Ageing (CAHA), Department of Sports Science and Clinical Biomechanics, University of Southern Denmark, Odense, 5230, Denmark

Abdulrahman I. Alaqil, Borja del Pozo Cruz & Paolo Caserotti

Department of Musculoskeletal Disorders and Physical Workload, National Research Centre for the Working Environment, Lersø Parkalle 105, Copenhagen, 2100, Denmark

Abdulrahman I. Alaqil, Andreas Holtermann & Nidhi Gupta

Faculty of Education, Department of Physical Education, University of Cádiz, Cádiz, Spain

Borja del Pozo Cruz

Biomedical Research and Innovation Institute of Cádiz (INiBICA) Research Unit, University of Cádiz, Cadiz, Spain

Lifestyle and Health Research Center, Health Sciences Research Center, Princess Nourah Bint Abdulrahman University, Riyadh, 11671, Saudi Arabia

Shaima A. Alothman & Hazzaa M. Al-Hazzaa

Mackenzie Wearables Research Hub, Charles Perkins Centre, The University of Sydney, Camperdown, NSW, Australia

Matthew N. Ahmadi & Emmanuel Stamatakis

School of Health Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW, Australia

School of Sports Sciences, University of Jordan, Amman, Jordan

Hazzaa M. Al-Hazzaa

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: AIA, NG, ES, and BdCMethodology: AIA, NG, ES, HMA, and BdCInvestigation: AIAData collection: AIAInterpretation of the findings: AIA, HMA, ES, NG, AH, PC, MNA, and BdCDrafting the paper: AIAReviewing and editing the draft: AIA, ES, HMA, BdC, SAA, PC, MNA, AH, and NGAll authors critically read, revised the draft for important intellectual content, approved the final version of the manuscript to be published, and agreed to be accountable for all aspects of the work.

Corresponding author

Correspondence to Abdulrahman I. Alaqil .

Ethics declarations

Ethics approval and consent to participate.

The Ethic approval was obtained from the Institutional Review Board at Princess Nourah Bint Abdul Rahman University, Riyadh, Saudi Arabia (IRB 22–0146). Written informed consent was obtained from participants. All methods were carried out in accordance with the Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., supplementary material 3., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Alaqil, A.I., del Pozo Cruz, B., Alothman, S.A. et al. Feasibility and acceptability of a cohort study baseline data collection of device-measured physical behaviors and cardiometabolic health in Saudi Arabia: expanding the Prospective Physical Activity, Sitting and Sleep consortium (ProPASS) in the Middle East. BMC Public Health 24 , 1379 (2024). https://doi.org/10.1186/s12889-024-18867-2

Download citation

Received : 12 September 2023

Accepted : 16 May 2024

Published : 22 May 2024

DOI : https://doi.org/10.1186/s12889-024-18867-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Feasibility
  • Epidemiology
  • Physical activity
  • Physical behavior
  • Sedentary behaviors
  • Accelerometry
  • Saudi adults

BMC Public Health

ISSN: 1471-2458

what is data collection procedure in research

IMAGES

  1. Data and data collection procedures

    what is data collection procedure in research

  2. How to Collect Data

    what is data collection procedure in research

  3. Data Gathering Procedure For Research Papers

    what is data collection procedure in research

  4. The procedure of data collection.

    what is data collection procedure in research

  5. 5. Steps in Conducting A Research. Collecting Data: This step involves

    what is data collection procedure in research

  6. Flow chart of data collection procedure

    what is data collection procedure in research

VIDEO

  1. Astrophotography Japan / UV:IR Cut Filter Fun Part2 (Episode 22)

  2. Reliability, Pilot Study, & Data Collection Procedure/Process

  3. 데이터 수집 절차 Data Collection Procedure

  4. Field Demonstration- Swiss Needle Cast Data Collection

  5. What are the 4 phases of systematic procedure in research?

  6. Data Collection Methods II डेटा संग्रह की विधियाँ II UGC NTA NET ( Library Science )

COMMENTS

  1. Data Collection

    Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem. While methods and aims may differ between fields, the overall process of ...

  2. Data Collection

    Data collection is the process of gathering and collecting information from various sources to analyze and make informed decisions based on the data collected. This can involve various methods, such as surveys, interviews, experiments, and observation. In order for data collection to be effective, it is important to have a clear understanding ...

  3. Data Collection in Research: Examples, Steps, and FAQs

    Data collection is the process of gathering information from various sources via different research methods and consolidating it into a single database or repository so researchers can use it for further analysis. Data collection aims to provide information that individuals, businesses, and organizations can use to solve problems, track progress, and make decisions.

  4. Data Collection: What It Is, Methods & Tools + Examples

    Data collection is the procedure of collecting, measuring, and analyzing accurate insights for research using standard validated techniques. Put simply, data collection is the process of gathering information for a specific purpose.

  5. Chapter 5: Collecting data

    Training of data extractors is intended to familiarize them with the review topic and methods, the data collection form or data system, and issues that may arise during data extraction. Results of the pilot testing of the form should prompt discussion among review authors and extractors of ambiguous questions or responses to establish consistency.

  6. Best Practices in Data Collection and Preparation: Recommendations for

    Our recommendations regarding data collection address (a) type of research design, (b) control variables, (c) sampling procedures, and (d) missing data management. Our recommendations regarding data preparation address (e) outlier management, (f) use of corrections for statistical and methodological artifacts, and (g) data transformations.

  7. What Is Data Collection? A Guide for Aspiring Data Scientists

    During the data collection process, researchers must identify the different data types, sources of data, and methods being employed since there are many different methods to collect data for analysis. Many fields, including commercial, government and research, rely heavily on data collection.

  8. Data Collection

    Mixed methods research employs multiple data collection methods, including qualitative and quantitative data, along with multiple tools to study a phenomenon from as many different angles as possible. Primary data collection tools can be as simple as notebooks or smartphone cameras. Photo by Kari Shea.

  9. Data collection

    Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research component in all study fields, including physical and social sciences, humanities, [2] and business.

  10. Data Collection Methods and Tools for Research; A Step-by-Step Guide to

    One of the main stages in a research study is data collection that enables the researcher to find answers to research questions. Data collection is the process of collecting data aiming to gain insights regarding the research topic. There are different types of data and different data collection methods accordingly.

  11. Data Collection Methods: A Comprehensive View

    The data obtained by primary data collection methods is exceptionally accurate and geared to the research's motive. They are divided into two categories: quantitative and qualitative. We'll explore the specifics later. Secondary data collection. Secondary data is the information that's been used in the past.

  12. 10 Collecting data

    Definition 10.1 (Protocol) A protocol is a procedure documenting the details of the design and implementation of studies, and for data collection. Unforeseen complications are not unusual, so often a pilot study (or a practice run) is conducted before the real data collection, to: determine the feasibility of the data collection protocol.

  13. Design: Selection of Data Collection Methods

    In this Rip Out we focus on data collection, but in qualitative research, the entire project must be considered. 1, 2 Careful design of the data collection phase requires the following: deciding who will do what, where, when, and how at the different stages of the research process; acknowledging the role of the researcher as an instrument of ...

  14. Methods of Data Collection, Representation, and Analysis

    This chapter concerns research on collecting, representing, and analyzing the data that underlie behavioral and social sciences knowledge. Such research, methodological in character, includes ethnographic and historical approaches, scaling, axiomatic measurement, and statistics, with its important relatives, econometrics and psychometrics. The field can be described as including the self ...

  15. Data Collection Methods

    Data collection is the systematic process of gathering observations or measurements in research. It can be qualitative or quantitative. ... archival research, and secondary data collection can be quantitative or qualitative methods. Carefully consider what method you will use to gather data that helps you directly answer your research questions.

  16. PDF Methods of Data Collection in Quantitative, Qualitative, and Mixed Research

    There are actually two kinds of mixing of the six major methods of data collection (Johnson & Turner, 2003). The first is intermethod mixing, which means two or more of the different methods of data collection are used in a research study. This is seen in the two examples in the previous paragraph.

  17. What Is Data Collection: Methods, Types, Tools

    Data collection is the process of collecting and evaluating information or data from multiple sources to find answers to research problems, answer questions, evaluate outcomes, and forecast trends and probabilities. It is an essential phase in all types of research, analysis, and decision-making, including that done in the social sciences ...

  18. Data Collection Methods

    Data Collection Methods. Data collection is a process of collecting information from all the relevant sources to find answers to the research problem, test the hypothesis (if you are following deductive approach) and evaluate the outcomes. Data collection methods can be divided into two categories: secondary methods of data collection and ...

  19. (PDF) Data Collection Methods and Tools for Research; A Step-by-Step

    One of the main stages in a research study is data collection that enables the researcher to find answers to research questions. Data collection is the process of collecting data aiming to gain ...

  20. Data Collection Procedure

    The data collection provides the basis for reliability estimations. Thus, a good data collection procedure is crucial to ensure that the reliability estimate is trustworthy. A prediction is never better than the data on which it is based. Thus, it is important to ensure the quality of the data collection. Quality of data collection involves: •

  21. Data Collection

    Data collection is the process of gathering and measuring information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. The data collection component of research is common to all fields of study including physical and social sciences, humanities, business, etc.

  22. Qualitative Research: Data Collection, Analysis, and Management

    INTRODUCTION. In an earlier paper, 1 we presented an introduction to using qualitative research methods in pharmacy practice. In this article, we review some principles of the collection, analysis, and management of qualitative data to help pharmacists interested in doing research in their practice to continue their learning in this area.

  23. Data collection in qualitative research

    The three core approaches to data collection in qualitative research—interviews, focus groups and observation—provide researchers with rich and deep insights. All methods require skill on the part of the researcher, and all produce a large amount of raw data. However, with careful and systematic analysis 12 the data yielded with these ...

  24. Developing a Data Management Plan

    A good data management plan begins by understanding the sponsor requirements funding your research. As a principal investigator (PI) it is your responsibility to be knowledgeable of sponsors requirements. The Data Management Plan Tool (DMPTool) has been designed to help PIs adhere to sponsor requirements efficiently and effectively.

  25. Conducting sustainability research in the anthropocene: toward a

    Data collection aims to gather relevant information and answer the research questions and/or hypotheses. Diverse methods and techniques are used to systematically collect, record, organize, examine, and interpret related data and draw meaningful conclusions.

  26. Feasibility and acceptability of a cohort study baseline data

    Physical behaviors such physical activity, sedentary behavior, and sleep are associated with mortality, but there is a lack of epidemiological data and knowledge using device-measured physical behaviors. To assess the feasibility of baseline data collection using the Prospective Physical Activity, Sitting, and Sleep consortium (ProPASS) protocols in the specific context of Saudi Arabia.

  27. Exploring Customers Experience and Satisfaction with Theme Hotels: A

    The process of data collection was conducted by SCTM 3.0 which was developed by Wellness and Tourism Big Data Institute, Kyungsung University ... " and "Disneyland Hotel" + "California" were inserted into the system to collect respectively relative research data. Data Analysis. When dealing with textual reviews in the form of text, it ...

  28. Applied Sciences

    Object detection in computer vision requires a sufficient amount of training data to produce an accurate and general model. However, aerial images are difficult to acquire, so the collection of aerial image datasets is a priority issue. Building on the existing research on image generation, the goal of this work is to create synthetic aerial image datasets that can be used to solve the problem ...

  29. Call for experts: Technical Advisory Group on Violence against Children

    Measurement, methodological considerations and related research on violence against children; Population-based survey methodology and implementation - sampling, ethics, data collection, analyses and reporting, preferably in relation to national or large-scale surveys measuring violence against children.