Oral NSAIDs.
Intra-articular (IA) Injections.
Exercise.
Fifteen studies were conducted in a single country site – one in Australia, five in the UK, and nine in the United States of America (USA). One study was conducted across multiple countries – Australia, Canada, the UK, and the USA. Sample sizes for the studies ranged from 11 in the pilot study 31 to 3895 the multi-site study. 32 Justifications for the sample sizes were based on the study type (eg, whether it was a pilot study or part of a larger trial) and the sampling strategies employed. Most studies recruited patient participants from clinical lists directly using letters, telephone interviews or face-to-face methods. Four studies sampled members of the general population via emails through market research databases to recruit participants who self-identified as living with OA. One study recruited participants from both clinical lists – the patient sample; and a random public sample (identified through random-digit telephone dialling). 23 One study recruited participants from a clinical trial as part of the evaluation 33 (see Table 2 ).
Sampling for All Reviewed Studies
Study | Country | Sample Size | RR | Sampling Method | Inclusion Criteria |
---|---|---|---|---|---|
Al-Omari, (2017) | UK | 11 | 100% | Participants were drawn from members of a Research Users’ Group (RUG). | Had been diagnosed with OA and had reported one or more of hip, knee, hand and foot joint pain in the past 12 months. |
Al-Omari et al (2015) | UK | 11 | 100% | Members of a research users’ group (RUG) in a research centre who have osteoarthritis were contacted by telephone and invited to attend one group session. | Participants who were representative of potential users of the software for discrete choice experiments and shared decision-making regarding OA medication in clinical practice. All participants were diagnosed with osteoarthritis and reported experiencing one or more of hip, knee, hand, or foot joint pain in the past 12 months. |
Al-Omari et al (2017) | UK | 11 | 100% | Random selection from members of a research users’ group (RUG) in a research centre. | Not previously involved in design of ACBA task. with osteoarthritis and reporting one or more of hip, knee, hand, and foot joint pain over the previous 12 months. |
Byrne et al (2006) | USA | Public:193 Patient: 198 | Public: 25% Patient: 28% | Public sample: Random-digit-dialing list of 4000 telephone numbers Patient sample: list of 1286 patients from Kelsey Seybold clinics. | Public sample: Adults living in Houston, age 20 or older Patient sample: Patients treated for knee osteoarthritis, age 55 to 80. |
Chang et al (2005) | Australia, Canada, the UK, and the USA | 3895 | 7.6% of the total invitation | Distributed 57,452 invitations by email using Harris Interactive. Harris Interactive is a website for methods and tools of market research (Harris Interactive, 2010). | Osteoarthritis patients who provided consistent ratings to the benchmark rating scenarios. |
Fraenkel et al (2004 A) | USA | 100 | 84% | Patients were sent a letter describing the study and then contacted by telephone 1 week later. | Osteoarthritis patients having pain in one or both knees on most days of the month and not having rheumatoid arthritis, gout, pseudogout, or bilateral knee replacements. |
Fraenkel et al (2004 B) | USA | 100 | 84% | Patients were sent a letter describing the study and then contacted by telephone 1 week later. | Osteoarthritis patients having pain in one or both knees on most days of the month and not having rheumatoid arthritis, gout, pseudogout, or bilateral knee replacements. |
Fraenkel et al (2004 C) | USA | 100 | 84% | Patients were sent a letter describing the study and then contacted by telephone 1 week later. | Osteoarthritis patients having pain in one or both knees on most days of the month and not having rheumatoid arthritis, gout, pseudogout, or bilateral knee replacements. |
Fraenkel and Fried, (2008) | USA | 90 | 78.9% | A research assistant recruited participants by approaching patients waiting in the primary care waiting room area. | Patients over 60 years of age, reporting pain in one or both knees on most days of the month, able to read and understand English, and able to perform a choice task. |
Fraenkel et al (2014) | USA | 304 | 100% | Convenience sample | Patients attending general medicine and subspecialty outpatient clinics affiliated with a large university medical centre. |
Harris et al (2018) | USA | 404 | 49.5 | Respondents were recruited via e-mail invitation from Harris Interactive’s (Rochester, New York, USA) online chronic-illness, panel in the UK. | Participating patients were required to have a self-reported physician’s diagnosis of OA and to be a UK resident aged 45 years or older. |
Hauber et al (2013) | UK | 289 | 98% | Respondents were recruited via e-mail invitation from Harris Interactive’s (Rochester, New York, USA) online chronic-illness panel in the UK. | Participating patients were required to have a self-reported physician’s diagnosis of OA and to be a UK resident aged 45 years or older. |
Laba et al (2013) | Australia | 188 | 37% | A paper-based survey was given to all LEGS (Long-term Evaluation of Glucosamine Sulfate study - a two-year, double-blind, placebo-controlled randomised clinical trial) participants attending their end-of-study visit by a member of the LEGS research team; surveys were mailed to participants who had already completed end-of-study visits. | All LEGS participants completing their end-of-study visit were eligible to participate. |
Moorman et al (2017) | USA | 323 | 81.8% | An email invitation to the survey was sent in June 2016 to a group of Internet panelists in the United States. They were recruited from Research Now, an online sampling and data collection company that provides a nationally representative panel of consumers. | Men and women aged 25 to 80 years; Diagnosed with OA in the knee; Experience pain in the knee of ≥4 on a 0 to 10 scale, where 0 means not at all painful and 10 means extremely painful; Experience knee pain at least once a week; Previously failed nonsurgical treatments for knee OA pain; Pass a security screen; No previous surgical implant involving the knee (ie TKA, UKA). |
Pinto et al (2019) | USA | 150 | 97.3 | Participants were recruited at community senior centers and resource fairs and from general internal medicine clinics at Northwestern Medicine, the Shirley Ryan AbilityLab (formerly the Rehabilitation Institute of Chicago) and via flyers posted on the Northwestern University medical campus, Chicago, USA. | Participants self-reported knee pain, ache or stiffness on most days of at least 1 month during the last year, were at least 45 years old, expressed interest in increasing or maintaining PA, and had no prior history of knee replacement on the side of complaint. Participants underwent a standing, fixed-flexion knee X-ray to identify presence of KOA. |
Ratcliffe et al (2004) | Not reported. Appear to be the UK | 412 | Not reported | The general population sample of respondents aged 55 years and over was identified using a market research database. The respondents answered a recruitment questionnaire over the phone. | Patients living with osteoarthritis over 55 years of age. |
All studies included participants with OA, mean age 55 years or more, and reported higher numbers of females to males. One study included a public sample of people age 20 and over. 23 One study did not report the gender of their population. 19 The response rates (RR) reported varying from 7.6% 32 to 100% 9 , 31 , 34 , 35 in the included studies, population and sampling features are presented in Table 2 . The methods of data collection used in the studies also vary, reporting mostly either computer-based questionnaire, 9 , 31 , 34–40 or online web-based questionnaires 32 , 40 , 41 (see Table 2 ).
A range of CA methods was used in the included studies. One study used Conjoint Value Analysis (CVA), three studies used Choice-Based Conjoint (CBC), three studies used Discrete Choice Experiments (DCE), three studies used Adaptive Choice-Based Conjoint (ACBC), and six studies used Adaptive Conjoint Analysis (ACA) (see Table 3 ). The number of attributes and levels identified in the studies ranged from 4 attributes with 12 levels 35 to 9 attributes with 29 levels 41 (see Table 3 ). The attributes tended to define the features of the OA symptoms, OA treatment such as the benefits and the risks, and cost of treatment (for all attributes and levels of the included studies see appendix 2 ).
The CA Methods’ Characteristics for All Reviewed Studies
Study | CA Method | Attributes/Levels | Scenarios | Statistical Analysis |
---|---|---|---|---|
Al-Omari, (2017) | ACBC | 8/28 | Not reported | Hierarchical Bayes |
Al-Omari et al (2015) | ACBC | 8/28 | Not reported | Not reported |
Al-Omari et al (2017) | ACBC | 8/28 | Variable | Monotone regression |
Byrne et al (2006) | CBC | 6/17 | 36 paired choices divided into 6 sets of 6 paired scenarios and each participant was randomly assigned to one of the 6 sets. | Logistic regression analysis |
Chang et al (2005) | CVA | 6/31 | 25 OA health state–side effect scenarios related to NSAIDs | Multivariable regression analysis |
Fraenkel et al (2004A) | ACA | 7/27 | Not reported | Least squares regression analysis |
Fraenkel et al (2004B) | ACA | 7/27 | Not reported | Least squares regression analysis |
Fraenkel et al (2004C) | ACA | 7/27 | Not reported | Least squares regression analysis |
Fraenkel and Fried, (2008) | ACA | 5/13 | Not reported | Least squares regression analysis |
Fraenkel et al (2014) | CBC | 4/12 | 12 | Hierarchical Bayes (HB) modelling. Subsequently performed Latent Class analysis to examine whether preferences clustered by specific segments. |
Harris et al (2018) | DCE | 5/12 | 72 | Individual pooled aggregate logit (Empirical Bayes & MLE) |
Hauber et al (2013) | DCE | 6/24 | 30, split across 3 questionnaires | Random parameters logit model. All analyses were conducted using NLOGIT 4.0. |
Laba et al (2013) | DCE | 7/20 | 16 | For the choice data, a panel mixed multinomial (random parameters) logit (MMNL) model was used to investigate changes in utility (U) (ie preference to continue taking a medication) when the level of a factor was changed using NLOGIT Version 4.0. |
Moorman et al (2017) | CBC | 9/29 | 12 | A hierarchical Bayesian multinomial logit model was used to generate utilities that accounted for individual preferences. |
Pinto et al (2019) | ACA | 6/18 | On average 35 | The PAPRIKA method was used to estimate ‘Part-worth utilities’ (weights) representing the relative importance of the attributes. |
Ratcliffe et al (2004) | DCE | 5/15 | 16 paired choices divided into 3 sets of 8 paired scenarios and each participant was randomly assigned to one of the 3 sets. | Random effects probit regression model |
In all types of CA, regression analysis techniques are generally used to study the patient’s preference. The choice of regression analysis type in CA depends on the type of the main outcome under study (eg, binary outcome, continuous outcome, etc.). More recent studies have adopted Hierarchical Bayesian (HB) models to investigate participants’ preferences at both the group “average” level as well as at the individual level 31 , 35 , 41 (see Table 3 ).
The review included studies investigating pharmaceutical, non-pharmaceutical, and surgical treatment for OA (see Table 1 ).
The majority of studies investigated the side effects and other features of nonsteroidal anti-inflammatory drugs (NSAIDs) and other medications such as disease-modifying drugs and supplements (glucosamine) on patients’ preferences for treatment of OA. 9 , 32 , 33 , 35–39 , 41–43
The relative importance of the risks of side effects; both rare and common were rated more important than the benefits associated with the treatment, time to benefit, out-of-pocket monthly cost, route of administration, and the product label. 36–38 One study found that relatively the most important attribute was the route of administration (cream, pills, injections into the knee and exercise) (relative importance of 24%), followed by the risk of dyspepsia and risk of bleeding ulcer, with the least important being decrease in pain and improved strength (relative importance of approximately 14%). 42 Similarly, a study investigating the long-term evaluation of glucosamine sulphate, found that relatively the most important attributes were the side effects of high blood pressure, heart/liver/kidney problems followed by cost. 33 The authors concluded that in their study, preferences to continue with OA treatments were influenced by side effects first and foremost and treatment efficacy did not significantly influence patient choice. 33 Again, a study 31 investigating 8 medication attributes, found that relatively the risks of side effects were the most important (combined their relative importance accounted for 66% of the treatment decision) and effectiveness of the medication only accounted for 8% of the treatment decision.
One study examined patients’ preferences for exercise in the context of other available treatment options (excluding surgery). 42 The authors found that patients prefer exercise over pharmacological treatment for; risk of dyspepsia and bleeding ulcer combined accounted for the relative importance of 41.3% compared to 28.9% relative importance for both decrease pain and improve strength attributes. 42 Another study investigated individual preferences for physical activity attributes (with no comparison to other types of OA treatment). 40 This study found that “health benefits” (26%) and “enjoyment” (24%) attributes were considered by patients to be relatively the most important.
Three studies investigated patients’ preferences for surgical treatment of OA. One study investigated the relative preferences for 9 different surgical related procedure attributes and simulated how patients may have responded to real-world knee OA procedures based on their preferences. 41 They found that patient preferences for surgical interventions were influenced by “the amount of cutting and removal of existing bone required” (relative importance of 18.7%), followed by “chance of additional surgery” (relative importance of 14.1), “amount of pain relief” (relative importance of 12.7%), with the least important attributes being “limits or complicates any future treatment need on the knee” and” length of hospital stay” with a relative importance of 7.3% each. 41
Similarly, in the study comparing patient preferences for surgery for patients with a hand OA diagnosis, 44 the authors found that “the need for future surgery” (relative importance=19%) and “recovery time” (relative importance=3%) were the least important factors influencing surgical preferences, while “joint stiffness” (relative importance=32%) and “grip strength” (relative importance=29%) were the most important. This supports the results from the earlier study that explored preferences for surgery versus medical treatment of knee OA, 23 which found that the severity of OA symptoms, directly and indirectly, influenced the patients’ choice of OA treatment, even in the presence of cultural differences in attitudes towards particular treatments.
To the best of our knowledge, this is the first review to investigate and summarise the use of CA techniques to value patients’ preferences for OA treatment. In addition, the search strategy was comprehensive, including the search of many databases, contacting authors and experts in the field, and searching the reference lists of published studies.
One of the limitations of this review is the lack of a validated quality assessment tool for CA studies. The use of the ISPOR checklist to score studies may be subjective to the examiner’s opinion. We tried to assess the methodological quality of these studies using the ISPOR Conjoint Analysis Experimental Design Good Research Practices Checklist. We were unable to make an objective decision regarding the minimum acceptable evidence required to award the scores. For example, question 2 “was the choice of attributes and levels supported by evidence?” we were unable to determine the quality and quantity of evidence required. This caused lengthy subjective disputes between the reviewers. Furthermore, the total scores for the studies indicated that CA studies published post the publication of the ISPOR checklist scored higher than those published pre-2011. This would be expected as most of these studies referenced ISPOR in their papers, meaning that we are assessing their quality against the same or similar criteria they used to design their studies, which was not available for studies published before 2011. It is not clear if this improvement in the scores is correlated with the publication of the ISPOR checklist or is simply reflecting an improvement in reporting. We agree with Webb and colleagues that the ISPOR checklist should not be used as a quality assessment tool for conjoint studies in its current format, as it was not originally developed for this purpose. 45
The studies have a high degree of heterogeneity in study design, study population, and treatment choice. The included papers incorporated studies using both rating/ranking and choice-based methods to investigate different options of treatment for OA (exercise, medication, and surgery) in the UK, Australia, Canada, and the USA. All included studies had homogeneous samples in terms of suffering from OA. Thus, the studies sample may represent the OA population. However, the healthcare systems differ between the countries within which the studies were conducted; therefore, the generalisability of the results could be limited.
Variations in the sample sizes between included studies (n = 11 to 3895) may indicate that there is still no consensus on the appropriate or agreed sample size calculation method for CA studies, as it depends on many factors such as the number of questions and scenarios in the conjoint task. It has been suggested that the sample size for a CA study should be at least 300 in one sample group. 46 However, the traditional calculations for sample size determination cannot readily be applied to CA 43 and are rarely applied for practical reasons. 47 Furthermore, it has been argued that collecting more data from each respondent by designing high-quality conjoint tasks may reduce sampling and measurement error. 46 Using similar CA methods to those in the review 36–38 , 42 in a study of patient preferences for acute pain treatment researchers attempted to reduce the limitation of a small sample (50 participants) by interviewing their respondents 4 times at 4 different stages of pain treatment. 48 Limitations around sample size in CA studies may be overcome in the design of the conjoint task and data collection.
The variation in the RR (7.6% to 100%) in the studies is potentially a reflection of the robustness of the methods of recruitment and methods of data collection. The included studies used a variety of methods of data collection. Methods reporting face-to-face interviewing or questionnaires targeted a specific population of interest tended to have higher response rates. Studies using telephone interviewing or emails, predominantly in a general population, had a lower response rate. These studies with low RR recognised the limitations of using an untargeted strategy and suggested response rates could be improved in future research by pre-screening participants in order to target the full survey to those who report a diagnosis or other study characteristic of interest. 32
All included studies recognised the value in utilising CA method to investigate patients’ preferences for OA treatment, but there was no consensus on which CA approach is the most appropriate. Both rating/ranking and choice-based methods were used to examine patients’ preferences for the treatment of OA. Recent academic and practical research applications have tended to favour choice-based approaches as opposed to rating/ranking. 49 However, the rating/ranking approach has also been used and recommended by many researchers to study patients’ preferences for OA treatment 36–38 , 42 as well as treatment preferences in rheumatoid arthritis (RA), 50 chronic pain, 51 and abdominal surgery 48 because it allows the inclusion of a large number of attributes and levels, which reflect the outcomes/concerns of patients with OA. The main advantage of ACA is that it is adaptive and therefore allows a large number of characteristics to be evaluated without resulting in information overload or respondent fatigue, and minimises interviewer, product, and brand bias. Nevertheless, there are still practical limitations associated with ACA, with researchers reporting that not all treatment characteristics could be included in an ACA task. 36–38
In this review, studies that used the choice-based approach reported that the use of the discrete choice method allowed them to identify attributes significantly influencing patients’ preferences for OA treatment. 43 Furthermore, a very low number of inconsistent responses were found, and participants reported that the questions were easy or very easy to answer. 23 Those studies that used ACBC 9 , 31 , 34 argued that the approach can capture more individual-level data and precise estimates than through a traditional CBC approach and that it can yield similar group-level standard errors using up to 38% fewer participants. 39 , 40 Furthermore, it has been reported that the ACBC method is more user friendly and engaging than alternative CA methods 31 , 34 , 52 , 53 and it can be used to elicit individual patients’ preferences. 9
Overwhelmingly the results of the studies in this review indicated that patient preferences for OA medications were driven by the desire to avoid both common and rare side effects, especially those with more serious drug-related toxic effects and that the effectiveness of the OA medication had very little impact on patients’ preferences. However, where investigated, studies suggested that preferences for side effects were affected by patient characteristics such as age and symptoms severity. Older respondents were more willing than younger respondents to trade-off an increased risk in the side effects 36–38 , 43 for an improvement in the symptoms of OA. The side effects associated with NSAIDs had a greater negative influence on the preferences of patients with milder OA than those in more severe OA states. 32 Even when exercise was compared to OA medications, patients were still more concerned about the side effects of the treatment than the benefits. 42 However, patients with more knee pain were more reluctant to choose exercise.
Patients generally attached greater importance to reducing or eliminating adverse events than reducing pain, but one study investigated the level of treatment-related risks patients were willing to accept in exchange for various improvements in pain. 39 The investigators found that participants’ “risk tolerance” varied according to their pain level at baseline and type of symptom relief – participants were willing to accept greater risks for improvements in ambulatory pain than in resting pain. 39 Similarly, a study of treatment options for disease-modifying drugs found that sub-groups of participants were willing to trade-off the risks of side-effects for improvements in a benefit. 35 In relation to surgical treatment for OA, it was reported that younger patients and those who reported the highest pain thresholds, and the greatest functional limitations were more likely to opt for surgical intervention. 41 Furthermore, the severity of the patients underlying symptoms proved to be the main driver influencing their preferences for surgery. 44
Where the severity of OA symptoms was measured alongside the conjoint task, all included studies suggested that the severity of symptoms influenced the patients’ preference of treatment, and consequently the relative importance of treatment characteristics. However, it is not clear whether these differences are a result of symptom severity or artefacts of the CA methods, attributes used, or treatments being assessed.
The severity of OA symptoms and the side effects of treatment have a significant influence on patients’ preferences for OA treatment. Both rating/ranking and choice-based CA methods are recommended in investigating patients’ preferences for OA treatment, but there is no consensus on which CA approach is the most appropriate.
ACA, Adaptive Conjoint Analysis; ACBC, Adaptive Choice-Based Conjoint; BWS, Best–Worst Scaling; CBC, Choice-Based Conjoint; CA, Conjoint Analysis; CVA, Conjoint Value Analysis; DCEs, Discrete Choice Experiments; HB, Hierarchical Bayesian; ISPOR, International Society of Pharmacoeconomics and Outcomes Research; MeSH, Medical Subject Headings; NHS, National Health Service; NSAIDs, Nonsteroidal Anti-Inflammatory Drugs; OA, Osteoarthritis; RR, Response Rates; RA, Rheumatoid Arthritis; UK, United Kingdom; USA, United States of America; WTP, Willingness to Pay.
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
All authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agree to be accountable for all aspects of the work.
The authors report no conflicts of interest in this work.
BMC Health Services Research volume 21 , Article number: 589 ( 2021 ) Cite this article
4275 Accesses
17 Citations
1 Altmetric
Metrics details
In designing, adapting, and integrating mental health interventions, it is pertinent to understand patients’ needs and their own perceptions and values in receiving care. Conjoint analysis (CA) and discrete choice experiments (DCEs) are survey-based preference-elicitation approaches that, when applied to healthcare settings, offer opportunities to quantify and rank the healthcare-related choices of patients, providers, and other stakeholders. However, a knowledge gap exists in characterizing the extent to which DCEs/CA have been used in designing mental health services for patients and providers.
We performed a scoping review from the past 20 years (2009–2019) to identify and describe applications of conjoint analysis and discrete choice experiments. We searched the following electronic databases: Pubmed, CINAHL, PsychInfo, Embase, Cochrane, and Web of Science to identify stakehold,er preferences for mental health services using Mesh terms. Studies were categorized according to pertaining to patients, providers and parents or caregivers.
Among the 30 studies we reviewed, most were published after 2010 (24/30, 80%), the majority were conducted in the United States (11/30, 37%) or Canada (10/30, 33%), and all were conducted in high-income settings. Studies more frequently elicited preferences from patients or potential patients (21/30, 70%) as opposed to providers. About half of the studies used CA while the others utilized DCEs. Nearly half of the studies sought preferences for mental health services in general (14/30, 47%) while a quarter specifically evaluated preferences for unipolar depression services (8/30, 27%). Most of the studies sought stakeholder preferences for attributes of mental health care and treatment services (17/30, 57%).
Overall, preference elicitation approaches have been increasingly applied to mental health services globally in the past 20 years. To date, these methods have been exclusively applied to populations within the field of mental health in high-income countries. Prioritizing patients’ needs and preferences is a vital component of patient-centered care – one of the six domains of health care quality. Identifying patient preferences for mental health services may improve quality of care and, ultimately, increase acceptability and uptake of services among patients. Rigorous preference-elicitation approaches should be considered, especially in settings where mental health resources are scarce, to illuminate resource allocation toward preferred service characteristics especially within low-income settings.
Peer Review reports
Mental disorders are the leading cause of disability and the second leading cause of death globally, accounting for over 276 million disability-adjusted life years and leading to over 9 million deaths annually [ 1 ]. The burden of depression, anxiety, substance use, and some neurological disorders is comparable to noncommunicable diseases like cancer and coronary heart disease, more prominently known for their worldwide health impact [ 2 ]. Despite this burden, mental health services are scarce in many areas of the world, especially low-and-middle-income countries [ 3 ]. Even when services exist, they may not serve patient and provider needs and be based on either of their preferences to optimize formal health care services.
There is strong evidence from other disease areas (e.g., cancer, HIV, and veteran health services, among others) that services which engage patients from the beginning – during conceptualization of the service – can be highly successful and effective [ 4 ]. The global impetus from the Sustainable Development Goals (SDGs) Universal Health Coverage initiative (SDG 3) focuses on the need for services that are accessible, affordable, good quality and acceptable by people for whom these are designed [ 5 ]. Correspondingly, taking example of services for adolescents and youth, the World Health Organization (WHO) encourages service provision that is responsive to patient preferences, such as “youth-friendly services” described in the Global Accelerated Action for Health of Adolescents (AA-HA!) guidelines, to encourage uptake and engagement in services [ 6 ]. The WHO considers patient-centeredness not only integral to human rights enforcement in health services but also central to developing integrated systems [ 7 ].
As mental ill-health becomes increasingly recognized as a global burden, innovations are emerging to provide accessible, affordable, and acceptable prevention, care, and treatment services to the diverse populations faced with mental health issues [ 8 , 9 , 10 ]. Information and messages about mental health, preventative services, treatment characteristics, provider approaches, and care provision modalities must continue to evolve based on stakeholder preferences to ensure relevance and desirability. However, patient involvement in shaping mental health practice has been minimal, especially in low-resource settings [ 11 , 12 , 13 , 14 ].
Despite establishing the need to rigorously elicit patient preferences for healthcare, “precisely how to systematically assess and incorporate patient preferences in the clinical setting remains an area with a need for methodological development” (astutely articulated by Wittink et al) [ 15 ]. Multiple methods have been developed and applied to empirically identify preferences. Two widely used quasi-experimental, quantitative approaches made popular by their use in market research and grounded in macroeconomic principles [ 16 ] are conjoint analysis (CA) and discrete choice experiments (DCE) DCEs [ 17 , 18 , 19 ]. Both methods offer rigorous and systematic approaches for eliciting preferences for service or product attributes from customers and stakeholders [ 20 ].
Conjoint analyses decompose an intervention into its key attributes, then pose the attributes to patients to understand patient-determined values for each attribute [ 21 , 22 ]. Similarly, in discrete choice experiments, researchers construct treatment or service options from a set of attributes and posing them to patients in an experimental design to enable independent assessment of preferences for specific attributes in statistical analysis [ 23 ]. The methods are grounded in the premise that goods and services are comprised of discrete attributes and that consumers holistically value goods and services based on the collective levels of the attributes [ 18 ]. As such, these methods involve posing options for attributes of services to a stakeholder group who select preferred options from a series of choices that pit attributes against each other. Ultimately, conjoint analysis and discrete choice experiments allow for estimation of the relative importance of aspects of the service, trade-offs between attributes made by stakeholders, and overall service satisfaction based on stakeholder preferences.
These methods are increasingly applied to healthcare settings to enable patient input for patient-centered care [ 18 ]. CA and DCEs have been successfully applied for patient preference elicitation in multiple areas of healthcare, including provider-interactions, health service delivery content and format, and treatment options [ 18 ]. Increasingly, CA and DCE methods are applied to mental health service delivery and treatment options. DCEs.
Appropriate and acceptable presentations of mental health services differ between groups such that cultural adaptations should be made for optimal effectiveness [ 24 , 25 ]. Especially in settings where few mental health services exist, development of novel albeit multimodal services should directly involve patient informed service development. Additionally, understanding preferences may elucidate patient perception of risks and causes of mental disorder, as well as social determinants driving mental health outcomes. In this way, CA and DCEs offer opportunities to further scientific understanding of mental health underpinnings within communities while illuminating gaps in patient knowledge worthy of attention. CA and DCEs offer rigorous and evidence-based approaches to improving acceptability and reducing barriers to mental health services, especially among hard-to-reach populations.
Despite the utility of CA and DCE methods toward improving mental health services, no studies have systematically synthesized information about application of CA and DCE toward preferences in mental health care provision. Understanding where such studies have occurred geographically, the mental health issues to which they were applied, and service and treatment attributes investigated would help identify gaps for further exploration. Further, systematically evaluating the study design components such as the preparatory work utilized, number and type of choices and attributes used, and other methodologic and analytic characteristics may facilitate application of CA and DCE for eliciting preferences in new populations and settings.
Due to the rapid developments in the application of CA and DCEs toward healthcare, specifically for mental health, we considered it timely to conduct a scoping review on applications of CA and DCEs for soliciting and identifying stakeholder preferences for mental health services within the past 20 years globally. We think there is a need to promote their use in global mental health with a focus on LMICs.
Through this scoping review we identified published examples of CA and DCEs for mental health within the literature and mapped their characteristics with the ultimate goal of informing future preference elicitation for mental health services.
We performed a broad search of the literature to identify articles depicting use of CA and DCEs to identify patient and stakeholder preferences for mental health services. Six databases were systematically consulted: Pubmed, CINAHL, PsychInfo, Embase, Cochrane, and Web of Science. Prior to conducting the search, we identified keywords and search terms and organized them appropriately for each database (see Supplementary Table 1 ). We performed the scoping search in July 2019, yielding 695 total citations (CINAHL: 63, Cochrane: 64, EMBASE: 355, PsychInfo: 61, Pubmed: 67, Web of Science: 85). Endnote X7 Reference Manager was used to manage citations identified. After duplicates ( n = 160) and citations published before 1990 ( n = 2) were removed, 533 citations remained. The PRISMA 2020 Statement Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews ( PRISMA - ScR ) guidelines were followed for this review [ 26 , 27 ].
A two-phased approach was used to identify articles included in the review. In phase 1, all 533 article titles and abstracts were assessed by a single reviewer for their consistency with inclusion/exclusion criteria (see Table 1 ). Articles were included that utilized CA and DCEs methods and sought preferences for mental health service aspects. We excluded articles that did not utilize these methods or that sought preferences for services not related to mental health, as well as non-English language publications. All articles that did not fit the inclusion criteria were excluded. The main reason for exclusion at the full-text review phase was due to CA and DCEs being non-mental health focused.
During phase 2, the remaining articles were reviewed in full-text separately but in parallel by two reviewers for their consistency with inclusion/exclusion criteria. During this phase, articles without full text versions and student dissertations or theses were additionally excluded. Any remaining reviewer disagreement was resolved with collective review of full-text articles and discussion about relevance. Both reviewers had to agree for an article to be excluded. Overall, 30 articles fit scoping review criteria and were identified for synthesis.
To address our research objective of investigating the applications of CA and DCEs to ascertain key stakeholder preferences for mental health services, understanding individual level service needs and demand characteristics we systematically examined each article for the population studied, geographical location, sample size, mental health service preferences assessed, methods used to design the study, methods used to analyze preferences, and categories/sub-categories of choices presented. Categories for data extraction were informed by a checklist for developing CA applied to health care settings from the International Society for Pharmacoeconomics and Outcomes Research which helps explain the utility of these methods toward health care improvement (see Table 2 ). We extracted this information into a comprehensive matrix and assessed the information for emerging patterns and gaps in the utilization of conjoint analysis to evaluate stakeholder preferences for mental health services within existing literature.
An electronic search yielded a total of 695 titles and abstracts which were judged to be potentially relevant based on title and abstract reading. Of these, 160 records were excluded for being duplicates and 2 were published before 1990. Full texts and abstracts of the remaining 533 articles were reviewed where 480 were excluded because they were not related to mental health. A total of the remaining 53 full-text articles were assessed for eligibility where 23 articles were excluded because they were either non-CA and non-DCEs or non-peer reviewed. A total 30 articles 30 were ultimately reviewed based on their satisfaction of inclusion criteria.
A flow chart through the different steps of study selection is provided in Fig. 1 .
Scoping review flow diagram
Study location and year.
The studies included were published between 2000 and 2018, the majority (21/30, 70%) of which were published since 2010. Most studies were conducted in the United States (11/30, 37%) or Canada (10/30, 33%), and all were conducted in high-income settings (Germany: 4/30, 13%, UK: 3/30, 10%, Japan: 2/30, 7%) (Table 3 ).
Studies most frequently elicited preferences from patient and prospective patient populations (21/30, 70%), others sought preferences from parents of children requiring mental health services (7/30, 23%), and few sought mental health providers and administrators (4/30, 13%). Some studies included multiple population types. Source populations for the studies ranged widely, with some studies recruiting participants directly from waiting rooms and outpatient health facilities [ 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 ], some from inpatient services [ 37 , 38 , 39 , 40 ], some querying university students [ 41 , 42 , 43 , 44 ], some recruiting from service waitlists (such as those waiting initiation of a service in the Canadian national health system) [ 45 , 46 , 47 , 48 , 49 , 50 , 51 ], others from provider databases [ 52 ] or internet-based health community [ 15 , 53 , 54 ].
Nearly half of the studies sought preferences for mental health services generally without focus on a particular issue or disorder (14/30, 47%). A quarter focused on preferences for unipolar depression services (8/30, 27%), and fewer focused on other mental health issues (attention deficit hyperactivity disorder: [3/30, 10%], addiction/substance use disorder [2/30, 7%], dementia [2/30, 7%], and bipolar disorder [1/30, 3%]). The mental health services of focus for the included studies ranged widely with over half (17/30, 57%) seeking stakeholder preferences for attributes of mental health care and treatment, and others focused on choices for health messages and information (2/30, 7%), prevention and early intervention services (2/30, 7%), child mental health interventions (4/30, 13%) campus-, school-, and community-based programs (2/30, 7%). One study sought preferences for psychosocial support services (1/30, 3%), one for genetic testing services for dementia (1/30, 7%), and another one for pharmacologic attribute preferences (1/30, 7%). Individual attributes assessed were extremely variable and ranged widely across studies to make further generalizations but depression remains a commonly studied condition.
Due to variability in stakeholder populations assessed, mental health issues explored, and attributes investigated in these CA and DCE studies, we did not synthesize information about patient and provider preferences identified within the CA and DCE studies. Through our systematic review, we aim to facilitate greater understanding of the design and application of CA and DCE studies for use in mental health care settings, thus we focused our results on practical aspects of existing studies. Across the 30 studies included from the last 20 years, we saw encouraging evidence of more recent CA and DCEs building upon methodologic and analytic experience from prior CA and DCEs applied to mental health topics, across varied populations. By identifying this rapidly expanded collection of CA and DCEs applied to mental health, we aim to amplify this trend such that future studies are able to build off of the knowledge accumulated over the past 20 years, expanding the application of CA and DCEs to new populations and settings.
CA and DCEs were employed with nearly equal proportion across the studies included (CA: 16/30, 53%, DCEs: 14/30, 47%) (see Table 4 ). Prior to developing the CA or DCE, 70% (21/30) of studies conducted qualitative exploration among patients, 50% (15/30) conducted quantitative exploration, and 43% (13/20) performed literature, or policy qualitative exploration among policy makers (3%) (Table 4 ). About half of studies (53%) employed ternary choice types, while others used binary (40%), or did not specify (13%). The number of attributes explored ranged from three to more than eight, yet the most often used number was more than 8 (40%) or 4 (37%). Studies most frequently posed more than 15 choices to each participant (33%), while the second most frequent number of choices was 5 or fewer (27%). Self-completed questionnaires were the most common form of administering CA and DCEs (80%), while five studies administered questionnaires by a study staff member. Sample sizes for the studies ranged from 29 to 2469, with 27% (8/30) of studies having 100 participants or fewer, 37% (11/30) having sample sizes between 101 and 300 and 33% (10/30) having over 300 participants.
The majority of CA and DCEs (57%) employed main effects and interactions in their study design plans. The most common methodologic approach to designing the choice tasks was use of orthogonal design with Bayesian analysis. Across the 30 studies, the total number of choice tasks posed within CA and DCEs ranged from 10 to over 150. Half of the analyses (15/30, 50%) utilized Sawtooth software, while SPSS was the second most-utilized statistical software (20%). Other analyses utilized SAS (13%), Stata (3%), R (3%), and many studies used multiple of the aforementioned statistical packages.
Similarly, most studies utilized multiple statistical analysis methods with the most frequently used method as logistic regression (12/30, 40%), latent class analysis as the second most used (10/30, 33%), hierarchical Bayes estimation methods were also commonly used (8/30, 27%). Other methods included ordinary least squares regression (6/30, 20%), chi-squared, ANOVA, and MANOVA tests (7/30, 23%), and ordered probit regression (4/30, 13%).
Our scoping review of CA and DCEs attempted to elicit stakeholder preferences and individual level service needs and demand for mental health services. We summarize the use of these preference elicitation methods to date towards finding solutions towards mental health service design and management given the increasing global health burden of mental health disorders [ 55 ]. We identified few ( n = 30) applications of these methods in this context and highlighted depression services as the mental health disorder toward which they have been most frequently utilized. All existing studies took place in high-income settings, showcasing a gap in current application and an opportunity for expansion to low- and middle-income settings. Such settings may face a scarcity of mental health resources such that prioritization based on patient-centered and provider-informed preferences could aid in tailoring services to optimize access and acceptability. Further, applications to date have mostly focused on adult mental health care and treatment, with fewer studies focused on child health. Two studies focused on preferences from university students highlighting potential utility in seeking mental health preferences among adolescent and young adult groups – an age category at higher risk for mental health issues globally and a demographic for whom mental health promotion and prevention services are important. Our results add to the limited literature regarding an appraisal of well-developed methods to improve patient-centeredness of mental health services using rigorous sequential mixed methods. Existing evidence demonstrates feasibility and increasing interest in seeking stakeholder preferences for mental health services, and can be used to inform future studies which expand the application of these methods to other contexts and populations facing mental health problems.
The need to address behavioral and psychosocial problems globally is more urgent than ever and is gaining recognition within global health goal-setting such as health systems strengthening to address the non-communicable disease burden (including mental disorders) within the Sustainable Development Goals [ 5 ]. Patient and provider preference elicitation to inform intervention development and evaluation should be considered an integral component of quality of care and service development globally. Recognizing our patients and community stakeholders as experts in their own treatment and service needs empowers them to take part in designing care that is acceptable, appropriate, and desirable. Service areas such as psychological and psychiatric services which may be underdeveloped and stigmatized in many settings could especially benefit from patient-informed alternatives, which may encourage utilization of services and, ultimately, alleviation of mental health burden. Such methods might also help us develop programs and services that may mitigate stigma and routinely experienced barriers to care. Here is an example of a DCE study that could give pointers to what patients might look forward to and inconveniences might be willing to overlook A study from South Africa echoed a similar sentiment based on a DCE looking at public health care in which they found that communities were prepared to tolerate public sector health service characteristics such as a long waiting time, poor staff attitudes and lack of direct access to doctors if they received the medicine they need, a thorough examination and a clear explanation of the diagnosis and prescribed treatment from health professionals [ 56 ].
Conjoint methods sharpen the focus on “what it is about treatment” that drives preferences and provides specific guideposts for how to design packages of treatment that are patient-centered. A number of studies covered depression and psychosocial support [ 15 , 28 , 29 , 30 , 31 , 32 , 33 , 35 , 37 , 38 , 39 , 40 , 42 , 43 , 53 , 54 , 57 , 58 , 59 ] from the premise that theoretical assimilation of intervention or treatment preference characteristics might vary from real life choices and concerns. A DCE is a quantitative tool that measures the weight of different factors that affect a decision. Participants are presented with two hypothetical scenarios to choose between. Some studies found that the patients expected more personal support from healthcare providers, including flexible working hours and higher quality of patient-provider relationships [ 60 ]. Preference elicitation is a key component of the treatment engagement process, improving understanding of which treatment types or strategies best support the priorities of the patient population and, thus improve their outcomes while bolstering their connection to care. Choices prioritized by patients for mental health services may illuminate their own conceptualizations of mental health issues which may highlight opportunities to utilize key health messages for psychotherapeutic interventions. Studies identified in this scoping review showcase that low literacy populations can be effectively included in preference elicitation exercises using simple visualizations and choice tasks that are broken down into basic categories. Other studies demonstrated that patient-preferences identified with conjoint analysis or discrete choice experiments could be used in conjunction with information about existing services, input from healthcare professionals, and qualitative interviews with patients to arrive at a more comprehensive plan for intervention and service development. Importantly, these methods may help serve the needs of diverse populations by informing appropriate and effective mental health services tailored to unique sub-groups. Discrete choice experiments and conjoint analysis might be useful to inform the development of tools to assist shared decision making in psychiatry [ 61 ]. Similar ideas were expressed in a DCE carried in Tanzania focused on maternal health care which found that care quality, both technical and interpersonal, was more important than clinic inputs such as equipment and cleanliness [ 62 ].
Our findings identified examples where conjoint analysis and discrete choice experiments were used to identify nuanced barriers and needs for capacity building among health providers and mental health specialists [ 33 , 36 , 52 , 63 ]. Implementation of evidence-based psychological and psychiatric interventions is complex, thus using quantitative preference elicitation methods to understand service provision processes at the administration, health system, and provider levels could streamline the complexity. Studies from our scoping review identified the desire from mental health providers and administrators for enhanced supervisory support, local decision control in treatment approaches, improved training in psychopathology, more leadership and flexibility in implementation processes, and further training opportunities. Overall, these methods may offer opportunities to improve service evaluation and health system feedback loops via input from health providers and administrators to improve quality of mental health service provision.
As the need for effective mental health services is increasingly recognized globally, methods to ensure that such services are relevant and responsive to the needs of patient populations are essential. Rigorous, quantitative approaches to ascertaining input from stakeholders, such as conjoint analysis and discrete choice experiments, have been specifically recommended for integrating mental health services within health systems in low- and middle-income countries [ 64 ]. Individual level patient and provider preferences that are identified and incorporated into design and implementation of mental health services synergistically strengthen provision. By seeking contributions from populations served, use of these methods improves appropriateness and desirability of services which may improve equity in mental health care. Additionally, the development of knowledge transfer strategies that align the preferences of professionals with those of the families they serve will go a long way in strengthening the system and services [ 65 ].
This scoping review was limited to peer-reviewed, published literature; thus, we did not account for conjoint analyses or discrete choice experiments for mental health service preferences reported in other sources. Further, we limited our review to studies available in English language, thus we may have missed findings from other settings published in other languages. Despite these limitations, we feel we were able to achieve our goal of scoping applications of conjoint analysis and DCEs for preference elicitation regarding mental health services through this review.
The objective of this scoping review was to describe existing applications of conjoint analysis and discrete choice experiments for eliciting stakeholder preferences, individual patient and provider level for mental health services within published literature. We found that conjoint analysis and discrete choice experiments have been increasingly used over the past 20 years to identify preferences from diverse populations and a range of mental health issues and services. All conjoint analyses identified for this scoping review were performed within high-income countries, yet a few were performed within low-income populations in those settings. Conjoint analysis and discrete choice experiments have been shown as effective methods for eliciting preferences for mental health services within diverse settings, illustrating a promising approach to increasing patient-centered mental health care. Future applications of such methods should be performed within low- and middle-income countries to assess the performance of this methodology within settings where patient involvement in care is traditionally low and appropriate mental health services are lacking. Ultimately, we assert that application of preference elicitation methods such as conjoint analysis and discrete choice experiments should be applied to mental health services among populations globally to expand utilization and reduce mental health burden.
The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.
Human immunodeficiency virus
Sustainable Development Goals
World Health Organization
Accelerated Action for Health of Adolescents
Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews
Cumulative Index to Nursing and Allied Health Literature
Excerpta Medica dataBASE
Analysis of Variance
Multivariate analysis of variance
Bridges JF, Hauber AB, Marshall D, Lloyd A, Prosser LA, Regier DA, et al. Conjoint analysis applications in health--a checklist: a report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. Value Health. 2011;14(4):403–13.
Article PubMed Google Scholar
Feigin VL, Nichols E, Alam T, Collaborators GN. Global, regional, and national burden of neurological disorders, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol. 2019;18(5):459–80.
Article Google Scholar
Patel V. Mental health in low- and middle-income countries. Br Med Bull. 2007;81-82:81–96.
Beusterien KM, Dziekan K, Flood E, Harding G, Jordan JC. Understanding patient preferences for HIV medications using adaptive conjoint analysis: Feasibility assessment. Value Health. 2005;8(4):453–61.
Kieny MP, Bekedam H, Dovlo D, Fitzgerald J, Habicht J, Harrison G, et al. Strengthening health systems for universal health coverage and sustainable development. Bull World Health Organ. 2017;95(7):537–9.
Article PubMed PubMed Central Google Scholar
(WHO) WHO. Global Accelerated Action for the Health of Adolescents (AA-HA)! 2017.
Google Scholar
Organization WH. WHO global strategy on integrated people-centred health services 2016-2026. 2015.
Magnabosco JL. Innovations in mental health services implementation: a report on state-level data from the U.S. Evidence-Based Practices Project. Implement Sci. 2006;1(1):13.
Kimberly J, Cook JM. Organizational Measurement and the Implementation of Innovations in Mental Health Services. Adm Policy Ment Health Ment Health Serv Res. 2008;35(1):11–20.
Hollis C, Morriss R, Martin J, Amani S, Cotton R, Denis M, et al. Technological innovations in mental healthcare: harnessing the digital revolution. Br J Psychiatry. 2018;206(4):263–5.
Mitton C, Smith N, Peacock S, Evoy B, Abelson J. Public participation in health care priority setting: a scoping review. Health Policy (Amsterdam, Netherlands). 2009;91:219–28.
Tambuyzer E, Pieters G, Van Audenhove C. Patient involvement in mental health care: one size does not fit all. Health Expect. 2014;17(1):138–50.
Ng C-J, Lee P-Y, Lee Y-K, Chew B-H, Engkasan JP, Irmi Z-I, et al. An overview of patient involvement in healthcare decision-making: a situational analysis of the Malaysian context. BMC Health Serv Res. 2013;13(1):408.
Armstrong N, Herbert G, Aveling E-L, Dixon-Woods M, Martin G. Optimizing patient involvement in quality improvement. Health Expect. 2013;16(3):e36–47.
Wittink MN, Cary M, TenHave T, Baron J, Gallo JJ. Towards patient-centered care for depression: Conjoint methods to tailor treatment based on preferences. Patient. 2010;3(3):145–57.
Green P. On the design of choice experiments involving multifactor alternatives. J Consum Res. 1974;1(2):61–8.
Ryan M. Discrete choice experiments in health care; 2004.
Book Google Scholar
Ryan M, Farrar S. Using conjoint analysis to elicit preferences for health care. BMJ. 2000;320(7248):1530–3.
Article CAS PubMed PubMed Central Google Scholar
Louviere JJ, Flynn TN, Carson RT. Discrete Choice Experiments Are Not Conjoint Analysis. J Choice Model. 2010;3(3):57–72.
Green P, Krieger AM, Wind Y. Thirty years of conjoint analysis: reflections and prospects: interfaces; 2001.
Bridges JF, Hauber AB, Marshall D, Lloyd A, Prosser LA, Regier DA, et al. Conjoint analysis applications in health--a checklist: a report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. Value Health. 2011;14(4).
Ryan M, Farrar S. Using conjoint analysis to elicit preferences for health care. BMJ (Clinical research ed). 2000;320(7248):1530–3. https://doi.org/10.1136/bmj.320.7248.1530 .
Article CAS Google Scholar
Lancsar E, Louviere J. Conducting discrete choice experiments to inform healthcare decision making: a user's guide. PharmacoEconomics. 2008;26(8):661–77. https://doi.org/10.2165/00019053-200826080-00004 PMID: 18620460.
Whaley AL, Davis KE. Cultural competence and evidence-based practice in mental health services: A complementary perspective. Am Psychol. 2007;62(6):563–74.
Griner D, Smith TB. Culturally adapted mental health intervention: A meta-analytic review. Psychotherapy (Chic). 2006r;43(4):531-48. https://doi.org/10.1037/0033-3204.43.4.531 . PMID: 22122142.
Tricco A, Lillie E, Zarin W, O'Brien K, Colquhoun H, Levac D, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73.
Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. 2021.
Dwight-Johnson M, Lagomasino IT, Aisenberg E, Hay J. Using conjoint analysis to assess depression treatment preferences among low-income Latinos. Psychiatr Serv (Washington, DC). 2004;55(8):934–6.
Dwight Johnson M, Apesoa-Varano C, Hay J, Unutzer J, Hinton L. Depression treatment preferences of older white and Mexican origin men. Gen Hosp Psychiatry. 2013;35(1):59–65.
Albus C, Schmeißer N, Salzberger B, Fätkenheuer G. Preferences regarding medical and psychosocial support in HIV-infected patients. Patient Educ Couns. 2005;56(1):16–20.
Zimmermann TM, Clouth J, Elosge M, Heurich M, Schneider E, Wilhelm S, et al. Patient preferences for outcomes of depression treatment in Germany: a choice-based conjoint analysis study. J Affect Disord. 2013;148(2-3):210–9.
Dwight-Johnson M, Lagomasino IT, Hay J, Zhang L, Tang L, Green JM, et al. Effectiveness of collaborative care in addressing depression treatment preferences among low-income Latinos. Psychiatr Serv (Washington, DC). 2010;61(11):1112–8.
Becker MP, Christensen BK, Cunningham CE, Furimsky I, Rimas H, Wilson F, et al. Preferences for early intervention mental health services: a discrete-choice conjoint experiment. Psychiatr Serv (Washington, DC). 2016;67(2):184–91.
Herman PM, Ingram M, Rimas H, Carvajal S, Cunningham CE. Patient preferences of a low-income Hispanic population for mental health services in primary care. Admin Pol Ment Health. 2016;43(5):740–9.
Flach SD, Diener A. Eliciting patients' preferences for cigarette and alcohol cessation: an application of conjoint analysis. Addict Behav. 2004;29(4):791–9.
Cunningham CE, Henderson J, Niccols A, Dobbins M, Sword W, Chen Y, et al. Preferences for evidence-based practice dissemination in addiction agencies serving women: a discrete-choice conjoint experiment. Addiction. 2012;107(8):1512–24.
Townend M. An application of conjoint analysis to the process of psychiatric day hospital care. J Psychiatr Ment Health Nurs. 2000;7(4):371–2.
Townend M, Shackley P. Establishing and quantifying the preferences of mental health service users for day hospital care: pilot study using conjoint analysis. J Ment Health. 2002;11(1):85–96.
Fahey A, Ní Chaoimh D, Mulkerrin EC, O'Keeffe ST, Mulkerrin GR. Deciding about nursing home care in dementia: A conjoint analysis of how older people balance competing goals. Geriatr Gerontol Int. 2017;17(12):2435–40.
Zipursky RB, Cunningham CE, Stewart B, Rimas H, Cole E, Vaz SM. Characterizing outcome preferences in patients with psychotic disorders: a discrete choice conjoint experiment. Schizophr Res. 2017;185:107–13.
Lee EJ, Chan F, Ditchman N, Feigon M. Factors influencing korean international students' preferences for mental health professionals: a conjoint analysis. Community Ment Health J. 2014;50(1):104–10.
Hajime S. Preferences for suicide prevention strategies among university students in Japan: a cross-sectional study using full-profile conjoint analysis. Psychol Health Med. 2018;23(9):1046–53.
Cunningham CE, Zipursky RB, Christensen BK, Bieling PJ, Madsen V, Rimas H, et al. Modeling the mental health service utilization decisions of university undergraduates: A discrete choice conjoint experiment. J Am College Health. 2017;65(6):389–99.
Okumura Y, Sakamoto S. Depression treatment preferences among Japanese undergraduates: using conjoint analysis. Int J Soc Psychiatry. 2012;58(2):195–203.
Wymbs FA. Examining parents' preferences for varieties and elements of behavioral parenting programs: ProQuest Information & Learning; 2012.
Wymbs FA. Parents' preferences for school- and community-based services for children at risk for ADHD. School Mental Health Multidiscipl Res Pract J. 2018;10(4):386–401.
Fegert JM, Slawik L, Wermelskirchen D, Nubling M, Muhlbacher A. Assessment of parents' preferences for the treatment of school-age children with ADHD: a discrete choice experiment. Expert Rev Pharmacoecon Outcomes Res. 2011;11(3):245–52.
Waschbusch DA, Cunningham CE, Pelham WE, Rimas HL, Greiner AR, Gnagy EM, et al. A discrete choice conjoint experiment to evaluate parent preferences for treatment of young, medication naive children with ADHD. J Clin Child Adolesc Psychol. 2011;40(4):546–61.
Cunningham CE, Chen Y, Deal K, Rimas H, McGrath P, Reid G, et al. The interim service preferences of parents waiting for children’s mental health treatment: A discrete choice conjoint experiment. J Abnorm Child Psychol. 2013;41(6):865–77.
Cunningham CE, Rimas H, Chen Y, Deal K, McGrath P, Lingley-Pottie P, et al. Modeling parenting programs as an interim service for families waiting for children's mental health treatment. J Clin Child Adolesc Psychol. 2015;44(4):616–29.
Cunningham CE, Deal K, Rimas H, Buchanan DH, Gold M, Sdao-Jarvie K, et al. Modeling the information preferences of parents of children with mental health problems: a discrete choice conjoint experiment. J Abnorm Child Psychol. 2008;36(7):1123–38.
Riepe MW, Gritzmann P, Brieden A. Preferences of psychiatric practitioners for core symptoms of major depressive disorder: a hidden conjoint analysis. Int J Methods Psychiatr Res. 2017;26(1):e1528. https://doi.org/10.1002/mpr.1528 .
Ng-Mak D, Poon JL, Roberts L, Kleinman L, Revicki DA, Rajagopalan K. Patient preferences for important attributes of bipolar depression treatments: a discrete choice experiment. Patient Preference Adherence. 2018;12:35–44.
Bell RA, Paterniti DA, Azari R, Duberstein PR, Epstein RM, Rochlen AB, et al. Encouraging patients with depressive symptoms to seek care: a mixed methods approach to message development. Patient Educ Couns. 2010;78(2):198–205.
GBD 2017 Risk Factor Collaborators. Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392(10159):1923–94. https://doi.org/10.1016/S0140-6736(18)32225-6 Epub 2018 Nov 8. Erratum in: Lancet. 2019 Jan 12;393(10167):132. Erratum in: Lancet. 2019 Jun 22;393(10190):e44. PMID: 30496105; PMCID: PMC6227755.
Honda A, Ryan M, Van Niekerk R, Diane M. Improving the public health sector in South Africa: eliciting public preferences using a discrete choice experiment. Health Policy Plan. 2015;30(5):600–11. https://doi.org/10.1093/heapol/czu038 .
Lee S-J, Brooks R, Bolan RK, Flynn R. Assessing willingness to test for HIV among men who have sex with men using conjoint analysis, evidence for uptake of the FDA-approved at-home HIV test. AIDS Care. 2013;25(12):1592–8.
Huang MY, Huston SA, Perri M. Consumer preferences for the predictive genetic test for Alzheimer disease. J Genet Couns. 2014;23(2):172–8.
Herman PM, Ingram M, Cunningham CE, Rimas H, Murrieta L, Schachter K, et al. A Comparison of Methods for Capturing Patient Preferences for Delivery of Mental Health Services to Low-Income Hispanics Engaged in Primary Care. Patient. 2016;9(4):293–301.
Moor SE, Tusubira AK, Akiteng AR, Hsieh E. Development of a discrete choice experiment to understand patient preferences for diabetes and hypertension management in rural Uganda. Lancet Glob Health. 2020;8:S22. https://doi.org/10.1016/S2214-109X(20)30163-7 .
Alguera-Lara V, Dowsey MM, Ride J, Kinder S, Castle D. Shared decision making in mental health: the importance for current clinical practice. Australas Psychiatry. 2017;25(6):578–82. https://doi.org/10.1177/1039856217734711 Epub 2017 Oct 10. PMID: 29017332.
Larson E, Vail D, Mbaruku G, Kimweri A, Freedman L, Kruk M. moving toward patient-centered care in Africa: a discrete choice experiment of preferences for delivery Care among 3,003 Tanzanian women; 2020.
Cunningham CE, Barwick M, Rimas H, Mielko S, Barac R. Modeling the decision of mental health providers to implement evidence-based children’s mental health services: A discrete choice conjoint experiment. Adm Policy Ment Health Ment Health Serv Res. 2018;45(2):302–17.
Semrau M, Alem A, Ayuso-Mateos JL, Chisholm D, Gureje O, Hanlon C, et al. Strengthening mental health systems in low- and middle-income countries: recommendations from the Emerald programme. BJPsych Open. 2019;5(5):e73. https://doi.org/10.1192/bjo.2018.90 .
Cunningham CE, Deal K, Rimas H, Chen Y, Buchanan DH, Sdao-Jarvie K. Providing information to parents of children with mental health problems: a discrete choice conjoint analysis of professional preferences. J Abnorm Child Psychol. 2009;37(8):1089–102.
Download references
Authors would like to thank Jurgen Unutzer for introducing us to these methods.
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Research reported in this publication was supported by the Fogarty International Center of the National Institutes of Health under Award Number K43TW010716, which also supported the contributions of MK to this work. AL is supported by the National Institutes of Health under Award Number F31HD101149. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Authors and affiliations.
Department of Global Health, University of Washington, Seattle, WA, 98195, USA
Anna Larsen
IKUZE AFRICA, Nairobi, 00100, Kenya
Albert Tele
Department of Psychiatry, University of Nairobi, (47074), Nairobi, 00100, Kenya
Manasi Kumar
You can also search for this author in PubMed Google Scholar
AL - Conception of the study, study design, implementation, interpretation of data; drafting and revising the paper, approval of the final draft, overall oversight. AT- Study design, Data quality control, interpretation of the data, literature review and initial drafting. MK.-. Conception of the study, study design, implementation and interpretation of data; literature review and manuscript revision and approval of the final draft.
Correspondence to Manasi Kumar .
Ethics approval and consent to participate.
Not applicable.
Competing interests.
To the best of our knowledge, no conflict of interest, financial or other, exists. All the authors read and approved the manuscript.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: supplementary table 1..
Search terms per database searched.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Larsen, A., Tele, A. & Kumar, M. Mental health service preferences of patients and providers: a scoping review of conjoint analysis and discrete choice experiments from global public health literature over the last 20 years (1999–2019). BMC Health Serv Res 21 , 589 (2021). https://doi.org/10.1186/s12913-021-06499-w
Download citation
Received : 22 November 2020
Accepted : 04 May 2021
Published : 18 June 2021
DOI : https://doi.org/10.1186/s12913-021-06499-w
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1472-6963
BMC Primary Care volume 23 , Article number: 234 ( 2022 ) Cite this article
2503 Accesses
3 Citations
5 Altmetric
Metrics details
While patients’ preferences in primary care have been examined in numerous conjoint analyses, there has been little systematic effort to synthesise the findings. This review aimed to identify, to organise and to assess the strength of evidence for the attributes and factors associated with preference heterogeneity in conjoint analyses for primary care outpatient visits.
We searched five bibliographic databases (PubMed, Embase, PsycINFO, Econlit and Scopus) from inception until 15 December 2021, complemented by hand-searching. We included conjoint analyses for primary care outpatient visits. Two reviewers independently screened papers for inclusion and assessed the quality of all included studies using the checklist by ISPOR Task Force for Conjoint Analysis. We categorized the attributes of primary care based on Primary Care Monitoring System framework and factors based on Andersen’s Behavioural Model of Health Services Use. We then assessed the strength of evidence and direction of preference for the attributes of primary care, and factors affecting preference heterogeneity based on study quality and consistency in findings.
Of 35 included studies, most (82.4%) were performed in high-income countries. Each study examined 3–8 attributes, mainly identified through literature reviews ( n = 25). Only six examined visits for chronic conditions, with the rest on acute or non-specific / other conditions. Process attributes were more commonly examined than structure or outcome attributes. The three most commonly examined attributes were waiting time for appointment, out-of-pocket costs and ability to choose the providers they see. We identified 24/58 attributes with strong or moderate evidence of association with primary care uptake (e.g., various waiting times, out-of-pocket costs) and 4/43 factors with strong evidence of affecting preference heterogeneity (e.g., age, gender).
We found 35 conjoint analyses examining 58 attributes of primary care and 43 factors that potentially affect the preference of these attributes. The attributes and factors, stratified into evidence levels based on study quality and consistency, can guide the design of research or policies to improve patients’ uptake of primary care. We recommend future conjoint analyses to specify the types of visits and to define their attributes clearly, to facilitate consistent understanding among respondents and the design of interventions targeting them.
Word Count: 346/350 words.
On Open Science Framework: https://osf.io/m7ts9
Peer Review reports
Primary care, defined as the first contact a person has with the health system, encompasses a broad range of health services, including preventive, curative and rehabilitative services, that addresses both acute and chronic conditions [ 1 , 2 , 3 ]. Internationally, better access to primary care has been associated with better health outcomes and lower total healthcare costs [ 4 ]. Thus, not only can primary care meet a broad range of the people’s health needs, it can also provide quality health services to people without resulting in financial hardship [ 5 , 6 ].
To better address the changing health needs due to ageing population and rising prevalence of chronic conditions, many countries worldwide, including the low and middle-income countries (LMICs) have undertaken initiatives to reform their delivery of primary care [ 7 , 8 ]. A central idea behind many such reforms is person-centred care that emphasises the value of patients’ views in co-designing and in delivering health care [ 9 , 10 ]. To co-design and to deliver person-centred care at primary care settings require policy makers and primary care service providers to understand patients’ preferences for health services delivered at primary care.
Conjoint analysis is a stated-preference method that derives the implicit values for an attribute of a product or a service using surveys [ 11 ]. In a conjoint analysis survey, respondents are presented hypothetical alternatives of a product or a service characterised (conjointly) by two or more attributes, each over a range of levels, alternatives which they are asked to rank, rate, or choose; a choice-based conjoint analysis where respondents are asked to choose between two or more alternatives is also known as “discrete choice experiment (DCE). Based on how the rankings, ratings or choices differ between the shortlisted attributes or between the alternatives of primary care services characterised by the shortlisted attributes, one could estimate preferences associated with the attributes [ 11 ] and use the preferences to predict uptake of the primary care service. Conjoint analyses can also elucidate preference heterogeneity by examining factors (e.g., patient characteristics) that modify the preference (and by extension, the uptake of the primary care service), which would provide insight on how to tailor the service to the characteristics of the target population.
Given its usefulness, numerous conjoint analyses on patients’ preference in primary care have been performed among patients visiting primary care facilities or among public members who are potential users of primary care. The only review of conjoint analyses on patients’ preference in primary care thus far found 18 DCEs (including two on out-of-hour service) performed between 2006 and 2015. The review [ 12 ] summarised a list of the attributes examined, organised into three general categories of structure, process and outcome attributes. However, it did not synthesise the direction of preference and the strengths of evidence of the attributes. The review also did not examine factors affecting preference heterogeneity. A synthesis of evidence for primary care attributes and factors affecting preference heterogeneity would advise which attributes or factors should be considered in future research and policy decisions in providing person-centred care at primary care settings.
To address these gaps, our review aims (1) to update the list of primary care attributes and to provide a list of factors affecting preference heterogeneity, focusing on outpatient visits based on all studies since the inception of the databases (2) to categorise the attributes based on a framework developed to describe primary care system [ 13 , 14 ], and the factors based on a framework of health services utilisation [ 15 ], and (3) to synthesise the direction and the strength of evidence of the attributes and the factors affecting preference heterogeneity.
This systematic review was prospectively registered on Open Science Framework ( https://osf.io/m7ts9 ) and is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline (Appendix 1 ).
We conducted systematic searches in five databases (PubMed, Embase, PsycINFO, Econlit and Scopus) from inception until 15 December 2021 using terms related to “primary care” and “preferences”, “conjoint analyses” or “DCE” (Appendix 2 ); these terms were adapted from the previous review on the same topic [ 12 ], as well as other systematic reviews in primary care [ 16 , 17 , 18 , 19 ] and systematic reviews of discrete choice experiments in healthcare [ 20 , 21 , 22 ]. To identify studies that may have been missed from database searches, we also hand-searched Google, included studies from previous review [ 12 ], and the reference lists of included studies.
All articles from the database searches were downloaded into EndNote for de-duplication, before being screened for eligibility by two independent reviewers (AHL, SWN) based on titles and abstracts and subsequently, based on full text. Any disagreements were reconciled via consensus and if necessary, involving a third reviewer (XRT or KKL). In cases of no access to full text, we contacted the corresponding authors of the studies and the journals multiple times. If we did not receive any response from the corresponding authors and the journals by the time the manuscript draft was complete, the studies were excluded.
We included studies that used DCEs or conjoint analyses to survey the patients or the general public on preferences for primary care outpatient visits.
We excluded studies that examined preferences on specific treatment (e.g., anti-diabetics), specific services in a clinic (e.g., pharmacy services), services in hospital outpatient clinics or out-of-hour services. Studies on out-of-hours service were excluded because they have evolved in some settings to be delivered over the phone or in tandem with hospital emergency departments, hence cater to patients with perceived urgent problems who are different from the general population who use primary care [ 23 ]. The inclusion and exclusion criteria are also summarised in Appendix 3 .
We created a data extraction form and a data dictionary using Microsoft Excel to extract data on study settings (publication year, continent, country’s income level, sources of funding), study design (recruitment setting and methods of survey administration), questionnaire design (the choice contexts, the types of primary care visits, the attributes, methods to identify the attributes and level, the factors affecting preference heterogeneity, methods to generate choice set and whether the study reported design efficiency), study samples (sample size, response rate, age, gender) and analyses (statistical model) from eligible articles. We also extracted the direction of association and statistical significance at p < 0.05 for the attributes and the factors affecting preference heterogeneity. Factors affecting preference heterogeneity were identified from study sample characteristics that are associated with latent class memberships (among studies that performed latent class analysis) or characteristics that moderated the associations between attributes and primary care uptake (among studies that performed logit or probit regression analyses). The data extraction form and the data dictionary were pre-tested with two studies by AHL and SWN and feedback was obtained to update the form before use.
The quality of the included studies was appraised using the checklist by ISPOR Task Force for Conjoint Analysis [ 24 ]. The checklist is made up of 10 items, each comprising 3 criteria. Each criterion was first evaluated “Yes”, “Partial” or “No” by two independent reviewers (AHL with SMO, or SWN). Based on the extent to which the three criteria were met, each item was then rated “Yes”, “Partial” or “No”. Any disagreements between them were reconciled via consensus, and if necessary, involving a third reviewer (LKK).
To provide an overview, we tabulated, in numbers and percentages, the study and sample characteristics, including the contexts of the choice questions (hereafter “choice contexts”), the types of primary care visits, the attributes and the factors affecting preference heterogeneity. The choice contexts were categorised based on for whom the primary care services were chosen (self, friend or relative) and if specified, the hypothetical reason the choices were required (e.g., current primary care clinic closes). The types of visits were categorised into visits for major acute, minor acute, chronic, or non-specific / other conditions based on data that emerged from the included studies. “Minor acute” conditions included influenza, urinary tract infections and upper respiratory tract infections while “major acute” conditions included severe lower back pain, “new urgent symptoms”, and perceived severe disease. Meanwhile, “non-specific / other conditions” referred to routine check-ups or conditions that were not explicitly stated and thus unable to be categorised into acute or chronic.
Meanwhile, the attributes were categorised into three levels (structure, process, or outcome). Each level was broken down into dimensions and features, based on the Primary Care Monitoring System (PC Monitor) framework. The framework describes primary care systems in three levels of structure, process, and outcome, each further divided into dimensions and features, with a total of 11 dimensions and 57 features. For example, the structure level comprises three dimensions: (a) governance, (b) economic conditions, and (c) workforce development. The governance dimension, for instance, includes the use of appropriate technology, decentralisation, ownership, etc. as its features. Meanwhile, the process level comprises four dimensions: (a) access, (b) continuity of care, (c) coordination of care. and (c) comprehensiveness of care; the outcome level comprises three dimensions: (a) quality of care; (b) efficiency of care; and (c) equity in health [ 13 , 14 ] (Fig. 1 ).
Finally, the factors affecting preference heterogeneity were categorised based on Andersen’s Behavioural Model of Health Services Use [ 15 ] into predisposing, enabling, health behaviour or need factors.
In the absence of gold standard on what constitutes “high quality”, we considered studies rated either “Yes” or “Partial” across all 10 items as high quality in main analysis and studies rated “Yes” in ≥ 5 out of 10 items as high quality in sensitivity analysis [ 24 ].
To synthesise the evidence level, we stratified each attribute and each factor into strong, moderate, limited, conflicting or inconclusive based on study quality and consistency of findings across ≥ 75% studies [ 25 , 26 , 27 ]. As illustrated in Fig. 2 , an attribute (or a factor) had “strong evidence” if it had been examined ≥ 2 times in studies of high quality, of which ≥ 75% produced consistent findings. If an attribute had been examined once in a high-quality study and ≥ 2 times in low-quality studies with consistent findings, it would be assigned “moderate evidence”. If an attribute had only been examined once in a high- and a low-quality study each or produced consistent findings ≥ 3 times in low-quality studies, it would be assigned “limited evidence”. If an attribute had been examined < 3 times in low-quality studies, the level of evidence would be deemed “inconclusive”. If < 75% of the findings were consistent, the evidence level would be deemed “conflicting” regardless of the study quality. For attributes that were binary (yes/no), ordinal or continuous, consistency accounted for the direction of association (positive, negative, none) as well as statistical significance (at p < 0.05) whereas for attributes that were nominal (e.g., choice of providers), consistency accounted for statistical significance; similarly for factors affecting preference heterogeneity. We were unable to account for consistency in the direction for binary (yes/no), ordinal or continuous factors affecting preference heterogeneity due to small number of studies examining the interaction terms of the same factor with the same attribute. This approach of evidence synthesis is commonly used in systematic reviews where meta-analyses are not feasible due to heterogeneity among the included studies. While it has been applied to synthesise evidence levels in systematic reviews of prognostic factors of clinical conditions [ 25 , 26 , 27 ], we are not aware of any attempt to apply the approach to synthesise the evidence levels for attributes and / or factors affecting preference heterogeneity in systematic review of conjoint analyses.
All analyses were performed on Microsoft Excel or R version 4.0.5 (The R Foundation for Statistical Computing, Vienna).
The search strategy identified 18,980 articles (Fig. 3 ), of which 17,233 were unique. After screening their titles and abstracts, 166 were retrieved for full text screening, from which 132 were excluded because they were not DCEs ( n = 53), were not on primary care ( n = 45), examined specific treatment ( n = 20), not English ( n = 8), examined preferences for out-of-hours treatment ( n = 5), or conference abstract ( n = 1). One additional article [ 28 ] was retrieved from the previous review [ 12 ]. For one abstract that may be eligible based on title and abstract [ 29 ], we had to contact the author and the journal via their contact emails and ResearchGate accounts for the full-text but did not receive a reply despite five attempts over a span of nine months. This gave 35 eligible articles for extraction, of which two were rating-based conjoint analyses, and the rest choice-based conjoint analysis or DCEs.
Table 1 summarises the study and sample characteristics, with details for each study in Appendix 4 . The studies were mostly published after 2010 (60.0%), in Europe (65.7%), from high-income countries (82.9%). Among studies that reported funding sources (71.4%), government funding dominated (45.7%). Study samples were recruited from primary care facilities (54.3%) or the community (42.9%), most of whom self-completed the questionnaires (62.9%). These studies recruited on average 881.8 respondents, with 62.8% response rates. The respondents, with 51.6 years-old mean age, comprised of 41.9% men.
The studies examined minor acute (54.3%), non-specific / other (45.7%), chronic (17.1%) and / or major acute (11.4%) conditions. They more frequently used process (94.3%) or outcome (91.4) than structure attributes (51.4%), predominantly identified through literature review (71.4%). Among the 16 studies that investigated factors affecting preference heterogeneity, they most investigated predisposing characteristics (28.6%), followed by enabling resources (25.7%), needs (14.3%) and health behaviour (5.7%). As for statistical analysis, logit model (74.5%) was the most widely used.
Study quality was determined based on the number of items rated “Yes” for each study. Including one study that received only “Yes” ratings, 29/35 studies had “Yes” or “Partial” across all 10 items; these studies were considered high quality in main analysis. Meanwhile, 25/35 studies received ≥ 5 “Yes” ratings and were considered high quality in sensitivity analysis.
Only 4/10 items received at least one “No” – “choice of attributes and levels supported by evidence” (3/35 studies were rated “No”), “choice of experimental design justified and evaluated” (2/35 “No”), “appropriate statistical analyses and model estimations” (2/35 “No”) and “appropriate design of data collection instrument” (1/35 “No”) (Appendix 5 ).
Overall, the 35 included studies examined 58 unique primary care attributes 183 times (average 5.2 attributes per study). These attributes fell into 3 levels, 9 dimensions and 19 features of primary care of the PC Monitor framework (Fig. 1 , Appendix 6 ).
Among the 3 levels of primary care, process had the largest number of unique attributes (34) across 4 dimensions (access, comprehensiveness, continuity, and coordination) and 12 features; outcome had 19 unique attributes across 2 dimensions (quality, efficiency) and 3 features; structure had 5 unique attributes across 3 dimensions (governance, workforce, others) and 4 features. Relational continuity of care was the most examined feature within the process level, efficiency in the performance of primary care workforce was the most examined feature within the outcome level, whereas profile of workforce was the most examined feature within the structure level (Fig. 1 ).
Across all levels, dimensions, and features of primary care, the ten most frequently examined attributes were waiting time for appointment (20 studies), out-of-pocket cost (15 studies), ability to choose the providers they see (15 studies), length of consultation time (12 studies), waiting time at clinic (10 studies) involvement in decision making (10 studies), amount of information received during consultation (8 studies), quality of the physical exam (7 studies), depth of the explanation (6 studies), and convenience of appointment time (5 studies) (Appendix 7 ).
Based on all 35 included studies regardless of type of visits, of the 58 attributes, none had inconclusive or conflicting evidence, but 21 had strong, 3 had moderate and 34 had limited strength of evidence (Table 2 a). Most of the attributes, listed in Table 3 , either positively or negatively influenced preference for primary care. For example, higher experience of care providers, availability of a convenient appointment time, better communication skills, better drug availability, longer consultation time, extended opening hours, amount of information received are associated with higher preference of primary care, whereas longer distance, higher out-of-pocket cost and longer waiting time are associated with lower preference; these attributes have strong or moderate strength of evidence in the main analyses and retained their strengths of evidence in the sensitivity analyses, except for drug availability for which the strength of evidence became limited. On the other hand, some attributes in the main analyses have limited strength of evidence of positively influencing preference (e.g., clinic managed by the government, availability of home visits, opening at lunch time or more days in a week, multidisciplinary care) or negatively influencing preference (e.g., clinics seeking voluntary contribution in addition to out-of-pocket cost, waiting time for referral). Finally, a minority of attributes, for instance, amount of billing problems, facility size, and provision of preventive care by the facility were found to have no association with a preference of primary care, although their evidence are also of limited strength.
The number of attributes with strong or moderate evidence decreased when the evidence was stratified by the type of visits, with some attributes becoming inconclusive (Table 2 a). The full list of attributes is available in Appendix 7 , including how their strengths of evidence varied with the type of visits.
The 16 studies examined 43 unique factors affecting preference heterogeneity (Table 2 b) 196 times (average 12.3 factors per study) – enabling resources (22 factors), needs factors (12 factors), predisposing characteristics (7 factors), and health behaviour (2 factors). Of these, only 4 had strong evidence of affecting preference heterogeneity of primary care (Table 4 ), i.e., age, gender, employment status, and income; all retained their strength of evidence in sensitivity analysis. Older respondents preferred lower out-of-pocket cost [ 30 , 31 ] and to choose their own healthcare provider [ 32 , 33 , 34 ] while younger respondents preferred shorter waiting times [ 31 , 35 ]. Meanwhile, female respondents preferred to choose their own healthcare provider [ 33 , 34 , 36 ] and better quality physical examination [ 31 ]. Patients who are employed were more willing to pay higher out-of-pocket cost [ 30 ] but preferred shorter waiting times [ 34 ], likewise for those with higher incomes [ 37 ]. The remaining factors had limited ( n = 31), inconclusive ( n = 5) or conflicting ( n = 3) evidence of affecting preference heterogeneity of primary care. The full list of factors is available in Appendix 8 , including how their strengths of evidence varied with the type of visits.
To provide person-centred care, primary care provision should align with patients’ preferences. The preferences of patients as well as public members who could be patients have been examined in numerous conjoint analyses. However, no systematic effort has been undertaken to synthesise their findings. To address this gap, our systematic review identified, organised, and assessed the evidence level of the attributes examined for patients’ preferences in primary care as well as the factors affecting these preferences. The 35 included conjoint analyses had similar characteristics – most were published in the last decade (since 2010), by high-income countries in Europe based on samples recruited from primary care facilities seeking to elicit preferences on visits for acute or non-specific / other conditions. Thus, it may not be surprising that despite spanning diverse levels, dimensions, and features of primary care, none of the 58 attributes was found to have conflicting evidence. Instead, 24 had strong or moderate evidence of an association with preference for primary care, while the remaining 34 attributes had limited evidence of an association or no association. Similarly for the factors affecting preference heterogeneity, albeit with smaller number of studies and only 4 factors found to have strong or moderate evidence.
Process of care, which had the highest number of unique attributes (vs structure and outcomes), was the most studied level of primary care. As no single unique attribute dominated the list, this indicates more varied priorities in selecting process attributes. Conversely, the lack of interest on structure of care (the lowest number of unique attributes) may be due to structural attributes being less observable by the public and less amenable by the policy makers in the short-term.
Meanwhile, the absence of attributes with conflicting evidence from our syntheses implies that patients or public members generally have consistent preference, at least within the contexts examined by the included studies. The consistency suggests the feasibility to improve primary care uptake by changing the attributes in the direction associated with a higher preference. Based on our review, examples of such attributes may be the providers’ communication skills (strong evidence for all visits except that for chronic conditions), quality of the physical examinations (strong evidence for minor acute conditions) and opening hours in the weekend (strong evidence for other / non-specific visits). On the other hand, our review also found some studies reporting attributes with subjective or unclear definition e.g., “best care” in one of the included studies [ 38 ]. Such attributes are likely challenging to operationalise and to target in policy interventions, as they may be understood differently by different respondents. To facilitate consistent understanding and the design of policy interventions, [ 39 , 40 ], we recommend future studies to clearly define and present their attributes (e.g. as a table in Wang et al. [ 41 ]).
As few studies examined factors affecting preference heterogeneity, most factors had either limited or inconclusive evidence. Out of the 43 unique factors, only four were examined across enough studies to have strong evidence affecting preference heterogeneity (age, gender, employment status, and income). Younger respondents and those with higher incomes may have lower preference for long waiting times for acute conditions [ 35 ] due to perceived lower value of a visit [ 42 ], while older respondents prefer lower out-of-pocket costs [ 30 , 37 ] possibly due to growing financial constraints [ 43 ] or healthcare expenditure with age [ 44 ]. Meanwhile, women respondents may prefer to choose their own providers [ 33 ], as they are likely to trust female physicians more [ 45 ] and are more comfortable with female physicians [ 46 , 47 ]. On the other hand, three factors were found to have conflicting evidence (education level, health status, and chronic disease status), which may be due to the same factor interacting differently with different attributes. For instance, those with chronic diseases were found to prefer more information on their condition but also less involvement in their treatment [ 48 ]. Hence, unlike that for attributes, we could not examine the direction of association for the factors affecting preference heterogeneity, which should be explored further in future conjoint analyses.
The only other review [ 12 ] on patients’ preferences in primary care encompassed three databases between 2006 and 2015, compared to five databases without date restriction (until 15 December 2021) in our review. This gives us more eligible studies (35 vs 18) and unique attributes (58 vs 30). Of the 18 studies from the previous review [ 12 ], 16 were included in our current review (15 of which appeared on our database searches); the remaining two [ 49 , 50 ] were excluded as they examined out-of-hour service. In terms of findings, the earlier review [ 12 ] found structure attributes to be the most common whereas our review found process attributes to be predominant. This difference in findings is due to both reviews using different approaches to definitions in categorising the attributes, the earlier review [ 12 ] followed the definitions in Donabedian’s model for quality of health care [ 51 ] whereas we followed that in the PC Monitor framework [ 13 , 14 ] which was specifically designed for primary care and allowed us to sub-categorise each attribute into dimensions and features. This resulted in some attributes e.g., opening hours, cost and distance that were “structure” in the earlier review [ 12 ] but were considered “process” in our review.
In addition to a list of attributes, our review also generates additional insights by (1) examining the factors affecting heterogeneity, (2) appraising the quality of included studies and (3) synthesising, based on study quality and consistency in findings, the evidence levels of the attributes and the factors affecting preference heterogeneity overall, and by the types of visits. Our findings on the attributes, their evidence level and direction of association largely corroborate findings from other quantitative or qualitative studies on barriers and facilitators on access to primary care that found higher preference for shorter travel distance to health facility [ 52 ], shorter waiting time [ 53 , 54 ], lower out-of-pocket costs [ 55 ], being treated with respect and having their own choice of healthcare provider [ 56 ]. Similarly for our findings on the factors affecting preference heterogeneity where female respondents preferred to choose their healthcare provider who they were more comfortable with [ 46 , 47 ], while older respondents preferred to choose healthcare provider but placed higher emphasis on the doctor making decisions [ 57 ]. Those with higher incomes were also willing to pay more for treatment than respondents with lower incomes [ 57 ].
Our findings should be interpreted alongside several limitations. First, the categories of attributes are based on the PC Monitor framework, which may have different definitions than other frameworks for primary care services [ 13 ]. However, as the framework was developed based on systematic review [ 13 , 14 ], it increases the generalisability of our findings to other settings. Second, some attributes may fit under > 1 category. For instance, “quality of the physical exam” reported in Cheraghi-Sohi et al. [ 58 ] and Kruk et al. [ 31 ] was categorised in “treatment and follow-up of diagnosis” feature of primary care (Appendix 6 ), although it may also fit into “quality of diagnosis and treatment in primary care”. However, we categorised each attribute only to one level, one domain and one feature, for ease of interpretation. Next, as we synthesised evidence only from published literature, our findings on the evidence levels may be susceptible to publication bias. In addition, as we extracted findings only from the final model, our findings on the evidence levels may also be sensitive to model selection by the respective studies. Besides that, the small number of studies that examined factors affecting preference heterogeneity only allowed us to synthesise the overall evidence levels of these factors, rather than based on how they interact with different attributes, which can be explored in future conjoint analyses or future reviews. Finally, we only included conjoint analyses examining primary care outpatient visits. Hence, our findings may not generalise to other services that may be considered primary care e.g., antenatal care [ 59 , 60 ] or pharmacy services [ 61 ].
Despite the limitations, the syntheses of evidence levels for the attributes and the factors affecting preference heterogeneity are our main strengths. To our knowledge, this has only been done on systematic reviews of prognostic factors [ 25 , 26 , 27 ] but not by any systematic review of DCEs.
For research, our findings may advise the choice of attributes and factors affecting preference heterogeneity in future conjoint analyses. For instance, future conjoint analyses may focus on attributes with limited or inconclusive evidence, or attributes in levels, dimensions or features of primary care that have been less studied. We also found a paucity of evidence for chronic conditions or in LMICs apart from China, despite the importance of primary care in meeting the preventive and curative care needs of patients in chronic conditions including in LMICs. In addressing these gaps, we recommend future conjoint analyses to specify the types of visits, as our findings suggest patients’ preferences may differ for different types of primary care visits.
For policy, our findings provide an evidence-based list of attributes to design primary care services for optimal uptake, at the local, regional, and national levels. At the local level, the attributes with strong or moderate evidence suggest that extending opening hours as well as allowing patients to choose their own providers or see a provider they are familiar with would improve the uptake of primary care services. Similarly, proactive management of the waiting time to get an appointment or waiting time at the clinic may also help. Healthcare providers may also be provided with trainings on communication skill, including how to get patients involved in their treatment decisions. At the regional or the national level, new primary care facilities should ideally be built in a location within reasonable distance travel time from nearby community, with services available at reasonable out-of-pocket cost. It will be up to the policy makers to determine which attributes should be prioritised first based on local context, whether as part of an ongoing changes or part of a larger reform.
Our review found 35 studies that examined 58 attributes and 43 factors that potentially affect patients’ preference in primary care, which we categorised based on PC Monitor framework and synthesised the strength of evidence based on study quality and consistency of study findings across studies. The lists of attributes and factors with their evidence levels can guide policies to improve patients’ uptake of primary care and future DCE studies in this area. Due to the lack of conjoint analyses performed in LMICs or examining visits for chronic conditions, we recommend future DCEs to look into these. In addressing any research gaps on preference for primary care outpatient visits, they should specify the types of visits and define their attributes clearly, to facilitate the design of interventions to target these attributes.
Number of studies examining each level, dimension and feature of the Primary Care (PC) Monitor Framework
Graphical presentation of the algorithm used to assign evidence level for each attribute and each factor
PRISMA flow diagram
All data presented in the manuscript or additional files are extracted from published papers, hence are publicly available.
Discrete choice experiment
Preferred Reporting items for systematic reviews and meta-analyses
Low and middle-income countries
International society for pharmacoeconomics and outcomes research
Primary care monitoring system
International Conference on Primary Health Care, World Health Organization, United Nations Children's Fund. Primary health care: Report of the International Conference on Primary Health Care, Alma-Ata, USSR, 6–12 September 1978 / Jointly Sponsored by the World Health Organization and the United Nations Children's Fund. Geneva: World Health Organization; 1978. Report No.: 9241800011.
Organisation for Economic Co-operation and Development. Primary Care. 2022. [cited 2022 26 May]. Available from: https://www.oecd.org/health/primary-care.htm .
World Health Organization. Primary health care. 2021. [cited 2022 26 May]. Available from: https://www.who.int/health-topics/primary-health-care#tab=tab_1 .
Starfield B, Shi L, Macinko J. Contribution of primary care to health systems and health. Milbank Q. 2005;83(3):457–502.
Article PubMed PubMed Central Google Scholar
van Weel C, Kidd MR. Why strengthening primary health care is essential to achieving universal health coverage. CMAJ. 2018;190(15):E463–6.
Declaration of Astana. Global Conference on Primary Health Care. Astana: World Health Organisation; 2018. [cited 2022 26 May]. Available from: https://www.who.int/primary-health/conference-phc/declaration .
Wang Y, Wilkinson M, Ng E, Cheng KK. Primary care reform in China. Br J Gen Pract. 2012;62(603):546.
Ekawati FM, Claramita M, Hort K, Furler J, Licqurish S, Gunn J. Patients’ experience of using primary care services in the context of Indonesian universal health coverage reforms. Asia Pac Fam Med. 2017;16:4.
Santana MJ, Manalili K, Jolley RJ, Zelinsky S, Quan H, Lu M. How to practice person-centred care: a conceptual framework. Health Expect. 2018;21(2):429–40.
Article PubMed Google Scholar
Epstein RM, Street RL Jr. The values and value of patient-centered care. Ann Fam Med. 2011;9(2):100–3.
Ryan M, Bate A, Eastmond CJ, Ludbrook A. Use of discrete choice experiments to elicit preferences. Quality in Health Care. 2001;10 Suppl 1(Suppl 1):i55-i60.
Kleij KS, Tangermann U, Amelung VE, Krauth C. Patients’ preferences for primary health care - a systematic literature review of discrete choice experiments. BMC Health Serv Res. 2017;17(1):476.
Kringos DS, Boerma WGW, Bourgueil Y, Cartier T, Hasvold T, Hutchinson A, et al. The European primary care monitor: structure, process and outcome indicators. BMC Fam Pract. 2010;11(1):81.
Kringos DS, Boerma WGW, Hutchinson A, van der Zee J, Groenewegen PP. The breadth of primary care: a systematic literature review of its core dimensions. BMC Health Serv Res. 2010;10(1):65.
Babitsch B, Gohl D, von Lengerke T. Re-revisiting Andersen's Behavioral Model of Health Services Use: a systematic review of studies from 1998–2011. Psycho-Soc Med. 2012;9:Doc11.
Welzel FD, Stein J, Hajek A, König H-H, Riedel-Heller SG. Frequent attenders in late life in primary care: a systematic review of European studies. BMC Family Pract. 2017;18(1):104.
Article Google Scholar
Kronenberg C, Doran T, Goddard M, Kendrick T, Gilbody S, Dare CR, et al. Identifying primary care quality indicators for people with serious mental illness: a systematic review. Br J Gen Pract. 2017;67(661):e519–30.
Esponda GM, Hartman S, Qureshi O, Sadler E, Cohen A, Kakuma R. Barriers and facilitators of mental health programmes in primary care in low-income and middle-income countries. Lancet Psychiatry. 2020;7(1):78–92.
Wiysonge CS, Paulsen E, Lewin Lewin S, Ciapponi A, Herrera CA, Opiyo N, et al. Financial arrangements for health systems in low-income countries: an overview of systematic reviews. Cochrane Database Syst Rev. 2017;9(9):Cd011084.
PubMed Google Scholar
Soekhai V, de Bekker-Grob EW, Ellis AR. Discrete Choice Experiments in Health Economics: Past. Present Future. 2019;37(2):201–26.
Google Scholar
Clark MD, Determann D, Petrou S, Moro D, de Bekker-Grob EW. Discrete choice experiments in health economics: a review of the literature. Pharmacoecon. 2014;32(9):883–902.
de Bekker-Grob EW, Ryan M, Gerard K. Discrete choice experiments in health economics: a review of the literature. Health Econ. 2012;21(2):145–72.
Foster H, Moffat KR, Burns N, Gannon M, Macdonald S, O’Donnell CA. What do we know about demand, use and outcomes in primary care out-of-hours services? A systematic scoping review of international literature. BMJ Open. 2020;10(1): e033481.
Bridges JFP, Hauber AB, Marshall D, Lloyd A, Prosser LA, Regier DA, et al. Conjoint Analysis Applications in Health—a Checklist: A Report of the ISPOR Good Research Practices for Conjoint Analysis Task Force. Value Health. 2011;14(4):403–13.
Lievense AM, Bierma-Zeinstra SMA, Verhagen AP, Verhaar JAN, Koes BW. Influence of hip dysplasia on the development of osteoarthritis of the hip. Ann Rheum Dis. 2004;63(6):621–6.
Article CAS PubMed PubMed Central Google Scholar
Bastick AN, Runhaar J, Belo JN, Bierma-Zeinstra SM. Prognostic factors for progression of clinical osteoarthritis of the knee: a systematic review of observational studies. Arthritis Res Ther. 2015;17(1):152.
Lim KK, Matchar DB, Chong JL, Yeo W, Howe TS, Koh JSB. Pre-discharge prognostic factors of physical function among older adults with hip fracture surgery: a systematic review. Osteoporos Int. 2019;30(5):929–38.
Article CAS PubMed Google Scholar
Tinelli M, Ryan M, Bond C. Patients’ preferences for an increased pharmacist role in the management of drug therapy. Int J Pharm Pract. 2010;17(5):275–82.
Kuzmanovic M, Vujosevic M, Martic M. Using Conjoint Analysis to Elicit Patients’ Preferences for Public Primary Care Service in Serbia. HealthMED. 2012;6:497–504.
Hjelmgren J, Anell A. Population preferences and choice of primary care models: a discrete choice experiment in Sweden. Health Policy. 2007;83(2–3):314–22.
Kruk ME, Rockers Rockers PC, Tornorlah Varpilah S, Macauley R. Population preferences for health care in Liberia: insights for rebuilding a health system. Health Serv Res. 2011;46(2):2057–78.
Gerard K, Salisbury C, Street D, Pope C, Baxter H. Is fast access to general practice all that should matter? A discrete choice experiment of patients’ preferences. J Health Serv Res Policy. 2008;13(Suppl 2):3–10.
Vick S, Scott A. Agency in health care. Examining patients’ preferences for attributes of the doctor-patient relationship. J Health Econ. 1998;17(5):587–605.
Rubin G, Bate A, George A, Shackley P, Hall N. Preferences for access to the GP: a discrete choice experiment. Br J Gen Pract. 2006;56(531):743–8.
PubMed PubMed Central Google Scholar
Seghieri C, Mengoni A, Nuti S. Applying discrete choice modelling in a priority setting: an investigation of public preferences for primary care models. Eur Health Econ. 2014;15(7):773–85.
Jia E, Gu Y, Peng Y, Li X, Shen X, Jiang M, et al. Preferences of Patients with Non-Communicable Diseases for Primary Healthcare Facilities: A Discrete Choice Experiment in Wuhan, China. Int J Environ Res Public Health. 2020;17(11):3987.
Article PubMed Central Google Scholar
Zhu J, Li J, Zhang Z, Li H. Patients’ choice and preference for common disease diagnosis and diabetes care: A discrete choice experiment. Int J Health Plann Manage. 2019;34(4):e1544–55.
Tinelli M, Nikoloski Z, Kumpunen S, Knai C, Pribakovic Brinovec R, Warren E, et al. Decision-making criteria among European patients: exploring patient preferences for primary care services. Eur J Pub Health. 2014;25(1):3–9.
Kløjgaard ME, Bech M, Søgaard R. Designing a Stated Choice Experiment: The Value of a Qualitative Process. J Choice Model. 2012;5(2):1–18.
Pearce A, Harrison M, Watson V, Street DJ, Howard K, Bansback N, et al. Respondent Understanding in Discrete Choice Experiments: A Scoping Review. Patient. 2021;14(1):17–53.
Wang X, Song K, Zhu P, Valentijn P, Huang Y, Birch S. How Do Type 2 Diabetes Patients Value Urban Integrated Primary Care in China? Results of a Discrete Choice Experiment. Int J Environ Res Public Health. 2019;17(1):117.
Chu H, Westbrook RA, Njue-Marendes S, Giordano TP, Dang BN. The psychology of the wait time experience – what clinics can do to manage the waiting experience for patients: a longitudinal, qualitative study. BMC Health Serv Res. 2019;19(1):459.
Huang R, Ghose B, Tang S. Effect of financial stress on self-rereported health and quality of life among older adults in five developing countries: a cross sectional analysis of WHO-SAGE survey. BMC Geriatr. 2020;20(1):288.
Hazra NC, Rudisill C, Gulliford MC. Determinants of health care costs in the senior elderly: age, comorbidity, impairment, or proximity to death? Eur J Health Econ. 2018;19(6):831–42.
Derose KP, Hays RD, McCaffrey DF, Baker DW. Does physician gender affect satisfaction of men and women visiting the emergency department? J Gen Intern Med. 2001;16(4):218–26.
Kerssens JJ, Bensing JM, Andela MG. Patient preference for genders of health professionals. Soc Sci Med. 1997;44(10):1531–40.
Leach B, Gradison M, Morgan P, Everett C, Dill MJ, de Oliveira JS. Patient preference in primary care provider type. Healthcare. 2018;6(1):13–6.
Mengoni A, Seghieri C, Nuti S. Heterogeneity in Preferences for Primary Care Consultations: Results from a Discrete Choice Experiment. Int J Stat Med Res. 2013;2:67–75.
Gerard K, Lattimer V, Surridge H, George S, Turnbull J, Burgess A, et al. The introduction of integrated out-of-hours arrangements in England: a discrete choice experiment of public preferences for alternative models of care. Health Expect. 2006;9(1):60–9.
Philips H, Mahr D, Remmen R, Weverbergh M, De Graeve D, Van Royen P. Predicting the place of out-of-hours care–a market simulation based on discrete choice analysis. Health Policy. 2012;106(3):284–90.
Donabedian A. The quality of care. How can it be assessed? J Am Med Assoc. 1988;260(12):1743–8.
Zhang W, Ung COL, Lin G, Liu J, Li W, Hu H, et al. Factors Contributing to Patients' Preferences for Primary Health Care Institutions in China: A Qualitative Study. Front Public Health. 2020;8:414-.
Norwood P, Correia I, Veiga P, Watson V. Patients’ experiences and preferences for primary care delivery: a focus group analysis. Primary Health Care Res Develop. 2019;20:e106.
Li Y, Li W, Wu Z, Yuang J, Wei Y, Huang C, et al. Findings About Patient Preferences for Medical Care Based on a Decision Tree Method Study Design for Influencing Factors. Inquiry. 2022;59:00469580221092831.
van den Broek-Altenburg EM, Atherly AJ. Patient preferences for provider choice: a discrete choice experiment. Am J Manag Care. 2020;26(7):e219–24.
García JA, Paterniti DA, Romano PS, Kravitz RL. Patient preferences for physician characteristics in university-based primary care clinics. Ethn Dis. 2003;13(2):259–67.
Jung HP, Baerveldt C, Olesen F, Grol R, Wensing M. Patient characteristics as predictors of primary health care preferences: a systematic literature analysis. Health Expect. 2003;6(2):160–81.
Cheraghi-Sohi S, Hole AR, Mead N, McDonald R, Whalley D, Bower P, et al. What patients want from primary care consultations: a discrete choice experiment to identify patients’ priorities. Ann Fam Med. 2008;6(2):107–15.
van der Pol M, Shiell A, Au F, Johnston D, Tough S. Convergent validity between a discrete choice experiment and a direct, open-ended method: comparison of preferred attribute levels and willingness to pay estimates. Soc Sci Med. 2008;67(12):2043–50.
van der Pol M, Shiell A, Au F, Jonhston D, Tough S. Eliciting individual preferences for health care: a case study of perinatal care. Health Expect. 2010;13(1):4–12.
Vass C, Gray E, Payne K. Discrete choice experiments of pharmacy services: a systematic review. Int J Clin Pharm. 2016;38(3):620–30.
Download references
We would like to thank the Director General of Health Malaysia for his permission to publish this article. We would like to acknowledge Dr. Azreena Che Abdullah for performing and downloading the initial search hits from the bibliographic databases
The study did not receive any funding.
Authors and affiliations.
Centre for Clinical Outcomes Research, Institute for Clinical Research, National Institutes of Health, Ministry of Health, Shah Alam, Malaysia
Audrey Huili Lim, Sock Wen Ng, Xin Rou Teh, Su Miin Ong & Sheamini Sivasampu
School of Life Course & Population Sciences, Faculty of Life Sciences & Medicine, King’s College London, London, UK
Ka Keat Lim
National Institute for Health Research (NIHR) Biomedical Research Centre, Guy’s and St Thomas’ NHS Foundation Trust and King’s College London, London, UK
You can also search for this author in PubMed Google Scholar
Sivasampu S (SS) and KK Lim (KKL) conceptualized and designed the study. AH Lim (AHL) prepared the search strategies and performed the searches. AHL and SW Ng (SWN) screened the abstracts and full texts. XR Teh (XRT), AHL and SWN prepared and piloted the data extraction tables. AHL and SWN extracted and crosschecked the data. AHL, SWN and SM Ong (SMO) assessed the methodological quality of the included papers and discussed any ratings that could not be agreed. AHL and KKL cleaned the data and performed the analyses based on input from SS, SWN and SMO. AHL and KKL prepared the first draft of the manuscript. SS, XRT, SWN, SMO and KKL critically reviewed drafts of the manuscript for important intellectual content. SS sought for and obtained the funding for open access publication. All authors approve the final draft and agreed to the final submission.
Correspondence to Ka Keat Lim .
Ethics approval and consent to participate.
Not applicable. This is a systematic literature review.
Not applicable.
The authors declare that they have no competing interests.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1:.
Appendix 1. PRISMA checklist. Appendix 2. Search strategies. Appendix 3. List of inclusion and exclusion criteria. Appendix 4. Detailed characteristics of included studies (include quality rating for each paper). Appendix 5. Methodological quality ratings of included studies, based on ISPOR Task Force for Conjoint Analysis checklist. Appendix 6. Number of studies that examined attributes within various levels, dimensions, and features of primary care according to the types of visits. Appendix 7. Full list of attributes according to evidence levels, overall and by types of visits (main analyses). Appendix 8. Full list of factors affecting preference heterogeneity according to evidence levels, overall and by types of visits (main analyses).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Reprints and permissions
Cite this article.
Lim, A.H., Ng, S.W., Teh, X.R. et al. Conjoint analyses of patients’ preferences for primary care: a systematic review. BMC Prim. Care 23 , 234 (2022). https://doi.org/10.1186/s12875-022-01822-8
Download citation
Received : 07 March 2022
Accepted : 09 August 2022
Published : 09 September 2022
DOI : https://doi.org/10.1186/s12875-022-01822-8
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 2731-4553
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Email citation, add to collections.
Your saved search, create a file for external citation management software, your rss feed.
Affiliations.
Background: The use of conjoint analysis (CA) to elicit patients' preferences for osteoarthritis (OA) treatment has the potential to contribute to tailoring treatments and enhancing patients' compliance and adherence. This review's main aim was to identify and summarise the evidence that used conjoint analysis techniques to quantify patient preferences for OA treatments.
Methods: A comprehensive search strategy was conducted using electronic databases and hand reference checks. Databases were searched from their inception until 10th June 2019. All OA and CA related terms were used to conduct the search. The authors reviewed the papers and used the International Society of Pharmacoeconomics and Outcomes Research (ISPOR) checklist to assess the quality of the included studies.
Results: The search identified 534 records. Sixteen records were selected for full-text review and quality assessment and all were included in the narrative data synthesis. All included studies suggested that the severity of symptoms influenced the patients' preference for OA treatment. All included studies recognised CA as a useful method to investigate patients' preferences concerning OA treatment.
Conclusion: Patients preference for OA treatment is driven by the severity of patients' symptoms and the desire to avoid treatment side effects and CA is a useful tool to investigate patients' preferences for OA treatment.
Keywords: conjoint analysis; osteoarthritis; patient preferences.
© 2021 Al-Omari et al.
PubMed Disclaimer
The authors report no conflicts of interest in this work.
The PRISMA flowchart.
Linkout - more resources, full text sources.
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
The role of web-based adaptive choice-based conjoint analysis technology in eliciting patients’ preferences for osteoarthritis treatment, preferences for neuromyelitis optica spectrum disorder treatments: a conjoint analysis with neurologists in spain., a conjoint study and segmentation on the preferred online learning attributes of senior high school learners, care complexity, perceptions of complexity and preferences for interprofessional collaboration: an analysis of relationships and social networks in paediatrics, user preference analysis of a sustainable workstation design for online classes: a conjoint analysis approach, decision tool of medical endoscope maintenance service in chinese hospitals: a conjoint analysis, thresholds for surgical referral in primary hyperparathyroidism: a conjoint analysis., peritoneal dialysis (pd) patient and nurse preferences around novel and standard automated pd device features, a conjoint analysis approach, implications, and mitigation plans in analyzing students’ preferences for online learning delivery types during the covid-19 pandemic for engineering students: a case study in the philippines, customer preference analysis on attributes of toyota’s ev shuttles using conjoint approach: a business case of the stones hotel bali collaboration project, 94 references, systematic review of studies using conjoint analysis techniques to investigate patients’ preferences regarding osteoarthritis treatment, conjoint analysis applications in health--a checklist: a report of the ispor good research practices for conjoint analysis task force., mental health service preferences of patients and providers: a scoping review of conjoint analysis and discrete choice experiments from global public health literature over the last 20 years (1999–2019), a systematic review of discrete-choice experiments and conjoint analysis studies in people with multiple sclerosis, a role for conjoint analysis in technology assessment in health care.
Patient preferences for the pharmacological treatment of osteoarthritis: a feasibility study using adaptive choice-based conjoint analysis (acbca), patients’ preferences for the treatment of anxiety and depressive disorders: a systematic review of discrete choice experiments, patients’ preferences regarding osteoarthritis medications: an adaptive choice-based conjoint analysis study, conjoint analysis. the cost variable: an achilles' heel, related papers.
Showing 1 through 3 of 0 Related Papers
PurposeSince the inception of the conjoint analysis technique in the year 1971, papers addressing the epistemological aspects of conjoint analysis are scant. Hence, this paper attempts to address the vacuum of qualitative discourse addressing the epistemological and methodological aspects of conjoint analysis including different issues, challenges, probable solutions, limitations and future direction of conjoint analysis in the recent decade.Design/methodology/approachFor exploring the methodological and epistemological aspects of conjoint analysis, the seminal papers on conjoint analysis were reviewed. Moreover, the authors' experience for the state-of-art review was also taken into consideration.FindingsThe findings suggest that conjoint analysis that roots back since 1971 has not seen much exploration in Asian regions and is mainly used for new product development in the field of marketing or allied areas. Moreover, the reliability and validity of conjoint analysis is always a matter of concern for the researchers that hinders this technique's wider adaptability. Thus, the paper presents some probable solutions to address the focal issues useful for improved reliability and validity of the conjoint analysis technique.Research limitations/implicationsThis paper attempts to familiarize the researchers with epistemological and methodological aspects of conjoint analysis with certain solutions to evolve beyond existing conjoint analysis dimensions in terms of improved validity, reliability, epistemological and methodological aspects of conjoint analysis (CA). Moreover, it acts as a call for research in different research domains, especially in the Asian continent.Originality/valueThere exist certain seminal research papers on epistemological aspects of conjoint analysis. However, there is a dearth of such attempt in the recent decade addressing the application issues of conjoint analysis incorporating the recent issues as well. Therefore, this paper is an attempt to usher the future researcher to understand the methodological aspects of conjoint analysis. It may prevent them from violating the basic assumptions and methodological threshold. This research technique is preferred equally by academicians and practitioners, thus making it imperative to have clarity beforehand for improved research rigor.
Purpose The purpose of this paper is to define a general and common construct in order to measure the level of difficulty companies experience when they implement continuous improvement (CI). Additionally, a rank of barriers is obtained together with a rank of companies. Design/methodology/approach In order to achieve the objective, first, a literature review is carried out to specify the domain of the construct; second, a sample of items is selected; third a survey is carried out in companies that have already implemented CI initiatives, the results being thus limited to this population; fourth, measures are purified by analysing the reliability and validity of the measurements, and finally results are obtained. The Rasch measurement theory will be used to provide a new perspective on a mature research topic. Findings It can be concluded that a new valid construct has been defined together with a rank of CI barriers, being lack of time the main barrier. A rank of companies is also obtained which is a first step in the development of future research studies. Practical implications Managers are provided with a better understanding of the barriers that can obstruct CI implementation. Thus, the rank of CI barriers guides managers through the most common and important obstacles so that they will be able to plan better CI strategies. In addition, the rank of companies allows each company to undertake a benchmarking exercise. Originality/value This work proposes a new way of analysing the difficulty in implementing CI as a continuum, rather than as independent barriers. From a theoretical point of view, it defines a new construct and offers a rank of CI barriers together with a rank of companies based on their level of difficulty when implementing CI initiatives. This is something new, as previous studies were mainly focussed on the items side. From a practical point of view, this study offers the surveyed companies the opportunity to see how they are positioned with respect to the other companies. Moreover, this rank of companies is the foundation on which to develop further studies with a practical orientation in the future.
PurposeThe purpose of this paper is to investigate the relationships between strategic orientations as well as the role played by them to impact the performance of industrial firms.Design/methodology/approachThe paper formulates some hypotheses from the literature review. These hypotheses are tested using structural equation modeling with data collected from 292 randomly selected firms operating in several industrial sectors in the Kingdom of Saudi Arabia.FindingsThe findings of this study showed the importance of these strategic orientations in enhancing the performance of Saudi industrial firms and emphasized the mediating role of entrepreneurial orientation in the relationships of market orientation and technology orientation to new product development performance and firm performance.Research limitations/implicationsThe study discusses the findings and advances certain limitations and research and managerial implications for future research avenues. It proposes some recommendations to help Saudi firms to choose more than one orientation simultaneously and adopt an appropriate configuration of orientations. Future research has to consider the interplay between these strategic orientations and the impacts of environmental turbulence in terms of market and technology turbulence on strategic orientations – performance relationship.Practical implicationsThe study suggests that managers of Saudi industrial firms should utilize a mix of aspects from several strategic orientations such as market and technology through entrepreneurial capabilities and resources that enhance higher levels of performance.Originality/valueThis study contributes to the literature on entrepreneurship and strategic management by showing the reliability of scales used and the confirmatory of the factor structure. It also contributes to business practices by showing the importance for Saudi firms to combine different strategic orientations and provide more attention to the interplay of these orientations in order to perform better in such a transitional context.
PurposeThis paper explains how servitization disrupts long-established internal and external boundaries of product-focused manufacturers and investigates the root causes of servitization challenges.Design/methodology/approachThe authors draw from the collective experiences of 20 senior executives from ten multinational manufacturers involved in servitization, using a multiple case study approach, and employ a codebook thematic analysis technique.FindingsThe authors develop an integrative framework based on the theoretical notions of power, competency and identity boundaries to offer insights into the root causes of various servitization-related challenges.Research limitations/implicationsAlthough the extant literature discusses servitization challenges, it does not examine the underlying root causes that create them in the first place. This study contributes to the extant research by establishing rational links between organisational boundaries (internal and external) and servitization challenges in the interest of building a coherent and systematically integrated body of theory that can be successfully applied and built upon by future research.Practical implicationsThis study provides a foundation for managers to recognise, anticipate and systematically manage various boundary-related challenges triggered by servitization.Originality/valueIt is one of the first studies to employ the concept of organisational boundary to understand the challenges created by servitization and to account for both internal (between different functions of the same organisation) and external boundaries (between an organisation and its external stakeholders) to establish a holistic understanding of the impacts of servitization on manufacturers.
Purpose The purpose of this paper is to present a review of the foodservice and restaurant literature that has been published over the past 10 years in the top hospitality and tourism journals. This information will be used to identify the key trends and topics studied over the past decade, and help to identify the gaps that appear in the research to identify opportunities for advancing future research in the area of foodservice and restaurant management. Design/methodology/approach This paper takes the form of a critical review of the extant literature that has been done in the foodservice and restaurant industries. Literature from the past 10 years will be qualitatively assessed to determine trends and gaps in the research to help guide the direction for future research. Findings The findings show that the past 10 years have seen an increase in the number of and the quality of foodservice and restaurant management research articles. The topics have been diverse and the findings have explored the changing and evolving segments of the foodservice industry, restaurant operations, service quality in foodservice, restaurant finance, foodservice marketing, food safety and healthfulness and the increased role of technology in the industry. Research limitations/implications Given the number of research papers done over the past 10 years in the area of foodservice, it is possible that some research has been missed and that some specific topics within the breadth and depth of the foodservice industry could have lacked sufficient coverage in this one paper. The implications from this paper are that it can be used to inform academics and practitioners where there is room for more research, it could provide ideas for more in-depth discussion of a specific topic and it is a detailed start into assessing the research done of late. Originality/value This paper helps foodservice researchers in determining where past research has gone and gives future direction for meaningful research to be done in the foodservice area moving forward to inform academicians and practitioners in the industry.
PurposeThe purpose of this paper is to outline the evolution of research on airport service quality and measurement index of passenger satisfaction to explore opportunities for future research direction.Design/methodology/approachA systematic literature review was conducted involving a total final sample 27 articles published during 2000–2020, the source of the database used in this study is Emerald, ScienceDirect, Harzing's Publish or Perish with API Key based on set of inclusion/exclusion criteria for analysis and synthesis to meet the purpose of the paper.FindingsDimensions of measuring airport service quality are currently based on a process approach. There are eight dimensions of ASQ measurement practiced by the industry, which is different from the five dimensions of service quality measurement generally. There is still a theoretical and empirical gap, so one of the challenges in applying the ASQ measurement dimensions is bridging research with applications in the airport industry. Other findings, research on airport service quality measurement is currently focused on passenger satisfaction. The integration of expectation-disconfirmation theory and service profit chain models can be used in service quality, passenger satisfaction and profitability.Research limitations/implicationsThis paper seeks to contribute to and analyze limited articles on service quality at airports and identify further research areas.Originality/valueThis paper tries to explain the development of research on the dimensions of measuring service quality at airports. The author identifies a gap in airport service quality measurement dimensions used by researchers and the industry. The author believes that this study can provide a comprehensive thought on using airport service quality measurement dimensions for future research.
Purpose This paper aims to investigate the effect of product complexity and communication quality on inter-organizational cost management (IOCM) and open book accounting (OBA) practices in buyer–supplier relationships in Malaysian manufacturing firms. Design/methodology/approach A questionnaire survey was administrated to CFOs or accounting managers of Malaysian suppliers. Exploratory factor analysis and Structural Equation Modeling procedures were applied to test convergent and discriminant validity of the measurement model and examine the relationships among the latent constructs in the structural model. Findings The results suggest that IOCM and OBA scales show acceptable reliability and validity. The findings also report that both product complexity and communication quality have a positive effect on IOCM and OBA in buyer–supplier relationships. However, the results suggest that IOCM does not influence OBA practice. Research limitations/implications Although IOCM and OBA constructs exhibited satisfactory reliability and validity, future research is required to refine and further validate these constructs. The data were only collected from the supplier’s perspective. Thus, future research is invited to benefit from matched data from both suppliers and buyers to generate additional insights on IOCM and OBA. Practical implications This study may assist suppliers and buyers in relationships by suggesting that complex products require the adoption of IOCM and OBA practices to reduce information asymmetries and manage costs. Furthermore, emphasizing quality of communication may enhance the implementation of these practices. Originality/value Theoretically, this study contributes to the academic stream of management accounting and cost management as it enhances an understanding of contributions introduced in prior literature on IOCM and OBA. It uses a complementary approach of transaction cost theory (TCT) and social exchange theory (SET) to explain the research model. Methodologically, the study validated scales for measuring IOCM and OBA in a new environment.
Purpose This paper aims to provide a systematic review of literature on the demand for takāful (Islamic insurance) from articles published from January 2009 to June 2019. The review aims to synthesise and segment previously published research to identify the gaps and provide future research direction. Design/methodology/approach A systematic review of the literature was conducted. Past research was analysed, and content comparisons based on research focus, context and methodology were evaluated. Findings It was found that not much has been written and published on takāful demand in quality journals. The first two articles were published in 2009, but it was only in 2017 that coverage of the topic rapidly expanded. Although no article was found to have been published in 2018 on takāful demand, there was one published article on the topic in 2019. This paper also found that not much attention has been given to takāful demand from the corporate sector. Research limitations/implications The defined rule for document searching and selection excluded out-of-scope documents that might be relevant. Furthermore, as this paper concentrates exclusively on articles published in English journals, the possibility that other relevant works do appear elsewhere in a different language is not denied. Practical implications Factors determining takāful demand are provided, and general directions are discussed, which managers can use to develop market share further. Originality/value Such an extensive review of literature on takāful demand has not been done before. Other than revealing ambiguities, gaps and contradictions in the literature, this paper sketches an avenue for further research. It also provides information and guidance for other researchers wishing to embark on research on takāful demand.
Purpose Coopetition, namely, the interplay between cooperation and competition, has received a good deal of interest in the business-to-business marketing literature. Academics have operationalised the coopetition construct and have used these measures to test the antecedents and consequences of firms collaborating with their competitors. However, business-to-business marketing scholars have not developed and validated an agreed operationalisation that reflects the dimensionality of the coopetition construct. Thus, the purpose of this study is to develop and validate a multi-dimensional measure of coopetition for marketing scholars to use in future research. Design/methodology/approach To use a highly cooperative and highly competitive empirical context, sporting organisations in New Zealand were sampled, as the key informants within these entities engaged in different forms of coopetition. Checks were made to ensure that the sampled entities produced generalisable results. That is, it is anticipated that the results apply to other industries with firms engaging in similar business-to-business behaviours. Various sources of qualitative and quantitative data were acquired to develop and validate a multi-dimensional measure of coopetition (the COOP scale), which passed all major assessments of reliability and validity (including common method variance). Findings The results indicated that coopetition is a multi-dimensional construct, comprising three distinct dimensions. First, local-level coopetition is collaboration among competing entities within a close geographic proximity. Second, national-level coopetition is cooperation with rivals within the same country but across different geographic regions. Third, organisation-level coopetition is cooperation with competitors across different firms (including with indirect rivals), regardless of their geographic location and product markets served. Indeed, organisation-level coopetition extends to how companies engage in coopetition in domestic and international capacities, depending on the extent to which they compete in similar product markets in comparison to industry rivals. Also, multiple indicators were used to measure each facet of the coopetition construct after the scale purification stage. Originality/value Prior coopetition-based investigations have predominately been conceptual or qualitative in nature. The scarce number of existing scales have significant problems, such as not appreciating that coopetition is a multi-dimensional variable, as well as using single indicators. In spite of a recent call for research on the multiple levels of coopetition, there has not been an agreed measure of the construct that accounts for its multi-dimensionality. Hence, this investigation responds to such a call for research by developing and validating the COOP scale. Local-, national- and organisation-level coopetition are anticipated to be the main facets of the coopetition construct, which offer several avenues for future research.
AbstrakIstilah Industri 4.0 lahir dari ide tentang revolusi industri keempat. Keberadaannya menawarkan banyak potensi manfaat. Guna mewujudkan Industri 4.0, diperlukan keterlibatan akademisi dalam bentuk riset. Artikel ini bertujuan untuk menelaah aspek dan arah perkembangan riset terkait Industri 4.0. Pendekatan yang digunakan adalah studi terhadap beragam definisi dan model kerangka Industri 4.0 serta pemetaan dan analisis terhadap sejumlah publikasi. Beberapa publikasi bertema Industri 4.0 dipilah menurut metode penelitian, aspek kajian dan bidang industri. Hasil studi menunjukkan Industri 4.0 memiliki empat belas aspek. Ditinjau dari metode penelitian, sebagian besar riset dilakukan melalui metode deskriptif dan konseptual. Ditinjau dari aspeknya, aspek bisnis dan teknologi menjadi fokus riset para peneliti. Ditinjau dari bidang industri penerapannya, sebagian besar riset dilakukan di bidang manufaktur. Ditinjau dari jumlahnya, riset terkait Industri 4.0 mengalami tren kenaikan yang signifikan. Artikel ini diharapkan dapat memberi gambaran mengenai apa itu Industri 4.0, perkembangan dan potensi riset yang ada di dalamnya. AbstractIndustry 4.0: Study of Aspects Classification and Future Research Direction. The term Industrial 4.0 refers to the idea about fourth industrial revolution. In order to realize Industry 4.0, academic involvement is required in the form of research. This article aims to define the aspects and future direction of research related to Industry 4.0. Literature review of various definition and concept models of Industry 4.0. was conducted to acquire the aspects. Mapping and analysis of several publications were conducted to determine the future direction of research. Publications were sorted according to research methods, aspects and type of industry. The result shows that Industry 4.0 has fourteen aspects. Based on research methods, most of the research is done through descriptive and conceptual methods. Business and technology aspects become the focus of the researchers and most of the research is done in manufacturing industry. Based on quantities, Industrial 4.0 research has experienced a significant upward trend. This article is expected to illustrate the concept, future development and research trend of Industry 4.0.Keywords: Industry 4.0; Literature Review; Research Trend
Conjoint analysis has proven to be a useful method for decomposing and estimating consumer preference for each attribute of a product or service through evaluations of sets of different versions of the product with varying attribute levels. The predictive value of conjoint analysis is confounded, however, by increasing market uncertainties and changes in user expectations. We explore the use of scenario-based conjoint analysis in order to complement qualitative design research methods in the early stages of concept development. The proposed methodology focuses on quantitatively assessing user experiences rather than product features to create experience-driven products, especially in cases in which the technology is advancing beyond consumer familiarity. Rather than replace conventional conjoint analysis for feature selection near the end of the product development cycle, our method broadens the scope of conjoint analysis so that this powerful measurement technique can be applied in the early stage of design to complement qualitative research and drive strategic directions for developing product experiences. We illustrate on a new product development case study of a flexible wearable for parent-child communication and tracking as an example of scenario-based conjoint analysis implementation. The results, limitations, and findings are discussed in more depth followed by future research directions.
Share document.
RTI uses cookies to offer you the best experience online. By clicking “accept” on this website, you opt in and you agree to the use of cookies. If you would like to know more about how RTI uses cookies and how to manage them please view our Privacy Policy here . You can “opt out” or change your mind by visiting: http://optout.aboutads.info/ . Click “accept” to agree.
An update on current practice in the published literature between 2005 and 2008
Marshall, D., Bridges, J. F. , Hauber, A. , Cameron, R. , Donnalley, L. , Fyie, K. , & Johnson, F. (2010). Conjoint analysis applications in health - How are studies being designed and reported? An update on current practice in the published literature between 2005 and 2008 . The Patient , 3 (4), 249-256. https://doi.org/10.2165/11539650-000000000-00000
Despite the increased popularity of conjoint analysis in health outcomes research, little is known about what specific methods are being used for the design and reporting of these studies. This variation in method type and reporting quality sometimes makes it difficult to assess substantive findings. This review identifies and describes recent applications of conjoint analysis based on a systematic review of conjoint analysis in the health literature. We focus on significant unanswered questions for which there is neither compelling empirical evidence nor agreement among researchers. We searched multiple electronic databases to identify English-language articles of conjoint analysis applications in human health studies published since 2005 through to July 2008. Two independent reviewers completed the detailed data extraction, including descriptive information, methodological details on survey type, experimental design, survey format, attributes and levels, sample size, number of conjoint scenarios per respondent, and analysis methods. Review articles and methods studies were excluded. The detailed extraction form was piloted to identify key elements to be included in the database using a standardized taxonomy. We identified 79 conjoint analysis articles that met the inclusion criteria. The number of applied studies increased substantially over time in a broad range of clinical applications, cancer being the most frequent. Most used a discrete-choice survey format (71%), with the number of attributes ranging from 3 to 16. Most surveys included 6 attributes, and 73% presented 7–15 scenarios to each respondent. Sample size varied substantially (minimum?=?13, maximum?=?1258), with most studies (38%) including between 100 and 300 respondents. Cost was included as an attribute to estimate willingness to pay in approximately 40% of the articles across all years. Conjoint analysis in health has expanded to include a broad range of applications and methodological approaches. Although we found substantial variation in methods, terminology, and presentation of findings, our observations on sample size, the number of attributes, and number of scenarios presented to respondents should be helpful in guiding researchers when planning a new conjoint analysis study in health.
10.2165/11539650-000000000-00000
To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.
Variability in personal exposure to ultrafine and fine particles by microenvironment among adolescents in cincinnati, androgen receptor monomers and dimers regulate opposing biological processes in prostate cancer cells, hospital healthcare resource utilization and associated hospital costs of patients with lupus nephritis in china, "if everyone knew about this, how many lives could we save".
Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 851))
Included in the following conference series:
This article aims to investigate apparel preferences among young female consumers in the context of a developing country with a specific focus on sustainable fashion. To do so, we employ a Conjoint analysis approach with the material used as an attribute signaling sustainability. In addition to material preferences, we investigate preferences towards different attribute levels for four additional attributes: design, country of origin, uniqueness, and price. Apart from that, we use cluster analysis to identify two market segments. By investigating consumer apparel preferences in a developing country setting, taking into account sustainability preferences by including material as a product attribute, and providing clear guidelines on product characteristics worth considering when designing new or redesigning already available clothing products, this research contributes both marketing theory and practice.
This is a preview of subscription content, log in via an institution to check access.
Subscribe and save.
Tax calculation will be finalised at checkout
Purchases are for personal use only
Institutional subscriptions
The World Bank: A New Era In Development. Washington: The World Bank (2023). https://openknowledge.worldbank.org/entities/publication/4683db9c-31bb-4bb0-9e00-ed6dc6aeae86
Ozdamar Ertekin, Z., Sevil Oflac, B., Serbetcioglu, C.: Fashion consumption during economic crisis: emerging practices and feelings of consumers. J. Glob. Fash. Market. 11 (3), 270–288 (2020). https://doi.org/10.1080/20932685.2020.1754269
Article Google Scholar
Vázquez-Martínez, U.J., Morales-Mediano, J., Leal-Rodríguez, A.L.: The impact of the COVID-19 crisis on consumer purchasing motivation and behavior. Eur. Res. Manag. Bus. Econ. 27 (3), 100166 (2021). https://doi.org/10.1016/j.iedeen.2021.100166
Voinea, L., Filip, A.: Analyzing the main changes in new consumer buying behavior during economic crisis. Int. J. Econ. Practices Theor. 1 (1), 14–19 (2011)
Google Scholar
Čaušević, F.: Principles of Macroeconomics (Volume II), School of Economics and Business, University of Sarajevo. (2012). ISBN 978-9958-25-071-2
Jacobs, K., Hörisch, J.: The importance of product lifetime labelling for purchase decisions: strategic implications for corporate sustainability based on a conjoint analysis in Germany. Bus. Strateg. Environ. 31 (4), 1275–1291 (2022). https://doi.org/10.1002/bse.2954
Jafar, H., Muda, I., Zainal, A., Yasin, W.: Profit maximization theory, survival-based theory and contingency theory: a review on several underlying research theories of corporate turnaround. Jurnal Ekonom 13 (4) (2010)
Samuelson, P., Nordhaus, W.: Economics. 19th Edition. Zagreb: Mate. (2011)
Amiruddin, A. H.: What are Portuguese consumers’ preferences towards sustainable fashion?-a conjoint analysis (Doctoral dissertation). 22–34 (2022). https://run.unl.pt/bitstream/10362/143245/1/wp.pdf
Brand, B.M., Rausch, T.M.: Examining sustainability surcharges for outdoor apparel using adaptive choice-based conjoint analysis. J. Clean. Prod. 289 , 125654 (2021). https://doi.org/10.1016/j.jclepro.2020.125654
Mendes, V.L.N.: What are portuguese consumers’ preferences towards sustainable fashion?-a study about the sustainable fashion personas (Doctoral dissertation). 27–50 (2022). https://run.unl.pt/bitstream/10362/145410/1/2021-22_spring_44850_vera-mendes.pdf
Spindler, V., Schunk, H., Könecke, T.: Sustainable consumption in sports fashion–German runners’ preference and willingness to pay for more sustainable sports apparel. Sustain. Prod. Consumption (2023). https://doi.org/10.1016/j.spc.2023.05.003
Olson, E.L.: Sustainable’ marketing mixes and the paradoxical consequences of good intentions. J. Bus. Res. 150 , 389–398 (2022). https://doi.org/10.1016/j.jbusres.2022.05.063
Chaturvedi, P., Kulshreshtha, K., Tripathi, V.: Investigating the determinants of behavioral intentions of generation Z for recycled clothing: an evidence from a developing economy. Young Consumers 21 (4), 403–417 (2020). https://doi.org/10.1108/YC-03-2020-1110
Chang, W.Y., Taecharungroj, V., Kapasuwan, S.: Sustainable luxury consumers’ preferences and segments: conjoint and cluster analyses. Sustainability 14 (15), 9551 (2022). https://doi.org/10.3390/su14159551
Chung, T., Lee, K.Y., Kim, U.: The impact of sustainable management strategies of sports apparel brands on brand reliability and purchase intention through single person media during COVID-19 pandemic: a path analysis. Sustainability 14 (12), 7076 (2022). https://doi.org/10.3390/su14127076
Thorisdottir, T.S., Johannsdottir, L.: Sustainability within fashion business models: a systematic literature review. Sustainability 11 (8), 2233 (2019). https://doi.org/10.3390/su11082233
Kanchanapibul, M., Lacka, E., Wang, X., Chan, H.K.: An empirical investigation of green purchase behaviour among the young generation. J. Clean. Prod. 66 , 528–536 (2014). https://doi.org/10.1016/j.jclepro.2013.10.062
Herbst, F., Burger, C.: Attributes used by young consumers when assessing a fashion product: a conjoint analysis approach. J. Consum. Sci. 30 , 40–45 (2002). https://doi.org/10.4314/jfecs.v30i1.52825
Jin, B., Yong Park, J., Sang Ryu, J.: Comparison of Chinese and Indian consumers’ evaluative criteria when selecting denim jeans: a conjoint analysis. J. Fashion Mark. Manage. Int. J. 14 (1), 180–194 (2010). https://doi.org/10.1108/13612021011025492
Birtwistle, G., Clarke, I., Freathy, P.: Customer decision making in fashion retailing: a segmentation analysis. Int. J. Retail Distrib. Manage. 26 (4), 147–154 (1998). https://doi.org/10.1108/09590559810214912
Brand, B.M., Rausch, T.M., Brandel, J.: The importance of sustainability aspects when purchasing online: comparing generation X and generation Z. Sustainability 14 (9), 5689 (2022). https://doi.org/10.3390/su14095689
Eckman, M., Damhorst, M.L., Kadolph, S.J.: Toward a model of the in-store purchase decision process: consumer use of criteria for evaluating women’s apparel. Cloth. Text. Res. J. 8 (2), 13–22 (1990). https://doi.org/10.1177/0887302X9000800202
Dickerson, K.G.: Relative importance of country of origin as an attribute in apparel choices. J. Consum. Stud. Home Econ. 11 (4), 333–343 (1987). https://doi.org/10.1111/j.1470-6431.1987.tb00144.x
Bertoli, G.: International marketing and the country of origin effect: the global impact of’ made in Italy. Edward Elgar Publishing. (2013). ISBN 978 1 78195 560 4
Bhaduri, G., Stanforth, N.: Evaluation of absolute luxury: effect of cues, consumers’ need for uniqueness, product involvement and product knowledge on expected price. J. Fashion Mark. Manage. Int. J. 20 (4), 471–486 (2016). https://www.emerald.com/insight/content/doi/ https://doi.org/10.1108/JFMM-12-2015-0095/full/html
Adnan, A., Ahmad, A., Khan, M.N.: Examining the role of consumer lifestyles on ecological behavior among young Indian consumers. Young Consumers 18 (4), 348–377 (2017). https://doi.org/10.1108/YC-05-2017-00699
Gurtner, S., Soyez, K: How to catch the generation Y.: identifying consumers of ecological innovations among youngsters. Technol. Forecast. Soc. Chang. 106 , 101–107 (2016). https://doi.org/10.1016/j.techfore.2016.02.015
Yadav, R., Pathak, G.S.: Young consumers’ intention towards buying green products in a developing nation: extending the theory of planned behavior. J. Clean. Prod. 135 , 732–739 (2016). https://doi.org/10.1016/j.jclepro.2016.06.120
Severo, E.A., Guimarães, J.C.F.D., Dellarmelin, M.L., Ribeiro, R.P.: The influence of social networks on environmental awareness and the social responsibility of generations. BBR. Braz. Bus. Rev. 16 (5), 500–518 (2019). https://doi.org/10.15728/bbr.2019.16.5.5
Dabija, D.C.: Enhancing green loyalty towards apparel retail stores: a cross-generational analysis on an emerging market. J. Open Innovation: Technol. Market Complex. 4 (1), 8 (2018). https://doi.org/10.1186/s40852-018-0090-7
Tait, P., Saunders, C., Dalziel, P., Rutherford, P., Driver, T., Guenther, M.: Comparing generational preferences for individual components of sustainability schemes in the Californian wine market. Appl. Econ. Lett. 27 (13), 1091–1095 (2020). https://doi.org/10.1080/13504851.2019.1661952
Wang, L., Xu, Y., Lee, H., Li, A.: Preferred product attributes for sustainable outdoor apparel: a conjoint analysis approach. Sustain. Prod. Consumption 29 , 657–671 (2022). https://doi.org/10.1016/j.spc.2021.11.011
Fuchs, M., Hovemann, G.: Consumer preferences for circular outdoor sporting goods: an adaptive choice-based conjoint analysis among residents of European outdoor markets. Cleaner Eng. Technol. 11 , 100556 (2022). https://doi.org/10.1016/j.clet.2022.100556
Achabou, M.A., Dekhili, S., Codini, A.P.: Consumer preferences towards animal-friendly fashion products: an application to the Italian market. J. Consum. Mark. 37 (6), 661–673 (2020). https://doi.org/10.1108/JCM-10-2018-2908
Anh, P.T.C., Huong, L.M., Oanh, V.T.K.: Generation Z willingness to pay for sustainable apparel: the influence of labelling for origin and eco-friendly material. J. Int. Econ. Manage. 20 (3), 42–59 (2020). https://doi.org/10.38203/jiem.020.3.0015
Hamin, H., Baumann, C., L. Tung, R.: Attenuating double jeopardy of negative country of origin effects and latecomer brand: an application study of ethnocentrism in emerging markets. Asia Pac. J. Mark. Logistics 26 (1), 54–77 (2014). https://doi.org/10.1108/APJML-07-2013-0090
Witek-Hajduk, M. K., Grudecka, A.: Does the developed-country brand name still matter? Consumers’ purchase intentions and ethnocentrism and materialism as moderators. J. Prod. Brand Manage. 31 (6), 50–71 (2022). https://doi.org/10.15678/EBER.2021.090110
Cutura, M.: The impacts of ethnocentrism on consumers’ evaluation processes and willingness to buy domestic vs. imported goods in the case of Bosnia and Herzegovina. South East Eur. J. Econ. Bus. 54–63 (2006). https://ssrn.com/abstract=1803993
Brkic, N., Corbo, M., Berberovic, D.: Ethnocentrism and animosity in consumer behavior in Bosnia and Herzegovina and implication for companies. Econ. Rev. J. Econ. Bus. 9 (1), 45–61 (2011). https://www.econstor.eu/bitstream/10419/193797/1/econ-review-v09-i1-p045-061.pdf
Jegethesan, K., Sneddon, J.N., Soutar, G.N.: Young Australian consumers’ preferences for fashion apparel attributes. J. Fashion Mark. Manage. Int. J. 16 (3), 275–289 (2012). https://www.emerald.com/insight/content/doi/ https://doi.org/10.1108/13612021211246044/full/html
Dabija, D.C., Băbuț, R.: Enhancing apparel store patronage through retailers’ attributes and sustainability. a generational approach. Sustainability 11 (17), 4532 (2019). https://doi.org/10.3390/su11174532
Johnstone, L., Lindh, C.: The sustainability-age dilemma: a theory of (un) planned behaviour via influencers. J. Consum. Behav. 17 (1), e127–e139 (2018). https://doi.org/10.1002/cb.1693
Zhang, B., Zhang, Y., Zhou, P.: Consumer attitude towards sustainability of fast fashion products in the UK. Sustainability 13 (4), 1646 (2021). https://doi.org/10.3390/su13041646
Home - Eurostat. (n.d.). https://ec.europa.eu/eurostat
Roseira, C., Teixeira, S., Barbosa, B., Macedo, R.: How collectivism affects organic food purchase intention and behavior: a study with norwegian and portuguese young consumers. Sustainability 14 (12), 7361 (2022). https://doi.org/10.3390/su14127361
Markert, J.: Demographics of age: Generational and cohort confusion. J. Curr. Issues Res. Advertising 26 (2), 11–25 (2004). https://doi.org/10.1080/10641734.2004.10505161
Kapferer, J.N., Michaut-Denizeau, A.: Are millennials really more sensitive to sustainable luxury? A cross-generational international comparison of sustainability consciousness when buying luxury. J. Brand Manag. 27 (1), 35–47 (2020). https://doi.org/10.1057/s41262-019-00165-7
Orme, B.: Getting started with conjoint analysis: strategies for product design and pricing research. Madison (WI): Research Publishers LLC. (2006) ISBN: 9780972729772
Diamantopoulos, A., Schlegelmilch, B.B., Halkias, G.: Taking the Fear Out of Data Analysis: Completely Revised, Significantly Extended and Still Fun. Edward Elgar Publishing. (2023). ISBN: 978 1 80392 985 9
Marshall, D., et al.: Conjoint analysis applications in health—how are studies being designed and reported? An update on current practice in the published literature between 2005 and 2008. Patient: Patient-Centered Outcomes Res. 3 , 249–256 (2010). https://doi.org/10.2165/11539650-000000000-00000
Peduzzi, P., Concato, J., Kemper, E., Holford, T.R., Feinstein, A.R.: A simulation study of the number of events per variable in logistic regression analysis. J. Clin. Epidemiol. 49 (12), 1373–1379 (1996). https://doi.org/10.1016/s0895-4356(96)00236-3
Curtis, M. J., et al.: Experimental design and analysis and their reporting: new guidance for publication in BJP. Br. J. Pharmacol. 172 (14), 3461 (2015). https://doi.org/10.1111/bph.12856
Vukic, M., Kuzmanovic, M., Kostic Stankovic, M.: Understanding the heterogeneity of Generation Y’s preferences for travelling: a conjoint analysis approach. Int. J. Tour. Res. 17 (5), 482–491 (2015). https://doi.org/10.1002/jtr.2015
Bhardwaj, V., Fairhurst, A.: Fast fashion: response to changes in the fashion industry. The Int. Rev. Retail, distrib. Consum. Res. 20 (1), 165–173 (2010). https://doi.org/10.1080/09593960903498300
Coskun, M., Burnaz, S.: Exploring the literal effect of COO for a new brand: a conjoint analysis approach. J. Int. Consum. Mark. 28 (2), 106–120 (2016). https://doi.org/10.1080/08961530.2015.1135677
Central Bank of Bosnia and Herzegovina: Annual report 2022. Sarajevo: Central Bank of Bosnia and Herzegovina (2023). https://www.cbbh.ba/Content/Archive/36
Agency for Statistics of Bosnia and Herzegovina: International trade in goods of BiH, 2022. Sarajevo: Agency for Statistics of Bosnia and Herzegovina (2023). https://bhas.gov.ba/data/Publikacije/Bilteni/2023/ETR_00_2022_TB_1_BS.pdf
Boufous, S., Hudson, D., Carpio, C.: Consumer willingness to pay for production attributes of cotton apparel. Agribusiness 39 (4), 1026–1048 (2023). https://doi.org/10.1002/agr.21802
Cleofas, M.A., Prasetyo, Y.T., Ong, A.K.S., Persada, S.F.: Brand or clothing function? Consumer preference analysis on clothing apparel attributes and design: a conjoint analysis approach 28 (2), 1–15 (2022). https://doi.org/10.18178/wcse.2022.04.146
Kondort, G., Pelau, C., Gati, M., Ciofu, I.: The role of fashion influencers in shaping consumers’ buying decisions and trends. In: Proceedings of the International Conference on Business Excellence vol. 17 (1), 1009–1018 (2023). https://doi.org/10.2478/picbe-2023-0092
Jung, S., Jin, B.: Sustainable development of slow fashion businesses: customer value approach. Sustainability 8 (6), 540 (2016). https://doi.org/10.3390/su8060540
Download references
Authors and affiliations.
School of Economics and Business Sarajevo, University of Sarajevo, Trg Oslobođenja-Alija Izetbegović 1, 71000, Sarajevo, Bosnia and Herzegovina
Esmeralda Marić & Lamija Biber
You can also search for this author in PubMed Google Scholar
Correspondence to Esmeralda Marić .
Editors and affiliations.
University of Sarajevo-Faculty of Civil Engineering, Sarajevo, Bosnia and Herzegovina
Naida Ademović
CANDARC, LLC, Illinois, IL, USA
Tijana Tufek-Memišević
University of Sarajevo-School of Economics and Business, Sarajevo, Bosnia and Herzegovina
Maja Arslanagić-Kalajdžić
Reprints and permissions
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
Cite this paper.
Marić, E., Biber, L. (2024). Analysis of Consumer Apparel Preferences with Emphasis on Sustainability in a Developing Country Setting: Conjoint Analysis Approach. In: Ademović, N., Tufek-Memišević, T., Arslanagić-Kalajdžić, M. (eds) Interdisciplinary Advances in Sustainable Development III. BHAAAS 2024. Lecture Notes in Networks and Systems, vol 851. Springer, Cham. https://doi.org/10.1007/978-3-031-71076-6_12
DOI : https://doi.org/10.1007/978-3-031-71076-6_12
Published : 03 September 2024
Publisher Name : Springer, Cham
Print ISBN : 978-3-031-71075-9
Online ISBN : 978-3-031-71076-6
eBook Packages : Engineering Engineering (R0)
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Policies and ethics
Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.
For a business to run effectively, its leadership needs a firm understanding of the value its products or services bring to consumers. This understanding allows for a more informed strategy across the board—from long-term planning to pricing and sales.
In today’s business environment, most products and services include multiple features and functions by default. So, how do businesses go about learning which ones their customers value most? Is it possible to assign a specific value to each feature a product offers?
This is where conjoint analysis becomes an essential tool.
Here’s an overview of conjoint analysis, why it’s important, and steps you can take to analyze your products or services.
Access your free e-book today.
Conjoint analysis is a form of statistical analysis that firms use in market research to understand how customers value different components or features of their products or services. It’s based on the principle that any product can be broken down into a set of attributes that ultimately impact users’ perceived value of an item or service.
Conjoint analysis is typically conducted via a specialized survey that asks consumers to rank the importance of the specific features in question. Analyzing the results allows the firm to then assign a value to each one.
Learn about conjoint analysis in the video below, and subscribe to our YouTube channel for more explainer content!
Conjoint analysis can take various forms. Some of the most common include:
The type of conjoint analysis a company uses is determined by the goals driving its analysis (i.e., what does it hope to learn?) and, potentially, the type of product or service being evaluated. It’s possible to combine multiple conjoint analysis types into “hybrid models” to take advantage of the benefits of each.
The insights a company gleans from conjoint analysis of its product features can be leveraged in several ways. Most often, conjoint analysis impacts pricing strategy, sales and marketing efforts, and research and development plans.
Conjoint analysis works by asking users to directly compare different features to determine how they value each one. When a company understands how its customers value its products or services’ features, it can use the information to develop its pricing strategy.
For example, a software company hoping to take advantage of network effects to scale its business might pursue a “freemium” model wherein its users access its product at no charge. If the company determines through conjoint analysis that its users highly value one feature above the others, it might choose to place that feature behind a paywall.
As such, conjoint analysis is an excellent means of understanding what product attributes determine a customer’s willingness to pay . It’s a method of learning what features a customer is willing to pay for and whether they’d be willing to pay more.
Conjoint analysis can inform more than just a company’s pricing strategy; it can also inform how it markets and sells its offerings. When a company knows which features its customers value most, it can lean into them in its advertisements, marketing copy, and promotions.
On the other hand, a company may find that its customers aren’t uniform in assigning value to different features. In such a case, conjoint analysis can be a powerful means of segmenting customers based on their interests and how they value features—allowing for more targeted communication.
For example, an online store selling chocolate may find through conjoint analysis that its customers primarily value two features: Quality and the fact that a portion of each sale goes toward funding environmental sustainability efforts. The company can then use that information to send different messaging and appeal to each segment's specific value.
Conjoint analysis can also inform a company’s research and development pipeline. The insights gleaned can help determine which new features are added to its products or services, along with whether there’s enough market demand for an entirely new product.
For example, consider a smartphone manufacturer that conducts a conjoint analysis and discovers its customers value larger screens over all other features. With this information, the company might logically conclude that the best use of its product development budget and resources would be to develop larger screens. If, however, future analyses reveal that customer value has shifted to a different feature—for example, audio quality—the company may use that information to pivot its product development plans.
Additionally, a company may use conjoint analysis to narrow down its product or service’s features. Returning to the smartphone example: There’s only so much space within a smartphone for components. How a phone manufacturer’s customers value different features can inform which components make it into the end product—and which are cut.
One example is Apple’s 2016 decision to remove the headphone jack from the iPhone to free up space for other components. It’s reasonable to assume this decision was reached after analysis revealed that customers valued other features above a headphone jack.
Conjoint analysis is an incredibly useful tool you can leverage at your company. By using it to understand which product or service features your customers value over others, you can make more informed decisions about pricing, product development, and sales and marketing activities.
Are you interested in learning more about how customers perceive and realize value from the products they buy, and how you can use that information to better inform your business? Explore Economics for Managers — one of our online strategy courses —and download our free e-book on how to formulate a successful business strategy.
Discover the world's research
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Humanities and Social Sciences Communications volume 11 , Article number: 1130 ( 2024 ) Cite this article
Metrics details
Financial fraud negatively impacts organizational administrative processes, particularly affecting owners and/or investors seeking to maximize their profits. Addressing this issue, this study presents a literature review on financial fraud detection through machine learning techniques. The PRISMA and Kitchenham methods were applied, and 104 articles published between 2012 and 2023 were examined. These articles were selected based on predefined inclusion and exclusion criteria and were obtained from databases such as Scopus, IEEE Xplore, Taylor & Francis, SAGE, and ScienceDirect. These selected articles, along with the contributions of authors, sources, countries, trends, and datasets used in the experiments, were used to detect financial fraud and its existing types. Machine learning models and metrics were used to assess performance. The analysis indicated a trend toward using real datasets. Notably, credit card fraud detection models are the most widely used for detecting credit card loan fraud. The information obtained by different authors was acquired from the stock exchanges of China, Canada, the United States, Taiwan, and Tehran, among other countries. Furthermore, the usage of synthetic data has been low (less than 7% of the employed datasets). Among the leading contributors to the studies, China, India, Saudi Arabia, and Canada remain prominent, whereas Latin American countries have few related publications.
Introduction.
Financial fraud represents a highly significant problem, resulting in grave consequences across business sectors and impacting people’s daily lives (Singh et al., 2022 ). Its occurrence leads to reduced confidence in the economy, resulting in destabilization and direct economic repercussions for stakeholders (Reurink, 2018 ). Abdallah et al. ( 2016 ) define fraud as a criminal act aimed at obtaining money unlawfully. There are diverse types of fraud, such as asset misappropriation, expense reimbursement, and financial statement manipulation. Scholars have classified fraud into three categories: banking, corporate, and insurance (Ali et al., 2022 ; Nicholls et al., 2021 ; West and Bhattacharya, 2016 ).
The problem becomes evident in the case of financial fraud, evidenced by the 2022 figures of the PricewaterhouseCoopers survey report revealing that 56% of companies globally have fallen victim to some form of fraud. In Latin America, 32% of companies have experienced fraud (PricewaterhouseCoopers, 2022 ). These alarming statistics align with the findings from Klynveld Peat Marwick Goerdeler (KPMG), indicating that 83% of the surveyed executives reported being targeted by cyber-attacks in the past 12 months. Furthermore, 71% had encountered some type of internal or external fraud (KPMG, 2022 ). These survey results reveal the higher risks of financial fraud faced by companies in Latin America, the United States, and Canada. In this context, traditional approaches, and techniques, as well as manual methods, have lost relevance and effectiveness because they cannot effectively address the complexity and scale of the information involved in detecting financial fraud.
As previously mentioned, despite the interest of organizations in detecting financial fraud using machine learning (ML), current knowledge in this field remains limited. After an initial research phase, specialized literature shows that most researchers have directed their efforts toward the analysis of credit card fraud using a supervised approach (Femila Roseline et al., 2022 ; Madhurya et al., 2022 ; Plakandaras et al., 2022 ; Saragih et al., 2019 ). In the studies of Ali et al. ( 2022 ), Hilal et al. ( 2022 ), and Ramírez-Alpízar et al. ( 2020 ), ML techniques employing the supervised approach were found to be the most widely used method for detecting financial fraud, compared to the unsupervised, deep learning, reinforcement, and semi-supervised approaches, among others. Moreover, scholars such as Whiting et al. ( 2012 ) have compared the performance of data mining models for detecting fraudulent financial statements using data from quarterly and annual financial indexes of public companies from the COMPUSTAT database.
Reurink ( 2018 ) has analyzed financial fraud resulting from false financial reports, scams, and misleading financial sales in the context of the financial market. Just like Wadhwa et al. ( 2020 ), he presented a wide variety of data mining methods, approaches, and techniques used in fraud detection, in addition to research addressing online banking fraud (Zhou et al., 2018 ; Moreira et al., 2022 ; Srokosz et al., 2023 ) and financial statement fraud (S. Chen, 2016 ; Ramírez-Alpízar et al., 2020 ). The abovementioned research works show that the accuracy of ML techniques in developing models for detecting financial fraud has increased (Al-Hashedi and Magalingam, 2021 ).
The effectiveness of financial fraud detection and prevention depends on the effective selection of appropriate ML techniques to identify new threats and minimize false fraud alarm warnings, responding to the negative impact of financial fraud on organizations (Ahmed et al., 2016 ). The use of ML techniques has made it possible to identify patterns and anomalies in large financial data sets. However, developments in detection tools, inaccurate classification, detection methods, privacy, computer performance, and disproportionate misclassification costs continue to hinder the accurate and timely detection of financial fraud (Dantas et al., 2022 ; Mongwe and Malan, 2020 ; Nicholls et al., 2021 ; West and Bhattacharya, 2016 ).
Recently, several studies have reviewed financial statement fraud detection methods in data mining and ML (Gupta and Mehta, 2021 ; Shahana et al., 2023 ); however, the present study is different from these past works in the area. These authors established the types of financial fraud and the different data mining techniques and approaches used to detect financial statement fraud. In contrast, our study explains the trends in the use of ML approaches and techniques to detect financial fraud, and it presents the more frequently used datasets in the literature for conducting experiments.
Fraud detection mechanisms using machine learning techniques help detect unusual transactions and prevent cybercrime (Polak et al., 2020 ). Although each of these approaches uses different methods in their experimentation, a systematic literature review (SLR) shows that the application of each algorithm mirrors performance metrics to determine the accuracy with which it predicts that a financial transaction is fraud. Such metrics include Accuracy, Precision, F1 Score, Recall, and Sensitivity, among others.
The research presented uses a rigorous and well-structured methodology to expand current knowledge on financial fraud detection using machine learning (ML) techniques. Through the use of a systematic literature review that follows adaptations of PRISMA guidelines and Kitchenham’s methodology, the study ensures a carefully planned and transparent review process. The sources of information consulted include research articles published in reputable academic databases such as Scopus, IEEE Xplore, Taylor & Francis, SAGE, and ScienceDirect, ensuring that the review covers the most relevant and quality scientific literature in the field of financial fraud and machine learning. Moreover, the study includes a bibliometric analysis using VOSviewer software, which allows identifying trends and patterns within the literature both quantitatively and visually. Based on the 104 articles reviewed, which cover the period 2012–2023, we manage to describe the types of fraud, the models applied, the ML techniques used, the datasets employed, and the metrics of performance reported. These contribute to filling the existing gaps in the literature by providing a comprehensive and up-to-date synthesis of the evidence on the use of machine learning techniques for financial fraud detection, thus laying the groundwork for future research and practical applications in this field.
Our responses to the initial research questions raised are four main contributions that justify this research. Thus, this study contributes to the literature on financial fraud detection by examining the relationship between the current literature on financial fraud detection and ML based on the scholars, articles, countries, journals, and trends in the area. Fraud has been classified as internal and external, with a focus on credit card loan fraud investigations and insurance fraud. The different ML techniques and their models applied to experiments were grouped. The most widely used datasets in financial fraud detection using ML are analyzed according to the 86 articles that contained experiments, highlighting that most of them involve real data. This paper is useful for researchers because it studies and presents the metrics used in supervised and unsupervised learning experiments, providing a clear view of their application in the different models.
Therefore, this study is relevant because it presents in a consolidated and updated manner new contributions derived from experiment results regarding the use of ML, which helps address the problem when financial fraud occurs.
The research work is organized as follows: the section “Methods” comprehensively describes the research method and the questions addressed in the study. Section “Results of the data synthesis” presents the findings encompassing authors, articles, sources, countries, trends, financial fraud types, and datasets with their characteristics to which the detection models using ML techniques were applied, with the results of their metrics. Finally, the section “Discussion and conclusion” highlights the conclusions, including future lines of research in the field.
The study focuses on SLR, which provides a comprehensive view of the great developments in financial fraud detection. Considering the purpose, scientific guidelines were followed in the literature review of the PRISMA and Kitchenham methods, which were adapted by the authors (Ashtiani and Raahemi, 2022 ; Kitchenham and Brereton, 2013 ; Kitchenham and Stuart, 2007 ; Kumbure et al., 2022 ; Moher et al., 2009 ; Roehrs et al., 2017 ; Saputra et al., 2023 ; Wohlin, 2014 ).
The method used in the SLR was developed with carefully planned and executed activities: (a) planning of the review, (b) definition of research questions, (c) description of the search strategy, (d) consultation concerning the search strategy, (e) selection of the inclusion/exclusion criteria and data selection, (f) description of the quality assessment, (g) investigation of the study topics, (h) description of data extraction, and (i) synthesis of the data.
Each of the activities conducted in this study is explained below.
The research purpose was established in accordance with the indicated research goals and questions. The analysis focused on research articles published between 2012 and 2023, particularly those using ML methods for financial fraud detection. Accordingly, the SLR procedure presented by Kitchenham and Stuart ( 2007 ) and Moher et al. ( 2009 ) was implemented following a series of steps adapted and modified by Ashtiani and Raahemi ( 2022 ) and Kumbure et al. ( 2022 ), as depicted in Fig. 1 . Thus, it was possible to ensure a rigorous and objective analysis of the available literature in our field of interest.
Description of the general process used to review the literature in the study area. Authors’ own elaboration.
The procedures implemented in this review process are discussed in the following subsections.
In SLR, research questions are key and decisive for the success of the study (Kitchenham and Stuart, 2007 ). Therefore, analyzing the existing literature on financial fraud detection through ML techniques and its characteristics, problems, challenges, solutions, and research trends is crucial. Table 1 describes the research questions to provide a structured framework for the study.
Within the proposed systematic review, the questions were fine-tuned, achieving a better classification and thematic analysis. The research questions were categorized into two groups: general questions (GQ) and specific questions (SQ). GQs provide an overview of the current state of the art, that is, a general framework for future research. Meanwhile, SQs focus on specific matters emerging from the application areas of the topic, thereby improving the filtering process of the study.
The search strategy was designed to identify a set of studies addressing the research questions posed. This strategy was to be implemented in two stages. In the first stage, a manual search was conducted by selecting a set of test documents through a defined database. Following the strategy proposed by Wohlin ( 2014 ), a snowball search was conducted. This approach involved choosing from a set of initial references (e.g., relevant articles or books addressing the subject matter) and searching for new related references relevant to the study based on these.
In the second stage, an automated search was performed using the technique described by Kitchenham and Brereton ( 2013 ), which included preparing a list of the main search terms to be applied in the queries in each database, as indicated in subsection “Search queries”.
In the study’s initial stage, nine journal articles were selected from the test set of papers (Ahmed et al., 2016 ; Ali et al., 2022 ; Bakumenko and Elragal, 2022 ; Gupta and Mehta, 2021 ; Hilal et al., 2022 ; Nicholls et al., 2021 ; Nonnenmacher and Marx Gómez, 2021 ; Ramírez-Alpízar et al., 2020 ; West and Bhattacharya, 2016 ). The manual literature search helped identify articles related to financial fraud detection through ML techniques, which were used as an initial set and were part of the final analysis. In the subsequent stage, a backward and forward snowball search was conducted. This approach involved using the initial set to select the relevant articles.
The backward snowball search process comprised reviewing article titles, including those meeting the inclusion and exclusion criteria. In the forward snowball search, the analysis was performed in the Scopus database to identify studies citing one or more of the articles in the initial set. This filtering method helped identify studies meeting the inclusion and exclusion criteria, eliminate duplicates from the previous set, and analyze articles answering the questions posed, which were retained in the final study set.
The research work mainly aimed to obtain a reliable set of relevant studies to minimize bias and increase the validity of the results. To this end, a manual search for articles meeting the inclusion and exclusion criteria was conducted by assessing the abstracts and other sections of articles. We decided to implement an automated search strategy using five databases: Scopus, IEEE Xplore, Taylor & Francis, SAGE, and ScienceDirect, known for their impartiality in the representation of research works, with inclusion and exclusion criteria already defined, thereby complementing the search. Thus, 104 related articles meeting the criteria established in the final set were identified.
Studies from 2012 onward were reviewed with keywords such as “financial fraud” and “machine learning” to identify model-based approaches and associated techniques. Table 2 presents a summary of the queries used in each data source.
The study established inclusion and exclusion criteria, a key process to select the most relevant articles. The exclusion criteria were documents published between 2012 and 2023 (until March), such as conference reviews, book chapters, editorials, and reviews. Further, the availability of the full text of the article was considered. We decided to exclude articles published before 2012 for the following reasons: (i) They were over 11 years old; (ii) Relevant publications prior to 2012 were scarce; and (iii) Sufficient number of articles were available between 2012 and 2023.
For the inclusion and exclusion criteria, appropriate filtering tools were applied to each data source during the search stage. This enabled the automated selection of the most relevant and appropriate studies based on the research goal.
In the data processing strategy used, databases were selected following strict inclusion and exclusion criteria to ensure the quality and relevance of the information collected (Table 3 ). Various databases initially identified the following number of relevant articles: Scopus (28), Taylor & Francis (80), SAGE (71), ScienceDirect (663), and IEEE Xplore (5132). This initial step provides a broad overview of the available literature in the field of financial fraud detection using ML models.
Subsequently, a data removal phase was carried out so as to ensure data integrity, such that the following number of articles (given in parentheses) were removed from each database: Scopus (0), Taylor & Francis (63), SAGE (57), ScienceDirect (636), and IEEE Xplore (5114). This rigorous process ensures the integrity of the data collected and avoids redundancy.
The final step consisted of obtaining the consolidated number of articles included after the selection and exclusion of duplicates: Scopus (28), Taylor & Francis (17), SAGE (14), ScienceDirect (27), and IEEE Xplore (18). This methodological strategy ensured the relevance of the articles that carried out a complete analysis in the field of financial fraud detection using ML models.
Once the inclusion and exclusion criteria were applied, the remaining articles were assessed for quality. The evaluation criteria used included the purpose of the research; contextualization; literature review; and related works, methods, conclusions, and results. To minimize the empirical obstacles associated with full-text filtering, a set of questions proposed by Roehrs et al. ( 2017 ) (see Table 4 ) was used to validate whether the selected articles met the previously established quality criteria.
In conducting the literature review to understand the current state of published research on the topic, a data orientation process was addressed, including preprocessing techniques and ML models and their metrics. Accordingly, four research topics were defined based on the research goals. They are presented in Table 5 .
For data extraction, the necessary attributes were first defined and the information pertaining to the study goals was summarized. Next, the relevant information was identified and obtained through a detailed reading of the full text of each article. The information was then stored in a Microsoft Excel spreadsheet. Data were collected on the attributes specified in Table 6 . In Table 6 , the “Study” column corresponds to the identifiers of the research topics in Quality Assessment, and the “Subject” column refers to the category to which the different attributes belong. The names of the attributes and a brief description are presented in the last two columns of the table, including additional columns with relevant information.
Data synthesis included analyzing and summarizing the information observed in the selected articles to address the research questions. To perform this task, a synthesis was conducted following the guidelines proposed by Moher et al. ( 2009 ) based on qualitative data. Further, a descriptive analysis was performed to obtain answers to the research questions. Consequently, a qualitative approach to data evidence was followed.
In this section, the 104 finally selected articles have been considered. The data were synthesized to address the five research questions mentioned.
General questions (GQ)
GQ1: Which were the most relevant authors, articles, sources, countries, and trends in the literature review on financial fraud detection based on the application of machine learning (ML) models?
The literature on financial fraud detection applying ML models has been studied by a large number of authors. However, some authors stood out in terms of the number of published papers and number of citations. Specifically, the most significant authors with two publications are Ahmed M. (with 318 citations), Ileberi E. (82 citations), Ali A. (20 citations), Chen S. (84 citations), and Domashova J and Kripak E. (each with 6 citations). Other relevant authors with one publication and who have been cited several times are Abdallah A. (with 333 citations), Abbasimehr H. (18 citations), Abd Razak S. (13 citations), Achakzai M. A. K. (5 citations), and Abosaq H. (2 citations). The aforementioned authors have contributed significantly to the development of research in financial fraud detection using ML models (Fig. 2 ).
Shows the analysis of the connections between authors based on co-authorship of publications. Produced with VOSviewer.
Collectively, the researchers have contributed a solid knowledge base and have laid the foundation for future research in financial fraud detection using ML models. Although other researchers contributed to the field, such as Khan, S. and Mishra, B., both with 7 citations, among others, some have been more prominent in terms of the number of papers published. Their collective works have enriched the field and have promoted a greater understanding of the challenges and opportunities in this area.
As depicted in Fig. 3 , clusters 2 (green) and 4 (yellow) present the most relevant research articles on financial fraud detection using ML models. Cluster 2, comprising 9 articles with 357 citations and 32 links, is highlighted because of the significant impact of the articles by Sahin, Huang, and Kim. These articles have the highest number of citations and are deemed to be useful starting points for those intending to dive into this research field. Cluster 4, constituting 6 articles with 158 citations and 27 links, includes the works of Dutta and Kim, who have also been cited considerably.
Depicts the connections between articles based on their bibliographic references. Produced with VOSviewer.
Articles in clusters 1 (red) and 3 (dark blue) could be valuable sources of information; however, they were observed to have a lower number of citations and links than those in clusters 2 and 4, such as that of Nian K. (62 citations and 4 links) and Olszewski (92 citations and 4 links). However, some articles in these clusters have had a substantial number of citations.
In Cluster 10 (pink), the article by Reurink A. is prominent, with 38 citations. This is followed by the article by Ashtiani M.N. with 10 citations. In Cluster 11 (light green), the article by Hájek P. has 129 citations. In Cluster 12 (grayish blue), the articles by Blaszczynski J. and Elshaar S. have the greatest number of citations, indicating their influence in the field of financial fraud detection.
In Cluster 13 (light brown), the article by Pourhabibi T. has the greatest number of citations at 102, suggesting that he has been relevant in the research on financial fraud detection. Finally, in Cluster 14 (purple), the articles by Seera M. have 63 citations and 2 links. The article by Ileberi E. has 11 citations and 1 link. Both articles have a small number of citations, indicating a lower influence on the topic.
In conclusion, clusters 2, 4, and 11 are the most relevant in this literature review. The articles by Sahin, Huang, Kim, Dutta, and Pumsirirat are the most influential ones in the research on financial fraud detection through the application of ML models.
The information presented in Fig. 4 is the result of a clustering analysis of the articles resulting from the literature review on financial fraud detection by implementing ML models. In total, 48 items were identified and grouped into 12 clusters. The links between the items were 100, with a total link strength of 123.
Shows the relationship between different scientific journals based on bibliographic links. Produced with VOSviewer.
The following is a description of each cluster with its respective number of items, links, and total link strength (the number of times a link appears between two items and its strength):
Cluster 1 (6 articles—red): This cluster includes journals such as Computers and Security , Journal of Network and Computer Applications , and Journal of Advances in Information Technology . The total number of links is 27, and the total link strength is 32.
Cluster 2 (6 articles—dark green): This cluster includes articles from Technological Forecasting and Social Change , Journal of Open Innovation: Technology, Market, and Complexity , and Global Business Review . The total number of links is 18, and the total link strength is 19.
Cluster 3 (5 articles—dark blue): This cluster includes articles from the International Journal of Advanced Computer Science and Applications , Decision Support Systems , and Sustainability . The total number of links is 19, and the total link strength is 20.
Cluster 4 (4 articles—dark yellow): This cluster includes articles from Expert Systems with Applications and Applied Artificial Intelligence . The total number of links is 26, and the total link strength is 45.
Cluster 5 (4 articles—purple): This cluster includes articles from Future Generation Computer Systems and the International Journal of Accounting Information Systems . The total number of links is 15, and the total link strength is 16.
Cluster 6 (4 articles—dark blue): This cluster includes articles from IEEE Access and Applied Intelligence . The total number of links is 18, and the total link strength is 26.
Cluster 7 (4 articles—orange): This cluster includes articles from Knowledge-Based Systems and Mathematics . The total number of links is 23, and the total link strength is 29.
Cluster 8 (4 articles—brown): This cluster includes articles from the Journal of King Saud University—Computer and Information Sciences and the Journal of Finance and Data Science . The total number of links is 13, and the total link strength is 13.
Cluster 9 (4 articles—light purple): This cluster includes articles from the International Journal of Digital Accounting Research and Information Processing and Management . The total number of links is 2, and the total link strength is 2.
The clusters represent groups of related articles published in different academic journals. Each cluster has a specific number of articles, links, and total link strength. These findings provide an overview of the distribution and connectedness of articles in the literature on financial fraud detection using ML models. Further, clustering helps identify patterns and common thematic areas in the research, which may be useful for future researchers seeking to explore this field.
Clusters 1, 4, and 7 indicate a greater number of stronger articles and links. These clusters encompass articles from Computers and Security , Expert Systems with Applications , and Knowledge-Based Systems , which are important sources for the SLR on financial fraud detection through the implementation of ML models.
The analysis presented indicates the number of documents related to research in different countries and territories. In this case, a list of 50 countries/territories and the number of documents related to the research conducted in each of them is presented. China leads with the highest paper count at 18, followed by India at 13 and Saudi Arabia and Canada at 9 each. Canada, Malaysia, Pakistan, South Africa, the United Kingdom, France, Germany, and Russia have similar research outputs with 4–9 papers. Sweden and Romania have 1 or 2 research papers, indicating limited scientific research output.
The presence of little-known countries such as Armenia, Costa Rica, and Slovenia suggests ongoing research in places less common in the academic world. From that point on, the number of papers has gradually decreased.
The production of papers is geographically distributed across countries from different continents and regions. However, more research exists on the subject from countries with developed and transition economies, which allows for a greater capacity to conduct research and produce papers.
Figure 5 , sourced from Scopus’s “Analyze search results” option, depicts countries with their respective number of published papers on the topic of financial fraud detection through ML models.
Represents the number of scientific publications in the study area classified by country. Produced with VOSviewer.
The above shows the diversity of countries involved in the research, where China leads the number of studies with 18 papers, followed by India with 13 and Saudi Arabia and Canada each with 9 papers. The other countries show little production, with less than 7 publications, which indicates an emerging topic of interest for the survival of companies that must prevent and detect different financial frauds using ML techniques.
The most relevant keywords in the review of literature on financial fraud detection implementing ML models include the following:
In Cluster 1, the most relevant keywords are “decision trees” (13 repetitions), “support vector machine (SVM)” (11 repetitions), “machine-learning” (10 repetitions), and “credit card fraud detection” (9 repetitions). A special focus has been placed on the topic of artificial intelligence (ML), in addition to algorithms and/or supervised learning models such as decision trees, support vector machines, and credit card fraud detection.
In Cluster 2, the most relevant keywords are “crime” (46 repetitions), “fraud detection” (43 repetitions), and “learning systems” (13 repetitions). These terms reflect a broader focus on financial fraud detection, where the aspects of crime in general, fraud detection, and learning systems used for this purpose have been addressed.
In Cluster 3, the most relevant keywords are “Finance” (19 repetitions), “Data Mining” (18 repetitions), and “Financial Fraud” (12 repetitions). These keywords indicate a focus on the financial industry, where data mining is used to reveal patterns and trends related to financial fraud.
In Cluster 4, the most relevant keywords are “Machine Learning” (45 repetitions), “Anomaly Detection” (16 repetitions), and “Deep Learning” (11 repetitions). They reflect an emphasis on the use of traditional ML and deep learning techniques for anomaly detection and financial fraud detection.
In general, the different clusters indicate the most relevant keywords in the SLR on financial fraud detection through ML models. Each cluster presents a specific set of keywords reflecting the most relevant trends and approaches in this field of research (Fig. 6 ).
Shows the relationships between keywords based on their co-occurrence in the literature reviewed. Produced with VOSviewer.
GQ2: What types of financial fraud have been identified in ML studies?
Financial fraud is generated by weaknesses in companies’ control mechanisms, which are analyzed based on the variables that allow them to materialize. These include opportunity, motivation, self-fulfillment, capacity, and pressure. Some of these are comprehensively analyzed by Donald Cressey through the fraud theory approach. The lack of modern controls has led organizations to use ML in response to this major problem. According to the findings of the Global Economic Crime and Fraud Survey 2022–2023, which gathered insights from 1,028 respondents across 36 countries worldwide, instances of fraud within these companies have caused a financial loss of approximately 10 million dollars (PricewaterhouseCoopers, 2022 ).
Referring to the concept of fraud, as outlined in international studies (Estupiñán Gaitán, 2015 ; Márquez Arcila, 2019 ; Montes Salazar, 2019 ) and the guidelines of the American Institute of Certified Public Accountants, it is an illegal, intentional act in which there is a victim (someone who loses a financial resource) and a victimizer (someone who obtains a financial resource from the victim). Thus, the proposed classification includes corporate fraud and/or fraud in organizations, considering that the purpose is to misappropriate the capital resources of an entity or individual: cash, bank accounts, loans, bonds, stocks, real estate, and precious metals, among others.
In this SLR study, we have considered fraud classifications by authors of 86 articles, which encompass experiments. We have excluded the 18 SLR articles from our analysis. The types presented in Table 7 follow the holistic view of the authors of the research for a better understanding of the subject of financial fraud, considering whether it is internal or external fraud.
Table 7 highlights the diverse types of frauds, and the research works on them. According to the classification, external frauds correspond to those performed by stakeholders outside the company. This study’s findings show that 54% of the analyzed articles investigate external fraud, among which the most important studies are on credit card loan fraud, followed by insurance fraud, using supervised and unsupervised ML techniques for their detection.
In research works (Kumar et al., 2022 ) analyzing credit card fraud, attention is drawn to the importance of prevention through the behavioral analysis of customers who acquire a bank loan and identifying applicants for bad loans through ML models. The datasets used in these fraud studies have covered transactions performed by credit card holders (Alarfaj et al., 2022 ; Baker et al., 2022 ; Hamza et al., 2023 ; Madhurya et al., 2022 ; Ounacer et al., 2018 ; Sahin et al., 2013 ), while other research works have covered master credit card money transactions in different countries (Wu et al., 2023 ) and fraudulent transactions gathered from 2014 to 2016 by the international auditing firm Mazars (Smith and Valverde, 2021 ).
The second major type of external fraud is insurance fraud, which is classified as fraud in health insurance programs involving practices such as document forgery, fraudulent billing, and false medical prescriptions (Sathya and Balakumar, 2022 ; Van Capelleveen et al., 2016 ) and automobile insurance fraud involving fraudulent actions between policyholders and repair shops, who mutually rely on each other to obtain benefits (Aslam et al., 2022 ; Nian et al., 2016 ; Subudhi and Panigrahi, 2020 ); as a result of the issues they face, insurance companies have developed robust models using ML.
As regards internal fraud, caused by an individual within the company, 46% of studies have analyzed this type, with financial statement fraud, money laundering fraud, and tax fraud standing out. The studies show that the investigations are based on information reported by the US Securities and Exchange Commission (SEC) and the stock exchanges of China, Canada, Tehran, and Taiwan, among others. To a considerable extent, the information taken is from the real sector, and very few studies have obtained synthetic information based on the application of different learning models.
The following is a summary of the financial information obtained by the researchers to apply AI models and techniques:
Stock market financial reports : Fraud in the Canadian securities industry (Lokanan and Sharma, 2022 ), companies listed on the Chinese stock exchanges (Achakzai and Juan, 2022 ; Y. Chen and Wu, 2022 ; Xiuguo and Shengyong, 2022 ), companies with shares according to the SEC (Hajek and Henriques, 2017 ; Papík and Papíková, 2022 ), companies listed on the Tehran Stock Exchange (Kootanaee et al. 2021 ), companies in the Taiwan Economic Journal Data Bank (TEJ) stock market (S. Chen, 2016 ; S. Chen et al., 2014 ), analysis of SEC accounting and auditing publications (Whiting et al., 2012 )
Wrong financial reporting to manipulate stock prices (Chullamonthon and Tangamchit, 2023 ; Khan et al., 2022 ; Zhao and Bai, 2022 )
Financial data of 2318 companies with the highest number of financial frauds (mechanical equipment, medical biology, media, and chemical industries; Shou et al., 2023 ), fraudulent financial restatements (Dutta et al., 2017 )
Data from 950 companies in the Middle East and North Africa region (Ali et al., 2023 ), analyzing outliers in sampling risk and inefficiency of general ledger financial auditing (Bakumenko and Elragal, 2022 ), fraudulent intent errors by top management of public companies (Y. J. Kim et al., 2016 ), reporting of general ledger journal entries from an enterprise resource planning system (Zupan et al., 2020 )
Synthetic financial dataset for fraud detection (Alwadain et al., 2023 ).
Studies have analyzed situations involving fraudulent financial statements. In these cases, instances of fraud have already occurred, leading to the creation of financial reports that contain statements with outliers that can be deemed fraudulent intent or errors in financial figures. This raises a reasonable doubt about whether an intent exists with regard to the reporting of unrealistic figures. Notably, once there are parties responsible for the financial information presented to stakeholders, such as organization owners, managers, administrators, accountants, or auditors, it is unlikely for it to be unintentional (an error). In this context, transparency and explainability are essential so as to ensure fairness in decisions, thus avoiding bias and discrimination based on prejudiced data (Rakowski et al., 2021 ).
Because of its significance, the information reported in financial statements is vital for investigations. Studies have indicated substantial amounts of data extracted from the financial reports of regulatory bodies such as stock exchanges and auditing firms. These entities use the data to establish the existence of fraud and its types through predictive models that use ML techniques. Thus, they require financial data such as dates, the third party affected, user, debit or credit amount, and type of document, among other aspects involving an accounting record. This information aids in identifying the possible impact in terms of lower profits and the perpetrator and/or perpetrators to gather sufficient evidence and file criminal proceedings for the financial damage caused.
Moreover, investigations concerning money laundering fraud and/or money laundering, the second most investigated internal fraud type, encompass the reports of natural and legal persons exposed by the Financial Action Task Force in countries such as the Kingdom of Saudi Arabia (Alsuwailem et al., 2022 ), transactions from April to September 2018 from Taiwan’s “T” bank and the account watch list of the National Police Agency of the Ministry of Interior (Ti et al., 2022 ), money laundering frauds in Middle East banks (Lokanan, 2022 ), transactions of financial institutions in Mexico from January 2020 (Rocha-Salazar et al., 2021 ), and synthetic data of simulated banking transactions (Usman et al., 2023 ).
Concerns regarding the entry of proceeds from money laundering into an organization have been articulated in relation to the financial damage it causes to the country. At the macroeconomic level, these activities negatively affect financial stability, distorting the prices of goods and services. Moreover, such activities disrupt markets, making it difficult to make efficient financial decisions. At the microeconomic level, legitimate businesses face unfair competition with companies using illegal money, which may lead to higher unemployment levels. Furthermore, money laundering has a social impact because it affects the security and welfare of society.
Thus, some research works (Alsuwailem et al., 2022 ) have indicated the need to implement ML models for promoting anti-money laundering measures. For instance, in Saudi Arabia, money from illicit drug trafficking, corruption, counterfeiting, and product piracy have entered the country. The measures to be taken are categorized according to the three stages of money laundering: placement, layering (also known as concealment), and integration. These include new legal regulations against money laundering, staff training, customer identification and validation, reporting of suspicious activities, and documentation and storage of relevant data (Bolgorian et al., 2023 ).
Regarding the 7.5% incidence of internal fraud, specifically categorized as tax fraud resulting from tax evasion, the studies have analyzed tax returns on income and/or profits of legal persons and/or individuals from the Serbian tax administration during 2016–2017 (Savić et al., 2022 ). Studies have encompassed periodic value-added tax (VAT) returns, together with the anonymous list of clients for the tax year 2014 obtained from the Belgian tax administration (Vanhoeyveld et al., 2020 ) and income tax and VAT taxpayers registered and provided by the State Revenue Committee of the Republic of Armenia in 2018 (Baghdasaryan et al., 2022 ). These studies hold great relevance for tax administrations using different strategies to minimize the impact of fraud resulting from tax evasion. Tax evasion reduces the government’s ability to collect revenue, directly affecting government finances and causing budget deficits, thereby increasing public debt.
GQ3: Which ML models were implemented to detect financial fraud in the datasets?
Given that ML is a key tool to extract meaningful information and make informed decisions, this study analyzes the most widely used ML techniques in the field of financial fraud detection. It takes as reference 86 experimental articles, excluding 18 SLR articles. In these articles, the most commonly used trends and approaches in the implementation of ML techniques in financial fraud detection were identified.
For the analysis, the pattern of frequency of use of ML models was observed. Several of them have been prominent because of their popularity and implementation in detecting financial fraud (Fig. 7 ). Some of the most widely used models include long-short term memory (LSTM) with 7 mentions, autoencoder with 10 mentions, XGBoost with 13 mentions, k -nearest neighbors (KNN) with 14 mentions, artificial neural network (ANN) with 17 mentions, NB with 19 mentions, SVM with 29 mentions, DT with 29 mentions, LR with 32 mentions, and RF with 34 mentions.
Illustrates the most common machine learning models in financial fraud detection. Authors’ own elaboration.
The LSTM model is a recurrent neural network used for sequence processing, especially for tasks concerning natural language processing (Chullamonthon and Tangamchit, 2023 ; Esenogho et al., 2022 ; Femila Roseline et al., 2022 ). Moreover, autoencoders are models used for data compression and decompression. These models are useful in dimensionality reduction applications (Misra et al., 2020 ; Srokosz et al., 2023 ). XGBoost is a library combining multiple weak DT models, offering a scalable and efficient solution in classification and regression tasks (Dalal et al., 2022 ; Udeze et al., 2022 ).
KNN and ANN are widely used models in various ML applications. KNN is based on neighbor closeness, and ANN is inspired by human brain functioning. NB is a probabilistic algorithm commonly used in text classification and data mining (Ashtiani and Raahemi, 2022 ; Lei et al., 2022 ; Shahana et al., 2023 ).
SVM, DT, LR, and RF, the most commonly mentioned models, are used in a wide range of classification and regression applications. These models are prominent because of their effectiveness and applicability to different scenarios, such as credit card loan fraud (external fraud) and financial statement fraud (internal fraud).
The most frequently used ML techniques are supervised learning (56.73%); unsupervised learning (18.29%), a combination of supervised and unsupervised learning (15.38%), a combination of supervised and deep learning (2.88%), and mathematical approach, supervised, and semi-supervised learning (0.96%). Figure 8 presents the ML techniques in the literature reviewed and indicates the number of times each type of technique is applied. Some articles applied several ML methods, in which the algorithms are mainly classified according to the learning method. In this case, there are four main types: supervised, semi-supervised, unsupervised, and deep learning.
Shows the different experimental approaches used in the study. Authors’ own elaboration.
Supervised learning is the most widely used technique, with 56.73% of citations in financial fraud studies. In this approach, labeled training data are used, where the expected outputs are known and a model is built that can make higher-accuracy predictions on new unlabeled data. Common examples of supervised learning techniques include the models of LR, SVM, DT, RF, KNM, NB, and ANN.
Moreover, unsupervised learning constitutes 18.27% of the mentions. The technique focuses on discovering patterns in the data without knowing data with labels and/or types for training. Some of these include DBSCAN, autoencoder, and isolation forest (IF).
The combination of supervised, unsupervised, and semi-supervised learning is used with a frequency of 1.92%. This technique and/or approach combines elements of supervised and unsupervised learning, using both labeled and unlabeled data to train the models. It is also used when labeled data are scarce or expensive to obtain; thus, the aim is to take advantage of unlabeled information to improve model performance.
Finally, supervised and deep learning represents 2.88% of the mentions. It is based on deep neural networks with multiple neurons and hidden layers to learn complex data representations. It has achieved remarkable developments in areas such as image processing, voice recognition, and machine translation.
Specific questions (SQ)
SQ1: What datasets were used by implementing ML models for financial fraud detection?
First, the data structure and fraud types may vary with the collection of datasets. The performance of fraud detection models may be affected by variations in the number of instances and attributes selected. Therefore, investigating the datasets and their characteristics is relevant, as data differ in terms of data type (number, text) and the data source from which they were obtained (synthetic and/or real), as can be observed in Fig. 9 .
Depicts the datasets used in the research on financial fraud detection. Authors’ own elaboration.
The dataset was created by the Machine Learning group at Université Libre de Bruxelles. It encompasses anonymized credit card transactions labeled as fraudulent or genuine. The transactions were performed in September 2013 over two days by European cardholders; a record of only 492 frauds out of 284,807 transactions is highly unbalanced because the positive types (frauds) represent only 0.172% of all transactions (Machine Learning Group, 2018 ).
The characteristics of the set encompass numerical variables resulting from a principal component analysis (PCA) transformation. For confidentiality, the original features of the data have not been disclosed. Features V1, V2…, V28 have been the main components obtained through PCA. The only features that have not transformed with PCA include “Time,” which denotes the seconds elapsed between each transaction. “Amount” denotes the transaction amount. The “Class” feature is the response variable, taking 1 as the value in case of fraud and 0 (no fraud) otherwise.
This dataset has been used by 15 authors in their papers, who have applied different financial fraud detection techniques (Alarfaj et al., 2022 ; Baker et al., 2022 ; Fanai and Abbasimehr, 2023 ; Fang et al., 2019 ; Femila Roseline et al., 2022 ; Hwang and Kim, 2020 ; Ileberi et al., 2021 , 2022 ; Khan et al., 2022 ; Misra et al., 2020 ; Ounacer et al., 2022 ).
The dataset was proposed by Professor Hofmann to the UC Irvine ML repository on November 16, 1994, for facilitating credit rating (Hofmann, 1994 ). It mainly aims to determine whether a person presents a favorable or unfavorable credit risk (binary rating). The set is multivariate, which implies that it contains many attributes used in credit rating. These attributes include information on existing current account status, credit duration, credit history, and credit purpose and amount, among others. In total, there are 20 attributes describing several characteristics of individuals and contains 1000 instances; it has been widely used in research related to credit rating (Esenogho et al., 2022 ; Fanai and Abbasimehr, 2023 ; Lee et al., 2018 ; Pumsirirat and Yan, 2018 ; Seera et al., 2021 ).
The dataset belongs to the UC Irvine ML repository and was created by Ross Quinlan in 1997. It focuses on credit card applications within the financial field (Quinlan, 1997 ). It has a total of 690 instances and 14 attributes of which 6 are numeric of type integer/actual and 8 are categorical; consequently, its data characteristics are multivariate—that is, it contains multiple variables and/or attributes. Several studies have used the ensemble data (Lee et al., 2018 ; Pumsirirat and Yan, 2018 ; Seera et al., 2021 ; Singh et al., 2022 ).
The China Stock Market and Accounting Research (CSMAR) Database contains financial reports and violations of CSMAR. It provides information on China’s stock markets and the financial statements of listed companies; the data were collected between 1998 and 2016 from publicly funded companies (CSMAR, 2022 ). It includes fraudulent and non-fraudulent companies committing several types of fraud, such as showing higher profits and/or earnings, fictitious assets, false records, and other irregularities in financial reporting.
The set comprises 35,574 samples, including 337 annual fraud samples of companies in the Chinese stock market. This is selected as a data source to illustrate the financial statement information of listed companies in three studies (Achakzai and Juan, 2022 ; Y. Chen and Wu, 2022 ; Shou et al., 2023 ).
It was generated by the PaySim mobile money simulator using aggregated data from a private dataset deriving from one month of financial records from a mobile money service in an African country (López-Rojas, 2017 ). The original records were provided by a multinational company offering mobile financial services in more than 14 countries worldwide. The dataset has been used in numerous studies (Alwadain et al., 2023 ; Hwang and Kim, 2020 ; Moreira et al., 2022 ).
The synthetic dataset provided is a scaled-down version, representing a quarter of the original dataset. It was made available for Kaggle. It constitutes 6,362,620 samples, with 8213 fraudulent transaction samples and 6,354,407 non-fraudulent transactions. It includes several attributes related to mobile money transactions: transaction type (cash-in, cash-out, debit, payment, and transfer); transaction amount in local currency; customer information (customer conducting the transaction and transaction recipient); initial balances before and after the transaction; and fraudulent behavior indicators (isFraud and isFlaggedFraud). These attributes indicate a binary classification.
It was created by I-Cheng Yeh and introduced on January 25, 2016, and is available in the UC Irvine ML repository (Yeh, 2016 ). The dataset, which is used for classification tasks, focuses on the case of defaulted payments of credit card customers in Taiwan in the business area. Moreover, it is a multivariate dataset with 30,000 instances and 24 attributes. They include attributes such as the amount of credit granted, payment history, and statement records spanning April through September 2005. This data source is selected in studies such as those by Esenogho et al. ( 2022 ), Pumsirirat and Yan ( 2018 ), and Seera et al. ( 2021 ).
Edgar Lopez Rojas created the dataset in 2017. The synthetic data were generated in the BankSim payment simulator. It is based on a sample of transactional data provided by a bank in Spain (López-Rojas, 2017 ). It includes the following characteristics: step, customer ID, age, gender, zip code, merchant ID, zip code of merchant, category of purchase, amount of purchase, and fraud status. It comprises 594,643 transactions, of which ~1.2% (7200) were labeled as fraud and the rest (587,443) were labeled as genuine, and it was processed as a binary classification problem. The dataset has been used in several investigations (Esenogho et al., 2022 ; Pumsirirat and Yan, 2018 ; Seera et al., 2021 ).
This dataset is a financial and economic information and research database (Compustat, 2022 ). It contains characteristics related to various aspects of companies, such as asset quality, revenues earned, administrative and sales expenses, and sales growth, among others. COMPUSTAT collects and stores detailed information on listed companies in the United States and Canada. The set includes information on 61 characteristics and consists of 228 companies, of which half showed fraud in their information while the other half did not present fraud (binary classification), and it is used in studies (Dutta et al., 2017 ; Whiting et al., 2012 ).
This dataset is used in the CoIL 2000 challenge, available at the UC Irvine Machine Learning Repository, created by Peter Van Der Putten. It consists of 9822 instances and 86 attributes containing information about customers of an insurance company and includes data on product use and sociodemographic data (Putten, 2000 ). It is characterized as multivariate and is used to perform regression/classification tasks by studies using the dataset (Huang et al., 2018 ; Sathya and Balakumar, 2022 ).
This dataset contains Bitcoin transaction metadata from 2011 to 2013. It was created by Omer Shafiq (Kaggle handle: OmerShafiq) and introduced to the Kaggle online community in 2019. The set comprises 11 attributes and 30,000 instances related to Bitcoin transactions, bitcoin flows, connections between transactions, average ratings, and malicious transactions (Omershafiq, 2019 ). It is efficient for investigating and analyzing anomalies and fraud detection in Bitcoin transactions (Ashfaq et al., 2022 ).
SQ2: What were the metrics used to assess the performance of ML models to detect financial fraud?
Based on previous studies (Nicholls et al., 2021 ; Shahana et al., 2023 ), the performance of the metrics used in ML models is the last step in determining whether the results align with the problem at hand. The metrics demonstrate the ability to do a specific task, such as classification, regression, or clustering quality, as they allow comparing the performance of models.
Many evaluation metrics have been used in previous studies, such as precision, sensitivity, recall, accuracy, and area under the curve. These metrics can be calculated using the confusion matrix. Figure 10 compares the target and true values with the predicted ones based on the study by Torrano et al. ( 2018 ).
Presents the confusion matrix generated during the evaluation of the financial fraud detection models. Authors’ own elaboration.
According to previous studies (Shahana et al., 2023 ; Zhao and Bai, 2022 ), true positive (TP) projects a positive value (fraud) that matches the true value; true negative (TN) accurately predicts a negative outcome (no fraud); false positive (FP) denotes the predicted positive whose true value is negative (no fraud); and false negative (FN) represents the predicted negative whose true value is positive (fraud). FP and FN represent the misclassification cost, also known as classification model prediction error.
The metrics used to evaluate the effectiveness of supervised ML techniques are as follows. The accuracy metric is the most commonly used (Ramírez-Alpízar et al., 2020 ). It is defined as the total number or proportion of correct predictions/samples over the total number of records analyzed. Further, it is a method of evaluating the performance of a binary classification model distinguishing between true and false. In Eq. ( 1 ), it calculates the accuracy metric.
The sensitivity metric known as recall (TP or TPR rate) is the ratio of successfully identified fraudulent predictions to the total number of fraudulent samples. Equation ( 2 ) calculates the sensitivity metric.
The specificity metric (TN rate or TNR) is the percentage of non-fraudulent samples properly designated as non-fraudulent. It is represented in Eq. ( 3 ).
Accuracy is the ratio of correctly classified fraudulent predictions to the total number of fraudulent predictions. Equation ( 4 ) calculates the precision metric.
F1-score is a metric that combines accuracy and recall using a weighted harmonic mean (Bakumenko and Elragal, 2022 ). It is presented in Eq. ( 5 ).
Type I error (FP or FPR rate) is the number of legitimate predictions mistakenly labeled as fraudulent as a percentage of all legitimate predictions. The metric is defined in Eq. ( 6 ).
Type II error (FN or FNR rate) is the proportion of fraudulent samples incorrectly designated as non-fraudulent. Type I and II errors make up the overall error rate. It is defined in Eq. ( 7 ).
The area under the curve (AUC), or area under the receiver operating characteristic curve, represents a graphic of TPR versus FPR (Y. Chen and Wu, 2022 ). AUC values range from 0 to 1; the more accurate an ML model, the higher its AUC value. It is a metric that represents the model’s performance when differentiating between two classes.
Following the guidelines in previous studies (Amrutha et al., 2023 ; García-Ordás et al., 2023 ; Palacio, 2019 ), some metrics used to evaluate the effectiveness of unsupervised ML techniques will be defined.
The silhouette coefficient identifies the most appropriate number of clusters; a higher coefficient means better quality with this number of clusters. Equation ( 8 ) calculates the metric.
where x denotes the average of the distances of observation j with respect to the rest of the observations of the cluster to which j belongs. Furthermore, y denotes the minimum distance to a different cluster. The silhouette score takes values between −1 and 1. Based on the study by Viera et al. ( 2023 ), 1 (correct) represents the assignment of observation j to a good cluster, zero (0) indicates that observation j is between two distinct groups, and −1 (incorrect) indicates that the assignment of j to the cluster is a bad clustering.
The rand index is the similarity measure between two clusters considering all pairs and including those assigned to the same cluster in both the predictions and the true cluster. Equation ( 9 ) calculates the index.
The Davies–Bouldin metric is a score used to evaluate clustering algorithms. It is defined as the mean value of the samples, represented in Eq. ( 10 ).
where k denotes the number of groups \({c}_{i},{c}_{j}\) , k represents the centroids of cluster i and j , respectively, with \(d\left({c}_{i},{c}_{i}\right)\) as the distance between them, while \({\alpha }_{i}\) and \({\alpha }_{j}\) corresponds to the average distance of all elements in clusters i and j and the distance to their respective \({c}_{i}\) and \({c}_{j}\) centroids (Viera et al., 2023 ).
The Fowlkes–Mallows index is defined as the geometric mean between precision and recall, represented in Eq. ( 11 ).
The cophenetic correlation coefficient is a clustering method to produce a dendrogram (tree diagram). Equation ( 12 ) indicates the metric.
where \(x(i,j)=|{x}_{i}-{x}_{j}|\) represents the Euclidean distance between the i th and j th points of \(x\) . While \(t(i,j)\) is the height of the node at which the two points, \({t}_{i}\) and \({t}_{j}\) , of the dendrogram meet and \(\bar{x}\) and \(\bar{t}\) are the mean value of \(x(i,j)\) and \(t(i,j).\)
Research on the detection of financial fraud by applying ML techniques is a significant topic. On the one hand, fraud directly affects the business world and, on the other hand, detecting it early involves great challenges; this has led to designing tools using AI, such as ML techniques. This study is an SLR using adaptations of the PRISMA and Kitchenham methods to critically analyze and synthesize the study results. Research articles published in Scopus, IEEE Xplore, Taylor & Francis, SAGE, and ScienceDirect were explored. The results were presented in two parts. The first one included a bibliometric study with the open-source software VOSviewer, followed by a discussion of the SLR results.
The bibliometric analysis presented the results of the authors, articles, sources, countries, and most important trends in the literature on financial fraud detection by applying ML, as well as an analysis of fraud types, ML models, and datasets. From the 104 articles dating from 2012 to 2023, several types of fraudulent activities are described, as well as external (e.g., credit cards, insurance) and internal (e.g., financial statements, money laundering) frauds, and a brief report on fraud, in general, is provided. Further, it was possible to extract supervised and unsupervised ML techniques, with the 10 most used models as RF in supervised techniques and autoencoder as an unsupervised technique.
During the literature review on the detection of financial fraud using machine learning models, it became evident that several authors have made significant contributions. However, some stand out more in terms of the number of publications and citations. Some of the most notable ones, Ahmed M. with 318 citations, Ileberi E. with 82, and Chen S. with 84, have made important advances in the field. Others, such as Abdallah A., with only one publication, but with 333 citations, have also made a considerable impact. And although researchers such as Khan S. and Mishra B. have fewer citations, the combined work of all these authors has established a robust knowledge base, providing a deeper understanding of the challenges and opportunities present in financial fraud detection through machine learning techniques.
Consistent with the analysis of the article clusters, clusters 2, 4 and 11 emerge as the most influential in this field with topics of interdisciplinary interest (artificial intelligence/machine learning, accounting, finance), among academics and auditing firms. The SLR evidences that authors in these domains often cooperate when it comes to publication, in turn, studies by (Huang et al., 2018 ; J. Kim et al., 2019 ; Sahin et al., 2013 ; Dutta et al., 2017 ) are highly cited articles.
Similarly, the leading countries in the research area include China, which has the largest number of published articles, followed by India and Saudi Arabia. The production of articles on the subject was found to be geographically distributed among countries whose economies are developing and are in transition, which indicates a greater capacity for the production of papers and research. In comparison to Ashtiani and Raahemi’s ( 2022 ) study highlighting the United States, leading with the largest number of papers (18) in the area, followed by China (8) and Greece (7), Al-Hashedi and Magalingam’s ( 2021 ) posit that India is the top producer of articles with 24, followed by China (14) and the United States (9).
The journals that have accepted the publication of these studies are specifically in the accounting and computer science domain. There is much literature on computers and security, expert systems with applications, and knowledge-based systems on financial fraud detection through ML models, as supported by Al-Hashedi and Magalingam ( 2021 ) and Ali et al. ( 2022 ). The keywords highlighted in the studies include crime, fraud detection, and ML. These words indicate a central focus on the financial industry, where learning and/or data mining systems help discover patterns or anomalies in financial data, in addition to attractive trends and approaches in the research field.
The literature has indicated articles investigating fraud types, particularly credit card loan fraud and insurance fraud, which are of great interest to the scientific community (Al-Hashedi and Magalingam, 2021 ; Ali et al., 2022 ; West and Bhattacharya, 2016 ). This study has classified the different types of fraud into internal and external, and sub-classifications have been derived. In both types, ML techniques have been used to detect financial fraud—supervised (59 articles), unsupervised (19 articles), supervised and unsupervised (16 articles), and deep learning (3 articles), among others. Most of the studies analyzed have developed binary classification models, that is, fraud or non-fraud. Supervised learning techniques require labeled data, and the most frequently used models are LR, RF, and SVM, among others. In the experiments, the prevalence of metrics such as accuracy, precision, sensitivity, and F1-score are highlighted. For unsupervised learning as a technique, the data do not have a label and focus on discovering new patterns with algorithms such as DBSCAN, autoencoder, and IF, among others. The evaluation with internal metrics was not made in detail. Few studies using semi-supervised learning and deep learning techniques have been highlighted because of the fact that they are novel.
Further, it is found in the trend through the keywords, as the research works address the subject of ML, learning algorithms, deep learning, SVM, fraudulent transactions, and anomaly detection, but it is evident that there is little research on unsupervised learning and deep learning. The scarce use of these techniques may be because of the complexity of the models and the high consumption of computational resources. In the analysis of the 86 experiment articles, few articles were found that used unsupervised techniques. Also, a large part of the datasets used is labeled, which requires further experimentation with models and unlabeled real-world datasets (Ounacer et al., 2018 ; Pumsirirat and Yan, 2018 ; Rubio et al., 2020 ; Van Capelleveen et al., 2016 ; Vanini et al., 2023 ). Meanwhile, labeled data are costly because an expert is required for their construction. Thus, more attention has been given to data origin, preprocessing, and feature extraction before training an ML model to increase detection accuracy. Accordingly, it should be emphasized that deep learning models require a thorough design and adjustment compared with previous models. They are quite sensitive to the architecture structure and choice of hyperparameters. Further, the data quality and quantity required is relatively high, so it should be considered in the design stage.
The studies show that the datasets for the experiments were taken from the stock exchanges of China, Canada, the United States, Taiwan, and Tehran, among others. The researchers used ML models to detect financial fraud in credit card loans, highlighting the use of the “Credit Card Fraud Detection” dataset, mentioned 15 times. Also, the performance of ML models can be affected because of the selected set by the number of selected attributes and instances. From the analysis, it was observed that most of the articles use real datasets obtained from existing databases, historical records, or other collection methods, and few studies use synthetic datasets (four articles), which are those generated by modeling or simulation techniques and try to mimic a real dataset.
Still, the integration of real and synthetic datasets enables a comprehensive approach to the problem by providing a basis and complementary information for conclusions and comparisons with other studies on the performance of ML models. Specifically, the datasets used in recent studies and/or articles, spanning from 2012 to 2023, reveal concern related to obsolete data approximately from 1994, which, because of their age, do not provide effective and accurate results in the current context as a result of the new fraud modalities created day after day, with characteristics and behavior patterns that have evolved significantly over time.
The literature review and bibliometric analyses on financial fraud detection using machine learning and its various techniques conducted between 2012 and 2023 show a remarkable evolution in this field. Authors, including Ahmed M., Ileberi E., and Chen S. have made important contributions with a high number of citations. There has been fundamental interdisciplinary collaboration between areas such as artificial intelligence, accounting, finance, and information security, highlighting widely cited studies such as Huang et al. ( 2018 ), J. Kim et al. ( 2019 ), Sahin et al. ( 2013 ), and Dutta et al. ( 2017 ). Countries such as China, India and Saudi Arabia leading in publications can be seen, which reflects the global effort of emerging economies. Supervised learning techniques such as Random Forest, and unsupervised ones, like Autoencoder, are the most widely used. Furthermore, the effort and enthusiasm for the use of deep learning, despite its complexity and high computational resource requirements, are evident.
Research mainly uses real datasets such as those from the Chinese, Canadian, US, Taiwanese, and Tehran stock exchanges, with the “Credit Card Fraud Detection” dataset being the most important one. The journals that publish these studies belong both to the accounting area and to computer science, with extensive literature in Computers and Security, Expert Systems with Applications, and Knowledge-Based Systems. While it is true that the accuracy of fraud detection depends on the quality of the data and preprocessing with various algorithms, the need for robust and updated approaches to face new fraud modalities is particularly highlighted.
The study had limitations that affected the scope and interpretation of the results. Although a systematic review was performed, the lack of quantitative support in the data collected is acknowledged. From the 104 articles identified in the SLR, 18 correspond to systematic reviews, which limits the availability of studies with specific details or experiments. This affected the depth of the analysis and the comprehensiveness of the results obtained.
The literature review reveals a predominant emphasis on the banking sector, especially in relation to credit card fraud and insurance fraud. The narrow focus leads to a lack of diversity in the types of fraud studied, excluding internal fraud types such as embezzlement, racketeering, smurfing, defalcation, collusion, signature forgery, and manipulation of accounting documents, among others. The underrepresentation of these other fraud types compromises the generalization of the findings and the applicability of ML models to contexts beyond the banking sector.
The datasets analyzed show a significant deficiency in the representation of fraud types. It can be observed that most of these datasets originated from the main stock exchanges and, additionally, the information used to carry out the experiments is old. This scenario indicates the inclusion of non-contemporary fraud types in the analysis. The limited availability of information on the performance metrics of the unsupervised learning models made it difficult to count the evaluation metrics used to predict financial fraud.
The field of financial fraud detection using ML models offers promising prospects for future research. An area of potential improvement is experimentation with advanced techniques, such as reinforcement learning or deep neural network architectures, to improve the accuracy and efficiency of models, including unsupervised learning. This approach could enable the development of more sophisticated systems capable of identifying complex fraud patterns and dynamically adjusting to the changing strategies of criminals, who are constantly innovating new fraud methods.
Moreover, it is suggested that the applicability of fraud detection systems in contexts other than banking be analyzed by adopting the anomaly approach, which would make it possible to move forward in the detection of fraud in real-time and minimize risks in organizations. It is also proposed that a dataset be created, containing real context information, which is freely accessible and includes new fraud methods to provide the scientific community with an updated dataset.
The datasets generated and/or analyzed in this study are available in the Harvard Dataverse repository https://doi.org/10.7910/DVN/CM8NVY .
Abdallah A, Maarof MA, Zainal A (2016) Fraud detection system: a survey. J Netw Comput Appl 68:90–113. https://doi.org/10.1016/j.jnca.2016.04.007
Article Google Scholar
Achakzai MAK, Juan P (2022) Using machine learning meta-classifiers to detect financial frauds. Financ Res Lett 48:102915. https://doi.org/10.1016/j.frl.2022.102915
Ahmed M, Mahmood AN, Islam MdR (2016) A survey of anomaly detection techniques in financial domain. Future Gener Comput Syst 55:278–288. https://doi.org/10.1016/j.future.2015.01.001
Al Ali A, Khedr AM, El-Bannany M, Kanakkayil S (2023) A powerful predicting model for financial statement fraud based on optimized XGBoost ensemble learning technique. Appl Sci 13(4):2272. https://doi.org/10.3390/app13042272
Article CAS Google Scholar
Alarfaj FK, Malik I, Khan HU, Almusallam N, Ramzan M, Ahmed M (2022) Credit card fraud detection using state-of-the-art machine learning and deep learning algorithms. IEEE Access 10:39700–39715. https://doi.org/10.1109/ACCESS.2022.3166891
Al-Hashedi KG, Magalingam P (2021) Financial fraud detection applying data mining techniques: a comprehensive review from 2009 to 2019. Comput Sci Rev 40:100402. https://doi.org/10.1016/j.cosrev.2021.100402
Ali A, Abd Razak S, Othman SH, Eisa TAE, Al-Dhaqm A, Nasser Tusneem ME, Elshafie H, Saif A (2022) Financial fraud detection based on machine learning: a systematic literature review. Appl Sci (Switz). https://doi.org/10.3390/app12199637
Alsuwailem AAS, Salem E, Saudagar AKJ (2022) Performance of different machine learning algorithms in detecting financial fraud. Comput Econ. https://doi.org/10.1007/s10614-022-10314-x
Alwadain A, Ali RF, Muneer A (2023) Estimating financial fraud through transaction-level features and machine learning. Mathematics 11(5):1184. https://doi.org/10.3390/math11051184
Amrutha E, Arivazhagan S, Jebarani WSL (2023) Deep clustering network for steganographer detection using latent features extracted from a novel convolutional autoencoder. Neural Process Lett 55(3):2953–2964. https://doi.org/10.1007/s11063-022-10992-6
Arévalo F, Barucca P, Téllez-León I-E, Rodríguez W, Gage G, Morales R (2022) Identifying clusters of anomalous payments in the salvadorian payment system. Lat Am J Cent Bank. 3(1):100050. https://doi.org/10.1016/j.latcb.2022.100050
Ashfaq T, Khalid R, Yahaya A, Aslam S, Alsafari S, Hameed I (2022) A machine learning and blockchain bases efficient fraud detection mechanism. Sensors 22(19):7162. https://doi.org/10.3390/s22197162
Article ADS PubMed PubMed Central Google Scholar
Ashtiani MN, Raahemi B (2022) Intelligent fraud detection in financial statements using machine learning and data mining: a systematic literature review. IEEE Access 10:72504–72525. https://doi.org/10.1109/ACCESS.2021.3096799
Aslam F, Hunjra A, Ftiti Z, Louhichi W, Shams T (2022) Insurance fraud detection: evidence from artificial intelligence and machine learning. Res Int Bus Financ. https://doi.org/10.1016/j.ribaf.2022.101744
Baghdasaryan V, Davtyan H, Sarikyan A, Navasardyan Z (2022) Improving tax audit efficiency using machine learning: the role of taxpayer’s network data in fraud detection. Appl Artif Intell 36(1). https://doi.org/10.1080/08839514.2021.2012002
Baker MR, Mahmood ZN, Shaker EH (2022) Ensemble learning with supervised machine learning models to predict credit card fraud transactions. Rev Intell Artif. https://doi.org/10.18280/ria.360401
Bakumenko A, Elragal A (2022) Detecting anomalies in financial data using machine learning algorithms. Systems. https://doi.org/10.3390/systems10050130
Bekirev AS, Klimov VV, Kuzin MV, Shchukin BA (2015) Payment card fraud detection using neural network committee and clustering. Optical Mem. Neural Netw 24(3):193–200. https://doi.org/10.3103/S1060992X15030030
Benchaji I, Douzi S, Ouahidi BEl (2021) Credit card fraud detection model based on LSTM recurrent neural networks. J Adv Inf Technol 12(2):113–118. https://doi.org/10.12720/jait.12.2.113-118
Błaszczyński J, de Almeida Filho AT, Matuszyk A, Szeląg M, Słowiński R (2021) Auto loan fraud detection using dominance-based rough set approach versus machine learning methods. Expert Syst Appl 163:113740. https://doi.org/10.1016/j.eswa.2020.113740
Bolgorian M, Mayeli A, Ronizi NG (2023) CEO compensation and money laundering risk. J Econ Criminol 1:100007. https://doi.org/10.1016/j.jeconc.2023.100007
Chen S (2016) Detection of fraudulent financial statements using the hybrid data mining approach. SpringerPlus 5(1):89. https://doi.org/10.1186/s40064-016-1707-6
Article PubMed PubMed Central Google Scholar
Chen S, Goo Y-JJ, Shen Z-D (2014) A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements. Sci World J 2014:1–9. https://doi.org/10.1155/2014/968712
Chen Y, Wu Z (2022) Financial fraud detection of listed companies in China: a machine learning approach. Sustainability 15(1):105. https://doi.org/10.3390/su15010105
Chullamonthon P, Tangamchit P (2023) Ensemble of supervised and unsupervised deep neural networks for stock price manipulation detection. Expert Syst Appl 220:119698. https://doi.org/10.1016/j.eswa.2023.119698
Compustat (2022) Compustat. S&P Global Market Intelligence. https://www.marketplace.spglobal.com/en/datasets?cq_cmp=9778467255&cq_plac=&cq_net=g&cq_pos=&cq_plt=gp&utm_source=google&utm_medium=cpc&utm_campaign=DMS_Marketplace_Search_Google&utm_term=&utm_content=586436401424&_bt=586436401424&_bk=&_bm=&_bn=g&_bg=133704002389&gclid=Cj0KCQjw4s-kBhDqARIsAN-ipH3TguUoVohfDZgD65fjvKomc6BBgJ3uA9zP95m6u4vOs5yG7_L7w2UaAnnvEALw_wcB
CSMAR (2022) China Stock Market & Accounting Research (CSMAR). Wharton University of Pennsylvania. https://wrds-www.wharton.upenn.edu/pages/about/data-vendors/china-stock-market-accounting-research-csmar/
Dalal S, Seth B, Radulescu M, Secara C, Tolea C (2022) Predicting fraud in financial payment services through optimized hyper-parameter-tuned XGBoost model. Mathematics 10(24):4679. https://doi.org/10.3390/math10244679
Dantas RM, Firdaus R, Jaleel F, Neves Mata P, Mata MN, Li G (2022) Systemic acquired critique of credit card deception exposure through machine learning. J Open Innov: Technol Mark Complex 8(4):192. https://doi.org/10.3390/joitmc8040192
Domashova J, Kripak E (2021) Identification of non-typical international transactions on bank cards of individuals using machine learning methods. Procedia Comput Sci 190:178–183. https://doi.org/10.1016/j.procs.2021.06.023
Domashova J, Kripak E (2022) Development of a generalized algorithm for identifying atypical bank transactions using machine learning methods. Procedia Comput Sci 213:101–109. https://doi.org/10.1016/j.procs.2022.11.044
Dutta I, Dutta S, Raahemi B (2017) Detecting financial restatements using data mining techniques. Expert Syst Appl 90:374–393. https://doi.org/10.1016/j.eswa.2017.08.030
Elshaar S, Sadaoui S (2020) Semi-supervised Classification of Fraud Data in Commercial Auctions. Appl Artif Intell 34(1):47–63. https://doi.org/10.1080/08839514.2019.1691341
Esenogho E, Mienye ID, Swart TG, Aruleba K, Obaido G (2022) A neural network ensemble with feature engineering for improved credit card fraud detection. IEEE Access 10:16400–16407. https://doi.org/10.1109/ACCESS.2022.3148298
Eshghi A, Kargari M (2019) Introducing a new method for the fusion of fraud evidence in banking transactions with regards to uncertainty. Expert Syst Appl 121:382–392. https://doi.org/10.1016/j.eswa.2018.11.039
Estupiñán Gaitán R (2015) Control interno y fraudes: análisis de informe COSO I, II y III con base en los ciclos transaccionales, Tercera edición (Niebel BW (ed)). Ecoe Ediciones
Fanai H, Abbasimehr H (2023) A novel combined approach based on deep autoencoder and deep classifiers for credit card fraud detection. Expert Syst Appl 217:119562. https://doi.org/10.1016/j.eswa.2023.119562
Fang Y, Zhang Y, Huang C (2019) Credit card fraud detection based on machine learning. Comput Mater Contin 61(1):185–195. https://doi.org/10.32604/cmc.2019.06144
Femila Roseline J, Naidu G, Samuthira Pandi V, Alamelu alias Rajasree S, Mageswari N (2022) Autonomous credit card fraud detection using machine learning approach✰. Comput Electr Eng 102:108132. https://doi.org/10.1016/j.compeleceng.2022.108132
García-Ordás MT, Alaiz-Moretón H, Casteleiro-Roca J-L, Jove E, Benítez-Andrades JA, García-Rodríguez I, Quintián H, Calvo-Rolle JL (2023) Clustering techniques selection for a hybrid regression model: a case study based on a solar thermal system. Cybern Syst 54(3):286–305. https://doi.org/10.1080/01969722.2022.2030006
Gupta S, Mehta SK (2021) Data mining-based financial statement fraud detection: systematic literature review and meta-analysis to estimate data sample mapping of fraudulent companies against non-fraudulent companies. Global Bus Rev https://doi.org/10.1177/0972150920984857
Hajek P, Henriques R (2017) Mining corporate annual reports for intelligent detection of financial statement fraud—a comparative study of machine learning methods. Knowl-Based Syst 128:139–152. https://doi.org/10.1016/j.knosys.2017.05.001
Hamza C, Lylia A, Nadine C, Nicolas C (2023) Semi-supervised method to detect fraudulent transactions and identify fraud types while minimizing mounting costs. Int J Adv Comput Sci Appl 14(2). https://doi.org/10.14569/IJACSA.2023.0140298
Hilal W, Gadsden SA, Yawney J (2022) Financial fraud: a review of anomaly detection techniques and recent advances. Expert Syst Appl 193:116429. https://doi.org/10.1016/j.eswa.2021.116429
Hofmann H (1994) Statlog (German credit data). UCI Machine Learning Repository. https://doi.org/10.24432/C5NC77
Huang D, Mu D, Yang L, Cai X (2018) CoDetect: financial fraud detection with anomaly feature detection. IEEE Access 6:19161–19174. https://doi.org/10.1109/ACCESS.2018.2816564
Hwang J, Kim K (2020) An efficient domain-adaptation method using GAN for fraud detection. Int J Adv Comput Sci Appl 11(11). https://doi.org/10.14569/IJACSA.2020.0111113
Ileberi E, Sun Y, Wang Z (2021) Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access 9:165286–165294. https://doi.org/10.1109/ACCESS.2021.3134330
Ileberi E, Sun Y, Wang Z (2022) A machine learning based credit card fraud detection using the GA algorithm for feature selection. J Big Data 9(1):24. https://doi.org/10.1186/s40537-022-00573-8
Khan S, Alourani A, Mishra B, Ali A, Kamal M (2022) Developing a credit card fraud detection model using machine learning approaches. Int J Adv Comput Sci Appl 13(3). https://doi.org/10.14569/IJACSA.2022.0130350
Kim J, Kim H-J, Kim H (2019) Fraud detection for job placement using hierarchical clusters-based deep neural networks. Appl Intell 49(8):2842–2861. https://doi.org/10.1007/s10489-019-01419-2
Kim YJ, Baik B, Cho S (2016) Detecting financial misstatements with fraud intention using multi-class cost-sensitive learning. Expert Syst Appl 62:32–43. https://doi.org/10.1016/j.eswa.2016.06.016
Kitchenham B, Brereton P (2013) A systematic review of systematic review process research in software engineering. Inf Softw Technol 55(12):2049–2075. https://doi.org/10.1016/j.infsof.2013.07.010
Kitchenham B, Stuart C (2007) Guidelines for performing systematic literature reviews in software engineering. https://www.researchgate.net/publication/302924724_Guidelines_for_performing_Systematic_Literature_Reviews_in_Software_Engineering
Kootanaee AJ, Aghajan AAP, Shirvani MH (2021) A hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements. J Optim Ind Eng 14(2):183–201. https://doi.org/10.22094/JOIE.2020.1877455.1685
KPMG (2022) Una triple amenaza en las Américas. KMPG. https://kpmg.com/co/es/home/insights/2022/01/kpmg-fraud-outlook-survey.html
Kumar S, Ahmed R, Bharany S, Shuaib M, Ahmad T, Tag Eldin E, Rehman AU, Shafiq M (2022) Exploitation of machine learning algorithms for detecting financial crimes based on customers’ behavior. Sustainability 14(21):13875. https://doi.org/10.3390/su142113875
Kumbure MM, Lohrmann C, Luukka P, Porras J (2022) Machine learning techniques and data for stock market forecasting: a literature review. Expert Syst Appl 197:116659. https://doi.org/10.1016/j.eswa.2022.116659
Lee H, Choi E, Kim I, Choi D, Go W, Lee K, Yim H, Lee T (2018) Feature selection practice for unsupervised learning of credit card fraud detection. J Theor Appl Inf Technol 96(2):408–417
Google Scholar
Lei X, Mohamad UH, Sarlan A, Shutaywi M, Daradkeh YI, Mohammed HO (2022) Development of an intelligent information system for financial analysis depend on supervised machine learning algorithms. Inf Process Manag 59(5):103036. https://doi.org/10.1016/j.ipm.2022.103036
Lokanan M, Tran V, Vuong NH (2019) Detecting anomalies in financial statements using machine learning algorithm. Asian J Account Res 4(2):181–201. https://doi.org/10.1108/AJAR-09-2018-0032
Lokanan ME, Sharma K (2022) Fraud prediction using machine learning: The case of investment advisors in Canada. Mach Learn Appl 8:100269. https://doi.org/10.1016/j.mlwa.2022.100269
Lokanan ME (2022) Predicting money laundering using machine learning and artificial neural networks algorithms in banks. J Appl Secur Res 1–25. https://doi.org/10.1080/19361610.2022.2114744
López-Rojas E (2017) Synthetic financial datasets for fraud detection. Kaggle. https://www.kaggle.com/datasets/ealaxi/paysim1
Machine Learning Group (2018) Credit card fraud detection. Kaggle. https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
Madhurya MJ, Gururaj HL, Soundarya BC, Vidyashree KP, Rajendra AB (2022) Exploratory analysis of credit card fraud detection using machine learning techniques. Glob Transit Proc 3(1):31–37. https://doi.org/10.1016/j.gltp.2022.04.006
Malik EF, Khaw KW, Belaton B, Wong WP, Chew X (2022) Credit card fraud detection using a new hybrid machine learning architecture. Mathematics 10(9):1480. https://doi.org/10.3390/math10091480
Márquez Arcila RH (2019) Auditoría forense. Ecoe Ediciones
Misra S, Thakur S, Ghosh M, Saha SK (2020) An autoencoder based model for detecting fraudulent credit card transaction. Procedia Comput Sci 167:254–262. https://doi.org/10.1016/j.procs.2020.03.219
Moher D, Liberati A, Tetzlaff J, Altman DG (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med 6(7):e1000097. https://doi.org/10.1371/journal.pmed.1000097
Mongwe W, Malan K (2020) A survey of automated financial statement fraud detection with relevance to the South African context. S Afr Comput J 32(1). https://doi.org/10.18489/sacj.v32i1.777
Montes Salazar CA (2019) Riesgos de fraude en una auditoría de estados financieros (1.a ed.). Alfaomega. ISBN: 9789587782639. https://www.alfaomegacloud.com/reader/riesgos-de-fraude-en-una-auditoria-de-estados-financieros?location=3
Moreira MÂL, Junior C, de SR, Silva DF, de L, de Castro Junior MAP, Costa IP, de A, Gomes CFS, dos Santos M (2022) Exploratory analysis and implementation of machine learning techniques for predictive assessment of fraud in banking systems. Procedia Comput Sci 214:117–124. https://doi.org/10.1016/j.procs.2022.11.156
Narsimha B, Raghavendran CV, Rajyalakshmi P, Reddy GK, Bhargavi M, Naresh P (2022) Cyber defense in the age of artificial intelligence and machine learning for financial fraud detection application. Int J Electr Electron Res 10(2):87–92. https://doi.org/10.37391/ijeer.100206
Nian K, Zhang H, Tayal A, Coleman T, Li Y (2016) Auto insurance fraud detection using unsupervised spectral ranking for anomaly. J Financ Data Sci 2(1):58–75. https://doi.org/10.1016/j.jfds.2016.03.001
Nicholls J, Kuppa A, Le-Khac N-A (2021) Financial cybercrime: a comprehensive survey of deep learning approaches to tackle the evolving financial crime landscape. IEEE Access 9:163965–163986. https://doi.org/10.1109/ACCESS.2021.3134076
Nonnenmacher J, Marx Gómez J (2021) Unsupervised anomaly detection for internal auditing: Literature review and research agenda. Int J Digit Account Res 1–22. https://doi.org/10.4192/1577-8517-v21_1
Olszewski D (2014) Fraud detection using self-organizing map visualizing the user profiles. Knowl Based Syst 70:324–334. https://doi.org/10.1016/j.knosys.2014.07.008
Omershafiq (2019) Bitcoin network transactional metadata. Kaggle. https://www.kaggle.com/datasets/omershafiq/bitcoin-network-transactional-metadata
Ounacer S, Ait El Bour H, Oubrahim Y, Ghoumari MY, Azzouazi M (2018) Using isolation forest in anomaly detection: the case of credit card transactions. Period Eng Nat Sci 6(2):394. https://doi.org/10.21533/pen.v6i2.533
Palacio SM (2019) Abnormal pattern prediction: detecting fraudulent insurance property claims with semi-supervised machine-learning. Data Sci J 18(1):35. https://doi.org/10.5334/dsj-2019-035
Papík M, Papíková L (2022) Detecting accounting fraud in companies reporting under US GAAP through data mining. Int J Account Inf Syst 45:100559. https://doi.org/10.1016/j.accinf.2022.100559
Plakandaras V, Gogas P, Papadimitriou T, Tsamardinos I (2022) Credit card fraud detection with automated machine learning systems. Appl Artif Intell 36(1). https://doi.org/10.1080/08839514.2022.2086354
Polak P, Nelischer C, Guo H, Robertson DC (2020) Intelligent” finance and treasury management: what we can expect. AI Soc 35(3):715–726. https://doi.org/10.1007/s00146-019-00919-6
PricewaterhouseCoopers (2022) Encuesta Global de Crimen y Fraude Económico de PwC Colombia 2022 – 2023. https://www.pwc.com/co/es/publicaciones/encuesta-crimen-fraude-economico.html
Pumsirirat A, Yan L (2018) Credit card fraud detection using deep learning based on auto-encoder and restricted Boltzmann machine. Int J Adv Comput Sci Appl 9(1). https://doi.org/10.14569/IJACSA.2018.090103
Putten P (2000) Insurance Company Benchmark (COIL 2000). UCI Machine Learning Repository. https://doi.org/10.24432/C5630S
Quinlan R (1997) Statlog (Australian credit approval). UCI Machine Learning Repository. https://doi.org/10.24432/C59012
Rakowski R, Polak P, Kowalikova P (2021) Ethical aspects of the impact of AI: the status of humans in the era of artificial intelligence. Society 58(3):196–203. https://doi.org/10.1007/s12115-021-00586-8
Ramírez-Alpízar A, Jenkins M, Martínez A, Quesada-López C (2020a) Use of data mining and machine learning techniques for fraud detection in financial statements: a systematic mapping study. Rev Ibér Sist Tecnol Inf Lousada No. E28:97–109
Reurink A (2018) Financial fraud: a literature review. J Econ Surv 32(5):1292–1325. https://doi.org/10.1111/joes.12294
Rocha-Salazar J-J, Segovia-Vargas M-J, Camacho-Miñano M-M (2021) Money laundering and terrorism financing detection using neural networks and an abnormality indicator. Expert Syst Appl 169:114470. https://doi.org/10.1016/j.eswa.2020.114470
Roehrs A, da Costa CA, Righi R, da R, de Oliveira KSF (2017) Personal health records: a systematic literature review. J Med Internet Res 19(1):e13. https://doi.org/10.2196/jmir.5876
Rubio J, Barucca P, Gage G, Arroyo J, Morales-Resendiz R (2020) Classifying payment patterns with artificial neural networks: an autoencoder approach. Lat Am J Cent Bank 1(1–4):100013. https://doi.org/10.1016/j.latcb.2020.100013
Sahin Y, Bulkan S, Duman E (2013) A cost-sensitive decision tree approach for fraud detection. Expert Syst Appl 40(15):5916–5923. https://doi.org/10.1016/j.eswa.2013.05.021
Saputra M, Santosa PI, Permanasari AE (2023) Consumer behaviour and acceptance in fintech adoption: a systematic literature review. Acta Inform Pragensia 12(2):468–489. https://doi.org/10.18267/j.aip.222
Saragih MG, Chin J, Setyawasih R, Nguyen PT, Shankar K (2019) Machine learning methods for analysis fraud credit card transaction. Int J Eng Adv Technol 8(6S):870–874. https://doi.org/10.35940/ijeat.F1164.0886S19
Sathya M, Balakumar B (2022) Insurance fraud detection using novel machine learning technique. Int J Intell Syst Appl Eng 10(3):374–381
Savić M, Atanasijević J, Jakovetić D, Krejić N (2022) Tax evasion risk management using a hybrid unsupervised outlier detection method. Expert Syst Appl 193:116409. https://doi.org/10.1016/j.eswa.2021.116409
Seera M, Lim CP, Kumar A, Dhamotharan L, Tan KH (2021) An intelligent payment card fraud detection system. Ann Oper Res. https://doi.org/10.1007/s10479-021-04149-2
Shahana T, Lavanya V, Bhat AR (2023) State of the art in financial statement fraud detection: a systematic review. Technol Forecast Soc Change 192:122527. https://doi.org/10.1016/j.techfore.2023.122527
Shou M, Bao X, Yu J (2023) An optimal weighted machine learning model for detecting financial fraud. Appl Econ Lett 30(4):410–415. https://doi.org/10.1080/13504851.2021.1989367
Singh A, Jain A, Biable SE (2022) Financial fraud detection approach based on firefly optimization algorithm and support vector machine. Appl Comput Intell Soft Comput 2022:1–10. https://doi.org/10.1155/2022/1468015
Smith Q-J, Valverde R (2021) A perceptron based neural network data analytics architecture for the detection of fraud in credit card transactions in financial legacy systems. WSEAS Trans Syst Control 16:358–374. https://doi.org/10.37394/23203.2021.16.31
Sofy MA, Khafagy MH, Badry RM (2023) An intelligent Arabic model for recruitment fraud detection using machine learning. J Adv Informat Technol. https://doi.org/10.12720/jait.14.1.102-111
Srokosz M, Bobyk A, Ksiezopolski B, Wydra M (2023) Machine-learning-based scoring system for antifraud CISIRTs in banking environment. Electronics 12(1):251. https://doi.org/10.3390/electronics12010251
Subudhi S, Panigrahi S (2020) Use of optimized fuzzy C -Means clustering and supervised classifiers for automobile insurance fraud detection. J King Saud Univ— Comput Inf Sci 32(5):568–575. https://doi.org/10.1016/j.jksuci.2017.09.010
Ti Y-W, Hsin Y-Y, Dai T-S, Huang M-C, Liu L-C (2022) Feature generation and contribution comparison for electronic fraud detection. Sci Rep 12(1):18042. https://doi.org/10.1038/s41598-022-22130-2
Article ADS CAS PubMed PubMed Central Google Scholar
Tingfei H, Guangquan C, Kuihua H (2020) Using variational auto encoding in credit card fraud detection. IEEE Access 8:149841–149853. https://doi.org/10.1109/ACCESS.2020.3015600
Torrano C, Recuero P, Ramirez F, Hernández S, Torres J (2018) Machine learning aplicado a la ciberseguridad: técnicas y ejemplos en detección de amenazas. Zeroxword Computing
Udeze CL, Eteng IE, Ibor AE (2022) Application of machine learning and resampling techniques to credit card fraud detection. J Niger Soc Phys Sci 769. https://doi.org/10.46481/jnsps.2022.769
Usman A, Naveed N, Munawar S (2023) Intelligent anti-money laundering fraud control using graph-based machine learning model for the financial domain. J Cases Inf Technol 25(1):1–20. https://doi.org/10.4018/JCIT.316665
Van Capelleveen G, Poel M, Mueller RM, Thornton D, Van Hillegersberg J (2016) Outlier detection in healthcare fraud: a case study in the Medicaid dental domain. Int J Account Inf Syst 21:18–31. https://doi.org/10.1016/j.accinf.2016.04.001
Vanhoeyveld J, Martens D, Peeters B (2020) Value-added tax fraud detection with scalable anomaly detection techniques. Appl Soft Comput 86:105895. https://doi.org/10.1016/j.asoc.2019.105895
Vanini P, Rossi S, Zvizdic E, Domenig T (2023) Online payment fraud: from anomaly detection to risk management. Financ Innov 9(1):66. https://doi.org/10.1186/s40854-023-00470-w
Vanneschi L, Horn DM, Castelli M, Popovič A (2018) An artificial intelligence system for predicting customer default in e-commerce. Expert Syst Appl 104:1–21. https://doi.org/10.1016/j.eswa.2018.03.025
Viera J, Aguilar J, Rodríguez-Moreno M, Quintero-Gull C (2023) Analysis of the behavior pattern of energy consumption through online clustering techniques. Energies 16(4):1649. https://doi.org/10.3390/en16041649
Wadhwa VK, Saini AK, Kumar SS (2020) Financial fraud prediction models: a review of research evidence. Int J Sci Technol Res 9(1):677–680
West J, Bhattacharya M (2016) Intelligent financial fraud detection: a comprehensive review. Comput Secur 57:47–66. https://doi.org/10.1016/j.cose.2015.09.005
Whiting DG, Hansen JV, McDonald JB, Albrecht C, Albrecht WS (2012) Machine learning methods for detecting patterns of management fraud. Comput Intell 28(4):505–527. https://doi.org/10.1111/j.1467-8640.2012.00425.x
Article MathSciNet Google Scholar
Wohlin C (2014) Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering. pp. 1–10
Wu B, Lv X, Alghamdi A, Abosaq H, Alrizq M (2023) Advancement of management information system for discovering fraud in master card based intelligent supervised machine learning and deep learning during SARS-CoV2. Inf Process Manag 60(2):103231. https://doi.org/10.1016/j.ipm.2022.103231
Article PubMed Google Scholar
Xiong T, Ma Z, Li Z, Dai J (2022) The analysis of influence mechanism for internet financial fraud identification and user behavior based on machine learning approaches. Int J Syst Assur Eng Manag 13(S3):996–1007. https://doi.org/10.1007/s13198-021-01181-0
Xiuguo W, Shengyong D (2022) An analysis on financial statement fraud detection for Chinese listed companies using deep learning. IEEE Access 10:22516–22532. https://doi.org/10.1109/ACCESS.2022.3153478
Yeh I-C (2016) Default of credit card clients. UCI Machine Learning Repository. https://doi.org/10.24432/C55S3H
Zhang Z, Zhou X, Zhang X, Wang L, Wang P (2018) A model based on convolutional neural network for online transaction fraud detection. Secur Commun. Netw. 2018:1–9. https://doi.org/10.1155/2018/5680264
Zhao Z, Bai T (2022) Financial fraud detection and prediction in listed companies using SMOTE and machine learning algorithms. Entropy 24(8):1157. https://doi.org/10.3390/e24081157
Zhou H, Chai H, Qiu M (2018) Fraud detection within bankcard enrollment on mobile device based payment using machine learning. Front Inf Technol Electron Eng 19(12):1537–1545. https://doi.org/10.1631/FITEE.1800580
Zupan M, Budimir V, Letinic S (2020) Journal entry anomaly detection model. Intell Syst Account Financ Manag 27(4):197–209. https://doi.org/10.1002/isaf.1485
Download references
We would like to express our gratitude to the Universidad Cooperativa de Colombia, Ibagué campus, Espinal. This research work was supported by Universidad Cooperativa de Colombia and derived from research project INV3456 entitled “Detection of anomalies in financial data in social economy organizations through machine learning techniques” associated with the PLANAUDI, AQUA and SINERGIA UCC group, from the Research Center of the Public Accounting and Systems Engineering program of the UCC Ibagué campus.
Authors and affiliations.
School of Public Accounting, Universidad Cooperativa de Colombia, 730001, Ibagué-Espinal campus, Ibagué, Colombia
Ludivia Hernandez Aros & John Johver Moreno Hernandez
School of Systems Engineering, Universidad Cooperativa de Colombia, 730001, Ibagué-Espinal campus, Ibagué, Colombia
Luisa Ximena Bustamante Molano & Fernando Gutierrez-Portela
School of Business Administration, Universidad Cooperativa de Colombia, 730001, Ibagué-Espinal campus, Ibagué, Colombia
Mario Samuel Rodríguez Barrero
You can also search for this author in PubMed Google Scholar
All authors contributed to the creation and design of the study.
Correspondence to Ludivia Hernandez Aros .
Competing interests.
The authors declare no competing interests.
The authors declare that they have no human participants, human data, or human tissue.
The authors have no data from any individual person on any form.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Hernandez Aros, L., Bustamante Molano, L.X., Gutierrez-Portela, F. et al. Financial fraud detection through the application of machine learning techniques: a literature review. Humanit Soc Sci Commun 11 , 1130 (2024). https://doi.org/10.1057/s41599-024-03606-0
Download citation
Received : 15 November 2023
Accepted : 13 August 2024
Published : 03 September 2024
DOI : https://doi.org/10.1057/s41599-024-03606-0
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
Original Submission Date Received: .
Find support for a specific problem in the support section of our website.
Please let us know what you think of our products and services.
Visit our dedicated information section to learn more about MDPI.
Word sense disambiguation for morphologically rich low-resourced languages: a systematic literature review and meta-analysis.
2. related work, 3. materials and methods, 3.1. search strategy, 3.2. inclusion criteria and exclusion criteria, 3.3. data synthesis and statistical analysis, 4.1. meta-analysis summary, 4.2. publication bias and meta-regression, 4.3. implications of heterogeneity for wsd based on forest plot in figure 3, 4.4. significance, 4.5. descriptive statistics of primary studies, 5. conclusions.
Institutional review board statement, informed consent statement, data availability statement, conflicts of interest.
Click here to enlarge figure
Author | Approach | Model | Accuracy |
---|---|---|---|
[ ] | Supervised | SVM | 97% |
[ ] | Knowledge-Based | Effect Coarse-Grained | 83% |
[ ] | Unsupervised | Leacock–Chodrow | 72% |
[ ] | Supervised | BERT | 96% |
[ ] | Supervised | BiLSTM | 90 |
[ ] | Transformer Models | Arabic BERT | 84% |
[ ] | Unsupervised | Graph-Based Algorithm | 63% |
[ ] | Knowledge-Based | Selectional Preferences | 75% |
[ ] | Supervised | Bootstrapping | 69% |
[ ] | Knowledge-Based | LESK | 34% |
[ ] | Supervised | BiLSTM | 90% |
[ ] | Supervised | Naïve Bayes | 89% |
[ ] | Supervised | K-Nearest Neighbor | 94% |
[ ] | Supervised + Unsupervised | Distributional Semantic Space | 86% |
[ ] | Unsupervised + Knowledge-Based | PCA and CE | 92% |
[ ] | Unsupervised | Graph-Based | 47% |
[ , ] | Knowledge-Based | Dependency Disambiguation Graph +Contextual Disambiguation Graph | 47% |
[ ] | Supervised | Baseline Method is Modified (inclusion of Lemmatization and Bootstrapping) | 84% |
[ ] | Unsupervised | Graph-Based | 80% |
[ ] | Knowledge-Based | Maximum Overlap | 75% |
[ ] | Deep Learning | LSTM | 84% |
[ ] | Transformer-based | ELMO | 78% |
Database | Results | Search Phrase | Notes |
---|---|---|---|
SCOPUS | 1124 | Article title, Abstract, keywords ((word sense disambiguation OR “WSD”) OR (“Morphologically rich”) OR (“Low-resourced Languages”)) | extensive database with a broad scope. |
Springer | 560 | ((“Natural Language Processing”) OR (“Word Embedding”) OR (“Word Vector Space”) OR (“Lexical Ambiguity”) OR (“Polysemy””)) | focused on information technology and computers. |
IEEE Xplore | 300 | (“Lexical Ambiguity”) OR (“Polysemy”) OR (“Language Models”) OR (“Semantic Space”) OR (“Semantic Similarity”)). | abundant in publications on engineering and technology. |
Google Scholar | 150 | ((word sense disambiguation OR “WSD”) OR (“Morphologically rich”) OR (“Low-resourced Languages”)) | offers a quick and easy method for searching academic publications in general. |
Criteria | Decision |
---|---|
The predetermined keywords appear throughout the document, or at the very least in the title, keywords, and abstract sections. | Inclusion |
Publications released in the year 2014 and after. | Inclusion |
Research article written in the English language. | Inclusion |
Research articles without WSD-selected approaches, which are Supervised, Unsupervised, and Knowledge-based learning. | Exclusion |
Research articles without evaluation metrics. | Exclusion |
Research articles without a corpus or dataset. | Exclusion |
Articles not written in English, reports published prior to 2024, case reports and series, editorial letters, commentary, opinions, conference abstracts, and dissertations. | Exclusion |
Meta-Analysis Summary: Random-Effects Model Method: DerSimonian–Laird | |||||||
---|---|---|---|---|---|---|---|
Heterogeneity: | tau = 5.8194 | I (%) = 82.29 | H = 5.65 | ||||
Study (n = 32) | Effect Size | [95% CI] | Weight | ||||
(Al-Hajj and Jarrar, 2022) | [ ] | −12.201 | −14.340 | −10.063 | 3.17 | ||
(Alian and Awajan, 2020) | [ ] | −12.643 | −15.501 | −9.784 | 2.79 | ||
(Biś et al., 2019) | [ ] | −13.005 | −15.071 | −10.939 | 3.20 | ||
(Chasin et al., 2014) | [ ] | −10.777 | −13.384 | −8.169 | 2.93 | ||
(Choi et al., 2017) | [ ] | −7.776 | −9.928 | −5.624 | 3.16 | ||
(Demlew and Yohannes, 2022) | [ ] | −12.286 | −14.399 | −10.172 | 3.18 | ||
(Dhungana and Shakya, 2017) | [ ] | −5.227 | −7.232 | −3.223 | 3.23 | ||
(Fard et al., 2014) | [ ] | −12.462 | −14.652 | −10.272 | 3.14 | ||
(Huang et al., 2019) | [ ] | −8.527 | −10.795 | −6.259 | 3.10 | ||
(Jaber and Martinez, 2021) | [ ] | −8.822 | −10.812 | −6.833 | 3.24 | ||
(Jain and Lobiyal, 2020) | [ ] | −9.162 | −11.359 | −6.965 | 3.14 | ||
(Jha et al., 2023) | [ ] | −10.935 | −13.244 | −8.627 | 3.08 | ||
(Jha et al., 2023b) | [ ] | −7.327 | −9.790 | −4.865 | 3.00 | ||
(Jia et al., 2018) | [ ] | −12.436 | −14.695 | −10.177 | 3.11 | ||
(Yepes, 2018) | [ ] | −13.263 | −15.263 | −11.262 | 3.24 | ||
(Lopukhin and Lopukhina, 2016) | [ ] | −11.963 | −14.226 | −9.700 | 3.11 | ||
(Meng, 2022) | [ ] | −11.858 | −14.714 | −9.002 | 2.80 | ||
(Mohd et al., 2020) | [ ] | −9.770 | −12.137 | −7.403 | 3.05 | ||
(Pal and Saha, 2019) | [ ] | −8.309 | −10.819 | −5.800 | 2.98 | ||
(Pal et al., 2017) | [ ] | −6.470 | −8.735 | −4.205 | 3.10 | ||
(Pal et al., 2018) | [ ] | −12.377 | −14.515 | −10.238 | 3.17 | ||
(Pal et al., 2017) | [ ] | −12.286 | −14.329 | −10.242 | 3.22 | ||
(Pal Singh and Kuma, 2019) | [ ] | −13.971 | −16.049 | −11.893 | 3.20 | ||
(Rios et al., 2018) | [ ] | −8.158 | −10.191 | −6.126 | 3.22 | ||
(Sabbir et al., 2017) | [ ] | −10.639 | −12.680 | −8.598 | 3.22 | ||
(Saidi and Jarray, 2022) | [ ] | −9.367 | −11.364 | −7.370 | 3.24 | ||
(Shafi et al., 2023) | [ ] | −9.049 | −11.071 | −7.027 | 3.23 | ||
(Singh and Kumar, 2019) | [ ] | −10.180 | −12.319 | −8.041 | 3.17 | ||
(Torunoglu-Selamet et al., 2020) | [ ] | −10.062 | −12.281 | −7.844 | 3.13 | ||
(Yusuf et al., 2022) | [ ] | −3.302 | −5.671 | −0.934 | 3.05 | ||
(Zhang et al., 2019) | [ ] | −7.288 | −9.355 | −5.222 | 3.20 | ||
(Zhang et al., 2019b) | [ ] | −5.326 | −7.397 | −3.255 | 3.20 | ||
| |||||||
Parameter | Coefficient | Std. Err. | z | p > |z| | [95% Conf. Interval] | |
---|---|---|---|---|---|---|
Pubyear | 0.1499619 | 0.2004735 | 0.75 | 0.454 | −0.242959 | 0.5428827 |
Constant | −312.7112 | 404.7987 | −0.77 | 0.440 | −1106.102 | 480.6797 |
Parameter | Coefficient | Std. Err. | z | p > |z| | [95% Conf. Interval] | |
---|---|---|---|---|---|---|
Dataset | −7.81 × 10 | 1.73 × 10 | −4.52 | 0.000 | −0.0000112 | −4.43 × 10 |
Constant | −8.987219 | 404.7987 | 0.4182412 | 0.000 | −1106.102 | −8.167481 |
Studies (n = 36) | Coefficient | [95% Conf. Interval] | |
---|---|---|---|
Observed (n = 32) | −9.906 | −10.830 | −8.983 |
Observed + Imputed (32 + 4) | −9.434 | −10.371 | −8.496 |
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
Masethe, H.D.; Masethe, M.A.; Ojo, S.O.; Giunchiglia, F.; Owolawi, P.A. Word Sense Disambiguation for Morphologically Rich Low-Resourced Languages: A Systematic Literature Review and Meta-Analysis. Information 2024 , 15 , 540. https://doi.org/10.3390/info15090540
Masethe HD, Masethe MA, Ojo SO, Giunchiglia F, Owolawi PA. Word Sense Disambiguation for Morphologically Rich Low-Resourced Languages: A Systematic Literature Review and Meta-Analysis. Information . 2024; 15(9):540. https://doi.org/10.3390/info15090540
Masethe, Hlaudi Daniel, Mosima Anna Masethe, Sunday Olusegun Ojo, Fausto Giunchiglia, and Pius Adewale Owolawi. 2024. "Word Sense Disambiguation for Morphologically Rich Low-Resourced Languages: A Systematic Literature Review and Meta-Analysis" Information 15, no. 9: 540. https://doi.org/10.3390/info15090540
Article access statistics, further information, mdpi initiatives, follow mdpi.
Subscribe to receive issue release notifications and newsletters from MDPI journals
IMAGES
VIDEO
COMMENTS
This article aims to describe the conjoint analysis (CA) method and its application in healthcare settings, and to provide researchers with a brief guide to conduct a conjoint study. CA is a method for eliciting patients' preferences that offers choices similar to those in the real world and allows researchers to quantify these preferences.
The use of conjoint analysis (CA) to elicit patients' preferences for osteoarthritis (OA) treatment has the potential to contribute to tailoring treatments and enhancing patients' compliance and adherence. This review's main aim was to ...
Background The use of conjoint analysis (CA) to elicit patients' preferences for osteoarthritis (OA) treatment has the potential to contribute to tailoring treatments and enhancing patients' compliance and adherence. This review's main aim was to identify and summarise the evidence that used conjoint analysis techniques to quantify patient preferences for OA treatments.
Several earlier review articles in marketing and consumer academic research have documented the evolution of conjoint analysis. 2 This manuscript provides an organizing framework for this vast literature and reviews key articles, critically discusses several advanced issues and developments, and identifies directions for future research.
The objective of this scoping review was to describe existing applications of conjoint analysis and discrete choice experiments for eliciting stakeholder preferences, individual patient and provider level for mental health services within published literature.
Background While patients' preferences in primary care have been examined in numerous conjoint analyses, there has been little systematic effort to synthesise the findings. This review aimed to identify, to organise and to assess the strength of evidence for the attributes and factors associated with preference heterogeneity in conjoint analyses for primary care outpatient visits. Methods We ...
This article aims to describe the conjoint analysis (CA) method and its application in healthcare settings, and to provide researchers with a brief guide to conduct a conjoint study. CA is a ...
This variation in method type and reporting quality sometimes makes it difficult to assess substantive findings. This review identifies and describes recent applications of conjoint analysis based on a systematic review of conjoint analysis in the health literature.
This article aims to describe the conjoint analysis (CA) method and its application in healthcare settings, and to provide researchers with a brief guide to conduct a conjoint study. CA is a method for eliciting patients' preferences that offers choices similar to those in the real world and allows researchers to quantify these preferences.
This review aimed to identify, to organise and to assess the strength of evidence for the attributes and factors associated with preference heterogeneity in conjoint analyses for primary care ...
This review's main aim was to identify and summarise the evidence that used conjoint analysis techniques to quantify patient preferences for OA treatments. Methods: A comprehensive search strategy was conducted using electronic databases and hand reference checks. Databases were searched from their inception until 10th June 2019.
This article aims to describe the conjoint analysis (CA) method and its application in healthcare settings, and to provide researchers with a brief guide to conduct a conjoint study. CA is a method for eliciting patients' preferences that offers choices similar to those in the real world and allows researchers to quantify these preferences. To identify literature related to conjoint analysis ...
Despite the increased popularity of conjoint analysis in health outcomes research, little is known about what specific methods are being used for the design and reporting of these studies. This variation in method type and reporting quality sometimes makes it difficult to assess substantive findings. This review identifies and describes recent applications of conjoint analysis based on a ...
The conjoint analysis method is a method for eliciting patients' preferences that offers choices similar to those in the real world and allows researchers to quantify these preferences, and there are some limitations regarding the appropriate sample size, quality assessment tool, and external validity of CA. This article aims to describe the conjoint analysis (CA) method and its application ...
PurposeSince the inception of the conjoint analysis technique in the year 1971, papers addressing the epistemological aspects of conjoint analysis are scant. Hence, this paper attempts to address the vacuum of qualitative discourse addressing the epistemological and methodological aspects of conjoint analysis including different issues, challenges, probable solutions, limitations and future ...
In conjoint analysis (CA), one generates a set of product profiles that vary systematically in terms of their attributes and attribute levels. These p…
This paper discusses various issues involved in imple- menting conjoint analysis and describes some new technical developments and application areas for the methodology. The modeling of consumer preferences among T multiattribute alternatives has been one of the. major activities in consumer research for at least a.
Tenopir et al. ( 2011) even utilized conjoint analysis in context of a research question from the sphere of scholarly communication similar to ours: controlling for seven different characteristics of research articles, they found article topic, online accessibility, and peer review status to be the most important factors for researchers when ...
Hence, this paper attempts to address the vacuum of qualitative discourse addressing the epistemological and methodological aspects of conjoint analysis including different issues, challenges ...
This variation in method type and reporting quality sometimes makes it difficult to assess substantive findings. This review identifies and describes recent applications of conjoint analysis based on a systematic review of conjoint analysis in the health literature.
joint analysis and highlighted gaps in the current literature. Conclusions: 62 a ticles focused on hedonic goods and 38 on extrinsic qualities. Insights from this review champion conjoint analysis as an indispensable tool, highlighting it potential to refine future research endeavours in the domain. Results and supporting data from conjoint resear
3.1 Conjoint Analysis Method. All methods of Conjoint Analysis—Traditional, Adaptive, and Choice-Based Conjoint—have their respective advantages and drawbacks. However, considering the purpose of this research, as well as the number of attribute levels, the Traditional Conjoint Analysis seems to be a suitable option.
Conjoint analysis is a form of statistical analysis that firms use in market research to understand how customers value different components or features of their products or services. It's based on the principle that any product can be broken down into a set of attributes that ultimately impact users' perceived value of an item or service.
Plant-based alternatives have a lower environmental impact than animal-derived proteins, but many consumers hesitate to try them. An alternative strategy is partially substituting animal proteins with plant proteins, creating hybrid products with improved characteristics. This study investigates consumer perception of hybrid yogurt using choice-based conjoint analysis (CBC) with five ...
This practice has been listed as problematic because omitting gray literature in a systematic review may lead to publication bias and limit unique perspectives that can be drawn from select gray ... evaluating eligible articles using the single-case analysis and review framework (SCARF) tool to produce study rigor, quality, and primary outcomes ...
A non-systematic literature research was performed on February 15th 2024 using the Medical Literature Analysis and Retrieval System Online (Medline), Web of Science and Google Scholar.
Cyclodextrins are ring-shaped sugars used as additives in medications to improve solubility, stability, and sensory characteristics. Despite being widespread, Chagas disease is neglected because of the limitations of available medications. This study aims to review the compounds used in the formation of inclusion complexes for the treatment of Chagas disease, analyzing the incorporated ...
Choice - based appro ach to conjoint analysis [7 ]. More recently Green and Srinivasan offered a review of t he literature on conjoint a nalysis in a p restigious marketin g
The information presented in Fig. 4 is the result of a clustering analysis of the articles resulting from the literature review on financial fraud detection by implementing ML models. In total, 48 ...
In natural language processing, word sense disambiguation (WSD) continues to be a major difficulty, especially for low-resource languages where linguistic variation and a lack of data make model training and evaluation more difficult. The goal of this comprehensive review and meta-analysis of the literature is to summarize the body of knowledge regarding WSD techniques for low-resource ...