• Utility Menu

University Logo

  • ARC Scheduler
  • Student Employment
  • Effective Learning Practices

Learning at college requires processing and retaining a high volume of information across various disciplines and subjects at the same time, which can be a daunting task, especially if the information is brand new. In response, college students try out varied approaches to their learning – often drawing from their high school experiences and modeling what they see their peers doing. While it’s great to try different styles and approaches to learning and studying for your courses, it's smart to incorporate into your daily habits some learning practices that are backed up by current research. 

Below are some effective learning practices suggested by research in the cognitive and learning sciences:

Take ownership of your educational experience.

As an engaged learner, it is important to take an active, self-directed role in your academic experience. Taking agency might feel new to you. In high school, you might have felt like you had little control over your learning experience, so transitioning to an environment where you are implicitly expected to be in the driver’s seat can be disorienting. 

A shift in your mindset regarding your agency, however, can make a big difference in your ability to learn effectively and get the results you want out of your courses.  

Here are four concrete actions you can take to assert ownership over your education :

  • Attend office hours . Come prepared with questions for your instructor about lectures, readings, or other aspects of the course. 
  • Schedule meetings with administrators and faculty  to discuss your academic trajectory and educational goals. You might meet with your academic adviser, course heads, or the Director of Undergraduate Studies (DUS) in your concentration.
  • Identify areas for growth and development  based on your academic goals. Then, explore opportunities to shape and further refine your skills in those areas.
  • Advocate  for support, tools, equipment, or considerations that address your learning needs.

Seek out opportunities for active learning.

Many courses include opportunities for active and engaged learning within their structure. Take advantage of those opportunities in order to enhance your understanding of the material. If such opportunities are not built into the course structure, you can develop your own active learning strategies, including joining study groups and using other active studying techniques. Anytime you grapple actively with your course material, rather than taking it in passively, you’re engaging in active learning. By doing so, you are increasing your retention of key course concepts.

One particularly effective way to help yourself stay focused and engaged in the learning process is to cultivate learning communities, such as accountability groups and study groups. Working in the company of other engaged learners can help remind you why you love learning or why you chose a particular course, concentration, research project, or field of study. Those reminders can re-energize and refocus your efforts. 

Practice study strategies that promote deep learning.

In an attempt to keep up with the demands of college, many students learn concepts just in time for assessment benchmarks (tests, exams, and quizzes). The problem with this methodology is that, for many disciplines (and especially in STEM), the concepts build on one another. Students survive the course only to be met at the final with concepts from the first quiz that they have forgotten long ago. This is why deep learning is important. Deep learning occurs when students use study strategies that ensure course ideas and concepts are embedded into long-term, rather than just short-term, memory. Building your study plans and review sessions in a way that helps create a conceptual framing of the material will serve you now and in the long run. 

Here are some study strategies that promote deep learning: 

Concept Mapping : A concept map is a visualization of knowledge that is organized by the relationships between the topics. At its core, it is made of concepts that are connected together by lines (or arrows) that are labeled with the relationship between the concepts. 

Collaboration : You don’t have to go it alone. In fact, research on learning suggests that it’s best not to. Using study groups, ARC accountability hours, office hours, question centers, and other opportunities to engage with your peers helps you not only test your understanding but also learn different approaches to tackling the material.

Self-test : Quiz yourself about the material you need to know with your notes put away. Refamiliarize yourself with the answers to questions you get wrong, wait a few hours, and then try asking yourself again. Use practice tests provided by your courses or use free apps to create quizzes for yourself.

Create a connection : As you try to understand how all the concepts and ideas from your course fit together, try to associate new information with something you already know. Making connections can help you create a more holistic picture of the material you’re learning. 

Teach someone (even yourself!) : Try teaching someone the concept you’re trying to remember. You can even try to talk to yourself about it! Vocalizing helps activate different sensory processes, which can enhance memory and help you embed concepts more deeply.

Interleave : We often think we’ll do best if we study one subject for long periods of time, but research contradicts this. Try to work with smaller units of time (a half-hour to an hour) and switch up your subjects. Return to concepts you studied earlier at intervals to ensure you learned them sufficiently.

Be intentional about getting started and avoiding procrastination.

When students struggle to complete tasks and projects, their procrastination is not because of laziness, but rather because of the anxiety and negative emotions that accompany starting the task. Understanding what conditions promote or derail your intention to begin a task can help you avoid procrastinating.

Consider the following tips for getting started: 

Eat the Frog : The frog is that one thing you have on your to-do list that you have absolutely no motivation to do and that you’re most likely to procrastinate on. Eating the frog means to just do it, as the first thing you do, and get it over with. If you don’t, odds are that you’ll procrastinate all day. With that one task done, you will experience a sense of accomplishment at the beginning of your day and gain some momentum that will help you move through the rest of your tasks.

Pomodoro Technique : Sometimes, we can procrastinate because we’re overwhelmed by the sheer amount of time we expect it will take to complete a task. But, while it might feel hard to sit down for several hours to work on something, most of us feel we can easily work for a half hour on almost any task. Enter the Pomodoro Technique! When faced with any large task or series of tasks, break the work down into short, timed intervals (25 minutes or so) that are spaced out by short breaks (5 minutes). Working in short intervals trains your brain to focus for manageable periods of time and helps you stay on top of deadlines. With time, the Pomodoro Technique can even help improve your attention span and concentration. Pomodoro is a cyclical system. You work in short sprints, which makes sure you’re consistently productive. You also get to take regular breaks that bolster your motivation and get you ready for your next pomodoro.

Distraction Pads : Sometimes we stop a task that took us a lot of time to get started on because we get distracted by something else. To avoid this, have a notepad beside you while working, and every time you get distracted with a thought, write it down, then push it aside for later. Distracting thoughts can be anything from remembering that you still have another assignment to complete to daydreaming about your next meal. Later on in the day, when you have some free time, you can review your distraction pad to see if any of those thoughts are important and need to be addressed.

Online Apps : It can be hard to rely on our own force of will to get ourselves to start a task, so consider using an external support. There are many self-control apps available for free online (search for "self-control apps"). Check out a few and decide on one that seems most likely to help you eliminate the distractions that can get in the way of starting and completing your work. 

Engage in metacognition.

An effective skill for learning is metacognition. Metacognition is the process of “thinking about thinking” or reflecting on personal habits, knowledge, and approaches to learning. Engaging in metacognition enables students to become aware of what they need to do to initiate and persist in tasks, to evaluate their own learning strategies, and to invest the adequate mental effort to succeed. When students work at being aware of their own thinking and learning, they are more likely to recognize patterns and to intentionally transfer knowledge and skills to solve increasingly complex problems. They also develop a greater sense of self-efficacy.

Mentally checking in with yourself while you study is a great metacognitive technique for assessing your level of understanding. Asking lots of “why,” “how,” and “what” questions about the material you’re reviewing helps you to be reflective about your learning and to strategize about how to tackle tricky material. If you know something, you should be able to explain to yourself how you know it. If you don’t know something, you should start by identifying exactly what you don’t know and determining how you can find the answer.

Metacognition is important in helping us overcome illusions of competence (our brain’s natural inclination to think that we know more than we actually know). All too often students don’t discover what they really know until they take a test. Metacognition helps you be a better judge of how well you understand your course material, which then enables you to refine your approach to studying and better prepare for tests.

Accordion style

  • Assessing Your Understanding
  • Building Your Academic Support System
  • Common Class Norms
  • First-Year Students
  • How to Prepare for Class
  • Interacting with Instructors
  • Know and Honor Your Priorities
  • Memory and Attention
  • Minimizing Zoom Fatigue
  • Note-taking
  • Office Hours
  • Perfectionism
  • Scheduling Time
  • Senior Theses
  • Study Groups
  • Tackling STEM Courses
  • Test Anxiety

Learning Strategies That Work

Dr. Mark A. McDaniel shares effective, evidence-based strategies about learning to replace less effective but widely accepted practices.

Dr. Mark A. McDaniel

How do we learn and absorb new information? Which learning strategies actually work and which are mere myths?

Such questions are at the center of the work of Mark McDaniel , professor of psychology and the director of the Center for Integrative Research on Cognition, Learning, and Education at Washington University in St. Louis. McDaniel coauthored the book Make it Stick: The Science of Successful Learning .

In this Q&A adapted from a Career & Academic Resource Center podcast episode , McDaniel discusses his research on human learning and memory, including the most effective strategies for learning throughout a lifetime.

Harvard Extension: In your book, you talk about strategies to help students be better learners in and outside of the classroom. You write, “We harbor deep convictions that we learn better through single-minded focus and dogged repetition. And these beliefs are validated time and again by the visible improvement that comes during practice, practice, practice.”

McDaniel: This judgment that repetition is effective is hard to shake. There are cues present that your brain picks up when you’re rereading, when you’re repeating something that give you the metacognitive, that is your judgment about your own cognition, give you the misimpression that you really have learned this stuff well.

Older learners shouldn’t feel that they’re at a definitive disadvantage, because they’re not. Older learners really want to try to leverage their prior knowledge and use that as a basis to structure and frame and understand new information coming in.

And two of the primary cues are familiarity. So as you keep rereading, the material becomes more familiar to you. And we mistakenly judge familiarity as meaning robust learning.

And the second cue is fluency. It’s very clear from much work in reading and cognitive processes during reading that when you reread something at every level, the processes are more fluent. Word identification is more fluent. Parsing the structure of the sentence is more fluent. Extracting the ideas is more fluent. Everything is more fluent. And we misinterpret these fluency cues that the brain is getting. And these are accurate cues. It is more fluent. But we misinterpret that as meaning, I’ve really got this. I’ve really learned this. I’m not going to forget this. And that’s really misleading.

So let me give you another example. It’s not just rereading. It’s situations in, say, the STEM fields or any place where you’ve got to learn how to solve certain kinds of problems. One of the standard ways that instructors present homework is to present the same kind of problem in block fashion. You may have encountered this in your own math courses, your own physics courses.

So for example, in a physics course, you might get a particular type of work problem. And the parameters on it, the numbers might change, but in your homework, you’re trying to solve two or three or four of these work problems in a row. Well, it gets more and more fluid because exactly what formula you have to use. You know exactly what the problem is about. And as you get more fluid, and as we say in the book, it looks like you’re getting better. You are getting better at these problems.

But the issue is that can you remember how to identify which kinds of problems go with which kinds of solutions a week later when you’re asked to do a test where you have all different kinds of problems? And the answer is no, you cannot when you’ve done this block practice. So even though instructors who feel like their students are doing great with block practice and students will feel like they’re doing great, they are doing great on that kind of block practice, but they’re not at all good now at retaining information about what distinguishing features or problems are signaling certain kinds of approaches.

What you want to do is interleave practice in these problems. You want to randomly have a problem of one type and then solve a problem of another type and then a problem of another type. And in doing that, it feels difficult and it doesn’t feel fluent. And the signals to your brain are, I’m not getting this. I’m not doing very well. But in fact, that effort to try to figure out what kinds of approaches do I need for each problem as I encounter a different kind of problem, that’s producing learning. That’s producing robust skills that stick with you.

So this is a seductive thing that we have to, instructors and students alike, have to understand and have to move beyond those initial judgments, I haven’t learned very much, and trust that the more difficult practice schedule really is the better learning.

And I’ve written more on this since Make It Stick . And one of my strong theoretical tenets now is that in order for students to really embrace these techniques, they have to believe that they work for them. Each student has to believe it works for them. So I prepare demonstrations to show students these techniques work for them.

The net result of adopting these strategies is that students aren’t spending more time. Instead they’re spending more effective time. They’re working better. They’re working smarter.

When students take an exam after doing lots of retrieval practice, they see how well they’ve done. The classroom becomes very exciting. There’s lots of buy-in from the students. There’s lots of energy. There’s lots of stimulation to want to do more of this retrieval practice, more of this difficulty. Because trying to retrieve information is a lot more difficult than rereading it. But it produces robust learning for a number of reasons.

I think students have to trust that these techniques, and I think they also have to observe that these techniques work for them. It’s creating better learning. And then as a learner, you are more motivated to replace these ineffective techniques with more effective techniques.

Harvard Extension: You talk about tips for learners , how to make it stick. And there are several methods or tips that you share: elaboration, generation, reflection, calibration, among others. Which of these techniques is best?

McDaniel: It depends on the learning challenges that are faced. So retrieval practice, which is practicing trying to recall information from memory is really super effective if the requirements of your course require you to reproduce factual information.

For other things, it may be that you want to try something like generating understanding, creating mental models. So if your exams require you to draw inferences and work with new kinds of problems that are illustrative of the principles, but they’re new problems you haven’t seen before, a good technique is to try to connect the information into what I would call mental models. This is your representation of how the parts and the aspects fit together, relate together.

It’s not that one technique is better than the other. It’s that different techniques produce certain kinds of outcomes. And depending on the outcome you want, you might select one technique or the other.

I really firmly believe that to the extent that you can make learning fun and to the extent that one technique really seems more fun to you, that may be your go to technique. I teach a learning strategy course and I make it very clear to students. You don’t need to use all of these techniques. Find a couple that really work for you and then put those in your toolbox and replace rereading with these techniques.

Harvard Extension: You reference lifelong learning and lifelong learners. You talk about the brain being plastic, mutability of the brain in some ways, and give examples of how some lifelong learners approach their learning.

McDaniel: In some sense, more mature learners, older learners, have an advantage because they have more knowledge. And part of learning involves relating new information that’s coming into your prior knowledge, relating it to your knowledge structures, relating it to your schemas for how you think about certain kinds of content.

And so older adults have the advantage of having this richer knowledge base with which they can try to integrate new material. So older learners shouldn’t feel that they’re at a definitive disadvantage, because they’re not. Older learners really want to try to leverage their prior knowledge and use that as a basis to structure and frame and understand new information coming in.

Our challenges as older learners is that we do have these habits of learning that are not very effective. We turn to these habits. And if these aren’t such effective habits, we maybe attribute our failures to learn to age or a lack of native ability or so on and so forth. And in fact, that’s not it at all. In fact, if you adopt more effective strategies at any age, you’re going to find that your learning is more robust, it’s more successful, it falls into place.

You can learn these strategies at any age. Successful lifelong learning is getting these effective strategies in place, trusting them, and having them become a habit for how you’re going to approach your learning challenges.

6 Benefits of Connecting with an Enrollment Coach

Thinking about pursuing a degree or certificate at Harvard Extension School? Learn more about how working with an enrollment coach will get you off to a great start.

Harvard Division of Continuing Education

The Division of Continuing Education (DCE) at Harvard University is dedicated to bringing rigorous academics and innovative teaching capabilities to those seeking to improve their lives through education. We make Harvard education accessible to lifelong learners from high school to retirement.

Harvard Division of Continuing Education Logo

Featured Topics

Featured series.

A series of random questions answered by Harvard experts.

Explore the Gazette

Read the latest.

Joonho Lee (top left), Rita Hamad, Fei Chen, Miaki Ishii, Jeeyun Chung, Suyang Xu, Stephanie Pierce, and Jarad Mason.

Complex questions, innovative approaches

Planktonic foraminifera fossils.

Early warning sign of extinction?

Bonobo.

So much for summers of love

Lessons in learning.

Sean Finamore ’22 (left) and Xaviera Zime ’22 study during a lecture in the Science Center.

Photos by Kris Snibbe/Harvard Staff Photographer

Peter Reuell

Harvard Staff Writer

Study shows students in ‘active learning’ classrooms learn more than they think

For decades, there has been evidence that classroom techniques designed to get students to participate in the learning process produces better educational outcomes at virtually all levels.

And a new Harvard study suggests it may be important to let students know it.

The study , published Sept. 4 in the Proceedings of the National Academy of Sciences, shows that, though students felt as if they learned more through traditional lectures, they actually learned more when taking part in classrooms that employed so-called active-learning strategies.

Lead author Louis Deslauriers , the director of science teaching and learning and senior physics preceptor, knew that students would learn more from active learning. He published a key study in Science in 2011 that showed just that. But many students and faculty remained hesitant to switch to it.

“Often, students seemed genuinely to prefer smooth-as-silk traditional lectures,” Deslauriers said. “We wanted to take them at their word. Perhaps they actually felt like they learned more from lectures than they did from active learning.”

In addition to Deslauriers, the study is authored by director of sciences education and physics lecturer Logan McCarty , senior preceptor in applied physics Kelly Miller, preceptor in physics Greg Kestin , and Kristina Callaghan, now a physics lecturer at the University of California, Merced.

The question of whether students’ perceptions of their learning matches with how well they’re actually learning is particularly important, Deslauriers said, because while students eventually see the value of active learning, initially it can feel frustrating.

“Deep learning is hard work. The effort involved in active learning can be misinterpreted as a sign of poor learning,” he said. “On the other hand, a superstar lecturer can explain things in such a way as to make students feel like they are learning more than they actually are.”

To understand that dichotomy, Deslauriers and his co-authors designed an experiment that would expose students in an introductory physics class to both traditional lectures and active learning.

For the first 11 weeks of the 15-week class, students were taught using standard methods by an experienced instructor. In the 12th week, half the class was randomly assigned to a classroom that used active learning, while the other half attended highly polished lectures. In a subsequent class, the two groups were reversed. Notably, both groups used identical class content and only active engagement with the material was toggled on and off.

Following each class, students were surveyed on how much they agreed or disagreed with statements such as “I feel like I learned a lot from this lecture” and “I wish all my physics courses were taught this way.” Students were also tested on how much they learned in the class with 12 multiple-choice questions.

When the results were tallied, the authors found that students felt as if they learned more from the lectures, but in fact scored higher on tests following the active learning sessions. “Actual learning and feeling of learning were strongly anticorrelated,” Deslauriers said, “as shown through the robust statistical analysis by co-author Kelly Miller, who is an expert in educational statistics and active learning.”

Those results, the study authors are quick to point out, shouldn’t be interpreted as suggesting students dislike active learning. In fact, many studies have shown students quickly warm to the idea, once they begin to see the results. “In all the courses at Harvard that we’ve transformed to active learning,” Deslauriers said, “the overall course evaluations went up.”

bar chart

Co-author Kestin, who in addition to being a physicist is a video producer with PBS’ NOVA, said, “It can be tempting to engage the class simply by folding lectures into a compelling ‘story,’ especially when that’s what students seem to like. I show my students the data from this study on the first day of class to help them appreciate the importance of their own involvement in active learning.”

McCarty, who oversees curricular efforts across the sciences, hopes this study will encourage more of his colleagues to embrace active learning.

“We want to make sure that other instructors are thinking hard about the way they’re teaching,” he said. “In our classes, we start each topic by asking students to gather in small groups to solve some problems. While they work, we walk around the room to observe them and answer questions. Then we come together and give a short lecture targeted specifically at the misconceptions and struggles we saw during the problem-solving activity. So far we’ve transformed over a dozen classes to use this kind of active-learning approach. It’s extremely efficient — we can cover just as much material as we would using lectures.”

A pioneer in work on active learning, Balkanski Professor of Physics and Applied Physics Eric Mazur hailed the study as debunking long-held beliefs about how students learn.

“This work unambiguously debunks the illusion of learning from lectures,” he said. “It also explains why instructors and students cling to the belief that listening to lectures constitutes learning. I recommend every lecturer reads this article.”

Dean of Science Christopher Stubbs , Samuel C. Moncher Professor of Physics and of Astronomy, was an early convert. “When I first switched to teaching using active learning, some students resisted that change. This research confirms that faculty should persist and encourage active learning. Active engagement in every classroom, led by our incredible science faculty, should be the hallmark of residential undergraduate education at Harvard.”

Ultimately, Deslauriers said, the study shows that it’s important to ensure that neither instructors nor students are fooled into thinking that lectures are the best learning option. “Students might give fabulous evaluations to an amazing lecturer based on this feeling of learning, even though their actual learning isn’t optimal,” he said. “This could help to explain why study after study shows that student evaluations seem to be completely uncorrelated with actual learning.”

This research was supported with funding from the Harvard FAS Division of Science.

Share this article

You might like.

Seven projects awarded Star-Friedman Challenge grants

Planktonic foraminifera fossils.

Fossil record stretching millions of years shows tiny ocean creatures on the move before Earth heats up

Bonobo.

Despite ‘hippie’ reputation, male bonobos fight three times as often as chimps, study finds

How old is too old to run?

No such thing, specialist says — but when your body is trying to tell you something, listen

Alcohol is dangerous. So is ‘alcoholic.’

Researcher explains the human toll of language that makes addiction feel worse

Cease-fire will fail as long as Hamas exists, journalist says

Times opinion writer Bret Stephens also weighs in on campus unrest in final Middle East Dialogues event

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2024 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How to Learn More Effectively

10 Learning Techniques to Try

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

new research learning techniques

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk,  "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

new research learning techniques

Knowing the most effective strategies for how to learn can help you maximize your efforts when you are trying to acquire new ideas, concepts, and skills. If you are like many people, your time is limited, so it is important to get the most educational value out of the time you have.

Speed of learning is not the only important factor, however. It is important to be able to accurately remember the information that you learn, recall it at a later time, and use it effectively in a wide variety of situations.

How can you teach yourself to learn? As you approach a new subject, incorporate some of the following tactics:

  • Find ways to boost your memory
  • Always keep learning new things
  • Use a variety of learning techniques
  • Try teaching it to someone else
  • Connect new information to things you already know
  • Look for opportunities to have hands-on experiences
  • Remember that mistakes are part of the process
  • Study a little bit every day
  • Test yourself
  • Focus on one thing at a time

Knowing how to learn well is not something that happens overnight, but putting a few of these learning techniques into daily practice can help you get more out of your study time.

Improve Your Memory

There are a number of different strategies that can boost memory . Basic tips such as improving your focus, avoiding cram sessions, and structuring your study time are good places to start, but there are even more lessons from psychology that can dramatically improve your learning efficiency.

Strategies that can help improve your memory include:

  • Getting r egular physical exercise , which is linked to improvements in memory and brain health
  • Spending time socializing with other people
  • Getting enough sleep
  • Eliminating distractions so you can focus on what you are learning
  • Organizing the information you are studying to make it easier to remember
  • Using elaborative rehearsal when studying; when you learn something new, spend a few moments describing it to yourself in your own words
  • Using visual aids like photographs, graphs, and charts
  • Reading the information you are studying out loud

For example, you might use general learning techniques like setting aside quiet time to study, rehearsing, and reading information aloud. You might combine this with strategies that can foster better memory, such as exercising and socializing.

If you're pressed for time, consider combining study strategies. Listen to a podcast while you're taking a walk or join a group where you can practice your new skills with others.

Keep Learning New Things

Prasit photo / Getty Images

One sure-fire way to become a more effective learner is to simply keep learning. Research has found that the brain is capable of producing new brain cells, a process known as neurogenesis . However, many of these cells will eventually die unless a person engages in some type of effortful learning.

By learning new things, these cells are kept alive and incorporated into brain circuits.

So, if you are learning a new language, it is important to keep practicing the language in order to maintain the gains you have achieved. This "use-it-or-lose-it" phenomenon involves a brain process known as "pruning."

In pruning , certain pathways in the brain are maintained, while others are eliminated. If you want the new information you just learned to stay put, keep practicing and rehearsing it.

Learn in Multiple Ways

Another one of the best ways to learn is to focus on learning in more than one way. For example, instead of just listening to a podcast, which involves auditory learning , find a way to rehearse the information both verbally and visually.

This might involve describing what you learned to a friend, taking notes, or drawing a mind map. By learning in more than one way, you’re further cementing the knowledge in your mind.

For example, if you are learning a new language, try varying techniques such as listening to language examples, reading written language, practicing with a friend, and writing down your own notes.

One helpful tip is to try writing out your notes on paper rather than typing on a laptop, tablet, or computer. Research has found that longhand notes can help cement information in memory more effectively than digital note-taking.

Varying your learning techniques and giving yourself the opportunity to learn in different ways and in different contexts can help make you a more efficient learner.

Teach What You Are Learning

Educators have long noted that one of the best ways to learn something is to teach it to someone else. Remember your seventh-grade presentation on Costa Rica? By teaching to the rest of the class, your teacher hoped you would gain even more from the assignment.

You can apply the same principle today by sharing newly learned skills and knowledge with others. Start by translating the information into your own words. This process alone helps solidify new knowledge in your brain. Next, find some way to share what you’ve learned.

Some ideas include writing a blog post, creating a podcast, or participating in a group discussion.

Build on Previous Learning

Tara Moore\ / Getty Images

Another great way to become a more effective learner is to use relational learning, which involves relating new information to things that you already know.

For example, if you are learning a new language, you might associate the new vocabulary and grammar you are learning with what you already know about your native language or other languages you may already speak.

Gain Practical Experience

LWA / Dann Tardif / Getty Images

For many students, learning typically involves reading textbooks, attending lectures, or doing research in the library or online. While seeing information and then writing it down is important, actually putting new knowledge and skills into practice can be one of the best ways to improve learning.

If it is a sport or athletic skill, perform the activity on a regular basis. If you are learning a new language, practice speaking with another person and surround yourself with language-immersion experiences. Watch foreign-language films and strike up conversations with native speakers to practice your budding skills.

If you are trying to acquire a new skill or ability, focus on gaining practical experience.

Don't Be Afraid to Make Mistakes

Research suggests that making mistakes when learning can improve learning outcomes. According to one study, trial-and-error learning where the mistakes were close to the actual answer was actually a helpful part of the learning process.

Another study found that mistakes followed by corrective feedback can be beneficial to learning. So if you make a mistake when learning something new, spend some time correcting the mistake and examining how you arrived at the incorrect answer.

This strategy can help foster critical thinking skills and make you more adaptable in learning situations that require being able to change your mind.

Research suggests that making mistakes when learning can actually help improve outcomes, especially if you correct your mistake and take the time to understand why it happened.

Use Distributed Practice

David Schaffer / Getty Images

Another strategy that can help is known as distributed practice. Instead of trying to cram all of your learning into a few long study sessions, try a brief, focused session, and then take a break.

So if you were learning a new language, you might devote a period of time to an intensive session of studying. After a break, you would then come back and rehearse your previous learning while also extending it to new learning.

This process of returning for brief sessions over a long period of time is one of the best ways to learn efficiently and effectively. 

What is the best way to learn?

Research suggests that this type of distributed learning is one of the most effective learning techniques. Focus on spending a little time studying each topic every day.

While it may seem that spending more time studying is one of the best ways to maximize learning, research has demonstrated that taking tests actually helps you better remember what you've learned, even if it wasn't covered on the test.

This phenomenon, known as the testing effect, suggests that spending time retrieving information from memory improves the long-term memory of that information. This retrieval practice makes it more likely that you will be able to remember that information again in the future.

Stop Multitasking

For many years, it was thought that people who multitask (perform more than one activity at once) had an edge over those who did not. However, research now suggests that multitasking can actually make learning less effective.

Multitasking can involve trying to do more than one thing at the same time, but it can also involve quickly switching back and forth between tasks or trying to rapidly perform tasks one after the other. 

According to research, doing this not only makes people less productive when they work but also impairs attention and reduces comprehension. Multitasking when you are studying makes it harder to focus on the information and reduces how much you understand it.

Research has also found that media multitasking, or dividing attention between different media sources, can also have a detrimental impact on learning and academic performance.

To avoid the dangers of multitasking, start by focusing your attention on the task at hand and continue working for a predetermined amount of time.

If you want to know how to learn, it is important to explore learning techniques that have been shown to be effective. Strategies such as boosting your memory and learning in multiple ways can be helpful. Regularly learning new things, using distributed practice, and testing yourself often can also be helpful ways to become a more efficient learner.

A Word From Verywell

Becoming a more effective learner can take time, and it always takes practice and determination to establish new habits. Start by focusing on just a few of these tips to see if you can get more out of your next study session.

Perhaps most importantly, work on developing the mindset that you are capable of improving your knowledge and skills. Research suggests that believing in your own capacity for growth is one of the best ways to take advantage of the learning opportunities you pursue.

Frequently Asked Questions

Create a study schedule, eliminate distractions, and try studying frequently for shorter periods of time. Use a variety of learning methods such as reading the information, writing it down, and teaching it to someone else.

Learning techniques that can help when you have ADHD include breaking up your study sessions into small blocks, giving yourself plenty of time to prepare, organizing your study materials, and concentrating on information at times when you know that your focus is at its best.

Practice testing and distributed practice have been found to be two of the most effective learning strategies. Test yourself in order to practice recalling information and spread your learning sessions out into shorter sessions over a longer period of time.

The easiest way to learn is to build on the things that you already know. As you gradually extend your knowledge a little bit at a time, you'll eventually build a solid body of knowledge around that topic.

Five ways to learn include visual, auditory, text-based, kinesthetic, and multimodal learning. The VARK model of learning styles suggests that people tend to have a certain preference for one or more of these ways to learn.

Chaire A, Becke A, Düzel E. Effects of physical exercise on working memory and attention-related neural oscillations . Front Neurosci . 2020;14:239. doi:10.3389/fnins.2020.00239

Mazza S, Gerbier E, Gustin M-P, et al. Relearn faster and retain longer: Along with practice, sleep makes perfect . Psychol Sci. 2016;27(10):1321-1330. doi:10.1177/0956797616659930

Manning JR, Kahana MJ.  Interpreting semantic clustering effects in free recall .  Memory . 2012;20(5):511-517. doi:10.1080/09658211.2012.683010

Forrin ND, Macleod CM.  This time it's personal: the memory benefit of hearing oneself .  Memory.  2018;26(4):574-579. doi:10.1080/09658211.2017.1383434

Shors TJ, Anderson ML, Curlik DM 2nd, Nokia MS. Use it or lose it: how neurogenesis keeps the brain fit for learning .  Behav Brain Res . 2012;227(2):450-458. doi:10.1016/j.bbr.2011.04.023

Mueller PA, Oppenheimer DM. The pen Is mightier than the keyboard: Advantages of longhand over laptop note taking . Psychol Sci . 2014. 2014;25(6):1159-1168. doi:10.1177/0956797614524581

Cyr AA, Anderson ND. Learning from your mistakes: does it matter if you’re out in left foot, I mean field ? Memory . 2018;26(9):1281-1290. doi:10.1080/09658211.2018.1464189

Metcalfe J. Learning from errors . Annu Rev Psychol . 2017;68(1):465-489. doi:10.1146/annurev-psych-010416-044022

Dunlosky J, Rawson KA, Marsh EJ, Nathan MJ, Willingham DT. Improving students’ learning with effective learning techniques: promising directions from cognitive and educational psychology . Psychol Sci Public Interest . 2013;14(1):4-58. doi:10.1177/1529100612453266

Pastotter B, Bauml KHT. Retrieval practice enhances new learning: the forward effect of testing . Front Psychol . 2014;5. doi:10.3389/fpsyg.2014.00286

Jeong S-H, Hwang Y.  Media multitasking effects on cognitive vs. attitudinal outcomes: A meta-analysis .  Hum Commun Res . 2016;42(4):599-618. doi:10.1111/hcre.12089

May K, Elder A. Efficient, helpful, or distracting? A literature review of media multitasking in relation to academic performance . Int J Educ Technol High Educ.  2018;15(1):13. doi:10.1186/s41239-018-0096-z

Sarrasin JB, Nenciovici L, Foisy LMB, Allaire-Duquette G, Riopel M, Masson S. Effects of teaching the concept of neuroplasticity to induce a growth mindset on motivation, achievement, and brain activity: A meta-analysis . Trends Neurosci Educ . 2018;12:22-31. doi:10.1016/j.tine.2018.07.003

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

  • Reference Manager
  • Simple TEXT file

People also looked at

Hypothesis and theory article, how a new learning theory can benefit transformative learning research: empirical hypotheses.

new research learning techniques

  • 1 Department of Human Development, Teachers College, Columbia University, New York, NY, United States
  • 2 Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States

Transformative Learning research and practice has consistently stalled on three fundamental debates: (1) what transformative learning is, and how it’s differentiated from other learning; (2) what the preconditions for transformative learning are; and (3) what transformative learning’s predictable and relevant outcomes are. The following article attempts two main feats: (1) to provide a re-organization of transformative learning theory through the work of Vygotskian cultural-historical activity theory, and a newly synthesized meta-theory of learning and development generally, and (2) to use that re-organized model to articulate empirical research questions and hypotheses that are more amenable to observation and analysis than the typical time and cost intensive methods available to most researchers studying transformative learning today. The newly synthesized model draws on historical work in cognitive, social, educational, and clinical psychology, and clearly articulates the dialectical nature between the environment and experience, and what is meant by classical transformative learning concepts such as cognitive-rational frame of reference shifts, self/soul inner work, critical reflection, imaginative engagement, and everything in between.

Introduction

In the last four decades of transformative learning research, analytical-reductionist psychological science has proliferated characteristics and definitions of transformative learning without doing enough critical-dialectical theoretical work to resolve the inconsistencies between them ( Cranton and Taylor, 2013 ; Howie and Bagnall, 2013 ). The following article is intended to make progress toward a resolution. Transformative Learning (TL), according to its most cited theorist, Jack Mezirow, is:

The process by which we transform problematic frames of reference (mindsets, habits of mind, meaning perspectives) – sets of assumption and expectation – to make them more inclusive, discriminating, open, reflective and emotionally able to change. Such frames are better because they are more likely to generate beliefs and opinions that will prove more true or justified to guide action ( Mezirow, 2008 , p. 92).

In this context, frames of reference are composed of “habits of mind” and “points of view” (2008, p. 92). Habits of mind are defined as “broad, abstract, orienting, habitual ways of thinking, feeling, and acting, influenced by assumptions that constitute a set of codes” (2008, p. 92). Points of view are defined as “the constellation of belief, memory, value judgment, attitude and feeling that shapes a particular interpretation” (2008, p. 92). An example provided by Mezirow of a habit of mind is ethnocentrism, a resulting point of view being the negative feelings, beliefs, judgments, and attitudes toward individuals or groups with different characteristics than our own (2008, p. 93). Finally, “problematic frames of reference” are those that result in a “disorienting dilemma” for the individual, where their current habits of mind and points of view are inadequate for overcoming some challenge through changing only a point of view or a habit of mind, and can only be resolved through changing the entire frame of reference, or the meaning-making relationships between the habits of mind and the points of view, or how habits of mind “result” in points of view (2008, p. 94).

Transformative learning then, is neatly described as occurring in the moment when a point of view transforms not only the habit of mind, but the entire frame of reference (habits of mind as well as resulting points of view and the relationships between them, p. 94; also defined as “structures of assumptions,” 1997, p. 5). This deceptively simple illustration of TL has led to its application in diverse but not always easily relatable contexts and conditions ( Nohl, 2015 ), and what exactly is meant by how points of view “result” from habits of mind (i.e., the frame-of-reference process) isn’t very clear, and neither are its necessary and sufficient conditions ( Dirkx et al., 2018 ). As a further confusion, frames of reference are alternatively described in Mezirow’s later writings as composed by two dimensions (habits of mind and points of view, i.e., greater than the sum of these parts), as well as equated with one of these dimensions (habits of mind), often on the same page (2008, p. 92). Yet in his earlier writings, these concepts are clearly differentiated (1991, p. 5–6).

Not only has Mezirow’s own thinking around TL evolved over time ( Kitchenham, 2008 ), his original 10-step critical-dialectical theory ( Mezirow, 2000 ) has been criticized for a lack of generalizability, and alternative models have proliferated within the gap ( Taylor, 2007 ; Hoggan, 2016b ). Both factors combined make theoretical differentiation (between TL and not-TL) and linkage (between various observations of TL) challenging. An example of the ad hoc proliferation: Taylor (1997) categorizes TL processes as psychocritical, psychodevelopmental, psychoanalytical, or social-emancipatory, which all require a disorienting dilemma but specifying various conditions that produce it and engaging different processes to resolve it. Then, Taylor (2008) adds neurobiological, cultural-spiritual, race-centric, and planetary to the typology, but it isn’t clear how any of these new categories demonstrate consistent discriminant or convergent validity beyond loosely and incompletely described content validity (see Taylor, 2007 , p. 10). Hoggan (2016b) further complicates this picture by categorizing TL outcomes without regard to the processes that may give rise to one category of outcomes instead of another.

An empirical issue resulting from this theoretical milieu: strategies for measuring TL or TL outcomes have relied on intensive qualitative data collection such as retrospective interviews ( Taylor, 1994 ), focus groups ( Hoggan, 2014 ), written content analysis ( Boyer et al., 2006 ), video content analysis ( Burden and Atkinson, 2008 ), and ethnography ( Quinn and Sinclair, 2016 ), or on crude quantitative methods such as self-report scales (see Romano, 2017 for a review). These methods limit the scope and generalizability of TL research generally due to the time and cost implications of the qualitative strategies ( Harder et al., 2021 ), or the lack of reliability found in self-reports. Further, methods have also tended to impose data collection instruments that probably instigate TL outcomes they hope to observe (e.g., Carrington and Selva , 2010 , “reflection logs” p. 1; Harder et al., 2021 WeValue InSitu; see also Pernell-Arnold et al., 2012 ; Dirkx et al., 2018 ). These characteristics of TL research gate its theoretical advance and understanding by underemphasizing a priori hypotheses about what causes transformation in favor of arguing for the expansion of TL theory to include the researcher’s domain of practice and/or methodology of choice. While it is important to find the conceptual and practical boundaries of TL, this is impossible to do without an anchored perspective, just as, somewhat ironically, the transformation from one perspective to another isn’t possible without first one identified perspective and then a differentiated other perspective to transform to Mezirow (2003 , p. 60). The purpose here is to show how previous TL meta-theory attempts have fallen short, and why, before explaining how a new theory of learning generally can boost TL research by providing such an anchor. To do so, I return to Mezirow’s original conceptualization of TL, and show how its most mature evolution can be clarified and associated with evidence-based TL outcomes with this new theory. I then specify empirically testable hypotheses that afford broader, faster, and cheaper data collection methods for TL researchers.

What has been missing since the beginning are empirically testable hypotheses concerning:

(1) What is transformative learning, and how is it compared to other kinds of learning ( Mezirow, 2000 ; Kitchenham, 2008 ; Sessa et al., 2011 )?

(2) What are the preconditions for transformative learning to occur ( Mezirow, 1978 , 1991 , 2003 ; Dirkx et al., 2018 )?

(3) What are the predictable outcomes of transformative learning ( Hoggan, 2016a , b ; for relevant discussions, see Dirkx et al., 2006 ; Taylor and Cranton, 2012 )?

These questions have been addressed in the literature by numerous authors examining qualitative data from their own perspectives with their own biases, resulting in disparate theories that pay minor lip service to one another without critically examining the gaps, overlaps, and confusions across them ( Cranton and Taylor, 2013 ). This trend hampers theoretical development as the meanings of central terms like “perspective,” “meaning,” “frame of reference,” and “habits of mind” are defined in conflict with previous definitions ( Howie and Bagnall, 2013 ).

This article attempts to resolve these issues by applying a newly synthesized theory of learning and development to transformative learning, and then contrasting it with perceptual, adaptive, and generative learning ( Goldstone, 1998 ; Sessa et al., 2011 ). First, a Vygotskian perspective on cultural-historical activity theory ( Roth and Lee, 2007 ) is presented as the theoretical basis for this new theory of learning, known as the Introduction-Conflict-Balance-Creation-Identity Theory of Learning and Development (ICBCI), which is then briefly outlined (see Friedman, 2021 for full details). Next, the stubborn challenges of TL research are reviewed in light of this new theory. Finally, ICBCI is used to state empirically testable hypotheses for TL theory as a theory-in-practice of learning-leading-development through human activity ( Holzman, 2006 ; Roth and Lee, 2007 ).

Vygotskian Cultural-Historical Activity Theory

As early as the 1930s, Russian psychologist Lev Vygotsky expressed frustration with educational psychology as employing “atomistic and functional modes of analysis…[that] treated psychological processes in isolation” ( Vygotsky, 1986 , p. 1). In the time since, numerous psychologists have taken up the charge to integrate psychological processes with one another with varying degrees of analytical-reductionism. While the various threads of this work go by many names, Vygotsky’s colleagues and students developed what is known generally as cultural-historical activity theory (CHAT; Roth and Lee, 2007 ). Vygotsky’s original emphasis on engaging critical-dialectical methods to discover the processes involved in human learning and development spurred his students, particularly Alexander Luria and A. N. Leont’ev, to develop his work further, culminating in what is today considered “third-generation CHAT” ( Roth and Lee, 2007 , p. 188). The roots of CHAT can be traced back to dialectical materialism (e.g., Marx, 1967 ), classical German philosophy (e.g., Hegel, 1991 ; Wittgenstein, 2010 ), and Vygotsky’s (1978 , 1986 ) writings. Vygotsky’s work, considered the genesis of first-generation activity theory, emphasized activity, rather than the individual person, as the appropriate unit of psychological analysis ( Newman and Holzman, 2013 , p. 52), a revolutionary act amongst dominant Western constructivist theory ( Loughlin, 1992 , p. 791). In the second generation, students of Vygotsky incorporated societal, cultural, and historical dimensions into the dialectical materialist focus on activity ( Roth and Lee, 2007 , p. 189). And in its third generation, Leont’ev (1978) specifically argued for historically evolving object-practical activity as the fundamental unit and the explanatory principle for human learning and development ( Langner, 1984 ).

Put simply, Vygotsky posited that psychological science was far more insightful and productive when viewing activity, rather than individuals, under definite conditions; his contemporaries and immediate students expanded these observations of definite local conditions, such as a teacher working with a student to learn language or mathematical operations, to global conditions, incorporating the cultural-historical dimensions of that activity, such as who was culturally welcome to learn math (e.g., largely wealthy men and boys) and by what historically embedded method (e.g., direct instruction). Finally, Vygotsky’s intellectual descendants in Soviet Russia as well as Europe and the United States (e.g., Leont’ev, 1978 ; Cole, 1995 ) discovered the value and relevance of cultural tools , or objects and methods of practice under definite conditions. These tools develop and change through praxis , or the moments of real human activity that occur only once ( Bakhtin, 1993 ), distinguished from practice , or the patterned form of action over time. For Vygotsky, what mattered was the activity engaged; for his students, the activity plus its contextualized expectations and norms; and for his descendants, that activity in normed context around stable tools also under development and change themselves, including but not limited to objects, theories, and spaces for and of activity. The development from first generation activity theory to present day CHAT is easily traced back to Vygotsky’s work, and its reliance on Marxist dialectical materialism (applied to educational psychology). For this reason, CHAT is interchangeably referred to below as “Vygotskian” theory.

Actions in Activity

More recently, researchers pursuing further theoretical advancement of these Vygotskian ideas have emphasized the important distinction between activity as opposed to behavior ( Newman and Holzman, 2013 , p. 46). Activity is defined by conscious awareness of, and contribution to, dialectical-critical learning and development, in a radically monistic sense, in history , rather than for society (p. 49). In other words, human activity changes the conditions that define it while being defined by them (i.e., a tool-and-result , p. 47), or capable of making tools to remake itself with, similar to a dye-maker machine in a machine shop, which can produce parts to repair or enhance the dye-maker, essentially constituting a machine that constructs itself (an imperfect analogy to neurobiological systems such as the human brain). This is fundamental human activity, where the products (cultural tools in Vygotskian theory) of that activity redefine the activity itself in their construction and use (p. 87).

A simple example of activity under definite conditions would be when a group of people agree on norms for creating norms in the group, such as deciding to use voting to make decisions on what tasks to prioritize in completing a project. Another: a classroom of students deciding to improve the ecosystem of a local creek to learn about scientific observation techniques (e.g., Roth and Lee, 2004 ). While subtler, this example highlights the radical monism ( Newman and Holzman, 2013 , p. 137) of Vygotskian theory: in praxis (i.e., the exact same moment that is never repeated), students are learning (acquiring) and developing (evolving) scientific cultural tools as their unique perspective participates in the activity, adopting some pieces wholesale (e.g., velocity is equal to distance over time) while also adapting provided tools (e.g., exchanging Styrofoam balls for oranges to counter the wind’s confounding effect; Roth and Lee, 2007 , p. 204), the nature of their own interactional stance (child/observer to student/actor), and the nature of interaction generally believed to be culturally appropriate (direct instruction in dialogue with project-based learning). The refusal to engage in dualistic thinking (subject/object, individual/collective, and learning/development) in Vygotskian theory forces the theorist to think dialectically, which is:

Equivalent to saying that any part that one might heuristically isolate within a unit [of activity] presupposes all other parts ; a unit can be analyzed in terms of its component parts, but none of these parts can be understood or theorized apart from the others that contribute to defining it (p. 196).

Roth and Lee’s (2004) study is a radically monistic description of humans engaged in activity under definite conditions, as “they not only contribute to the ultimate reproduction of society, but also increase action possibilities for themselves” (p. 205), and what is meant below by “learning-and-development,” in the sense that activity is the cause-and-effect, dialectically, of simultaneous individual and societal learning within praxis (a single moment that occurs only once).

Critically, for ICBCI (see below), Vygotskian theorists characterize various forms activity by the nature of their motives ( Leont’ev, 1981 ), realized by adopting the general object or motive of the activity itself ( Roth and Lee, 2007 , p. 201). ICBCI clarifies this motive as the purpose of the activity, useful for anchoring critical-dialectical analysis of human activity under definite conditions (i.e., in pursuit of an implied or identified purpose; Friedman, 2021 , p. 6–7). Thus, preliminarily for the discussion below, [one form of] activity is praxis that reciprocally defines, and is defined by, the purpose (or motive) for which it is conducted ( Leont’ev, 1981 ; Newman and Holzman, 2013 , p. 148), such as when children engage in imaginative play, and develop a world where each child’s assertions and contributions through word and action both change the nature of their own understanding and the nature of the imagined world itself in the same moment and with the same act (p. 99; Vygotsky, 1978 , p. 102–103). The theoretical advancement made by the ICBCI model is to extend and clarify how purpose (such as “imagine a world to play imagination in”) is a dialectical unity with the norms, goals, and meaning of praxis as well ( Friedman, 2021 , p. 5–6; also see Figure 1 and section “ICBCI: A Learning Theory on its Frontier” below). Before discussing ICBCI in more detail, it is necessary to clarify what is not activity, behavior .

www.frontiersin.org

Figure 1. Introduction-Conflict-Balance-Creation-Identity (ICBCI) model of transformative learning (TL). See full model details in Friedman (2021) .

Actions in Behavior

When human actions are not dialectical in praxis (e.g., not simultaneously defining and defined by their definite conditions), they are instrumental, in service of a particular purpose (i.e., function) and are being defined by their conditions, but not defining them, referred to here as behavior ( Newman and Holzman, 2013 , p. 46). Behavior (i.e., a tool-for-result ), implies a constellation of actions in service of societal conditions, with no access or capacity to change those conditions themselves, like using a screwdriver and a screw ( Roth and Lee, 2007 , p. 201–202). A screwdriver can make use of a screw because conditions allow for that, but it cannot change the norms of the screw-screwdriver relationship itself. In fact, it can only entropically deteriorate in service of those norms, such as stripping the head of the screw. Behavior can only change conditions defined by the purpose of the tool itself. In this example, the tool secures one material to another with the use of the screw. Behavior, as the term is used here, is akin to what has also been called operations (p. 202). Leont’ev (1978) viewed them as emergent “in the objective-object conditions of [goal] achievement” (p. 65), such as turning the screw “left-loosey” or “right-tighty.” Deciding to do so is, potentially, conscious and goal-directed (e.g., “I want to tighten/loosen”), but given the overt goal (e.g., “tighten that screw”), is relegated to subconscious instrumental action taken for granted and barely attended. Instead, the action is assumed and conditioned over time. Thus, behavior (as opposed to activity) is defined entirely by its conditions, and cannot change the conditions themselves (e.g., the direction of the screw’s helix, or what screws are for). An example relevant to education: a teacher simply assigning basic workbook problems “to teach math” and students completing those problems “to learn math.” Activity in this case may involve arithmetic word problems the students write for each other or going shopping on a budget with various calculation requirements (see Lave, 1988 ).

For the present discussion, this distinction between activity (tool-and-result) and behavior (tool-for-result) lays the theoretical foundation for Mezirow’s (2008) transformation in the context of ICBCI. For transformative learning to occur, activity is necessary, as the tools applied in the learning context are necessarily changed by the actions (i.e., tool use) of those experiencing transformation. In Mezirow’s (2008) terminology, this is a point of view changing not only a habit of mind, but an entire frame of reference, or the relationships between points of view and habits of mind. ICBCI helps clarify this notion by connecting learning tools (predicting, trying, doing, and reflecting, i.e., habits of mind) to the products of tool use (purpose, norms, goals, and meaning, i.e., points of view), and further, by describing exactly what the relationships between points of view and habits of mind are: connections between purpose , and norms, goals, and meanings (i.e., Introduction, Conflict, Balance, Creation, and Identity activity and behavior). To clarify the meaning of this statement, a general outline of the ICBCI model is necessary.

Introduction-Conflict-Balance-Creation-Identity: A Learning Theory on Its Frontier

See Figure 1 for a reduced presentation of the ICBCI model of learning-and-development. ICBCI is a meta-theory that synthesizes historical work from cognitive, social, educational, and clinical psychology ( Friedman, 2021 ). It posits that “zones of proximal development” (ZPDs; Vygotsky, 1986 , p. 208–209) define-and-are-defined-by five “spheres of activity” (or behavior): Introduction, Conflict, Balance, Creation, and Identity (the hyphens here denote activity-like reciprocity between the constructs, i.e., are in dialectical unity). These spheres of activity (or behavior) are qualified by four “balance tools”: Purpose, Norms, Goals, and Meaning; and two “imbalance forces”: Rigidity and Chaos, resulting in Balance (i.e., activity/integration) or imbalance (i.e., behavior/trauma), whose interaction defines-and-is-defined-by learning-and-development. Each of these constructs is briefly explained below, and full details of the model can be found elsewhere (e.g., Friedman, 2021 ).

These spheres, tools, and forces are always in dynamic interplay in human activity under definite conditions (e.g., during all forms of learning). In other words, the purpose, norms, goals, and meaning (i.e., conditions) of an activity (or behavior) meet the rigidity| chaos present in the individual| group and the environment| purpose and produces either a ZPD (i.e., activity), behavior, or trauma. Note that here and below, the Sheffer stroke (“|”) corresponds to the NAND operation in classical Boolean logic to denote the dialectical nature of these categories ( Roth and Lee, 2007 , p. 197). The terms on either side of the stroke presuppose the other and are understood as mutually exclusive terms of the same entity that together explain what neither alone does. While the rigidity| chaos unity isn’t discussed at length in this paper, all that matters for the present discussion is that it explains the natural and unknowable forces of change that we, in praxis, affect, and affect us. The rigidity| chaos unity thus explains the infinite milieu of conditions in history humans contend against in their own processes of learning-and-development.

Under conditions ZPDs emerge, learning-and-development is perceptual, adaptive, generative, and/or transformative, depending on the spheres of activity that are defining-and-being-defined-by the ZPD (see Figure 1 ). Under conditions that ZPDs do not emerge, learning takes the form of conditioning, which is to say that the individual| group engages in behavior primed and enforced by the conditions that they have no power to change; they simply execute expectations, perfectly or imperfectly, without conscious access to the conditions’ development, or their own. Before describing how this theoretical shift can aid TL research in section “Theoretical and Real Obstacles to Current TL Theory,” the main constructs of the model relevant to the present discussion are briefly described.

Spheres of Activity or Behavior

Introduction-Conflict-Balance-Creation-Identity posits five modes of activity (or behavior; depicted as spheres in Figure 1 ) extended from the integration of classical group dynamics theory ( Tuckman and Jensen, 1977 ) and the Kolb Experiential Learning Cycle ( Kolb, 2014 ; for details of this integration, see Friedman, 2021 ). Each mode is defined by the interaction between two spectra: (a) perception-action, and (b) internal-external. The distinction between perception and action is related to common sense notions of observing or sensing and acting or doing, respectively. The distinction between internal and external is related to whether perception and/or action is directed to the outside world or inner milieu of the individual| group.

Thus, external perception describes the “Introduction” mode, wherein individual| groups observe and get a sense of their environment| purpose. Following clockwise around Figure 1 , internal action describes “Conflict” wherein individual| groups act on the internal milieu of themselves, essentially to organize and resolve apparent contradiction or tension. “Creation” is described as “external action,” the mode individual| groups engage while acting on their environment| purpose. Internal perception describes “Identity,” or the mode wherein individual| groups observe and get a sense of their own being within the environment| purpose. Finally, “Balance” describes the mode of any unity between (i.e., co-occurrence of) Introduction, Conflict, Creation, and/or Identity. Further, the model borrows Vygotskian theorists’ discovery of activity as defining-and-defined-by learning-and-development and extend the discovery of this unity (and it’s disunity, behavior) to activity as defining-and-defined-by the five modes (as each is a form of learning; see Figure 1 ), while behavior is simply defined by them (see above, Newman and Holzman, 2013 , p. 46). The actions that support (i.e., create the potential for) activity, and thus learning-and-development, are called “balance tools” [note their places on the border between the Balance sphere and ZPDs (i.e., activity) in Figure 1 ].

Balance Tools

The balance tools – Purpose, Norms, Goals, and Meaning – are derived from the integration of the five spheres in Figure 1 with the Kolb Experiential Learning Cycle actions: Predict (also referred to as Think; i.e., abstract conceptualization), Try (i.e., active experimentation), Do (i.e., concrete experience), and Reflect (i.e., reflective observation; Kolb, 2014 ), and serve as supports between the spheres (i.e., the more developed the balance tools, the more capable the activity or behavior). By taking the Vygotskian view of activity rather than the individual as the proper unit of psychological analysis (see Roth and Lee, 2007 , Figure 4, p. 198; Newman and Holzman, 2013 , p. 52), ICBCI recasts actions individual| groups engage in as tools (tools-for-results and tools-and-results depending on the definite conditions) that human activity (and behavior) requires to function. Sometimes these tools are explicit and conscious (i.e., articulated, acknowledged, and intentional), such as when the purpose of the learning activity, the methods engaged in pursuing that purpose, the goals (i.e., objectives) those methods aim to achieve, and the meaning of the resulting experience for that purpose are articulated. Other times they are implicit and subconscious (i.e., assumed, taken-for-granted, unknown potentially to both teachers and students), as is their negotiation. An example of activity at the conscious level are project-based learning environments where actions (and their environment| purpose) are co-constructed by both teacher and student. The unconscious level is common in apprenticeships where shifting balance tools may not be articulated or recorded but are nonetheless evolving through reciprocal activity between the apprentice and the expert. This evolution does not occur in behavior, where the tools are inaccessible to definition by the learner. Note here that these tools (purpose, norms, goals, and meaning) are also postulated to be the “definite conditions,” and thus, while they can each define-and-be-defined-by one another, they do not need to be in praxis, and this is the distinction between activity and behavior, one of the crucial points of the argument presented here.

Given the focus of this article on transformative learning, the balance tools (i.e., conditions) most important for the present discussion are Meaning and Purpose. Or, as Vygotskian theorists consider it – the unity – human-activity-as-meaning-making-as-learning-and-development ( Newman and Holzman, 2013 , p. 198–199). ICBCI furthers this Vygotskian discovery by clarifying the unity’s definite conditions and in so doing defines TL phenomena: when Meaning (i.e., the reflective observation of experience such as an appraisal, judgment, or metaphor) is engaged in as activity (i.e., meaning is made in such a way as to transform meaning-making, i.e., reflection), and that activity transforms Purpose (i.e., the conceptual abstraction of experience into a model or prediction) under those [transforming] definite conditions which further, is engaged in as an activity itself (i.e., transforms concept-building activity, i.e., thinking/predicting). Thus, the Vygotskian discovery of meaning-making-as-learning-and-development is, in ICBCI’s theory of TL, further elucidated as meaning-making-transforming-purpose-as-learning-and-development (see also Immordino-Yang et al., 2019 for a discussion of this phenomenon from educational neuroscience). It is that meaning-making activity that transforms purpose of human activity under definite conditions (i.e., the balance tools, including purpose) that ICBCI identifies as transformative learning, in a radically monistic account. This is only a slight clarification of Mezirow’s (2003) point of view (i.e., meaning) that transforms a frame of reference (i.e., purpose), but, as shown below, a crucially important one.

To preview, since human activity under definite conditions describes reciprocity between human actions and the conditions that define them, and those conditions are balance tools, and one of those balance tools is Purpose, and Purpose most powerfully influences the other three tools (Norms, Goals, and Meaning; see Leont’ev, 1981 ; Friedman, 2021 ), ICBCI shows how TL, in making Meaning that transforms Purpose that transforms Norms, Goals, and Meaning can lead to radical and irreversible change in individual| groups within their [transformed] environment| purpose: it transforms points of view (constellations of purpose, norms, goals, and meaning), habits of mind (predicting, trying, goal-setting, reflecting processes) and frames of reference (quality and capacity of Introduction, Conflict, Balance, Creation, and Identity activity and behavior). In other words, it is a radically monistic account of TL. The goal of the following section is to suggest that the most intractable issues of TL research and practice can be at least chipped away at if not alleviated by making exactly this relationship (meaning-making-transforming-purpose-transforming-conditions) clear, and amenable to observation, without the need for mountains of time and data to do so.

Theoretical and Real Obstacles to Current Transformative Learning Theory

Despite 30 years of work, theoretical progress on TL has stalled in the same places ( Cranton and Taylor, 2013 ; Howie and Bagnall, 2013 ; Dirkx et al., 2018 ): what exactly is being transformed, what are the predictable consequences of this transformation, and how is this transformation an example of learning [processes] (i.e., how is transformative learning related to other, non-transformative, forms of learning)? The following section attempts to show how these obstacles can be resolved by a Vygotskian perspective of education and the role of educational psychology. It is not the author’s view that researchers today are unaware of Vygotskian cultural-historical activity theory, but rather that this work and Vygotsky’s life-as-lived are often misinterpreted to fit a dominant, institutionalized concept of the bounds of psychology and its appropriate unit of analysis (the individual, or in less Westernized traditions, the collective). The following is an attempt to return to Vygotsky’s discovery of human-activity-as-meaning-making-as-learning-and-development to show how ICBCI makes TL processes and outcomes observable across sets of conditions (i.e., Purpose, Norms, Goals, and Meaning). First, a brief review of the history of learning research is presented, before describing how ICBCI, in following Vygotskian cultural-historical activity theory, articulates TL’s necessary and sufficient conditions as re-organizing [revolutionary] activity.

Before Vygotsky’s and his contemporaries’ work from the early 20th century was widespread in the West in the 1970s and 1980s, “learning,” was first conceived by James et al. (1890) , Thorndike (1927) , and another Russian psychologist, Pavlov (1957 ; later championed most strongly by the American, Skinner, 1965 ), as innumerable stored Locke (1847) representations of stimulus-response (S-R) links, and all that mattered was how many times the S-R link had been “occasioned.” Later, thanks publicly to Chomsky (1959) , and privately to numerous passionate researchers (e.g., Newell and Simon, 1972 ; Neisser, 2014 , among many others), the quality, rather than solely the quantity, of information processing was discovered as a factor in determining learning processes and outcomes. Only very recently in the West, biopsychosocial approaches to educational psychology and cognitive neuroscience (those that consider the biophysiological and social environment of learning in the process of research and practice) have strongly argued with tantalizing neural and behavioral evidence that while the information processing approach was certainly an improvement over behaviorism’s S-R links, it still lacks much in the way of explaining learning phenomena, and is improved in this capacity by accounting for the motor, emotional, and social (i.e., the nature of the group and individual relationships present) contexts of the learning environment, and the surrounding socio-cultural-historical environment (i.e., the dominant culture(s) present; see Bandura, 1997 for a classical argument; Barsalou, 2008 ; Barrett, 2017 , and Immordino-Yang et al., 2019 for modern perspectives).

During all this time, Vygotsky and his contemporaries published their work and passed away, largely ignored by the West. Also, during this time, Mezirow (1978) began his research program to investigate a particularly important sort of learning that seems to transform the very people who experience it, rather than simply provide another tool in their toolbelt (i.e., the learning experience re-organizes the entire structure of what they already know, rather than learning a new tool to simply apply or extend the structure already known). It is relevant to consider what Mezirow would have thought or what direction his work would have taken if Vygotsky’s work was more well known in his time, but more pertinent to the present goal is how Mezirow’s work can be understood in terms of the radical monism championed by Vygotskian scholars. In other words, Mezirow’s classic 10 steps of TL (and learning more generally by Mezirow’s descendants and colleagues) will be described as a dynamic emergent process in ICBCI, before describing the concrete predictable consequences of TL according to ICBCI. First, a broad overview of learning as conceived of by TL researchers generally is presented in dialect with Vygotskian ideas.

What Is Learning?

Though the actual attention to non-transformative learning by Mezirow waxed and waned over his career, it was clear to him that TL was a separable kind of learning from other kinds of learning ( Mezirow, 2000 ). Particularly, TL according to Mezirow is a form of Habermas’s (1984) “communicative learning” as compared to “instrumental” (learning to manipulate or control the environment or other people to enhance performance), “impressionistic” (learning to enhance one’s impression on others), or “normative” (learning oriented to common values and a normative sense of entitlement to expect certain behavior) learning ( Mezirow, 1997 ). “Communicative learning,” or learning to understand the meaning of what is being communicated, is exactly what Vygotskian theorists had in mind when describing the unity of imitation-as-revolutionary-activity-as-learning-and-development, when they described how children imitate adults (and peers) in performing the activity they observe in others – and this is crucial – only the activity , and not the behavior ( Bloom et al., 1974 ; Newman and Holzman, 2013 , p. 56). In other words, Mezirow (and Habermas) are pointing at the tip of the Vygotskian iceberg: that learning to understand meaning is necessarily communicative, necessarily an activity between rather than of , individuals.

For Vygotskian cultural-historical activity theory, instrumental, impressionistic, and normative learning are not learning-leading-development, or revolutionary activity , but rather, behavior , or development leading learning, also, just plain “acting” ( Newman and Holzman, 2013 , p. 176; i.e., operations , Roth and Lee, 2007 , p. 202). Behavior, and acting out behaviors, despite any learning’s newness to a given individual| group, won’t enable them to maintain that behavior outside the present conditions, unless those conditions are recreated for that individual| group. Vygotskian scholars contrast this kind of learning with the revolutionary activity of learning-leading-development, where individuals can transfer that activity to new sets of conditions (within limits, see Lave, 1988 ; Bransford and Schwartz, 1999 ; and Immordino-Yang et al., 2019 for discussions).

For Mezirow, TL occurs when a new “point of view” (as the result of cumulative progression toward that point, or sudden situational experience of it) changes not just a present habit of mind but the over-arching and determining frame of reference (1991), and this is contrasted with non-transformative forms of learning where the new points of view don’t change anything (i.e., a new point of view), or change only a habit of mind or other points of view (i.e., both new meaning schemes), rather than the entire frame of reference (i.e., a new meaning perspective, 1991, p. 93–94; also described as content, process, and premise reflection, respectively, p. 107–108). For example, in the ethnocentric example earlier, a new point of view (e.g., “that person of a different ethnicity is more intelligent than I thought”) experienced within an old habit of mind (e.g., “persons of different ethnicities are less intelligent”) may lead a learner to adopt a only a new habit of mind (e.g., “my ethnicity’s peoples are more intelligent for other reasons than ethnicity, like our culture”), or a new frame of reference (e.g., “all individuals exist on the same scale of intelligence regardless of origin”). Only the latter is an example of TL (related: Bateson’s (1972) , “Learning III” p. 293).

This is what makes TL irreversible for Mezirow: the shift in the frame of reference as a result of the new point of view, because the new frame of reference transfers to new and old points of view (e.g., “many people I thought inferior before are actually not”). For Vygotskian theory, this (tool-and-result activity: points of view defining-and-defined-by frames of reference) is what makes activity learning-and-development, or adapting to history, and behavior simply acting, or adapting to society (i.e., the adoption of a point of view, or a habit of mind, without understanding why , or how , i.e., without having access to the conditions; Newman and Holzman, 2013 , p. 187–188). In this sense, behavior can be thought of as the expression of a point-of-view or the expansion or application of a current habit of mind. Activity, on the other hand, is either the adoption of a new habit of mind (when norming, goal-setting, and meaning-making processes are accessible and changed in accessing them) or a new frame of reference (when meaning-making-as-activity defines-and-is-defined-by purpose which then transforms habits of mind: norming, goal-setting, and meaning-making processes) resulting in reorganized points of view (constellations of norms, goals, meaning, and purpose).

For both TL and CHAT research programs, there is something unique about dynamic and reciprocal activity between humans and their conditions, and ICBCI attempts to articulate this uniqueness by clarifying what a “point of view” and “frame of reference” are (meaning and purpose, respectively), how they prime transformative experiences (meaning-making-transforms-purpose), and the product that is transformed (meaning-making-transforming-purpose-transforming-conditions). In the case of a TL experience relative to ethnocentrism, an old purpose (e.g., “maintain assumption of natural superiority over other humans”) is transformed through a meaning-making process (see above) to a new one (e.g., “recognize common humanity regardless of ethnicity”), which then proliferates through new norms, goals, and meanings (i.e., conditions and points of view), and norming, goal-setting, meaning-making, and purpose-identifying processes (i.e., balance tools, or habits of mind). A final briefing note on learning perspectives in TL theory of learning generally will help interpret this claim (and Figure 1 ) before elaborating on transformative learning in ICBCI terms.

TL researchers since Mezirow have embarked on diverse directions to define learning, and transformative learning as a special case thereof (for a relevant dialogue on divisions within TL research itself, see Dirkx et al., 2006 ). Probably the most well-known taxonomy of this work within the TL literature (besides the Mezirow/Habermas taxonomy above) is described in detail by Sessa et al. (2011) , who, working in a team learning space, define TL as:

Re-shaping or altering the team’s purpose, goals, structure, or processes…and requires experiencing disorientation and then reorientation for an entirely new direction for growth…produc[ing] a new team, structure, strategy, goals, and identity (p. 149).

Sessa et al. (2011) anchor this definition of TL by comparing Transformative Learning to Adaptive Learning (“reacting almost automatically to stimuli to make changes in process and outcome as a coping mechanism”) and Generative Learning (“proactively and intentionally applying new skills, knowledge, behaviors, and interaction patterns to improve…performance”) processes (2011, p. 149). Focusing on activity here as the appropriate unit of analysis rather than the individual vs. group distinction, this tool-and-result aspect of TL, and the tool-for-result character of adaptive and generative learning, clearly emerges. This suggests that for Sessa et al. (2011) , adaptive and generative learning are forms of behavior [according to Newman and Holzman (2013) ], and transformative learning is a form of activity (as defined by CHAT; Roth and Lee, 2007 ). ICBCI disagrees.

Relying on Vygotskian cultural-historical activity theory, ICBCI defines learning as increasing capacity to act on a specified purpose under definite conditions . Note the use of “act” here, rather than activity or behavior. The increased capacity is independent of any definite future reciprocity between actions and conditions. Some learning increases capacity for activity, some for behavior, and some for both. Some learning is learning-leading-development, and some learning is development-leading-learning. A key insight that follows this formulation is how all types of learning can be activity (tool-and-result) or behavior (tool-for-result), including TL (see above, and Figure 1 ).

To be clear, the transformative learning process that Mezirow (1991) describes is, to ICBCI, transformative learning-and-development (i.e., activity, or more specifically: meaning-making-transforming-purpose-transforming-conditions), but this is not the only kind of TL, because sometimes individual| groups “act out” TL, and are thus able to recreate the consequences of that TL experience in those conditions , but not in others ( Newman and Holzman, 2013 , p. 176). Their transformed frame-of-reference, in the case of Identity as behavior, is relevant to only that environment| purpose it was transformed in, and not others (e.g., being able to take a humanistic meaning perspective, or purpose, with a group of colleagues after an anti-racist workshop but reverting to egotistic perspectives with family). Remember, for ICBCI, conditions and balance tools are essentially the same, what matters is if they’re accessible to individual| groups’ actions. If they are, activity results; if not, behavior. The theoretical existence of TL activity doesn’t preclude that of TL behavior [the “acting out,” or unaware pretending of transformation, in Newman and Holzman’s (2013) language, p. 176]. TL behavior is meaning-making-that-transforms-purpose (but isn’t transformed, or to use Vygotskian language, reorganized, by it). In other words, the environment| purpose is transformed, but the individual| group’s capacity for Identity, is not. This is also akin to Mezirow’s (1991) point of view that changes a habit of mind (in this case, how purpose is identified, or “process reflection,” p. 107–108), but not the frame of reference (how identified purpose establishes conditions, or “premise reflection,” p. 108). Before describing this difference in detail, it will be helpful to review the TL literature’s response to the second stable obstacle: what is transformed.

What Is Transformed?

As mentioned above, for Mezirow (2008) , problematic frames of reference are what’s transformed. Also referred to as meaning perspectives, and defined as the “structures of culture and language through which we construe meaning by attributing coherence and significance to our experience,” these frames of reference are transformed when those structures encounter a “disorienting dilemma,” instigating a practical-critical process of reflection, identification, communication, and integration of changes in perception and action that culminate in a novel point of view from which an entirely re-organized frame of reference propagates (p. 92). This cohering and signifying structure of experience, for human activity, is Purpose, or more specifically, tool-and-result activity-as-identifying-purpose-transforming-conditions. The relationships between and among the individual’s points of view are themselves reorganized to reflect a new meaning perspective (i.e., frame of reference). For ICBCI, Purpose constructs (i.e., is) the frame of reference, and is also the primary condition for the activity [or behavior] engaged in, framing every other condition (Norms, Goals, and Meaning). This formation of perspective (i.e., Purpose) for human activity sets the stage for transformative experiences, serving as the landmark for meaning-making activity to transform, in so doing transforming every other condition for the individual| group. Purpose has a special place in ICBCI, and in human activity ( Leont’ev, 1981 ; Friedman, 2021 ).

No matter the typology of the transformation itself (or the typology of its outcomes), it can be described by ICBCI. Taylor (1997 , 2008) identifies eight types of TL processes (see section “Introduction”). ICBCI can anchor every kind under the umbrella of a relevant and articulated Purpose of human activity under definition conditions without the need for eight categories overlapping to different extents with one another. To simultaneously echo and update Taylor (2008) , the exciting part of the diversity offered by the Purpose concept emulates the diversity of human learning-and-development, and thus helps us get that much clearer on the more fundamental question of what exactly develops – the capacity for [revolutionary] activity (itself enabling behavior) within the reach of present definite conditions – and how that development occurs: activity-as-meaning-making-transforming-purpose. When conditions (Purpose, Norms, Goals, and Meaning) are such that individual| groups can change their conditions through their actions (i.e., engage in activity) and one of those actions is a meaning-making process that transforms their purpose in that environment| purpose (transforming the rest of their conditions), we can say that TL, as Mezirow (2008) described, occurs.

The infinite number of purposes that may be identified (and their context-bound necessity) provides scope and structure to TL research by enabling taxonomic efforts to focus on the nature of the change itself, rather than its antecedents and consequences. Thus far, the codification effort of TL has proliferated in walled gardens within the taxonomy all claiming a unique kind of transformation (e.g., psychocritical, cultural-spiritual, race-centric, etc.), for which the list of necessary and sufficient conditions for a “disorienting dilemma,” “critical reflection,” or “imaginative engagement” to occur has rarely simplified, and far more often compounded on itself in the effort to answer critics and broaden the umbrella TL theory covers (e.g., Taylor, 2008 ; Hoggan, 2016b ).

In contrast to these efforts to categorize disparate content, ICBCI focuses on the dynamic and continuous process of emergent transformational activity (or behavior), making clear what exactly is transformed: Purpose (and as a result: balance tools, as well as the capacity of their interactions, Introduction, Conflict, Balance, Creation, and Identity); how it is transformed: tool-and-result meaning-making-transforming-purpose; and what enables, or instigates this activity : a set of conditions (i.e., purpose, norms, goals, and meaning) that don’t have the capacity to fulfill the current Purpose . This can be mapped onto the model and compared to other forms of learning-and-development (i.e., activity, not behavior), that are not transformative (see Figure 1 ): perceptual activity transforms the Norming process through trying new norms (based on present purpose); adaptive activity transforms the Goal-setting process through setting new goals (based on present norms and purpose); generative activity transforms the Meaning-making process through making new meaning (based on set goals, norms, and purpose); and finally, transformative activity transforms the purpose-identification process through identifying new purpose (based on made meaning, in pursuit of a goal, through norms, hinged on purpose), that, due to the environment| purpose unity (i.e., the conditions-defining nature of purpose), transforms perceptual, adaptive, and generative activity , or the relationships between norms, goals, meanings, and their formation processes. In this way, ICBCI’s definition of learning can be further elucidated as taking the shape of either (a) learning-and-development, or transferable learning (to new sets of definite conditions) when engaged as activity ; or as (b) development-leading-learning, or non-transferable learning when engaged as behavior . This is a very Vygotskian idea: that the development we are in search of in the process of education is that which can be carried around, and this is only made possible when the learning individual| group has access to reshaping (through activity) the conditions of their environment| purpose, or what Vygotsky described as the ZPD ( Vygotsky, 1978 ). See Table 1 for examples of activity and behavior for each kind of learning.

www.frontiersin.org

Table 1. Examples of activity vs. behavior for various learning types.

What makes TL truly unique in the pantheon of learning phenomena tends to be its emphasis on its changes changing everything else. Again, ICBCI models exactly this, as it is only through transforming Purpose, through transformative activity that one “re-Introduces” their “entire self” (purpose in this set of definite conditions) to a new set of definite conditions from a new meaning perspective, or purpose. Further, for the purposes of TL research, that newly transformed purpose can be anchored to a set of meanings before, after, and within any particular meaning-making process, the changes in those meanings can be identified, and any resulting changes in activity or behavior under new conditions (i.e., new norms, goals, meaning, and purpose), integrated and observed to build a theory of what potentiates TL experiences. Finally, the complexity of any given environment| purpose: its depth, breadth, and coherent integration (or rigidity| chaos) can be interrogated with systematic clarity compared to the transformed environment| purpose. Before an illustration of this potential, the TL predictions ICBCI makes beg elaboration.

What Are the Predictable Consequences of Transformation?

The final stubborn stumbling block to TL theory and practice that ICBCI can help resolve are the predictable consequences (i.e., evidence) of transformative learning. Here, the challenge is collecting practical and observable data from TL phenomena. Because it hasn’t been clear what the antecedents to transformation are systematically (other than “disorienting dilemma”), data is typically sampled from settings considered dramatic enough to make TL likely (e.g., breast cancer survivors, Hoggan, 2014 ; outdoor adventure education, Meerts-Brandsma et al., 2020 ; developing cultural competency as members of historical majorities, Taylor, 1994 ; and the women’s liberation movement, Mezirow, 1978 ), rather than observing TL under definite conditions where TL is theoretically potentiated for some actions, but not all actions, and the hypotheses determining which are tested empirically.

In other words, in TL’s fragmented theoretical landscape, researchers can study who transforms when they do transform , why they transformed, and what the consequences of their transformation are, but they cannot study who doesn’t transform, or what actions or conditions prime transformation vs. don’t, because the experimental contexts engaged assume that transformation is inevitable for at least someone under those conditions (and researchers focus on them ). The limitations of these contexts restrict researchers’ ability to understand the boundaries of what TL is and what it isn’t ( Nohl, 2015 ). TL research today can’t study why certain actions don’t lead to transformation unless one or more of Mezirow’s 10 steps didn’t occur, or the active frame of reference wasn’t “problematic,” but these are vague and insufficient negative definitions ( Apte’s (2009) dialectical model is an interesting practical-critical exception that hasn’t been noticed much by TL researchers). Further, the theoretical models available for collecting systematic data on a TL experience (i.e., transformative activity and its consequents) remain sparse, and require an intensive amount of qualitative data collection and analysis to draw conclusions (see Harder et al., 2021 for a relevant discussion and attempted technological solution resulting in similar limitations). These limitations in scope and efficiency can be overcome if conducting TL research based on ICBCI.

Regardless of the setting observed, TL outcomes are often categorized in terms of their depth, breadth, and stability (e.g., Hoggan, 2016b ). ICBCI further clarifies “stability” as “integration” (differentiation and linkage; Siegel, 2001 ), or increasingly greater capacity of modes of activity (i.e., ICBCI; Friedman, 2021 ). Every TL experience, according to ICBCI, leads to a sweeping activity period where meaning-making-transforms-purpose, and that made meaning propagates through transforming purpose which then re-organizes norms, goals, and meanings related to that environment| purpose. This is what ICBCI means by a transformed re-introduction to definite conditions. Those definite conditions are defined by the identified purpose. The introduction (or any other) mode can be either of activity or behavior. In both cases, the perceptual learning (or any mode of learning) and the formation of norms (or any balance tool) are based on, or related to, the environment| purpose. Engaging in activity (rather than behavior) in any form of learning extends the environment| purpose to which that learning will transfer. However, it is only when the introduction mode (or any mode) is engaged in as activity , as the direct result of the Identity mode as activity , that there is evidence of TL (i.e., if perceptual, adaptive, and generative activity transforms as a result of meaning-making-transforming-purpose). If perceptual, adaptive, and generative activity (and behavior) is a spontaneous propagation of that meaning-making-transforming-purpose process, there is evidence of TL. When there is evidence of TL, ICBCI predicts that, in Siegel’s (2010) language, the [transformed] definite conditions (purpose, norms, goals, meanings) will be more flexible, adaptive, coherent, energized, and stable across and within the environment| purpose (p. 69–71). In essence, their capacity for [revolutionary] activity (as opposed to [societally expected] behavior), will be greater, and challenges that used to be more difficult are now less, achievements that were impossible before will now be possible. Wondrously, this claim of course, is an empirically testable one, because we can anchor on each environment| purpose and test each individual| group within it.

Thus, the 30 year-old questions: what are the consequences of transformation, and how do they differ from consequences of non-transformation, can finally be answered. The consequences of transformation are contained in the dialectical unity: Meaning-making| Purpose| Norms| Goals| Meaning (i.e., a meaning-making process that transforms purpose results in a new purpose, or meaning perspective, that requires transformations of Norms, Goals, and Meanings, and their formative processes , to align with the transformed Purpose). This means that no matter the content of the outcomes (e.g., Hoggan, 2016b , p. 70), they can be described in terms of Purpose, and its transformation under definite conditions (of norms, goals, and meaning). This focus on Purpose allows individual| groups (be they researchers or learners) to identify specific changes relevant to history (i.e., their activity), rather than society (i.e., their behavior; acting out what is expected). Additionally, each purpose can be seen both as what is transformed: from the previously identified purpose to the newly identified one; and the outcomes of that transformation: new purpose propagated through new norms, goals, and meanings, as well as new norming, goal-setting, and meaning-making processes. An identical formulation of the consequences of TL: “triple-loop” learning ( Peschl, 2007 ), or that which re-organizes itself, is re-organized by, and reorganizes its container in the process of its performance. The consequences of non-transformative learning-and-development (i.e., perceptual, adaptive, and generative activity): “double-loop” learning ( Argyris, 1977 ), or that which re-organizes itself in the process of its performance. These are the predictable consequences and key pieces of evidence TL theory has been searching for: new meanings re-organizing purpose, which then re-organizes norming, goal-setting, and meaning-making processes to align with the historical direction of activity for each individual| group experiencing TL.

Thus far, this theoretical proposal has suggested that TL theory has faced the same obstacles since Mezirow’s formulation of the topic: a lack of clarity on what exactly learning is, what transformative learning specifically transforms, and what the predictable consequences of these transformations are. These obstacles have kept TL research largely in a qualitative case-study space, only tentatively inching forward into experimental and generalizable methods until a stringent criterion for dramatic enough change gadflies researchers and hampers further progress ( Cranton and Taylor, 2013 ).

Introduction-Conflict-Balance-Creation-Identity offers the following resolutions: (1) learning conceived of in Vygotskian terms as tool-and-result activity, or tool-for-result behavior. While the latter is still learning, it isn’t capable of re-organizing its conditions, only being defined by them, and thus can’t be transformative activity (or transferable to new sets of conditions), though might be “acting out” transformative behavior (in which case we would expect meaning to shape purpose, but not purpose to re-shape meaning, losing any holistic transformation, or “breaking the loop”); (2) TL as transforming purpose through meaning-making processes that are also transformed through transforming that purpose of activity under definite conditions. It is the unity, meaning-making-transforming-purpose that is itself transformed during TL activity. Finally, (3) the predictable consequences of transformation are (so far discovered) transformed Norming, Goal Setting, and Meaning-making activity (tool-and-result change, and their ICBCI interactions) related to Purpose-identifying activity for the environment| purpose. Given these tool-and-result methods for investigating TL, researchers can be more efficiently equipped to observe necessary and sufficient conditions for TL for every purpose under definite conditions . An example of what this could look like follows before presenting final thoughts and empirically testable hypotheses based on ICBCI for TL.

An Illustration of Transformative Learning According to Introduction-Conflict-Balance-Creation-Identity

Outdoor adventure education (OAE) is known for its TL potential so much so that a large part of the field’s research and practice is focused on TL theory and outcomes (e.g., Meerts-Brandsma et al., 2020 ). Briefly, OAE typically involves a stable group of learners spending a significant amount of time together engaged in challenge-based problem solving (usually, but not always, outdoors in nature). The significance of each element of these conditions can’t be easily overstated. The group primes dialogue, the environmental challenge primes practical-critical activity, and the significant time together, reflection and conceptualization. Typical TL examples in these environments are when individuals see themselves as more capable and competent as a result of overcoming an obstacle they thought impossible for them to overcome (usually following a challenge they saw themselves as incapable to accept, but then, through activity, through imitation-learning-leading-development in a ZPD, they realize they are actually quite capable; Newman and Holzman, 2013 , p. 176). And this new point of view, that they are more capable than they realized, propagates through their frame of reference (who they are as a person, what they as a person are capable of) and habits of mind (responding “oh, I can do this” to a tall tower to climb or long hike instead of “get me out of here”) across contexts, or sets of definite conditions (i.e., feeling capable of public speaking as a result of completing a long hike, not because long hikes make you good at public speaking, but because the frame of reference, individual competence judgment, has re-organized to prime confidence in the face of challenge rather than insecurity). While basic, this is, in a general sense, the archetypical TL trajectory in Mezirow’s (2008) language.

Introduction-Conflict-Balance-Creation-Identity can help define what is observed in this example and what can be predicted about similar purposes under definite conditions. The “disorienting dilemma” can be further clarified in terms of Norms (e.g., as the normative belief: “I am incapable of doing things that scare me”) that didn’t support the capacity of articulated Goals (e.g., “I am going to climb this tower”) that instigated Meaning-making activity that transformed Purpose (e.g., “If I can climb this tower, I was wrong about being incapable, I wonder what else I thought myself incapable of that I might actually be quite able to do…”). In this case, Purpose has shifted from, for example, “I am here to climb towers”, to “I am here to increase my self-confidence, in climbing towers as well as doing many other things.” Each of these four conditions can be identified prior to and in the moment of disorientation, what ICBCI refers to as imbalance, to interrogate the dynamic interrelationships that prime TL for every purpose (in our current example, what is stated above, or perhaps “to increase feelings of competence in the face of challenges”). Importantly, this shift in purpose is only possible in activity, as in behavior, these conditions cannot be accessed or negotiated, and likely take the form of “to climb a tower as a group” ( Newman and Holzman, 2013 , p. 194–195).

Introduction-Conflict-Balance-Creation-Identity can also further clarify the shift in meaning perspective by anchoring on the pre-transformational meanings and interrogating meanings post-transformation, or during TL activity, to better explain the mechanisms of TL (i.e., Norms-Goals-Meanings in conflict with Purpose under conditions of activity, which is to say Purpose-Norms-Goals-Meaning constellations that are accessible to the learner). This allows researchers and educators to peer inside the black box of “shifts in meaning perspective.” In this case, pre-dilemma meanings had to do with maintaining norms related to the purpose of competence that interpret the environment as threatening, overwhelming, and beyond the competence of the individual| group. Since the hypothetical post-transformation norms are observed as “interpret challenging environment| purposes as welcoming and tantalizing,” the TL activity itself, the during-imbalance meanings can be interrogated for change processes with clarity. For example, imagine that in the moment of struggle, the individual| group [potentially] undergoing transformation is probed for their current meaning of the environment| purpose; this is surely more reliable and less expensive than extensive retrospective interviews.

Purpose, and its transformation – in this case, first to increase competence by going on an OAE trip, and then, to feel competent in the face of challenge – is what helps anchor TL theory, research, and practice according to ICBCI. What norms were meaningfully related to the imbalanced purpose, and what norms are now meaningfully related to the balanced purpose? What goals? What meanings? Was there Norming, Goal Setting, and Meaning-making activity preliminary to Purpose-identifying activity , or only norming, goal setting, and meaning-making behavior? These are empirically testable hypotheses. As the articulated purpose changes, and as activity supplants behavior, hypotheses can also be articulated as to the direction of that purposeful change, and the effect of its direction and magnitude on consequent Norming, Goal-setting, Meaning-making, and Purpose-identifying activity processes. These empirical hypotheses can then help potentiate activity that reorganize TL (and ICBCI) theory itself to understand the lifespan of TL, activity under definite conditions that create capacity for TL, and the resulting impact on the lives-as-lived individual| groups who experience TL. Nothing is more important in a world with so much integral change to make so quickly.

Vygotsky and his descendants’ discovery that all [revolutionary] learning-and-development is a dialectical unity (meaning-making-as-learning-and-development) necessarily embedded in history (human activity under definite conditions) helps us clarify, through a synthetic meta-theory of learning, ICBCI, and the organization of classical features of TL, and further clarifies exactly what they are: disorienting dilemmas are threads of Norming, Goal-setting, and Meaning-making activity (or behavior) incapable of fulfilling articulated (or implicit) Purpose (i.e., are imbalanced), instigating Purpose-identification activity, or tool-and-result meaning-making-transforming-purpose. Dialogue, imaginative engagement, and critical reflection are more integrated Norming, Goal-setting, and Meaning-making activity propagating from Purpose-identification activity, and not behavior; and transformed frames of reference are more capable and complex, which is to say deeper, broader, and more integrated meaning perspectives, or Purpose under definite conditions. A final example from development: consider a baby’s environment| purpose to understand utterances shifting to the application of utterances in communication. It is an open question whether this is identical to TL in the adult context (i.e., that transforming purpose is instigated by meaning-making activity) or whether it is simply behavior. ICBCI-based experiments can help sort this out by pursuing methods to probe the concept of “meaning-making” itself, and how the activity of it develops.

Future Directions

With ICBCI and its tool-and-result methods covering the entire TL trajectory, TL researchers and practitioners can now readily articulate sets of concrete empirical hypotheses. Some examples are summarized in Table 2 , many are in the preceding text. It is the hope that this clarification of TL theory and concepts will enable researchers to interrogate deeper relationships between activity and behavior, between perceptual, adaptive, generative, and transformative learning, and most importantly, between activity and exactly what develops as humans digest experience. Additionally, some classic lines in the sand for TL researchers such as whether TL is a qualitative or quantitative phenomenon, an individual or group process, or has social or individual sources of disorientation can be wiped away by recognizing activity, rather than an individual, as the appropriate unit of analysis, and can specify the conditions of that activity (i.e., Purpose, Norms, Goals, and Meanings) in terms of both its qualitative and quantitative characteristics, its differences when observed in individuals, groups, entire cultures, or some individuals in groups or cultures but not others, and how these levels interact. For example, an open empirical question is whether transformational learning on behalf of a leader primes transformational learning across the culture that they lead ( Friedman, 2021 ). This shift in the unit of analysis is itself revolutionary activity in the service of psychology-in-history’s Purpose-identification: to describe and predict human activity under definite conditions, rather than to describe and objectify a human, as humans are not objects, in their transformation or otherwise, and don’t behave well or act naturally when studied or interacted with as such. Recognizing activity as the appropriate unit of analysis opens the door to agency on behalf of those studied in the context of transformation, for which agency is crucial according to Vygotskian theory. The purpose above has been to show how recognizing agency as such can move TL beyond the stumbling blocks currently on its treadmill. In Vygotsky’s words, “the method is simultaneously prerequisite and product, the tool and the result of the study” (1978, p. 65), and it’s time transformative learning research methods engage in transformative activity themselves, rather than simply attempting its description. Cultural-historical activity theory, and ICBCI as a revolutionary progression of it, provide one such option for doing so.

www.frontiersin.org

Table 2. ICBCI empirical transformative learning (TL) questions and hypotheses.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Apte, J. (2009). Facilitating transformative learning: a framework for practice. Aust. J. Adult Learn. 49, 169–189.

Google Scholar

Argyris, C. (1977). Double loop learning in organizations. Harv. Bus. Rev. 55, 115–125.

Bakhtin, M. (1993). Toward a Philosophy of the Act. Austin, TX: University of Texas Press.

Bandura, A. (1997). Self Efficacy: The Exercise of Control. New York, NY: W.H. Freeman and Co.

Barrett, L. F. (2017). How Emotions are Made: The Secret Life of the Brain. Boston, MA: Houghton Mifflin Harcourt.

Barsalou, L. W. (2008). Grounded cognition. Annu. Rev. Psychol. 59, 617–645. doi: 10.1146/annurev.psych.59.103006.093639

PubMed Abstract | CrossRef Full Text | Google Scholar

Bateson, G. (1972). “The logical categories of learning and communication,” in Steps to an Ecology of Mind: Collected Essays in Anthropology, Psychiatry, Evolution, and Epistemology , ed. G. Bateson (San Francisco, CA: Chandler Publishing Company).

Bloom, L., Hood, L., and Lightbown, P. (1974). Imitation in language development: If, when, and why. Cogn. Psychol. 6, 380–420. doi: 10.1016/0010-0285(74)90018-8

CrossRef Full Text | Google Scholar

Boyer, N. R., Maher, P. A., and Kirkman, S. (2006). Transformative learning in online settings: the use of self-direction, metacognition, and collaborative learning. J. Transform. Educ. 4, 335–361. doi: 10.1177/1541344606295318

Bransford, J. D., and Schwartz, D. L. (1999). Chapter 3: rethinking transfer: a simple proposal with multiple implications. Rev. Res. Educ. 24, 61–100. doi: 10.3102/0091732X024001061

Burden, K., and Atkinson, S. (2008). “The transformative potential of the DiAL-e framework: Crossing boundaries, pushing frontiers,” in Proceedings of ASCILITE - Australian Society for Computers in Learning in Tertiary Education Annual Conference 2008 , Melbourne.

Carrington, S., and Selva, G. (2010). Critical social theory and transformative learning: evidence in pre−service teachers’ service−learning reflection logs. High. Educ. Res. Dev. 29, 45–57. doi: 10.1080/07294360903421384

Chomsky, N. (1959). Verbal behavior (a review of Skinner’s book). Language 35, 26–58. doi: 10.2307/411334

Cole, M. (1995). “Socio-cultural-historical psychology: Some general remarks and a proposal for a new kind of cultural-genetic methodology,” in Sociocultural Studies of Mind , eds J. V. Wertsch, P. del Río, and A. Alvarez (Madrid: Cambridge University Press). doi: 10.1017/CBO9781139174299.010

Cranton, P., and Taylor, E. W. (2013). A theory in progress? Issues in transformative learning theory. Eur. J. Res. Educ. Learn. Adults 4, 35–47. doi: 10.3384/rela.2000-7426.rela5000

Dirkx, J. M., Espinoza, B. D., and Schlegel, S. (2018). “Critical Reflection and Imaginative Engagement: Towards an Integrated Theory of Transformative Learning,” Paper Presented at the Adult Education Research Conference , Norman, OK.

Dirkx, J. M., Mezirow, J., and Cranton, P. (2006). Musings and reflections on the meaning, context, and process of transformative learning: a dialogue between John M. Dirkx and Jack Mezirow. J. Transform. Educ. 4, 123–139. doi: 10.1177/1541344606287503

Friedman, J. (2021). “ICBCI: An Integrated Model of Group and Individual Development for Learning Facilitators,” in Proceedings of the Learning Ideas Conference , Cham. doi: 10.1007/978-3-030-90677-1_10

Goldstone, R. L. (1998). Perceptual learning. Annu. Rev. Psychol. 49, 585–612. doi: 10.1146/annurev.psych.49.1.585

Habermas, J. (1984). The Theory of Communicative Action: Vol.1, Reason and Rationalization in Society. Boston: Beacon Press.

Harder, M. K., Dike, F. O., Firoozmand, F., Des Bouvrie, N., and Masika, R. J. (2021). Are those really transformative learning outcomes? Validating the relevance of a reliable process. J. Clean. Prod. 285:125343. doi: 10.1016/j.jclepro.2020.125343

Hegel, G. W. F. (1991). Hegel: Elements of the Philosophy of Right. Cambridge: Cambridge University Press.

Hoggan, C. (2014). Insights from breast cancer survivors: the interplay between context, epistemology, and change. Adult Educ. Q. 64, 191–205. doi: 10.1177/0741713614523666

Hoggan, C. (2016b). Transformative learning as a metatheory: definition, criteria, and typology. Adult Educ. Q. 66, 57–75. doi: 10.1177/0741713615611216

Hoggan, C. (2016a). “Bringing clarity to transformative learning research,” in Paper Presented at the Adult Education Research Conference , Manhattan, KS. doi: 10.1057/978-1-137-55783-4_3

Holzman, L. (2006). What kind of theory is activity theory? Introduction. Theory Psychol. 16, 5–11. doi: 10.1177/0959354306060105

Howie, P., and Bagnall, R. (2013). A beautiful metaphor: transformative learning theory. Int. J. Lifelong Educ. 32, 816–836. doi: 10.1080/02601370.2013.817486

Immordino-Yang, M. H., Darling-Hammond, L., and Krone, C. R. (2019). Nurturing nature: How brain development is inherently social and emotional, and what this means for education. Educ. Psychol. 54, 185–204. doi: 10.1080/00461520.2019.1633924

James, W., Burkhardt, F., Bowers, F., and Skrupskelis, I. K. (1890). The Principles of Psychology. London: Macmillan.

Kitchenham, A. (2008). The evolution of John Mezirow’s transformative learning theory. J. Transform. Educ. 6, 104–123. doi: 10.1177/1541344608322678

Kolb, D. A. (2014). Experiential Learning: Experience as the Source of Learning and Development. Upper Saddle River, NJ: FT press.

Langner, M. (1984). Rezeption der Tätigkeitstheorie und der Sprachtätigkeitstheorie in der Bundesrepublik Deutschland, Teil II [Reception of activity theory and speech act theory in the Federal Republic of Germany, part II]. Dtsch. Sprach. 4, 326–358.

Lave, J. (1988). Cognition in Practice: Mind, Mathematics and Culture in Everyday Life. Cambridge: Cambridge University Press. doi: 10.1017/CBO9780511609268

Leont’ev, A. N. (1978). Activity, Consciousness, and Personality. Englewood Cliffs, NJ: Prentice Hall.

Leont’ev, A. N. (1981). Problems of the Development of the Mind. Moscow: Soviet Union: Progress.

Locke, J. (1847). An Essay Concerning Human Understanding. Philadelphia, PA: Kay & Troutman.

Loughlin, M. O. (1992). Rethinking science education: beyond piagetian constructivism toward a sociocultural model of teaching and learning. J. Res. Sci. Teach. 29, 791–820. doi: 10.1002/tea.3660290805

Marx, K. (1967). Capital: A Critique of Political Economy. New York, NY: International Publishers.

Meerts-Brandsma, L., Sibthorp, J., and Rochelle, S. (2020). Using transformative learning theory to understand outdoor adventure education. J. Advent. Educ. Outdoor Learn. 20, 381–394. doi: 10.1080/14729679.2019.1686040

Mezirow, J. (1978). Perspective transformation. Adult Educ. 28, 100–110. doi: 10.1177/074171367802800202

Mezirow, J. (1991). Transformative Dimensions of Adult Learning. San Francisco, CA: JosseyBass.

Mezirow, J. (1997). Transformative learning: theory to practice. New Dir. Adult Contin. Educ. 1997, 5–12. doi: 10.1002/ace.7401

Mezirow, J. (2000). “Learning to think like an adult,” in Learning as Transformation: Critical Perspectives on a Theory in Progress , ed. J. Mezirow (San Francisco, CA: Jossey-Bass), 3–34.

Mezirow, J. (2003). Transformative learning as discourse. J. Transform. Educ. 1, 58–63. doi: 10.1177/1541344603252172

Mezirow, J. (2008). An overview on transformative learning. Lifelong Learn. 40–54.

Neisser, U. (2014). Cognitive Psychology: Classic edition. Hove: Psychology press. doi: 10.4324/9781315736174

Newell, A., and Simon, H. A. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.

Newman, F., and Holzman, L. (2013). Lev Vygotsky (classic edition): Revolutionary Scientist. Hove: Psychology Press. doi: 10.4324/9780203758076

Nohl, A. M. (2015). Typical phases of transformative learning: a practice-based model. Adult Educ. Q. 65, 35–49. doi: 10.1177/0741713614558582

Pavlov, I. P. (1957). Experimental Psychology and Other Essays. New York, NY: Philosophical Library.

Pernell-Arnold, A., Finley, L., Sands, R. G., Bourjolly, J., and Stanhope, V. (2012). Training mental health providers in cultural competence: a transformative learning process. Am. J. Psychiatr. Rehabil. 15, 334–356. doi: 10.1080/15487768.2012.733287

Peschl, M. F. (2007). Triple-loop learning as foundation for profound change, individual cultivation, and radical innovation. Construction processes beyond scientific and rational knowledge. Constr. Found. 2, 136–145.

Quinn, L. J., and Sinclair, A. J. (2016). Undressing transformative learning: the roles of instrumental and communicative learning in the shift to clothing sustainability. Adult Educ. Q. 66, 199–218. doi: 10.1177/0741713616640881

Romano, A. (2017). The challenge of the assessment of processes and outcomes of transformative learning. Educ. Reflect. Pract. 184–219. doi: 10.3280/ERP2017-001012

Roth, W. M., and Lee, S. (2004). Science education as/for participation in the community. Sci. Educ. 88, 263–291. doi: 10.1002/sce.10113

Roth, W. M., and Lee, Y. J. (2007). “Vygotsky’s neglected legacy”: cultural-historical activity theory. Rev. Educ. Res. 77, 186–232. doi: 10.3102/0034654306298273

Sessa, V. I., London, M., Pingor, C., Gullu, B., and Patel, J. (2011). Adaptive, generative, and transformative learning in project teams. Team Perform. Manage. 17, 146–167. doi: 10.1108/13527591111143691

Siegel, D. J. (2001). Toward an interpersonal neurobiology of the developing mind: attachment relationships,“mindsight,” and neural integration. Infant Ment. Health J. 22, 67–94. doi: 10.1002/1097-0355(200101/04)22:1<67::AID-IMHJ3>3.0.CO;2-G

Siegel, D. J. (2010). Mindsight: The New Science of Personal Transformation. New York, NY: Bantam.

Skinner, B. F. (1965). Science and Human Behavior. New York, NY: Simon and Schuster.

Taylor, E. W. (1994). Intercultural competency: a transformative learning process. Adult Educ. Q. 44, 154–174. doi: 10.1177/074171369404400303

Taylor, E. W. (1997). Building upon the theoretical debate: a critical review of the empirical studies of Mezirow’s transformative learning theory. Adult Educ. Q. 48, 34–59. doi: 10.1177/074171369704800104

Taylor, E. W. (2007). An update of transformative learning theory: a critical review of the empirical research (1999–2005). Int. J. Lifelong Educ. 26, 173–191. doi: 10.1080/02601370701219475

Taylor, E. W. (2008). Transformative learning theory. New Dir. Adult Contin. Educ. 2008, 5–15. doi: 10.1002/ace.301

Taylor, E. W., and Cranton, P. (2012). “A content analysis of transformative learning theory,” in Proceedings of the Adult Education Research Conference . Available online at: https://newprairiepress.org/aerc/2012/papers/47

Thorndike, E. L. (1927). The law of effect. Am. J. Psychol. 39, 212–222. doi: 10.2307/1415413

Tuckman, B. W., and Jensen, M. A. C. (1977). Stages of small-group development revisited. Group Organ. Stud. 2, 419–427. doi: 10.1177/105960117700200404

Vygotsky, L. S. (1978). Mind in Society: Development of Higher Psychological Processes. Cambridge, MA: Harvard university press.

Vygotsky, L. S. (1986). Thought and Language. Cambridge, MA: MIT Press.

Wittgenstein, L. (2010). Philosophical Investigations. Hoboken, NJ: John Wiley & Sons.

Keywords : transformative learning, Vygotsky, ZPD, ICBCI, meta-theory, practical-critical

Citation: Friedman J (2022) How a New Learning Theory Can Benefit Transformative Learning Research: Empirical Hypotheses. Front. Educ. 7:857091. doi: 10.3389/feduc.2022.857091

Received: 18 January 2022; Accepted: 16 May 2022; Published: 01 June 2022.

Reviewed by:

Copyright © 2022 Friedman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Joshua Friedman, [email protected]

This article is part of the Research Topic

Transformative Learning, Teaching and Action in the Most Challenging Times

ScienceDaily

How do we learn to learn? New research offers an education

Study on mice reveals importance of ignoring distraction while learning.

Cognitive training designed to focus on what’s important while ignoring distractions can enhance the brain’s information processing, enabling the ability to “learn to learn,” finds a new study on mice. 

“As any educator knows, merely recollecting the information we learn in school is hardly the point of an education,” says André Fenton, a professor of neural science at New York University and the senior author of the study, which appears in the journal Nature . “Rather than using our brains to merely store information to recall later, with the right mental training, we can also ‘learn to learn,’ which makes us more adaptive, mindful, and intelligent.”

Researchers have frequently studied the machinations of memory--specifically, how neurons store the information gained from experience so that the same information can be recalled later. However, less is known about the underlying neurobiology of how we “learn to learn”--the mechanisms our brains use to go beyond drawing from memory to utilize past experiences in meaningful, novel ways.

A greater understanding of this process could point to new methods to enhance learning and to design precision cognitive behavioral therapies for neuropsychiatric disorders like anxiety, schizophrenia, and other forms of mental dysfunction.

To explore this, the researchers conducted a series of experiments using mice, who were assessed for their ability to learn cognitively challenging tasks. Prior to the assessment, some mice received “cognitive control training” (CCT). They were put on a slowly rotating arena and trained to avoid the stationary location of a mild shock using stationary visual cues while ignoring locations of the shock on the rotating floor. CCT mice were compared to control mice. One control group also learned the same place avoidance, but it did not have to ignore the irrelevant rotating locations. 

The use of the rotating arena place avoidance methodology was vital to the experiment, the scientists note, because it manipulates spatial information, dissociating the environment into stationary and rotating components. Previously, the lab had shown that learning to avoid shock on the rotating arena requires using the hippocampus, the brain’s memory and navigation center, as well as the persistent activity of a molecule (protein kinase M zeta [PKM?]) that is crucial for maintaining increases in the strength of neuronal connections and for storing long-term memory. 

“In short, there were molecular, physiological, and behavioral reasons to examine long-term place avoidance memory in the hippocampus circuit as well as a theory for how the circuit could persistently improve,” explains Fenton.

Analysis of neural activity in the hippocampus during CCT confirmed the mice were using relevant information for avoiding shock and ignoring the rotating distractions in the vicinity of the shock. Notably, this process of ignoring distractions was essential for the mice learning to learn as it allowed them to do novel cognitive tasks better than the mice that did not receive CCT. Remarkably, the researchers could measure that CCT also improves how the mice’s hippocampal neural circuitry functions to process information. The hippocampus is a crucial part of the brain for forming long-lasting memories as well as for spatial navigation, and CCT improved how it operates for months.

“The study shows that two hours of cognitive control training causes learning to learn in mice and that learning to learn is accompanied by improved tuning of a key brain circuit for memory,” observes Fenton. “Consequently, the brain becomes persistently more effective at suppressing noisy inputs and more consistently effective at enhancing the inputs that matter.”

The paper’s other authors were: Ain Chung and Eliott Levy, NYU doctoral students at the time of the research; Claudia Jou a doctoral student at the City University of New York’s Hunter College and the Graduate Center; Alejandro Grau-Perales and Dino Dvorak, NYU postdoctoral fellows at the time of the study; and Nida Hussain, a student at NYU’s College of Arts and Science at the time of the study.

The research was supported by grants from the National Institutes of Health (R01MH115304, R01NS105472, and R01AG043688).

  • Intelligence
  • Educational Psychology
  • Neuroscience
  • Infant and Preschool Learning
  • Language Acquisition
  • Child Development
  • Brain-Computer Interfaces
  • Psycholinguistics
  • Mental confusion
  • Learning disability
  • Neuropsychology
  • Cognitive neuroscience

Story Source:

Materials provided by New York University . Note: Content may be edited for style and length.

Journal Reference :

  • Ain Chung, Claudia Jou, Alejandro Grau-Perales, Eliott R. J. Levy, Dino Dvorak, Nida Hussain, André A. Fenton. Cognitive control persistently enhances hippocampal information processing . Nature , 2021; DOI: 10.1038/s41586-021-04070-5

Cite This Page :

Explore More

  • Universal, Long-Lasting Flu Shot
  • Better Vision from CRISPR Gene Editing Trial
  • Why Venus Has Almost No Water
  • Stellar Light Surrounding Ancient Quasars
  • New Discoveries About Jupiter
  • Simulations Support Dark Matter Theory
  • 3D Printed Programmable Living Materials
  • Emergence of Animals: Magnetic Field Collapse
  • Ice Shelves Crack from Weight of Meltwater Lakes
  • Countries' Plans to Remove CO2 Not Enough

Trending Topics

Strange & offbeat.

APS

Improving Students’ Learning With Effective Learning Techniques: Promising Directions From Cognitive and Educational Psychology

Close-up photograph of a perfect grade on a scantron test.

Read the Full Text ( PDF , HTML )

Some students seem to breeze through their school years, whereas others struggle, putting them at risk for getting lost in our educational system and not reaching their full potential. Parents and teachers want to help students succeed, but there is little guidance on which learning techniques are the most effective for improving educational outcomes. This leads students to implement studying strategies that are often ineffective, resulting in minimal gains in performance. What then are the best strategies to help struggling students learn?

Fortunately for students, parents, and teachers, psychological scientists have developed and evaluated the effectiveness of a wide range of learning techniques meant to enhance academic performance. In this report, Dunlosky (Kent State University), Rawson (Kent State University), Marsh (Duke University), Nathan (University of Wisconsin–Madison), and Willingham (University of Virginia) review the effectiveness of 10 commonly used learning techniques.

The authors describe each learning technique in detail and discuss the conditions under which each technique is most successful. They also describe the students (age, ability level, etc.) for whom each technique is most useful, the materials needed to utilize each technique, and the specific skills each technique promotes. To allow readers to easily identify which methods are the most effective, the authors rate the techniques as having high, medium, or low utility for improving student learning.

Which learning techniques made the grade? According to the authors, some commonly used techniques, such as underlining, rereading material, and using mnemonic devices, were found to be of surprisingly low utility. These techniques were difficult to implement properly and often resulted in inconsistent gains in student performance. Other learning techniques such as taking practice tests and spreading study sessions out over time — known as distributed practice — were found to be of high utility because they benefited students of many different ages and ability levels and enhanced performance in many different areas.

The real-world guidance provided by this report is based on psychological science, making it an especially valuable tool for students, parents, and teachers who wish to promote effective learning. Although there are many reasons why students struggle in school, these learning techniques, when used properly, should help provide meaningful gains in classroom performance, achievement test scores, and many other tasks students will encounter across their lifespan.

About the Authors ( PDF , HTML )

Editorial: Applying Cognitive Psychology to Education: Translational Educational Science

By Henry L. Roediger, III

Read coverage of this report on TIME.com

' src=

This article confirms my own observations in the classroom. Thank you. I am anxious to read the entire article.

' src=

if these techniques may be made available to school authorities and parent then our students can get the best out of it.

' src=

This article is great and these techniques are valid and can transform our schools performance. Thanks for thorough researched work.

' src=

Alot has to be done by the schools authorities to see that these techniques are use to better the education of our country Cameroon

' src=

thank you,this article will help me lot for my teaching career

' src=

teachers as facilitators in classes need to be aware of these techniques through continual learning.

' src=

Thanks to the person who had created this link. It is very useful for our QC project and we had used these methods for helping our fellow mates.

' src=

These techniques should be used at different levels and teachers should be fully equipped with all techniques otherwise required results can not be achieved

' src=

This articles techniques are very different from what teachers these days. I need to get this information out there to the teachers at my school and try to get them to utilize it in class.

' src=

We teach these techniques and other educational philosophies through workshops to students and teachers at schools. We would love to connect if someone is interested in organizing the workshop at a school / society. Please get in touch through [email protected] / http://www.augbrain.com

' src=

its an informative article.

' src=

Thanks. But enumerate strategies in the Nursry and Primary for it seems more complex

' src=

I think it was a very informative articial, and he learning techniques are very detailed.

' src=

Very informative

' src=

Mnemonic An Outdated Devices In Learning

' src=

How to study effectively. i am a computer science student and have to search online . which is on one side useful that if something is missing in one article is found on another article but its so much time consuming plus i want to get all the knowledge related but its not possible.

kindly guide me to “learn how to learn!”

that i should concentrate on 1 topic and first learn the basic and then goto further details…..

' src=

This article is very interesting.

' src=

thank you so much, great article.

' src=

This is an outstanding article

' src=

This article is very Informative.

' src=

Good way of explaining, and good paragraph to obtain information on the topic of my presentation subject matter, which i am going to present in academy.

' src=

I’m curious as to what you define as ‘learning’ because this research seems to focus mostly on memorisation which on Bloom’s taxonomy is a fairly low element of learning, though obviously vital for some subjects.

I wonder if you should make this distinction clearer because this research is useful in identifying the best pedagogies for memorisation and retrieval but may not be the best for different types of learning outcomes.

' src=

Thank you, it’s great learning way

' src=

This article is important because learning effective learning techniques will help me to find the right technique to be successful in my educational journey.

' src=

I really like the techniques used in the article,I believe that they can enhance the learning skills in the classroom.Great article!

' src=

I think that this technique will be very effective at an college or learning institution

' src=

I agree that distibuted practice is indeed a valuable tool, however I disagree that it is of high utility because I am never able to actually “utiize” the tools e.g. practice tests/exams. Why aren’t more online practice exams aren’t available for scoring and self-challenging purposes? That would be more of an actual incentive to me. I also disagree that double checking and ightingmarking,high!ighting,underlining are low utility. As someone who is OCD, like myself, this is very satisfying and is actually very good at helping with retention of the material subject learned.

' src=

The highlighting technique does not work for my 6th grader. But this technique works for me. Distributed practice was indeed a more successful tool and technique.

' src=

If you want to study hard and effective than 1.first of all just feel relax 2.you have to motivate your self by watching some videos or some thoughts 3.you have to suppose,the subject you are reading is very easy. 4.now take that subject and read it or learn it with enthusiasm 5.and do it Thanks

' src=

THANK YOU SIR

THIS ARTICLE WILL HELP ME THROUGH MY TEACHING CARIER.

' src=

Where is the CP Practice Exam? I ended up here through Nala.org

' src=

Your Best and Brightest always seem to want to change the playing field of learning. The gentlemen and women who wrote most of the worlds great literature, would not respect your methods; nor should the parents, who see graduates with ever-decreasing test scores, abilities and aptitude.

' src=

I have found this article a learning experience that i can imply to my everyday lifestyle by using the moral tools given to preserver through this college coarse.

' src=

its a good connductor fo any student to know what is inportant in teaching method

' src=

I was disappointed to not identify strategies based on Problem Based Learning, which is particularly relevant for STEM classes and students.

“Problem-based learning is a student-centered pedagogy in which students learn about a subject through the experience of solving an open-ended problem found in trigger material.” Wikipedia

' src=

Very helpful!

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines .

Please login with your APS account to comment.

new research learning techniques

Making a Career Choice: Follow in Your Own Footsteps

In a guest column, APS Fellow Barbara Wanchisen shares observations and ideas on broadening career opportunities for psychological scientists.

new research learning techniques

Student Notebook: Finding Your Path in Psychological Science

Feeling unsure or overwhelmed as an early-career psychology student? Second-year graduate student Mariel Barnett shares advice to quell uncertainties.

new research learning techniques

Matching Psychology Training to Job Market Realities

APS President Wendy Wood discusses how graduate programs can change the habit of focusing on academic-career preparation.

Privacy Overview

  • Utility Menu

University Logo

GA4 Tracking Code

Home

fa51e2b1dc8cca8f7467da564e77b5ea

  • Make a Gift
  • Join Our Email List

Active Learning

  • Classroom Debate
  • Leading Discussions
  • Flipped Classrooms
  • Polling & Clickers
  • Teaching with Cases
  • Problem Solving in STEM

Active learning includes any type of instructional activity that engages students in learning, beyond listening, reading, and memorizing.  As examples, students might talk to a classmate about a challenging question, respond to an in-class prompt in writing, make a prediction about an experiment, or apply knowledge from a reading to a case study.  Active learning commonly includes collaboration between students in pairs or larger groups, but independent activities that involve reflection or writing—like quick-writes, or real-time polling in lectures—are also valuable.

Instructors can employ active learning in classes of any size, although certain activities may be better suited for smaller classes than large lecture halls.  Nonetheless, even large classes—including classes that meet in lecture halls with fixed seats—can incorporate a variety of activities that encourage students to talk with each other, work in small groups on an activity, or respond to a question through in-class writing or polling.  Furthermore, even small classes can increase student engagement beyond what might occur in a full group discussion by varying the instructional approaches and including small group discussions and activities.

Why should I use active learning?

 Active learning is valuable for a variety of reasons:

  • It provides instructors with feedback about what students are learning.
  • It helps students gauge their own understanding. By grappling with ideas, students connect new concepts to prior knowledge in meaningful ways and construct their own understanding.
  • Collaborating with classmates promotes community and connection between students, which can enhance a sense of belonging as well as motivation.
  • It creates a low bar to participation for quiet or passive students by encouraging every student to think and do.

Many of the larger scale studies on active learning have been conducted in STEM disciplines, although it reasonable to expect that the benefits of active learning extend to any field.  A 2014 meta-analysis of 225 research studies in STEM classes found that students in classes with active learning performed 6% better on exams than students in classes with traditional lecturing, and that students in classes with traditional lecturing were 1.5 times more likely to fail than in classes with active learning ( Freeman et al, 2014 ).  Additionally, active learning has been shown to decrease the achievement gap for underrepresented minorities and first generation college students ( Theobald et al, 2020 ).

What are some examples?

think pair share

Active learning strategies come in many varieties, most of which can be grafted into existing courses without costly revisions. One of the simplest and most elegant exercises, called  Think-pair-share , could easily be written into almost any lecture. In this exercise, students are given a minute to think about—and perhaps respond in writing—to a question on their own.  Students next exchange ideas with a partner.  Finally, some students share with the entire class. A think-pair-share engages every student, and also encourages more participation than simply asking for volunteers to respond to a question.

Other active learning exercises include:

  • Case studies :  In a case study, students apply their knowledge to real life scenarios, requiring them to synthesize a variety of information and make recommendations.
  • Collaborative note taking :  The instructor pauses during class and asks students to take a few minutes to summarize in writing what they have just learned and/or consolidate their notes.  Students then exchange notes with a partner to compare; this can highlight key ideas that a student might have missed or misunderstood.
  • Concept map :  This activity helps students understand the relationship between concepts. Typically, students are provided with a list of terms.  They arrange the terms on paper and draw arrows between related concepts, labeling each arrow to explain the relationship.
  • Group work :  Whether solving problems or discussing a prompt, working in small groups can be an effective method of engaging students.  In some cases, all groups work on or discuss the same question; in other cases, the instructor might assign different topics to different groups.  The group’s task should be purposeful, and should be structured in such a way that there is an obvious advantage to working as a team rather than individually.  It is useful for groups to share their ideas with the rest of the class—whether by writing answers on the board, raising key points that were discussed, or sharing a poster they created.
  • Jigsaw :  Small groups of students each discuss a different, but related topic. Students are then shuffled such that new groups are comprised of one student from each of the original groups. In these new groups, each student is responsible for sharing key aspects of their original discussion. The second group must synthesize and use all of the ideas from the first set of discussions in order to complete a new or more advanced task.  A nice feature of a jigsaw is that every student in the original group must fully understand the key ideas so that they can teach their classmates in the second group. 

  • NB: A minute paper can also be used as a reflection at the end of class.  The instructor might ask students to write down the most important concept that they learned that day, as well as something they found confusing.  Targeted questions can also provide feedback to the instructor about students’ experience in the class.
  • Statement correction , or  intentional mistakes :  The instructor provides statements, readings, proofs, or other material that contains errors.  The students are charged with finding and correcting the errors.  Concepts that students commonly misunderstand are well suited for this activity.
  • Strip sequence , or  sequence reconstruction : The goal of this activity is for students to order a set of items, such as steps in a biological process or a series of historical events.  As one strategy, the instructor provides students with a list of items written on strips of paper for the students to sort.  Removable labels with printed items also work well for this activity.
  • Polling :  During class, the instructor asks a multiple-choice question.  Students can respond in a variety of ways.  Possibilities include applications such as  PollEverywhere  or  Learning Catalytics .  In some courses, each student uses a handheld clicker, or personal response device, to record their answers through software such as  TurningPoint  or  iClicker .  Alternatively, students can respond to a multiple-choice question by raising the appropriate number of fingers or by holding up a colored card, where colors correspond to the different answers. A particularly effective strategy is to ask each student to first respond to the poll independently, then discuss the question with a neighbor, and then re-vote.

ABL Connect  provides more in-depth information about and examples of many of these activities.

In addition to these classroom-based strategies, instructors might take students out of the classroom; for example, students can visit museums or libraries, engage in field research, or work with the local community. 

For more information...

Tipsheet: Active Learning

PhysPort resources on stimulating productive engagement

Ambrose, S. A., Bridges, M. W., DiPietro, M., Lovett, M. C., & Norman, M. K. (2010). How Learning Works: Seven Research-based Principles for Smart Teaching. Chicago, IL: John Wiley & Sons.

Bain, K. 2004. What the best college teachers do. Cambridge, MA: Harvard University Press.

Brookfield, S. D., & Preskill, S. (2012). Discussion As a Way of Teaching: Tools and Techniques for Democratic Classrooms. Chicago, IL: John Wiley & Sons.

Brookfield, S. D., & Preskill, S. (2016). The Discussion Book: 50 Great Ways to Get People Talking. San Francisco, CA: Jossey-Bass.

Brown, P. C., Roediger, H. L., & McDaniel, M. A. (2014). Make It Stick: The Science of Successful Learning, 1st Edition. Cambridge, MA: Harvard University Press.

Handelsman, J., Miller, S., & Pfund, C. 2007. Scientific teaching. New York: W. H. Freeman and Company.

Lang, J. (2010). On Course: A Week-by-Week Guide to Your First Semester of College Teaching, 1st Edition. Cambridge, MA: Harvard University Press.

Lang, J. (2016). Small Teaching: Everyday Lessons from the Science of Learning. San Francisco, CA: Jossey-Bass.

Millis, B. J. 1990. Helping faculty build learning communities through cooperative groups. Available: http://digitalcommons.unl.edu/podimproveacad/202/ [2017, August 31].

  • Designing Your Course
  • A Teaching Timeline: From Pre-Term Planning to the Final Exam
  • The First Day of Class
  • Group Agreements
  • Engaged Scholarship
  • Devices in the Classroom
  • Beyond the Classroom
  • On Professionalism
  • Getting Feedback
  • Equitable & Inclusive Teaching
  • Advising and Mentoring
  • Teaching and Your Career
  • Teaching Remotely
  • Tools and Platforms
  • The Science of Learning
  • Bok Publications
  • Other Resources Around Campus

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 05 April 2022

Recent advances and applications of deep learning methods in materials science

  • Kamal Choudhary   ORCID: orcid.org/0000-0001-9737-8074 1 , 2 , 3 ,
  • Brian DeCost   ORCID: orcid.org/0000-0002-3459-5888 4 ,
  • Chi Chen   ORCID: orcid.org/0000-0001-8008-7043 5 ,
  • Anubhav Jain   ORCID: orcid.org/0000-0001-5893-9967 6 ,
  • Francesca Tavazza   ORCID: orcid.org/0000-0002-5602-180X 1 ,
  • Ryan Cohn   ORCID: orcid.org/0000-0002-7898-0059 7 ,
  • Cheol Woo Park 8 ,
  • Alok Choudhary 9 ,
  • Ankit Agrawal 9 ,
  • Simon J. L. Billinge   ORCID: orcid.org/0000-0002-9734-4998 10 ,
  • Elizabeth Holm 7 ,
  • Shyue Ping Ong   ORCID: orcid.org/0000-0001-5726-2587 5 &
  • Chris Wolverton   ORCID: orcid.org/0000-0003-2248-474X 8  

npj Computational Materials volume  8 , Article number:  59 ( 2022 ) Cite this article

66k Accesses

234 Citations

38 Altmetric

Metrics details

  • Atomistic models
  • Computational methods

Deep learning (DL) is one of the fastest-growing topics in materials data science, with rapidly emerging applications spanning atomistic, image-based, spectral, and textual data modalities. DL allows analysis of unstructured data and automated identification of features. The recent development of large materials databases has fueled the application of DL methods in atomistic prediction in particular. In contrast, advances in image and spectral data have largely leveraged synthetic data enabled by high-quality forward models as well as by generative unsupervised DL methods. In this article, we present a high-level overview of deep learning methods followed by a detailed discussion of recent developments of deep learning in atomistic simulation, materials imaging, spectral analysis, and natural language processing. For each modality we discuss applications involving both theoretical and experimental data, typical modeling approaches with their strengths and limitations, and relevant publicly available software and datasets. We conclude the review with a discussion of recent cross-cutting work related to uncertainty quantification in this field and a brief perspective on limitations, challenges, and potential growth areas for DL methods in materials science.

Similar content being viewed by others

new research learning techniques

De novo design of protein structure and function with RFdiffusion

new research learning techniques

Physics-informed machine learning

new research learning techniques

Imaging 3D chemistry at 1 nm resolution with fused multi-modal electron tomography

Introduction.

“Processing-structure-property-performance” is the key mantra in Materials Science and Engineering (MSE) 1 . The length and time scales of material structures and phenomena vary significantly among these four elements, adding further complexity 2 . For instance, structural information can range from detailed knowledge of atomic coordinates of elements to the microscale spatial distribution of phases (microstructure), to fragment connectivity (mesoscale), to images and spectra. Establishing linkages between the above components is a challenging task.

Both experimental and computational techniques are useful to identify such relationships. Due to rapid growth in automation in experimental equipment and immense expansion of computational resources, the size of public materials datasets has seen exponential growth. Several large experimental and computational datasets 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 have been developed through the Materials Genome Initiative (MGI) 11 and the increasing adoption of Findable, Accessible, Interoperable, Reusable (FAIR) 12 principles. Such an outburst of data requires automated analysis which can be facilitated by machine learning (ML) techniques 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 .

Deep learning (DL) 21 , 22 is a specialized branch of machine learning (ML). Originally inspired by biological models of computation and cognition in the human brain 23 , 24 , one of DL’s major strengths is its potential to extract higher-level features from the raw input data.

DL applications are rapidly replacing conventional systems in many aspects of our daily lives, for example, in image and speech recognition, web search, fraud detection, email/spam filtering, financial risk modeling, and so on. DL techniques have been proven to provide exciting new capabilities in numerous fields (such as playing Go 25 , self-driving cars 26 , navigation, chip design, particle physics, protein science, drug discovery, astrophysics, object recognition 27 , etc).

Recently DL methods have been outperforming other machine learning techniques in numerous scientific fields, such as chemistry, physics, biology, and materials science 20 , 28 , 29 , 30 , 31 , 32 . DL applications in MSE are still relatively new, and the field has not fully explored its potential, implications, and limitations. DL provides new approaches for investigating material phenomena and has pushed materials scientists to expand their traditional toolset.

DL methods have been shown to act as a complementary approach to physics-based methods for materials design. While large datasets are often viewed as a prerequisite for successful DL applications, techniques such as transfer learning, multi-fidelity modelling, and active learning can often make DL feasible for small datasets as well 33 , 34 , 35 , 36 .

Traditionally, materials have been designed experimentally using trial and error methods with a strong dose of chemical intuition. In addition to being a very costly and time-consuming approach, the number of material combinations is so huge that it is intractable to study experimentally, leading to the need for empirical formulation and computational methods. While computational approaches (such as density functional theory, molecular dynamics, Monte Carlo, phase-field, finite elements) are much faster and cheaper than experiments, they are still limited by length and time scale constraints, which in turn limits their respective domains of applicability. DL methods can offer substantial speedups compared to conventional scientific computing, and, for some applications, are reaching an accuracy level comparable to physics-based or computational models.

Moreover, entering a new domain of materials science and performing cutting-edge research requires years of education, training, and the development of specialized skills and intuition. Fortunately, we now live in an era of increasingly open data and computational resources. Mature, well-documented DL libraries make DL research much more easily accessible to newcomers than almost any other research field. Testing and benchmarking methodologies such as underfitting/overfitting/cross-validation 15 , 16 , 37 are common knowledge, and standards for measuring model performance are well established in the community.

Despite their many advantages, DL methods have disadvantages too, the most significant one being their black-box nature 38 which may hinder physical insights into the phenomena under examination. Evaluating and increasing the interpretability and explainability of DL models remains an active field of research. Generally a DL model has a few thousand to millions of parameters, making model interpretation and direct generation of scientific insight difficult.

Although there are several good recent reviews of ML applications in MSE 15 , 16 , 17 , 19 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , DL for materials has been advancing rapidly, warranting a dedicated review to cover the explosion of research in this field. This article discusses some of the basic principles in DL methods and highlights major trends among the recent advances in DL applications for materials science. As the tools and datasets for DL applications in materials keep evolving, we provide a github repository ( https://github.com/deepmaterials/dlmatreview ) that can be updated as new resources are made publicly available.

General machine learning concepts

It is beyond the scope of this article to give a detailed hands-on introduction to Deep Learning. There are many materials for this purpose, for example, the free online book “Neural Networks and Deep Learning” by Michael Nielsen ( http://neuralnetworksanddeeplearning.com ), Deep Learning by Goodfellow et al. 21 , and multiple online courses at Coursera, Udemy, and so on. Rather, this article aims to motivate materials scientist researchers in the types of problems that are amenable to DL, and to introduce some of the basic concepts, jargon, and materials-specific databases and software (at the time of writing) as a helpful on-ramp to help get started. With this in mind, we begin with a very basic introduction to Deep learning.

Artificial intelligence (AI) 13 is the development of machines and algorithms that mimic human intelligence, for example, by optimizing actions to achieve certain goals. Machine learning (ML) is a subset of AI, and provides the ability to learn without explicitly being programmed for a given dataset such as playing chess, social network recommendation etc. DL, in turn, is the subset of ML that takes inspiration from biological brains and uses multilayer neural networks to solve ML tasks. A schematic of AI-ML-DL context and some of the key application areas of DL in the materials science and engineering field are shown in Fig. 1 .

figure 1

Deep learning is considered a part of machine learning, which is contained in an umbrella term artificial intelligence.

Some of the commonly used ML technologies are linear regression, decision trees, and random forest in which generalized models are trained to learn coefficients/weights/parameters for a given dataset (usually structured i.e., on a grid or a spreadsheet).

Applying traditional ML techniques to unstructured data (such as pixels or features from an image, sounds, text, and graphs) is challenging because users have to first extract generalized meaningful representations or features themselves (such as calculating pair-distribution for an atomic structure) and then train the ML models. Hence, the process becomes time-consuming, brittle, and not easily scalable. Here, deep learning (DL) techniques become more important.

DL methods are based on artificial neural networks and allied techniques. According to the “universal approximation theorem” 50 , 51 , neural networks can approximate any function to arbitrary accuracy. However, it is important to note that the theorem doesn’t guarantee that the functions can be learnt easily 52 .

Neural networks

A perceptron or a single artificial neuron 53 is the building block of artificial neural networks (ANNs) and performs forward propagation of information. For a set of inputs [ x 1 ,  x 2 , . . . ,  x m ] to the perceptron, we assign floating number weights (and biases to shift wights) [ w 1 ,  w 2 , . . . ,  w m ] and then we multiply them correspondingly together to get a sum of all of them. Some of the common software packages allowing NN trainings are: PyTorch 54 , Tensorflow 55 , and MXNet 56 . Please note that certain commercial equipment, instruments, or materials are identified in this paper in order to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by NIST, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.

Activation function

Activation functions (such as sigmoid, hyperbolic tangent (tanh), rectified linear unit (ReLU), leaky ReLU, Swish) are the critical nonlinear components that enable neural networks to compose many small building blocks to learn complex nonlinear functions. For example, the sigmoid activation maps real numbers to the range (0, 1); this activation function is often used in the last layer of binary classifiers to model probabilities. The choice of activation function can affect training efficiency as well as final accuracy 57 .

Loss function, gradient descent, and normalization

The weight matrices of a neural network are initialized randomly or obtained from a pre-trained model. These weight matrices are multiplied with the input matrix (or output from a previous layer) and subjected to a nonlinear activation function to yield updated representations, which are often referred to as activations or feature maps. The loss function (also known as an objective function or empirical risk) is calculated by comparing the output of the neural network and the known target value data. Typically, network weights are iteratively updated via stochastic gradient descent algorithms to minimize the loss function until the desired accuracy is achieved. Most modern deep learning frameworks facilitate this by using reverse-mode automatic differentiation 58 to obtain the partial derivatives of the loss function with respect to each network parameter through recursive application of the chain rule. Colloquially, this is also known as back-propagation.

Common gradient descent algorithms include: Stochastic Gradient Descent (SGD), Adam, Adagrad etc. The learning rate is an important parameter in gradient descent. Except for SGD, all other methods use adaptive learning parameter tuning. Depending on the objective such as classification or regression, different loss functions such as Binary Cross Entropy (BCE), Negative Log likelihood (NLLL) or Mean Squared Error (MSE) are used.

The inputs of a neural network are generally scaled i.e., normalized to have zero mean and unit standard deviation. Scaling is also applied to the input of hidden layers (using batch or layer normalization) to improve the stability of ANNs.

Epoch and mini-batches

A single pass of the entire training data is called an epoch, and multiple epochs are performed until the weights converge. In DL, datasets are usually large and computing gradients for the entire dataset and network becomes challenging. Hence, the forward passes are done with small subsets of the training data called mini-batches.

Underfitting, overfitting, regularization, and early stopping

During an ML training, the dataset is split into training, validation, and test sets. The test set is never used during the training process. A model is said to be underfitting if the model performs poorly on the training set and lacks the capacity to fully learn the training data. A model is said to overfit if the model performs too well on the training data but does not perform well on the validation data. Overfitting is controlled with regularization techniques such as L2 regularization, dropout, and early stopping 37 .

Regularization discourages the model from simply memorizing the training data, resulting in a model that is more generalizable. Overfitting models are often characterized by neurons that have weights with large magnitudes. L2 regularization reduces the possibility of overfitting by adding an additional term to the loss function that penalizes the large weight values, keeping the values of the weights and biases small during training. Another popular regularization is dropout 59 in which we randomly set the activations for an NN layer to zero during training. Similar to bagging 60 , the use of dropout brings about the same effect of training a collection of randomly chosen models which prevents the co-adaptations among the neurons, consequently reducing the likelihood of the model from overfitting. In early stopping, further epochs for training are stopped before the model overfits i.e., accuracy on the validation set flattens or decreases.

Convolutional neural networks

Convolutional neural networks (CNN) 61 can be viewed as a regularized version of multilayer perceptrons with a strong inductive bias for learning translation-invariant image representations. There are four main components in CNNs: (a) learnable convolution filterbanks, (b) nonlinear activations, (c) spatial coarsening (via pooling or strided convolution), (d) a prediction module, often consisting of fully connected layers that operate on a global instance representation.

In CNNs we use convolution functions with multiple kernels or filters with trainable and shared weights or parameters, instead of general matrix multiplication. These filters/kernels are matrices with a relatively small number of rows and columns that convolve over the input to automatically extract high-level local features in the form of feature maps. The filters slide/convolve (element-wise multiply) across the input with a fixed number of strides to produce the feature map and the information thus learnt is passed to the hidden/fully connected layers. Depending on the input data, these filters can be one, two, or three-dimensional.

Similar to the fully connected NNs, nonlinearities such as ReLU are then applied that allows us to deal with nonlinear and complicated data. The pooling operation preserves spatial invariance, downsamples and reduces the dimension of each feature map obtained after convolution. These downsampling/pooling operations can be of different types such as maximum-pooling, minimum-pooling, average pooling, and sum pooling. After one or more convolutional and pooling layers, the outputs are usually reduced to a one-dimensional global representation. CNNs are especially popular for image data.

Graph neural networks

Graphs and their variants.

Classical CNNs as described above are based on a regular grid Euclidean data (such as 2D grid in images). However, real-life data structures, such as social networks, segments of images, word vectors, recommender systems, and atomic/molecular structures, are usually non-Euclidean. In such cases, graph-based non-Euclidean data structures become especially important.

Mathematically, a graph G is defined as a set of nodes/vertices V , a set of edges/links, E and node features, X : G  = ( V ,  E ,  X ) 62 , 63 , 64 and can be used to represent non-Euclidean data. An edge is formed between a pair of two nodes and contains the relation information between the nodes. Each node and edge can have attributes/features associated with it. An adjacency matrix A is a square matrix indicating connections between the nodes or not in the form of 1 (connected) and 0 (unconnected). A graph can be of various types such as: undirected/directed, weighted/unweighted, homogeneous/heterogeneous, static/dynamic.

An undirected graph captures symmetric relations between nodes, while a directed one captures asymmetric relations such that A i j  ≠  A j i . In a weighted graph, each edge is associated with a scalar weight rather than just 1s and 0s. In a homogeneous graph, all the nodes represent instances of the same type, and all the edges capture relations of the same type while in a heterogeneous graph, the nodes and edges can be of different types. Heterogeneous graphs provide an easy interface for managing nodes and edges of different types as well as their associated features. When input features or graph topology vary with time, they are called dynamic graphs otherwise they are considered static. If a node is connected to another node more than once it is termed a multi-graph.

Types of GNNs

At present, GNNs are probably the most popular AI method for predicting various materials properties based on structural information 33 , 65 , 66 , 67 , 68 , 69 . Graph neural networks (GNNs) are DL methods that operate on graph domain and can capture the dependence of graphs via message passing between the nodes and edges of graphs. There are two key steps in GNN training: (a) we first aggregate information from neighbors and (b) update the nodes and/or edges. Importantly, aggregation is permutation invariant. Similar to the fully connected NNs, the input node features, X (with embedding matrix) are multiplied with the adjacency matrix and the weight matrices and then multiplied with the nonlinear activation function to provide outputs for the next layer. This method is called the propagation rule.

Based on the propagation rule and aggregation methodology, there could be different variants of GNNs such as Graph convolutional network (GCN) 70 , Graph attention network (GAT) 71 , Relational-GCN 72 , graph recurrent network (GRN) 73 , Graph isomerism network (GIN) 74 , and Line graph neural network (LGNN) 75 . Graph convolutional neural networks are the most popular GNNs.

Sequence-to-sequence models

Traditionally, learning from sequential inputs such as text involves generating a fixed-length input from the data. For example, the “bag-of-words” approach simply counts the number of instances of each word in a document and produces a fixed-length vector that is the size of the overall vocabulary.

In contrast, sequence-to-sequence models can take into account sequential/contextual information about each word and produce outputs of arbitrary length. For example, in named entity recognition (NER), an input sequence of words (e.g., a chemical abstract) is mapped to an output sequence of “entities” or categories where every word in the sequence is assigned a category.

An early form of sequence-to-sequence model is the recurrent neural network, or RNN. Unlike the fully connected NN architecture, where there is no connection between hidden nodes in the same layer, but only between nodes in adjacent layers, RNN has feedback connections. Each hidden layer can be unfolded and processed similarly to traditional NNs sharing the same weight matrices. There are multiple types of RNNs, of which the most common ones are: gated recurrent unit recurrent neural network (GRURNN), long short-term memory (LSTM) network, and clockwork RNN (CW-RNN) 76 .

However, all such RNNs suffer from some drawbacks, including: (i) difficulty of parallelization and therefore difficulty in training on large datasets and (ii) difficulty in preserving long-range contextual information due to the “vanishing gradient” problem. Nevertheless, as we will later describe, LSTMs have been successfully applied to various NER problems in the materials domain.

More recently, sequence-to-sequence models based on a “transformer” architecture, such as Google’s Bidirectional Encoder Representations from Transformers (BERT) model 77 , have helped address some of the issues of traditional RNNs. Rather than passing a state vector that is iterated word-by-word, such models use an attention mechanism to allow access to all previous words simultaneously without explicit time steps. This mechanism facilitates parallelization and also better preserves long-term context.

Generative models

While the above DL frameworks are based on supervised machine learning (i.e., we know the target or ground truth data such as in classification and regression) and discriminative (i.e., learn differentiating features between various datasets), many AI tasks are based on unsupervised (such as clustering) and are generative (i.e., aim to learn underlying distributions) 78 .

Generative models are used to (a) generate data samples similar to the training set with variations i.e., augmentation and for synthetic data, (b) learn good generalized latent features, (c) guide mixed reality applications such as virtual try-on. There are various types of generative models, of which the most common are: (a) variational encoders (VAE), which explicitly define and learn likelihood of data, (b) Generative adversarial networks (GAN), which learn to directly generate samples from model’s distribution, without defining any density function.

A VAE model has two components: namely encoder and decoder. A VAE’s encoder takes input from a target distribution and compresses it into a low-dimensional latent space. Then the decoder takes that latent space representation and reproduces the original image. Once the network is trained, we can generate latent space representations of various images, and interpolate between these before forwarding them through the decoder which produces new images. A VAE is similar to a principal component analysis (PCA) but instead of linear data assumption in PCA, VAEs work in nonlinear domain. A GAN model also has two components: namely generator, and discriminator. GAN’s generator generates fake/synthetic data that could fool the discriminator. Its discriminator tries to distinguish fake data from real ones. This process is also termed as “min-max two-player game.” We note that VAE models learn the hidden state distributions during the training process, while GAN’s hidden state distributions are predefined. Rather GAN generators serve to generate images that could fool the discriminator. These techniques are widely used for images and spectra and have also been recently applied to atomic structures.

Deep reinforcement learning

Reinforcement learning (RL) deals with tasks in which a computational agent learns to make decisions by trial and error. Deep RL uses DL into the RL framework, allowing agents to make decisions from unstructured input data 79 . In traditional RL, Markov decision process (MDP) is used in which an agent at every timestep takes action to receive a scalar reward and transitions to the next state according to system dynamics to learn policy in order to maximize returns. However, in deep RL, the states are high-dimensional (such as continuous images or spectra) which act as an input to DL methods. DRL architectures can be either model-based or model-free.

Scientific machine learning

The nascent field of scientific machine learning (SciML) 80 is creating new opportunities across all paradigms of machine learning, and deep learning in particular. SciML is focused on creating ML systems that incorporate scientific knowledge and physical principles, either directly in the specific form of the model or indirectly through the optimization algorithms used for training. This offers potential improvements in sample and training complexity, robustness (particularly under extrapolation), and model interpretability. One prominent theme can be found in ref. 57 . Such implementations usually involve applying multiple physics-based constraints while training a DL model 81 , 82 , 83 . One of the key challenges of universal function approximation is that a NN can quickly learn spurious features that have nothing to do with the features that a researcher could be actually interested in, within the data. In this sense, physics-based regularization can assist. Physics-based deep learning can also aid in inverse design problems, a challenging but important task 84 , 85 . On the flip side, deep Learning using Graph Neural Nets and symbolic regression (stochastically building symbolic expressions) has even been used to “discover” symbolic equations from data that capture known (and unknown) physics behind the data 86 , i.e., to deep learn a physics model rather than to use a physics model to constrain DL.

Overview of applications

Some aspects of successful DL application that require materials-science-specific considerations are:

acquiring large, balanced, and diverse datasets (often on the order of 10,000 data points or more),

determing an appropriate DL approach and suitable vector or graph representation of the input samples, and

selecting appropriate performance metrics relevant to scientific goals.

In the following sections we discuss some of the key areas of materials science in which DL has been applied with available links to repositories and datasets that help in the reproducibility and extensibility of the work. In this review we categorize materials science applications at a high level by the type of input data considered: 11 atomistic, 12 stoichiometric, 13 spectral, 14 image, and 15 text. We summarize prevailing machine learning tasks and their impact on materials research and development within each broad materials data modality.

Applications in atomistic representations

In this section, we provide a few examples of solving materials science problems with DL methods trained on atomistic data. The atomic structure of material usually consists of atomic coordinates and atomic composition information of material. An arbitrary number of atoms and types of elements in a system poses a challenge to apply traditional ML algorithms for atomistic predictions. DL-based methods are an obvious strategy to tackle this problem. There have been several previous attempts to represent crystals and molecules using fixed-size descriptors such as Coulomb matrix 87 , 88 , 89 , classical force field inspired descriptors (CFID) 90 , 91 , 92 , pair-distribution function (PRDF), Voronoi tessellation 93 , 94 , 95 . Recently graph neural network methods have been shown to surpass previous hand-crafted feature set 28 .

DL for atomistic materials applications include: (a) force-field development, (b) direct property predictions, (c) materials screening. In addition to the above points, we also elucidate upon some of the recent generative adversarial network and complimentary methods to atomistic aproaches.

Databases and software libraries

In Table 1 we provide some of the commonly used datasets used for atomistic DL models for molecules, solids, and proteins. We note that the computational methods used for different datasets are different and many of them are continuously evolving. Generally it takes years to generate such databases using conventional methods such as density functional theory; in contrast, DL methods can be used to make predictions with much reduced computational cost and reasonable accuracy.

Table 1 we provide DL software packages used for atomistic materials design. The type of models includes general property (GP) predictors and interatomic force fields (FF). The models have been demonstrated in molecules (Mol), solid-state materials (Sol), or proteins (Prot). For some force fields, high-performance large-scale implementations (LSI) that leverage paralleling computing exist. Some of these methods mainly used interatomic distances to build graphs while others use distances as well as bond-angle information. Recently, including bond angle within GNN has been shown to drastically improve the performance with comparable computational timings.

Force-field development

The first application includes the development of DL-based force fields (FF) 96 , 97 /interatomic potentials. Some of the major advantages of such applications are that they are very fast (on the order of hundreds to thousands times 64 ) for making predictions and solving the tenuous development of FFs, but the disadvantage is they still require a large dataset using computationally expensive methods to train.

Models such as Behler-Parrinello neural network (BPNN) and its variants 98 , 99 are used for developing interatomic potentials that can be used beyond just 0 K temperature and time-dependent behavior using molecular dynamics simulations such as for nanoparticles 100 . Such FF models have been developed for molecular systems, such as water, methane, and other organic molecules 99 , 101 as well as solids such as silicon 98 , sodium 102 , graphite 103 , and titania ( T i O 2 ) 104 .

While the above works are mainly based on NNs, there has also been the development of graph neural network force-field (GNNFF) framework 105 , 106 that bypasses both computational bottlenecks. GNNFF can predict atomic forces directly using automatically extracted structural features that are not only translationally invariant, but rotationally-covariant to the coordinate space of the atomic positions, i.e., the features and hence the predicted force vectors rotate the same way as the rotation of coordinates. In addition to the development of pure NN-based FFs, there have also been recent developments of combining traditional FFs such as bond-order potentials with NNs and ReaxFF with message passing neural network (MPNN) that can help mitigate the NNs issue for extrapolation 82 , 107 .

Direct property prediction from atomistic configurations

DL methods can be used to establish a structure-property relationship between atomic structure and their properties with high accuracy 28 , 108 . Models such as SchNet, crystal graph convolutional neural network (CGCNN), improved crystal graph convolutional neural network (iCGCNN), directional message passing neural network (DimeNet), atomistic line graph neural network (ALIGNN) and materials graph neural network (MEGNet) shown in Table 1 have been used to predict up to 50 properties of crystalline and molecular materials. These property datasets are usually obtained from ab-initio calculations. A schematic of such models shown in Fig. 2 . While SchNet, CGCNN, MEGNet are primarily based on atomic distances, iCGCNN, DimeNet, and ALIGNN models capture many-body interactions using GCNN.

figure 2

a CGCNN model in which crystals are converted to graphs with nodes representing atoms in the unit cell and edges representing atom connections. Nodes and edges are characterized by vectors corresponding to the atoms and bonds in the crystal, respectively [Reprinted with permission from ref. 67 Copyright 2019 American Physical Society], b ALIGNN 65 model in which the convolution layer alternates between message passing on the bond graph and its bond-angle line graph. c MEGNet in which the initial graph is represented by the set of atomic attributes, bond attributes and global state attributes [Reprinted with permission from ref. 33 Copyright 2019 American Chemical Society] model, d iCGCNN model in which multiple edges connect a node to neighboring nodes to show the number of Voronoi neighbors [Reprinted with permission from ref. 122 Copyright 2019 American Physical Society].

Some of these properties include formation energies, electronic bandgaps, solar-cell efficiency, topological spin-orbit spillage, dielectric constants, piezoelectric constants, 2D exfoliation energies, electric field gradients, elastic modulus, Seebeck coefficients, power factors, carrier effective masses, highest occupied molecular orbital, lowest unoccupied molecular orbital, energy gap, zero-point vibrational energy, dipole moment, isotropic polarizability, electronic spatial extent, internal energy.

For instance, the current state-of-the-art mean absolute error for formation energy for solids at 0 K is 0.022 eV/atom as obtained by the ALIGNN model 65 . DL is also heavily being used for predicting catalytic behavior of materials such as the Open Catalyst Project 109 which is driven by the DL methods materials design. There is an ongoing effort to continuously improve the models. Usually energy-based models such as formation and total energies are more accurate than electronic property-based models such as bandgaps and power factors.

In addition to molecules and solids, property predictions models have also been used for bio-materials such as proteins, which can be viewed as large molecules. There have been several efforts for predicting protein-based properties, such as binding affinity 66 and docking predictions 110 .

There have also been several applications for identifying reasonable chemical space using DL methods such as autoencoders 111 and reinforcement learning 112 , 113 , 114 for inverse materials design. Inverse materials design with techniques such as GAN deals with finding chemical compounds with suitable properties and act as complementary to forward prediction models. While such concepts have been widely applied to molecular systems, 115 , recently these methods have been applied to solids as well 116 , 117 , 118 , 119 , 120 .

Fast materials screening

DFT-based high-throughput methods are usually limited to a few thousands of compounds and take a long time for calculations, DL-based methods can aid this process and allow much faster predictions. DL-based property prediction models mentioned above can be used for pre-screening chemical compounds. Hence, DL-based tools can be viewed as a pre-screening tool for traditional methods such as DFT. For example, Xie et al. used CGCNN model to screen stable perovskite materials 67 as well hierarchical visualization of materials space 121 . Park et al. 122 used iCGCNN to screen T h C r 2 S i 2 -type materials. Lugier et al. used DL methods to predict thermoelectric properties 123 . Rosen et al. 124 used graph neural network models to predict the bandgaps of metal-organic frameworks. DL for molecular materials has been used to predict technologically important properties such as aqueous solubility 125 and toxicity 126 .

It should be noted that the full atomistic representations and the associated DL models are only possible if the crystal structure and atom positions are available. In practice, the precise atom positions are only available from DFT structural relaxations or experiments, and are one of the goals for materials discovery instead of the starting point. Hence, alternative methods have been proposed to bypass the necessity for atom positions in building DL models. For example, Jain and Bligaard 127 proposed the atomic position-independent descriptors and used a CNN model to learn the energies of crystals. Such descriptors include information based only on the symmetry (e.g., space group and Wyckoff position). In principle, the method can be applied universally in all crystals. Nevertheless, the model errors tend to be much higher than graph-based models. Similar coarse-grained representation using Wyckoff representation was also used by Goodall et al. 128 . Alternatively, Zuo et al. 129 started from the hypothetical structures without precise atom positions, and used a Bayesian optimization method coupled with a MEGNet energy model as an energy evaluator to perform direct structural relaxation. Applying the Bayesian optimization with symmetry relaxation (BOWSR) algorithm successfully discovered ReWB (Pca2 1 ) and MoWC 2 (P6 3 /mmc) hard materials, which were then experimentally synthesized.

Applications in chemical formula and segment representations

One of the earliest applications for DL included SMILES for molecules, elemental fractions and chemical descriptors for solids, and sequence of protein names as descriptors. Such descriptors lack explicit inclusion of atomic structure information but are still useful for various pre-screening applications for both theoretical and experimental data.

SMILES and fragment representation

The simplified molecular-input line-entry system (SMILES) is a method to represent elemental and bonding for molecular structures using short American Standard Code for Information Interchange (ASCII) strings. SMILES can express structural differences including the chirality of compounds, making it more useful than a simply chemical formula. A SMILES string is a simple grid-like (1-D grid) structure that can represent molecular sequences such as DNA, macromolecules/polymers, protein sequences also 130 , 131 . In addition to the chemical constituents as in the chemical formula, bondings (such as double and triple bondings) are represented by special symbols (such as ’=’ and ’#’). The presence of a branch point indicated using a left-hand bracket “(” while the right-hand bracket “)” indicates that all the atoms in that branch have been taken into account. SMILES strings are represented as a distributed representation termed a SMILES feature matrix (as a sparse matrix), and then we can apply DL to the matrix similar to image data. The length of the SMILES matrix is generally kept fixed (such as 400) during training and in addition to the SMILES multiple elemental attributes and bonding attributes (such as chirality, aromaticity) can be used. Key DL tasks for molecules include (a) novel molecule design, (b) molecule screening.

Novel molecules with target properties can designed using VAE, GAN and RNN based methods 132 , 133 , 134 . These DL-generated molecules might not be physically valid, but the goal is to train the model to learn the patterns in SMILES strings such that the output resembles valid molecules. Then chemical intuitions can be further used to screen the molecules. DL for SMILES can also be used for molecularscreening such as to predict molecular toxicity. Some of the common SMILES datasets are: ZINC 135 , Tox21 136 , and PubChem 137 .

Due to the limitations to enforce the generation of valid molecular structures from SMILES, fragment-based models are developed such as DeepFrag and DeepFrag-K 138 , 139 . In fragment-based models, a ligand/receptor complex is removed and then a DL model is trained to predict the most suitable fragment substituent. A set of useful tools for SMILES and fragment representations are provided in Table 2 .

Chemical formula representation

There are several ways of using the chemical formula-based representations for building ML/DL models, beginning with a simple vector of raw elemental fractions 140 , 141 or of weight percentages of alloying compositions 142 , 143 , 144 , 145 , as well as more sophisticated hand-crafted descriptors or physical attributes to add known chemistry knowledge (e.g., electronegativity, valency, etc. of constituent elements) to the feature representations 146 , 147 , 148 , 149 , 150 , 151 . Statistical and mathematical operations such as average, max, min, median, mode, and exponentiation can be carried out on elemental properties of the constituent elements to get a set of descriptors for a given compound. The number of such composition-based features can range from a few dozens to a few hundreds. One of the commonly used representations that have been shown to work for a variety of different use-cases is the materials agnostic platform for informatics and exploration (MagPie) 150 . All these composition-based representations can be used with both traditional ML methods such as Random Forest as well as DL.

It is relevant to note that ElemNet 141 , which is a 17-layer neural network composed of fully connected layers and uses only raw elemental fractions as input, was found to significantly outperform traditional ML methods such as Random Forest, even when they were allowed to use more sophisticated physical attributes based on MagPie as input. Although no periodic table information was provided to the model, it was found to self-learn some interesting chemistry, like groups (element similarity) and charge balance (element interaction). It was also able to predict phase diagrams on unseen materials systems, underscoring the power of DL for representation learning directly from raw inputs without explicit feature extraction. Further increasing the depth of the network was found to adversely affect the model accuracy due to the vanishing gradient problem. To address this issue, Jha et al. 152 developed IRNet, which uses individual residual learning to allow a smoother flow of gradients and enable deeper learning for cases where big data is available. IRNet models were tested on a variety of big and small materials datasets, such as OQMD, AFLOW, Materials Project, JARVIS, using different vector-based materials representations (element fractions, MagPie, structural) and were found to not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data 153 . Further, graph-based methods such as Roost 154 have also been developed which can outperform many similar techniques.

Such methods have been used for diverse DFT datasets mentioned above in Table 1 as well as experimental datasets such as SuperCon 155 , 156 for quick pre-screening applications. In terms of applications, they have been applied for predicting properties such as formation energy 141 , bandgap, and magnetization 152 , superconducting temperatures 156 , bulk, and shear modulus 153 . They have also been used for transfer learning across datasets for enhanced predictive accuracy on small data 34 , even for different source and target properties 157 , which is especially useful to build predictive models for target properties for which big source datasets may not be readily available.

There have been libraries of such descriptors developed such as MatMiner 151 and DScribe 158 . Some examples of such models are given in Table 2 . Such representations are especially useful for experimental datasets such as those for superconducting materials where the atomic structure is not tabulated. However, these representations cannot distinguish different polymorphs of a system with different point groups and space groups. It has been recently shown that although composition-based representations can help build ML/DL models to predict some properties like formation energy with remarkable accuracy, it does not necessarily translate to accurate predictions of other properties such as stability, when compared to DFT’s own accuracy 159 .

Spectral models

When electromagnetic radiation hits materials, the interaction between the radiation and matter measured as a function of the wavelength or frequency of the radiation produces a spectroscopic signal. By studying spectroscopy, researchers can gain insights into the materials’ composition, structural, and dynamic properties. Spectroscopic techniques are foundational in materials characterization. For instance, X-ray diffraction (XRD) has been used to characterize the crystal structure of materials for more than a century. Spectroscopic analysis can involve fitting quantitative physical models (for example, Rietveld refinement) or more empirical approaches such as fitting linear combinations of reference spectra, such as with x-ray absorption near-edge spectroscopy (XANES). Both approaches require a high degree of researcher expertise through careful design of experiments; specification, revision, and iterative fitting of physical models; or the availability of template spectra of known materials. In recent years, with the advances in high-throughput experiments and computational data, spectroscopic data has multiplied, giving opportunities for researchers to learn from the data and potentially displace the conventional methods in analyzing such data. This section covers emerging DL applications in various modes of spectroscopic data analysis, aiming to offer practice examples and insights. Some of the applications are shown in Fig. 3 .

figure 3

a Predicting structure information from the X-ray diffraction 374 , Reprinted according to the terms of the CC-BY license 374 . Copyright 2020. b Predicting catalysis properties from computational electronic density of states data. Reprinted according to the terms of the CC-BY license 202 . Copyright 2021.

Currently, large-scale and element-diverse spectral data mainly exist in computational databases. For example, in ref. 160 , the authors calculated the infrared spectra, piezoelectric tensor, Born effective charge tensor, and dielectric response as a part of the JARVIS-DFT DFPT database. The Materials Project has established the largest computational X-ray absorption database (XASDb), covering the K-edge X-ray near-edge fine structure (XANES) 161 , 162 and the L-edge XANES 163 of a large number of material structures. The database currently hosts more than 400,000 K-edge XANES site-wise spectra and 90,000 L-edge XANES site-wise spectra of many compounds in the Materials Project. There are considerably fewer experimental XAS spectra, being on the order of hundreds, as seen in the EELSDb and the XASLib. Collecting large experimental spectra databases that cover a wide range of elements is a challenging task. Collective efforts focused on curating data extracted from different sources, as found in the RRUFF Raman, XRD and chemistry database 164 , the open Raman database 165 , and the SOP spectra library 166 . However, data consistency is not guaranteed. It is also now possible for contributors to share experimental data in a Materials Project curated database, MPContribs 167 . This database is supported by the US Department of Energy (DOE) providing some expectation of persistence. Entries can be kept private or published and are linked to the main materials project computational databases. There is an ongoing effort to capture data from DOE-funded synchrotron light sources ( https://lightsources.materialsproject.org/ ) into MPContribs in the future.

Recent advances in sources, detectors, and experimental instrumentation have made high-throughput measurements of experimental spectra possible, giving rise to new possibilities for spectral data generation and modeling. Such examples include the HTEM database 10 that contains 50,000 optical absorption spectra and the UV-Vis database of 180,000 samples from the Joint Center for Artificial Photosynthesis. Some of the common spectra databases for spectra data are shown in Table 3 . There are beginning to appear cloud-based software as a service platforms for high-throughput data analysis, for example, pair-distribution function (PDF) in the cloud ( https://pdfitc.org ) 168 which are backed by structured databases, where data can be kept private or made public. This transition to the cloud from data analysis software installed and run locally on a user’s computer will facilitate the sharing and reuse of data by the community.

Applications

Due to the widespread deployment of XRD across many materials technologies, XRD spectra became one of the first test grounds for DL models. Phase identification from XRD can be mapped into a classification task (assuming all phases are known) or an unsupervised clustering task. Unlike the traditional analysis of XRD data, where the spectra are treated as convolved, discrete peak positions and intensities, DL methods treat the data as a continuous pattern similar to an image. Unfortunately, a significant number of experimental XRD datasets in one place are not readily available at the moment. Nevertheless, extensive, high-quality crystal structure data makes creating simulated XRD trivial.

Park et al. 169 calculated 150,000 XRD patterns from the Inorganic Crystal Structure Database (ICSD) structural database 170 and then used CNN models to predict structural information from the simulated XRD patterns. The accuracies of the CNN models reached 81.14%, 83.83%, and 94.99% for space-group, extinction-group, and crystal-system classifications, respectively.

Liu et al. 95 obtained similar accuracies by using a CNN for classifying atomic pair-distribution function (PDF) data into space groups. The PDF is obtained by Fourier transforming XRD into real space and is particularly useful for studying the local and nanoscale structure of materials. In the case of the PDF, models were trained, validated, and tested on simulated data from the ICSD. However, the trained model showed excellent performance when given experimental data, something that can be a challenge in XRD data because of the different resolutions and line-shapes of the diffraction data depending on specifics of the sample and experimental conditions. The PDF seems to be more robust against these aspects.

Similarly, Zaloga et al. 171 also used the ICSD database for XRD pattern generation and CNN models to classify crystals. The models achieved 90.02% and 79.82% accuracy for crystal systems and space groups, respectively.

It should be noted that the ICSD database contains many duplicates, and such duplicates should be filtered out to avoid information leakage. There is also a large difference in the number of structures represented in each space group (the label) in the database resulting in data normalization challenges.

Lee et al. 172 developed a CNN model for phase identification from samples consisting of a mixture of several phases in a limited chemical space relevant for battery materials. The training data are mixed patterns consisting of 1,785,405 synthetic XRD patterns from the Sr-Li-Al-O phase space. The resulting CNN can not only identify the phases but also predict the compound fraction in the mixture. A similar CNN was utilized by Wang et al. 173 for fast identification of metal-organic frameworks (MOFs), where experimental spectral noise was extracted and then synthesized into the theoretical XRD for training data augmentation.

An alternative idea was proposed by Dong et al. 174 . Instead of recognizing only phases from the CNN, a proposed “parameter quantification network” (PQ-Net) was able to extract physico-chemical information. The PQ-Net yields accurate predictions for scale factors, crystallite size, and lattice parameters for simulated and experimental XRD spectra. The work by Aguiar et al. 175 took a step further and proposed a modular neural network architecture that enables the combination of diffraction patterns and chemistry data and provided a ranked list of predictions. The ranked list predictions provide user flexibility and overcome some aspects of overconfidence in model predictions. In practical applications, AI-driven XRD identification can be beneficial for high-throughput materials discovery, as shown by Maffettone et al. 176 . In their work, an ensemble of 50 CNN models was trained on synthetic data reproducing experimental variations (missing peaks, broadening, peaking shifting, noises). The model ensemble is capable of predicting the probability of each category label. A similar data augmentation idea was adopted by Oviedo et al. 177 , where experimental XRD data for 115 thin-film metal-halides were measured, and CNN models trained on the augmented XRD data achieved accuracies of 93% and 89% for classifying dimensionality and space group, respectively.

Although not a DL method, an unsupervised machine learning approach, non-negative matrix factorization (NMF), is showing great promise for yielding chemically relevant XRD spectra from time- or spatially-dependent sets of diffraction patterns. NMF is closely related to principle component analysis in that it takes a set of patterns as a matrix and then compresses the data by reducing the dimensionality by finding the most important components. In NMF a constraint is applied that all the components and their weights must be strictly positive. This often corresponds to a real physical situation (for example, spectra tend to be positive, as are the weights of chemical constituents). As a result, it appears that the mathematical decomposition often results in interpretable, physically meaningful, components and weights, as shown by Liu et al. for PDF data 178 . An extension of this showed that in a spatially resolved study, NMF could be used to extract chemically resolved differential PDFs (similar to the information in EXAFS) from non-chemically resolved PDF measurements 179 . NMF is very quick and easy to apply and can be applied to just about any set of spectra. It is likely to become widely used and is being implemented in the PDFitc.org website to make it more accessible to potential users.

Other than XRD, the XAS, Raman, and infrared spectra, also contain rich structure-dependent spectroscopic information about the material. Unlike XRD, where relatively simple theories and equations exist to relate structures to the spectral patterns, the relationships between general spectra and structures are somewhat elusive. This difficulty has created a higher demand for machine learning models to learn structural information from other spectra.

For instance, the case of X-ray absorption spectroscopy (XAS), including the X-ray absorption near-edge spectroscopy (XANES) and extended X-ray absorption fine structure (EXAFS), is usually used to analyze the structural information on an atomic level. However, the high signal-to-noise XANES region has no equation for data fitting. DL modeling of XAS data is fascinating and offers unprecedented insights. Timoshenko et al. used neural networks to predict the coordination numbers of Pt 180 and Cu 181 in nanoclusters from the XANES. Aside from the high accuracies, the neural network also offers high prediction speed and new opportunities for quantitative XANES analysis. Timoshenko et al. 182 further carried out a novel analysis of EXAFS using DL. Although EXAFS analysis has an explicit equation to fit, the study is limited to the first few coordination shells and on relatively ordered materials. Timoshenko et al. 182 first transformed the EXAFS data into 2D maps with a wavelet transform and then supplied the 2D data to a neural network model. The model can instantly predict relatively long-range radial distribution functions, offering in situ local structure analysis of materials. The advent of high-throughput XAS databases has recently unveiled more possibilities for machine learning models to be deployed using XAS data. For example, Zheng et al. 161 used an ensemble learning method to match and fast search new spectra in the XASDb. Later, the same authors showed that random forest models outperform DL models such as MLPs or CNNs in directly predicting atomic environment labels from the XANES spectra 183 . Similar approaches were also adopted by Torrisi et al. 184 In practical applications, Andrejevic et al. 185 used the XASDb data together with the topological materials database. They constructed CNN models to classify the topology of materials from the XANES and symmetry group inputs. The model correctly predicted 81% topological and 80% trivial cases and achieved 90% accuracy in material classes containing certain elements.

Raman, infrared, and other vibrational spectroscopies provide structural fingerprints and are usually used to discriminate and estimate the concentration of components in a mixture. For example, Madden et al. 186 have used neural network models to predict the concentration of illicit materials in a mixture using the Raman spectra. Interestingly, several groups have independently found that DL models outperform chemometrics analysis in vibrational spectroscopies 187 , 188 . For learning vibrational spectra, the number of training spectra is usually less than or on the order of the number of features (intensity points), and the models can easily overfit. Hence, dimensional reduction strategies are commonly used to compress the information dimension using, for example, principal component analysis (PCA) 189 , 190 . DL approaches do not have such concerns and offer elegant and unified solutions. For example, Liu et al. 191 applied CNN models to the Raman spectra in the RRUFF spectral database and show that CNN models outperform classical machine learning models such as SVM in classification tasks. More DL applications in vibrational spectral analysis can be found in a recent review by Yang et al. 192 .

Although most current DL work focuses on the inverse problem, i.e., predicting structural information from the spectra, some innovative approaches also solve the forward problems by predicting the spectra from the structure. In this case, the spectroscopy data can be viewed simply as a high-dimensional material property of the structure. This is most common in molecular science, where predicting the infrared spectra 193 , molecular excitation spectra 194 , is of particular interest. In the early 2000s, Selzer et al. 193 and Kostka et al. 195 attempted predicting the infrared spectra directly from the molecular structural descriptors using neural networks. Non-DL models can also perform such tasks to a reasonable accuracy 196 . For DL models, Chen et al. 197 used a Euclidean neural network (E(3)NN) to predict the phonon density of state (DOS) spectra 198 from atom positions and element types. The E(3)NN model captures symmetries of the crystal structures, with no need to perform data augmentation to achieve target invariances. Hence the E(3)NN model is extremely data-efficient and can give reliable DOS spectra prediction and heat capacity using relatively sparse data of 1200 calculation results on 65 elements. A similar idea was also used to predict the XAS spectra. Carbone et al. 199 used a message passing neural network (MPNN) to predict the O and N K-edge XANES spectra from the molecular structures in the QM9 database 7 . The training XANES data were generated using the FEFF package 200 . The trained MPNN model reproduced all prominent peaks in the predicted XANES, and 90% of the predicted peaks are within 1 eV of the FEFF calculations. Similarly, Rankine et al. 201 started from the two-body radial distribution function (RDC) and used a deep neural network model to predict the Fe K-edge XANES spectra for arbitrary local environments.

In addition to learn the structure-spectra or spectra-structure relationships, a few works have also explored the possibility of relating spectra to other material properties in a non-trivial way. The DOSnet proposed by Fung et al. 202 (Fig. 3 b) uses the electronic DOS spectra calculated from DFT as inputs to a CNN model to predict the adsorption energies of H, C, N, O, S and their hydrogenated counterparts, CH, CH 2 , CH 3 , NH, OH, and SH, on bimetallic alloy surfaces. This approach extends the previous d-band theory 203 , where only the d-band center, a scalar, was used to correlate with the adsorption energy on transition metals. Similarly, Kaundinya et al. 204 used Atomistic Line Graph Neural Network (ALIGNN) to predict DOS for 56,000 materials in the JARVIS-DFT database using a direct discretized spectrum (D-ALIGNN), and a compressed low-dimensional representation using an autoencoder (AE-ALIGNN). Stein et al. 205 tried to learn the mapping between the image and the UV-vis spectrum of the material using the conditional variational encoder (cVAE) with neural network models as the backbone. Such models can generate the UV-vis spectrum directly from a simple material image, offering much faster material characterizations. Predicting gas adsorption isotherms for direct air capture (DAC) are also an important application of spectra-based DL models. There have been several important works 206 , 207 for CO 2 capture with high-performance metal-organic frameworks (MOFs) which are important for mitigating climate change issues.

Image-based models

Computer vision is often credited as precipitating the current wave of mainstream DL applications a decade ago 208 . Naturally, materials researchers have developed a broad portfolio of applications of computer vision for accelerating and improving image-based material characterization techniques. High-level microscopy vision tasks can be organized as follows: image classification (and material property regression), auto-tuning experimental imaging hyperparameters, pixelwise learning (e.g., semantic segmentation), super-resolution imaging, object/entity recognition, localization, and tracking, microstructure representation learning.

Often these tasks generalize across many different imaging modalities, spanning optical microscopy (OM), scanning electron microscopy (SEM) techniques, scanning probe microscopy (SPM, as in scanning tunneling microscopy (STM) or atomic force microscopy (AFM), and transmission electron microscopy (TEM) variants, including scanning transmission electron microscopy (STEM).

The images obtained with these techniques range from capturing local atomic to mesoscale structures (microstructure), the distribution and type of defects, and their dynamics which are critically linked to the functionality and performance of the materials. Over the past few decades, atomic-scale imaging has become widespread and near-routine due to aberration-corrected STEM 209 . The collection of large image datasets is increasingly presenting an analysis bottleneck in the materials characterization pipeline, and the immediate need for automated image analysis becomes important. Non-DL image analysis methods have driven tremendous progress in quantitative microscopy, but often image processing pipelines are brittle and require too much manual identification of image features to be broadly applicable. Thus, DL is currently the most promising solution for high-performance, high-throughput automated analysis of image datasets. For a good overview of applications in microstructure characterization specifically, see 210 .

Image datasets for materials can come from either experiments or simulations. Software libraries mentioned above can be used to generate images such as STM/STEM. Images can also be obtained from the literature. A few common examples for image datasets are shown below in Table 4 . Recently, there has been a rapid development in the field of image learning tasks for materials leading to several useful packages. We list some of them in Table 4 .

Applications in image classification and regression

DL for images can be used to automatically extract information from images or transform images into a more useful state. The benefits of automated image analysis include higher throughput, better consistency of measurements compared to manual analysis, and even the ability to measure signals in images that humans cannot detect. The benefits of altering images include image super-resolution, denoising, inferring 3D structure from 2D images, and more. Examples of the applications of each task are summarized below.

Image classification and regression

Classification and regression are the processes of predicting one or more values associated with an image. In the context of DL the only difference between the two methods is that the outputs of classification are discrete while the outputs of regression models are continuous. The same network architecture may be used for both classification and regression by choosing the appropriate activation function (i.e., linear for regression or Softmax for classification) for the output of the network. Due to its simplicity image classification is one of the most established DL techniques available in the materials science literature. Nonetheless, this technique remains an area of active research.

Modarres et al. applied DL with transfer learning to automatically classify SEM images of different material systems 211 . They demonstrated how a single approach can be used to identify a wide variety of features and material systems such as particles, fibers, Microelectromechanical systems (MEMS) devices, and more. The model achieved 90% accuracy on a test set. Misclassifications resulted from images containing objects from multiple classes, which is an inherent limitation of single-class classification. More advanced techniques such as those described in subsequent sections can be applied to avoid these limitations. Additionally, they developed a system to deploy the trained model at scale to process thousands of images in parallel. This approach is essential for large-scale, high-throughput experiments or industrial applications of classification. ImageNet-based deep transfer learning has also been successfully applied for crack detection in macroscale materials images 212 , 213 , as well as for property prediction on small, noisy, and heterogeneous industrial datasets 214 , 215 .

DL has also been applied to characterize the symmetries of simulated measurements of samples. In ref. 216 , Ziletti et al. obtained a large database of perfect crystal structures, introduced defects into the perfect lattices, and simulated diffraction patterns for each structure. DL models were trained to identify the space group of each diffraction patterns. The model achieved high classification performance, even on crystals with significant numbers of defects, surpassing the performance of conventional algorithms for detecting symmetries from diffraction patterns.

DL has also been applied to classify symmetries in simulated STM measurements of 2D material systems 217 . DFT was used to generate simulated STM images for a variety of material systems. A convolutional neural network was trained to identify which of the five 2D Bravais lattices each material belonged to using the simulated STM image as input. The model achieved an average F1 score of around 0.9 for each lattice type.

DL has also been used to improve the analysis of electron backscatter diffraction (EBSD) data, with Liu et al. 218 presenting one of the first DL-based solution for EBSD indexing capable of taking an EBSD image as input and predicting the three Euler angles representing the orientation that would have led to the given EBSD pattern. However, they considered the three Euler angles to be independent of each other, creating separate CNNs for each angle, although the three angles should be considered together. Jha et al. 219 built upon that work to train a single DL model to predict the three Euler angles in simulated EBSD patterns of polycrystalline Ni while directly minimizing the misorientation angle between the true and predicted orientations. When tested on experimental EBSD patterns, the model achieved 16% lower disorientation error than dictionary-based indexing. Similarly, Kaufman et al. trained a CNN to predict the corresponding space group for a given diffraction pattern 220 . This enables EBSD to be used for phase identification in samples where the existing phases are unknown, providing a faster or more cost-effective method of characterizing than X-ray or neutron diffraction. The results from these studies demonstrate the promise of applying DL to improve the performance and utility of EBSD experiments.

Recently, DL has also been to learn crystal plasticity using images of strain profiles as input 221 , 222 . The work in ref. 221 used domain knowledge integration in the form of two-point auto-correlation to enhance the predictive accuracy, while 222 applied residual learning to learn crystal plasticity at nanoscale. It used strain profiles of materials of varying sample widths ranging from 2 μm down to 62.5 nm obtained from discrete dislocation dynamics to build a deep residual network capable of identifying prior deformation history of the sample as low, medium, or high. Compared to the correlation function-based method (68.24% accuracy), the DL model was found to be significantly more accurate (92.48%) and also capable of predicting stress-strain curves of test samples. This work additionally used saliency maps to try to interpret the developed DL model.

Pixelwise learning

DL can also be applied to generate one or more predictions for every pixel in an image. This can provide more detailed information about the size, position, orientation, and morphology of features of interest in images. Thus, pixelwise learning has been a significant area of focus with many recent studies appearing in materials science literature.

Azimi et al. applied an ensemble of fully convolutional neural networks to segment martensite, tempered martensite, bainite, and pearlite in SEM images of carbon steels. Their model achieved 94% accuracy, demonstrating a significant improvement over previous efforts to automate the segmentation of different phases in SEM images. Decost, Francis, and Holm applied PixelNet to segment microstructural constituents in the UltraHigh Carbon Steel Database 223 , 224 . In contrast to fully convolutional neural networks, which encode and decode visual signals using a series of convolution layers, PixelNet constructs “hypercolumns”, or concatenations of feature representations corresponding to each pixel at different layers in a neural network. The hypercolumns are treated as individual feature vectors, which can then be classified using any typical classification approach, like a multilayer perceptron. This approach achieved phase segmentation precision and recall scores of 86.5% and 86.5%, respectively. Additionally, this approach was used to segment spheroidite particles in the matrix, achieving precision and recall scores of 91.1% and 91.1%, respectively.

Pixelwise DL has also been applied to automatically segment dislocations in Ni superalloys 210 . Dislocations are visually similar to \(\gamma -{\gamma }^{\prime}\) and dislocation in Ni superalloys. With limited training data, a single segmentation model could not distinguish between these features. To overcome this, a second model was trained to generate a coarse mask corresponding to the deformed region in the material. Overlaying this mask with predictions from the first model selects the dislocations, enabling them to be distinguished from \(\gamma -{\gamma }^{\prime}\) interfaces.

Stan, Thompson, and Voorhees applied Pixelwise DL to characterize dendritic growth from serial sectioning and synchrotron computed tomography data 225 . Both of these techniques generate large amounts of data, making manual analysis impractical. Conventional image processing approaches, utilizing thresholding, edge detectors, or other hand-crafted filters, cannot effectively deal with noise, contrast gradients, and other artifacts that are present in the data. Despite having a small training set of labeled images, SegNet automatically segmented these images with much higher performance.

Object/entity recognition, localization, and tracking

Object detection or localization is needed when individual instances of recognized objects in a given image need to be distinguished from each other. In cases where instances do not overlap each other by a significant amount, individual instances can be resolved through post-processing of semantic segmentation outputs. This technique has been applied extensively to detect individual atoms and defects in microstructural images.

Madsen et al. applied pixelwise DL to detect atoms in simulated atomic-resolution TEM images of graphene 226 . A neural network was trained to detect the presence of each atom as well as predict its column height. Pixelwise results are used as seeds for watershed segmentation to achieve instance-level detection. Analysis of the arrangement of the atoms led to the autonomous characterization of defects in the lattice structure of the material. Interestingly, despite being trained only on simulations, the model successfully detected atomic positions in experimental images.

Maksov et al. demonstrated atomistic defect recognition and tracking across sequences of atomic-resolution STEM images of WS 2 227 . The lattice structure and defects existing in the first frame were characterized through a physics-based approach utilizing Fourier transforms. The positions of atoms and defects in the first frame were used to train a segmentation model. Despite only using the first frame for training, the model successfully identified and tracked defects in the subsequent frames for each sequence, even when the lattice underwent significant deformation. Similarly, Yang et al. 228 used U-net architecture (as shown in Fig. 4 ) to detect vacancies and dopants in WSe 2 in STEM images with model accuracy of up to 98%. They classified the possible atomic sites based on experimental observations into five different types: tungsten, vanadium substituting for tungsten, selenium with no vacancy, mono-vacancy of selenium, and di-vacancy of selenium.

figure 4

a Deep neural networks U-Net model constructed for quantification analysis of annular dark-field in the scanning transmission electron microscope (ADF-STEM) image of V-WSe 2 . b Examples of training dataset for deep learning of atom segmentation model for five different species. c Pixel-level accuracy of the atom segmentation model as a function of training epoch. d Measurement accuracy of the segmentation model compared with human-based measurements. Scale bars are 1 nm [Reprinted according to the terms of the CC-BY license ref. 228 ].

Roberts et al. developed DefectSegNet to automatically identify defects in transmission and STEM images of steel including dislocations, precipitates, and voids 229 . They provide detailed information on the model’s design, training, and evaluation. They also compare measurements generated from the model to manual measurements performed by several different human experts, demonstrating that the measurements generated by DL are quantitatively more accurate and consistent.

Kusche et al. applied DL to localize defects in panoramic SEM images of dual-phase steel 230 . Manual thresholding was applied to identify dark defects against the brighter matrix. Regions containing defects were classified via two neural networks. The first neural network distinguished between inclusions and ductile damage in the material. The second classified the type of ductile damage (i.e., notching, martensite cracking, etc.) Each defect was also segmented via a watershed algorithm to obtain detailed information on its size, position, and morphology.

Applying DL to localize defects and atomic structures is a popular area in materials science research. Thus, several other recent studies on these applications can be found in the literature 231 , 232 , 233 , 234 .

In the above examples pixelwise DL, or classification models are combined with image analysis to distinguish individual instances of detected objects. However, when several adjacent objects of the same class touch or overlap each other in the image, this approach will falsely detect them to be a single, larger object. In this case, DL models designed for the detection or instance segmentation can be used to resolve overlapping instances. In one such study, Cohn and Holm applied DL for instance-level segmentation of individual particles and satellites in dense powder images 235 . Segmenting each particle allows for computer vision to generate detailed size and morphology information which can be used to supplement experimental powder characterization for additive manufacturing. Additionally, overlaying the powder and satellite masks yielded the first method for quantifying the satellite content of powder samples, which cannot be measured experimentally.

Super-resolution imaging and auto-tuning experimental parameters

The studies listed so far focus on automating the analysis of existing data after it has been collected experimentally. However, DL can also be applied during experiments to improve the quality of the data itself. This can reduce the time for data collection or improve the amount of information captured in each image. Super-resolution and other DL techniques can also be applied in situ to autonomously adjust experimental parameters.

Recording high-resolution electron microscope images often require large dwell times, limiting the throughput of microscopy experiments. Additionally, during imaging, interactions between the electron beam and a microscopy sample can result in undesirable effects, including charging of non-conductive samples and damage to sensitive samples. Thus, there is interest in using DL to artificially increase the resolution of images without introducing these artifacts. One method of interest is applying generative adversarial networks (GANs) for this application.

De Haan et al. recorded SEM images of the same regions of interest in carbon samples containing gold nanoparticles at two resolutions 236 . Low-resolution images recorded were used as inputs to a GAN. The corresponding images with twice the resolution were used as the ground truth. After training the GAN reduced the number of undetected gaps between nanoparticles from 13.9 to 3.7%, indicating that super-resolution was successful. Thus, applying DL led to a four-fold reduction of the interaction time between the electron beam and the sample.

Ede and Beanland collected a dataset of STEM images of different samples 237 . Images were subsampled with spiral and ‘jittered’ grid masks to obtain partial images with resolutions reduced by a factor up to 100. A GAN was trained to reconstruct full images from their corresponding partial images. The results indicated that despite a significant reduction in the sampling area, this approach successfully reconstructed high-resolution images with relatively small errors.

DL has also been applied to automated tip conditioning for SPM experiments. Rashidi and Wolkow trained a model to detect artifacts in SPM measurements resulting from degradation in tip quality 238 . Using an ensemble of convolutional neural networks resulted in 99% accuracy. After detecting that a tip has degraded, the SPM was configured to automatically recondition the tip in situ until the network indicated that the atomic sharpness of the tip has been restored. Monitoring and reconditioning the tip is the most time and labor-intensive part of conducting SPM experiments. Thus, automating this process through DL can increase the throughput and decrease the cost of collecting data through SPM.

In addition to materials characterization, DL can be applied to autonomously adjust parameters during manufacturing. Scime et al. mounted a camera to multiple 3D printers 239 . Images of the build plate were recorded throughout the printing process. A dynamic segmentation convolutional neural network was trained to recognize defects such as recoater streaking, incomplete spreading, spatter, porosity, and others. The trained model achieved high performance and was transferable to multiple printers from three different methods of additive manufacturing. This work is the first step to enabling smart additive manufacturing machines that can correct defects and adjust parameters during printing.

There is also growing interest in establishing instruments and laboratories for autonomous experimentation. Eppel et al. trained multiple models to detect chemicals, materials, and transparent vessels in a chemistry lab setting 240 . This study provides a rigorous analysis of several different approaches for scene understanding. Models were trained to characterize laboratory scenes with different methods including semantic segmentation and instance segmentation, both with and without overlapping instances. The models successfully detected individual vessels and materials in a variety of settings. Finer-grained understanding of the contents of vessels, such as segmentation of individual phases in multi-phase systems, was limited, outlining the path for future work in this area. The results represent an important step towards realizing automated experimentation for laboratory-scale experiments.

Microstructure representation learning

Materials microstructure is often represented in the form of multi-phase high-dimensional 2D/3D images and thus can readily leverage image-based DL methods to learn robust, low-dimensional microstructure representations, which can subsequently be used for building predictive and generative models to learn forward and inverse structure-property linkages, which are typically studied across different length scales (multi-scale modeling). In this context, homogenization and localization refer to the transfer of information from lower length scales to higher length scales and vice-versa. DL using customized CNNs has been used both for homogenization, i.e., predicting the macroscale property of material given its microstructure information 221 , 241 , 242 , as well as for localization, i.e., predicting the strain distribution across a given microstructure for a loading condition 243 .

Transfer learning has also been widely used for analyzing materials microstructure images; methods for improving the use of transfer learning to materials science applications remain an area of active research. Goetz et al. investigated the use of unsupervised domain adaptation as an alternative to simply fine-tuning a pre-trained model 244 . In this technique a model is first trained on a labeled dataset in the source domain. Next, a discriminator model is used to train the model to generate domain-agnostic features. Compared to simple fine-tuning, unsupervised domain adaptation improved the performance of classification and segmentation neural networks on materials science datasets. However, it was determined that the highest performance was achieved when the source domain was more visually similar to the target (for example, using a different set of microstructural images instead of ImageNet.) This highlights the utility of establishing large, publicly available datasets of annotated images in materials science.

Kitaraha and Holm used the output of an intermediate layer of a pre-trained convolutional neural network as a feature representation for images of steel surface defects and Inconnel fracture surfaces 245 . Images were classified by defect type or fracture surface orientation using unsupervised DL. Even though no labeled data was used to train the neural network or the unsupervised classifier, the model found natural decision boundaries that achieved a classification performance of 98% and 88% for the defect classes and fracture surface orientations, respectively. Visualization of the representations through principal component analysis (PCA) and t-distributed stochastic neighborhood embedding (t-SNE) provided qualitative insights into the representations. Although the detailed physical interpretation of the representations is still a distant goal, this study provides tools for investigating patterns in visual signals contained in image-based datasets in materials science.

Larmuseau et al. investigated the use of triplet networks to obtain consistent representations for visually similar images of materials 246 . Triplet networks are trained with three images at a time. The first image, the reference, is classified by the network. The second image, called the positive, is another image with the same class label. The last image, called the negative, is an image from a separate class. During training the loss function includes errors in predicting the class of the reference image, the difference in representations of the reference and positive images, and the similarity in representations of the reference and negative images. This process allows the network to learn consistent representations for images in the same class while distinguishing images from different classes. The triple network outperformed an ordinary convolutional neural network trained for image classification on the same dataset.

In addition to investigating representations used to analyze existing images, DL can generate synthetic images of materials systems. Generative Adversarial Networks (GANs) are currently the predominant method for synthetic microstructure generation. GANs consist of a generator, which creates a synthetic microstructure image, and a discriminator, which attempts to predict if a given input image is real or synthetic. With careful application, GANs can be a powerful tool for microstructure representation learning and design.

Yang and Li et al. 247 , 248 developed a GAN-based model for learning a low-dimensional embedding of microstructures, which could then be easily sampled and used with the generator of the GAN model to generate realistic, statistically similar microstructure images, thus enabling microstructural materials design. The model was able to capture complex, nonlinear microstructure characteristics and learn the mapping between the latent design variables and microstructures. In order to close the loop, the method was combined with a Bayesian optimization approach to design microstructures with optimal optical absorption performance. The discovered microstructures were found to have up to 17% better property than randomly sampled microstructures. The unique architecture of their GAN model also facilitated generator scalability to generate arbitrary-sized microstructure images and discriminator transferability to build structure-property prediction models. Yang et al. 249 recently combined GANs with MDNs (mixture density networks) to enable inverse modeling in microstructural materials design, i.e., generate the microstructure for a given desired property.

Hsu et al. constructed a GAN to generate 3D synthetic solid oxide fuel cell microstructures 250 . These microstructures were compared to other synthetic microstructures generated by DREAM.3D as well as experimentally observed microstructures measured via sectioning and imaging with PFIB-SEM. Synthetic microstructures generated from the GAN were observed to qualitatively show better agreement to the experimental microstructures than the DREAM.3D microstructures, as evidenced by the more realistic phase connectivity and lower amount of agglomeration of solid phases. Additionally, a statistical analysis of various features such as volume fraction, particle size, and several other quantities demonstrated that the GAN microstructures were quantitatively more similar to the real microstructures than the DREAM.3D microstructures.

In a similar study, Chun et al. generated synthetic microstructures of high energy materials using a GAN 251 . Once again, a synthetic microstructure generated via GAN showed better qualitative visual similarity to an experimentally observed microstructure compared to a synthetic microstructure generated via a transfer learning approach, with sharper phase boundaries and fewer computational artifacts. Additionally, a statistical analysis of the void size, aspect ratio, and orientation distributions indicated that the GAN produced microstructures that were quantitatively more similar to real materials.

Applications of DL to microstructure representation learning can help researchers improve the performance of predictive models used for the applications listed above. Additionally, using generative models can generate more realistic simulated microstructures. This can help researchers develop more accurate models for predicting material properties and performance without needing to synthesize and process these materials, significantly increasing the throughput of materials selection and screening experiments.

Mesoscale modeling applications

In addition to image-based characterization, deep learning methods are increasingly used in mesoscale modeling. Dai et al. 252 trained a GNN successfully trained to predict magnetostriction in a wide range of synthetic polycrystalline systems with around 10% prediction error. The microstructure is represented by a graph where each node corresponds to a single grain, and the edges between nodes indicate an interface between neighboring grains. Five node features (3 Euler angles, volume, and the number of neighbors) were associated with each grain. The GNN outperformed other machine learning approaches for property prediction of polycrystalline materials by accounting for interactions between neighboring grains.

Similarly, Cohn and Holm present preliminary work applying GNNs to predict the occurrence of abnormal grain growth (AGG) in Monte Carlo simulations of microstructure evolution 253 . AGG appears to be stochastic, making it notoriously difficult to predict, control, and even observe experimentally in some materials. AGG has been reproduced in Monte Carlo simulations of material systems, but a model that can predict which initial microstructures will undergo AGG has not been established before. A dataset of Monte Carlo simulations was created using SPPARKS 254 , 255 . A microstructure GNN was trained to predict AGG in individual simulations, with 75% classification accuracy. In comparison, an image-based only achieved 60% accuracy. The GNN also provided physical insight to understanding AGG and indicated that only 2 neighborhood shells are needed to achieve the maximum performance achieved in the study. These early results motivate additional work on applying GNNs to predict the occurrence in both simulated and real materials during processing.

Natural language processing

Most of the existing knowledge in the materials domain is currently unavailable as structured information and only exists as unstructured text, tables, or images in various publications. There exists a great opportunity to use natural language processing (NLP) techniques to convert text to structured data or to directly learn and make inferences from the text information. However, as a relatively new field within materials science, many challenges remain unsolved in this domain, such as resolving dependencies between words and phrases across multiple sentences and paragraphs.

Datasets for NLP

Datasets relevant to natural language processing include peer-reviewed journal articles, articles published on preprint servers such as arXiv or ChemRxiv, patents, and online material such as Wikipedia. Unfortunately, accessing or parsing most such datasets remains difficult. Peer-reviewed journal articles are typically subject to copyright restrictions and thus difficult to obtain, especially in the large numbers required for machine learning. Many publishers now offer text and data mining (TDM) agreements that can be signed online, allowing at least a limited, restricted amount of work to be performed. However, gaining access to the full text of many publications still typically requires strict and dedicated agreements with each publisher. The major advantage of working with publishers is that they have often already converted the articles from a document format such as PDF into an easy-to-parse format such as HyperText Markup Language (HTML). In contrast, articles on preprint servers and patents are typically available with fewer restrictions, but are commonly available only as PDF files. It remains difficult to properly parse text from PDF files in a reliable manner, even when the text is embedded in the PDF. Therefore, new tools that can easily and automatically convert such content into well-structured HTML format with few residual errors would likely have a major impact on the field. Finally, online sources of information such as Wikipedia can serve as another type of data source. However, such online sources are often more difficult to verify in terms of accuracy and also do not contain as much domain-specific information as the research literature.

Software libraries for NLP

Applying NLP to a raw dataset involves multiple steps. These steps include retrieving the data, various forms of “pre-processing” (sentence and word tokenization, word stemming and lemmatization, featurization such as word vectors or part of speech tagging), and finally machine learning for information extraction (e.g., named entity recognition, entity-relationship modeling, question and answer, or others). Multiple software libraries exist to aid in materials NLP, as described in Table 5 . We note that although many of these steps can in theory be performed by general-purpose NLP libraries such as NLTK 256 , SpaCy 257 , or AllenNLP 258 , the specialized nature of chemistry and materials science text (including the presence of complex chemical formulas) often leads to errors. For example, researchers have developed specialized codes to perform preprocessing that better detect chemical formulas (and not split them into separate tokens or apply stemming/lemmatization to them) and scientific phrases and notation such as oxidation states or symbols for physical units.

Similarly, chemistry-specific codes for extracting entities are better at extracting the names of chemical elements (e.g., recognizing that “He” likely represents helium and not a male pronoun) and abbreviations for chemical formulas. Finally, word embeddings that convert words such as “manganese” into numerical vectors for further data mining are more informative when trained specifically on materials science text versus more generic texts, even when the latter datasets are larger 259 . Thus, domain-specific tools for NLP are required in nearly all aspects of the pipeline. The main exception is that the architecture of the specific neural network models used for information extraction (e.g., LSTM, BERT, or architectures used to generate word embeddings such as word2vec or GloVe) are typically not modified specifically for the materials domain. Thus, much of the materials and chemistry-centric work currently regards data retrieval and appropriate preprocessing. A longer discussion of this topic, with specific examples, can be found in refs. 260 , 261 .

NLP methods for materials have been applied for information extraction and search (particularly as applied to synthesis prediction) as well as materials discovery. As the domain is rapidly growing, we suggest dedicated reviews on this topic by Olivetti et al. 261 and Kononova et al. 260 for more information.

One of the major uses of NLP methods is to extract datasets from the text in published studies. Conventionally, such datasets required manual entry of datasets by researchers combing the literature, a laborious and time-consuming process. Recently, software tools such as ChemDataExtractor 262 and other methods 263 based on more conventional machine learning and rule-based approaches have enabled automated or semi-automated extraction of datasets such as Curie and Néel magnetic phase transition temperatures 264 , battery properties 265 , UV-vis spectra 266 , and surface and pore characteristics of metal-organic frameworks 267 . In the past few years, DL approaches such as LSTMs and transformer-based models have been employed to extract various categories of information 268 , and in particular materials synthesis information 269 , 270 , 271 from text sources. Such data have been used to predict synthesis maps for titania nanotubes 272 , various binary and ternary oxides 273 , and perovskites 274 .

Databases based on natural language processing have also been used to train machine learning models to identify materials with useful functional properties, such as the recent discovery of the large magnetocaloric properties of HoBe 2 275 . Similarly, Cooper et al. 276 demonstrated a “design to device approach” for designing dye-sensitized solar cells that are co-sensitized with two dyes 276 . This study used automated text mining to compile a list of candidate dyes for the application along with measured properties such as maximum absorption wavelengths and extinction coefficients. The resulting list of 9431 dyes extracted from the literature was downselected to 309 candidates using various criteria such as molecular structure and ability to absorb in the solar spectrum. These candidates were evaluated for suitable combinations for co-sensitization, yielding 33 dyes that were further downselected using density functional theory calculations and experimental constraints. The resulting 5 dyes were evaluated experimentally, both individually and in combinations, resulting in a combination of dyes that not only outperformed any of the individual dyes but demonstrated performance comparable to existing standard material. This study demonstrates the possibility of using literature-based extraction to identify materials candidates for new applications from the vast body of published work, which may have never tested those materials for the desired application.

It is even possible that natural language processing can directly make materials predictions without intermediary models. In a study reported by Tshitoyan et al. 259 (as shown in Fig. 5 ), word embeddings (i.e., numerical vectors representing distinct words) trained on materials science literature could directly predict materials applications through a simple dot product between the trained embedding for a composition word (such as PbTe) and an application words (such as thermoelectrics). The researchers demonstrated that such an approach, if applied in the past using historical data, may have subsequently predicted many recently reported thermoelectric materials; they also presented a list of potentially interesting thermoelectric compositions using the known literature at the time. Since then, several of these predictions have been tested either computationally 277 , 278 , 279 , 280 , 281 , 282 or experimentally 283 as potential thermoelectrics. Such approaches have recently been applied to search for understudied areas of metallocene catalysis 284 , although challenges still remain in such direct approaches to materials prediction.

figure 5

a Network for training word embeddings for natural language processing application. A one-hot encoded vector at left represents each distinct word in the corpus; the role of a hidden layer is to predict the probability of neighboring words in the corpus. This network structure trains a relatively small hidden layer of 100–200 neurons to contain information on the context of words in the entire corpus, with the result that similar words end up with similar hidden layer weights (word embeddings). Such word embeddings can transform wordsin text form into numerical vectors that may be useful for a variety of applications. b projection of word embeddings for various materials science words, as trained on a corpus scientific abstracts, into two dimensions using principle components analysis. Without any explicit training, the word embeddings naturally preserve relationships between chemical formulas, their common oxides, and their ground state structures. [Reprinted according to the terms of the CC-BY license ref. 259 ].

Uncertainty quantification

Uncertainty quantification (UQ) is an essential step in evaluating the robustness of DL. Specifically, DL models have been criticized for lack of robustness, interpretability, and reliability and the addition of carefully quantified uncertainties would go a long way towards addressing such shortcomings. While most of the focus in the DL field currently goes into developing new algorithms or training networks to high accuracy, there is increasing attention to UQ, as exemplified by the detailed review of Abdar et al. 285 . However, determining the uncertainty associated with DL predictions is still challenging and far from a completely solved problem.

The main drawback to estimating UQ when performing DL is the fact that most of the currently available UQ implementations do not work for arbitrary, off-the-shelf models, without retraining or redesigning. Bayesian NNs are the exception; however, they require significant modifications to the training procedure, are computationally expensive compared to non-Bayesian NNs, and become increasingly inefficient the larger the datasize gets. A considerable fraction of the current research in DL UQ focuses exactly on such an issue: how to evaluate uncertainty without requiring computationally expensive retraining or DL code modifications. An example of such an effort is the work of Mi et al. 286 , where three scalable methods are explored, to evaluate the variance of output from trained NN, without requiring any amount of retraining. Another example is Teye, Azizpour, and Smith’s exploration of the use of batch normalization as a way to approximate inference in Bayesian models 287 .

Before reviewing the most common methods used to evaluate uncertainty in DL, let us briefly point out key reasons to add UQ to DL modeling. Reaching high accuracy when training DL models implicitly assume the availability of a sufficiently large and diverse training dataset. Unfortunately, this rarely occurs in material discovery applications 288 . ML/DL models are prone to perform poorly on extrapolation 289 . It is also extremely difficult for ML/DL models to recognize ambiguous samples 290 . In general, determining the amount of data necessary to train a DL to achieve the required accuracy is a challenging problem. Careful evaluation of the uncertainty associated with DL predictions would not only increase reliability in predicted results but would also provide guidance on estimating the needed training dataset size as well as suggesting what new data should be added to reach the target accuracy (uncertainty-guided decision). Zhang, Kailkhura, and Han’s work emphasizes how including a UQ-motivated reject option into the DL model substantially improves the performance of the remaining material data 288 . Such a reject option is associated with the detection of out-of-distribution samples, which is only possible through UQ analysis of the predicted results.

Two different uncertainty types are associated with each ML prediction: epistemic uncertainty and aleatory uncertainty. Epistemic uncertainty is related to insufficient training data in part of the input domain. As mentioned above, while DL is very effective at interpolation tasks, they can have more difficulty in extrapolation. Therefore, it is vital to quantify the lack of accuracy due to localized, insufficient training data. The aleatory uncertainty, instead, is related to parameters not included in the model. It relates to the possibility of training on data that our DL perceives as very similar but that are associated with different outputs because of missing features in the model. Ideally, we would like UQ methodologies to distinguish and quantify both types of uncertainties separately.

The most common approaches to evaluate uncertainty using DL are Dropout methods, Deep Ensemble methods, Quantile regression, and Gaussian Processes. Dropout methods are commonly used to avoid overfitting. In this type of approach, network nodes are disabled randomly during training, resulting in the evaluation of a different subset of the network at each training step. When a similar randomization procedure is also applied to the prediction procedure, the methodology becomes Monte-Carlo dropout 291 . Repeating such randomization multiple times produces a distribution over the outputs, from which mean and variance are determined for each prediction. Another example of using a dropout approach to approximate Bayesian inference in deep Gaussian processes is the work of Gal and Ghahramani 292 .

Deep ensemble methodologies 293 , 294 , 295 , 296 combine deep learning modelling with ensemble learning. Ensemble methods utilize multiple models and different random initializations to improve predictability. Because of the multiple predictions, statistical distributions of the outputs are generated. Combining such results into a Gaussian distribution, confidence intervals are obtained through variance evaluation. Such a multi-model strategy allows the evaluation of aleatory uncertainty when sufficient training data are provided. For areas without sufficient data, the predicted mean and variance will not be accurate, but the expectation is that a very large variance should be estimated, clearly indicating non-trustable predictions. Monte-Carlo Dropout and Deep Ensembles approaches can be combined to further improve confidence in the predicted outputs.

Quantile regression can be utilized with DL 297 . In this approach, the loss function is used in a way that allows to predict for the chosen quantile a (between 0 and 1). A choice of a  = 0.5 corresponds to evaluating the Mean Absolute Error (MAE) and predicting the median of the distribution. Predicting for two more quantile values (amin and amax) determines confidence intervals of width amax − amin. For instance, predicting for amin = 0.1 and amax = 0.8 produces confidence intervals covering 70% of the population. The largest drawback of using quantile to estimate prediction intervals is the need to run the model three times, one for each quantile needed. However, a recent implementation in TensorFlow allows to simultaneously obtain multiple quantiles in one run.

Lastly, Gaussian Processes (GP) can be used within a DL approach as well and have the side benefit of providing UQ information at no extra cost. Gaussian processes are a family of infinite-dimensional multivariate Gaussian distributions completely specified by a mean function and a flexible kernel function (prior distribution). By optimizing such functions to fit the training data, the posterior distribution is determined, which is later used to predict outputs for inputs not included in the training set. Because the prior is a Gaussian process, the posterior distribution is Gaussian as well 298 , thus providing mean and variance information for each predicted data. However, in practice standard kernels under-perform 299 . In 2016, Wilson et al. 300 suggested processing inputs through a neural network prior to a Gaussian process model. This procedure could extract high-level patterns and features, but required careful design and optimization. In general, Deep Gaussian processes improve the performance of Gaussian processes by mapping the inputs through multiple Gaussian process ‘layers’. Several groups have followed this avenue and further perfected such an approach (ref. 299 and references within). A common drawback of Bayesian methods is a prohibitive computational cost if dealing with large datasets 292 .

Limitations and challenges

Although DL methods have various fascinating opportunities for materials design, they have several limitations and there is much room to improve. Reliability and quality assessment of datasets used in DL tasks are challenging because there is either a lack of ground truth data, or there are not enough metrics for global comparison, or datasets using similar or identical set-ups may not be reproducible 301 . This poses an important challenge in relying upon DL-based prediction.

Material representations based on chemical formula alone by definition do not consider structure, which on the one hand makes them more amenable to work for new compounds for which structure information may not be available, but on the other hand, makes it impossible for them to capture phenomena such as phase transitions. Properties of materials depend sensitively on structure to the extent that their properties can be quite opposite depending on the atomic arrangement, like a diamond (hard, wide-band-gap insulator) and graphite (soft, semi-metal). It is thus not a surprise that chemical formula-based methods may not be adequate in some cases 159 .

Atomistic graph-based predictions, although considered a full atomistic description, are tested on bulk materials only and not for defective systems or for multi-dimensional phases of space exploration such as using genetic algorithms. In general, this underscores that the input features must be predictive for the output labels and not be missing some key information. Although atomistic graph neural network models such as atomistic line graph neural network (ALIGNN) have achieved remarkable accuracy compared to previous atomistic based models, the model errors still need to be further brought down to reach something resembling deep learning ‘chemical-accuracies.’

In terms of images and spectra, the experimental data are too noisy most of the time and require much manipulation before applying DL. In contrast, theory-based simulated data represent an alternate path forward but may not capture realistic scenarios such as the presence of structured noise 217 .

Uncertainty quantification for deep learning for materials science is important, yet only a few works have been published in this field. To alleviate the black-box 38 nature of the DL methods, a package such as GNNExplainer 302 has been tried in the context of the material. Such attempts at greater interpretability will be important moving forward to gain the trust of the materials community.

While training-validation-test split strategies were primarily designed in DL for image classification tasks with a certain number of classes, the same for regression models in materials science may not be the best approach. This is because it is possible that during the training the model is seeing a material very similar to the test set material and in reality it is difficult to generalize the model. Best practices need to be developed for data split, normalization, and augmentation to avoid such issues 289 .

Finally, we note an important technological challenge is to make a closed-loop autonomous materials design and synthesis process 303 , 304 that can include both machine learning and experimental components in a self-driving laboratory 305 . For an overview of early proof of principle attempts see 306 . For example, in an autonomous synthesis experiment the oxidation state of copper (and therefore the oxide phase) was varied in a sample of copper oxide by automatically flowing more oxidizing or more reducing gas over the sample and monitoring the charge state of the copper using XANES. An algorithmic decision policy was then used to automatically change the gas composition for a subsequent experiment based on the prior experiments, with no human in the loop, in such a way as to autonomously move towards a target copper oxidation state 307 . This simple proof of principle experiment provides just a glimpse of what is possible moving forward.

Data availability

The data from new figures are available on reasonable request from the corresponding author. Data from other publishers are not available from the corresponding author of this work but may be available by reaching the corresponding author of the cited work.

Code availability

Software packages mentioned in the article (whichever made available by the authors) can be found at https://github.com/deepmaterials/dlmatreview . Software for other packages can be obtained by reaching the corresponding author of the cited work.

Callister, W. D. et al. Materials Science and Engineering: An Introduction (Wiley, 2021).

Saito, T. Computational Materials Design, Vol. 34 (Springer Science & Business Media, 2013).

Choudhary, K. et al. The joint automated repository for various integrated simulations (jarvis) for data-driven materials design. npj Comput. Mater. 6 , 1–13 (2020).

Article   Google Scholar  

Kirklin, S. et al. The open quantum materials database (oqmd): assessing the accuracy of dft formation energies. npj Comput. Mater. 1 , 1–15 (2015).

Jain, A. et al. Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Mater. 1 , 011002 (2013).

Curtarolo, S. et al. Aflow: An automatic framework for high-throughput materials discovery. Comput. Mater. Sci. 58 , 218–226 (2012).

Article   CAS   Google Scholar  

Ramakrishnan, R., Dral, P. O., Rupp, M. & Von Lilienfeld, O. A. Quantum chemistry structures and properties of 134 kilo molecules. Sci. Data 1 , 1–7 (2014).

Draxl, C. & Scheffler, M. Nomad: The fair concept for big data-driven materials science. MRS Bull. 43 , 676–682 (2018).

Wang, R., Fang, X., Lu, Y., Yang, C.-Y. & Wang, S. The pdbbind database: methodologies and updates. J. Med. Chem. 48 , 4111–4119 (2005).

Zakutayev, A. et al. An open experimental database for exploring inorganic materials. Sci. Data 5 , 1–12 (2018).

de Pablo, J. J. et al. New frontiers for the materials genome initiative. npj Comput. Mater. 5 , 1–23 (2019).

Wilkinson, M. D. et al. The fair guiding principles for sci. data management and stewardship. Sci. Data 3 , 1–9 (2016).

Friedman, J. et al. The Elements of Statistical Learning, Vol. 1 (Springer series in statistics New York, 2001).

Agrawal, A. & Choudhary, A. Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science. APL Mater. 4 , 053208 (2016).

Vasudevan, R. K. et al. Materials science in the artificial intelligence age: high-throughput library generation, machine learning, and a pathway from correlations to the underpinning physics. MRS Commun. 9 , 821–838 (2019).

Schmidt, J., Marques, M. R., Botti, S. & Marques, M. A. Recent advances and applications of machine learning in solid-state materials science. npj Comput. Mater. 5 , 1–36 (2019).

Butler, K. T., Davies, D. W., Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559 , 547–555 (2018).

Xu, Y. et al. Deep dive into machine learning models for protein engineering. J. Chem. Inf. Model. 60 , 2773–2790 (2020).

Schleder, G. R., Padilha, A. C., Acosta, C. M., Costa, M. & Fazzio, A. From dft to machine learning: recent approaches to materials science–a review. J. Phys. Mater. 2 , 032001 (2019).

Agrawal, A. & Choudhary, A. Deep materials informatics: applications of deep learning in materials science. MRS Commun. 9 , 779–792 (2019).

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436–444 (2015).

McCulloch, W. S. & Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5 , 115–133 (1943).

Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 65 , 386–408 (1958).

Gibney, E. Google ai algorithm masters ancient game of go. Nat. News 529 , 445 (2016).

Ramos, S., Gehrig, S., Pinggera, P., Franke, U. & Rother, C. Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling. in 2017 IEEE Intelligent Vehicles Symposium (IV) , 1025–1032 (IEEE, 2017).

Buduma, N. & Locascio, N. Fundamentals of deep learning: Designing next-generation machine intelligence algorithms (O’Reilly Media, Inc., O’Reilly, 2017).

Kearnes, S., McCloskey, K., Berndl, M., Pande, V. & Riley, P. Molecular graph convolutions: moving beyond fingerprints. J. Computer Aided Mol. Des. 30 , 595–608 (2016).

Albrecht, T., Slabaugh, G., Alonso, E. & Al-Arif, S. M. R. Deep learning for single-molecule science. Nanotechnology 28 , 423001 (2017).

Ge, M., Su, F., Zhao, Z. & Su, D. Deep learning analysis on microscopic imaging in materials science. Mater. Today Nano 11 , 100087 (2020).

Agrawal, A., Gopalakrishnan, K. & Choudhary, A. In Handbook on Big Data and Machine Learning in the Physical Sciences: Volume 1. Big Data Methods in Experimental Materials Discovery World Scientific Series on Emerging Technologies, 205–230 (“World Scientific, 2020).

Erdmann, M., Glombitza, J., Kasieczka, G. & Klemradt, U. Deep Learning for Physics Research (World Scientific, 2021).

Chen, C., Ye, W., Zuo, Y., Zheng, C. & Ong, S. P. Graph networks as a universal machine learning framework for molecules and crystals. Chem. Mater. 31 , 3564–3572 (2019).

Jha, D. et al. Enhancing materials property prediction by leveraging computational and experimental data using deep transfer learning. Nat. Commun . 10 , 1–12 (2019).

Cubuk, E. D., Sendek, A. D. & Reed, E. J. Screening billions of candidates for solid lithium-ion conductors: a transfer learning approach for small data. J. Chem. Phys. 150 , 214701 (2019).

Chen, C., Zuo, Y., Ye, W., Li, X. & Ong, S. P. Learning properties of ordered and disordered materials from multi-fidelity data. Nat. Comput. Sci. 1 , 46–53 (2021).

Artrith, N. et al. Best practices in machine learning for chemistry. Nat. Chem. 13 , 505–508 (2021).

Holm, E. A. In defense of the black box. Science 364 , 26–27 (2019).

Mueller, T., Kusne, A. G. & Ramprasad, R. Machine learning in materials science: Recent progress and emerging applications. Rev. Comput. Chem. 29 , 186–273 (2016).

CAS   Google Scholar  

Wei, J. et al. Machine learning in materials science. InfoMat 1 , 338–358 (2019).

Liu, Y. et al. Machine learning in materials genome initiative: a review. J. Mater. Sci. Technol. 57 , 113–122 (2020).

Wang, A. Y.-T. et al. Machine learning for materials scientists: an introductory guide toward best practices. Chem. Mater. 32 , 4954–4965 (2020).

Morgan, D. & Jacobs, R. Opportunities and challenges for machine learning in materials science. Annu. Rev. Mater. Res. 50 , 71–103 (2020).

Himanen, L., Geurts, A., Foster, A. S. & Rinke, P. Data-driven materials science: status, challenges, and perspectives. Adv. Sci. 6 , 1900808 (2019).

Rajan, K. Informatics for materials science and engineering: data-driven discovery for accelerated experimentation and application (Butterworth-Heinemann, 2013).

Montáns, F. J., Chinesta, F., Gómez-Bombarelli, R. & Kutz, J. N. Data-driven modeling and learning in science and engineering. Comptes Rendus Mécanique 347 , 845–855 (2019).

Aykol, M. et al. The materials research platform: defining the requirements from user stories. Matter 1 , 1433–1438 (2019).

Stanev, V., Choudhary, K., Kusne, A. G., Paglione, J. & Takeuchi, I. Artificial intelligence for search and discovery of quantum materials. Commun. Mater. 2 , 1–11 (2021).

Chen, C. et al. A critical review of machine learning of energy materials. Adv. Energy Mater. 10 , 1903242 (2020).

Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2 , 303–314 (1989).

Kidger, P. & Lyons, T. Universal approximation with deep narrow networks . in Conference on learning theory , 2306–2327 (PMLR, 2020).

Lin, H. W., Tegmark, M. & Rolnick, D. Why does deep and cheap learning work so well? J. Stat. Phys. 168 , 1223–1247 (2017).

Minsky, M. & Papert, S. A. Perceptrons: An introduction to computational geometry (MIT press, 2017).

Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 , 8026–8037 (2019).

Google Scholar  

Abadi et al., TensorFlow: A system for large-scale machine learning. arXiv:1605.08695, Preprint at https://arxiv.org/abs/1605.08695 (2006).

Chen, T. et al. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv . https://arxiv.org/abs/1512.01274 (2015).

Nwankpa, C., Ijomah, W., Gachagan, A. & Marshall, S. Activation functions: comparison of trends in practice and research for deep learning. arXiv . https://arxiv.org/abs/1811.03378 (2018).

Baydin, A. G., Pearlmutter, B. A., Radul, A. A. & Siskind, J. M. Automatic differentiation in machine learning: a survey. J. Machine Learn. Res. 18 , 1–43 (2018).

Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv. https://arxiv.org/abs/1207.0580 (2012).

Breiman, L. Bagging predictors. Machine Learn. 24 , 123–140 (1996).

LeCun, Y. et al. The Handbook of Brain Theory and Neural Networks vol. 3361 (MIT press Cambridge, MA, USA 1995).

Wilson, R. J. Introduction to Graph Theory (Pearson Education India, 1979).

West, D. B. et al. Introduction to Graph Theory Vol. 2 (Prentice hall Upper Saddle River, 2001).

Wang, M. et al. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv . https://arxiv.org/abs/1909.01315 (2019).

Choudhary, K. & DeCost, B. Atomistic line graph neural network for improved materials property predictions. npj Comput. Mater. 7 , 1–8 (2021).

Li, M. et al. Dgl-lifesci: An open-source toolkit for deep learning on graphs in life science. arXiv . https://arxiv.org/abs/2106.14232 (2021).

Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120 , 145301 (2018).

Klicpera, J., Groß, J. & Günnemann, S. Directional message passing for molecular graphs. arXiv . https://arxiv.org/abs/2003.03123 (2020).

Schutt, K. et al. Schnetpack: A deep learning toolbox for atomistic systems. J. Chem. Theory Comput. 15 , 448–455 (2018).

Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. arXiv . https://arxiv.org/abs/1609.02907 (2016).

Veličković, P. et al. Graph attention networks. arXiv . https://arxiv.org/abs/1710.10903 (2017).

Schlichtkrull, M. et al. Modeling relational data with graph convolutional networks. arXiv. https://arxiv.org/abs/1703.06103 (2017).

Song, L., Zhang, Y., Wang, Z. & Gildea, D. A graph-to-sequence model for AMR-to-text generation . In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 1616–1626 (Association for Computational Linguistics, 2018).

Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? arXiv . https://arxiv.org/abs/1810.00826 (2018).

Chen, Z., Li, X. & Bruna, J. Supervised community detection with line graph neural networks. arXiv . https://arxiv.org/abs/1705.08415 (2017).

Jing, Y., Bian, Y., Hu, Z., Wang, L. & Xie, X.-Q. S. Deep learning for drug design: an artificial intelligence paradigm for drug discovery in the big data era. AAPS J. 20 , 1–10 (2018).

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv. https://arxiv.org/abs/1810.04805 (2018).

De Cao, N. & Kipf, T. Molgan: An implicit generative model for small molecular graphs. arXiv . https://arxiv.org/abs/1805.11973 (2018).

Pereira, T., Abbasi, M., Ribeiro, B. & Arrais, J. P. Diversity oriented deep reinforcement learning for targeted molecule generation. J. Cheminformatics 13 , 1–17 (2021).

Baker, N. et al. Workshop report on basic research needs for scientific machine learning: core technologies for artificial intelligence. Tech. Rep . https://doi.org/10.2172/1478744 . (2019).

Chan, H. et al. Rapid 3d nanoscale coherent imaging via physics-aware deep learning. Appl. Phys. Rev. 8 , 021407 (2021).

Pun, G. P., Batra, R., Ramprasad, R. & Mishin, Y. Physically informed artificial neural networks for atomistic modeling of materials. Nat. Commun. 10 , 1–10 (2019).

Onken, D. et al. A neural network approach for high-dimensional optimal control. arXiv. https://arxiv.org/abs/2104.03270 (2021).

Zunger, A. Inverse design in search of materials with target functionalities. Nat. Rev. Chem. 2 , 1–16 (2018).

Chen, L., Zhang, W., Nie, Z., Li, S. & Pan, F. Generative models for inverse design of inorganic solid materials. J. Mater. Inform. 1 , 4 (2021).

Cranmer, M. et al. Discovering symbolic models from deep learning with inductive biases. arXiv . https://arxiv.org/abs/2006.11287 (2020).

Rupp, M., Tkatchenko, A., Müller, K.-R. & Von Lilienfeld, O. A. Fast and accurate modeling of molecular atomization energies with machine learning. Phys. Rev. Lett. 108 , 058301 (2012).

Bartók, A. P., Kondor, R. & Csányi, G. On representing chemical environments. Phys. Rev. B 87 , 184115 (2013).

Faber, F. A. et al. Prediction errors of molecular machine learning models lower than hybrid dft error. J. Chem. Theory Comput. 13 , 5255–5264 (2017).

Choudhary, K., DeCost, B. & Tavazza, F. Machine learning with force-field-inspired descriptors for materials: Fast screening and mapping energy landscape. Phys. Rev. Mater. 2 , 083801 (2018).

Choudhary, K., Garrity, K. F., Ghimire, N. J., Anand, N. & Tavazza, F. High-throughput search for magnetic topological materials using spin-orbit spillage, machine learning, and experiments. Phys. Rev. B 103 , 155131 (2021).

Choudhary, K., Garrity, K. F. & Tavazza, F. Data-driven discovery of 3d and 2d thermoelectric materials. J. Phys. Condens. Matter 32 , 475501 (2020).

Ward, L. et al. Including crystal structure attributes in machine learning models of formation energies via voronoi tessellations. Phys. Rev. B 96 , 024104 (2017).

Isayev, O. et al. Universal fragment descriptors for predicting properties of inorganic crystals. Nat. Commun. 8 , 1–12 (2017).

Liu, C.-H., Tao, Y., Hsu, D., Du, Q. & Billinge, S. J. Using a machine learning approach to determine the space group of a structure from the atomic pair distribution function. Acta Crystallogr. Sec. A 75 , 633–643 (2019).

Smith, J. S., Isayev, O. & Roitberg, A. E. Ani-1: an extensible neural network potential with dft accuracy at force field computational cost. Chem. Sci. 8 , 3192–3203 (2017).

Behler, J. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J. Chem. Phys. 134 , 074106 (2011).

Behler, J. & Parrinello, M. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys. Rev. Lett. 98 , 146401 (2007).

Ko, T. W., Finkler, J. A., Goedecker, S. & Behler, J. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat. Commun. 12 , 398 (2021).

Weinreich, J., Romer, A., Paleico, M. L. & Behler, J. Properties of alpha-brass nanoparticles. 1. neural network potential energy surface. J. Phys. Chem C 124 , 12682–12695 (2020).

Wang, H., Zhang, L., Han, J. & E, W. Deepmd-kit: A deep learning package for many-body potential energy representation and molecular dynamics. Computer Phys. Commun. 228 , 178–184 (2018).

Eshet, H., Khaliullin, R. Z., Kühne, T. D., Behler, J. & Parrinello, M. Ab initio quality neural-network potential for sodium. Phys. Rev. B 81 , 184107 (2010).

Khaliullin, R. Z., Eshet, H., Kühne, T. D., Behler, J. & Parrinello, M. Graphite-diamond phase coexistence study employing a neural-network mapping of the ab initio potential energy surface. Phys. Rev. B 81 , 100103 (2010).

Artrith, N. & Urban, A. An implementation of artificial neural-network potentials for atomistic materials simulations: Performance for tio2. Comput. Mater. Sci. 114 , 135–150 (2016).

Park, C. W. et al. Accurate and scalable graph neural network force field and molecular dynamics with direct force architecture. npj Comput. Mater. 7 , 1–9 (2021).

Chmiela, S., Sauceda, H. E., Müller, K.-R. & Tkatchenko, A. Towards exact molecular dynamics simulations with machine-learned force fields. Nat. Commun. 9 , 1–10 (2018).

Xue, L.-Y. et al. Reaxff-mpnn machine learning potential: a combination of reactive force field and message passing neural networks. Phys. Chem. Chem. Phys. 23 , 19457–19464 (2021).

Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. arXiv . https://arxiv.org/abs/1704.01212 (2017).

Zitnick, C. L. et al. An introduction to electrocatalyst design using machine learning for renewable energy storage. arXiv. https://arxiv.org/abs/2010.09435 (2020).

McNutt, A. T. et al. Gnina 1 molecular docking with deep learning. J. Cheminformatics 13 , 1–20 (2021).

Jin, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. in International conference on machine learning , 2323–2332 (PMLR, 2018).

Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminformatics 9 , 1–14 (2017).

You, J., Liu, B., Ying, R., Pande, V. & Leskovec, J. Graph convolutional policy network for goal-directed molecular graph generation. arXiv. https://arxiv.org/abs/1806.02473 (2018).

Putin, E. et al. Reinforced adversarial neural computer for de novo molecular design. J. Chem. Inf. Model. 58 , 1194–1204 (2018).

Sanchez-Lengeling, B., Outeiral, C., Guimaraes, G. L. & Aspuru-Guzik, A. Optimizing distributions over molecular space. an objective-reinforced generative adversarial network for inverse-design chemistry (organic). ChemRxiv https://doi.org/10.26434/chemrxiv.5309668.v3 (2017).

Nouira, A., Sokolovska, N. & Crivello, J.-C. Crystalgan: learning to discover crystallographic structures with generative adversarial networks. arXiv. https://arxiv.org/abs/1810.11203 (2018).

Long, T. et al. Constrained crystals deep convolutional generative adversarial network for the inverse design of crystal structures. npj Comput. Mater. 7 , 66 (2021).

Noh, J. et al. Inverse design of solid-state materials via a continuous representation. Matter 1 , 1370–1384 (2019).

Kim, S., Noh, J., Gu, G. H., Aspuru-Guzik, A. & Jung, Y. Generative adversarial networks for crystal structure prediction. ACS Central Sci. 6 , 1412–1420 (2020).

Long, T. et al. Inverse design of crystal structures for multicomponent systems. arXiv. https://arxiv.org/abs/2104.08040 (2021).

Xie, T. & Grossman, J. C. Hierarchical visualization of materials space with graph convolutional neural networks. J. Chem. Phys. 149 , 174111 (2018).

Park, C. W. & Wolverton, C. Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. Phys. Rev. Mater. 4 , 063801 (2020).

Laugier, L. et al. Predicting thermoelectric properties from crystal graphs and material descriptors-first application for functional materials. arXiv. https://arxiv.org/abs/1811.06219 (2018).

Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery. Matter 4 , 1578–1597 (2021).

Lusci, A., Pollastri, G. & Baldi, P. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules. J. Chem. Inf. Model. 53 , 1563–1575 (2013).

Xu, Y. et al. Deep learning for drug-induced liver injury. J. Chem. Inf. Model. 55 , 2085–2093 (2015).

Jain, A. & Bligaard, T. Atomic-position independent descriptor for machine learning of material properties. Phys. Rev. B 98 , 214112 (2018).

Goodall, R. E., Parackal, A. S., Faber, F. A., Armiento, R. & Lee, A. A. Rapid discovery of novel materials by coordinate-free coarse graining. arXiv . https://arxiv.org/abs/2106.11132 (2021).

Zuo, Y. et al. Accelerating Materials Discovery with Bayesian Optimization and Graph Deep Learning. arXiv . https://arxiv.org/abs/2104.10242 (2021).

Lin, T.-S. et al. Bigsmiles: a structurally-based line notation for describing macromolecules. ACS Central Sci. 5 , 1523–1531 (2019).

Tyagi, A. et al. Cancerppd: a database of anticancer peptides and proteins. Nucleic Acids Res. 43 , D837–D843 (2015).

Krenn, M., Häse, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (selfies): a 100% robust molecular string representation. Machine Learn. Sci. Technol. 1 , 045024 (2020).

Lim, J., Ryu, S., Kim, J. W. & Kim, W. Y. Molecular generative model based on conditional variational autoencoder for de novo molecular design. J. Cheminformatics 10 , 1–9 (2018).

Krasnov, L., Khokhlov, I., Fedorov, M. V. & Sosnin, S. Transformer-based artificial neural networks for the conversion between chemical notations. Sci. Rep. 11 , 1–10 (2021).

Irwin, J. J., Sterling, T., Mysinger, M. M., Bolstad, E. S. & Coleman, R. G. Zinc: a free tool to discover chemistry for biology. J. Chem. Inf. Model. 52 , 1757–1768 (2012).

Dix, D. J. et al. The toxcast program for prioritizing toxicity testing of environmental chemicals. Toxicol. Sci. 95 , 5–12 (2007).

Kim, S. et al. Pubchem 2019 update: improved access to chemical data. Nucleic Acids Res. 47 , D1102–D1109 (2019).

Hirohara, M., Saito, Y., Koda, Y., Sato, K. & Sakakibara, Y. Convolutional neural network based on smiles representation of compounds for detecting chemical motif. BMC Bioinformatics 19 , 83–94 (2018).

Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Sci. 4 , 268–276 (2018).

Liu, R. et al. Deep learning for chemical compound stability prediction . In Proceedings of ACM SIGKDD workshop on large-scale deep learning for data mining (DL-KDD) , 1–7. https://rosanneliu.com/publication/kdd/ (ACM SIGKDD, 2016).

Jha, D. et al. Elemnet: Deep learning the chem. mater. from only elemental composition. Sci. Rep. 8 , 1–13 (2018).

Agrawal, A. et al. Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters. Integr. Mater. Manuf. Innov. 3 , 90–108 (2014).

Agrawal, A. & Choudhary, A. A fatigue strength predictor for steels using ensemble data mining: steel fatigue strength predictor . In Proceedings of the 25th ACM International on Conference on information and knowledge management , 2497–2500. https://doi.org/10.1145/2983323.2983343 (2016).

Agrawal, A. & Choudhary, A. An online tool for predicting fatigue strength of steel alloys based on ensemble data mining. Int. J. Fatigue 113 , 389–400 (2018).

Agrawal, A., Saboo, A., Xiong, W., Olson, G. & Choudhary, A. Martensite start temperature predictor for steels using ensemble data mining . in 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA) , 521–530 (IEEE, 2019).

Meredig, B. et al. Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys. Rev. B 89 , 094104 (2014).

Agrawal, A., Meredig, B., Wolverton, C. & Choudhary, A. A formation energy predictor for crystalline materials using ensemble data mining . in 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) , 1276–1279 (IEEE, 2016).

Furmanchuk, A., Agrawal, A. & Choudhary, A. Predictive analytics for crystalline materials: bulk modulus. RSC Adv. 6 , 95246–95251 (2016).

Furmanchuk, A. et al. Prediction of seebeck coefficient for compounds without restriction to fixed stoichiometry: A machine learning approach. J. Comput. Chem. 39 , 191–202 (2018).

Ward, L., Agrawal, A., Choudhary, A. & Wolverton, C. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput. Mater. 2 , 1–7 (2016).

Ward, L. et al. Matminer: An open source toolkit for materials data mining. Comput. Mater. Sci. 152 , 60–69 (2018).

Jha, D. et al. Irnet: A general purpose deep residual regression framework for materials discovery . In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining , 2385–2393. https://arxiv.org/abs/1907.03222 (2019).

Jha, D. et al. Enabling deeper learning on big data for materials informatics applications. Sci. Rep. 11 , 1–12 (2021).

Goodall, R. E. & Lee, A. A. Predicting materials properties without crystal structure: Deep representation learning from stoichiometry. Nat. Commun. 11 , 1–9 (2020).

NIMS. Superconducting material database (supercon) . https://supercon.nims.go.jp/ (2021).

Stanev, V. et al. Machine learning modeling of superconducting critical temperature. npj Comput. Mater. 4 , 1–14 (2018).

Gupta, V. et al. Cross-property deep transfer learning framework for enhanced predictive analytics on small materials data. Nat. Commun . 12 , 1–10 (2021).

Himanen, L. et al. Dscribe: Library of descriptors for machine learning in materials science. Computer Phys. Commun. 247 , 106949 (2020).

Bartel, C. J. et al. A critical examination of compound stability predictions from machine-learned formation energies. npj Comput. Mater. 6 , 1–11 (2020).

Choudhary, K. et al. High-throughput density functional perturbation theory and machine learning predictions of infrared, piezoelectric, and dielectric responses. npj Comput. Mater. 6 , 1–13 (2020).

Zheng, C. et al. Automated generation and ensemble-learned matching of X-ray absorption spectra. npj Comput. Mater. 4 , 1–9 (2018).

Mathew, K. et al. High-throughput computational x-ray absorption spectroscopy. Sci. Data 5 , 1–8 (2018).

Chen, Y. et al. Database of ab initio l-edge x-ray absorption near edge structure. Sci. Data 8 , 1–8 (2021).

Lafuente, B., Downs, R. T., Yang, H. & Stone, N. In Highlights in mineralogical crystallography 1–30 (De Gruyter (O), 2015).

El Mendili, Y. et al. Raman open database: first interconnected raman–x-ray diffraction open-access resource for material identification. J. Appl. Crystallogr. 52 , 618–625 (2019).

Fremout, W. & Saverwyns, S. Identification of synthetic organic pigments: the role of a comprehensive digital raman spectral library. J. Raman Spectrosc. 43 , 1536–1544 (2012).

Huck, P. & Persson, K. A. Mpcontribs: user contributed data to the materials project database . https://docs.mpcontribs.org/ (2019).

Yang, L. et al. A cloud platform for atomic pair distribution function analysis: Pdfitc. Acta Crystallogr. A 77 , 2–6 (2021).

Park, W. B. et al. Classification of crystal structure using a convolutional neural network. IUCrJ 4 , 486–494 (2017).

Hellenbrandt, M. The Inorganic Crystal Structure Database (ICSD)—present and future. Crystallogr. Rev. 10 , 17–22 (2004).

Zaloga, A. N., Stanovov, V. V., Bezrukova, O. E., Dubinin, P. S. & Yakimov, I. S. Crystal symmetry classification from powder X-ray diffraction patterns using a convolutional neural network. Mater. Today Commun. 25 , 101662 (2020).

Lee, J.-W., Park, W. B., Lee, J. H., Singh, S. P. & Sohn, K.-S. A deep-learning technique for phase identification in multiphase inorganic compounds using synthetic XRD powder patterns. Nat. Commun. 11 , 86 (2020).

Wang, H. et al. Rapid identification of X-ray diffraction patterns based on very limited data by interpretable convolutional neural networks. J. Chem. Inf. Model. 60 , 2004–2011 (2020).

Dong, H. et al. A deep convolutional neural network for real-time full profile analysis of big powder diffraction data. npj Comput. Mater. 7 , 1–9 (2021).

Aguiar, J. A., Gong, M. L. & Tasdizen, T. Crystallographic prediction from diffraction and chemistry data for higher throughput classification using machine learning. Comput. Mater. Sci. 173 , 109409 (2020).

Maffettone, P. M. et al. Crystallography companion agent for high-throughput materials discovery. Nat. Comput. Sci. 1 , 290–297 (2021).

Oviedo, F. et al. Fast and interpretable classification of small X-ray diffraction datasets using data augmentation and deep neural networks. npj Comput. Mater. 5 , 1–9 (2019).

Liu, C.-H. et al. Validation of non-negative matrix factorization for rapid assessment of large sets of atomic pair-distribution function (pdf) data. J. Appl. Crystallogr. 54 , 768–775 (2021).

Rakita, Y. et al. Studying heterogeneities in local nanostructure with scanning nanostructure electron microscopy (snem). arXiv https://arxiv.org/abs/2110.03589 (2021).

Timoshenko, J., Lu, D., Lin, Y. & Frenkel, A. I. Supervised machine-learning-based determination of three-dimensional structure of metallic nanoparticles. J. Phys. Chem Lett. 8 , 5091–5098 (2017).

Timoshenko, J. et al. Subnanometer substructures in nanoassemblies formed from clusters under a reactive atmosphere revealed using machine learning. J. Phys. Chem C 122 , 21686–21693 (2018).

Timoshenko, J. et al. Neural network approach for characterizing structural transformations by X-ray absorption fine structure spectroscopy. Phys. Rev. Lett. 120 , 225502 (2018).

Zheng, C., Chen, C., Chen, Y. & Ong, S. P. Random forest models for accurate identification of coordination environments from X-ray absorption near-edge structure. Patterns 1 , 100013 (2020).

Torrisi, S. B. et al. Random forest machine learning models for interpretable X-ray absorption near-edge structure spectrum-property relationships. npj Comput. Mater. 6 , 1–11 (2020).

Andrejevic, N., Andrejevic, J., Rycroft, C. H. & Li, M. Machine learning spectral indicators of topology. arXiv preprint at https://arxiv.org/abs/2003.00994 (2020).

Madden, M. G. & Ryder, A. G. Machine learning methods for quantitative analysis of raman spectroscopy data . in Opto-Ireland 2002: Optics and Photonics Technologies and Applications , Vol. 4876, 1130–1139 (International Society for Optics and Photonics, 2003).

Conroy, J., Ryder, A. G., Leger, M. N., Hennessey, K. & Madden, M. G. Qualitative and quantitative analysis of chlorinated solvents using Raman spectroscopy and machine learning . in Opto-Ireland 2005: Optical Sensing and Spectroscopy, Vol. 5826, 131–142 (International Society for Optics and Photonics, 2005).

Acquarelli, J. et al. Convolutional neural networks for vibrational spectroscopic data analysis. Anal. Chim. Acta 954 , 22–31 (2017).

O’Connell, M.-L., Howley, T., Ryder, A. G., Leger, M. N. & Madden, M. G. Classification of a target analyte in solid mixtures using principal component analysis, support vector machines, and Raman spectroscopy . in Opto-Ireland 2005: Optical Sensing and Spectroscopy , Vol. 5826, 340–350 (International Society for Optics and Photonics, 2005).

Zhao, J., Chen, Q., Huang, X. & Fang, C. H. Qualitative identification of tea categories by near infrared spectroscopy and support vector machine. J. Pharm. Biomed. Anal. 41 , 1198–1204 (2006).

Liu, J. et al. Deep convolutional neural networks for Raman spectrum recognition: a unified solution. Analyst 142 , 4067–4074 (2017).

Yang, J. et al. Deep learning for vibrational spectral analysis: Recent progress and a practical guide. Anal. Chim. Acta 1081 , 6–17 (2019).

Selzer, P., Gasteiger, J., Thomas, H. & Salzer, R. Rapid access to infrared reference spectra of arbitrary organic compounds: scope and limitations of an approach to the simulation of infrared spectra by neural networks. Chem. Euro. J. 6 , 920–927 (2000).

Ghosh, K. et al. Deep learning spectroscopy: neural networks for molecular excitation spectra. Adv. Sci. 6 , 1801367 (2019).

Kostka, T., Selzer, P. & Gasteiger, J. A combined application of reaction prediction and infrared spectra simulation for the identification of degradation products of s-triazine herbicides. Chemistry 7 , 2254–2260 (2001).

Mahmoud, C. B., Anelli, A., Csányi, G. & Ceriotti, M. Learning the electronic density of states in condensed matter. Phys. Rev. B 102 , 235130 (2020).

Chen, Z. et al. Direct prediction of phonon density of states with Euclidean neural networks. Adv. Sci. 8 , 2004214 (2021).

Kong, S. et al. Density of states prediction for materials discovery via contrastive learning from probabilistic embeddings. arXiv . https://arxiv.org/abs/2110.11444 (2021).

Carbone, M. R., Topsakal, M., Lu, D. & Yoo, S. Machine-learning X-ray absorption spectra to quantitative accuracy. Phys. Rev. Lett. 124 , 156401 (2020).

Rehr, J. J., Kas, J. J., Vila, F. D., Prange, M. P. & Jorissen, K. Parameter-free calculations of X-ray spectra with FEFF9. Phys. Chem. Chem. Phys. 12 , 5503–5513 (2010).

Rankine, C. D., Madkhali, M. M. M. & Penfold, T. J. A deep neural network for the rapid prediction of X-ray absorption spectra. J. Phys. Chem A 124 , 4263–4270 (2020).

Fung, V., Hu, G., Ganesh, P. & Sumpter, B. G. Machine learned features from density of states for accurate adsorption energy prediction. Nat. Commun. 12 , 88 (2021).

Hammer, B. & Nørskov, J. Theoretical surface science and catalysis-calculations and concepts. Adv. Catal. Impact Surface Sci. Catal. 45 , 71–129 (2000).

Kaundinya, P. R., Choudhary, K. & Kalidindi, S. R. Prediction of the electron density of states for crystalline compounds with atomistic line graph neural networks (alignn). arXiv. https://arxiv.org/abs/2201.08348 (2022).

Stein, H. S., Soedarmadji, E., Newhouse, P. F., Guevarra, D. & Gregoire, J. M. Synthesis, optical imaging, and absorption spectroscopy data for 179072 metal oxides. Sci. Data 6 , 9 (2019).

Choudhary, A. et al. Graph neural network predictions of metal organic framework co2 adsorption properties. arXiv . https://arxiv.org/abs/2112.10231 (2021).

Anderson, R., Biong, A. & Gómez-Gualdrón, D. A. Adsorption isotherm predictions for multiple molecules in mofs using the same deep learning model. J. Chem. Theory Comput. 16 , 1271–1283 (2020).

Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 , 1097–1105 (2012).

Varela, M. et al. Materials characterization in the aberration-corrected scanning transmission electron microscope. Annu. Rev. Mater. Res. 35 , 539–569 (2005).

Holm, E. A. et al. Overview: Computer vision and machine learning for microstructural characterization and analysis. Metal. Mater Trans. A 51 , 5985–5999 (2020).

Modarres, M. H. et al. Neural network for nanoscience scanning electron microscope image recognition. Sci. Rep. 7 , 1–12 (2017).

Gopalakrishnan, K., Khaitan, S. K., Choudhary, A. & Agrawal, A. Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection. Construct. Build. Mater. 157 , 322–330 (2017).

Gopalakrishnan, K., Gholami, H., Vidyadharan, A., Choudhary, A. & Agrawal, A. Crack damage detection in unmanned aerial vehicle images of civil infrastructure using pre-trained deep learning model. Int. J. Traffic Transp. Eng . 8 , 1–14 (2018).

Yang, Z. et al. Data-driven insights from predictive analytics on heterogeneous experimental data of industrial magnetic materials . In IEEE International Conference on Data Mining Workshops (ICDMW) , 806–813. https://doi.org/10.1109/ICDMW.2019.00119 (IEEE Computer Society, 2019).

Yang, Z. et al. Heterogeneous feature fusion based machine learning on shallow-wide and heterogeneous-sparse industrial datasets . In 25th International Conference on Pattern Recognition Workshops, ICPR 2020 , 566–577. https://doi.org/10.1007/978-3-030-68799-1_41 (Springer Science and Business Media Deutschland GmbH, 2021).

Ziletti, A., Kumar, D., Scheffler, M. & Ghiringhelli, L. M. Insightful classification of crystal structures using deep learning. Nat. Commun. 9 , 2775 (2018).

Choudhary, K. et al. Computational scanning tunneling microscope image database. Sci. Data 8 , 1–9 (2021).

Liu, R., Agrawal, A., Liao, W.-k., Choudhary, A. & De Graef, M. Materials discovery: Understanding polycrystals from large-scale electron patterns . in 2016 IEEE International Conference on Big Data (Big Data) , 2261–2269 (IEEE, 2016).

Jha, D. et al. Extracting grain orientations from EBSD patterns of polycrystalline materials using convolutional neural networks. Microsc. Microanal. 24 , 497–502 (2018).

Kaufmann, K., Zhu, C., Rosengarten, A. S. & Vecchio, K. S. Deep neural network enabled space group identification in EBSD. Microsc. Microanal. 26 , 447–457 (2020).

Yang, Z. et al. Deep learning based domain knowledge integration for small datasets: Illustrative applications in materials informatics . in 2019 International Joint Conference on Neural Networks (IJCNN) , 1–8 (IEEE, 2019).

Yang, Z. et al. Learning to predict crystal plasticity at the nanoscale: Deep residual networks and size effects in uniaxial compression discrete dislocation simulations. Sci. Rep. 10 , 1–14 (2020).

Decost, B. L. et al. Uhcsdb: Ultrahigh carbon steel micrograph database. Integr. Mater. Manuf. Innov. 6 , 197–205 (2017).

Decost, B. L., Lei, B., Francis, T. & Holm, E. A. High throughput quantitative metallography for complex microstructures using deep learning: a case study in ultrahigh carbon steel. Microsc. Microanal. 25 , 21–29 (2019).

Stan, T., Thompson, Z. T. & Voorhees, P. W. Optimizing convolutional neural networks to perform semantic segmentation on large materials imaging datasets: X-ray tomography and serial sectioning. Materials Characterization 160 , 110119 (2020).

Madsen, J. et al. A deep learning approach to identify local structures in atomic-resolution transmission electron microscopy images. Adv. Theory Simulations 1 , 1800037 (2018).

Maksov, A. et al. Deep learning analysis of defect and phase evolution during electron beam-induced transformations in ws 2. npj Comput. Mater. 5 , 1–8 (2019).

Yang, S.-H. et al. Deep learning-assisted quantification of atomic dopants and defects in 2d materials. Adv. Sci. https://doi.org/10.1002/advs.202101099 (2021).

Roberts, G. et al. Deep learning for semantic segmentation of defects in advanced stem images of steels. Sci. Rep. 9 , 1–12 (2019).

Kusche, C. et al. Large-area, high-resolution characterisation and classification of damage mechanisms in dual-phase steel using deep learning. PLoS ONE 14 , e0216493 (2019).

Vlcek, L. et al. Learning from imperfections: predicting structure and thermodynamics from atomic imaging of fluctuations. ACS Nano 13 , 718–727 (2019).

Ziatdinov, M., Maksov, A. & Kalinin, S. V. Learning surface molecular structures via machine vision. npj Comput. Mater. 3 , 1–9 (2017).

Ovchinnikov, O. S. et al. Detection of defects in atomic-resolution images of materials using cycle analysis. Adv. Struct. Chem. Imaging 6 , 3 (2020).

Li, W., Field, K. G. & Morgan, D. Automated defect analysis in electron microscopic images. npj Comput. Mater. 4 , 1–9 (2018).

Cohn, R. et al. Instance segmentation for direct measurements of satellites in metal powders and automated microstructural characterization from image data. JOM 73 , 2159–2172 (2021).

de Haan, K., Ballard, Z. S., Rivenson, Y., Wu, Y. & Ozcan, A. Resolution enhancement in scanning electron microscopy using deep learning. Sci. Rep. 9 , 1–7 (2019).

Ede, J. M. & Beanland, R. Partial scanning transmission electron microscopy with deep learning. Sci. Rep. 10 , 1–10 (2020).

Rashidi, M. & Wolkow, R. A. Autonomous scanning probe microscopy in situ tip conditioning through machine learning. ACS Nano 12 , 5185–5189 (2018).

Scime, L., Siddel, D., Baird, S. & Paquit, V. Layer-wise anomaly detection and classification for powder bed additive manufacturing processes: A machine-agnostic algorithm for real-time pixel-wise semantic segmentation. Addit. Manufact. 36 , 101453 (2020).

Eppel, S., Xu, H., Bismuth, M. & Aspuru-Guzik, A. Computer vision for recognition of materials and vessels in chemistry lab settings and the Vector-LabPics Data Set. ACS Central Sci. 6 , 1743–1752 (2020).

Yang, Z. et al. Deep learning approaches for mining structure-property linkages in high contrast composites from simulation datasets. Comput. Mater. Sci. 151 , 278–287 (2018).

Cecen, A., Dai, H., Yabansu, Y. C., Kalidindi, S. R. & Song, L. Material structure-property linkages using three-dimensional convolutional neural networks. Acta Mater. 146 , 76–84 (2018).

Yang, Z. et al. Establishing structure-property localization linkages for elastic deformation of three-dimensional high contrast composites using deep learning approaches. Acta Mater. 166 , 335–345 (2019).

Goetz, A. et al. Addressing materials’ microstructure diversity using transfer learning. arXiv . arXiv-2107. https://arxiv.org/abs/2107.13841 (2021).

Kitahara, A. R. & Holm, E. A. Microstructure cluster analysis with transfer learning and unsupervised learning. Integr. Mater. Manuf. Innov. 7 , 148–156 (2018).

Larmuseau, M. et al. Compact representations of microstructure images using triplet networks. npj Comput. Mater. 2020 6:1 6 , 1–11 (2020).

Li, X. et al. A deep adversarial learning methodology for designing microstructural material systems . in International Design Engineering Technical Conferences and Computers and Information in Engineering Conference , Vol. 51760, V02BT03A008 (American Society of Mechanical Engineers, 2018).

Yang, Z. et al. Microstructural materials design via deep adversarial learning methodology. J. Mech. Des. 140 , 111416 (2018).

Yang, Z. et al. A general framework combining generative adversarial networks and mixture density networks for inverse modeling in microstructural materials design. arXiv . https://arxiv.org/abs/2101.10553 (2021).

Hsu, T. et al. Microstructure generation via generative adversarial network for heterogeneous, topologically complex 3d materials. JOM 73 , 90–102 (2020).

Chun, S. et al. Deep learning for synthetic microstructure generation in a materials-by-design framework for heterogeneous energetic materials. Sci. Rep. 10 , 1–15 (2020).

Dai, M., Demirel, M. F., Liang, Y. & Hu, J.-M. Graph neural networks for an accurate and interpretable prediction of the properties of polycrystalline materials. npj Comput. Mater. 7 , 1–9 (2021).

Cohn, R. & Holm, E. Neural message passing for predicting abnormal grain growth in Monte Carlo simulations of microstructural evolution. arXiv. https://arxiv.org/abs/2110.09326v1 (2021).

Plimpton, S. et al. SPPARKS Kinetic Monte Carlo Simulator . https://spparks.github.io/index.html . (2021).

Plimpton, S. et al. Crossing the mesoscale no-man’s land via parallel kinetic Monte Carlo. Tech. Rep . https://doi.org/10.2172/966942 (2009).

Xue, N. Steven bird, evan klein and edward loper. natural language processing with python. oreilly media, inc.2009. isbn: 978-0-596-51649-9. Nat. Lang. Eng. 17 , 419–424 (2010).

Honnibal, M. & Montani, I. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. https://doi.org/10.5281/zenodo.3358113 (2017).

Gardner, M. et al. Allennlp: A deep semantic natural language processing platform. arXiv. https://arxiv.org/abs/1803.07640 (2018).

Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571 , 95–98 (2019).

Kononova, O. et al. Opportunities and challenges of text mining in aterials research. iScience 24 , 102155 (2021).

Olivetti, E. A. et al. Data-driven materials research enabled by natural language processing and information extraction. Appl. Phys. Rev. 7 , 041317 (2020).

Swain, M. C. & Cole, J. M. Chemdataextractor: a toolkit for automated extraction of chemical information from the scientific literature. J. Chem. Inf. Model. 56 , 1894–1904 (2016).

Park, S. et al. Text mining metal–organic framework papers. J. Chem. Inf. Model. 58 , 244–251 (2018).

Court, C. J. & Cole, J. M. Auto-generated materials database of curie and néel temperatures via semi-supervised relationship extraction. Sci. Data 5 , 1–12 (2018).

Huang, S. & Cole, J. M. A database of battery materials auto-generated using chemdataextractor. Sci. Data 7 , 1–13 (2020).

Beard, E. J., Sivaraman, G., Vázquez-Mayagoitia, Á., Vishwanath, V. & Cole, J. M. Comparative dataset of experimental and computational attributes of uv/vis absorption spectra. Sci. Data 6 , 1–11 (2019).

Tayfuroglu, O., Kocak, A. & Zorlu, Y. In silico investigation into h2 uptake in mofs: combined text/data mining and structural calculations. Langmuir 36 , 119–129 (2019).

Weston, L. et al. Named entity recognition and normalization applied to large-scale information extraction from the materials science literature. J. Chem. Inf. Model. 59 , 3692–3702 (2019).

Vaucher, A. C. et al. Automated extraction of chemical synthesis actions from experimental procedures. Nat. Commun. 11 , 1–11 (2020).

He, T. et al. Similarity of precursors in solid-state synthesis as text-mined from scientific literature. Chem. Mater. 32 , 7861–7873 (2020).

Kononova, O. et al. Text-mined dataset of inorganic materials synthesis recipes. Sci. Data 6 , 1–11 (2019).

Kim, E. et al. Materials synthesis insights from scientific literature via text extraction and machine learning. Chem. Mater. 29 , 9436–9444 (2017).

Kim, E., Huang, K., Jegelka, S. & Olivetti, E. Virtual screening of inorganic materials synthesis parameters with deep learning. npj Comput. Mater. 3 , 1–9 (2017).

Kim, E. et al. Inorganic materials synthesis planning with literature-trained neural networks. J. Chem. Inf. Model. 60 , 1194–1201 (2020).

de Castro, P. B. et al. Machine-learning-guided discovery of the gigantic magnetocaloric effect in hob 2 near the hydrogen liquefaction temperature. NPG Asia Mater. 12 , 1–7 (2020).

Cooper, C. B. et al. Design-to-device approach affords panchromatic co-sensitized solar cells. Adv. Energy Mater. 9 , 1802820 (2019).

Yang, X., Dai, Z., Zhao, Y., Liu, J. & Meng, S. Low lattice thermal conductivity and excellent thermoelectric behavior in li3sb and li3bi. J. Phys. Condens. Matter 30 , 425401 (2018).

Wang, Y., Gao, Z. & Zhou, J. Ultralow lattice thermal conductivity and electronic properties of monolayer 1t phase semimetal site2 and snte2. Phys. E 108 , 53–59 (2019).

Jong, U.-G., Yu, C.-J., Kye, Y.-H., Hong, S.-N. & Kim, H.-G. Manifestation of the thermoelectric properties in ge-based halide perovskites. Phys. Rev. Mater. 4 , 075403 (2020).

Yamamoto, K., Narita, G., Yamasaki, J. & Iikubo, S. First-principles study of thermoelectric properties of mixed iodide perovskite cs (b, b’) i3 (b, b’= ge, sn, and pb). J. Phys. Chem. Solids 140 , 109372 (2020).

Viennois, R. et al. Anisotropic low-energy vibrational modes as an effect of cage geometry in the binary barium silicon clathrate b a 24 s i 100. Phys. Rev. B 101 , 224302 (2020).

Haque, E. Effect of electron-phonon scattering, pressure and alloying on the thermoelectric performance of tmcu _3 ch _4(tm= v, nb, ta; ch= s, se, te). arXiv . https://arxiv.org/abs/2010.08461 (2020).

Yahyaoglu, M. et al. Phase-transition-enhanced thermoelectric transport in rickardite mineral cu3–x te2. Chem. Mater. 33 , 1832–1841 (2021).

Ho, D., Shkolnik, A. S., Ferraro, N. J., Rizkin, B. A. & Hartman, R. L. Using word embeddings in abstracts to accelerate metallocene catalysis polymerization research. Computers Chem. Eng. 141 , 107026 (2020).

Abdar, M. et al. A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fusion . 76 , 243–297 (2021).

Mi, Lu, et al. Training-free uncertainty estimation for dense regression: Sensitivityas a surrogate. arXiv . preprint at arXiv:1910.04858. https://arxiv.org/abs/1910.04858 (2019).

Teye, M., Azizpour, H. & Smith, K. Bayesian uncertainty estimation for batch normalized deep networks . in International Conference on Machine Learning , 4907–4916 (PMLR, 2018).

Zhang, J., Kailkhura, B. & Han, T. Y.-J. Leveraging uncertainty from deep learning for trustworthy material discovery workflows. ACS Omega 6 , 12711–12721 (2021).

Meredig, B. et al. Can machine learning identify the next high-temperature superconductor? examining extrapolation performance for materials discovery. Mol. Syst. Des. Eng. 3 , 819–825 (2018).

Zhang, J., Kailkhura, B. & Han, T. Y.-J. Mix-n-match: Ensemble and compositional methods for uncertainty calibration in deep learning . in International Conference on Machine Learning , 11117–11128 (PMLR, 2020).

Seoh, R. Qualitative analysis of monte carlo dropout. arXiv. https://arxiv.org/abs/2007.01720 (2020).

Gal, Y. & Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning . in international conference on machine learning , 1050–1059 (PMLR, 2016).

Jain, S., Liu, G., Mueller, J. & Gifford, D. Maximizing overall diversity for improved uncertainty estimates in deep ensembles . In Proceedings of the AAAI Conference on Artificial Intelligence , 34 , 4264–4271. https://doi.org/10.1609/aaai.v34i04.5849 (2020).

Ganaie, M. et al. Ensemble deep learning: a review. arXiv . https://arxiv.org/abs/2104.02395 (AAAI Technical Track: Machine Learning, 2021).

Fort, S., Hu, H. & Lakshminarayanan, B. Deep ensembles: a loss landscape perspective. arXiv. https://arxiv.org/abs/1912.02757 (2019).

Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. arXiv. https://arxiv.org/abs/1612.01474 (2016).

Moon, S. J., Jeon, J.-J., Lee, J. S. H. & Kim, Y. Learning multiple quantiles with neural networks. J. Comput. Graph. Stat. 30 , 1–11. https://doi.org/10.1080/10618600.2021.1909601 (2021).

Rasmussen, C. E. Summer School on Machine Learning , 63–71 (Springer, 2003).

Hegde, P., Heinonen, M., Lähdesmäki, H. & Kaski, S. Deep learning with differential gaussian process flows. arXiv. https://arxiv.org/abs/1810.04066 (2018).

Wilson, A. G., Hu, Z., Salakhutdinov, R. & Xing, E. P. Deep kernel learning. in Artificial intelligence and statistics , 370–378 (PMLR, 2016).

Hegde, V. I. et al. Reproducibility in high-throughput density functional theory: a comparison of aflow, materials project, and oqmd. arXiv. https://arxiv.org/abs/2007.01988 (2020).

Ying, R., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. Gnnexplainer: Generating explanations for graph neural networks. Adv. Neural Inf. Process. Syst. 32 , 9240 (2019).

Roch, L. M. et al. Chemos: orchestrating autonomous experimentation. Sci. Robot. 3 , eaat5559 (2018).

Szymanski, N. et al. Toward autonomous design and synthesis of novel inorganic materials. Mater. Horiz. 8 , 2169–2198. https://doi.org/10.1039/D1MH00495F (2021).

MacLeod, B. P. et al. Self-driving laboratory for accelerated discovery of thin-film materials. Sci. Adv. 6 , eaaz8867 (2020).

Stach, E. A. et al. Autonomous experimentation systems for materials development: a community perspective. Matter https://www.cell.com/matter/fulltext/S2590-2385(21)00306-4 (2021).

Rakita, Y. et al. Active reaction control of cu redox state based on real-time feedback from i n situ synchrotron measurements. J. Am. Chem. Soc. 142 , 18758–18762 (2020).

Chmiela, S. et al. Machine learning of accurate energy-conserving molecular force fields. Sci. Adv. 3 , e1603015 (2017).

Thomas, R. S. et al. The us federal tox21 program: a strategic and operational plan for continued leadership. Altex 35 , 163 (2018).

Russell Johnson, N. Nist computational chemistry comparison and benchmark database . In The 4th Joint Meeting of the US Sections of the Combustion Institute . https://ci.confex.com/ci/2005/techprogram/P1309.HTM (2005).

Lopez, S. A. et al. The harvard organic photovoltaic dataset. Sci. Data 3 , 1–7 (2016).

Johnson, R. D. et al. Nist computational chemistry comparison and benchmark database . http://srdata.nist.gov/cccbdb (2006).

Mobley, D. L. & Guthrie, J. P. Freesolv: a database of experimental and calculated hydration free energies, with input files. J. Computer Aided Mol. Des. 28 , 711–720 (2014).

Andersen, C. W. et al. Optimade: an api for exchanging materials data. arXiv. https://arxiv.org/abs/2103.02068 (2021).

Chanussot, L. et al. Open catalyst 2020 (oc20) dataset and community challenges. ACS Catal. 11 , 6059–6072 (2021).

Dunn, A., Wang, Q., Ganose, A., Dopp, D. & Jain, A. Benchmarking materials property prediction methods: the matbench test set and automatminer reference algorithm. npj Comput. Mater. 6 , 1–10 (2020).

Talirz, L. et al. Materials cloud, a platform for open computational science. Sci. Data 7 , 1–12 (2020).

Chung, Y. G. et al. Advances, updates, and analytics for the computation-ready, experimental metal–organic framework database: Core mof 2019. J. Chem. Eng. Data 64 , 5985–5998 (2019).

Sussman, J. L. et al. Protein data bank (pdb): database of three-dimensional structural information of biological macromolecules. Acta Crystallogr. Sec. D Biol. Crystallogr. 54 , 1078–1084 (1998).

Benson, M. L. et al. Binding moad, a high-quality protein–ligand database. Nucleic Acids Res. 36 , D674–D678 (2007).

Fung, V., Zhang, J., Juarez, E. & Sumpter, B. G. Benchmarking graph neural networks for materials chemistry. npj Comput. Mater. 7 , 1–8 (2021).

Louis, S.-Y. et al. Graph convolutional neural networks with global attention for improved materials property prediction. Phys. Chem. Chem. Phys. 22 , 18141–18148 (2020).

Khorshidi, A. & Peterson, A. A. Amp: A modular approach to machine learning in atomistic simulations. Computer Phys. Commun. 207 , 310–324 (2016).

Yao, K., Herr, J. E., Toth, D. W., Mckintyre, R. & Parkhill, J. The tensormol-0.1 model chemistry: a neural network augmented with long-range physics. Chem. Sci. 9 , 2261–2269 (2018).

Doerr, S. et al. Torchmd: A deep learning framework for molecular simulations. J. Chem. Theory Comput. 17 , 2355–2363 (2021).

Kolb, B., Lentz, L. C. & Kolpak, A. M. Discovering charge density functionals and structure-property relationships with prophet: A general framework for coupling machine learning and first-principles methods. Sci. Rep. 7 , 1–9 (2017).

Zhang, L., Han, J., Wang, H., Car, R. & Weinan, E. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120 , 143001 (2018).

Geiger, M. et al. e3nn/e3nn: 2021-06-21 . https://doi.org/10.5281/zenodo.5006322 (2021).

Duvenaud, D. K. et al. Convolutional networks on graphs for learning molecular fingerprints (eds. Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M. & Garnett, R.) in Adv. Neural Inf. Process. Syst. 28 2224–2232 (Curran Associates, Inc., 2015).

Li, X. et al. Deepchemstable: Chemical stability prediction with an attention-based graph convolution network. J. Chem. Inf. Model. 59 , 1044–1049 (2019).

Wu, Z. et al. MoleculeNet: A benchmark for molecular machine learning. Chem. Sci. 9 , 513–530 (2018).

Wang, A. Y.-T., Kauwe, S. K., Murdock, R. J. & Sparks, T. D. Compositionally restricted attention-based network for materials property predictions. npj Comput. Mater. 7 , 77 (2021).

Zhou, Q. et al. Learning atoms for materials discovery. Proc. Natl Acad. Sci. USA 115 , E6411–E6417 (2018).

O’Boyle, N. & Dalke, A. Deepsmiles: An adaptation of smiles for use in machine-learning of chemical structures. ChemRxiv https://doi.org/10.26434/chemrxiv.7097960.v1 (2018).

Green, H., Koes, D. R. & Durrant, J. D. Deepfrag: a deep convolutional neural network for fragment-based lead optimization. Chem. Sci. 12 , 8036–8047. https://doi.org/10.1039/D1SC00163A (2021).

Elhefnawy, W., Li, M., Wang, J. & Li, Y. Deepfrag-k: a fragment-based deep learning approach for protein fold recognition. BMC Bioinformatics 21 , 203 (2020).

Paul, A. et al. Chemixnet: Mixed dnn architectures for predicting chemical properties using multiple molecular representations. arXiv . https://arxiv.org/abs/1811.08283 (2018).

Paul, A. et al. Transfer learning using ensemble neural networks for organic solar cell screening . in 2019 International Joint Conference on Neural Networks (IJCNN) , 1–8 (IEEE, 2019).

Choudhary, K. et al. Computational screening of high-performance optoelectronic materials using optb88vdw and tb-mbj formalisms. Sci. Data 5 , 1–12 (2018).

Wong-Ng, W., McMurdie, H., Hubbard, C. & Mighell, A. D. Jcpds-icdd research associateship (cooperative program with nbs/nist). J. Res. Natl Inst. Standards Technol. 106 , 1013 (2001).

Belsky, A., Hellenbrandt, M., Karen, V. L. & Luksch, P. New developments in the inorganic crystal structure database (icsd): accessibility in support of materials research and design. Acta Crystallogr. Sec. B Struct. Sci. 58 , 364–369 (2002).

Gražulis, S. et al. Crystallography Open Database—an open-access collection of crystal structures. J. Appl. Crystallogr. 42 , 726–729 (2009).

Linstrom, P. J. & Mallard, W. G. The nist chemistry webbook: a chemical data resource on the internet. J. Chem. Eng. Data 46 , 1059–1063 (2001).

Saito, T. et al. Spectral database for organic compounds (sdbs). (National Institute of Advanced Industrial Science and Technology (AIST), 2006).

Steinbeck, C., Krause, S. & Kuhn, S. Nmrshiftdb constructing a free chemical information system with open-source components. J. Chem. inf. Computer Sci. 43 , 1733–1739 (2003).

Fung, V., Hu, G., Ganesh, P. & Sumpter, B. G. Machine learned features from density of states for accurate adsorption energy prediction. Nat. Commun. 12 , 1–11 (2021).

Kong, S., Guevarra, D., Gomes, C. P. & Gregoire, J. M. Materials representation and transfer learning for multi-property prediction. arXiv . https://arxiv.org/abs/2106.02225 (2021).

Bang, K., Yeo, B. C., Kim, D., Han, S. S. & Lee, H. M. Accelerated mapping of electronic density of states patterns of metallic nanoparticles via machine-learning. Sci. Rep . 11 , 1–11 (2021).

Chen, D. et al. Automating crystal-structure phase mapping by combining deep learning with constraint reasoning. Nat. Machine Intell. 3 , 812–822 (2021).

Ophus, C. A fast image simulation algorithm for scanning transmission electron microscopy. Adv. Struct. Chem. imaging 3 , 1–11 (2017).

Aversa, R., Modarres, M. H., Cozzini, S., Ciancio, R. & Chiusole, A. The first annotated set of scanning electron microscopy images for nanoscience. Sci. Data 5 , 1–10 (2018).

Ziatdinov, M. et al. Causal analysis of competing atomistic mechanisms in ferroelectric materials from high-resolution scanning transmission electron microscopy data. npj Comput. Mater. 6 , 1–9 (2020).

Souza, A. L. F. et al. Deepfreak: Learning crystallography diffraction patterns with automated machine learning. arXiv. http://arxiv.org/abs/1904.11834 (2019).

Scime, L. et al. Layer-wise imaging dataset from powder bed additive manufacturing processes for machine learning applications (peregrine v2021-03). Tech. Rep . https://www.osti.gov/biblio/1779073 (2021).

Somnath, S., Smith, C. R., Laanait, N., Vasudevan, R. K. & Jesse, S. Usid and pycroscopy–open source frameworks for storing and analyzing imaging and spectroscopy data. Microsc. Microanal. 25 , 220–221 (2019).

Savitzky, B. H. et al. py4dstem: A software package for multimodal analysis of four-dimensional scanning transmission electron microscopy datasets. arXiv. https://arxiv.org/abs/2003.09523 (2020).

Madsen, J. & Susi, T. The abtem code: transmission electron microscopy from first principles. Open Res. Euro. 1 , 24 (2021).

Koch, C. T. Determination of core structure periodicity and point defect density along dislocations . (Arizona State University, 2002).

Allen, L. J. et al. Modelling the inelastic scattering of fast electrons. Ultramicroscopy 151 , 11–22 (2015).

Maxim, Z., Jesse, S., Sumpter, B. G., Kalinin, S. V. & Dyck, O. Tracking atomic structure evolution during directed electron beam induced si-atom motion in graphene via deep machine learning. Nanotechnology 32 , 035703 (2020).

Khadangi, A., Boudier, T. & Rajagopal, V. Em-net: Deep learning for electron microscopy image segmentation . in 2020 25th International Conference on Pattern Recognition (ICPR) , 31–38 (IEEE, 2021).

Meyer, C. et al. Nion swift: Open source image processing software for instrument control, data acquisition, organization, visualization, and analysis using python. Microsc. Microanal. 25 , 122–123 (2019).

Kim, J., Tiong, L. C. O., Kim, D. & Han, S. S. Deep learning-based prediction of material properties using chemical compositions and diffraction patterns as experimentally accessible inputs. J. Phys. Chem Lett. 12 , 8376–8383 (2021).

Von Chamier, L. et al. Zerocostdl4mic: an open platform to simplify access and use of deep-learning in microscopy. BioRxiv. https://www.biorxiv.org/content/10.1101/2020.03.20.000133v4 (2020).

Jha, D. et al. Peak area detection network for directly learning phase regions from raw x-ray diffraction patterns . in 2019 International Joint Conference on Neural Networks (IJCNN) , 1–8 (IEEE, 2019).

Hawizy, L., Jessop, D. M., Adams, N. & Murray-Rust, P. Chemicaltagger: A tool for semantic text-mining in chemistry. J. Cheminformatics 3 , 1–13 (2011).

Corbett, P. & Boyle, J. Chemlistem: chemical named entity recognition using recurrent neural networks. J. Cheminformatics 10 , 1–9 (2018).

Rocktäschel, T., Weidlich, M. & Leser, U. Chemspot: a hybrid system for chemical named entity recognition. Bioinformatics 28 , 1633–1640 (2012).

Jessop, D. M., Adams, S. E., Willighagen, E. L., Hawizy, L. & Murray-Rust, P. Oscar4: a flexible architecture for chemical text-mining. J. Cheminformatics 3 , 1–12 (2011).

Leaman, R., Wei, C.-H. & Lu, Z. tmchem: a high performance approach for chemical named entity recognition and normalization. J. Cheminformatics 7 , 1–10 (2015).

Suzuki, Y. et al. Symmetry prediction and knowledge discovery from X-ray diffraction patterns using an interpretable machine learning approach. Sci. Rep. 10 , 21790 (2020).

Download references

Acknowledgements

Contributions from K.C. were supported by the financial assistance award 70NANB19H117 from the U.S. Department of Commerce, National Institute of Standards and Technology. E.A.H. and R.C. (CMU) were supported by the National Science Foundation under grant CMMI-1826218 and the Air Force D3OM2S Center of Excellence under agreement FA8650-19-2-5209. A.J., C.C., and S.P.O. were supported by the Materials Project, funded by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division under contract no. DE-AC02-05-CH11231: Materials Project program KC23MP. S.J.L.B. was supported by the U.S. National Science Foundation through grant DMREF-1922234. A.A. and A.C. were supported by NIST award 70NANB19H005 and NSF award CMMI-2053929.

Author information

Authors and affiliations.

Materials Science and Engineering Division, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA

Kamal Choudhary & Francesca Tavazza

Theiss Research, La Jolla, CA, 92037, USA

Kamal Choudhary

DeepMaterials LLC, Silver Spring, MD, 20906, USA

Material Measurement Science Division, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA

Brian DeCost

Department of NanoEngineering, University of California San Diego, San Diego, CA, 92093, USA

Chi Chen & Shyue Ping Ong

Energy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, CA, USA

  • Anubhav Jain

Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213, USA

Ryan Cohn & Elizabeth Holm

Department of Materials Science and Engineering, Northwestern University, Evanston, IL, 60208, USA

Cheol Woo Park & Chris Wolverton

Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, 60208, USA

Alok Choudhary & Ankit Agrawal

Department of Applied Physics and Applied Mathematics and the Data Science Institute, Fu Foundation School of Engineering and Applied Sciences, Columbia University, New York, NY, 10027, USA

Simon J. L. Billinge

You can also search for this author in PubMed   Google Scholar

Contributions

The authors contributed equally to the search as well as analysis of the literature and writing of the manuscript.

Corresponding author

Correspondence to Kamal Choudhary .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Choudhary, K., DeCost, B., Chen, C. et al. Recent advances and applications of deep learning methods in materials science. npj Comput Mater 8 , 59 (2022). https://doi.org/10.1038/s41524-022-00734-6

Download citation

Received : 25 October 2021

Accepted : 24 February 2022

Published : 05 April 2022

DOI : https://doi.org/10.1038/s41524-022-00734-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Analysis of solar energy potentials of five selected south-east cities in nigeria using deep learning algorithms.

  • Samuel Ikemba
  • Kim Song-hyun
  • Akeeb Adepoju Fawole

Sustainable Energy Research (2024)

Correlative, ML-based and non-destructive 3D-analysis of intergranular fatigue cracking in SAC305-Bi solder balls

  • Charlotte Cui
  • Fereshteh Falah Chamasemani
  • Roland Brunner

npj Materials Degradation (2024)

Structured information extraction from scientific text with large language models

  • John Dagdelen
  • Alexander Dunn

Nature Communications (2024)

Methods and applications of machine learning in computational design of optoelectronic semiconductors

  • Xiaoyu Yang
  • Lijun Zhang

Science China Materials (2024)

Data-driven analysis of spinodoid topologies: anisotropy, inverse design, and elasticity tensor distribution

  • Farshid Golnary
  • Mohsen Asghari

International Journal of Mechanics and Materials in Design (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

new research learning techniques

Recent progress in leveraging deep learning methods for question answering

  • Published: 16 January 2022
  • Volume 34 , pages 2765–2783, ( 2022 )

Cite this article

new research learning techniques

  • Tianyong Hao 1 ,
  • Xinxin Li 2 ,
  • Yulan He 1 ,
  • Fu Lee Wang 3 &
  • Yingying Qu   ORCID: orcid.org/0000-0002-0365-3168 4  

3723 Accesses

30 Citations

4 Altmetric

Explore all metrics

Question answering, serving as one of important tasks in natural language processing, enables machines to understand questions in natural language and answer the questions concisely. From web search to expert systems, question answering systems are widely applied to various domains in assisting information seeking. Deep learning methods have boosted various tasks of question answering and have demonstrated dramatic effects in performance improvement for essential steps of question answering. Thus, leveraging deep learning methods for question answering has drawn much attention from both academia and industry in recent years. This paper provides a systematic review of the recent development of deep learning methods for question answering. The survey covers the scope including methods, datasets, and applications. The methods are discussed in terms of network structure characteristics, methodology innovations, and their effectiveness. The survey is expected to be a contribution to the summarization of recent research progress and future directions of deep learning methods for question answering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

new research learning techniques

Similar content being viewed by others

new research learning techniques

Deep learning-based question answering: a survey

new research learning techniques

Deep Learning in Question Answering

new research learning techniques

Python Automatic Question Answering System Based on Deep Learning

Adlouni YE, Rodríguez H, Meknassi M, El Alaoui SO, En-nahnahi N (2019) A multi-approach to community question answering. Expert Sys Appl 137:432–442

Article   Google Scholar  

Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P, Nasrin MS, Hasan M, van Essen BC, Awwal AAS, Asari VK (2019) A state-of-the-art survey on deep learning theory and architectures. Electronics 8(3):292

Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: ICLR

Banerjee S, Naskar S, Rosso P, Bandyopadhyay S (2018) Code mixed cross script factoid question classification - a deep learning approach. J Intell & Fuzzy Sys 34(5):2959–2969

Bast H, Haussmann E (2015) More accurate question answering on freebase. In: CIKM’15, pp 1431–1440

Ben Abacha A, Demner-Fushman D (2019) A question-entailment approach to question answering. BMC Bioinfo 20(1):e33

Bengio Y (2009) Learning deep architectures for AI. Found Trends in Machine Learn 2(1):1–127

Article   MathSciNet   MATH   Google Scholar  

Berant J, Chou A, Roy F, Liang P (2013) Semantic parsing on freebase from question-answer pairs. In: EMNLP, pp 1533–1544

Bi M, Zhang Q, Zuo M, Xu Z, Jin Q (2019) Bi-directional lstm model with symptoms-frequency position attention for question answering system in medical domain. Neural Process Lett 51(5):570

Google Scholar  

Bisk Y, Reddy S, Blitzer J, Hockenmaier J, Steedman M (2016) Evaluating induced ccg parsers on grounded semantic parsing. In: EMNLP, pp 2022–2027

Cai L, Zhou S, Yan X (2019) Yuan R (2019) A stacked bilstm neural network based on coattention mechanism for question answering. Computat Intell Neurosci 9:1–12

Cai LQ, Wei M, Zhou ST, Yan X (2020) Intelligent question answering in restricted domains using deep learning and question pair matching. IEEE Access 8:32922–32934

Chen Z, Zhang C, Zhao Z, Yao C, Cai D (2018) Question retrieval for community-based question answering via heterogeneous social influential network. Neurocomputing 285:117–124

Chen ZY, Chang CH, Chen YP, Nayak J, Ku LW (2019) Uhop: An unrestricted-hop relation extraction framework for knowledge-based question answering. In: NAACL-HLT, pp 345–356

Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: EMNLP, pp 1724–1734

Cortes E, Woloszyn V, Binder A, Himmelsbach T, Barone D, Möller S (2020) An empirical comparison of question classification methods for question answering systems. In: LREC, pp 5408–5416

Croce D, Filice S, Basili R (2019) Making sense of kernel spaces in neural learning. Computer Speech & Language 58:51–75

Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: A new paradigm to machine learning. Archi Computat Method Eng 85(4):114

MathSciNet   Google Scholar  

Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp 4171–4186

Dimitrakis E, Sgontzos K, Tzitzikas Y (2019) A survey on question answering systems over linked data and documents. J Intell Info Sys 51(5):570

Dong L, Mallinson J, Reddy S, Lapata M (2017) Learning to paraphrase for question answering. In: EMNLP, pp 875–886

Du X, Shao J, Cardie C (2017) Learning to ask: Neural question generation for reading comprehension. In: ACL, pp 1342–1352

Dubey M, Banerjee D, Abdelkawi A, Lehmann J (2019) Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. SEMWEB 11779:69–78

Elman JL (1990) Finding structure in time. Cognitive Sci 14(2):179–211

Elsahar H, Gravier C, Laforest F (2018) Zero-shot question generation from knowledge graphs for unseen predicates and entity types. In: NAACL-HLT, pp 218–228

Fukushima K (1988) Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks 1(2):119–130

Garg S, Vu T, Moschitti A (2020) Tanda: Transfer and adapt pre-trained transformer models for answer sentence selection. AAAI 34:7780–7788

Goldberg Y (2016) A primer on neural network models for natural language processing. J Artif Intell Res 57(1):345–420

Green BF, Wolf AK, Chomsky C, Laughery K (1961) Baseball: an automatic question-answerer. In: IRE-AIEE-ACM ’61 (Western), pp 219–224

Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y (2016) Pointing the unknown words. In: ACL, pp 140–149

Hao Z, Wu B, Wen W, Cai R (2019) A subgraph-representation-based method for answering complex questions over knowledge bases. Neural Networks 119:57–65

He J, Fu M, Tu M (2019) Applying deep matching networks to chinese medical question answering: a study and a dataset. BMC Med Info Decision Making 19(S2):1

Hirschman L, Gaizauskas R (2001) Natural language question answering: the view from here. Nat Lang Eng 7(4):275–300

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Computat 9(8):1735–1780

Huang H, Wei X, Nie L, Mao X, Xu XS (2019) From question to text: Question-oriented feature attention for answer selection. ACM Trans Info Sys 37(1):1–33

Huang W, Qu Q, Yang M (2020) Interactive knowledge-enhanced attention network for answer selection. Neural Comput Appl 32(15):11343–11359

Indurthi SR, Raghu D, Khapra MM, Joshi S (2017) Generating natural language question-answer pairs from a knowledge graph using a rnn based question generation model. In: EACL, pp 376–385

Jiang B, Tan L, Ren Y, Li F (2019) Intelligent interaction with virtual geographical environments based on geographic knowledge graph. ISPRS Int J Geo-Info 8(10):428

Jing L, Gulcehre C, Peurifoy J, Shen Y, Tegmark M, Soljacic M, Bengio Y (2019) Gated orthogonal recurrent units: on learning to forget. Neural Computat 31(4):765–783

Khalifa M, Shaalan K (2019) Character convolutions for arabic named entity recognition with long short-term memory networks. Comp Speech & Language 58:335–346

Kim S, Park D, Choi Y, Lee K, Kim B, Jeon M, Kim J, Tan AC, Kang J (2018) A pilot study of biomedical text comprehension using an attention-based deep neural reader: design and experimental analysis. JMIR Med Info 6(1):e2

Kim Y, Lee H, Shin J, Jung K (2019) Improving neural question generation using answer separation. AAAI 33:6602–6609

Kolomiyets O, Moens MF (2011) A survey on question answering technology from an information retrieval perspective. Info Sci 181(24):5412–5434

Article   MathSciNet   Google Scholar  

Kumar A, Irsoy O, Ondruska P, Iyyer M, Bradbury J, Gulrajani I, Zhong V, Paulus R, Socher R (2016) Ask me anything: Dynamic memory networks for natural language processing. In: ICML, pp 1378–1387

Kumar V, Hua Y, Ramakrishnan G, Qi G, Gao L, Li YF (2019) Difficulty-controllable multi-hop question generation from knowledge graphs. SEMWEB 11778:382–398

Lan Y, Jiang J (2020) Query graph generation for answering multi-hop complex questions from knowledge bases. In: ACL, pp 969–974

Lan Y, Wang S, Jiang J (2019) Knowledge base question answering with a matching-aggregation model and question-specific contextual relations. IEEE/ACM Trans Audio, Speech, and Language Process 27(10):1629–1638

Lan Y, Wang S, Jiang J (2019) Multi-hop knowledge base question answering with an iterative sequence matching model. In: ICDM, pp 359–368

Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R (2020) Albert: A lite bert for self-supervised learning of language representations. In: ICLR

Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proceed of the IEEE 86:2278–2324

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

Lee CH, Lee HY, Wu SL, Liu CL, Fang W, Hsu JY, Tseng BH (2019) Machine comprehension of spoken content: Toefl listening test and spoken squad. IEEE/ACM Trans on Audio, Speech, and Language Process 27(9):1469–1480

Li J, Sun A, Han J, Li C (2022) A survey on deep learning for named entity recognition. IEEE Transact Knowledge & Data Eng 34:50–70

Li X, Zhang S, Wang B, Gao Z, Fang L, Xu H (2019) A hybrid framework for problem solving of comparative questions. IEEE Access 7:185961–185976

Lin T, Goyal P, Girshick R, He K, Dollár P (2020) Focal loss for dense object detection. IEEE Trans Pattern Anal Machine Intell 42(2):318–327

Liu D, Niu Z, Zhang C, Zhang J (2019) Multi-scale deformable cnn for answer selection. IEEE Access 7:164986–164995

Liu H, Liu Y, Wong LP, Lee LK, Hao T (2020) A hybrid neural network bert-cap based on pre-trained language model and capsule network for user intent classification. Complexity 2020:1–11

Luo K, Lin F, Luo X, Zhu K (2018) Knowledge base question answering via encoding of complex query graphs. In: EMNLP, pp 2185–2194

Mahmoud A, Zrigui M (2019) Sentence embedding and convolutional neural network for semantic textual similarity detection in arabic language. Arab J Sci Eng 44(11):9263–9274

Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2021) Deep learning-based text classification: A comprehensive review. ACM Comput Surv 54(3):62:1–62:40

Ojokoh B, Adebisi E (2019) A review of question answering systems. J Web Eng 17(8):717–758

Otter DW, Medina JR, Kalita JK (2021) A survey of the usages of deep learning in natural language processing. IEEE Trans Neural Network Learn Sys 32:604–624

Pan L, Lei W, Chua TS, Kan MY (2019) Recent advances in neural question generation. ArXiv abs/1905.08949

Parshakova T, Rameau F, Serdega A, Kweon IS, Kim DS (2019) Latent question interpretation through variational adaptation. IEEE/ACM Trans Audio, Speech and Language Process 27(11):1713–1724

Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: NAACL-HLT, pp 2227–2237

Qu Y, Liu J, Kang L, Shi Q, Ye D (2018) Question answering over freebase via attentive rnn with similarity matrix based cnn. arXiv: abs/1804.03317

Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100,000+ questions for machine comprehension of text. In: EMNLP, pp 2383–2392

Ren Q, Cheng X, Su S (2020) Multi-task learning with generative adversarial training for multi-passage machine reading comprehension. AAAI 34:8705–8712

Roy PK, Singh JP (2019) Predicting closed questions on community question answering sites using convolutional neural network. Neural Comput Appl 19(5):53

Sanh V, Debut L, Chaumond J, Wolf T (2019) Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv: abs/1910.01108

Sawant U, Garg S, Chakrabarti S, Ramakrishnan G (2019) Neural architecture for question answering using a knowledge graph and web corpus. Info Retr J 22(3–4):324–349

Shah AA, Ravana SD, Hamid S, Ismail MA (2018) Accuracy evaluation of methods and techniques in web-based question answering systems: a survey. Knowl Info Sys 58(03):611–650

Shao T, Guo Y, Chen H, Hao Z (2019) Transformer-based neural network for answer selection in question answering. IEEE Access 7:26146–26156

Shao T, Kui X, Zhang P, Chen H (2019) Collaborative learning for answer selection in question answering. IEEE Access 7:7337–7347

Shuang K, Liu Y, Zhang W, Zhang Z (2018) Summarization filter: Consider more about the whole query in machine comprehension. IEEE Access 6:58702–58709

Song L, Wang Z, Hamza W, Zhang Y, Gildea D (2018) Leveraging context information for natural question generation. In: NAACL-HLT, New Orleans, Louisiana, pp 569–574

Song Y, Hu QV, He L (2019) P-cnn: Enhancing text matching with positional convolutional neural network. Knowledge-Based Sys 169:67–79

Subramanian S, Wang T, Yuan X, Zhang S, Trischler A, Bengio Y (2018) Neural models for key phrase extraction and question generation. In: QA@ACL, pp 78–88

Sukhbaatar S, Szlam A, Weston J, Fergus R (2015) End-to-end memory networks. In: NIPS, p 2440-2448

Sun Y, Xia T (2019) A hybrid network model for tibetan question answering. IEEE Access 7:52769–52777

Talmor A, Berant J (2018) Repartitioning of the complexwebquestions dataset. arXiv: abs/1807.09623

Talmor A, Berant J (2018) The web as a knowledge-base for answering complex questions. In: NAACL-HLT, pp 641–651

Tan C, Wei F, Zhou Q, Yang N, Du B, Lv W, Zhou M (2018) Context-aware answer sentence selection with hierarchical gated recurrent neural networks. IEEE/ACM Trans Audio, Speech and Language Process 26(3):540–549

Tay Y, Tuan LA, Hui SC (2018) Hyperbolic representation learning for fast and efficient neural question answering. In: WSDM, pp 583–591

Tixier AJP (2018) Notes on deep learning for nlp. arXiv: abs/1808.09772

Tolias K, Chatzis SP (2019) \(t\) -exponential memory networks for question-answering machines. IEEE Trans Neural Networks Learn Sys 30(8):2463–2477

Wang M, A Smith N, Mitamura T (2007) What is the jeopardy model? a quasi-synchronous grammar for qa. In: EMNLP-CoNLL, pp 22–32

Wang S, Zhou W, Jiang C (2020) A survey of word embeddings based on deep learning. Computing 102(3):717–740

Wang Z, Liu J, Xiao X, Lyu Y, Wu T (2018) Joint training of candidate extraction and answer selection for reading comprehension. In: ACL, pp 1715–1724

Wen J, Tu H, Cheng X, Xie R, Yin W (2019) Joint modeling of users, questions and answers for answer selection in cqa. Expert Sys Appl 118:563–572

Weston J, Bordes A, Chopra S, Rush AM, van Merriënboer B, Joulin A, Mikolov T (2016) Towards ai-complete question answering: A set of prerequisite toy tasks. In: ICLR (Poster)

Wu Y, Wu W, Li Z, Zhou M (2018) Knowledge enhanced hybrid neural network for text matching. In: AAAI, pp 5586–5593

Wulamu A, Sun Z, Xie Y, Xu C, Yang A (2019) An improved end-to-end memory network for qa tasks. Computers, Materials & Continua 60(3):1283–1295

Xia C, Zhang C, Yan X, Chang Y, Yu P (2018) Zero-shot user intent detection via capsule neural networks. In: EMNLP, pp 3090–3099

Xin J, Lin Y, Liu Z, Sun M (2018) Improving neural fine-grained entity typing with knowledge attention. In: AAAI, pp 5997–6004

Yang B, Mitchell T (2017) Leveraging knowledge bases in lstms for improving machine reading. In: ACL, pp 1436–1446

Yang M, Tu W, Qu Q, Zhou W, Liu Q, Zhu J (2019) Advanced community question answering by leveraging external knowledge and multi-task learning. Knowledge-Based Sys 171:106–119

Yang X, Fan P (2019) Convolutional end-to-end memory networks for multi-hop reasoning. IEEE Access 7:135268–135276

Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. In: NeurIPS, pp 5754–5764

Yao X (2014) Feature-driven question answering with natural language alignment. John Hopkins University (PhD thesis)

Yih Wt, Richardson M, Meek C, Chang MW, Suh J (2016) The value of semantic parse labeling for knowledge base question answering. In: ACL, pp 201–206

Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13(3):55–75

Yuan X, Wang T, Gulcehre C, Sordoni A, Bachman P, Zhang S, Subramanian S, Trischler A (2017) Machine comprehension by text-to-text neural question generation. In: Rep4NLP@ACL, pp 15–25

Yue C, Cao H, Xiong K, Cui A, Qin H, Li M (2017) Enhanced question understanding with dynamic memory networks for textual question answering. Expert Sys Appl 80:39–45

Zhang L, Winn J, Tomioka R (2016) Gaussian attention model and its application to knowledge base embedding and question answering. arXiv: abs/1611.02266

Zhang S, Zhang X, Wang H, Cheng J, Li P, Ding Z (2017) Chinese medical question answer matching using end-to-end character-level multi-scale cnns. Appl Sci 7(8):767

Zhang S, Zhang X, Wang H, Guo L, Liu S (2018) Multi-scale attentive interaction networks for chinese medical question answer selection. IEEE Access 6:74061–74071

Zhang S, Zhang W, Niu J (2019) Improving short-text representation in convolutional networks by dependency parsing. Knowledge and Information Systems 61(1):463–484

Zhang X, Lu W, Li F, Peng X, Zhang R (2019) Deep feature fusion model for sentence semantic matching. Comput, Mater & Continua 61(2):601–616

Zhang Y, Dai H, Kozareva Z, Smola AJ, Le Song (2018) Variational reasoning for question answering with knowledge graph. In: AAAI, pp 6069–6076

Zhao Y, Ni X, Ding Y, Ke Q (2018) Paragraph-level neural question generation with maxout pointer and gated self-attention networks. In: EMNLP, pp 3901–3910

Zhou M, Huang M, Zhu X (2018) An interpretable reasoning network for multi-relation question answering. In: COLING, pp 2010–2022

Zhou Q, Yang N, Wei F, Tan C, Bao H, Zhou M (2017) Neural question generation from text: A preliminary study. NLPCC 10619:662–671

Zhu S, Cheng X, Su S (2020) Knowledge-based question answering by tree-to-sequence learning. Neurocomputing 372:64–72

Download references

Acknowledgements

This work is supported by grants from National Natural Science Foundation of China (No. 61772146), The Science and Technology Plan of Guangzhou (No. 201804010296), and Natural Science Foundation of Guangdong Province, China (No. 2018A030310051).

Author information

Authors and affiliations.

School of Computer Science, South China Normal University, Guangzhou, China

Tianyong Hao & Yulan He

Institute of Logic and Cognition, Sun Yat-sen University, Guangzhou, China

School of Science and Technology, Hong Kong Metropolitan University, Hong Kong, Hong Kong SAR

Fu Lee Wang

School of Business, Guangdong University of Foreign Studies, Guangzhou, China

Yingying Qu

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Yingying Qu .

Ethics declarations

Conflict of interest.

The authors declare that they have no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Hao, T., Li, X., He, Y. et al. Recent progress in leveraging deep learning methods for question answering. Neural Comput & Applic 34 , 2765–2783 (2022). https://doi.org/10.1007/s00521-021-06748-3

Download citation

Received : 21 August 2020

Accepted : 11 November 2021

Published : 16 January 2022

Issue Date : February 2022

DOI : https://doi.org/10.1007/s00521-021-06748-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Question answering
  • Deep learning
  • Performance evaluation
  • Find a journal
  • Publish with us
  • Track your research

share this!

May 6, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

Do good lessons promote students' attention and behavior?

by Anke Wilde, Leibniz-Institut für Bildungsforschung und Bildungsinformation

teaching

Students are better able to regulate themselves in lessons that they consider to be particularly well implemented. This is the conclusion drawn from a study by the DIPF | Leibniz Institute for Research and Information in Education, published in the journal Learning and Instruction .

The link between teaching quality and self-regulation tends to be particularly true for pupils who have problems controlling their behavior and following lessons, for example due to ADHD symptoms.

Good teaching is characterized by the teacher leading the class through the lesson without disruption, encouraging the students to think, taking an interest in them and supporting them individually. The better the teacher is at this, the better the students will be able to regulate their behavior, for example by paying attention, cooperating and adhering to the class rules.

As a result, they learn better. This link, which has already been established in research, has now been examined in more detail in this daily diary study and evaluated with the help of multilevel analyses.

It became clear that the quality of teaching has an impact not only on self-regulation overall, but also in each individual lesson, as Dr. Friederike Blume, lead author of the now published study, summarizes the results.

"When teachers are particularly good at classroom management and providing student support in a lesson, students are better able to regulate their behavior. When these two characteristics of good teaching are not working well in a lesson, students also reported that they were less able to concentrate and engage."

Cognitive activation, the third characteristic of good teaching, was hardly relevant for self-regulation. Therefore, the personal relationship between teacher and student is particularly important, emphasizes Dr. Blume.

This is especially true for students who have difficulties with self-regulation, such as those with attention deficit hyperactivity disorder (ADHD).

"Many teachers find it difficult to establish a positive relationship with children with ADHD symptoms," says the educational researcher. "However, our study showed that in lessons where children with self-regulation difficulties felt particularly supported by their teacher, they were more likely to report being able to concentrate better and follow class rules.

"It is therefore worth taking a positive approach to these children in the classroom and showing a genuine interest in them, as this can reduce the pressure on teachers in the long term and bring more calm to the classroom."

The DIPF researcher also recommends that teachers ask their students for feedback on their teaching from time to time. Although this is still a taboo for many, it can provide valuable information on how to better tailor their teaching to the needs of individual students.

A total of 64 pupils in years 5 and 6 took part in the study. They did not necessarily belong to the same school or class, but were recruited through an email appeal to music schools, sports and leisure centers, for example.

At the start of the study, the children completed a questionnaire about general information such as their grade level and type of school, as well as how they rated their self-regulation skills. Over the next three school weeks, the children answered daily questions about the last lesson of each day.

The questions related to the quality of teaching (e.g., support from the teacher, disruptions in lessons, stimulation of reflection), as well as their ability to regulate themselves in that lesson (e.g., attention, impulse control, motor activity).

The links between the individual lessons and the corresponding daily entries were evaluated using multilevel analysis. Among other things, the results were analyzed on an intrapersonal level, which allows conclusions to be drawn at the level of the individual child. In addition, interpersonal associations were examined, which allows conclusions to be drawn about all participants together.

Limitations of the study

Studies with such an elaborate design, involving daily diaries, always aim to collect data in as short time as possible. As a result, teaching quality was only measured here on the basis of only few statements, which certainly do not cover all the characteristics of good teaching.

Future studies should therefore take a closer look at classroom interaction processes to explore which features of teaching are particularly beneficial, especially for children with stronger ADHD symptoms.

Furthermore, future studies must show whether the results found here apply to all subjects or only to certain subjects, and the role of different teaching methods.

Journal information: Learning and Instruction

Provided by Leibniz-Institut für Bildungsforschung und Bildungsinformation

Explore further

Feedback to editors

new research learning techniques

Astronomers explore globular cluster NGC 2419

new research learning techniques

Research team develops AI to perform chemical synthesis

2 hours ago

new research learning techniques

US restorationist solves 60-million-year-old dinosaur fossil 'puzzles'

3 hours ago

new research learning techniques

Researchers reveal how molecular roadblocks slow the breakdown of cellulose for biofuels

13 hours ago

new research learning techniques

In South Africa, tiny primates could struggle to adapt to climate change

14 hours ago

new research learning techniques

Study reveals flaw in long-accepted approximation used in water simulations

new research learning techniques

Researchers develop nanotechnology for creating wafer-scale nanoparticle monolayers in seconds

new research learning techniques

Study underscores new strategies to fight drug-resistant bacteria

15 hours ago

new research learning techniques

Researchers establish commercially viable process for manufacturing with promising new class of metals

16 hours ago

new research learning techniques

Chimps shown to learn and improve tool-using skills even as adults

17 hours ago

Relevant PhysicsForums posts

Physics instructor minimum education to teach community college, studying "useful" vs. "useless" stuff in school.

Apr 30, 2024

Why are Physicists so informal with mathematics?

Apr 29, 2024

Plagiarism & ChatGPT: Is Cheating with AI the New Normal?

Apr 28, 2024

Digital oscilloscope for high school use

Apr 25, 2024

Motivating high school Physics students with Popcorn Physics

Apr 3, 2024

More from STEM Educators and Teaching

Related Stories

new research learning techniques

Virtual pupils make for more confident teachers

Jun 22, 2021

new research learning techniques

Small class sizes not better for pupils' grades or resilience, says study

Mar 8, 2024

new research learning techniques

Explicit socioemotional learning can have a key role in physical education lessons, study says

Mar 13, 2024

new research learning techniques

Study finds clear instruction, parental support predict students' sense of school belonging

Mar 27, 2024

new research learning techniques

How do we help students from disadvantaged backgrounds feel confident about school?

Apr 1, 2024

new research learning techniques

Positive teacher-student relationships lead to better teaching

Mar 8, 2022

Recommended for you

new research learning techniques

Investigation reveals varied impact of preschool programs on long-term school success

May 2, 2024

new research learning techniques

Training of brain processes makes reading more efficient

Apr 18, 2024

new research learning techniques

Researchers find lower grades given to students with surnames that come later in alphabetical order

Apr 17, 2024

new research learning techniques

Earth, the sun and a bike wheel: Why your high-school textbook was wrong about the shape of Earth's orbit

Apr 8, 2024

new research learning techniques

Touchibo, a robot that fosters inclusion in education through touch

Apr 5, 2024

new research learning techniques

More than money, family and community bonds prep teens for college success: Study

Let us know if there is a problem with our content.

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter

Dietrich College of Humanities and Social Sciences

Coding for animals key to engaging children in stem, an education pilot study bridges animal behavior research and computer coding to engage elementary school students in real-world, interdisciplinary science.

By Stacy Kish skish(through)andrew.cmu.edu

Teachers today are facing a bit of a conundrum. Their goal is to prepare their young students to enter a rapidly changing world. Even basic jobs require technical proficiency, which requires computational and analytical skills. To address this need, many educators are pushing to fold these important STEM skills into elementary curriculum. 

Here’s the problem. Young students can lose interest and even develop an aversion to the tasks that build the skills associated with computational thinking. Past studies have pointed to historic low completion rates in STEM fields, with computer science among the lowest. A bridge is needed to engage students in the tasks to develop these important 21 st century skills. 

Innovative Teaching Methods: Merging Interests for Effective Learning

“Sometimes students disengage from science, because they do not see the science that they are doing in the classroom as connected to the real world,” said Jessica Cantlon , the Ronald J. and Mary Ann Zdrojkowski Associate Professor of Developmental Neuroscience and Psychology at Carnegie Mellon University. “When young students engage in authentic science experiences, they can absorb facts more effectively.”

Unlike in the classroom, science does not fall into neat, separated boxes. Real-world science is interdisciplinary. Past attempts to build this bridge have focused on topics, such as robotics, gaming, or animations, but  the niche nature of this subject often leaves many students uninterested. 

Cantlon and her colleagues took a different approach. They merged a topic that younger children (grades 3 to 6) enjoy — animals — with one that most kids might look at like a plate of steaming Brussels sprouts — computer coding. The results of their pilot program are available in the April 2 issue of the journal STEM Education Research .

“The focus of this pilot study is whether, in principle, students can acquire skills in computational thinking during a relatively short, loose format, authentic science experience,” said Cantlon. “By learning these skills, the students also maintained or gained excitement throughout the project’s unique immersive experience in animal behavior.”

Integrating Coding and Animal Behavior

Cantlon and her colleagues developed an educational program in collaboration with the Primate Portal, an exhibit at the Seneca Park Zoo in Rochester, N.Y., in which the public can watch olive baboons solve problems presented as computerized tasks on a touchscreen computer

Through the program, students learned a basic coding language (Scratch) to develop a game that olive baboons at the zoo play to test their intelligence. While the students have the freedom to create their game, they are given different frameworks as a starting point, such as a matching game or a search game, like “Where’s Waldo.” At the end of the five-day programming course, the students took a field trip to the zoo to watch the primates play the games they programmed.

Insights and Challenges: Teacher and Student Perspectives

“The students definitely struggle with the complexity of the code as they had little to no experience with coding,” said Greg Booth, a teacher in the REACH program for gifted/talented students at QUEST Elementary in the Hilton Central School District who worked with the researchers on this project. “They were not given the opportunity to do [coding] in school prior to this, and they had a tremendous amount of intrinsic motivation to learn and develop their coding skills.”

In the first iteration of this pilot project, the team engaged 57 elementary-aged students from three elementary schools in Western New York, of which 36 completed pre- and post-surveys to evaluate the skills acquired during the class.

“It is rare for anyone to collect data from informal, teacher–scientist interventions,” said Cantlon, who is first author on the study. “[The study’s] effect size is large, because [the students] learned a lot of new computational thinking skills by completing the coding projects.”

Enhancing STEM Interest and Skills

According to Cantlon, the effect size of the study is large because students began the course with little to no knowledge about coding and developed definite coding skills that supported computational thinking. For example, the students learned to write conditional statements, a loop in code and interpret logical statements. In addition, the students experienced a significant increase in accuracy and problem-solving attempts. The project also showed that it is possible to integrate “learning and doing” into curriculum for elementary-aged students.

“I love seeing kids get interested in science, especially girls,” said Caroline DeLong , professor at Rochester Institute of Technology and co-author on the study. “This program is a fantastic way to utilize kids’ love of animals as a bridge to learning new computational skills and a way to show them how science works in real time.”

The students’ computational thinking scores improved by 17% from the beginning of the course to the end. There was no difference in the level of improvement between boys and girls who participated in the program. In addition, the students spoke highly of the program, citing their interest in creativity and independence during the learning process. According to Cantlon, the program shows that it is possible to enhance students’ interest in science and cultivate essential 21st-century skills. 

“Yes it is possible to engage students early, in elementary school, and hook them into STEM interests in something that they might think is boring — coding,” said Cantlon. “It is important to engage students before they make up their mind that STEM isn’t for them and while they are still open to learning about STEM and hopefully come to a new conclusion that STEM is for them.”

Cantlon and her colleagues aim to expand this approach to engage a more diverse group of students in future studies.

Cantlon and DeLong were joined by Katherine Becker at the Rochester Institute of Technology on the project, titled “Computational thinking during a short, authentic, interdisciplinary STEM experience for elementary students.” The project received funding from the National Science Foundation and a fund associated with the Ronald and Mary Ann Zdrojkowski Chair in Developmental Neuroscience at CMU.

Student responses to a learning query from one of the post-course surveys

  • “I learned that baboons are just like us.”
  • “The coolest thing I learned was how to use the ‘if then else’ block.”
  • “The thing I got to learn is that baboons can do math! (They are super intelligent.)”
  • “I learned how to think mechanically.”

Jessica Cantlon headshot

  • CMU Directory
  • Dietrich College Calendar

USC Viterbi School of Engineering Logo – Viterbi School website

USC at ICLR 2024

Researchers from the usc school of advanced computing present 13 papers at the international conference on learning representations (iclr)..

The International Conference on Learning Representations (ICLR) takes place in Vienna, Austria, May 7-11.

The International Conference on Learning Representations (ICLR) takes place in Vienna, Austria, May 7-11.

Researchers from the USC School of Advanced Computing , a unit of the USC Viterbi School of Engineering , are presenting 13 papers at the International Conference on Learning Representations (ICLR), a premier gathering of professionals dedicated to the advancement of the branch of artificial intelligence called representation learning, commonly referred to as deep learning. These include faculty and students from the Thomas Lord Department of Computer Science and the Ming Hsieh Department of Computer Science and Electrical Engineering .

ORAL   papers are approximately in the top 1%, and SPOTLIGHT papers are in the top 6% of accepted papers.

Datasets and benchmarks

WildChat: 1M ChatGPT Interaction Logs in the Wild  (SPOTLIGHT)

Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng

The past decade has witnessed significant advances in time series modeling with deep learning. While achieving state-of-the-art results, the best-performing architectures vary highly across applications and domains. Meanwhile, for natural language processing, the Generative Pre-trained Transformer (GPT) has demonstrated impressive performance via training one general-purpose model across various textual datasets. It is intriguing to explore whether GPT-type architectures can be effective for time series, capturing the intrinsic dynamic attributes and leading to significant accuracy improvements. In this paper, we propose a novel framework, TEMPO, that can effectively learn time series representations. We focus on utilizing two essential inductive biases of the time series task for pre-trained models: (i) decomposition of the complex interaction between trend, seasonal and residual components; and (ii) introducing the design of prompts to facilitate distribution adaptation in different types of time series. TEMPO expands the capability for dynamically modeling real-world temporal phenomena from data within diverse domains. Our experiments demonstrate the superior performance of TEMPO over state-of-the-art methods on zero shot setting for a number of time series benchmark datasets. This performance gain is observed not only in scenarios involving previously unseen datasets but also in scenarios with multi-modal inputs. This compelling finding highlights TEMPO’s potential to constitute a foundational model-building framework.

General machine learning

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement (ORAL)

Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin, Chandra Bhagavatula, Bailin Wang, Yoon Kim, Yejin Choi, Nouha Dziri, Xiang Ren

The ability to derive underlying principles from a handful of observations and then generalize to novel situations — known as inductive reasoning — is central to human intelligence. Prior work suggests that language models (LMs) often fall short on inductive reasoning, despite achieving impressive success on research benchmarks. In this work, we conduct a systematic study of the inductive reasoning capabilities of LMs through iterative hypothesis analysis, a technique that more closely mirrors the human inductive process than standard input-output prompting. Iterative hypothesis refinement employs a three-step process: proposing, selecting, and refining hypotheses in the form of textual rules. By examining the intermediate rules, we observe that LMs are phenomenal hypothesis proposers (i.e., generating candidate rules), and when coupled with a (task-specific) symbolic interpreter that is able to systematically filter the proposed set of rules, this hybrid approach achieves strong results across inductive reasoning benchmarks that require inducing causal relations, language-like instructions, and symbolic concepts. However, they also behave as puzzling. Inductive reasoners, showing notable performance gaps between rule induction (i.e., identifying plausible rules) and rule application (i.e., applying proposed rules to instances), suggesting that LMs are proposing hypotheses without being able to actually apply the rules. Through empirical and human analyses, we further reveal several discrepancies between the inductive reasoning processes of LMs and humans, shedding light on both the potentials and limitations of using LMs in inductive reasoning tasks.

Can We Get the Best of Both Binary Neural Networks and Spiking Neural Networks for Efficient Computer Vision?

Gourav Datta, Zeyu Liu, Peter Beerel

Binary neural networks (BNN) have emerged as an attractive computing paradigm for a wide range of low-power vision tasks. However, state-of-the-art (SOTA) BNNs do not yield any sparsity, and induce a significant number of non-binary operations. On the other hand, activation sparsity can be provided by spiking neural networks (SNN), that too have gained significant traction in recent times. Thanks to this sparsity, SNNs when implemented on neuromorphic hardware, have the potential to be significantly more power-efficient compared to traditional artificIal neural networks (ANN). However, SNNs incur multiple time steps to achieve close to SOTA accuracy. Ironically, this increases latency and energy—costs that SNNs were proposed to reduce—and presents itself as a major hurdle in realizing SNNs’ theoretical gains in practice. This raises an intriguing question: Can we obtain SNN-like sparsity and BNN-like accuracy and enjoy the energy-efficiency benefits of both? To answer this question, in this paper, we present a training framework for sparse binary activation neural networks (BANN) using a novel variant of the Hoyer regularizer. We estimate the threshold of each BANN layer as the Hoyer extremum of a clipped version of its activation map, where the clipping value is trained using gradient descent with our Hoyer regularizer. This approach shifts the activation values away from the threshold, thereby mitigating the effect of noise that can otherwise degrade the BANN accuracy. Our approach outperforms existing BNNs, SNNs, and adder neural networks (that also avoid energy-expensive multiplication operations similar to BNNs and SNNs) in terms of the accuracy-FLOPs trade-off for complex image recognition tasks. Downstream experiments on object detection further demonstrate the efficacy of our approach. Lastly, we demonstrate the portability of our approach to SNNs with multiple time steps. Codes are publicly available here .

LMUFormer: Low Complexity Yet Powerful Spiking Model With Legendre Memory Units

Zeyu Liu, Gourav Datta, Anni Li, Peter Beerel

Transformer models have demonstrated high accuracy in numerous applications but have high complexity and lack sequential processing capability making them ill-suited for many streaming applications at the edge where devices are heavily resource-constrained. Thus motivated, many researchers have proposed reformulating the transformer models as RNN modules which modify the self-attention computation with explicit states. However, these approaches often incur significant performance degradation.The ultimate goal is to develop a model that has the following properties: parallel training, streaming and low-cost inference, and state-of-the-art (SOTA) performance. In this paper, we propose a new direction to achieve this goal. We show how architectural modifications to a fully-sequential recurrent model can help push its performance toward Transformer models while retaining its sequential processing capability. Specifically, inspired by the recent success of Legendre Memory Units (LMU) in sequence learning tasks, we propose LMUFormer, which augments the LMU with convolutional patch embedding and convolutional channel mixer. Moreover, we present a spiking version of this architecture, which introduces the benefit of states within the patch embedding and channel mixer modules while simultaneously reducing the computing complexity. We evaluated our architectures on multiple sequence datasets. Of particular note is our performance on the Speech Commands V2 dataset (35 classes). In comparison to SOTA transformer-based models within the ANN domain, our LMUFormer demonstrates comparable performance while necessitating a remarkable 70× reduction in parameters and a substantial 140× decrement in FLOPs. Furthermore, when benchmarked against extant low-complexity SNN variants, our model establishes a new SOTA with an accuracy of 96.12\%. Additionally, owing to our model’s proficiency in real-time data processing, we are able to achieve a 32.03\% reduction in sequence length, all while incurring an inconsequential decline in performance.

Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

Xiongye Xiao, Gengshuo Liu, Gaurav Gupta, Defu Cao, Shixuan Li, Yaxing Li, Tianqing Fang, Mingxi Cheng, Paul Bogdan

Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world in autonomous systems and cyber-physical systems. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Different from most traditional fusion models that incorporate all modalities identically in neural networks, our model designates a prime modality and regards the remaining modalities as detectors in the information pathway, serving to distill the flow of information. Our proposed perception model focuses on constructing an effective and compact information flow by achieving a balance between the minimization of mutual information between the latent state and the input modal state, and the maximization of mutual information between the latent states and the remaining modal states. This approach leads to compact latent state representations that retain relevant information while minimizing redundancy, thereby substantially enhancing the performance of multimodal representation learning. Experimental evaluations on the MUStARD, CMU-MOSI, and CMU-MOSEI datasets demonstrate that our model consistently distills crucial information in multimodal learning scenarios, outperforming state-of-the-art benchmarks. Remarkably, on the CMU-MOSI dataset, ITHP surpasses human-level performance in the multimodal sentiment binary classification task across all evaluation metrics (i.e., Binary Accuracy, F1 Score, Mean Absolute Error, and Pearson Correlation).

Learning theory

Towards Establishing Guaranteed Error for Learned Database Operations

Sepanta Zeighami, Cyrus Shahabi

Machine learning models have demonstrated substantial performance enhancements over non-learned alternatives in various fundamental data management operations, including indexing (locating items in an array), cardinality estimation (estimating the number of matching records in a database), and range-sum estimation (estimating aggregate attribute values for query-matched records). However, real-world systems frequently favor less efficient non-learned methods due to their ability to offer (worst-case) error guarantees — an aspect where learned approaches often fall short. The primary objective of these guarantees is to ensure system reliability, ensuring that the chosen approach consistently delivers the desired level of accuracy across all databases. In this paper, we embark on the first theoretical study of such guarantees for learned methods, presenting the necessary conditions for such guarantees to hold when using machine learning to perform indexing, cardinality estimation and range-sum estimation. Specifically, we present the first known lower bounds on the model size required to achieve the desired accuracy for these three key database operations. Our results bound the required model size for given average and worst-case errors in performing database operations, serving as the first theoretical guidelines governing how model size must change based on data size to be able to guarantee an accuracy level. More broadly, our established guarantees pave the way for the broader adoption and integration of learned models into real-world systems.

Probabilistic methods

Closing the Curious Case of Neural Text Degeneration Matthew Finlayson, John Hewitt, Alexander Koller, Swabha Swayamdipta, Ashish Sabharwal

Despite their ubiquity in language generation, it remains unknown why truncation sampling heuristics like nucleus sampling are so effective. We provide a theoretical explanation for the effectiveness of the truncation sampling by proving that truncation methods that discard tokens below some probability threshold (the most common type of truncation) can guarantee that all sampled tokens have nonzero true probability. However, thresholds are a coarse heuristic, and necessarily discard some tokens with nonzero true probability as well. In pursuit of a more precise sampling strategy, we show that we can leverage a known source of model errors, the softmax bottleneck, to prove that certain tokens have nonzero true probability, without relying on a threshold. Based on our findings, we develop an experimental truncation strategy and the present pilot studies demonstrating the promise of this type of algorithm. Our evaluations show that our method outperforms its threshold-based counterparts under automatic and human evaluation metrics for low-entropy (i.e., close to greedy) open-ended text generation. Our theoretical findings and pilot experiments provide both insight into why truncation sampling works, and make progress toward more expressive sampling algorithms that better surface the generative capabilities of large language models.

Explain like I’m five: Matthew Finlayson

Photo of Matthew Finlayson

“We explain why language models, especially small ones, make prediction mistakes, then find a new way to predict when they will make those mistakes and avoid them. Our research helps make small language models, like the ones that Apple is releasing that can run on your phone, more reliable. Anyone who needs to use language models without an internet connection may need to use a small model that can fit on their device. Our method will help make those small models less likely to make mistakes.”

Reinforcement learning

Proximal Policy Gradient Arborescence for Quality Diversity Reinforcement Learning ( SPOTLIGHT)

Sumeet Batra, Bryon Tjanaka, Matthew Christopher Fontaine, Aleksei Petrenko, Stefanos Nikolaidis, Gaurav S. Sukhatme

Training generally capable agents that thoroughly explore their environment and learn new and diverse skills is a long-term goal of robot learning. Quality Diversity Reinforcement Learning (QD-RL) is an emerging research area that blends the best aspects of both fields – Quality Diversity (QD) provides a principled form of exploration and produces collections of behaviorally diverse agents, while Reinforcement Learning (RL) provides a powerful performance improvement operator enabling generalization across tasks and dynamic environments. Existing QD-RL approaches have been constrained to sample efficient, deterministic off-policy RL algorithms and/or evolution strategies and struggle with highly stochastic environments. In this work, we, for the first time, adapt on-policy RL, specifically Proximal Policy Optimization (PPO), to the Differentiable Quality Diversity (DQD) framework and propose several changes that enable efficient optimization and discovery of novel skills on high-dimensional, stochastic robotics tasks. Our new algorithm, Proximal Policy Gradient Arborescence (PPGA), achieves state-of- the-art results, including a 4x improvement in best reward over baselines on the challenging humanoid domain.

Explain like I’m five: Sumeet Batra

“In this paper, we develop an algorithm that allows robots in simulation to discover new ways of moving, including walking, running, galloping, hopping on one leg, and limping. Our paper achieves promising results in making AI for humanoid robots more reliable and adaptable to the complexity of the world, bringing us one step closer to having helpful humanoid robot assistants in our households and workplaces. We are in the works of bringing this to a real robot for a future paper!”

Representation learning for computer vision, audio, language, and other modalities

TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting

Defu Cao, Furong Jia, Sercan O Arik, Tomas Pfister, Yixiang Zheng, Wen Ye, Yan Liu

Tailoring Self-Rationalizers with Multi-Reward Distillation

Sahana Ramnath, Brihi Joshi, Skyler Hallinan, Ximing Lu, Liunian Li, Aaron Chan, Jack Hessel, Yejin Choi, Xiang Ren

Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant scales (e.g., 175B parameter GPT-3); and 2) focuses largely on downstream performance, ignoring the semantics of the rationales themselves, e.g., are they faithful, true, and helpful for humans? In this work, we enable small-scale LMs (∼200x smaller than GPT-3) to generate rationales that not only improve downstream task performance, but are also more plausible, consistent, and diverse, assessed both by automatic and human evaluation. Our method, MaRio (Multi-rewArd RatIOnalization), is a multi-reward conditioned self-rationalization algorithm that optimizes multiple distinct properties like plausibility, diversity and consistency. Results on three difficult question-answering datasets StrategyQA, QuaRel and OpenBookQA show that not only does MaRio improve task accuracy, but it also improves the self-rationalization quality of small LMs across the aforementioned axes better than a supervised fine-tuning (SFT) baseline. Extensive human evaluations confirm that MaRio rationales are preferred vs. SFT rationales, as well as qualitative improvements in plausibility and consistency.

PlaSma: Procedural Knowledge Models for Language-based Planning and Re-Planning Faeze Brahman, Chandra Bhagavatula, Valentina Pyatkin, Jena Hwang, Xiang Lorraine Li, Hirona Arai, Soumya Sanyal, Keisuke Sakaguchi, Xiang Ren, Yejin Choi

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex and often contextualized situations, e.g. “scheduling a doctor’s appointment without a phone.” While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (constrained) language-based planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the commonsense knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a new related task, Replanning, that requires a revision of a plan to cope with a constrained situation. In both the planning and replanning settings, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models’ capabilities. Finally, we showcase successful application of PlaSma in an embodied environment, VirtualHome.

EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li, Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, Yue Wang

We present EmerNeRF, a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes. Grounded in neural fields, EmerNeRF simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping. EmerNeRF hinges upon two core components: First, it stratifies scenes into static and dynamic fields. This decomposition emerges purely from self-supervision, enabling our model to learn from general, in-the-wild data sources. Second, EmerNeRF parameterizes an induced flow field from the dynamic field and uses this flow field to further aggregate multi-frame features, amplifying the rendering precision of dynamic objects. Coupling these three fields (static, dynamic, and flow) enables EmerNeRF to represent highly-dynamic scenes self-sufficiently, without relying on ground truth object annotations or pre-trained models for dynamic object segmentation or optical flow estimation. Our method achieves state-of-the-art performance in sensor simulation, significantly outperforming previous methods when reconstructing static (+2.93 PSNR) and dynamic (+3.70 PSNR) scenes. In addition, to bolster EmerNeRF’s semantic generalization, we lift 2D visual foundation model features into 4D space-time and address a general positional bias in modern Transformers, significantly boosting 3D perception performance (e.g., 37.50% relative improvement in occupancy prediction accuracy on average). Finally, we construct a diverse and challenging 120-sequence dataset to benchmark neural fields under extreme and highly-dynamic settings.

Transfer learning, meta learning and lifelong learning

Federated Orthogonal Training: Mitigating Global Catastrophic Forgetting in Continual Federated Learning

Yavuz Faruk Bakman, Duygu Nur Yaldiz, Yahya Ezzeldin, Salman Avestimehr

Federated learning (FL) has gained significant attraction due to its ability to enable privacy-preserving training over decentralized data. Current literature in FL mostly focuses on single-task learning. However, over time, new tasks may appear in the clients and the global model should learn these tasks without forgetting previous tasks. This real-world scenario is known as Continual Federated Learning (CFL). The main challenge of CFL is Global Catastrophic Forgetting, which corresponds to the fact that when the global model is trained on new tasks, its performance on old tasks decreases. There have been a few recent works on CFL to propose methods that aim to address the global catastrophic forgetting problem. However, these works either have unrealistic assumptions on the availability of past data samples or violate the privacy principles of FL. We propose a novel method, Federated Orthogonal Training (FOT), to overcome these drawbacks and address the global catastrophic forgetting in CFL. Our algorithm extracts the global input subspace of each layer for old tasks and modifies the aggregated updates of new tasks such that they are orthogonal to the global principal subspace of old tasks for each layer. This decreases the interference between tasks, which is the main cause for forgetting. Our method is almost computation-free on the client side and has negligible communication cost. We empirically show that FOT outperforms state-of-the-art continual learning methods in the CFL setting, achieving an average accuracy gain of up to 15% with 27% lower forgetting while only incurring a minimal computation and communication cost. Code can be found here .

Published on May 7th, 2024

Last updated on May 7th, 2024

Share This Story

Related Stories

USC Viterbi Magazine wins the CASE Grand Gold Award (Image/Courtesy of CASE)

ABOUT THE SCHOOL

  • 115 Year Celebration
  • About Andrew Viterbi
  • Diversity Equity & Inclusion
  • Facts and Numbers
  • Faculty Directory
  • Ginsburg Hall
  • USC Michelson Center

FROM THE DEAN

  • Dean's Message
  • Dean's Report
  • Initiatives and Priorities
  • Engineering +
  • Strategic Plan

NEWS | MEDIA | EVENTS

  • Keynote Lecture Series
  • Media Contact & Press Releases
  • Media Coverage
  • Public Image Archive
  • Publications
  • Social Media
  • Viterbi News Now

SCHOOL OF ADVANCED COMPUTING

  • Thomas Lord Department of Computer Science
  • Ming Hsieh Department of Electrical and Computer Engineering
  • Division of Computing Education (DCE)
  • Information Technology Program (ITP)
  • Interdisciplinary Data Science (IDS)
  • Information Science Institute (ISI)
  • Institute for Creative Technologies (ICT)
  • More to come soon

DEPARTMENTS AND ACADEMIC PROGRAMS

  • Aerospace and Mechanical Engineering
  • Astronautical Engineering
  • Alfred E. Mann Department of Biomedical Engineering
  • Mork Family Department of Chemical Engineering and Materials Science
  • Sonny Astani Department of Civil and Environmental Engineering
  • Daniel J. Epstein Department of Industrial and Systems Engineering
  • Engineering in Society Program
  • Information Technology Program

EXECUTIVE AND CONTINUING EDUCATION

  • Aviation Safety and Security Program
  • Corporate and Professional Programs

ONLINE ACCESS

  • Graduate Programs - DEN@Viterbi

SPECIALIZED GRADUATE PROGRAMS

  • Financial Engineering Program
  • Green Technologies Program
  • Data Science Program
  • Progressive Degree Program
  • Systems Architecting and Engineering Program

RESOURCES AND INITIATIVES

  • Academic Integrity
  • Accreditation
  • Awards Office
  • John Brooks Slaughter Center for Engineering Diversity
  • Division of Engineering Education
  • Globalization
  • K-12 Outreach
  • USC Experts Directory
  • Women in Science and Engineering

FIRST YEAR APPLICANTS

new research learning techniques

MASTER'S APPLICANTS

new research learning techniques

PHD APPLICANTS

new research learning techniques

TRANSFER APPLICANTS

new research learning techniques

RESEARCH ENVIRONMENT

  • Search Faculty Research Areas
  • Departments, Research Institutes and Centers
  • Research Infrastructure
  • Research Initiatives
  • Research Vision
  • Student Research
  • Summer Undergraduate Research Experience

TECHNOLOGY INNOVATION AND ENTREPRENEURSHIP

  • NSF I-Corps Hub: West Region
  • Office of Technology Innovation and Entrepreneurship
  • USC Stevens Center for Innovation
  • Viterbi News Network
  • Diversity Equity Inclusion
  • Dean’s Message
  • Dean’s Report
  • Media Contact & Press Releases
  • More to Come Soon
  • Biomedical Engineering
  • Informatics Program
  • Graduate Programs – DEN@Viterbi
  • First Year Applicants
  • Master’s Applicants
  • PHD Applicants
  • Transfer Applicants
  • Competitions
  • Entrepreneurship
  • I-Corps Node
  • Viterbi Startup Garage
  • Viterbi Student Innovation Institute (VSI2)
  • Viterbi Venture Fund
  • Search Menu
  • Conflict, Security, and Defence
  • East Asia and Pacific
  • Energy and Environment
  • Global Health and Development
  • International History
  • International Governance, Law, and Ethics
  • International Relations Theory
  • Middle East and North Africa
  • Political Economy and Economics
  • Russia and Eurasia
  • Sub-Saharan Africa
  • Advance Articles
  • Editor's Choice
  • Special Issues
  • Virtual Issues
  • Reading Lists
  • Archive Collections
  • Book Reviews
  • Author Guidelines
  • Submission Site
  • Open Access
  • Self-Archiving Policy
  • About International Affairs
  • About Chatham House
  • Editorial Board
  • Advertising & Corporate Services
  • Journals Career Network
  • Journals on Oxford Academic
  • Books on Oxford Academic
  • < Previous

International organizations and research methods: an introduction

  • Article contents
  • Figures & tables
  • Supplementary Data

Timon Forster, International organizations and research methods: an introduction, International Affairs , Volume 100, Issue 3, May 2024, Pages 1303–1304, https://doi.org/10.1093/ia/iiae084

  • Permissions Icon Permissions

Research methods are predominantly viewed as techniques that enable academics to collect new data, test hypotheses, advance scholarly debates and produce knowledge. International organizations and research methods challenges readers to think about methods as performative tools as well. Accordingly, the application of any method generates a distinct representation of the social world in which international institutions operate. From this vantage point, the choice of method itself is a function of the researchers’ academic training and background. The editors, Fanny Badache, Leah R. Kimber and Lucile Maertens, therefore call for a ‘deliberate and reflexive stance in the research process’ (p. 4). By treating methods as technical and performative tools, this edited volume showcases an extensive range of approaches that illuminate both the well-trodden paths and the uncharted territory of international organizations.

Badache, Kimber and Maertens, alongside their 59 contributors, have written a gentle methodological introduction to the study of international organizations. The book consists of five parts dedicated to different steps of the research process: observing, interviewing, documenting, measuring and combining. Each chapter describes a particular method (or set of methods) and its relevance, offering a brief how-to guide and discussing common challenges. Moreover, the book features 26 boxes that focus on more specific methodological tools or tricks, and it includes five interludes to take stock of the pertinent debates in the field. Thus, the book invites readers to approach international organizations in new ways. For instance, one could perceive the headquarters of the International Monetary Fund and the World Bank on 19th Street in Washington DC as artefacts (Box o); an exhibition celebrating the United Nations’ 80th anniversary suddenly lends itself to branding analysis (Box n) or to composing collages (chapter 29); and when a former high-level official publishes a memoir, an opportunity to conduct prosopography emerges (chapter 26).

Email alerts

Citing articles via.

  • Recommend to Your Librarian
  • Advertising and Corporate Services

Affiliations

  • Online ISSN 1468-2346
  • Print ISSN 0020-5850
  • Copyright © 2024 The Royal Institute of International Affairs
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Automated Machine Learning (AutoML)

In the past decade, machine learning has experienced explosive growth in both the range of applications it is applied to, and the amount of new research produced on it. Some of the largest driving forces behind this growth are the maturity of the ML algorithms and methods themselves, the generation and proliferation of massive volumes of data for the algorithms to learn from, the abundance of cheap compute to run the algorithms, and the increasing awareness among businesses that ML algorithms can address complex data structures and problems.

Many organizations want to use ML to take advantage of their data and derive actionable new insights from it, but it has become clear that there is an imbalance between the number of potential ML applications and the amount of trained, expert ML practitioners to address them. As a result, there is an increasing demand to democratize ML across organizations by creating tools that make ML widely accessible throughout the organization and can be used off-the-shelf by non-ML experts and domain experts.

Recently, Automated Machine Learning (AutoML) has emerged as way to address the massive demand for ML within organizations across all experience/skill levels. AutoML aims to create a single system to automate (e.g. remove human input from) as much of the ML workflow as possible, including data preparation, feature engineering, model selection, hyperparameter tuning, and model evaluation. In doing so, it can be beneficial to non-experts by lowering their barrier of entry into ML, but also to trained ML practitioners by eliminating some of the most tedious and time-consuming steps in the ML workflow.

  • AutoML for the non-ML expert (GIS analyst/Business analyst/Data analyst who are domain experts)

For the non-ML expert, the key advantage of using AutoML is that it eliminates some of the steps in the ML workflow that require the most technical expertise and understanding. Analysts who are domain experts can define their business problem and collect the appropriate data, then essentially let the computer learn to do the rest. They don’t need a deep understanding of data science techniques for data cleaning and feature engineering, they don’t have know what all the different ML algorithms do, and they don’t need to spend time aimlessly exploring different algorithms and hyperparameter configurations. Instead, these analysts can focus on applying their domain expertise to the specific business problem/domain application at hand, rather than on the ML workflow itself. Additionally, they can be less dependent on trained data scientists and ML engineers within their organization because they can build and utilize advanced models on their own, often without any coding experience required.

  • AutoML for the ML expert (Data scientist/ML engineer)

AutoML can also be hugely beneficial to ML experts, however the reasons may be less obvious. For one, ML experts do not have to spend as much time supporting the domain experts in their organization, and can therefore focus on their own, more advanced ML work. When it comes to the ML experts’ actual ML projects, AutoML can be a tremendous time saver and productivity booster. Much of the time consuming, tedious steps in the ML workflow such as data cleaning, feature engineering, model selection, and hyperparameter tuning can be automated. The time saved by automating many of these repetitive, exploratory steps can be shifted to more advanced technical tasks or to tasks that require more human input (e.g. collaborating with domain experts, understanding the business problem, or interpreting the ML results).

In addition to its time saving aspects, AutoML can also help boost the productivity of ML practitioners because it eliminates some of the subjective choice and experimentation involved in the ML workflow. For example, a ML practitioner approaching a new project may in theory have the training and expertise to guide them on which new features to construct, which ML algorithm might be the best for a particular problem, and which hyperparameters could be most optimal. However, they may overlook the construction of certain new features, or fail to try all the different combinations of hyperparameters that are possible while they are actually performing the ML workflow. Additionally, the ML practitioner may bias the feature selection process or choice of algorithm because they prefer a particular ML algorithm based on their previous work or its success in other ML applications they’ve seen. In reality, no single ML algorithm performs best on all datasets, certain ML algorithms are more sensitive than others to the selection of hyperparameters, and many business problems have varying degrees of complexity and requirements for interpretability from the ML algorithms used to solve them. AutoML can help reduce some of this human bias by applying many different ML algorithms to the same dataset and then determining which one performs best. AutoML also uses advanced techniques such as model ensembling that helps push the accuracy of models even further.

For the ML practitioner, AutoML can also serve as an initial starting point or benchmark in an ML project. They can use it to automatically develop a baseline model for a dataset, which can give them a set of preliminary insights into a particular problem. From here, they may decide to add or remove certain features from the input dataset, or hone in on a particular ML algorithm and fine tune it’s hyperparameters. In this sense, AutoML can be viewed as a means of narrowing down the set of initial choices for a trained ML practitioner, so they can focus on improving the performance of the ML model overall. This is a very commonly used workflow in practice, where ML experts will develop a data-driven benchmark using AutoML, then build upon this benchmark by incorporating their expertise to tweak and refine the results. The ML tools in ArcGIS Pro , along with the MLModel class in ArcGIS API for Python let them build upon this strong baseline and arrive at the most suitable model.

In the end, democratizing ML via AutoML within an organization allows domain experts to focus their attention on the business problem and obtain actionable results, it allows more analysts to build better models, and it can reduce the number of ML experts that the organization needs to hire. It can also help boost the productivity of trained ML practitioners and data scientists, allowing them to focus their expertise on the multitude of other tasks where it is needed most.

To help with these tasks, the GeoAI toolbox in ArcGIS provides tools that use AutoML to train, fine tune and ensemble several popular machine learning models for classification and regression given your data and available compute resources.

Training a good model takes work, but the benefits can be enormous. Good AI models can be powerful tools when automating the analysis of huge volumes of geospatial data. However, care should be taken to ensure that they are applied to relevant tasks, with the appropriate level of human oversight and transparency about the type of model and training datasets used for training the model.

Related topics

  • GUID-A21A3A02-8F67-4CDF-B813-818255048940
  • Deep learning in ArcGIS Pro

Feedback on this topic?

In this topic

Princeton University

Science has an ai problem. this group says they can fix it..

By Scott Lyon

May 1, 2024

Illustrated team of scientists around a table with data visualized on the wall.

Researchers recommend 32 best practices to stamp out a smoldering crisis that threatens to engulf all of science: thousands of AI-driven claims across dozens of fields that cannot be reproduced. Illustration courtesy Adobe Stock

AI holds the potential to help doctors find early markers of disease and policymakers to avoid decisions that lead to war. But a growing body of evidence has revealed deep flaws in how machine learning is used in science, a problem that has swept through dozens of fields and implicated thousands of erroneous papers.

Now an interdisciplinary team of 19 researchers, led by Princeton University computer scientists Arvind Narayanan and Sayash Kapoor, has published guidelines for the responsible use of machine learning in science.

“When we graduate from traditional statistical methods to machine learning methods, there are a vastly greater number of ways to shoot oneself in the foot,” said Narayanan , director of Princeton’s Center for Information Technology Policy and a professor of computer science . “If we don’t have an intervention to improve our scientific standards and reporting standards when it comes to machine learning-based science, we risk not just one discipline but many different scientific disciplines rediscovering these crises one after another.”

The authors say their work is an effort to stamp out this smoldering crisis of credibility that threatens to engulf nearly every corner of the research enterprise. A paper detailing their guidelines appeared May 1 in the journal Science Advances .

Because machine learning has been adopted across virtually every scientific discipline, with no universal standards safeguarding the integrity of those methods, Narayanan said the current crisis, which he calls the reproducibility crisis , could become far more serious than the replication crisis that emerged in social psychology more than a decade ago.

The good news is that a simple set of best practices can help resolve this newer crisis before it gets out of hand, according to the authors, who come from computer science, mathematics, social science and health research.

“This is a systematic problem with systematic solutions,” said Kapoor , a graduate student who works with Narayanan and who organized the effort to produce the new consensus-based checklist.

The checklist focuses on ensuring the integrity of research that uses machine learning. Science depends on the ability to independently reproduce results and validate claims. Otherwise, new work cannot be reliably built atop old work, and the entire enterprise collapses. While other researchers have developed checklists that apply to discipline-specific problems, notably in medicine, the new guidelines start with the underlying methods and apply them to any quantitative discipline.

One of the main takeaways is transparency. The checklist calls on researchers to provide detailed descriptions of each machine learning model, including the code, the data used to train and test the model, the hardware specifications used to produce the results, the experimental design, the project’s goals and any limitations of the study’s findings. The standards are flexible enough to accommodate a wide range of nuance, including private datasets and complex hardware configurations, according to the authors.

While the increased rigor of these new standards might slow the publication of any given study, the authors believe wide adoption of these standards would increase the overall rate of discovery and innovation, potentially by a lot.

“What we ultimately care about is the pace of scientific progress,” said sociologist Emily Cantrell , one of the lead authors, who is pursuing her Ph.D. at Princeton. “By making sure the papers that get published are of high quality and that they’re a solid base for future papers to build on, that potentially then speeds up the pace of scientific progress. Focusing on scientific progress itself and not just getting papers out the door is really where our emphasis should be.”

Kapoor concurred. The errors hurt. “At the collective level, it’s just a major time sink,” he said. That time costs money. And that money, once wasted, could have catastrophic downstream effects, limiting the kinds of science that attract funding and investment, tanking ventures that are inadvertently built on faulty science, and discouraging countless numbers of young researchers.

In working toward a consensus about what should be included in the guidelines, the authors said they aimed to strike a balance: simple enough to be widely adopted, comprehensive enough to catch as many common mistakes as possible.

They say researchers could adopt the standards to improve their own work; peer reviewers could use the checklist to assess papers; and journals could adopt the standards as a requirement for publication.

“The scientific literature, especially in applied machine learning research, is full of avoidable errors,” Narayanan said. “And we want to help people. We want to keep honest people honest.”

The paper, “ Consensus-based recommendations for machine-learning-based science ,” published on May 1 in Science Advances, included the following authors:

Sayash Kapoor, Princeton University; Emily Cantrell, Princeton University; Kenny Peng, Cornell University; Thanh Hien (Hien) Pham, Princeton University; Christopher A. Bail, Duke University; Odd Erik Gundersen, Norwegian University of Science and Technology; Jake M. Hofman, Microsoft Research; Jessica Hullman, Northwestern University; Michael A. Lones, Heriot-Watt University; Momin M. Malik, Center for Digital Health, Mayo Clinic; Priyanka Nanayakkara, Northwestern; Russell A. Poldrack, Stanford University; Inioluwa Deborah Raji, University of California-Berkeley; Michael Roberts, University of Cambridge; Matthew J. Salganik, Princeton University; Marta Serra-Garcia, University of California-San Diego; Brandon M. Stewart, Princeton University; Gilles Vandewiele, Ghent University; and Arvind Narayanan, Princeton University.

Related News

Avi Wigderson attending a lecture.

Grad alum Avi Wigderson wins Turing Award for groundbreaking insights in computer science

A figure wearing holographic displays glasses, a chip on the leg of the eyeglasses beaming colored light onto the inside of the lens of the glasses.

Holographic displays offer a glimpse into an immersive future

Pedestrians crossing the street in New York City.

Retro-reflectors could help future cities keep their cool

Computer simulation graphic showing hundreds of thousands of atoms in two planes, representing two surfaces, with an abstract web-like channel showing how charge carriers move between the surfaces.

The science of static shock jolted into the 21st century

Smiling man wearing a dark suit jacket

Justice Department designates Mayer to serve as first chief science and technology adviser and chief AI officer

An advanced chip taped out surrounded by a gold square surrounded by a large array of gold pins.

Built for AI, this chip moves beyond transistors for huge computational gains

new research learning techniques

Arvind Narayanan

new research learning techniques

Data Science

Related departments and centers.

Computer Science

Computer Science

new research learning techniques

Center for Information Technology Policy

Covering a story? Visit our page for journalists or call (773) 702-8360.

Photo of Legate-Yang smiling looking at camera with greenery in the background

Top Stories

  • UChicago political scientist Molly Offer-Westort named Carnegie Fellow

UChicago paleontologist Paul Sereno’s fossil lab moves to Washington Park

Uchicago scientists use machine learning to turn cell snapshots dynamic, pritzker school of molecular engineering study hopes to use machine learning to boost cancer, immunology research.

Imagine predicting the exact finishing order of the Kentucky Derby from a still photograph taken 10 seconds into the race.

That challenge pales in comparison to what researchers face when trying to study how embryos develop, cells differentiate, cancers form, and the immune system reacts—all using the snapshots from microscopes or genome sequencing.

But in a paper published April 26 in Proceedings of the National Academy of Sciences , researchers from the UChicago Pritzker School of Molecular Engineering and the Chemistry Department unveiled a powerful new method of using the static snapshots from single-cell RNA-sequencing to study how cells and genes change over time.

To develop the method, which they call TopicVelo, the team took an interdisciplinary approach, incorporating concepts from classical machine learning as well as computational biology and chemistry.

“In terms of unsupervised machine learning, we use a very simple, well-established idea. And in terms of the transcriptional model we use, it's also a very simple, old idea. But when you put them together, they do something more powerful than you might expect,” said PME Assistant Professor of Molecular Engineering and Medicine Samantha Riesenfeld , who wrote the paper with Chemistry Department Prof. Suriyanarayanan Vaikuntanathan and their joint student, UChicago Chemistry PhD candidate Cheng Frank Gao .

The trouble with pseudotime

When trying to understand complex processes in the body, researchers often use single-cell RNA-sequencing, or scRNA-seq, to get measurements that are powerful and detailed, but by nature are static.

The trouble is, Riesenfeld explained, “Single-cell RNA-sequencing is destructive. When you measure the cell this way, you destroy the cell.”

This leaves researchers only a snapshot of the moment the cell was measured/destroyed. However, the information many researchers need is how the cells transition over time . They need to know how a cell becomes cancerous, or how a particular gene program behaves during an immune response.

To help figure out dynamic processes from a static snapshot, researchers have traditionally used what’s called “pseudotime.” When an image is captured, it also captures other cells and genes of the same type that might be a little further on in the same process. If the scientists connect the dots correctly, they can gain powerful insights into how the process looks over time.

However, connecting those dots is difficult guesswork, based on the assumption that similar-looking cells are just at different points along the same path—and biology is often much more complicated, with false starts, stops, bursts, and multiple chemical forces tugging on each gene.

Instead of traditional pseudotime approaches, scientists have been interested in an alternate approach known as “RNA velocity,” which looks at the dynamics of transcription, splicing and degradation of the mRNA within those cells. It’s promising, but still early technology.

To improve the RNA velocity approach, TopicVelo embraces—and gleans insights from—a far more difficult stochastic model that reflects biology’s inescapable randomness.

“Cells, when you think about them, are intrinsically random,” said Gao, the first author on the paper. “You can have twins or genetically identical cells that will grow up to be very different. TopicVelo introduces the use of a stochastic model. We're able to better capture the underlying biophysics in the transcription processes that are important for mRNA transcription.”

Machine learning shows the way

The team also realized that another assumption limits standard RNA velocity. “Most methods assume that all cells are basically expressing the same big gene program, but you can imagine that cells have to do different kinds of processes simultaneously, to varying degrees,” Riesenfeld said. Disentangling these processes is a challenge.

Probabilistic topic modeling—a machine learning tool traditionally used to identify themes from written documents—provided the UChicago team with a strategy.

TopicVelo groups scRNA-seq data not by the types of cell or gene, but by the processes those cells and genes are involved in. The processes are inferred from the data, rather than imposed by external knowledge.

“If you look at a science magazine, it will be organized along topics like ‘physics,’ ‘chemistry’ and ‘astrophysics,’ these kinds of things,” Gao said. “We applied this organizing principle to single-cell RNA-sequencing data. So now, we can organize our data by topics, like ‘ribosomal synthesis,’ ‘differentiation,’ ‘immune response,’ and ‘cell cycle’. And we can fit stochastic transcriptional models specific to each process.”

After TopicVelo disentangles this kludge of processes and organizes them by topic, it applies topic weights back onto the cells, to account for what percentage of each cell’s transcriptional profile is involved in which activity.

According to Riesenfeld, “This approach helps us look at the dynamics of different processes and understand their importance in different cells. And that's especially useful when there are branch points, or when a cell is pulled in different directions.”

The results of combining the stochastic model with the topic model are striking. For example, TopicVelo was able to reconstruct trajectories that previously required special experimental techniques to recover. These improvements greatly broaden potential applications.

Gao compared the paper’s findings to the paper itself—the product of many areas of study and expertise.

“At PME, if you have a chemistry project, chances are there’s a physics or engineering student working on it,” he said. “It’s never just chemistry.”

Citation: “Dissection and Integration of Bursty Transcriptional Dynamics for Complex Systems,” Gao et al., Proceedings of the National Academy of Sciences, April 26, 2024. DOI: 10.1073/pnas.2306901121

Funding: NIH.

— Adapted from an article published by the Pritzker School of Molecular Engineering .

  • How to manifest your future using neuroscience, with James Doty (Ep. 135)

Get more with UChicago News delivered to your inbox.

Recommended

A photograph of a scientist wearing blue gloves in a laboratory and holding tweezers in one hand and a small memrane on a transparent sheet in the other

UChicago scientists invent ultra-thin, minimally-invasive pacemaker…

Scientist in a white lab coat and black gloves holds two different shaped pieces of plastic with tweezers, one in each hand

UChicago scientists develop a plastic that can be re-formed as needed

Related Topics

Latest news, big brains podcast: how to manifest your future using neuroscience.

Laura Hunter speaks to a group of museum visitors in front of a museum display case featuring hominid Lucy.

Research Speaks

Museum talks give UChicago graduate students’ research a new audience

A guest lecture with Martin Rees

The College

‘Are we doomed?’ Class debates end of the world—and finds reason for hope

Two scientists wearing goggles in a laboratory, one is working with gloves at a fume hood full of tubes and equipment

Biological Sciences Division

UChicago scientists tap the power of collaboration to make transformative breakthroughs

The Day Tomorrow Began

Where do breakthrough discoveries and ideas come from?

Explore The Day Tomorrow Began

Headshot of Tina Post

UChicago scholar wins National Book Critics Circle Award

Talia Crichlow reads with a young student in a school hallway.

Office of Civic Engagement

UChicago students help support young migrants at South Side schools

Around uchicago.

Profile view of Margareta Ingrid Christian speaking at a podium into a microphone. The Chicago skyline is visible outside the window.

Laing Award

UChicago Press awards top honor to Margareta Ingrid Christian for ‘Objects in Air’

Two uchicago scholars elected as 2023 american association for the advancement …, uchicago announces recipients of academic communicators network awards for 2024.

2024 Guggenheim Fellowships

Profs. Sianne Ngai and Robyn Schiff honored for their innovative literary work

“Peculiar Dynamics” Science as Art submission

Winners of the 2024 UChicago Science as Art competition announced

photo of white crabapple blossoms framing a gothic-spire tower on UChicago campus

Six UChicago scholars elected to American Academy of Arts and Sciences in 2024

“you have to be open minded, planning to reinvent yourself every five to seven years.”.

Prof. Chuan He faces camera smiling with hands on hips with a chemistry lab in the background

2024 Theses Doctoral

Deep Learning Strategies for Pandemic Preparedness and Post-Infection Management

Lee, Sang Won

The global transmission of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) has resulted in over 677 million infections and 6.88 million tragic deaths worldwide as of March 10th, 2023. During the pandemic, the ability to effectively combat SARS-CoV-2 had been hindered by the lack of rapid, reliable, and cost-effective testing platforms for readily screening patients, discerning incubation stages, and accounting for variants. The limited knowledge of the viral pathogenesis further hindered rapid diagnosis and long-term clinical management of this complex disease. While effective in the short term, measures such as social distancing and lockdowns have resulted in devastating economic loss, in addition to material and psychological hardships. Therefore, successfully reopening society during a pandemic depends on frequent, reliable testing, which can result in the timely isolation of highly infectious cases before they spread or become contagious. Viral loads, and consequently an individual's infectiousness, change throughout the progression of the illness. These dynamics necessitate frequent testing to identify when an infected individual can safely interact with non-infected individuals. Thus, scalable, accurate, and rapid serial testing is a cornerstone of an effective pandemic response, a prerequisite for safely reopening society, and invaluable for early containment of epidemics. Given the significant challenges posed by the pandemic, the power of artificial intelligence (AI) can be harnessed to create new diagnostic methods and be used in conjunction with serial tests. With increasing utilization of at-home lateral flow immunoassay (LFIA) tests, the National Institutes of Health (NIH) and Centers for Disease Control and Prevention (CDC) have consistently raised concerns about a potential underreporting of actual SARS-CoV-2-positive cases. When AI is paired with serial tests, it could instantly notify, automatically quantify, aid in real-time contact tracing, and assist in isolating infected individuals. Moreover, the computer vision-assisted methodology can help objectively diagnose conditions, especially in cases where subjective LFIA tests are employed. Recent advances in the interdisciplinary scientific fields of machine learning and biomedical engineering support a unique opportunity to design AI-based strategies for pandemic preparation and response. Deep learning algorithms are transforming the interpretation and analysis of image data when used in conjunction with biomedical imaging modalities such as MRI, Xray, CT scans, confocal microscopes, etc. These advances have enabled researchers to carry out real-time viral infection diagnostics that were previously thought to be impossible. The objective of this thesis is to use SARS-CoV-2 as a model virus and investigate the potential of applying multi-class instance segmentation deep learning and other machine learning strategies to build pandemic preparedness for rapid, in-depth, and longitudinal diagnostic platforms. This thesis encompasses three research tasks: 1) computer vision-assisted rapid serial testing, 2) infected cell phenotyping, and 3) diagnosing the long-term consequences of infection (i.e., long-term COVID). The objective of Task 1 is to leverage the power of AI, in conjunction with smartphones, to rapidly and simultaneously diagnose COVID-19 infections for millions of people across the globe. AI not only makes it possible for rapid and simultaneous screenings of millions but can also aid in the identification and contact tracing of individuals who may be carriers of the virus. The technology could be used, for example, in university settings to manage the entry of students into university buildings, ensuring that only students who test negative for the virus are allowed within campus premises, while students who test positive are placed in quarantine until they recover. The technology could also be used in settings where strict adherence to COVID-19 prevention protocols is compromised, for example, in an Emergency Room. This technology could also help with CDC’s concern on growing incidences of underreporting positive COVID-19 cases with growing utilization of at-home LFIA tests. AI can address issues that arise from relying solely on the visual interpretation of LFIA tests to make accurate diagnoses. One problem is that LFIA test results may be subjective or ambiguous, especially when the test line of the LFIA displays faint color, indicating a low analyte abundance. Therefore, reaching a decisive conclusion regarding the patient's diagnosis becomes challenging. Additionally, the inclusion of a secondary source for verifying the test results could potentially increase the test's cost, as it may require the purchase of complementary electronic gadgets. To address these issues, our innovation would be accurately calibrated with appropriate sensitivity markers, ensuring increased accuracy of the diagnostic test and rapid acquisition of test results from the simultaneous classification of millions of LFIA tests as either positive or negative. Furthermore, the designed network architecture can be utilized to detect other LFIA-based tests, such as early pregnancy detection, HIV LFIA detection, and LFIA-based detection of other viruses. Such minute advances in machine learning and artificial intelligence can be leveraged on many different scales and at various levels to revolutionize the health sector. The motivating purpose of Task 2 is to design a highly accurate instance segmentation network architecture not only for the analysis of SARS-CoV-2 infected cells but also one that yields the highest possible segmentation accuracy for all applications in biomedical sciences. For example, the designed network architecture can be utilized to analyze macrophages, stem cells, and other types of cells. Task 3 focuses on conducting studies that were previously considered computationally impossible. The invention will assist medical researchers and dentists in automatically calculating alveolar crest height (ACH) in teeth using over 500 dental Xrays. This will help determine if patients diagnosed with COVID-19 by a positive PCR test exhibited more alveolar bone loss and had greater bone loss in the two years preceding their COVID-positive test when compared to a control group without a positive COVID-19 test. The contraction of periodontal disease results in higher levels of transmembrane serine protease 2 (TMPRSS2) within the buccal cavity, which is instrumental in enabling the entry of SARS-CoV-2. Gum inflammation, a symptom of periodontal disease, can lead to alterations in the ACH of teeth within the oral mucosa. Through this innovation, we can calculate ACHs of various teeth and, therefore, determine the correlation between ACH and the risk of contracting SARS-CoV-2 infection. Without the invention, extensive manpower and time would be required to make such calculations and gather data for further research into the effects of SARS-CoV-2 infection, as well as other related biological phenomena within the human body. Furthermore, the novel network framework can be modified and used to calculate dental caries and other periodontal diseases of interest.

  • Biomedical engineering
  • Deep learning (Machine learning)
  • Artificial intelligence
  • Imaging systems in biology
  • COVID-19 Pandemic (2020-)
  • COVID-19 (Disease)--Testing
  • Macrophages
  • Centers for Disease Control and Prevention (U.S.)
  • National Institutes of Health (U.S.)

This item is currently under embargo. It will be available starting 2026-04-29.

More About This Work

  • DOI Copy DOI to clipboard

IMAGES

  1. 6 Effective Learning Methods

    new research learning techniques

  2. 15 Types of Research Methods (2024)

    new research learning techniques

  3. Your Step-by-Step Guide to Writing a Good Research Methodology

    new research learning techniques

  4. Types of Research Methodology

    new research learning techniques

  5. How Learning Works: 10 Research-Based Insights

    new research learning techniques

  6. 6 Powerful Learning Strategies You MUST Share with Students (2022)

    new research learning techniques

VIDEO

  1. Methods and Strategies of Teaching New Curriculum

  2. Teaching Research Methods: Overcoming Your Obstacles Webinar

  3. How research-based microlearning works? (Best examples)

  4. Mastering Your Mind

  5. Reinforcement Learning with Large Datasets: a Path to Resourceful Autonomous Agents

  6. Three Keys to Effective Online Learning

COMMENTS

  1. New Research Shows Learning Is More Effective When Active

    The research also found that effective active learning methods use not only hands-on and minds-on approaches, but also hearts-on, providing increased emotional and social support. Interest in active learning grew as the COVID-19 pandemic challenged educators to find new ways to engage students.

  2. Effective Learning Practices

    Effective Learning Practices. Learning at college requires processing and retaining a high volume of information across various disciplines and subjects at the same time, which can be a daunting task, especially if the information is brand new. In response, college students try out varied approaches to their learning - often drawing from ...

  3. The science of effective learning with spacing and retrieval practice

    Alexander Renkl. Educational Psychology Review (2023) Research on the psychology of learning has highlighted straightforward ways of enhancing learning. However, effective learning strategies are ...

  4. Learning Strategies That Work

    Dr. Mark A. McDaniel. How do we learn and absorb new information? Which learning strategies actually work and which are mere myths? Such questions are at the center of the work of Mark McDaniel, professor of psychology and the director of the Center for Integrative Research on Cognition, Learning, and Education at Washington University in St. Louis.. McDaniel coauthored the book Make it Stick ...

  5. PLAT 20 (1) 2021: Enhancing Student Learning in Research and

    More research is needed to better understand when retrieval practice should be combined with other productive techniques to optimize students' learning. Taken together, the lab-based studies of moderators of the testing effect in this Special Issue (see Table 1 ) contribute to the growing body of research on moderators of the testing effect.

  6. Improving Students' Learning With Effective Learning Techniques:

    Many students are being left behind by an educational system that some people believe is in crisis. Improving educational outcomes will require efforts on many fronts, but a central premise of this monograph is that one part of a solution involves helping students to better regulate their learning through the use of effective learning techniques.

  7. Lessons in learning

    "On the other hand, a superstar lecturer can explain things in such a way as to make students feel like they are learning more than they actually are." Director of sciences education and physics lecturer Logan McCarty is the co-author of a new study that says students who take part in active learning actually learn more than they think they do.

  8. Frontiers

    This article outlines a meta-analysis of the 10 learning techniques identified in Dunlosky et al. (2013a), and is based on 242 studies, 1,619 effects, 169,179 unique participants, with an overall mean of 0.56. The most effective techniques are Distributed Practice and Practice Testing and the least effective (but still with relatively high effects) are Underlining and Summarization. A major ...

  9. How to Learn More Effectively: 10 Learning Techniques to Try

    Organizing the information you are studying to make it easier to remember. Using elaborative rehearsal when studying; when you learn something new, spend a few moments describing it to yourself in your own words. Using visual aids like photographs, graphs, and charts. Reading the information you are studying out loud.

  10. Improving Students' Learning With Effective Learning Techniques:

    One potential reason for the disconnect between research on the efficacy of learning techniques and their use in educational practice is that because so many techniques are available, it would be challenging for educators to sift through the relevant research to decide which ones show promise of efficacy and could feasibly be implemented by ...

  11. How a New Learning Theory Can Benefit Transformative Learning Research

    Transformative Learning research and practice has consistently stalled on three fundamental debates: (1) what transformative learning is, and how it's differentiated from other learning; (2) what the preconditions for transformative learning are; and (3) what transformative learning's predictable and relevant outcomes are. The following article attempts two main feats: (1) to provide a re ...

  12. Learning strategies: a synthesis and conceptual model

    Surface learning includes subject matter vocabulary, the content of the lesson and knowing much more. Strategies include record keeping, summarisation, underlining and highlighting, note taking ...

  13. 6 Effective Learning Techniques that are Backed by Research

    And we all know that the harder you make your practice sessions, the better you'll learn. 4. Self-Explanation. Till now, we've discussed some valuable learning techniques that work in almost all types of learning. Self-explanation, although not that universal a method, is still one that shows promising results.

  14. Full article: Strategies and best practices for effective eLearning

    Fit among eLearning methods, learning task and learner styles. Khazanchi et al. (Citation 2015) propose a contingency theory-based model of eLearning. Using this theoretical lens, the authors argue that given a virtual learning environment, there are ideal profiles of eLearning ("fit") that result from a combination of learner engagement ...

  15. How do we learn to learn? New research offers an education

    New York University. (2021, November 10). How do we learn to learn? New research offers an education. ScienceDaily. Retrieved May 3, 2024 from www.sciencedaily.com / releases / 2021 / 11 ...

  16. Improving Students' Learning With Effective Learning Techniques

    News ; Research Topics ; Conventions & Events ... Other learning techniques such as taking practice tests and spreading study sessions out over time — known as distributed practice — were found to be of high utility because they benefited students of many different ages and ability levels and enhanced performance in many different areas ...

  17. Deep Learning: A Comprehensive Overview on Techniques ...

    Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI) is nowadays considered as a core technology of today's Fourth Industrial Revolution (4IR or Industry 4.0). Due to its learning capabilities from data, DL technology originated from artificial neural network (ANN), has become a hot topic in the context of computing, and is widely applied in various ...

  18. Active Learning

    Active learning includes any type of instructional activity that engages students in learning, beyond listening, reading, and memorizing. As examples, students might talk to a classmate about a challenging question, respond to an in-class prompt in writing, make a prediction about an experiment, or apply knowledge from a reading to a case study. Active learning commonly includes collaboration ...

  19. Full article: Is research-based learning effective? Evidence from a pre

    The effectiveness of research-based learning. Conducting one's own research project involves various cognitive, behavioural, and affective experiences (Lopatto, Citation 2009, 29), which in turn lead to a wide range of benefits associated with RBL. RBL is associated with long-term societal benefits because it can foster scientific careers: Students participating in RBL reported a greater ...

  20. Recent advances and applications of deep learning methods in ...

    Deep learning (DL) is one of the fastest-growing topics in materials data science, with rapidly emerging applications spanning atomistic, image-based, spectral, and textual data modalities. DL ...

  21. Recent progress in leveraging deep learning methods for question

    Question answering, serving as one of important tasks in natural language processing, enables machines to understand questions in natural language and answer the questions concisely. From web search to expert systems, question answering systems are widely applied to various domains in assisting information seeking. Deep learning methods have boosted various tasks of question answering and have ...

  22. (PDF) Learning styles: A detailed literature review

    The literature review shows several studies on a variety of le. arning styles-interactive, social, innovative, experiential, game-based, self-regulated, integrated, and expeditionary le. arning ...

  23. Do good lessons promote students' attention and behavior?

    Cognitive activation, the third characteristic of good teaching, was hardly relevant for self-regulation. Therefore, the personal relationship between teacher and student is particularly important ...

  24. Coding for Animals Key to Engaging Children in STEM

    An education pilot study bridges animal behavior research and computer coding to engage elementary school students in real-world, interdisciplinary science ... Innovative Teaching Methods: Merging Interests for Effective Learning ... "This program is a fantastic way to utilize kids' love of animals as a bridge to learning new computational ...

  25. USC at ICLR 2024

    Quality Diversity Reinforcement Learning (QD-RL) is an emerging research area that blends the best aspects of both fields - Quality Diversity (QD) provides a principled form of exploration and produces collections of behaviorally diverse agents, while Reinforcement Learning (RL) provides a powerful performance improvement operator enabling ...

  26. International organizations and research methods: an introduction

    Research methods are predominantly viewed as techniques that enable academics to collect new data, test hypotheses, advance scholarly debates and produce knowledge. International organizations and research methods challenges readers to think about methods as performative tools as well. Accordingly, the application of any method generates a ...

  27. Automated Machine Learning (AutoML)—ArcGIS Pro

    Automated Machine Learning (AutoML) In the past decade, machine learning has experienced explosive growth in both the range of applications it is applied to, and the amount of new research produced on it. Some of the largest driving forces behind this growth are the maturity of the ML algorithms and methods themselves, the generation and proliferation of massive volumes of data for the ...

  28. Science has an AI problem. This group says they can fix it

    The checklist focuses on ensuring the integrity of research that uses machine learning. Science depends on the ability to independently reproduce results and validate claims. Otherwise, new work cannot be reliably built atop old work, and the entire enterprise collapses.

  29. UChicago scientists use machine learning to turn cell snapshots dynamic

    Machine learning shows the way. The team also realized that another assumption limits standard RNA velocity. "Most methods assume that all cells are basically expressing the same big gene program, but you can imagine that cells have to do different kinds of processes simultaneously, to varying degrees," Riesenfeld said.

  30. Deep Learning Strategies for Pandemic Preparedness and Post-Infection

    The global transmission of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) has resulted in over 677 million infections and 6.88 million tragic deaths worldwide as of March 10th, 2023. During the pandemic, the ability to effectively combat SARS-CoV-2 had been hindered by the lack of rapid, reliable, and cost-effective testing platforms for readily screening patients, discerning ...