Relationships Between Expertise And Distinctiveness: Abnormal Medical Images Lead To Enhanced Memory Performance Only in Experts Part 3
Apr 16, 2024
Recall from the similarity analysis in Experiment 1 that the normal images in our data set are less similar to each other than the abnormal images. Thus memory for normal images should be better than abnormal (as it was in novices).
The memory of images and memory are inseparable. Our memory can be divided into visual memory, auditory memory, olfactory memory, tactile memory, etc. Among them, visual memory is the most commonly used one in our lives. Image memory refers to the memory of image information formed in our minds.
The memory of images has a very important positive significance in our lives. First, we can use it to recognize and identify objects, people, and scenes. For example, when we see a friend walking down the street, it only takes a few seconds to recognize his face. This is the result of our image memory. Secondly, the memory of images can help us learn better. The words, pictures, and diagrams we see require us to access and retain them through the memory of images. Finally, the memory of images can help us better understand and remember concepts and patterns.
At the same time, good memory can also enhance our image memory ability. Some scientific research shows that taking the opportunity to exercise memory from an early age can enhance our brain development and improve learning and memory abilities. For example, using the silent reading method can greatly improve our memory. In addition, paying attention to maintaining good living habits, such as regular work and rest, more exercise, a healthy diet, etc., can promote the healthy development of our brains and improve memory.
In short, the memory and memory of images promote and influence each other. We can improve our image memory ability by maintaining good living habits and actively exercising memory. I believe that everyone has a strong image memory ability, and as long as they make good use of it, they can achieve better achievements and experiences in life and work. It can be seen that we need to improve memory, and Cistanche deserticola can significantly improve memory, because Cistanche deserticola can also regulate the balance of neurotransmitters, such as increasing the levels of acetylcholine and growth factors. These substances are very important for memory and learning. In addition, Cistanche deserticola can also improve blood flow and promote oxygen delivery, which can ensure that the brain receives sufficient nutrients and energy, thereby improving brain vitality and endurance.

Click Know Short-term Memory how to improve
It is a memory for abnormal images that are better for radiologist observers. This suggests that the effect of expertise more than compensates for differences between the stimulus categories in image similarity. To see what the effect of abnormality is, independent of baseline image similarity differences, we can compare radiologists' memory performance to novices' performance with the same images.
To do this, we compare the benefit-in terms of AUC of the ROC-for radiologists relative to controls in each condition. Doing so reveals a significant abnormality benefit at both 3-back, t(31) = 6.67, p < .001, and 30-back in expert radiologists, t(31) = 4.33, p < .001, where, taking their performance after baselining relative to the performance of novice participants, radiologists were specifically better at remembering abnormal images (see Fig. 8).
Extracting additional information with a second presentation
Due to the structure of this experiment, designed to probe memory, each item in the memory set has two classification ratings (for normal/abnormal). Thus, while we set out to probe memory, the experiment also makes it possible for us to combine both ratings to examine whether there is a "crowd-within" effect in this situation (Vul & Pashler, 2008). The authors proposed the crowd-within as a variant for the "wisdom of the crowd."
They found that averaging a single individual's responses to repetitions of the same question led to better performance than single responses alone. This is what one would expect if a single judgment did not incorporate all of the information people could have about a question. If this is true for assessments of mammograms by expert radiologists, we would expect that averaging a radiologist's ratings of abnormality from two exposures to the same mammogram should result in better accuracy than looking at either rating alone.
Note that in this situation, however, unlike Vul and Pashler (2008), participants have additional information the second time-they get to see the image again before the second judgment, they are not just asked again. Thus, in this case, the crowd-within effect here could arise from actual new information being incorporated (e.g., the observer might scrutinize different parts of the image), rather than internal sampling.
We find a modest but significant advantage to incorporating both judgments: Averaging radiologists' responses from the first and second time that they saw an image resulted in slightly higher performance in the 30-back condition (AUC = 0.745) compared with single item performance (AUC = 0.716), t(31) = 3.46, p = .002 (see Fig. 9, left). The effect was not significant in the 3-back condition (joint AUC = .712, single AUC = .705), t(31) = 1.15, p = .259. Unsurprisingly, this effect was not present in novices, since their performance was very poor on both responses (see Fig. 9, right; all ps > .10).
Thus, expert performance can be improved (albeit, rather modestly) by averaging more than one response. It remains to be seen whether this benefit would occur if radiologists were offered unlimited time to process each image, rather than the 3 seconds in the current study. The limited viewing time here may have particularly enhanced radiologists' ability to extract new information in the second viewing of the mammogram.


General Discussion
In the current study, we examined memory performance by nonexpert novices and expert radiologists for normal versus abnormal mammography images as a case study in understanding the role of schemas, distinctiveness, and expertise in memory.
To do so, we relied on ROC analysis, designed to properly measure memory independent of differences in response criteria and to take into account both enhanced memory for seen items as well as the possibility of false alarms.

First, we looked at how confident and competent novice and expert observers were at classifying medical images as either normal or abnormal. Unsurprisingly, radiologists were much better than novices at this task. Novices did show some ability to distinguish abnormality, although this appeared to be largely the result of a few salient images. Second, we examined our main question of interest: memory for the images. In Experiment 1, we examined memory for mammograms in novices, who have none of the expertise or schemas needed to process these images.
We found poor performance overall, as well as a small normality benefit in novice participants' memory, which could be explained by the greater image dissimilarity of normal images. Thus, Experiment 1 (on novices) gave us not only a baseline for memory performance but also an understanding of the intricacies of our image set, showing that some abnormal images were quite salient and that our normal images were more dissimilar from each other.
Even though the normal images in our set were more visually distinctive, in Experiment 2, we found that radiologists had better memory for abnormal images, and had far superior memory performance to novices. This gives insight into how expertise changes memory: not only enhancing the encoding of normal items but also enhancing the distinctiveness of abnormal items.
Thus, while experts might have access to perceptual encoding benefits, distinctiveness, and/or schemas/ chunking to enable them to outperform novices, our finding of an extra benefit of expertise for abnormal images is most consistent with a special role of distinctiveness. For experts, the abnormal images have unique features that make them distinct from other items in memory; whereas for novices, these features are not appreciated and so these images are just like any other image.
For example, one possibility is that rather than encoding the entire image, in the case of abnormal images, radiologists specifically encode the abnormality and not the rest of the image into memory. This might reduce the load on memory for that image and might make the memory trace for that image more distinctive.
Broadly speaking, then, we find strong evidence for a role of schemas and distinctiveness in memory, even after taking into account false memory and the possibility of response criterion shifts: We find experts significantly outperform novices, and that memory for abnormal cases with visible, focal lesion is better than memory other images. There was no evidence of a memory benefit for "abnormal" contralateral cases.
Measuring memory: False alarms and ROC analysis
In the current studies, we used ROC analysis to examine memory. This is because, in previous work, it has often been unclear if benefits for schema-consistent information like those reported by experts are, in fact, improvements in memory, as opposed to changes in response criteria. To determine whether memory has improved, it is not adequate to simply find a reliable increase in the rate with which observers correctly report having been exposed to some piece of information (the true positive, or "hit" rate). The observer could simply be saying "Yes, I have seen it" more often.
This would produce an increase in false-positive (or false-alarm) errors. In the context of memory research, these false-positive errors can be seen as a form of false memory. In theory, signal detection models and measures like d' can distinguish between these two, but in practice, the prerequisites for d' to properly adjust for response bias (equal variance; zROC slopes = 1.0) are rarely present in recognition memory contexts and were not present here.
Thus, ROC analysis is needed to distinguish between the difference in the ability to remember as
opposed to criterion shifts, which would reflect an increased
tendency of observers to say that they remember (e.g.,
Wixted & Mickes, 2015).

Is false memory a true concern? Previous work has
found that organizing information in memory via schemas can
have both positive and negative consequences-and in particular, does often increase false alarms, making it difficult to tell
whether memory is genuinely improved. In particular, while
greater understanding-as in expertise-may allow encoding
of only the relevant details, reducing memory load, it may also
cause us to falsely remember information that was not present
(e.g., Owens et al., 1979). For example, in recognition tests,
people are more likely to false alarm to schema-consistent
relative to schema-inconsistent lures.
They would be more likely to falsely report seeing books in a graduate student's office than inconsistent objects like a piece of tree bark or a pair of pliers (Brewer & Treyens, 1981; Lampinen et al., 2001). And while participants are more likely to correctly remember schema-consistent information in a briefly presented scene (Biederman et al., 1982; Brewer & Treyens, 1981), they are also more likely to falsely remember such information (e.g., Hollingworth & Henderson, 2003; Pedzek et al. 1989).
Thus, measuring fully ROCs-rather than attempting to infer how response bias would change performance using measures like A', d', or hits minus false alarms-often reveals surprising answers about memory, particularly in situations like expertise and consistent/inconsistent items where it is known that both hit and false-alarm rates are affected.
For example, Dougal and Rotello (2007) used ROC analysis to show that the well-known effect of "improved memory" for emotional words compared with neutral words is a response bias effect, not a true difference in memory between the words. Similarly, Mickes et al. (2012) showed in the domain of eyewitness memory that sequential lineups, which reduce both false alarms and hit rates relative to simultaneous lineups, are inferior to simultaneous lineups, contrary to a large body of literature suggesting the opposite (e.g., Wells et al., 2011), as the major "benefit" arises simply from a response criterion shift, not a change in memory strength.
Thus, the current experiments provide unique evidence that expertise and distinctiveness that is apparent only to experts do enhance memory-and that this is not just a response criterion shift.
What explains radiologists outperforming novices
Consistent with a wide variety of work on expertise, we find that expert radiologists outperform novices in remembering mammograms. One likely possibility is that this occurs because of experts' knowledge about these images: they have relevant knowledge that allows them to understand these images in a way novices do not and they likely have perceptual expertise built into their visual system from years of experience (e.g., in the form of greater holistic processing; e.g., Richler et al., 2011). In particular, for an expert, the abnormal images would have an added attribute (that mass, that calcification), learned over years of experience, that would help to distinguish the item in memory.
However, in the current study, we did not attempt to directly match our experts to our novices. Our novice pool was sampled from the internet, which is much more broadly representative of the demographics of the United States than an undergraduate population (e.g., Difallah et al., 2018), but still likely differs in several ways from our radiologists (in demographic and socioeconomic factors, as well as motivation to focus on mammogram images).
Thus, Experiment 1 should be taken as only an approximate baseline: it revealed important image features in our stimulus set, and points to the possibility of strong expertise effects, but does not directly confirm these are based solely on knowledge rather than other factors.
Memory and abnormality judgments in radiologists
Previous work has found mixed results when investigating memory improvements in radiologists. For example, Hardesty et al. (2005) investigated radiologists' long-term memory for medical images presented months later and found that none of the radiologists remembered cases that they had read previously. Evans et al. (2016) found mixed results when investigating whether abnormality improves memory in expert observers, including radiologists.
Our results provide context to these ambiguities, as they suggest that expert radiologists do have stronger memory for abnormal images even in a long-term memory setting and even when response bias is properly taken into account using ROC analysis. However, our long delays were only on the order of minutes, not months, and so it remains unclear how such advantages would last over long durations. It is worth noting that in the classification task, radiologists performed on average much more poorly than would be expected of radiologists in the clinic with unlimited viewing time (d' = 2.5–3.0, as in D'Orsi et al., 2013). One reason for this might be that each image in our study was only presented for 3 seconds.
For instance, Evans et al. (2013) showed radiologists only a brief glimpse of mammograms and varied timing from 250 ms to 2,000 ms. The respective AUCs for radiologists in their experiment for 500 ms, 1,000 ms, and 2,000 ms viewing times wwere0.65, 0.66, and 0.72, respectively. In our experiment with a presentation time of 3,000 ms, we found an AUC of 0.72. Thus, our 3,000-ms presentations resulted in a similar level of performance to the 2,000-ms presentations of Evans et al. (2013), which, while well below what is expected with unlimited viewing time, is consistent with other studies and consistent with viewing time being the main constraint that leads to lower performance.
The "crowd-within" effect in radiologists
Because our study had radiologists answer the same classification question about an image multiple times, we looked at whether averaging radiologists' responses when they judge the same image twice resulted in better performance (a "crowd-within" effect; Vul & Pashler, 2008). We found that radiologist performance improved when averaged across the same image twice compared with either response alone, but only in the 30-back condition and only modestly even then. This indicates that by the time radiologists were presented with the same image 30 images later, they gave a response that was somewhat independent of their first response.
This suggests that, under the current experimental conditions, there might be information the radiologists are not using the first time they see an image-and that the opportunity to see the image again allows the radiologist to glean additional useful information. Future studies might determine whether such benefits persist when experts are given unlimited time to process the images as well as whether this effect can be made larger with an even longer delay between the first and second presentation of an image (as found by Vul & Pashler, 2008).
The "gist" of abnormality
Given the Evans et al. (2016) finding that there is a "gist of abnormality" present in the contralateral breast when no localizable abnormality is present, we were interested to know whether these contralateral-abnormal images had any advantage over normal images in expert memory. We found no such evidence. In our experiment, we also found no difference in the classification of abnormality between contralateral normal images compared with normal. ,
While at first, this might seem to contradict earlier work, several methodological differences make it difficult to compare our results directly with Evans et al. (2016). It is possible that we did not find this result because we presented images for a longer encoding time (3,000 ms). Typical stimulus exposure in mammogram "gist" studies has been less than a second; 500 ms is typical. It is possible that presenting images for longer encoding times might obscure the gist information-overwriting an initial "gist" impression with more semantic or meaningful information.
Recall, also, that our radiologists were not informed about the gist and likely reserved their "abnormal" ratings for cases where they could localize a lesion. It is possible that we would observe a contralateral-abnormal effect even at long encoding times if we explicitly directed participants to look for a more general abnormal texture or gist. Given these methodological differences, the current study cannot be readily compared with Evans et al. (2016). However, this seems to be a promising avenue for future work.
Conclusion
Using radiologists as a case study, we find an advantage for memory in experts as well as an advantage for abnormal images-even when properly measuring memory via ROC analysis. This is broadly consistent with the literature on schemas. Our findings have important implications for both applied fields that utilize expert intelligence in making inferential decisions as well as theoretical fields interested in how memory changes with expertise. In particular, understanding the structure of memory in experts is critical in situations where decisions need to be made by people who have significant expertise.
Acknowledgements All persons who contributed to this project are authors of the final paper.
Authors' contributions All authors contributed to the original hypothesis, and read and approved the final manuscript. H.M.S. contributed to data collection, data analysis, and writing of the manuscript. T.F.B. contributed to data collection, data analysis, and editing of the manuscript. J.M.W. provided general guidance and contributed to the editing of the manuscript.
Funding This research was supported by NSF BCS-1829434 to T.F.B.
Data Availability For data and material, please contact the corresponding author.
Declarations
Ethics approval and consent to participate All participants gave informed consent. For all experiments in this study, informed consent procedures were approved by the Institutional Review Board of the University of California, San Diego.

Consent for publication is Not applicable.
References
1. Bainbridge, W. A., Isola, P., & Oliva, A. (2013). The intrinsic memorability of face photographs. Journal of Experimental Psychology: General, 142(4), 1323–1334.
2. Bartlett, F. C. (1932). Remembering: An experimental and social study. Cambridge University Press.
3.Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com's Mechanical Turk. Political Analysis, 20(3), 351–368.
4.Biederman, I., Mezzanotte, R. J., & Rabinowitz, J. C. (1982). Scene perception: Detecting and judging objects undergoing relational violations. Cognitive Psychology, 14(2), 143–177.
5.Bilalić, M., Langner, R., Ulrich, R., & Grodd, W. (2011). Many faces of expertise: Fusiform face area in chess experts and novices. Journal of Neuroscience, 31(28), 10206–10214.
6.Brady, T. F., & Alvarez, G. A. (2011). Hierarchical encoding in visual working memory: Ensemble statistics bias memory for individual items. Psychological Science, 22(3), 384–392.
7.Brady, T. F., Alvarez, G., & Störmer, V. (2019). The role of meaning in visual memory: Face-selective brain activity predicts memory for ambiguous face stimuli. Journal of Neuroscience, 39(6) 1100– 1108.
8. Brewer, W. F., & Treyens, J. C. (1981). Role of schemata in memory for places. Cognitive Psychology, 13(2), 207–230.
9.Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon's Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science, 6(1), 3–5.
10. Calkins, M. W. (1894). Experimental. Psychological Review, 1(3), 327– 329.
For more information:1950477648nn@gmail.com






