Invited Paper

Performance Analysis of Artificial Intelligence Models Trained with Open-Source Dataset in Clinical Environment

10.4274/atfm.galenos.2022.97830

  • Ramazan Terzi
  • Mustafa Umut Demirezen

Received Date: 11.11.2022 Accepted Date: 23.11.2022 J Ankara Univ Fac Med 2022;75(1):25-34

Objectives:

Deep learning-based tumor detection and segmentation methods have been developed for a long time and are now widely used in the literature. While the earlier deep learning methods generally use architectures based on convolutional neural networks, more novel methods based on visual transformer architectures have several advanced capabilities and are widely used today. In this study, these two deep learning approaches were trained on the data set frequently used in the literature and tested on real clinical data obtained from the hospital environment. Thus, it is aimed to measure the usage efficiencies and generalization capabilities of the models trained on open datasets on 5 different lesion types in real clinical settings.

Materials and Methods:

Using BraTS 2020 as an open dataset, eight deep-learning architectures based on Convolutional Neural Networks and Visual Transformers were trained. The trained models were reported using MR images prepared and labeled by the doctors of Ankara University Faculty of Medicine, Department of Radiology and the performance of the deep learning models was reported using the IoU and Dice coefficient metrics.

Results:

In the light of the analyzes grouped by lesion types, when the models trained in the BraTS 2020 dataset were tested on the dataset of Ankara University: approximately 17%, 4%, and 9% performance decreases were observed for HGG lesion, NCR/NET, Edema and Enhancing Tumor labels, respectively. As for the LGG tumors, approximately 45%, and 30% performance drop for NCR/NET and Enhancing Tumor labels were discovered, respectively. For Cavernoma tumors, approximately a 60% performance decrease for Edema labels. For Meningioma tumors, An average of approximately 36% and 33% performance decline were reported for the Edema and Enhancing Tumor labels, respectively, and finally, approximately 61% and 2% performance diminish for the Schwannoma lesion for Edema and Enhancing Tumor labels were shown, respectively.

Conclusion:

In light of the findings, it has been observed that the generalization ability of deep learning models trained only with the open source dataset is quite limited in the clinical setting, varies according to the lesion type, and gives more successful results in the open dataset and similar datasets. In order to improve the model performance, it has been seen that transfer learning studies should be carried out in order to use the models developed on open datasets in the clinical environment.

Keywords: Deep Learning, Brain Tumor Segmentation, Visual Transformers, Convolutional Neural Networks

Full Text (Turkish)