DeepAstroUDA: semi-supervised universal domain adaptation for cross-survey galaxy morphology classification and anomaly detection

Ćiprijanović, A and Lewis, A and Pedro, K and Madireddy, S and Nord, B and Perdue, G N and Wild, S M (2023) DeepAstroUDA: semi-supervised universal domain adaptation for cross-survey galaxy morphology classification and anomaly detection. Machine Learning: Science and Technology, 4 (2). 025013. ISSN 2632-2153

[thumbnail of Ćiprijanović_2023_Mach._Learn.__Sci._Technol._4_025013.pdf] Text
Ćiprijanović_2023_Mach._Learn.__Sci._Technol._4_025013.pdf - Published Version

Download (3MB)

Abstract

Artificial intelligence methods show great promise in increasing the quality and speed of work with large astronomical datasets, but the high complexity of these methods leads to the extraction of dataset-specific, non-robust features. Therefore, such methods do not generalize well across multiple datasets. We present a universal domain adaptation method, DeepAstroUDA, as an approach to overcome this challenge. This algorithm performs semi-supervised domain adaptation (DA) and can be applied to datasets with different data distributions and class overlaps. Non-overlapping classes can be present in any of the two datasets (the labeled source domain, or the unlabeled target domain), and the method can even be used in the presence of unknown classes. We apply our method to three examples of galaxy morphology classification tasks of different complexities (three-class and ten-class problems), with anomaly detection: (1) datasets created after different numbers of observing years from a single survey (Legacy Survey of Space and Time mock data of one and ten years of observations); (2) data from different surveys (Sloan Digital Sky Survey (SDSS) and DECaLS); and (3) data from observing fields with different depths within one survey (wide field and Stripe 82 deep field of SDSS). For the first time, we demonstrate the successful use of DA between very discrepant observational datasets. DeepAstroUDA is capable of bridging the gap between two astronomical surveys, increasing classification accuracy in both domains (up to $40\%$ on the unlabeled data), and making model performance consistent across datasets. Furthermore, our method also performs well as an anomaly detection algorithm and successfully clusters unknown class samples even in the unlabeled target dataset.

Item Type: Article
Subjects: OA Digital Library > Multidisciplinary
Depositing User: Unnamed user with email support@oadigitallib.org
Date Deposited: 22 Jun 2024 08:52
Last Modified: 22 Jun 2024 08:52
URI: http://library.thepustakas.com/id/eprint/1761

Actions (login required)

View Item
View Item