Exoplanet Detection with Decision Trees
DOI:
https://doi.org/10.58445/rars.526Keywords:
exoplanet, decision trees, machine learning algorithmsAbstract
Exoplanets can be detected through the observations of brightness and movement of the stars they orbit. In the past, machine learning algorithms have been able to classify possible candidates using specific techniques such as analyzing large samples of data and automating the otherwise tedious process of classification. In our research, we train a decision tree algorithm on datasets containing confirmed exoplanets, candidates, and false positives from the Kepler Mission in the NASA Exoplanet Archive. From this training, we build a decision tree classification model with a 94.12% accuracy at classifying exoplanets when training on confirmed exoplanets, candidates, and false positives, and a 99.78% accuracy when training only on confirmed exoplanets and false positives. Alternatively, when training a decision tree regression model to predict Kepler Object of Interest KOI) scores, we obtain a loss of 0.04. The decision tree algorithm is a viable option in classifying and detecting exoplanets, as displayed by its effectiveness.
References
Gagnon, Jean, et al. “IAL 18: Exoplanets & General Planetary Systems.” UNLV Physics,
, https://www.physics.unlv.edu/~jeffery/astro/ial/ial_018.html.
Britannica, The Editors of Encyclopaedia. "Arecibo Observatory". Encyclopedia Britannica, 1 Aug. 2023, https://www.britannica.com/topic/Arecibo-Observatory.
Dooling, Dave. “Kepler.” Encyclopædia Britannica, Encyclopædia Britannica, inc., 2009,
www.britannica.com/topic/Kepler-satellite.
Richmond, M. (2001). A connection between radial velocity and distance.
http://spiff.rit.edu/classes/phys240/lectures/expand/expand.html
Dobrijevic, D., & Howell, E. (2022, January 14). Redshift and blueshift: What do they mean?
https://www.space.com/25732-redshift-blueshift.html
Rauf, J. (2021). Looking for Exoplanets.
https://www.uc.edu/content/dam/refresh/cont-ed-62/olli/21-fall/exoplanets4.pdf
Richmond, M. (2014). Important parameters of an eclipsing system. What can we learn from
light curves? http://spiff.rit.edu/classes/phys373/lectures/light_curves/light_curves.html
Stanford Online. (2020, April 17). Lecture 10 - Decision Trees and Ensemble Methods |
Stanford CS229: Machine Learning (Autumn 2018) [Video]. YouTube.
https://www.youtube.com/watch?v=wr9gUr-eWdA
Koech, K. 2020, August 20. Cross-Entropy Loss Function. Towards Data Science.
https://towardsdatascience.com/cross-entropy-loss-function-f38c4ec8643e
Yadav, D. 2019, December 9. Categorical encoding using Label-Encoding and
One-Hot-Encoder. Towards Data Science.
Mean Squared Error. In: The Concise Encyclopedia of Statistics. Springer, New York,
NY. https://doi.org/10.1007/978-0-387-32833-1_251
Kurama, V. 2020, March 29. Gradient Boosting for Classification.
https://blog.paperspace.com/gradient-boosting-for-classification/
Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
Downloads
Posted
Categories
License
Copyright (c) 2023 Sriram Loganathan
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.