Preprint / Version 4

Random Forest Identification of Pulsars


  • Ankhita Sathanur Eastlake High School



Machine Learning, Random Forest, Astrophysics, Astronomy, Pulsars


Pulsars are a unique type of rotating neutron star that emit pulses of radio emission in beams that sweep across Earth, allowing for the detection of their repetitive pulses. Traditionally, pulsar candidates have been identified through manual signal processing. As data volumes increase, automated methods, such as artificial neural networks, have been proposed. In this study, the random forest classifier – an algorithm that takes the majority output of multiple decision trees – was used to accurately separate real pulsar signals from radio frequency interference (RFI) and other noise. These candidates can then be further studied and allotted telescope time to confirm them as pulsars. 1,639 real pulsar examples and 16,259 samples of RFI/noise from the HTRU2 survey were used to create the model. Features of the data used include the mean, standard deviation, excess kurtosis, and skewness of the integrated pulse profile and DM-SNR curve. The model demonstrated a 98% accuracy in identifying pulsars. The excess kurtosis, skewness, and mean of integrated profile were identified to be the most important factors in differentiating between pulsars and interference. This tool could be used to filter data from future surveys to reduce the number of candidates that need to be processed by humans.


Calla Cofield. What Are Pulsars?

Lyon, R. J.; Stappers, B. W.; Cooper, S.; Brooke, J. M.; Knowles, J. D. Fifty Years of Pulsar Candidate Selection: From Simple Filters to a New Principled Real-Time Classification Approach. Monthly Notices of the Royal Astronomical Society 2016, 459 (1), 1104–1123.

Lyon, R. J. Why Are Pulsars Hard to Find?, University of Manchester, 2016.

IBM Cloud Education. What is Machine Learning?

Train and Test datasets in Machine Learning - Javatpoint

Bento, C. Decision Tree Classifier explained in real-life: picking a vacation destination

Yiu, T. Understanding Random Forest

IBM Cloud Education. What is Random Forest?

IBM Cloud Education. What are Neural Networks?

Artificial Neural Network - Basic Concepts - Tutorialspoint

Bates, S. D.; Bailes, M.; Barsdell, B. R.; Bhat, N. D. R.; Burgay, M.; Burke-Spolaor, S.; Champion, D. J.; Coster, P.; D’Amico, N.; Jameson, A.; Johnston, S.; Keith, M. J.; Kramer, M.; Levin, L.; Lyne, A.; Milia, S.; Ng, C.; Nietner, C.; Possenti, A.; Stappers, B. The High Time Resolution Universe Pulsar Survey — VI. An Artificial Neural Network and Timing of 75 Pulsars. Monthly Notices of the Royal Astronomical Society 2012, 427, 1052–1065.

Eatough, R. P.; Molkenthin, N.; Kramer, M.; Noutsos, A.; Keith, M. J.; Stappers, B. W.; Lyne, A. G. Selection of Radio Pulsar Candidates Using Artificial Neural Networks. Monthly Notices of the Royal Astronomical Society 2010, 407 (4), 2443–2450.

Morello, V.; Barr, E. D.; Bailes, M.; Flynn, C. M.; Keane, E. F.; van Straten, W. SPINN: A Straightforward Machine Learning Solution to the Pulsar Candidate Selection Problem. Monthly Notices of the Royal Astronomical Society 2014, 443 (2), 1651–1662.

UCI Machine Learning Repository: HTRU2 Data Set

Radhakrishnan, V.; Vivekanand, M. The Structure of Integrated Pulse Profiles. Journal of Astrophysics and Astronomy 1980, 1, 119–128.

Pulsar Dispersion Measure | COSMOS

CFI. Kurtosis - Definition, Excess Kurtosis, and Types of Kurtosis

How is the kurtosis of a distribution related to the geometry of the density function?

6.1: Qualitative Data and Quantitative Data

Aniththa. Hyperparameter tuning a model - Azure Machine Learning

Baron, D. Machine Learning in Astronomy: A Practical Overview. arXiv:1904.07248 [astro-ph] 2019.

Google. Classification: ROC Curve and AUC | Machine Learning Crash Course

Scikit-learn. sklearn.ensemble.RandomForestClassifier — scikit-learn 0.20.3 documentation

Gini Index for Decision Trees: Mechanism, Perfect & Imperfect Split With Examples

What Is Log Loss in Machine Learning? (accessed 2022 -08 -27).; Addepto (accessed 2022 -08 -27).

Fraj, M. B. In Depth: Parameter tuning for Random Forest



2022-10-03 — Updated on 2022-12-20