Background and Aims: Single-lead electrocardiograms (ECGs) can be recorded using widely available devices such as smartwatches and handheld ECG recorders. Such devices have been approved for atrial fibrillation (AF) detection. However, little evidence exists on the reliability of single-lead ECG interpretation. We aimed to assess the level of agreement on detection of AF by independent cardiologists interpreting single lead ECGs, and to identify factors influencing agreement. Methods: In a population-based AF screening study, adults aged ≥65 years old recorded four single-lead ECGs per day for 1-4 weeks using a handheld ECG recorder. ECGs showing potential AF were identified by a nurse with the aid of an automated algorithm. These ECGs were reviewed by two independent cardiologists who assigned participant- and ECG-level diagnoses. Inter-rater reliability of AF diagnosis was calculated using linear weighted Cohen’s kappa (κw). Results: 185 participants and 1,843 ECGs were reviewed by both cardiologists. The level of agreement was moderate: κw = 0.42 (95% CI, 0.32 - 0.52) at the participant-level; and κw = 0.51 (0.46 - 0.56) at the ECG-level. At participant-level, agreement was associated with the number of adequate-quality ECGs recorded, with higher agreement in participants who recorded at least 67 adequate-quality ECGs. At ECG-level, agreement was associated with ECG quality and whether ECGs exhibited algorithm-identified possible AF. Conclusions: Inter-rater reliability of AF diagnosis from single-lead ECGs was found to be moderate in older adults. Strategies to improve reliability might include participant and cardiologist training and designing AF detection programmes to obtain sufficient ECGs for reliable diagnoses.