Flying Ad Hoc Networks (FANETs) are a critical component of UAV-based communication systems with applications in surveillance, disaster response, and defense. Their highly dynamic topology, high mobility and limited battery capacity makes reliable and energy-efficient communication challenging for them. Traditional handover mechanisms, often adapted from Vehicular Ad Hoc Networks (VANETs), rely on static thresholds and are not suited to the three-dimensional mobility and energy constraints of FANETs. This study proposes QEEH, a Q-learning–based Energy-Efficient Handover framework designed for FANET environments. QEEH employs reinforcement learning to make adaptive handover decisions by considering signal strength, node density, residual energy, and traffic load. It also integrates multiple energy states—active, sleep, hibernate, and wake-up—to reduce power consumption without compromising connectivity. NS3-based simulations show that QEEH consistently outperforms CLEA-AODV, LFEAR, and PARouting. Compared with CLEA-AODV, QEEH achieves up to 23% higher throughput, 20% higher packet delivery ratio, 30% lower end-to-end delay, and 28% lower energy consumption, while maintaining more than 90% node survivability at the end of simulation, exceeding other protocols by 15–21%.These results demonstrate that intelligent, energy-aware handover schemes can enhance FANET performance. However, the findings are limited to NS3 simulations with moderate UAV densities. Future work will focus on testbed validation, scalability to large UAV swarms, and extending QEEH with deep reinforcement and federated learning for decentralized training.