This thesis explores the use of metaheuristics to tackle complex problems in machine learning, with a focus on feature selection, across two major application domains: medicine and graph theory. Feature selection is a combinatorial NP-Hard problem, for which exact methods quickly become inefficient as data dimensionality increases. To address this challenge, a method based on differential evolution, called Tournament in Differential Evolution (TiDE), was developed. It incorporates adaptive mechanisms for initialisation, mutation, and crossover.
The first part of the thesis is devoted to the theoretical foundations of metaheuristics and machine learning, with particular attention paid to data and model quality, as well as the diversity of classification and regression algorithms. A comprehensive experimental study was conducted on a benchmark of datasets with varying structures (e.g. dimensionality, noise, imbalance, redundancy), to assess the robustness and generality of the proposed approaches. Subsequently, the methods were applied to real-world medical datasets, particularly in the context of survival analysis for amyotrophic lateral sclerosis (ALS) and Covid-19 pneumonia. The results suggest that the proposed approaches not only improve predictive performance, but also significantly reduce data dimensionality while ensuring adequate interpretability.
Finally, the last part of the thesis applies metaheuristics and machine learning to the generation and refutation of conjectures in graph theory. A benchmark of conjectures was established, and several methods, including TiDE, were compared experimentally. The findings highlight the relevance of metaheuristics in underexplored areas of fundamental research, where combinatorial complexity renders classical approaches impractical.
This thesis demonstrates the relevance and robustness of metaheuristics in a variety of contexts, at the intersection of theoretical and applied approaches in artificial intelligence.