4.1.2. Performance Measures: 

The testing-based evaluation metrics (precision, recall, and F1 score) and accuracy (overall, macro, and weighted average) were used to evaluate the classification accuracy of the three methods applied to the testing datasets.

4.1.3. Methods setup and execution environment:

The BireyselValue method has one hyperparameter, the scalar of the radius, which is used in the building stage, outlined above in (\ref{103702}). For that purpose, values ranging from 0.1 to 0.3 are recommended. The range was chosen through experimentation, but in some datasets, such as \cite{Chicco_2020} and \cite{Koklu2020MulticlassCO}, when using this range, the accuracy measures after applying the prediction stage were very low. For those datasets, the range was adjusted to values ranging from 0.05 to 0.09 to reach an acceptable accuracy. In additionnot all the variables \(n\) for the datasets \cite{Chicco_2020}\cite{pubmed}\cite{Koklu2020MulticlassCO}, and \cite{Martins_2021}  were selected; however, only 30-50% of the variables were selected because, after some experimentation, the BireyselValue method performed much better with fewer variables. Notably, the selected variables are randomly chosen. A Python package from \cite{dahman2024a} was used to construct the building, training and prediction stages, as outlined in (\ref{921392}), (\ref{625657}), and (\ref{500846}), respectively. The documentation about the usage and the implementation steps are available \cite{dahman2024}.
The \(k\) hyperparameter for the K-nearest neighbor method was fixed \(k=5\). The value was chosen through experimentation, and the best performance was observed at this value. Similarly, the random state value for the logistic regression method was fixed with a range from 5-16, which varied depending on the dataset. For both methods, the Python package from Scikit-learn \cite{p2011} was used to perform the training and prediction steps.
Finally, the running environment was implemented on a basic machine containing a 5-core Intel(R) CPU (3.4 GHz-3.6 GHz) with 8 GB of RAM.

4.2. Results

The performance metric results in (\ref{625006}) obtained using the three methods applied to the testing datasets are presented in this section. Each table corresponds to one of the datasets in (\ref{384212}). Notably, LR, KN, and BV refer to logistic regression, the K-nearest neighbor, and the BireyselValue, respectively.