Table 1: Kariki Farm Dataset Dimension from wunderground.com
First, the data was loaded and preprocessed to ensure it was clean and had no missing values. Next, we used Rough Set to detect the interaction terms in the dataset. The process was achieved by first discretizing the data, which changes numerical representations to nominal, which was necessary to enhance the evaluation and management of data. Discretization uses a data transformation procedure involving finding and cutting data sets and dividing the data into intervals. Values lying within an interval are then mapped to the same value. Doing this process will reduce the size of the attribute value set (Hassanien, A. E., Abdelhafez, M. E., & Own, H. S. (2008)).
Next, the indiscernibility relation was used to determine which variables in the dataset are indiscernible from the rest. From this relation, we can now deduce the lower, upper, and boundary approximations, which determine the lower, upper, and boundary regions, respectively. The lower region represents attributes/variables belonging to the subset of interest. After deducing the approximations, the next step is to formulate the reduct (feature subset) from the lower/positive region of the approximations; the method employed here will be the greedy heuristic method for feature selection which is a wrapper feature selection algorithm. (Janusz, A., Ślęzak, D. (2014)).
The experiment framework is shown in the diagram below: