How to implement supporting data-quality-maintenance-systems using AI? How to find out which method performs best? Is it possible to further improve the results? How to take advantage of complementary human-machine-perception? This is an example of a successful implementation. Read my article on this topic, here.
Title
Classifying Changes in the Consolidation Perimeter in the German Group Accounts Statistics Using Statistical Learning
Abstract
The Deutsche Bundesbank publishes aggregated rates of change for the revenue, the earnings before interest, taxes, depreciation and amortization, and the earnings before interest and taxes of the German non-financial groups listed in the Prime Standard of the Frankfurt Stock Exchange. Where there are changes in the consolidation perimeter, it is crucial to adjust these growth rates to obtain an unbiased impression of economic developments. However, an adjustment is only possible if it is known which groups have undergone a significant change in group structure. To find evidence of changes in the perimeter, different statistical learning approaches are tested and combined.
This is the first paper to classify changes in the consolidation perimeter for data quality maintenance. A combination of a random forest and a weighted nearest neighbors approach, which are both oversampled, provides the best results. The approach avoids overlooking changes in the consolidation perimeter in an efficient and affordable way.
Keywords: Machine Learning, Data Quality Management, Deep Learning, Statistical Learning, Group Accounts, Changes Perimeter, Consolidation Perimeter, Ensemble Learning, SMOTE