Decision trees information gain vs feature importance
I have written a decision tree in Python based on Sklearn, but when I calculate the results and display the decision tree (and the 20 most important features), feature “A” is the most important and is also used as the root node.