Decision tree examples with solutions

In machine learning, a decision tree is a predictive model that maps observations about an item to conclusions about its target value. It's called a decision tree because it starts with a single decision point and branches out into a number of outcomes.


Real-World Example:

Let's apply this to a scenario of predicting whether a customer will subscribe to a new service based on their demographic and behavioral data.

  • Starting Point:
    • Features: Age, Income, Subscription History
  • Decision Nodes:
    • First node might be based on income: "Is income greater than $50,000?"
  • Branches:
    • If income is greater than $50,000, next node might be age: "Is age less than 40?"
    • If income is $50,000 or less, next node might be subscription history: "Has the customer subscribed to a similar service before?"
  • Further Decisions:
    • For those with high income and age less than 40, further decisions might include past subscription behavior.
    • For those with lower income, decisions might involve different factors like age or education level.
  • Leaf Nodes:
    • Eventually, each path leads to a leaf node that predicts whether the customer is likely to subscribe to the new service based on their unique combination of features.


code:


# Import necessary libraries

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeClassifier, export_text


# Hypothetical data (income in thousands, age in years, subscription history as binary)

data = {

    'Income': [55, 35, 75, 45, 60, 30, 80, 25],

    'Age': [25, 30, 40, 35, 27, 32, 45, 28],

    'SubscriptionHistory': [1, 0, 1, 0, 1, 0, 1, 0],  # 1 for subscribed before, 0 for not

    'WillSubscribe': [1, 0, 1, 0, 1, 0, 1, 0]  # Target variable: 1 for yes, 0 for no

}


# Create a DataFrame

df = pd.DataFrame(data)


# Separate features (X) and target variable (y)

X = df[['Income', 'Age', 'SubscriptionHistory']]

y = df['WillSubscribe']


# Split data into training and testing sets (70% train, 30% test)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


# Initialize Decision Tree classifier

clf = DecisionTreeClassifier(random_state=42)


# Fit the classifier to the training data

clf.fit(X_train, y_train)


# Visualize the decision tree rules

tree_rules = export_text(clf, feature_names=list(X.columns))

print("Decision Tree Rules:")

print(tree_rules)


# Predict on test data

y_pred = clf.predict(X_test)


# Print predicted values and actual values

print("\nPredicted Values:", y_pred)

print("Actual Values:", y_test.values)



example: skin health prediction


from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score

import numpy as np


# Define the data manually for simplicity

data = {

    'itching': [1, 1, 1, 0, 0],

    'red_spots': [1, 0, 0, 1, 0],

    'rash': [0, 1, 0, 0, 0],

    'diagnosis': ['Allergic Reaction', 'Contact Dermatitis', 'Dry Skin', 'Insect Bite', 'Normal Skin Condition']

}




# Convert data into a pandas DataFrame

import pandas as pd

df = pd.DataFrame(data)


# Features and target

X = df[['itching', 'red_spots', 'rash']]  # Features

y = df['diagnosis']  # Target variable


# Initialize the Decision Tree Classifier

clf = DecisionTreeClassifier(random_state=42)


# Train the classifier

clf.fit(X, y)


# Function to predict diagnosis based on user input

def predict_diagnosis(itching, red_spots, rash):

    symptoms = np.array([[itching, red_spots, rash]])

    prediction = clf.predict(symptoms)

    return prediction[0]


# Main function to interact with the user

def main():

    print("Welcome to the Skin Condition Diagnosis System!")

    print("Please enter your symptoms:")

    

    itching = int(input("Do you have itching? (0 for no, 1 for yes): "))

    red_spots = int(input("Do you have red spots? (0 for no, 1 for yes): "))

    rash = int(input("Do you have a rash? (0 for no, 1 for yes): "))

    

    diagnosis = predict_diagnosis(itching, red_spots, rash)

    print(f"Based on your symptoms:")

    print(f"Itching: {'Yes' if itching == 1 else 'No'}")

    print(f"Red Spots: {'Yes' if red_spots == 1 else 'No'}")

    print(f"Rash: {'Yes' if rash == 1 else 'No'}")

    print(f"The predicted diagnosis is: {diagnosis}")


if __name__ == "__main__":

    main()


0 Comments