Building an algorithm to predict Mortgage approval based on historical HMDA lending practices

Автор: Brian Byrne

Загружено: 2021-03-01

Просмотров: 629

Описание:

https://sites.google.com/view/vinegar...

Data Description
dir: debt payments to total income ratio;

hir: housing expenses to income ratio;

lvr: ratio of size of loan to assessed value of property;

ccs: consumer credit score;

mcs: mortgage credit score;

pbcr: public bad credit record;

dmi: denied mortgage insurance;

self: self employed;

single: applicant is single;

uria: 1989 Massachusetts unemployment rate applicant's industry;

condominiom: condominium;

black: race of applicant black;

deny: mortgage application denied;

Munnell, Tootell, Browne, and McEneaney (1996) at the Boston Fed examined mortgage lending in Boston to determine if race played a significant role in determining who was approved for a mortgage. The primary econometric technique they relied upon was logistic regression where race was included as one of the predictors or independent variables. The coefficient on race showed a statistically significant negative impact on probability of getting a mortgage for minority applicants. This finding prompted considerable subsequent debate and discussion. Here we apply machine learning techniques of the type suggested by Varian (2014). The data consists of 2380 observations of 12 predictors, one of which was race.

We extend the analysis to consider how to train algorithms to automate the lending or mortgage approval process and then test algorithm against actual out-of-sample data. We use the sklearn library and import a number of models including Logistic Regression, SVMs, K Nearest Neighbours, Decision Trees and Random Forest classifiers. We then use historical lending patterns to shape eligibility and predict mortgage approval. The algorithms do nothing more than merely attempt to replicate the historcal loan patterns of lending officers. The lending algorithms created therefore are not state of the art but do reflect historcal norms - flawed or not. These benchmarks nevertheless could be applied to determine how patterns in lending change.

#Using Logistic Regression Algorithm to the Training Set
from sklearn.linear_model import LogisticRegression
log = LogisticRegression(random_state = 0)
log.fit(X_train, y_train)

#Using KNeighborsClassifier Method of neighbors class to use Nearest Neighbor algorithm
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
knn.fit(X_train, y_train)

#Using SVC method of svm class to use Support Vector Machine Algorithm
from sklearn.svm import SVC
svc_lin = SVC(kernel = 'linear', random_state = 0)
svc_lin.fit(X_train, y_train)

#Using SVC method of svm class to use Kernel SVM Algorithm
from sklearn.svm import SVC
svc_rbf = SVC(kernel = 'rbf', random_state = 0)
svc_rbf.fit(X_train, y_train)

#Using GaussianNB method of naïve_bayes class to use Naïve Bayes Algorithm
from sklearn.naive_bayes import GaussianNB
gauss = GaussianNB()
gauss.fit(X_train, y_train)

#Using DecisionTreeClassifier of tree class to use Decision Tree Algorithm
from sklearn.tree import DecisionTreeClassifier
tree = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
tree.fit(X_train, y_train)

#Using RandomForestClassifier method of ensemble class to use Random Forest Classification algorithm
from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(n_estimators = 10, criterion = 'entropy', random_state = 0)
forest.fit(X_train, y_train)

#print model accuracy on the training data.
print('[0]Logistic Regression Training Accuracy:', log.score(X_train, y_train))
print('[1]K Nearest Neighbor Training Accuracy:', knn.score(X_train, y_train))
print('[2]Support Vector Machine (Linear Classifier) Training Accuracy:', svc_lin.score(X_train, y_train))
print('[3]Support Vector Machine (RBF Classifier) Training Accuracy:', svc_rbf.score(X_train, y_train))
print('[4]Gaussian Naive Bayes Training Accuracy:', gauss.score(X_train, y_train))
print('[5]Decision Tree Classifier Training Accuracy:', tree.score(X_train, y_train))
print('[6]Random Forest Classifier Training Accuracy:', forest.score(X_train, y_train))

return log, knn, svc_lin, svc_rbf, gauss, tree, forest

Building an algorithm to predict Mortgage approval based on historical HMDA lending practices

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Учебник по Excel за 15 минут

Учебник по Excel за 15 минут

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Изучите Wireshark! Учебник для начинающих

Изучите Wireshark! Учебник для начинающих

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Что происходит с нейросетью во время обучения?

Что происходит с нейросетью во время обучения?

Понимание GD&T

System Design Concepts Course and Interview Prep

System Design Concepts Course and Interview Prep

Predict The Stock Market With Machine Learning And Python

Predict The Stock Market With Machine Learning And Python

Пайтон для начинающих - Изучите Пайтон за 1 час

Пайтон для начинающих - Изучите Пайтон за 1 час

Учебник по Power BI за 10 минут

Учебник по Power BI за 10 минут

Учебное пособие по Power BI для начинающих | Создайте свою первую панель мониторинга прямо сейчас...

Учебное пособие по Power BI для начинающих | Создайте свою первую панель мониторинга прямо сейчас...

Биномиальные распределения | Вероятности вероятностей, часть 1

Биномиальные распределения | Вероятности вероятностей, часть 1

Доступное Введение в Машинное Обучение

Доступное Введение в Машинное Обучение

Объяснение Transformers: понимание модели, лежащей в основе GPT, BERT и T5

Объяснение Transformers: понимание модели, лежащей в основе GPT, BERT и T5

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

Do THIS to Protect Your Money From Geopolitical Risk (European Investor)

Do THIS to Protect Your Money From Geopolitical Risk (European Investor)

Маска подсети — пояснения

Маска подсети — пояснения

Краткое описание Agile Product Ownership

Краткое описание Agile Product Ownership

Что такое аналитика данных? — Введение (полное руководство)

Что такое аналитика данных? — Введение (полное руководство)

Power Query для начинающих: преобразование данных Excel за считанные минуты (учебное пособие 2025...

Power Query для начинающих: преобразование данных Excel за считанные минуты (учебное пособие 2025...