Data Analysis, Machine Learning and Neural Network technologies

  • By Daniel
  • 13 Sep, 2021
  • 3 min read

Technologies: Excel, Python, Streamlit, seaborn, pyplot, Pandas, NumPy, Matplotlib, scikit-learn, SMOTE, Jupyter Notebook, Markdown, SAS Enterprise Miner, GitHub, VS Code, HTML5, CSS3, Bootstrap

Description: Seek to employ theories and practices using the Python language and its libraries for Data Analysis and Machine Learning, among other technologies such as SAS Enterprise Miner and Streamlit to demonstrate the importance of the steps and use of Data Analysis and Machine Learning techniques step by step. The dataset is presented, analysed, cleaned, organized and using techniques of balancing the data, saving, creating the models, training, forecasting and comparing these models with each other, using various metrics to qualify which of them for the given problem was the best choice and explaining why this choice and presenting all the results on a web page through Streamlit where the user can navigate between several explanatory options for a perfect understanding of any person without such knowledge.

Chosen dataset: The dataset used for this project is a dataset from a marketing campaign. Where it seeks to predict possible customers for the campaign and reduce unnecessary spending. The initial dataset has 1500 rows and 19 columns with two possible targets. The dataset and this project are within the laws of the General Data Protection Regulation (GDPR).

Workflow of the entire process and stages:

  • Data Understanding
  • Implementation
  • Summary of Metrics and Comparison
  • Visualization

Where in each of these steps was addressed more deeply in subtopics.

Some of the models used in both Python and SAS Enterprise Miner:

  • Decision Tree(DT)
    • DecisionTreeClassifier(DTC)
    • RandomForest(RF)
  • Logistic Regression(LR)
  • Neural Network(NN)

Visualization: Data visualization was done in Streamlit, to demonstrate this Python library that was created in order to allow the Data Analyst not to have to venture to learn the complexities of other languages or frameworks such as JS or Django, to present the views and results. That way, using the library, the Analyst can easily build an elegant and clear view on a web page.