Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/165048
Title: | Machine learning for mixed data | Authors: | Zhu, Yixuan | Keywords: | Engineering::Electrical and electronic engineering | Issue Date: | 2023 | Publisher: | Nanyang Technological University | Source: | Zhu, Y. (2023). Machine learning for mixed data. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165048 | Abstract: | People are always dealing with mixed data, whether in scientific research, industrial production or daily life. With the continuous development of computer technology and the performance of machine learning models, the requirements for processing mixed data are increasing day by day, and one of the typical requirements is to classify them. In this dissertation, we first transform the categorical data by embedding algorithms in the mixed datasets and then perform classification experiments on them. In this dissertation, we normalize the numeric data in the mixed-data using relevant mathematical tools. For the categorical data in the mixed-data, we use three embedding methods to transform them into numeric types, namely one-hot encoding, TF-IDF method and embedding based on neural networks. Five machine learning models are used to perform classification experiments on them. These models include Logistic Regression (LR), K-Nearest Neighbors (KNN), Random Forest (RF), Gradient Boosting Decision Tree (GBDT) and XGBoost. We will collect the performance metrics data of each model at the optimal result. Then we compare the classification performance of these three embedding algorithms and the five machine learning models together and discuss them in relation to each other. Ultimately, we can complete a pipeline of models that completely implement the embedding transformation on categorical data and classify mixed-data. | URI: | https://hdl.handle.net/10356/165048 | Schools: | School of Electrical and Electronic Engineering | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | EEE Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Zhu Yixuan_Dissertation_Machine learning for mixed data.pdf Restricted Access | 5.78 MB | Adobe PDF | View/Open |
Page view(s)
334
Updated on May 7, 2025
Download(s) 50
34
Updated on May 7, 2025
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.