Dimensionality reduction on automotive data for meaningful insights

Short description

Exploring an automotive data set using principal component analysis (PCA) and t-distributed stochastic neighborhood embedding (t-SNE) to gain valuable understanding

Background

This is placeholder text.

Objective

This is placeholder text.
Client Name

MIT Applied Data Science Program

Release Date

February 8, 2022

Project Types

Dimensionality Reduction, Data Science

Skills

Data Preprocessing, Exploratory Data Analysis, Correlation Analysis, Dimensionality Reduction, Principal Component Analysis (PCA), t-distributed stochastic neighborhood embedding (t-SNE)

Tools

Python 3, Jupyter Notebooks, JetBrains DataSpell, Anaconda

Data set

  • Data type: string, float, int

Results

This is placeholder text.

  • This is placeholder text.
  • This is placeholder text.
  • This is placeholder text.
Principal component analysis (PCA) scatter plot, graphing PC1 and PC2, with plot colors in blue
Principal component analysis (PCA) scatter plot of PC1 and PC2 with plot colors in blue
Principal component analysis (PCA) scatter plot of PC1 and PC2 with the number of engine cylinders represented by gradient-based shades of purple
Principal component analysis (PCA) scatter plot of PC1 and PC2 with the number of engine cylinders represented by gradient-based shades of purple
t-SNE scatter plot displaying three distinct clusters of similarity within the data
t-SNE scatter plot displaying three distinct clusters of similarity within the data
t-SNE scatter plot displaying the three clusters of data points in relation to number of engine cylinders
t-SNE scatter plot displaying the three clusters of data points in relation to number of engine cylinders