I used the data from kaggle. I chose this dataset because I love cars, but only stay on those hyper-car🚘. I am planning to buy a useful car for traveling and commuting purposes. Therefore, I have to choose the practical car, since I am still a student yet. In the future, I probably will able to buy my dream car. Now, I will only focus on the car that can arise my parent's feelings, by their price. Therefore, in this visualization, mainly comparing the selling price of the used car, to see the general pattern does the data shows so that I have a clear idea when I go to pick a car. (This dataset is about all the used car that listed in cardekho.com, so it mainly about the reference instead of suggestion)
import pandas as pd
data=pd.read_csv('CAR DETAILS FROM CAR DEKHO.csv')
In this dataset, there are 8 different columns:
See the head of the dataset
data.head()
name | year | selling_price | km_driven | fuel | seller_type | transmission | owner | |
---|---|---|---|---|---|---|---|---|
0 | Maruti 800 AC | 2007 | 60000 | 70000 | Petrol | Individual | Manual | First Owner |
1 | Maruti Wagon R LXI Minor | 2007 | 135000 | 50000 | Petrol | Individual | Manual | First Owner |
2 | Hyundai Verna 1.6 SX | 2012 | 600000 | 100000 | Diesel | Individual | Manual | First Owner |
3 | Datsun RediGO T Option | 2017 | 250000 | 46000 | Petrol | Individual | Manual | First Owner |
4 | Honda Amaze VX i-DTEC | 2014 | 450000 | 141000 | Diesel | Individual | Manual | Second Owner |
Get all columns
data.columns
Index(['name', 'year', 'selling_price', 'km_driven', 'fuel', 'seller_type', 'transmission', 'owner'], dtype='object')
Using Altair to visualize
import altair as alt
alt.Chart(data).mark_line().encode(x='km_driven',y='selling_price').interactive()
At first, I was using the line chart to see the pattern between selling price and km drove. From the visualization, generally, the less km drove the higher price. It does not always maintain the pattern, which means the result is influenced by other features.
Using bar chart to visualize the relation between year and seller type.
alt.Chart(data).mark_bar().encode(
x='year',
y='count()',
color='seller_type',
tooltip=['name','year','selling_price','seller_type']).interactive()
From the dataset, we can see as time goes by, more and more people try to sell their car from dealer, but most of them sold their car individually. In the article Used car market continues to boom, beating slowdown blues, they said 'Compared with 2018, the used car market will grow by about 12% in 2019, with sales of about 4.2 million vehicles', which kind of make senses of what last graph shows, since rare people will sell their car less used for a year.The sales volume of cars produced between 2012 and 2018 has increased significantly. However, due to the limitation of data, it is impossible to determine which year the cars were sold. Also, because this data only represents 1 used car sales website, it can't represent the whole market, especially the larger group of the seller are individual, which means the users can sell their car by any website.
Using bubble chart to illustrate the relation among year, transmission, price and km driven.
#alt.data_transformers.disable_max_rows()
alt.Chart(data).mark_circle().encode(x='km_driven',
y='selling_price',
color=alt.Color('year', scale=alt.Scale(scheme='Spectral')),
tooltip = ['name','selling_price','owner','year'],
size='transmission'
).interactive()
In this graph, we can see that the less driven, the newer of the car, the more selling price. In general, automatic can sell higher price than manual. If the car came from 2005 or early, it won't worth lots of money, even they don't drive too much.
From the catplot above, it is observed that the influence of age, brand, body type, engine capacity and transmission system on the forecast price is in the highest level in machine learning, which proved idea that the year of the car produced, transmission system in this case, can affect the selling price in India used car market.
The interactive graphs, comparing the year, fuel, selling price and km driven.
# create a selection
selection = alt.selection(type="multi", fields=['year'])
# sets up a layout canvas layout
base = alt.Chart(data).properties(width=250, height=250)
# creates a histgram
hist = base.mark_bar(color="gray").encode(
x='year',
y='count()',
tooltip=['year','count()']
).add_selection(selection) # make histgram to be selective
scat = base.mark_circle().encode(
x='km_driven',
y='selling_price',
tooltip=['name','fuel','transmission','year','owner','selling_price'],
color=alt.Color('fuel', scale=alt.Scale(scheme='rainbow'))
).transform_filter(selection).interactive()
# show the 2 graphs
hist | scat
In the graphs above, we can see that the year of a car mostly around 2012 to 2018. By clicking on each year, we can see that diesel usually has a higher selling price than others. Also, the newer the car, the higher range of the price they can sell, the same as the last graph.
You might noticed that the sales in 2020 is abnormal compare to previous year, which has the increase trend. I searched on the web, Here is what I found Because of COVID, for most of the fiscal year 2020, sales in the country's auto industry have been declining. As the economy is under severe pressure, discretionary purchases such as cars or real estate will be hit hard. Not a single car was sold in April in India, as the article mentioned, but with the lockdown over in June, it would boost the ownership of used cars. More people will choose to sell their car to get liquidity money, so even the car was new, they still sell them.
From the visualization below, which have the same pattern. The year of the car built, and the distance of the car driven, significantly affect the selling price. If the car is newer or drives for a low distance, it will have a higher range of selling price.
In the article Used car market in India, online start-ups organising the sector, it shows that with the recognition of used cars, more and more consumers choose to buy used cars. With the increase of disposable income and urbanization, customers no longer prefer to keep their cars for several years, but are always eager to upgrade to the latest models. As the premium used car market gains more and more new cars, this offers additional luxury, comfort and prestige. The second-hand luxury car market is experiencing unprecedented growth. More and more people choose second-hand high-quality cars to meet their enthusiasm for owning luxury cars. From the chart, more and more second-hand luxury cars, second-hand cars that drive a little distance, appear in the market, which reflects the increasing desire of people to change cars.
In the visualization below, we can get general ideas of what used car will customers might purchase. From the historical view, customers are not only looking for the cheapest car. Instead, car brand such as Volkswagen, BMW, Audi, Mercedes-Benz, those luxury cars also has a better market.
If people want to buy a car in India, or want some references to select a car, here is the data shows: