Using Python To Defeat The Car Market
November 2022 (1824 Words, 11 Minutes)
Background
Purchasing a vehicle has become a daunting, time-consuming task since the height of the COVID-19 pandemic. The semiconductor shortage caused by interruptions in complex supply chains has inhibited the rate of vehicle production worldwide. At the time of writing, US Light Vehicle Sales remain down by over 13% and demand is expected to exceed supply into 2025 according to Haig Partners. Dealerships across the nation added additional markup beyond the Manufacturer Suggested Retail Price (MSRP) to make up for the reduced sales volume at the expense of the consumer. Excessive market adjustment fees combined with mandatory dealer-installed options have plagued the new vehicle market in the US since the beginning of the supply issues.
Despite it being an awful time to purchase a vehicle, it has become inevitable for many Americans - myself included. Early this year, I embarked on a journey to purchase a new vehicle out of necessity. Since it is a large, important purchase that I plan to keep for a long time, I was looking for something fun, practical, reasonably priced, and something that I wouldn’t outgrow anytime soon. The resulting decision led me to the Hyundai Elantra N.

The Fun Begins
Upon visiting the Hyundai USA website, I went to the “Inventory” section to see what my options were. However, the website only gives a maximum search radius of 250 miles. Unfortunately, in today’s market, consumers may need to travel in excess of that in order to find a vehicle without unnecessary markups. In order to efficiently discover inventory outside of that search radius, I needed to figure out how the site was getting the information it displays to the user.

This is when I first began to examine the web application in greater detail. Given that the search radius was a user-defined variable, I wondered if it is possible to alter its value to exceed the maximum allowed by Hyundai. Using the browser’s built-in developer tools to examine network traffic, I filtered the traffic using one of the likely parameters of 250 and noticed an HTTP GET request to “vehicleList.json” which includes our expected parameters. The response includes a JSON object containing dealerships, and those dealerships contain vehicles. This is how the web application fetches all of the available inventory within the search radius.

The “vehicleList.json” API appears to take 4 parameters: the user’s postal code, the year/model name of the vehicle, and the search radius. If we want to build a tool that can help us track Elantra N inventory nationwide, we must see if we can modify the search radius value successfully. If they are validating the user input on the server side before processing the request, the API call would be denied and we would be forced to find another solution.

When we try to directly visit the API, we are met with an HTTP 403 Error indicating access denied. Comparing the request made from the Inventory Search page versus the request made from directly visiting the API, the only major difference I noted between the two is the lack of the “referer” header in the direct visit. This indicates that Hyundai’s API likely uses this header as a means to detect and prevent unintelligent scraping. I went ahead and implemented this basic request programmatically using the Python Requests library and it worked. The same data that I saw through visiting the website was now saved as a variable that I could control in my Python code. Oh, and I confirmed that adding extra zeros to the end of the radius works. I now had the ability to query the inventory for the entire United States with one HTTP request.
import requests
url = "https://www.hyundaiusa.com/var/hyundai/services/inventory/vehicleList.json?zip=90210&year=2023&model=Elantra-N&radius=10"
ua = "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36"
ref = "https://www.hyundaiusa.com/us/en/inventory-search/vehicles-list?model=Elantra%20N&year=2023"
headers = {'User-Agent': ua, 'Referer': ref}
resp = requests.get(url, headers=headers)
print(resp.json())
{'status': 'SUCCESS', 'data': [{'modelYear': 2023, 'yrSerCd': '20234', 'dealerInfo': [{'dealerCd': 'CA393', 'dealerNm': 'Keyes Hyundai Of Van Nuys', 'dealerEmail': None, 'dealerUrl': 'www.keyeshyundai.com', 'address1': '5746 Van Nuys Boulevard', 'address2': '', 'city': 'Van Nuys', 'state': 'CA', 'zip': '91401', 'phone': '747-233-7900', 'fax': None, 'region': 'WE', 'latitude': 34.17514, 'longitude': -118.448165, 'distance': 6.34, 'shopperAssurance': 'N', 'IsPMADealer': 1, 'vehicles': [{'vin': 'KMHLW4AK9PU009801', 'modelNm': 'Elantra N', 'trimDesc': 'N', 'modelCd': '4N4A3FT5', 'price': '$33,775.00', 'exteriorColorCd': 'S3B', 'interiorColorCd': 'NNB', 'drivetrainDesc': 'FRONT WHEEL DRIVE', 'transmissionDesc': 'MANUAL', 'TotalPackages': 0, 'TotalOptions': 0, 'packages': None, 'colors': [{'SAPExterioColorCode': 'S3B', 'ExtColorLongDesc': 'PHANTOM BLACK', '360ImagePaths': [{'ImagePath': '/content/dam/hyundai/us/com/image/2023/elantran/n/exterior/base/phantom-black/360/', 'PackageID': 'P1'}]}], 'PlannedDeliveryDate': '2022-11-01T07:00:00', 'inventoryStatus': 'DS'}]}, {'dealerCd': 'CA01L', 'dealerNm': 'Hyundai Of Glendale', 'dealerEmail': None, 'dealerUrl': 'www.glendalehyundai.com', 'address1': '411 South Brand Blvd', 'address2': '', 'city': 'Glendale', 'state': 'CA', 'zip': '91204', 'phone': '818-243-4110', 'fax': '818-243-4646', 'region': 'WE', 'latitude': 34.14081192016602, 'longitude': -118.2553100585938, 'distance': 9.32, 'shopperAssurance': 'N', 'IsPMADealer': 0, 'vehicles': [{'vin': 'KMHLW4AK2PU009946', 'modelNm': 'Elantra N', 'trimDesc': 'N', 'modelCd': '4N4A3FT5', 'price': '$33,940.00', 'exteriorColorCd': 'S3B', 'interiorColorCd': 'NNB', 'drivetrainDesc': 'FRONT WHEEL DRIVE', 'transmissionDesc': 'MANUAL', 'TotalPackages': 0, 'TotalOptions': 0, 'packages': None, 'colors': [{'SAPExterioColorCode': 'S3B', 'ExtColorLongDesc': 'PHANTOM BLACK', '360ImagePaths': [{'ImagePath': '/content/dam/hyundai/us/com/image/2023/elantran/n/exterior/base/phantom-black/360/', 'PackageID': 'P1'}]}], 'PlannedDeliveryDate': '2022-10-30T07:00:00', 'inventoryStatus': 'DS'}]}]}]}
Data Parsing, Storage, and Analysis
Armed with the ability to programmatically fetch information from the server, it was time to make it usable. This meant first analyzing its structure to determine which information is actually important to me so I could determine how to extract it. I used an online JSON parser to help make sense of the data - it’s not quite JSON:API compliant but it is pretty close. There is a ‘status’ key whose value indicates whether or not the request was successful. There is also a JSON object which contains an array named ‘dealerInfo’ which contains objects for each dealership that has at least one Elantra N allocated to it. Each dealer object contains an array called ‘vehicles’ and each vehicle is represented as a JSON object including data such as the VIN, color, inventory status, and more.
I decided to use SQLite to provide long-term storage for the data obtained via the API since it is lightweight and simple to integrate into a Python project. The database contains two tables - one for storing dealership information using the dealer code as the primary key, and one for storing each individual vehicle using the VIN as the primary key.
Storing the data allows us to analyze historic trends over the course of the vehicle’s production that we would normally not be able to access. For example, by looking at the last five digits of each VIN, I know that there have been about 11k units produced at the time of writing. Because I have been storing this data in the SQLite database, I know that at least 2k of those were destined for the US. That said, this number is a lower bound since this method is only to track the subset of US-bound inventory which has been listed on the website.
I was also able to determine that about 74% of sampled US-bound Elantra N vehicles use automatic (DCT) transmissions, while the other 26% are equipped with a manual. Initially, this started off closer to 66-33% split but the DCT option gained a larger share of the market. Also, the distribution of vehicles by color is as follows: Phantom Black (27%), Ceramic White (22%), Intense Blue (18%), Cyber Gray (18%), and Performance Blue (14%). Anecdotally speaking, the colors with the lowest production numbers seem to be the most sought after by far so it is interesting to see that they are the rarest. The distribution of vehicles per state somewhat follows the order of states by population, but the vehicle per capita ratio does not align as well with California, Texas, and Florida in the lead with 190, 182, and 132 units respectively.
Beating the Odds
Purchasing a new car in these market conditions and in a rare configuration (Performance Blue paint, manual transmission) would be no easy task. However, armed with up-to-date information of the market and some handy alerting which I implemented, I was able to defy the odds and find the exact configuration I wanted just a few months after US deliveries had begun. And I’ve been loving it ever since.
