In this research project, I investigated how weather patterns in Brazil influence global Arabica coffee futures prices. Brazil accounts for 40% of the world's coffee production, making it the leading producer of Arabica coffee. Consequently, variations in Brazil's production capacity significantly impact coffee prices worldwide.
Using time series data and advanced machine learning models, I successfully predicted price trends by analyzing temperature and precipitation patterns in key Brazilian coffee-growing regions. This research demonstrates how climate factors directly affect commodity prices in global markets.
Unlike stocks, which are influenced by numerous factors including management changes, geopolitical events, and investor sentiment, commodity prices like coffee primarily operate on simple supply and demand principles. Coffee consumption is relatively stable and gradually increasing worldwide, creating a predictable demand side.
Latin America, particularly Brazil, dominates global coffee production. Weather disruptions in these regions have substantial impacts on worldwide supply, directly affecting prices. With less geopolitical interference compared to commodities like petroleum, coffee prices provide a cleaner dataset for studying how climate factors influence agricultural markets.
Arabica coffee beans thrive in specific temperature ranges between 18-21°C. My data analysis revealed that coffee-growing regions in Brazil experienced temperature fluctuations ranging from 15°C to 24°C, with occasional spikes reaching 30°C—conditions that complicate coffee production.
Additionally, precipitation patterns showed cyclical behavior with minimal rainfall most of the year but significant increases during December and January. These weather variations directly impact harvesting conditions and crop yields, ultimately affecting global coffee prices.
This research combined multiple data sources to create a robust analytical framework. Weather data was collected from the MeteoStat platform, focusing on daily temperature and precipitation readings from the Minas Gerais and São Paulo regions—Brazil's primary coffee-producing areas.
Daily Arabica coffee futures prices were obtained from Yahoo Finance, along with BRL/USD exchange rates to normalize price fluctuations across currencies. The data spans from 2019 to 2024, providing a comprehensive time series dataset for analysis.
To account for the delayed impact of weather on coffee prices, I shifted price data forward by 40 days, reflecting the time lag between weather events and their market consequences.
Initial analysis employed robust regression models—including HC3 and Huber T models—to manage outliers in the data. While these produced statistically significant results, their economic significance was limited, with R² values close to zero.
To address these limitations, I implemented machine learning models using SciKit Learn, with features including average temperature, precipitation, year, and month. The data was carefully preprocessed, standardizing numerical values while encoding categorical variables using appropriate techniques.
I tested four machine learning models: Random Forest Regression, Gradient Boosting Regression, Linear Regression, and K-Nearest Neighbors. Each model underwent extensive hyperparameter tuning to optimize performance while preventing overfitting.
The Gradient Boosting Regressor emerged as the superior model, demonstrating remarkable accuracy in predicting coffee price trends. Its ability to build sequential weak prediction models that correct previous errors proved ideal for capturing the complex relationship between weather patterns and price movements.
The analysis revealed that higher temperatures correlate with decreased coffee prices (coefficient of -0.009, significant at the 5% level). This aligns with agricultural science, as temperatures within the optimal range enhance production capacity, increasing supply and consequently reducing prices.
Conversely, increased precipitation showed a positive relationship with prices (coefficient of 0.0116, significant at the 1% level). This initially counterintuitive finding suggests that excessive rainfall during certain periods might disrupt harvesting operations and potentially damage crops, reducing supply and driving prices higher.
This research successfully demonstrates that machine learning models, particularly Gradient Boosting Regression, can effectively predict Arabica coffee prices by analyzing weather patterns in Brazil's key growing regions.
The findings have significant implications for:
As climate change continues to introduce greater variability in weather patterns, these predictive models will become increasingly valuable for understanding and navigating commodity markets.
For more details, access the full research paper here.