2020Bachelor's ThesisUniversity of Warsaw

Stock Price Prediction using GAN and BERT

Advanced deep learning research combining Generative Adversarial Networks and BERT-based sentiment analysis to predict gaming industry stock prices and develop profitable investment strategies.

Research Overview

This bachelor's thesis explored the application of cutting-edge machine learning techniques to financial market prediction. The research focused on four major gaming industry companies: Electronic Arts (EA), Ubisoft (UBSFY), Take-Two Interactive Software (TTWO), and Activision Blizzard (ATVI).

The gaming sector was specifically chosen due to its potential susceptibility to sentiment analysis - player opinions expressed on Reddit forums could significantly impact company valuations. The core innovation combined Generative Adversarial Networks (GANs) for price prediction with BERT-based sentiment analysis of Reddit r/Games discussions.

Key Research Question: Can a GAN-based system utilizing both technical indicators and sentiment analysis create profitable investment strategies that outperform the traditional Buy & Hold approach?

GAN training architecture showing generator and discriminator interaction

GAN Training Architecture: Generator creates price predictions while discriminator learns to distinguish real from synthetic data

Technical Architecture

GAN Model Design:
  • Generator: 3-layer GRU (1024, 512, 256 neurons) with MLP layers (128, 64, 1) and 0.2 recurrent dropout
  • Discriminator: 1D CNN with 3 convolutional layers (32, 64, 128 units) + MLP layers (220, 220, 1)
  • 30-day lookback window for sequential time series prediction
  • ADAM optimizer with learning rate 0.0016 for both networks
Sentiment Analysis Pipeline:
  • Google's BERT model for analyzing Reddit r/Games comments
  • Company-specific keywords: game titles, publisher names
  • Daily sentiment aggregation: mean, std dev, median, Q1, Q3, comment count
  • 3-year dataset (2019-2021) from Reddit API using psaw library
Investment strategy architecture showing prediction to trading signal flow

Investment Strategy Architecture: Converting GAN predictions into buy/sell/hold signals with alpha parameters

Model Training Results

The GAN model was trained using TensorFlow and Keras on NVIDIA GTX 1070 with CUDA acceleration. Each model underwent 1000 epochs of adversarial training with careful monitoring of generator-discriminator dynamics.

EA generator and discriminator loss functions over training epochs

EA Training Progress: Generator vs Discriminator loss showing adversarial competition and convergence

EA actual vs predicted stock prices showing model accuracy

EA Price Prediction Results: Model achieved 1.25% MAE relative to average price

Investment Strategy Results

The investment strategy was tested during a challenging market period where all four companies experienced downtrends. Despite adverse conditions, the GAN-based system demonstrated remarkable resilience and profitability.

EA investment strategy performance comparison

EA Strategy Performance: GAN-based approach outperforming Buy & Hold during market downturn

Key Findings

Model Performance Metrics

CompanyMAE ($)RMSE ($)Avg Price ($)Error %
Electronic Arts (EA)1.732.32139.191.25%
Activision Blizzard (ATVI)1.361.8983.651.62%
Ubisoft (UBSFY)0.260.3412.842.01%
Take-Two Interactive (TTWO)3.544.51170.882.07%

Investment Strategy Results

CompanyBuy & HoldGAN StrategyOutperformance
UBSFY$631.84 (-36.8%)$1,258.93 (+25.9%)+99.3%
EA$935.41 (-6.5%)$1,018.30 (+1.8%)+8.9%
ATVI$660.48 (-33.9%)$1,110.53 (+11.1%)+68.1%
TTWO$944.50 (-5.5%)$870.91 (-12.9%)-7.8%

Key Research Insights

  • Breakthrough Performance: 3 out of 4 companies achieved profitable strategies significantly outperforming Buy & Hold during bear market conditions
  • Exceptional Case Study: UBSFY strategy achieved +25.9% profit while Buy & Hold lost -36.8%, representing a 99.3% relative outperformance
  • Technical Analysis Validation: Processed price indicators reduced prediction error by 70-85% compared to raw market data
  • Sentiment Analysis Insights: Social media data showed limited effectiveness due to noise and temporal misalignment challenges
  • Bear Market Resilience: Short selling capability proved essential, enabling profit generation during overall market downturns
  • Model Generalization: GAN architecture successfully adapted across multiple companies with consistent sub-2.1% prediction errors

Technologies Used

Python 3.8TensorFlowKerasBERTGANGRUCNNReddit APIYahoo FinancePandasNumPyCUDALaTeX