Richard Hart

Head of Something @ Somewhere
Kent, UK

My Music
My Photos

LinkedIn
Mastodon

Product recommendations in Spree using the Jaccard Index

Being able to recommend products to shoppers is a vital part of any online store. The “Customers Who Bought This Item Also Bought” section can lead to a lot of extra sales if done well. The Jaccard Index is a way of measuring similarity between items. Using some custom SQL we can extract the values we need:

WITH
  product_sales AS (
    SELECT spree_variants.id AS variant_id, SUM(quantity) as sold
      FROM spree_line_items
      JOIN spree_variants ON spree_line_items.variant_id = spree_variants.id
      GROUP BY spree_variants.id ),

  product_joint_frequency AS (
    SELECT v1.product_id AS product_a, v2.product_id AS product_b, COUNT(*) AS frequency
      FROM spree_line_items AS li1
      JOIN spree_line_items AS li2 ON li1.order_id = li2.order_id
      JOIN spree_variants AS v1 ON li1.variant_id = v1.id
      JOIN spree_variants AS v2 ON li2.variant_id = v2.id
      WHERE li1.variant_id != li2.variant_id
      GROUP BY product_a, product_b )

SELECT product_a, product_b, frequency, sales1.sold AS a_sold, sales2.sold AS b_sold
  FROM product_joint_frequency
  JOIN product_sales AS sales1 ON product_joint_frequency.product_a = sales1.variant_id
  JOIN product_sales AS sales2 ON product_joint_frequency.product_b = sales2.variant_id
  ORDER BY product_a, joint_frequency DESC, product_b

With these values we can then calculate the affinity between sold products:

frequency / (a_sold + b_sold - frequency)