Being able to recommend products to shoppers is a vital part of any online store. The “Customers Who Bought This Item Also Bought” section can lead to a lot of extra sales if done well. The Jaccard Index is a way of measuring similarity between items. Using some custom SQL we can extract the values we need:
WITH product_sales AS ( SELECT spree_variants.id AS variant_id, SUM(quantity) as sold FROM spree_line_items JOIN spree_variants ON spree_line_items.variant_id = spree_variants.id GROUP BY spree_variants.id ), product_joint_frequency AS ( SELECT v1.product_id AS product_a, v2.product_id AS product_b, COUNT(*) AS frequency FROM spree_line_items AS li1 JOIN spree_line_items AS li2 ON li1.order_id = li2.order_id JOIN spree_variants AS v1 ON li1.variant_id = v1.id JOIN spree_variants AS v2 ON li2.variant_id = v2.id WHERE li1.variant_id != li2.variant_id GROUP BY product_a, product_b ) SELECT product_a, product_b, frequency, sales1.sold AS a_sold, sales2.sold AS b_sold FROM product_joint_frequency JOIN product_sales AS sales1 ON product_joint_frequency.product_a = sales1.variant_id JOIN product_sales AS sales2 ON product_joint_frequency.product_b = sales2.variant_id ORDER BY product_a, joint_frequency DESC, product_b
With these values we can then calculate the affinity between sold products:
frequency / (a_sold + b_sold - frequency)