Venn Plot
This module provides functionality for creating Venn and Euler diagrams from pandas DataFrames.
It is designed to visualize relationships between sets, highlighting intersections and differences between them.
Core Features
- Supports 2-set and 3-set Diagrams: Allows visualization of up to three overlapping sets.
- Venn and Euler Diagrams: Uses Venn diagrams by default; switches to Euler diagrams when
vary_size=True. - Customizable Colors and Labels: Automatically assigns colors and labels for subset representation.
- Dynamic Sizing: Adjusts circle sizes for Euler diagrams to reflect proportions.
- Title and Source Attribution: Optionally adds a title and source text.
Use Cases
- Set Comparisons: Identify shared and unique elements across two or three sets.
- Proportional Representation: Euler diagrams ensure area-accurate representation.
- Data Overlap Visualization: Helps in understanding relationships within categorical data.
Limitations and Warnings
- Only Supports 2 or 3 Sets: Does not extend to Venn diagrams with more than three sets.
- Pre-Aggregated Data Required: The module does not perform data aggregation; input data should already be structured correctly.
plot(df, labels, title=None, eyebrow=None, subtitle=None, source_text=None, vary_size=False, figsize=None, ax=None, subset_label_formatter=None, **kwargs)
Plots a Venn or Euler diagram using subset sizes extracted from a DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df |
DataFrame
|
DataFrame with 'groups' and 'percent' columns. |
required |
labels |
list[str]
|
Labels for the sets in the diagram. |
required |
title |
str
|
Title of the plot. Defaults to None. |
None
|
eyebrow |
str
|
Small uppercase label rendered above the title. Defaults to None. |
None
|
subtitle |
str
|
Supporting copy rendered below the title. Defaults to None. |
None
|
source_text |
str
|
Source text for attribution. Defaults to None. |
None
|
vary_size |
bool
|
Whether to vary circle size based on subset sizes. Defaults to False. |
False
|
figsize |
tuple[int, int]
|
Size of the plot. Defaults to None. |
None
|
ax |
Axes
|
Matplotlib axes object to plot on. Defaults to None. |
None
|
subset_label_formatter |
callable
|
Function to format subset labels. Defaults to None. |
None
|
**kwargs |
Any
|
Additional keyword arguments. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
SubplotBase |
SubplotBase
|
The matplotlib axes object with the plotted diagram. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the number of sets is not 2 or 3. |
Source code in openretailscience/plots/venn.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 | |