In AP English Language and Composition, we ask students to enter an ongoing conversation, craft an argument, and make deliberate rhetorical choices. The course and exam description (CED) guarantees that, in each synthesis source set, “[t]wo of the provided sources are visual, including at least one quantitative source” (111). Yet when they actually spend time with a graph or a chart, too many students shift gears. While the best AP Lang students learn to carefully develop their prose as they represent their reasoning, the numbers too often become almost decorative. Students may “cite” the statistic and possibly explain how it relates to their reasoning, but rarely do they really engage with it or analyze it. They just dump in some data and move on.
With the growth of AI and the nearly unbelievable (though perfectly reasonable) fact that 90% of the data humans have access to has been generated in just the last two years, the importance of “Data Literacy” extends far beyond the AP Language classroom and Synthesis task and become (as with all of the other things we do in a College Composition course and the AP Language course) essential skills to living in society and successfully participating in civilization.
"90% of the data humans have access to has been generated in just the last two years."
In short, statistical and data literacy is civic literacy.
Here’s the uncomfortable truth: very few secondary schools (even moreso, undergraduate programs) across the country require sustained instruction in reading or thinking critically about data, graphs, and charts unless students major in STEM, economics, or specific social sciences. Even then, the emphasis is often on producing calculations rather than interrogating representations.
This means our AP Lang students are often encountering quantitative rhetoric without a framework for analyzing it.
They may know how to:
Identify the highest bar on a graph.
Quote a percentage.
Restate a trend.
But they may not know how to:
Question sampling methods.
Evaluate scale manipulation.
Consider omitted variables.
Distinguish correlation from causation.
Analyze how a statistic functions rhetorically within an argument.
In other words, they can report data—but not analyze it as a rhetorical choice.
As teachers of rhetoric, that's our lane.
"Design is Argument"
-
"Design Choices are Rhetorical Choices"
Every graph and statistic in a synthesis packet is a constructed artifact. Someone chose:
What to measure.
How to measure it.
What to include.
What to exclude.
How to label axes.
How to scale the visual.
How to frame the conclusion.
As Edward Tufte’s work on visual display reminds us, “design is argument.” Choices about scale, proportion, and emphasis shape perception. A shortened y-axis can dramatize small differences. A selective timeframe can imply trends that aren’t really there.
When students treat data as objective “proof,” they miss many of the complexities of the rhetorical situation.
Instead, we need to train them to ask:
What claim is this data being used to support?
What assumptions underlie the measurement?
What audience is this designed to persuade?
What alternative interpretations might exist?
This is analysis—not summary.
If we want students to move beyond data dumping, we need to explicitly teach a few common quantitative fallacies.
1. Correlation ≠ Causation
Students often treat two trends moving together as proof that one causes the other. But as researchers across fields emphasize, correlation alone cannot establish causal relationships. Confounding variables may explain the pattern.
Teaching students to ask, “What else might explain this?” strengthens their line of reasoning and deepens their commentary.
2. Cherry-Picking Data
A writer may decide to select only the most favorable statistic from a broader dataset. Students need to recognize when a single data point is standing in for a more complex picture.
Encourage them to ask:
Is this representative?
What is the broader context?
What data might complicate this claim?
This may allow them to recognize biases in the writer or, on the other hand, may allow them to recognize how a writer’s bias affects their use–and representation of–the data.
3. Misleading Scales and Visual Distortion
Graphs can exaggerate or minimize differences depending on how the axes are scaled. Tufte’s concept of the “lie factor” is useful here: the degree to which visual representation distorts the numerical reality.
Students should learn to scrutinize:
Axis starting points.
Uneven intervals.
Inconsistent category groupings.
Because date representation is rhetorical, that scrutiny is rhetorical analysis.
If you want to cultivate intellectual humility and critical reasoning, introduce your students to Simpson’s Paradox.
Simpson’s Paradox occurs when a trend appears in several groups of data but reverses when the groups are combined. It demonstrates how aggregated data can obscure subgroup differences—and how interpretation depends on framing.
This is gold for rhetorical instruction.
Simpson’s Paradox teaches students:
Data can tell different stories depending on grouping.
Context matters.
Aggregation is a rhetorical move.
“Overall” trends can conceal inequities.
A Concrete Example
One classic example of Simpson’s Paradox involves university admissions data. Imagine a university where, at first glance, the overall admission rate for men is 60% and for women is 40%. A quick reader might conclude gender bias.
But when the data is broken down by department, a different pattern emerges:
In the engineering department (which admits only 30% of applicants), both men and women are admitted at roughly the same rate.
In the humanities department (which admits 70% of applicants), both men and women are again admitted at roughly the same rate.
However, a larger proportion of women applied to the more competitive engineering program, while a larger proportion of men applied to the less competitive humanities program.
Within each department, admission rates are comparable. The overall disparity emerges from how applicants are distributed across programs.
That reversal—where aggregated data suggests discrimination but disaggregated data complicates the narrative—is Simpson’s Paradox.
How this Might Be Used in an Argument
Imagine an AP synthesis prompt about equity in higher education. A student might initially write:
“Because women are admitted at a rate 20 percentage points lower than men, the university’s admissions process is clearly biased.”
That is data dumping.
A more rhetorically sophisticated student might write:
“Although the university’s overall admission rate appears to favor male applicants (60% compared to 40% for women), departmental breakdowns reveal comparable acceptance rates within individual programs. This discrepancy suggests that application patterns—rather than explicit admissions bias—may account for the aggregate gap. However, the concentration of women in more competitive fields raises additional questions about advising structures and access to information.”
Now the student is:
Analyzing grouping choices.
Questioning assumptions.
Identifying multiple plausible interpretations.
Extending the conversation rather than prematurely closing it.
That is an argument grounded in critical data literacy.
Researchers in statistics and data science repeatedly emphasize that context and grouping decisions shape interpretation. For AP Lang, that insight translates directly into stronger reasoning, more nuanced commentary, and a more defensible line of reasoning.
Scholars like Gerd Gigerenzer have written extensively about statistical literacy and the dangers of innumeracy in public life. When citizens misinterpret risk percentages or misunderstand probability, policy conversations suffer.
Similarly, cognitive research on graph comprehension shows that visual displays require learned interpretive skills. Reading a graph is not intuitive; it is a form of literacy.
If literacy is our domain, then quantitative literacy intersects with it.
We cannot assume that because students can decode prose, they can interpret a scatterplot. Interpretation requires:
Understanding methodology.
Recognizing sample size implications.
Evaluating reliability.
Considering margin of error.
Identifying potential bias.
In AP synthesis, this means moving from “Source D shows that 65% of participants prefer X” to “Source D’s survey of 200 self-selected respondents suggests X, though its limited sample and potential response bias complicate the its findings.”
That shift—from reporting to evaluating—could mean the difference between a high performing essay and an okay one.
If we want students to stop dumping data, we have to change how we teach it.
1. Treat Graphs Like Texts
Ask students to annotate graphs the way they annotate prose:
Circle loaded labels.
Note scale choices.
Identify implied claims.
Write margin commentary about limitations.
Make visual analysis routine, not occasional.
2. Require Methodology Commentary
When students cite quantitative evidence in a practice synthesis essay, require one sentence that addresses methodology or limitation.
For example:
“Although the study’s sample is geographically limited…”
“Because the data aggregates multiple income brackets…”
“Given the short time span represented…”
This builds habits of rhetorical evaluation.
3. Practice Disaggregation Thinking
Present students with a dataset summary, then ask:
What subgroups might matter here?
How could this look different if broken down by region, age, or income?
This is where Simpson’s Paradox becomes instructional rather than theoretical.
4. Teach the “So What?” and “Who Cares?” of Statistics
Students often believe that citing a large number automatically strengthens an argument. But magnitude alone does not equal significance.
After a statistic, ask:
Why does this matter?
Compared to what?
What threshold makes this meaningful?
Why will/should the reader care about that?
As most AP Lang teachers already know, synthesis is not about stacking sources; it is about entering a conversation. The best essays recognize the larger conversation and place sources in dialogue as the argument develops.
Students often use the quantitative sources as ways of:
Establishing scope of a problem.
Providing empirical grounding.
Signaling credibility.
Challenging assumptions.
But they can also:
Oversimplify.
Obscure complexity.
Mask inequities.
Mislead.
When students learn to analyze data as rhetorical, they can position themselves more effectively in the conversation.
They become better able (and willing) to:
Qualify sweeping claims.
Complicate binary debates.
Identify limitations.
Build more defensible arguments.
This leads to sophistication—not in the buzzword sense, but in the intellectual sense.
Here’s the no-nonsense part: if we allow students to treat statistics only as decorative proof, we are lowering the cognitive demand of the task. They are capable of more.
We already teach them to:
Analyze language.
Evaluate tone.
Establish claims.
Select information as evidence.
Consider the audience.
Create and follow a line of reasoning.
Data analysis is an extension of those same rhetorical moves.
When students understand that a graph is an argument, that aggregation is a choice, and that even a percentage carries assumptions, they become stronger readers and more responsible writers.
And in a world saturated with infographics, viral statistics, and weaponized data, that responsibility matters.
The goal isn’t to turn AP Lang into AP Statistics. It’s to ensure that when students encounter quantitative evidence in a synthesis packet—or in civic life—they don’t just dump the data.
They question it. They contextualize it. They analyze it. They use it with purpose.
That’s rhetoric.