Research Tool
Item Difficulty Estimator
Estimate assessment item difficulty using the AIED 2026 approach. Supports 9 item types and 28 interactive math widgets across K-8.
Try an example
Select an example or enter your own item
Supports 9 item types and 28 interactive widgets
9 Item Types
Assessment formats
Multiple Choice
4 options, 1 correct, misconception-mapped distractors
Multiple Select
Select all that apply from a set of options
True / False
Binary judgment on a mathematical statement
Short Answer
Free-text response, pattern-matched scoring
Numeric Entry
Exact numeric answer with tolerance range
Fill in the Blank
Cloze-style with inline blanks in context
Matching
Connect items across two columns
Ordering
Drag items into correct sequence
Essay / Explanation
Extended response, LLM-evaluated with rubric
28 Interactive Widgets
Math manipulatives
Counting
2x5 grid for addition/subtraction to 20
60+ sprite types in configurable arrangements
Place Value
Hundreds, tens, and ones blocks
Interactive column chart
Operations
Jumps, highlights, position markers
Rectangular grid for multiplication/division
Dot array with row/column highlights
Negative to positive with regions
Fractions
Bar divided into equal parts
Pie-chart style sectors
Side-by-side bars or circles
Grid fraction with eaten parts
Measurement
12-hour face with draggable hands
Liquid fractions
Data
Stacked dots on number line
Grouped frequency bars
Quartiles and whiskers
Correlation visualization
Bar segments for ratios
Geometry
Points, lines, regions
Drag-and-drop geometric shapes
Labeled sides with Pythagorean theorem
3D cube structures (WebGL)
How it works
1. Chain-of-thought analysis
The LLM analyzes 7 difficulty factors before estimating: cognitive steps, prerequisites, misconceptions, transfer distance, working memory, distractor quality, and reading load.
2. Anchored rubric
Instead of asking “how hard is this?” we provide grade-level calibrated examples at each difficulty level, grounding estimates in concrete comparisons.
3. Bias correction
LLMs exhibit variance collapse and systematic overconfidence. We apply corrections from our AIED 2026 research across 200 experimental conditions.