Thursday, April 17, 2025

Analyzing "expert questions" statistically

 



Analyzing "expert questions" statistically depends a lot on the type of data you have and what you're trying to find out. Let's break it down and I’ll give you an overview of possible approaches.


🔍 Step 1: Understand the Question Type

What do you mean by “expert question”? Possibilities:

  1. A question written by an expert (e.g., to test others' knowledge)?

  2. A question posed to an expert (e.g., in interviews)?

  3. A question answered by experts, where you’re evaluating the quality/consistency?

  4. A survey question labeled as “expert-level”?

Once that’s clear, you can move to…


📊 Step 2: Collect the Right Data

Typical data points could include:

  • Difficulty ratings (from a panel or AI)

  • Number of correct answers by non-experts

  • Time taken to answer

  • Agreement among experts

  • Textual features (e.g., length, keywords, jargon density)


📐 Step 3: Statistical Methods You Can Use

Here are some methods depending on what you’re analyzing:

A. Descriptive Statistics

To describe the question characteristics:

  • Mean, median, SD of difficulty rating

  • Frequency of topic types

  • Length of question (word count, sentence complexity)

B. Inter-Rater Reliability

If multiple experts rate the same question:

  • Cohen’s Kappa (for 2 raters)

  • Fleiss’ Kappa (for 3+ raters)

  • ICC (Intraclass Correlation Coefficient) if ratings are on a continuous scale

C. Item Analysis (common in education/testing)

Used to evaluate questions in a test:

  • Difficulty index (p-value): % of people who answered it correctly

  • Discrimination index: Correlation between the item score and total score

  • Point-biserial correlation

D. Inferential Statistics

If you want to compare groups (e.g., expert vs non-expert responses):

  • t-test / ANOVA: compare mean ratings or performance

  • Chi-square test: compare distributions (e.g., topic frequency)

  • Regression Analysis: predict difficulty or accuracy based on features of the question

E. Text Analysis + Statistics

If you're analyzing the question text itself:

  • Use text mining to extract features (TF-IDF, readability scores)

  • Then apply clustering, factor analysis, or logistic regression to relate these to outcomes (like answer accuracy or expert agreement)


🧠 Example Use Case

You have 50 expert-written questions and 100 learners answering them. You want to know which questions are too easy or too hard.

  • Calculate p-value (difficulty index) for each question

  • Use discrimination index to see which questions best separate high- and low-performing learners

  • Use item-total correlation to flag bad items

  • Optional: run a factor analysis to see if items group by topic or skill


No comments: