

Generating and evaluating hypotheses about past, present, and future events is core to argumentation in many domains, such as forensic investigations, medical diagnostics, and scientific research. In this paper, we explore the role of hypothesis-making in argumentative dialogue. To do so, we introduce an annotated dataset of 502 hypotheses in the existing RIP corpus of collaborative problem-solving in murder mystery games, creating the RIP1 corpus. Propositions marked as hypotheses in RIP1 correlate systematically with argument structure (previously annotated according to Inference Anchoring Theory). We explore the interaction between arguments and hypotheses, showing hypotheses are often conclusions of arguments and differences between how hypotheses and assertions are treated in dialogue. Based on this quantitative analysis, we conduct preliminary computational experiments establishing a baseline for the automatic mining of hypotheses. Experimentation with a Support Vector Machine and a fine-tuned RoBERTa model shows initial performance on text span classification with an F1 score of 80.0, outperforming random and majority baselines, and providing a target for future improvement.