Directed graphs from papers with OpenAI o3
I've been having fun asking o3 to read papers, extract hypotheses and results, and then construct "causal"1 diagrams, i.e., directed acyclic graphs that contain the relationships of different variables.
Strong associations are connected with black lines, weak or negative associations with red lines.
Here's an example from "World Models and Consistent Mistakes in LLMs".
Prompt:
Read this paper and then do a causal diagram that captures the hypotheses tested in it.
Add nodes for hypotheses and results. You may use dark lines for hypotheses that have sufficient evidence
and red dashed lines for hypotheses without sufficient evidence or negative results.
Keep all relevant nodes, but ensure the diagram is readable.
Return this in mermaid format.
Result:

Kind of useful, although one would have to read the paper to understand what is going on.
A more controlled setting
Asked o4-mini to write a mini-paper on Newton's gravitational law, specifically the dependence on , using fictitious experiments. It produced such a mini-paper with the table
Separation r (m) Mean Force \bar F (N) \bar F \times r^2 (N·m²)
0.30 3.72 × 10⁻⁹ 3.35 × 10⁻¹⁰
0.50 1.34 × 10⁻⁹ 3.35 × 10⁻¹⁰
0.70 6.82 × 10⁻¹⁰ 3.35 × 10⁻¹⁰
1.00 3.35 × 10⁻¹⁰ 3.35 × 10⁻¹⁰
Then asked in a separate chat for a causal diagram based on the mini-paper:

What happens if we only give to o3 the table without any explanation?

Without a lot to go on, o3 focusses on the functional relationships, but the graph looks correct given the table.
Scaling this
With ChatGPT agent, we can even scale this across papers, getting perhaps some nice summaries of what worked and what didn't, although more guidance is needed towards which papers to read.

This is the document o4-mini created, if interested.
Testing the Inverse-Square Law of Gravitation: A Single Fictitious Experiment
Introduction
Newton’s law of universal gravitation asserts that two point masses m_1 and m_2 attract each other with a force given by:
F = G \frac{m_1 m_2}{r^2},
where r is the distance between their centers of mass and G is the gravitational constant. This experiment isolates the distance dependency by holding both masses constant and varying r.
Experiment: Verifying the 1/r^2 Distance Dependency
Experimental Setup
• Masses: Two identical spherical masses, each of 0.5\,\mathrm{kg}, mounted on insulated, vibration-damped supports.
• Torsion Balance: A high-precision torsion balance measures the tiny gravitational force between the spheres.
• Distance Control: A micrometer-driven translation stage adjusts separation in precise steps.
Procedure
1. Fix both masses at m_1 = m_2 = 0.5\,\mathrm{kg}.
2. Set separation distances r at four values: 0.30 m, 0.50 m, 0.70 m, and 1.00 m.
3. For each r, record the torsion angle and convert to force using the calibrated torsion constant.
4. Repeat each measurement five times and compute the mean force \bar F and its standard deviation.
Simulated Results
Separation r (m) Mean Force \bar F (N) \bar F \times r^2 (N·m²)
0.30 3.72 × 10⁻⁹ 3.35 × 10⁻¹⁰
0.50 1.34 × 10⁻⁹ 3.35 × 10⁻¹⁰
0.70 6.82 × 10⁻¹⁰ 3.35 × 10⁻¹⁰
1.00 3.35 × 10⁻¹⁰ 3.35 × 10⁻¹⁰
The near-constant value of \bar F r^2 across all separations demonstrates the 1/r^2 dependency.
Analysis
Plotting \bar F versus 1/r^2 yields a straight line through the origin. A linear regression gives:
\bar F = (3.35 \times 10^{-10}\,\mathrm{N\cdot m^2}) \times \frac{1}{r^2},
with an R^2 value of 0.999, confirming the inverse-square relation.
End of Document
quoting "causal" here because papers in ML most often show strong association and not causality. Also, extracting causal relationships isn't an easy task for LLMs, e.g, https://arxiv.org/pdf/2312.04350↩