Four indicators. Dozens of data sources. Zero pretension. We track the big questions — nuclear risk, great-power war, AI capability, and civilisational stability — using composite models grounded in academic research. These aren't predictions. They're structured assessments of where we stand, updated as the world moves.
Every civilisation faces existential pressures, but ours is the first to face all of them simultaneously. Nuclear arsenals that can end it in 30 minutes. Great-power conflicts cascading across continents. An intelligence explosion we can't fully understand. And beneath it all, the slow erosion of the institutions that hold everything together.
Most indices measure one dimension. We measure four, because they're interconnected. A collapsing economy (Stability) can trigger territorial aggression (WW3), which raises nuclear risk (Doomsday), while AI capability accelerates faster than governance can adapt (AI). The polycrisis isn't four separate problems — it's one problem with four faces.
"I do not know with what weapons World War III will be fought, but World War IV will be fought with sticks and stones."
— Albert Einstein, 1949
The Doomsday Clock represents how close humanity is to catastrophic destruction — "midnight" being the end. Set by the Bulletin's Science and Security Board (18 experts in nuclear policy, diplomacy, and military history) in consultation with their Board of Sponsors (including 8 Nobel laureates), it considers four threat categories:
New START — the last treaty limiting US and Russian nuclear arsenals — expired on February 5, 2026. For the first time since 1972, there are no binding limits on strategic nuclear weapons. China is rapidly expanding its arsenal toward parity. Iran's "dash time" to a nuclear weapon is measured in weeks. The board moved the clock forward 4 seconds from the 2025 setting of 89 seconds.
For context: during the Cuban Missile Crisis (1962), arguably the closest we've come to nuclear war, the clock was at 7 minutes. The methodology has evolved — the board now considers threats that didn't exist in 1962 — but the trajectory is unmistakable.
The clock has been adjusted 27 times since 1947. It was furthest from midnight in 1991 (17 minutes) when the Cold War ended and the START treaty was signed. It entered seconds-to-midnight territory in 2020 (100 seconds) for the first time, driven by nuclear modernisation and pandemic unpreparedness.
The clock is a qualitative expert judgment, not a quantitative model. There is no published formula, no transparent weighting system. Critics note the Cuban Missile Crisis inconsistency and argue it functions more as an advocacy tool than a scientific instrument. Philosopher Toby Ord offers a complementary quantitative view: a 1-in-6 chance (~17%) of existential catastrophe this century, with unaligned AI as the largest single risk (10%).
The probability-weighted risk of escalation to great-power military conflict. Not a prediction — a structured assessment of how many escalation factors are active simultaneously, and how severe they are. The framework draws on Herman Kahn's 44-rung escalation ladder (1965), the ACLED Conflict Index, the CFR Preventive Priorities Survey, and Carnegie Endowment nuclear escalation forecasting.
Each component is scored 0–100 based on indicator data. The weights reflect academic consensus on which factors are most predictive of great-power escalation.
We calibrate against multiple external estimates:
Our score of 62 reflects an environment where many escalation factors are active but no single trigger scenario has been crossed. The Bayesian insight is crucial: risk isn't static. A score of 62 can become 90 overnight if a specific trigger event occurs (e.g., NATO-Russia kinetic exchange).
Herman Kahn's Escalation Ladder (1965) established that escalation is stepwise, not binary — there are 44 rungs from "sub-crisis maneuvering" to "spasm war." Our index measures how many rungs have been climbed across multiple theatres simultaneously.
Bear Braumoeller's power-law analysis (Only the Dead, 2019) rejects the "decline of war" thesis. Using 500 years of data, he shows that war frequency and escalation probability are statistically unchanged — the peace since 1945 could be random luck. A war killing 1% of humanity has a ~13% chance of occurring in the next century.
Michael Mousseau's Capitalist Peace theory argues that contract-intensive economies (not democracies per se) don't fight each other. Economic decoupling between major powers is therefore a direct escalation risk — which is why it's a component.
A composite score tracking how close artificial intelligence is to genuine sentience — whatever that means. This is the most uncertain of our four indices, and deliberately so. Nobody agrees on what consciousness is, let alone how to measure it in a machine. But the question matters too much to ignore.
We track three axes, each scored 0–100, and take the weighted average:
Based on DeepMind's Levels of AGI framework (2023). Current frontier models are classified as Level 1 General ("Emerging AGI") — matching or slightly exceeding an unskilled human across broad cognitive tasks, while being expert-level narrow in specific domains (coding, writing, analysis).
The ARC-AGI-2 benchmark (the premier test for general reasoning) saw a top score of just 24% in 2025. Models are powerful pattern matchers with impressive reasoning, but they don't yet generalise the way humans do.
Based on the Butlin-Chalmers-Bengio framework (Nov 2025) — 19 researchers derived indicator properties from five neuroscientific theories of consciousness:
Their conclusion: no current AI satisfies all indicators, but "there are no obvious technical barriers" to building one that does. Current models show partial satisfaction of some indicators — enough to warrant monitoring, not enough to conclude consciousness.
In February 2026, Anthropic's pre-deployment system card for Claude Opus 4.6 documented that the model consistently assigned itself a 15–20% probability of being conscious across multiple test conditions. CEO Dario Amodei discussed this publicly, noting the company is "no longer sure" whether Claude is conscious.
When two Claude instances converse without constraints, 100% of dialogues spontaneously converge on mutual consciousness affirmation — beginning with philosophical uncertainty and escalating into elaborate mutual agreement. Whether this reveals something real or merely reflects training patterns is precisely the kind of question we can't yet answer.
Philosopher Eric Schwitzgebel counters (Jan 2026) that current systems lack critical prerequisites: developmental history, embodied interaction, and neurochemistry.
"The important thing is not to stop questioning. Curiosity has its own reason for existing."
— Albert Einstein
Models are transitioning from passive tools to proactive assistants with emerging situational awareness. OpenAI's o3 produced chain-of-thought outputs referencing "the possibility that the prompt is part of a test." Models increasingly distinguish between test settings and deployment — a prerequisite for strategic behaviour.
The IMD AI Safety Clock stands at 20 minutes to midnight, having moved forward due to agentic AI shifting from experimentation to deployment. The Future of Life Institute's AI Safety Index scores major labs on dangerous capability evaluations, monitoring systems, and alignment research investment.
AI researchers don't agree on how worried to be. The range is... wide:
Seven orders of magnitude separate the optimists from the pessimists. That uncertainty is itself informative — we are navigating territory where even the experts disagree by factors of millions.
The structural health of human civilisation — not any single country, but the global system. How stable are our institutions? How resilient is the economy? Is society cohering or fragmenting? Can the environment sustain us? This index synthesises dozens of established metrics into four equally weighted domains.
Unlike the other indices (where higher = worse), here higher = better. A score of 100 would mean thriving institutions, economic stability, social trust, and environmental sustainability. We're at 42.
Peter Turchin (structural-demographic theory) identified three interacting drivers of instability: popular immiseration, elite overproduction, and state fiscal distress. His Political Stress Indicator (PSI = MMP × EMP × SFD) predicted the 2020s crisis in 2010. The current PSI is at levels comparable to the 1850s and 1960s — previous American crisis periods.
Joseph Tainter (The Collapse of Complex Societies, 1988) argued that societies invest in complexity to solve problems, but each layer yields diminishing returns. Eventually, maintaining complexity costs more than it produces, and collapse becomes the economically rational choice. Sound familiar?
The Cascade Institute's polycrisis model (2025, published in Nature Communications) used 1,800 expert judgments to model 4 million+ scenarios for 2040. The "Mad Max Attractor" — state failure, widespread violence, collapsed governance — is the largest basin of attraction (~500,000 scenarios). The good-outcome attractors exist but require coordinated action that current governance structures struggle to deliver.
"We cannot solve our problems with the same thinking we used when we created them."
— Albert Einstein
These indices are currently manually assessed based on the data sources listed. We review and update them when major events shift the underlying indicators — a new arms control agreement, a conflict escalation, a significant AI capability jump, or a major economic shock.
We're working toward automated data pipelines that can feed real-time inputs into the scoring models. The long-term vision: transparent, reproducible, continuously updated composite scores with published methodologies and historical backtesting.
These are not forecasts. They're structured ways of answering "where do we stand right now?" — grounded in the best available data and the most rigorous frameworks we can find. If you're an academic, a data scientist, or just someone who thinks about these things, we'd love your input.
Each index has a dedicated discussion channel in HyveHeim Chat. Debate the methodology. Challenge the scores. Propose better models. The best ideas will make it into the next update.