AI models are gravity fields. Your prompts are masses.

Rafael Lima, who was on our prompting team early at Darwin, gave me this analogy.

Picture a model as the canvas of spacetime in general relativity. A planet placed on the canvas distorts the geometry around it, and that distortion changes how everything else moves through the field. Each AI model — GPT, Claude, Gemini, DeepSeek — is its own canvas with its own planets in different places, because each was trained with different weights and different data. Some models are very similar (rumors of distillation), and you’ll see eerily similar outputs across them. Some are entirely different galaxies.

Your prompt is a set of masses you place on the canvas. The output is the trajectory those masses produce, given the existing gravity. So:

Same prompt + similar model = similar trajectory (similar output).
Same prompt + different model = different trajectory.

   model A — gravity field           model B — gravity field
   (e.g. GPT)                          (e.g. Claude)

   ╔═══════════════╗                  ╔═══════════════╗
   ║  ·     *    ·  ║                  ║  *       ·     ║
   ║     ⊙  ─╮      ║                  ║   ·   ⊙  ─╮  ║
   ║         ↓     ║   same prompt     ║          ↘    ║
   ║      *        ║   ════════════→  ║   *           ║
   ║   trajectory A  ║                  ║   trajectory B  ║
   ╚═══════════════╝                  ╚═══════════════╝

   ⊙ = your prompt (a mass placed on the canvas)
   * = the model's training-data planets warping the canvas
   ↓ = the trajectory = the output

To get the same output across two very different models, you have to add new masses to your prompt — extra instructions, extra context, extra constraints — to bend the trajectory back to where you want it.

We had a real case we called the “NT problem.” We’d built a prompt on one model that included never say "I understand" because that model was annoyingly servile. Worked fine. We migrated to a different model. The new model started replying with “NT, …” — it had partially absorbed the instruction. Different gravity, different distortion, same prompt produced a degenerate output. The fix was to rewrite the constraint at a higher level: “don’t be servile, don’t use phrases like ‘I understand’ or its synonyms or near-synonyms.”

Now zoom out. Imagine all of humanity using these models constantly — for ordering coffee, for writing to friends, for running businesses. When a model provider changes their gravity (a new version, a new RLHF pass), every app built on top of it shifts. The companies that survive the shift are the ones that have built the eval infrastructure, the comparison tooling, the team whose entire job is monitoring model-to-model migrations.

We have that team at Darwin. It’s a new role, didn’t exist five years ago. As models keep evolving, this role becomes critical. If your application is one prompt away from breaking when the next frontier model ships, you don’t have a moat — you have a fuse.

So: assume your gravity is going to change. Build for migration. Have the team for it.

navigation
`g` `h`	home (~/)
`g` `w`	whoami
`g` `s`	sirena
`g` `d`	darwin
`g` `r`	rodati
`g` `f`	food
`g` `c`	colophon
`g` `n`	newsletter
`g` `t`	tags
`g` `u`	uses
view
`D`	toggle dark mode
`+`	bigger text
`−`	smaller text
`0`	reset text size
edit
`a`	select all
`y`	copy page url
window
`n`	new shell tab
`m`	minimize
`z`	zoom (max)
`x`	close window
`o`	bring to front
help
`?`	toggle this help
`esc`	close