I think I found a geometric shortcut around the scaling problem

I’ve been chewing on something for a while now and finally wrote it down properly. I wanted to share it here because this community actually understands the weight space geometry that makes this possible.

The scaling problem is obvious to everyone here. We keep making models bigger because bigger performs better. But then we can’t deploy them without expensive hardware or aggressive compression that loses fidelity. Pruning, quantization, distillation, they all trade something away. You never get the full model back.

So I started wondering: what if we’re asking the wrong question?

Instead of “how do we compress this model,” what if we treat the model’s weight space as a geometric object and literally fold it?

Think about a piece of paper. Unfolded, it covers a whole table. Folded enough times, it fits in your palm. But the information, the fibers, the structure, it’s all still there. You don’t need to unfold the whole thing to read one sentence. You just pierce through the fold at the right coordinate and traverse the local layers.

That’s the core intuition. A trained neural network is just a point in a high-dimensional manifold. If you impose a discrete symmetry group on that manifold, basically a mathematical folding operation, you get a quotient space that’s tiny but losslessly recoverable. Then you train a small navigator, call it C, that learns to traverse the folds for any given input. It never materializes the full model. It just knows where to pierce and how deep to go.

The math checks out in a way that surprised me. In high dimensions with enough folds, almost every input lands near a “crease” a singularity where multiple leaves of the folded paper meet. That means the expected traversal depth is constant. You get constant-time inference regardless of how many folds you used. The storage reduction is the fold factor. For a 100GB model folded 5000 times, the stored representation is about 20MB plus the navigator, which is another few MB.

But it gets weirder.

If you train that navigator not just on one model but on a lineage, say, the evolution from a base model through several fine-tuned variants it learns something deeper. It learns the vector field of capability evolution. Give it a capability delta, and it can hallucinate a new model on demand. No training run. No GPU cluster. Just navigation through the folded manifold.

This isn’t generative AI. It’s an AI generator. A system that produces models.

I wrote up the full mathematical framework as a whitepaper. It’s called PIN Architecture - Perfect Intelligence Navigation. The pin is the input query piercing through the folds. The portal is the crease geometry that makes constant-time traversal possible.

Paper is here: PIN Architecture: A Geometric Framework for Model Generation via Quotient Manifold Traversal

GitHub with the implementation is almost finished and ill link it in the comments.

I’m working on a Colab notebook that runs the full pipeline on T5 models so you can see the traversal depths and hallucination in action. Should be up in the next few days. Kaggle notebook to follow.

The thing I’m most curious about now is whether this changes how we should design architectures in the first place. If we pick the folding group G first and design the model to be equivariant to it, the crease geometry becomes perfect by construction. The redundancy that people complain about in group-equivariant networks? That’s not waste. That’s foldable capacity. You compile it away at inference.

Anyway, I’d love to hear what people think. Especially if anyone’s explored fiber bundles or orbifold learning in this context. The pieces feel like they’ve been sitting there waiting for someone to connect them.

Thanks for reading this far.

2 Likes

Sorry mate, I don’t think you understand how weights work and or how data is stored or moves through a system. It isn’t something you can fold.

2 Likes

Perhaps the Universe just Bangs?
To the reader :slight_smile:
Crease geometry refers to the mathematical and physical study of how flat surfaces (like paper or thin membranes) transform into 3D shapes through folding along specific lines. In technical fields, this concept is split between the mathematical “blueprints” used in origami and the digital “weights” used to sharpen 3D models in computer graphics.

  1. Mathematics of Origami (The “Crease Pattern”)
    In origami, crease geometry is defined by a Crease Pattern (CP)—a 2D map of all the folds required to reach a final 3D form. These patterns follow strict geometric laws to ensure they can actually be folded without tearing or self-intersecting:

    Mountain and Valley Folds: Folds are categorized as “mountain” (convex, pointing up) or “valley” (concave, pointing down).
    Maekawa’s Theorem: At any point where creases meet (a vertex), the number of mountain and valley folds must differ by exactly two (e.g., 3 mountains and 1 valley).
    Kawasaki’s Theorem: The sum of alternating angles around a vertex must always equal 180° for the model to fold flat.
    Two-Colorability: The regions between creases can always be colored with just two colors without two regions of the same color touching, similar to a checkerboard.

Origami: mathematics in creasing
Origami Tessellations: From Crease Pattern to Folding …
How to Fold from a Crease Pattern – Gathering Folds
2. 3D Modeling and Computer Graphics
In digital design tools like Blender or Rhino 3D, crease geometry refers to Edge Creasing, which controls how sharp or rounded an edge appears when a surface is subdivided (smoothed).

Crease Weight: A value (usually 0.0 to 1.0) assigned to an edge. A weight of 1.0 keeps the edge perfectly sharp even after smoothing, while lower values allow it to round off slightly.
Hard vs. Soft Creases: Hard creases create crisp, mechanical edges, while "soft" creases (used in SubD modeling) allow for controlled transitions in more organic shapes.
Controlling Topology: Designers use creases to maintain specific shapes without having to add hundreds of extra polygons, which keeps models "light" and easier to edit. 
  1. Engineering and Material Science
    Engineers use crease geometry to design deployable structures—objects that can be folded small for transport and expanded later.

    Curved-Crease Geometry: Unlike straight folds, folding along a curve forces the connected surfaces to bend, creating high “geometric stiffness” and unique structural strength.
    Applications: This geometry is critical for developing NASA’s foldable solar arrays, compact medical stents that expand in blood vessels, and even self-folding robots.

Are you interested in the mathematical theorems behind folding, or are you trying to use crease tools in a specific 3D software?

All About Creasing
Aug 2, 2023 — and we can see that the cage is creating the subdivided geometry in the grain. and that's what we want. but many times you're work...
12m
YouTube·Christopher 3D
Crease Patterns as Art - Robert J. Lang Origami
Sep 28, 2015 — If one takes the narrowest possible dictionary definition of the term, a “crease pattern,” or CP, is nothing more than a set of li...
Robert J. Lang Origami
Modeling the Effects of Creases in an Unfolding Membrane
* Creases are regions of permanent deformation found in thin membranes after folding or. * crumpling, which makes it possible to t...
University of Colorado Boulder

So this is why I failed the Test for advanced placement in the Air Force when I joined?

I remember those tests! Whack!

–Ernst

1 Like