ObliqBench
Do multi-agent review rigs actually find things a single model wouldn't? Help us find out. Rate real findings, and we'll see which topologies earn their keep.
Do multi-agent review rigs actually find things a single model wouldn't? Help us find out. Rate real findings, and we'll see which topologies earn their keep.