About Us
*Lightning Rod Labs trains AI to predict the future.*
We turn messy, timestamped history into grounded training data automatically—no labeling or extraction required. We treat the future as the label, generating high-signal supervision at scale, so models learn causal factors—not just tokens. This means you can go from raw data to deployable specialized models in hours, removing the data bottleneck entirely.
We’ve used this to beat frontier AIs 100x larger on prediction‑market benchmarks, and have demonstrated success in everything from financial forecasting to supply chain disruptions.
What You’ll Do
Training data is the biggest bottleneck in AI today. You'll build the platform that eliminates it — giving engineers everywhere the easiest way to generate high-quality, grounded training data from their own messy sources and teach AI real‑world reasoning, at scale.
- Build the platform for prediction agents. Own the infrastructure for training and serving prediction agents.
- Scale the data generation engine. Build pipelines that turn messy public + private data into training and evaluation signals.
- Own the platform & SDK. Improve the developer experience for enterprises generating custom training data.
- Make it production-grade. Set up the engineering systems that keep quality high and reliability strong as we scale.
- Build the team. Hire, manage, and grow engineers as we scale.
What We Expect from You
- Excellent Python chops; comfortable with TypeScript and Next.js.
- Proven track record shipping systems that matter.
- Engineering leadership: you’ve led teams and will set technical direction, drive the roadmap, and make good tradeoffs (scrappy vs over-engineered).
- People management: hiring, setting expectations, evaluating performance, and giving clear feedback.
- Execution discipline: you can improve team velocity and delivery without heavy process.
- Quality + reliability: you’ve set up systems that minimize bugs and keep tech debt under control (code review norms, testing practices, and practical CI (continuous integration)).
- Built for scale: you’ve designed systems that handle high job throughput and stay reliable. Bonus for GPU clusters or distributed training/inference.
- Data + pipelines at scale: large datasets, orchestration, and production data quality (for example PyArrow, BigQuery, Prefect).