--- title: Founding ML Engineer date: 02.24.25 tags: - positions - product - dev - announcements --- (NYC, Full-Time) ## About the Role We're seeking a leading machine learning engineer who can architect breakthrough systems while staying immersed in cutting-edge research. In direct collaboration with the CEO, you'll shape the future of AI at Plastic--tackling challenges across the entire machine learning stack. This role demands someone who thrives at the intersection of research and engineering; someone who can read and reproduce state-of-the-art papers, design novel architectures, and transform promising experiments into production-ready systems. You'll move fluidly between theoretical frameworks and practical implementation, making intuitive decisions about model architecture, quantization strategies, and serving infrastructure. We need a technical polymath who excels across the ML stack: from designing systematic experiments and running rigorous evaluations to building robust data pipelines and scalable model serving/inference systems. You should be particularly adept with post-training techniques that are redefining the field--from advanced inference-time computation methods to reinforcement learning with reasoning models. The LLM space moves at lightning speed, and so do we. You'll prototype rapidly while maintaining research rigor, implement robust MLOps practices, and craft observable systems that scale. Our small, interdisciplinary team moves fast--high agency is core to who we are. You'll have the freedom to directly impact our research and products to push the boundaries of what's possible in AI. We're building systems that haven't been built before, solving problems that haven't been solved. If you're a technical leader who thrives on these challenges and can serve as our ML north star, we want you on our team. ## About You - 3+ years of applied ML experience with deep LLM expertise - High cultural alignment with Plastic Labs' ethos - NYC-based or open to NYC relocation - Strong command of a popular Python ML library (e.g PyTorch, TF, JAX, HF transformers, etc) - Experience replicating research papers soon after publication - Experience building and scaling robust inference systems - Practical experience with post-training and inference-time techniques (RL a plus) - Ability to build reliable MLOps pipelines that perform under load - Proficiency with Unix environments and developer tools (Git, Docker, etc.) - Up-to-date with the Open Source AI community and emerging technologies - Self-directed with a bias toward rapid execution - Driven to push language models beyond conventional boundaries - Background in cognitive sciences (CS, linguistics, neuroscience, philosophy, psychology, etc...) or related fields a plus ## Research We're Excited About [s1: Simple test-time scaling](https://arxiv.org/abs/2501.19393) [Neural Networks Are Elastic Origami!](https://youtu.be/l3O2J3LMxqI?si=bhodv2c7GG75N2Ku) [Titans: Learning to Memorize at Test Time](https://arxiv.org/abs/2501.00663v1) [Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning](https://arxiv.org/abs/2412.13631) [Generative Agent Simulations of 1,000 People](https://arxiv.org/abs/2411.10109) [DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning](https://arxiv.org/abs/2501.12948) [Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains](https://arxiv.org/abs/2501.05707) [Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm](https://arxiv.org/pdf/2102.07350) [Theory of Mind May Have Spontaneously Emerged in Large Language Models](https://arxiv.org/pdf/2302.02083v3) [Think Twice: Perspective-Taking Improved Large Language Models' Theory-of-Mind Capabilities](https://arxiv.org/pdf/2311.10227) [Representation Engineering: A Top-Down Approach to AI Transparency](https://arxiv.org/abs/2310.01405) [Theia Vogel's post on Representation Engineering Mistral 7B an Acid Trip](https://vgel.me/posts/representation-engineering/) [A Roadmap to Pluralistic Alignment](https://arxiv.org/abs/2402.05070) [Open-Endedness is Essential for Artificial Superhuman Intelligence](https://arxiv.org/pdf/2406.04268) [Simulators](https://generative.ink/posts/simulators/) [Extended Mind Transformers](https://arxiv.org/pdf/2406.02332) [Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models](https://arxiv.org/abs/2310.06983) [Constitutional AI: Harmlessness from AI Feedback](https://arxiv.org/pdf/2212.08073) [Claude's Character](https://www.anthropic.com/research/claude-character) [Language Models Represent Space and Time](https://arxiv.org/pdf/2310.02207) [Generative Agents: Interactive Simulacra of Human Behavior](https://arxiv.org/abs/2304.03442) [Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge](https://arxiv.org/abs/2407.19594) [Cyborgism](https://www.lesswrong.com/posts/bxt7uCiHam4QXrQAA/cyborgism) [Spontaneous Reward Hacking in Iterative Self-Refinement](https://arxiv.org/abs/2407.04549) [... accompanying twitter thread](https://x.com/JanePan_/status/1813208688343052639) (Back to [[Working at Plastic]])