Jobs

Senior Data Platform Engineer
CrosbyIn-personFull timeNew York City$170,000 - $225,000Offers equity

Welcome to Crosby, the next-generation law company!

We're a team of technologists and legal experts collaborating closely to reimagine corporate legal services from the ground up. We build proprietary technology and human-in-the-loop workflows to radically enhance the lawyer-machine relationship. Our mission is to review complex documents faster than ever and with perfect quality.

Crosby was founded by Ryan (Penn, Stanford Law, ex-Cooley, former GC) and John (Penn M&T, ex-Ramp, ex-Google).

We believe:

  • A great legal system is the watermark of a great society.
  • Legal work is art and science. We are discovering the frontier between the two— codifying what can be systematized, and amplifying human expertise where judgment matters most.
  • The right way to deploy AI in law isn’t by selling software. It’s by selling outcomes — higher-quality legal work, delivered faster and more efficiently
  • In an in-person culture at our NYC office.

The Role

As a Founding Data Platform Engineer, you will build and own the technical backbone that powers Crosby's entire AI-driven platform. Your mission is to design, build, and scale the data infrastructure responsible for ingesting, processing, and storing vast amounts of complex legal documents. You will create the reliable, high-performance systems that our machine learning models and application engineers depend on to transform the legal industry. This is a foundational role with a massive impact on our ability to scale.

What You’ll Do

  • Build the core pipeline: Design and build scalable data ingestion and processing pipelines for unstructured documents (PDFs, DOCX) using tools like Python and Prefect.
  • Own the storage layer: Manage our core data storage layer, including PostgreSQL with pgvector for hybrid search and Redis for high-speed caching.
  • Power AI Workloads: Develop and maintain our serverless compute and ML workloads using infrastructure like Modal and AWS.
  • Codify Infrastructure: Implement and manage robust, version-controlled infrastructure using Infrastructure as Code ( Pulumi).
  • Fuel Product Velocity: Partner closely with ML and application engineers to build the APIs and data models they need to launch new, AI-powered features.
  • Raise the bar: Define best practices for data quality, observability, and reliability across the entire platform.

What We’re Looking For

  • Experience: 3+ years of experience in data engineering, platform engineering, or a related backend role.
  • Strong Programming Skills: Deep proficiency in Python and a strong command of SQL.
  • Cloud Infrastructure: Hands-on experience building and managing infrastructure on AWS (e.g., RDS, ECS, S3).
  • Builder Mindset: You are excited about the opportunity to build data systems from the ground up and make foundational architectural decisions.
  • Data Modeling: A strong understanding of how to model data for both analytical and operational use cases.
  • Ownership: You are driven to take full responsibility for the reliability and performance of the systems you build.

Ideal Qualifications

  • Direct experience with workflow orchestration tools like Prefect or Airflow.
  • Experience with Infrastructure as Code tools like Pulumi or Terraform.
  • Familiarity with vector databases and search technologies ( pgvector, Pinecone, etc.).
  • Experience in a fast-paced, early-stage startup environment.
  • Knowledge of serverless compute platforms like Modal or AWS Lambda.

Why Work at Crosby

  • Foundational Impact: As a founding engineer, you will shape the data platform, culture, and technical direction of the entire company.
  • Competitive salary and equity compensation.
  • Comprehensive health, dental, and vision insurance.
  • Unlimited PTO.
  • In-person team in NYC with a collaborative, high-energy environment.

Apply now to join Crosby and be part of transforming the legal landscape.