New York Times Bldg, 620 8th Ave, New York, NY 10018, USA - fully remote

The NYT Messaging Platforms group delivers hundreds-of-millions of messages per day to tens-of-millions of NYTimes users while providing the resiliency required to deliver breaking news at any time. The group is responsible for developing the core messaging platforms for the New York Times - including email, push notifications, and on-site messaging. To reach our goal of 10 million digital subscribers by 2025, this group is committed to evolving our platform’s architecture to ensure we can extend the reach of our content and drive engagement for our subscribers.

We are currently set to embark on several projects to evolve our current platforms into a more mature distributed architecture. In doing so, we need a Principal Engineer with extensive experience in building and operating distributed systems at scale. This role will report to Jeremy Smith, Engineering Director of Messaging Platforms, and will drive the group’s technical strategy. 

You will:

  • Provide technical leadership to the engineers in the Messaging Platform group and beyond
  • Drive effective decision making across the Messaging Platform group regarding architecture, design, and implementation details
  • Ensure that our messaging systems are highly available, scalable, secure, observable, operable, and performant while scaling to deliver billions of messages to our readers each month
  • Mentor engineers while providing coaching where needed to level up the group’s abilities
  • Engage in collaborative discussions across the Messaging group to garner buy-in on your technical strategy
  • Make pragmatic decisions between delivery of new capabilities and repayment of technical debt
  • Influence organizational-wide technical decisions

You should apply if you have:

  • Extensive experience building distributed systems in the cloud, from concept through production
  • Ability to design systems to be reliable despite constantly evolving and potentially unreliable conditions
  • Migrated a tightly-coupled architecture (e.g., monolith or distributed monolith) to a distributed system with discrete loosely-coupled services
  • Strong understanding of concurrency patterns and fault-tolerant design 
  • Deep understanding of IPC and RPC patterns for distributed systems
  • Practical experience with traffic shaping for services with unpredictable load patterns
  • Experience in multiple database models as well as design trade-offs between ACID and Eventual Consistency transactions
  • A mind towards designing for long-term scalability
  • Experience with container orchestration technologies: e.g., Kubernetes, service mesh, etc
  • Experience with Continuous Integration and Continuous Delivery techniques and tooling 
  • A product mindset, in which engineering costs are evaluated against benefits to our users
  • Comfort navigating uncertainty and driving clarity on long-term cross-team initiatives


  • The New York Times is committed to a diverse and inclusive workforce, one that reflects the varied global community that we serve. Our journalism and the products we build in the service of that journalism greatly benefit from a range of perspectives that can only come from diversity across our ranks, at all levels of the organization. Our dedication to diversity and inclusion is the right thing to do, and it is also the smart thing for our business.
  • We have 11 employee-led groups for people who share identities and interests, ranging from the Arab Collective to Young Professionals. These groups help to build an inclusive community and host over 100 events a year.
  • We value transparency. Our processes reflect this. We share roadmaps, ideas, praise, constructive criticism, and generally over-communicate. When we have something to say to someone or ask of someone, we go and find them and say or ask them directly.
  • We clean up. We have seen firsthand the effects of sprawl, neglect, and things left unfinished. We finish, we clean up, we close the loop. We celebrate simplification and retiring the old.
  • We learn from failure. This means an open examination of what went wrong, through blameless post-mortems, five why’s, etc., and, whenever possible, failing to a state that a) works and b) defaults to availability for our users.
  • We value technical leadership equally to and alongside management. We use our technical professional tracks to identify the people that we have entrusted with technical leadership. These individuals are very important within our organization.
  • Our technical leaders also teach and mentor our team members. They are the key to us continuously improving our technical capability.

Does this job really require Go skills? If not, please report it and we will take a look.