Invest in your career with a Madrona-funded company.


Senior Data Scientist



Data Science
Posted on Wednesday, June 5, 2024

Company Overview

Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI platform including access to models for speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications.

The Opportunity

Despite the proliferation of text-based communication, voice remains the preferred medium for humans to interact with machines. Delivering real-world voice AI solutions to our customers' most challenging problems ultimately drives our mission. At Deepgram, you will have the unique opportunity to innovate, experiment, and build -- significantly shaping our products and AI capabilities. We value tenacious problem-solving and the ability to iterate, learn and adapt. Domain-specific expertise in speech or language AI is not required. As such, you're encouraged to deepen your skills on-the-job, broadening your knowledge and expertise through constant iteration and invention. Our start-up environment offers a stunning growth trajectory due to a level of ownership and an on-ground connection with end-customers that larger research labs simply cannot provide. Embark on a journey to redefine voice technology with us at Deepgram.

The Role

Deepgram is currently looking for seasoned Data Scientists with demonstrated experience solving hard data problems while exploring research frontiers. Conversational audio presents incredibly rich scientific, engineering, and infrastructure challenges that are orders of magnitude harder than working with text. At Deepgram, you will help us to build an industrial “data factory” that will be used to power the next generation of Voice AI systems - unlocking the creation of models that go beyond basic transcription and comprehension, capturing nuanced meanings in complex conversations, adapting robustly to diverse speech patterns, and generating empathic responses with human-like, contextualized speech. You will collaborate closely with our product, engineering, and data teams to build and deploy models in the most scalable voice API on the planet. We look forward to you bringing your expertise, sharing insights from your latest experiments, and collaborating with us to push the boundaries of AI and voice technology.

What You’ll Do

  • Drive high performance data acquisition, preparation and synthesis pipelines to generate data for the next generation of speech and language AI foundation models

  • Develop advanced characterizations of complex conversational audio utilizing a diverse toolkit of signals processing techniques and deep learning models

  • Collaborate with DataOps and Engineering to create automated systems which scale the ability of human annotators to label high value data and provide critical feedback on model outputs

  • Build advanced benchmarking methodologies and curated datasets for evaluating conversational voice systems

  • Document and present results of data experiments and analysis for internal and external audiences

You’ll Love This Role If You

  • Are obsessed with making sense out of complex and/or messy data

  • Enjoy building from the ground up and love to create new systems from scratch

  • Are passionate about AI and interested in leveraging data to solve hard problems

  • Are motivated by the prospect of scaling yourself using automation and AI models

It’s Important To Us That You Have

  • Experience building data processing pipelines from a blank page and owning the entire data stack including data acquisition, characterization, cleaning, serving and transformation

  • Experience and expertise applying statistical methods and deep learning models to understand complex data

  • Strong communication skills and the ability to translate complex concepts in simple terms, depending on the target audience

  • Strong software engineering skills with particular emphasis on developing clean, modular code in Python and working with Pytorch

Nice to have

  • Background in Physics, Mechanical Engineering or Language Processing

  • Experience building models

  • Speech and audio experience

Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $85 million in total funding after closing our Series B funding round last year. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you!

Deepgram is an equal opportunity employer. We want all voices and perspectives represented in our workforce. We are a curious bunch focused on collaboration and doing the right thing. We put our customers first, grow together and move quickly. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, veteran status, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate.

We are happy to provide accommodations for applicants who need them.