My job alerts

Senior Data Scientist

Deepgram

Apply

Data Science

Remote

Posted on Wednesday, June 5, 2024

Company Overview

Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI platform including access to models for speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications.

The Opportunity

Despite the proliferation of text-based communication, voice remains the preferred medium for humans to interact with machines. Delivering real-world voice AI solutions to our customers' most challenging problems ultimately drives our mission. At Deepgram, you will have the unique opportunity to innovate, experiment, and build -- significantly shaping our products and AI capabilities. We value tenacious problem-solving and the ability to iterate, learn and adapt. Domain-specific expertise in speech or language AI is not required. As such, you're encouraged to deepen your skills on-the-job, broadening your knowledge and expertise through constant iteration and invention. Our start-up environment offers a stunning growth trajectory due to a level of ownership and an on-ground connection with end-customers that larger research labs simply cannot provide. Embark on a journey to redefine voice technology with us at Deepgram.

The Role

Deepgram is currently looking for seasoned Data Scientists with demonstrated experience solving hard data problems while exploring research frontiers. Conversational audio presents incredibly rich scientific, engineering, and infrastructure challenges that are orders of magnitude harder than working with text. At Deepgram, you will help us to build an industrial “data factory” that will be used to power the next generation of Voice AI systems - unlocking the creation of models that go beyond basic transcription and comprehension, capturing nuanced meanings in complex conversations, adapting robustly to diverse speech patterns, and generating empathic responses with human-like, contextualized speech. You will collaborate closely with our product, engineering, and data teams to build and deploy models in the most scalable voice API on the planet. We look forward to you bringing your expertise, sharing insights from your latest experiments, and collaborating with us to push the boundaries of AI and voice technology.

What You’ll Do

Drive high performance data acquisition, preparation and synthesis pipelines to generate data for the next generation of speech and language AI foundation models
Develop advanced characterizations of complex conversational audio utilizing a diverse toolkit of signals processing techniques and deep learning models
Collaborate with DataOps and Engineering to create automated systems which scale the ability of human annotators to label high value data and provide critical feedback on model outputs
Build advanced benchmarking methodologies and curated datasets for evaluating conversational voice systems
Document and present results of data experiments and analysis for internal and external audiences

You’ll Love This Role If You

Are obsessed with making sense out of complex and/or messy data
Enjoy building from the ground up and love to create new systems from scratch
Are passionate about AI and interested in leveraging data to solve hard problems
Are motivated by the prospect of scaling yourself using automation and AI models

It’s Important To Us That You Have

Experience building data processing pipelines from a blank page and owning the entire data stack including data acquisition, characterization, cleaning, serving and transformation
Experience and expertise applying statistical methods and deep learning models to understand complex data
Strong communication skills and the ability to translate complex concepts in simple terms, depending on the target audience
Strong software engineering skills with particular emphasis on developing clean, modular code in Python and working with Pytorch

Nice to have

Background in Physics, Mechanical Engineering or Language Processing
Experience building models
Speech and audio experience

Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $85 million in total funding after closing our Series B funding round last year. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you!

Deepgram is an equal opportunity employer. We want all voices and perspectives represented in our workforce. We are a curious bunch focused on collaboration and doing the right thing. We put our customers first, grow together and move quickly. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, veteran status, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate.

We are happy to provide accommodations for applicants who need them.

Apply now

See more open positions at Deepgram

Invest in your career with a Madrona-funded company.

Senior Data Scientist

Company Overview