Career Opportunities in Machine Learning: A Live Talk with Cohere’s Alexandre Matton

Natalie Cone
March 13, 2023
15 min read

In this post, we share a summary of the main portion of Matton’s talk, aimed at an audience of students, early ML career practitioners, and career transitioners. Readers of the post will learn: 

  • 5 main roles in ML: what the work looks like in practice and where you can find it in the job market
  • Matton’s thoughts on preparing yourself for a career in ML engineering, including education, building a portfolio, and interviewing tips

And a bonus discussion for current industry professionals! During the audience Q&A, Matton shares his tips for automated monitoring and deployment of ML models.

Background

Alexandre Matton is a Research Machine Learning Engineer at Cohere, where he works on building semantic search applications. Previously, he led R&D for the document processing team at Scale AI. Alexandre focuses on using cutting edge machine learning techniques to address real-world industry challenges. He holds a Master of Science in computational mathematics from Stanford University and École Polytechnique.

Founded in Toronto in 2019, Cohere provides natural language processing models that help companies improve human-machine interactions. Of the company and his role within it, Matton says “Cohere is very research-oriented with a significant part of the team working on large language models similar to GPT. The ML team works on many NLP use cases, such as semantic search and classification, since they provide high value across a range of companies and industries.” He adds, “Because of the topics we're working on, we’re extremely focused on research at the edge of current boundaries. There’s a culture of spending most of our time coding and implementing as many ideas as possible to figure out what works and what doesn't. My role involves a lot of research, but I also do a bit of everything,” as might be expected in a company of Cohere’s size (just over 150 employees).

5 Main Roles in ML

  1. ML operations or ML infrastructure engineer 
  2. ML research scientist
  3. General ML engineer
  4. Data engineer
  5. Data scientist


The ML operations or ML infrastructure engineer is responsible for building everything around models, including deploying, monitoring, and maintaining them. 

  • Their goal is to ensure that the infrastructure is solid, robust, and scalable. They monitor the model in production to ensure it correctly answers what it was made to answer. 
  • This role bridges the gap between data scientists/AI and software engineering. It’s suitable for software engineers who want to move into ML. 
  • There is a focus on inference and deployment: making sure all deployments are up to date, scale as expected, handle logging and monitoring, and ensure that nothing abnormal goes on with the model. 
  • The role typically requires less knowledge in pure ML and more in computer science. 
  • The role can be found in companies of any size, as long as they are deploying ML models. However, it may be more common in larger companies with more complex and demanding ML systems.

The ML research scientist is focused on pure research and improving machine learning methods. 

  • Their primary responsibility is pure research. They either take an existing research problem or create a new one, get acquainted with the state-of-the-art, and try to improve on it. 
  • Their end goal is often to publish new papers at conferences.
  • They are usually found in big companies and big research labs like Facebook AI and Google Brain. 
  • While smaller companies may also advertise ML research scientist positions, be aware that you might not be getting the type of work you’re expecting; this is because research in AI is generally high-risk and not usually the priority for smaller companies focused on building a business.
  • In large companies, an ML research engineer is the equivalent of a junior research scientist. At smaller companies or companies that are less geared towards research, it generally just means ML engineer.
  • If you want to work at labs like Google Brain or FAIR, a PhD is the most direct path, but it’s still possible without one. You’ll need to publish some papers, build a network within the company that interests you, and participate in research projects. 

The general ML engineer works on ML projects end to end. 

  • Their role can vary depending on the size of the company. 
  • They tend to work on the whole pipeline: pre-processing and curating data, training models, deploying them, and building methods to improve them over time. The general ML engineer can also be involved in discussions with customers to define the problem. 
  • Their role is very broad and might require less specialization. 
  • This role can be found in companies of all sizes, with specific responsibilities varying depending on company size. In larger companies, they may be more specialized and work on only one part of the pipeline.
  • This role can also vary a lot depending on the use case and the tools the company uses. For example, working with classical statistical algorithms is quite different from working with deep learning. When an ML engineer focuses on deep learning, it generally involves much longer training time, higher quantity of data, and the engineer will spend more time on pipeline optimization rather than, for example, data preprocessing.
  • This role requires a mix of software engineering and ML skills. Most of this engineer’s time is usually spent on software engineering problems. 

The data engineer focuses on data collection, storage, curation, and preprocessing. Data engineering is the foundation of ML and thus this discipline is expressed everywhere where ML exists.

  • They are responsible for ensuring that data is labeled correctly, and is of high quality. 
  • They might work with data infrastructure and storage systems.
  • This role is more commonly found in larger companies with complex data needs, such as those in the finance, healthcare, and tech industries. However, data engineers may also be needed in smaller companies that are working with large amounts of data.
  • In very small companies, the specific title, "data engineer" might not exist due to limited resources. The reality is that under these circumstances, ML engineers absorb this work.

The data scientist job title signals a broad range of potential responsibilities. It’s important to read through the job description thoroughly and talk with the recruiter to ensure there’s no misunderstanding about the role in question. Depending on the company, a data scientist can be:

  • A data engineer
  • A data analyst (who will spend most of their time on data visualization)
  • An ML engineer
  • An actual data scientist (as generally accepted, a pure data scientist role). In this case, they build predictive models to draw useful insights for their companies, encourage or make data-driven decisions, or work on a data-driven product. For instance, a data scientist’s job can be to understand what factors lead to customer churn from a company’s subscription plan.
  • Data scientists generally write less code than ML engineers and their work is more geared toward the application of statistics and visualization. They spend more time writing Jupyter notebooks and less time building pipelines.

Preparing for a Career in ML: Education, Building a Portfolio & Preparing for an Interview

Education 

Multiple educational trajectories lead to work in ML. The path one follows (courses undertaken, degrees earned) is dependent on their interests and goals. Alex suggests incorporating a balance between computer science and applied mathematics classes in preparation for an industry role in ML, as both are necessary to become a strong ML engineer. “Coming from a mathematics background, I wish I had taken more deep computer science classes, as it’s difficult to catch up on this knowledge outside of a course,” Alex laments. 

It’s possible to dive straight into an ML role in industry after earning a bachelor's degree. A master’s degree, however, can be beneficial for practitioners who want to switch to a new field or if their university offers an accelerated program. Alex shares that, as a foreign student, earning a master’s degree was a natural choice since this is the most common trajectory in Europe for students of computer science or applied mathematics. Continuing his education in the U.S. via the master’s program also supported his goal of landing a professional ML job in the U.S. Multiple factors inform decisions around which degrees to pursue, but in most cases a graduate degree is not necessary for a career in industry. Additionally, for jobs that are not focused on research, five years of industry experience is often more valuable than a PhD.

Two scenarios justify the pursuit of a PhD and the better part of a decade one would invest in earning one: passion for the research question or broader topic, and/or interest in a career as a research scientist. Alex shares that “PhDs are useful for those who want to become research scientists, but the field may change rapidly, making this long path stressful.” He himself pursued a career in industry rather than continuing with academia because he was attracted to the idea of creating products that people use, and didn’t find a research field he was overwhelmingly passionate about. He hasn’t regretted his choice. 

Education in ML doesn’t end once you’ve earned a degree. The field evolves quickly and requires the habit of continuous learning. “As an ML engineer you want to spend a lot of time learning, at least an hour per day is the north star. It’s an investment for your company and amazing for you personally because it makes you better at what you do.” Taking courses such as Sphere’s live cohort courses is a great way to commit to the process amid competing deadlines and life responsibilities. Professionals can be confident in the quality and applicability of reading materials selected specifically for the course topic, as opposed to blogs and resources excavated on the internet. Reading the most current academic papers is also valuable. Alex reveals, "I spend a lot of time reading papers because I'm pretty familiar with the topic I'm working on. My goal is to see what's out there, what people are working on right now, and see if I can incorporate some of their work inside my projects." On the other hand, if you aren’t already an expert in the papers’ subjects, they can be difficult to fully understand. Filling this gap is where, once again, a formal course with instructor and peer support can be immensely beneficial.  

Building Your ML Portfolio

Working on large open-source projects can be rewarding and a gold star on a resume. Getting the work can be challenging, though, as some require a high level of experience. Alex offers, “one of the most famous open-source projects to contribute to is Hugging Face. They encourage external contributions to their large body of open source repositories.”

Entering Kaggle competitions is another way to build ML chops. These events are time-consuming but also intensive learning opportunities. As Alex shares, “what I've observed in working with people who are very good at Kaggle is that they are also strong ML engineers.”

Finally, many research papers do not publish their code, creating ample opportunities to implement that code. This activity augments your portfolio and makes your resume more attractive to prospective employers, even if you’re early in your career. 

Preparing for an ML Systems Design Interview

Make sure you can list concrete ML use cases and design pipelines (no need to implement them). The activity forces you to think about what data to use; how to acquire, format and store it; what models to train; and how to deploy and monitor them.

Become familiar with some ML Ops tools like Weights & Biases, or open-source alternatives like MLflow. Also monitor tools like Datadog and deployment tools like Kubernetes and SageMaker. Be able to articulate clearly why one tool might be superior in a distinct use case.

Knowledge about deployment is useful, even if the job description does not specifically reflect that of an ML engineer.

Consider investing in formal, applied courses in the subject area, such as Chip Huyen's, ML Systems Design and Strategy.

Bonus Discussion! Automated Deployment and Monitoring of ML Models

Small companies can use online tools such as Weights & Biases to log hyperparameters and compare different models easily. Alex also suggests tools such as Amazon SageMaker, which can manage the deployment of models automatically; or using frameworks like Docker with Kubernetes for more control over the deployment process. 

Training models in a completely automated fashion involves risk. ML engineers need to be sure that each model created continues to be accurate with the majority of basic cases. Each iteration needs to be better than the previous one. On this topic, Alex notes, “it's important to structure a monitoring framework to ensure that the model's accuracy is being tracked regularly.”

Don’t spend too much time optimizing the accuracy of the model. Once a model has a “good enough” accuracy it should be put into production as quickly as possible to ensure a robust pipeline. “The deployment of the model is the most important part, and it's better to spend more time improving the model after it's deployed. The process of achieving perfect customer alignment is difficult, and it's not until the model is deployed that the understanding of the ground truth is clear. Most customers may not know the exact problems they are facing, and deploying a solution quickly can help them refine their understanding of the problem.” Sage direction from an industry veteran, indeed.

Follow Sphere on LinkedIn and stay in the loop about community events, new blog publications, and new courses.

- Natalie Cone, Head of Community at SphereSphere recently hosted a live talk with Alexandre Matton of Cohere on the topic of career opportunities in machine learning. With robust participation from the audience, it would be impossible to capture in writing the many “productive digressions” that ensued from the lively Q&A, but you can watch the full video here. 

Heading 2

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Heading 3

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

Heading 4

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.

Heading 5

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.

Heading 6

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.

Block quote