In this post, we share a summary of the main portion of Matton’s talk, aimed at an audience of students, early ML career practitioners, and career transitioners. Readers of the post will learn:
And a bonus discussion for current industry professionals! During the audience Q&A, Matton shares his tips for automated monitoring and deployment of ML models.
Background
Alexandre Matton is a Research Machine Learning Engineer at Cohere, where he works on building semantic search applications. Previously, he led R&D for the document processing team at Scale AI. Alexandre focuses on using cutting edge machine learning techniques to address real-world industry challenges. He holds a Master of Science in computational mathematics from Stanford University and École Polytechnique.
Founded in Toronto in 2019, Cohere provides natural language processing models that help companies improve human-machine interactions. Of the company and his role within it, Matton says “Cohere is very research-oriented with a significant part of the team working on large language models similar to GPT. The ML team works on many NLP use cases, such as semantic search and classification, since they provide high value across a range of companies and industries.” He adds, “Because of the topics we're working on, we’re extremely focused on research at the edge of current boundaries. There’s a culture of spending most of our time coding and implementing as many ideas as possible to figure out what works and what doesn't. My role involves a lot of research, but I also do a bit of everything,” as might be expected in a company of Cohere’s size (just over 150 employees).
The ML operations or ML infrastructure engineer is responsible for building everything around models, including deploying, monitoring, and maintaining them.
The ML research scientist is focused on pure research and improving machine learning methods.
The general ML engineer works on ML projects end to end.
The data engineer focuses on data collection, storage, curation, and preprocessing. Data engineering is the foundation of ML and thus this discipline is expressed everywhere where ML exists.
The data scientist job title signals a broad range of potential responsibilities. It’s important to read through the job description thoroughly and talk with the recruiter to ensure there’s no misunderstanding about the role in question. Depending on the company, a data scientist can be:
Multiple educational trajectories lead to work in ML. The path one follows (courses undertaken, degrees earned) is dependent on their interests and goals. Alex suggests incorporating a balance between computer science and applied mathematics classes in preparation for an industry role in ML, as both are necessary to become a strong ML engineer. “Coming from a mathematics background, I wish I had taken more deep computer science classes, as it’s difficult to catch up on this knowledge outside of a course,” Alex laments.
It’s possible to dive straight into an ML role in industry after earning a bachelor's degree. A master’s degree, however, can be beneficial for practitioners who want to switch to a new field or if their university offers an accelerated program. Alex shares that, as a foreign student, earning a master’s degree was a natural choice since this is the most common trajectory in Europe for students of computer science or applied mathematics. Continuing his education in the U.S. via the master’s program also supported his goal of landing a professional ML job in the U.S. Multiple factors inform decisions around which degrees to pursue, but in most cases a graduate degree is not necessary for a career in industry. Additionally, for jobs that are not focused on research, five years of industry experience is often more valuable than a PhD.
Two scenarios justify the pursuit of a PhD and the better part of a decade one would invest in earning one: passion for the research question or broader topic, and/or interest in a career as a research scientist. Alex shares that “PhDs are useful for those who want to become research scientists, but the field may change rapidly, making this long path stressful.” He himself pursued a career in industry rather than continuing with academia because he was attracted to the idea of creating products that people use, and didn’t find a research field he was overwhelmingly passionate about. He hasn’t regretted his choice.
Education in ML doesn’t end once you’ve earned a degree. The field evolves quickly and requires the habit of continuous learning. “As an ML engineer you want to spend a lot of time learning, at least an hour per day is the north star. It’s an investment for your company and amazing for you personally because it makes you better at what you do.” Taking courses such as Sphere’s live cohort courses is a great way to commit to the process amid competing deadlines and life responsibilities. Professionals can be confident in the quality and applicability of reading materials selected specifically for the course topic, as opposed to blogs and resources excavated on the internet. Reading the most current academic papers is also valuable. Alex reveals, "I spend a lot of time reading papers because I'm pretty familiar with the topic I'm working on. My goal is to see what's out there, what people are working on right now, and see if I can incorporate some of their work inside my projects." On the other hand, if you aren’t already an expert in the papers’ subjects, they can be difficult to fully understand. Filling this gap is where, once again, a formal course with instructor and peer support can be immensely beneficial.
Working on large open-source projects can be rewarding and a gold star on a resume. Getting the work can be challenging, though, as some require a high level of experience. Alex offers, “one of the most famous open-source projects to contribute to is Hugging Face. They encourage external contributions to their large body of open source repositories.”
Entering Kaggle competitions is another way to build ML chops. These events are time-consuming but also intensive learning opportunities. As Alex shares, “what I've observed in working with people who are very good at Kaggle is that they are also strong ML engineers.”
Finally, many research papers do not publish their code, creating ample opportunities to implement that code. This activity augments your portfolio and makes your resume more attractive to prospective employers, even if you’re early in your career.
Make sure you can list concrete ML use cases and design pipelines (no need to implement them). The activity forces you to think about what data to use; how to acquire, format and store it; what models to train; and how to deploy and monitor them.
Become familiar with some ML Ops tools like Weights & Biases, or open-source alternatives like MLflow. Also monitor tools like Datadog and deployment tools like Kubernetes and SageMaker. Be able to articulate clearly why one tool might be superior in a distinct use case.
Knowledge about deployment is useful, even if the job description does not specifically reflect that of an ML engineer.
Consider investing in formal, applied courses in the subject area, such as Chip Huyen's, ML Systems Design and Strategy.
Small companies can use online tools such as Weights & Biases to log hyperparameters and compare different models easily. Alex also suggests tools such as Amazon SageMaker, which can manage the deployment of models automatically; or using frameworks like Docker with Kubernetes for more control over the deployment process.
Training models in a completely automated fashion involves risk. ML engineers need to be sure that each model created continues to be accurate with the majority of basic cases. Each iteration needs to be better than the previous one. On this topic, Alex notes, “it's important to structure a monitoring framework to ensure that the model's accuracy is being tracked regularly.”
Don’t spend too much time optimizing the accuracy of the model. Once a model has a “good enough” accuracy it should be put into production as quickly as possible to ensure a robust pipeline. “The deployment of the model is the most important part, and it's better to spend more time improving the model after it's deployed. The process of achieving perfect customer alignment is difficult, and it's not until the model is deployed that the understanding of the ground truth is clear. Most customers may not know the exact problems they are facing, and deploying a solution quickly can help them refine their understanding of the problem.” Sage direction from an industry veteran, indeed.
Follow Sphere on LinkedIn and stay in the loop about community events, new blog publications, and new courses.
- Natalie Cone, Head of Community at SphereSphere recently hosted a live talk with Alexandre Matton of Cohere on the topic of career opportunities in machine learning. With robust participation from the audience, it would be impossible to capture in writing the many “productive digressions” that ensued from the lively Q&A, but you can watch the full video here.