Productionalisation of ML models is a very complex process involving numerous stakeholders such as product managers, data engineers, data scientists, DevOps engineers, and MLOps engineers. The complexity of their interaction is compounded by the unique nature of ML projects which tend to have high levels of uncertainty and experimentation as well as a large arsenal of tools, languages, and algorithms at the disposal of data scientists.
Today, estimates indicate that 50 to 80% of ML models never make it into production and for the ones that do, most are prone to numerous failures and operational issues. The reasons for this sad state of affairs range from lack of organizational support to unavailability of tried and tested frameworks. In this presentation, we will take a journey through the ML Lifecycle and share the challenges which happen throughout the cycle thereby increasing productionalization difficulties. The intention of the presentation is not to focus on how to do these steps but rather to share common errors and mistakes that I have seen (or made!) with the objective of raising the awareness of the practitioner so that they can be better equipped to build sustainable ML systems. The stages we will explore are Project Goal Definition, Data Collection and Preparation, Data Preprocessing, Feature Engineering, Model Training and Evaluation, Model Deployment and Monitoring.