Machine learning system


A machine learning system or ML system is a data processing system that employs machine learning in some or all of its components.

Design process

ML systems are complicated and can take an enormous number of shapes and sizes. It's necessary, or at least very recommended, to take a formulaic approach to designing one in order to enforce some rigor in the logic behind the system. There are many ways one could go about this: here is one path.

  1. Decide whether you even need machine learning. Though ML is incredibly powerful, in many cases it's just unnecessary. Perhaps the problem is simple enough that it can be solved rigorously and deterministically, especially if a solution already exists. No point in using ML to calculate 17 + 45. Perhaps the problem is so complicated that creating an ML system to approximate a good solution would take more time, hardware or funds than you have available. You could determine continent-scale population dynamics using ML, but you would need enormous labor to do so.
  2. Supervised or unsupervised? Supervised machine learning is expensive and labor-intensive since you need an enormous amount of curated data designed specifically for your training process. Unsupervised machine learning instead only requires the raw data and it figures out what to do on its own, but you sacrifice a lot of control over the process and the model can learn to predict the wrong thing1.
  3. Define the problem.
  4. Design the ML system.
  5. Implement the ML system.
  6. Assess the ML system.

Some additional points. Consider some ML system summarized as fpredict:XYf_\text{predict}: X\mapsto Y.

  • Even if you do make the ML system, can yy be calculated fast enough for your purposes? Would a human be able to do it faster or more accurately?
  • If a human can do it, why not? Cost? No available workers? Danger to the person (e.g. chance of collapse of an abandoned building)? Inherent bias to the human? yy is not high-value enough to justify getting a person to do it?
  • While fpredictf_\text{predict} is expected to be derived using machine learning, nothing's stopping a person from estimating the function themselves. If, say, a physicist can hand-make a competent model or theory for a phenomenon, you don't necessarily need ML. The question then becomes: is the hand-made solution better than the ML one? For instance, is the human solution too expensive, time-consuming or inaccurate?

By and large, you could oversimplify the design philosophy of an ML systems down to three metrics:

  • Efficiency. How fast and easily does the ML system runs? Are the resources used justified?
  • Effectiveness. How good are the decisions made by the ML system? Are they good enough for your purposes?
  • Human dignity. Would a human do the same thing better? Would the ML system encourage damaging behavior?2

Components

ML systems are made up of many components. The difference between a system and a simple ML model is that the system also includes all accessory code and methods that combine to turn an input into a decision. It may include multiple models strung together.

Footnotes

  1. An anecdote: an MLOps professor of mine was working on a system to make predictions on photos of places in different districts of a Brazilian city. If I recall correctly, it was suppose to determine which district the image was taken in. Instead of doing that, the unsupervised model instead learnt to categorize the images based on the time of day... If you're doing unsupervised learning, perhaps invest in explainable AI.

  2. For instance, an unsupervised ML system that spontaneously learns to discriminate based on race or gender.