UBS

Python is irreplaceable for Machine Learning, but running Python in production can be a problem if other parts of the system are written using C#. ML.NET is a Machine Learning library for C# that helps deliver Machine Learning features in a .NET environment more quickly.

Difficulties of running Python in production

At this point, Python is the de-facto language for Machine Learning (ML). Python has a great ML ecosystem to implement end-to-end ML workflows. There are data analysis tools like Pandas, modeling libraries like Scikit Learn, TensorFlow, and PyTorch, deployment and monitoring solutions like MLflow and Seldon, and even cloud platforms like Azure ML or Sagemaker.

Unfortunately, using Python for production can range from complicated to impossible. Say you have a big ASP.NET project with microservice architecture and complex infrastructure (security, CI & CD, audit, logging, etc.). Although infrastructure can be reimplemented in a Python microservice, it requires a lot of effort and Python software engineering skills. This would also be the case for a WPF application or Xamarin application where Python just cannot be used.

That is where ML.NET can help.

ML.NET overview

ML.NET is an open-source and cross-platform Machine Learning framework developed by Microsoft. It was developed internally for more than a decade and then published on GitHub in 2018, where it has 7k+ stars. ML.NET is used by Power BI, Windows Defender, and others.

ML.NET is an all-in-one framework that provides a wide range of features, including:

Writing a simple pipeline via ML.NET is just as easy as using sklearn. Here is an ML.NET code sample taken from ML.NET’s site that trains a binary text classifier:

You can find more examples on the dotnet/machinelearning-samples repository on GitHub.

NimbusML: Research in Python and run in .NET

ML.NET can be used from Python. NimbusML provides Python bindings for ML.NET as sklearn-compatible objects.

That means that a data scientist can work in Python and leverage the power of the Python ML ecosystem while keeping Machine Learning models close to the production .NET environment. The trained model can be saved in the ML.NET format in just one line. Later, a C# developer can easily load the trained model and make predictions, even if they don’t know anything about ML algorithms.

Please note that custom feature engineering code still has to be rewritten from Python to C#.

Here is a NimbusML code sample taken from their GitHub:

The drawbacks of ML.NET

  • Most importantly, ML.NET is not extensible. Unlike with sklearn, you cannot develop your custom ML.NET-compatible components. This is because most ML.NET base classes are marked as internal. This limitation was confirmed by Microsoft.
  • Limited set of algorithms. Some important algorithms are missing from ML.NET. For example, there is no KNN-classifier, only K-means for clustering, etc. Since ML.NET cannot be extended, there is no way around these missing functions
  • Only the top-level API is exposed. ML.NET pipelines are created through the MLContext class, and individual component constructors are mainly internal. In some cases, a more granular component configuration is needed, but a constructor is not accessible. For example, there is an ApplyOnnxModel, but it cannot accept an ONNX model as byte[], while underlying ONNX Runtime Session can.
  • Hard-to-use documentation. ML.NET has quite extensive documentation, but it is not organized very well. It is hard to understand how all the classes are interconnected and what you need to use. Sklearn, on the other hand, provides intuitive and easy-to-grasp documentation with component documentation, examples, guides, and best practices like “try this algorithm as an alternative.” Microsoft is using their standard documentation UI for ML.NET, but a different template would likely be better.
  • Lack of comprehensive ML examples. There are a number of examples for ML.NET, including end-to-end-apps, but they use pretty simple ML pipelines. It would be nice to see more advanced ML use cases with complex pipelines and feature engineering.

ML.NET is a fairly young framework and is still rapidly developing, so these drawbacks might be resolved soon. You may also find that the available set of algorithms is sufficient for many real-life cases.

ML.NET alternatives

ML.NET is not the only technology that allows you to do research in Python and run Machine Learning models in C#.

There are C# bindings for TensorFlow and PyTorch (one, two). It would be a perfect option if you already use TensorFlow/PyTorch for research.

ONNX is another promising technology. ONNX is a unified format to represent Machine Learning models. Many popular frameworks (including sklearn, TensorFlow, PyTorch, and more) let you convert their models to the ONNX format. There is a highly efficient ONNX Runtime on which you can run ONNX models on different devices using a variety of computational back ends (CPU/GPU/TPU). ONNX Runtime is implemented in C++ and has bindings for C#, Java, Python, and Node.js.

A set of SciSharp STACK projects can help with many ML problems, it even includes an NLP library and a chatbot framework!

If you are ready

for something new, challenging, and inspiring.
Check out our open open positions

Ready to start

Conclusion

ML.NET is a young and promising Machine Learning framework for .NET. It can deal with many common ML problems right out of the box. NumbusML allows you to do research in Python while running models in C#. Still, it’s important to note that, as of now, it lacks extensibility and has hard-to-use documentation.

We are hiring!

Here at Exadel, we have a great team, well-known global clients, and even our own AI-powered products! We are proud of our friendly team atmosphere.

How can we help you?
Contact Us