Scikit learn design principles

In this post we are going to take a look at the design principles of the very popular library which is Scikit Learn.

If you are into machine learning and deep learning then you might be familiar with the scikit learn library. But those who are beginners, they might have a small hint of how things work around here but through this post will help you to get a general idea about this opensource library.

Following topics will be covered in this post:
1.What is Scikit learn
2.Some details about scikit learn
3.list and describe the design principles

1.What is scikit learn??
Scikit learn is an open source library written in python which supports many machine learning algorithms like Classification,Regression,Rlustering and many other algorithms.It was designed to work in harmony with other libraries like NumPy and SciPy.

2.Some more details about scikit learn
The first public release of Scikit was on February 1,2010 and was designed extensively by developers at the French Institute for Research in Computer Science and Automation. Before the French developer started to work on the library it was initially started by David Cournapeau as a Google Summer of Code project in 2007.

Mostly the algorithms in the library are written in Python but some of the algorithms are written in Cython to improve the performance.

3.General Design principles[1]
Some principles were followed while designing the interfaces in order to avoid frequent updates and improve the code maintainability.

Following are the general principles for the SciKit learn API design:

Consistency: The design of all the objects are consistent and they are documents in the same consistent way.

Inspection: The parameters to the constructor and the methods are exposed as public attributes.

Non-proliferation of classes: Some rules are already in place which involves representing learning algorithms objects using custom classes, data set using NumPy and SciPy sparse matrix.The hyperparameters are expressed as the standard python strings and number.

Composition:Many machine learning tasks are expressible as sequences or combination of transformations to data. Mostly these algorithms are expresses as a composition of these basic building tasks.

Sensible defaults: Whenever the function requires parameters to be passed from the user then there will be sensible defaults set so that some basic flow is defined and we get some sensible output.

For more details refer the link : Scikit Learn Design Principles

3. General API[1]
Whenever you use the Scikit learn libraries objects they have certain interfaces whose implementation is common for all the objects.

Those interfaces are:
A. Predictor
B. Estimator
C. Transformer

A.Predictor
This interface is used for making predictions on the given data set. This interface is extending the functionality of the estimator interface by providing implementation for the predict method.
Using the predict method we make predictions using the trained model.Also there are other methods which give the score for the confidence of the predictions.

B.Estimator
This API is at the core of the the Scikit learn where is exposes the methods which implicitly contains procedures for defining the objects for the algorithm. Estimator's constructor does not really see the data in fact it just accepts some public hyperparameter.The estimator exposes fit methods to accept the public parameters and fir the model.

C.Transformer
Transformer is just an extension to the estimators API and it exposes a predict method. The predict method accepts an array as a parameter and in return predicts the labels and values. The value returned is based on the parameters set using the estimator.

For more details refer the link : Scikit Learn Design Principles

Thank you.

That's all for this post!!
Thank you for reading this post.
If you have any suggestions regarding the post contents or if you need some more details on any other topic, please post it in the comments section.
Your suggestions are too valuable so they should not be missed.

References
1.https://arxiv.org/pdf/1309.0238.pdf
2.https://en.wikipedia.org/wiki/Scikit-learn

be Technical

Search This Blog

Scikit learn design principles

Labels

Comments

Post a Comment

Popular posts from this blog

PyMuPDF vs PDFMiner

Finding difference between 2 files in Python

Adding existing Anaconda environment to Jupyter notebook