Classification

Classification is used to assign an observation to a specific category within a set of categories. Here are two different types of classification: binary classification and multiclass classification.
With binary classification, one can assign elements of a set into two groups based a rule. Some of the methods used for binary classification are: decision trees, bayesian networks, support vector machines and probit model.

One method of SVM is the linear SVM. The linear SVM can be used to classify data into two classes. The following code can be used to import modules to accomplish the linear SVM.


                    from sklearn import svm

                    from sklearn import metrics

                    from sklearn.neighboers import KNeighborsClassifier

Clustering

With clustering, one can group observation/objects into clusters. Observations in one cluster are more similar than to those in other groups/clusters. Below is a list of some of the algorithms used for clustering:

Fuzzy clustering
Expectation Maximization
BIRCH
DBSCAN
K-Means

Balanced Iterative Reducing and Clusterting using Hierarchies(BIRCH) can cluster large datasets by creating a summary of the large dataset to retain as much info as possible. In order to use the BIRCH algorithm, one can import the following modules:


                    import matplotlib.pyplot as plt

                    from sklearn.datasets.samples_generator import make_blobs

                    from sklearn.cluster import Birch

In order to use the DBSCAN algorithm, one can import the following module:


                    from sklearn.cluster import DBSCAN

Regression

Regression can be used to estimate relationship between a dependent variables and one or more independent variables. There are different types of regression: linear regression, logistic regression, stepwise regression, and classifiers. Below is a module to import for logistic regression:


                    from sklearn.linear_model import LogisticRegression

Below is a module to import for linear regression


                    from sklearn.linear_model import LinearRegression

One should note that there is no method for calculating power and sample size. We can take into account the rule made by Good and Hardin.

Feature Engineering

Feature engineering consists of extracting features from data. It can be used for predictive models and has been seen in code competitions.

Here is a list of feature engineering techniques: imputation, categorical encoding, binning, scaling, log transform, feature selection, and feature grouping.

Reinforcement Learning

Reinforcement learning is part of machine learning in which we look at how agents take action in an environment to maximize the notion of cumulative reward. This fiel is studied in other disciplines such as game theory, control theory, operations research, information theory, smilation-based optimization, multi-agent sysstems, swarm intelligence, and statistics.