= Machine Learning = == K-means clustering == * https://en.wikipedia.org/wiki/K-means_clustering It aims to partition n observations into k cluster. It's an unsupervised k-means algorithm * PSPP contains k-means, The QUICK CLUSTER command performs k-means clustering on the dataset. * Weka contains k-means and x-means. * Octave contains k-means. * OpenCV contains a k-means implementation. * Spark MLlib implements a distributed k-means algorithm. == K-NN classifier == * https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm k-nearest neighbors algorithm allows classification and regression A confusion matrix or "matching matrix" is often used as a tool to validate the accuracy of k-NN classification. * https://en.wikipedia.org/wiki/Confusion_matrix == Decision trees == * https://en.wikipedia.org/wiki/Decision_tree_learning Createa a model that predicts the value of a target variable based on several input variables. Classification tree outcome is the class (discrete) to which the data belongs. Regression tree outcome can be considered a real number Notable decision tree algorithms include: * ID3 (Iterative Dichotomiser 3) * C4.5 (successor of ID3) * CART (Classification And Regression Tree) * Chi-square automatic interaction detection (CHAID) * MARS === ID3 === * https://en.wikipedia.org/wiki/ID3_algorithm Algorithm invented by Ross Quinlan[1] used to generate a decision tree from a dataset. == Naive Bayes classifier == * https://en.wikipedia.org/wiki/Naive_Bayes_classifier Document classification Here is a worked example of naive Bayesian classification to the document classification problem. Consider the problem of classifying documents by their content, for example into spam and non-spam e-mails. == Apriori algorithm == https://en.wikipedia.org/wiki/Apriori_algorithm association rule learning market basket analysis == Libraries/frameworks == * scikit-learn * R (an open-source software environment for statistical computing, which includes several CART implementations such as rpart, party and randomForest packages), * Weka (a free and open-source data-mining suite, contains many decision tree algorithms), * Orange * KNIME * OpenCV === w3schools python ML === * https://www.w3schools.com/python/python_ml_getting_started.asp matplotlib.pyplot.scatter matplotlib.pyplot.hist numpy.mean numpy.median numpy.std numpy.var numpy.percentile numpy.random.uniform numpy.random.normal numpy.poly1d numpy.polyfit pandas.read_csv scipy.stats.mode scipy.stats.linregress scipy.cluster.hierarchy.dendrogram scipy.cluster.hierarchy.linkage sklearn.metrics.r2_score sklearn.linear_model sklearn.preprocessing.StandardScaler sklearn.tree sklearn.tree.DecisionTreeClassifier sklearn.metrics.confusion_matrix sklearn.metrics.accuracy_score sklearn.metrics.precision_score sklearn.metrics.recall_score sklearn.metrics.f1_score sklearn.cluster.AgglomerativeClustering sklearn.linear_model.LogisticRegression sklearn.cluster.KMeans sklearn.neighbors.KNeighborsClassifier