MoinMoin Logo
  • Comments
  • Immutable Page
  • Menu
    • Navigation
    • RecentChanges
    • FindPage
    • Local Site Map
    • Help
    • HelpContents
    • HelpOnMoinWikiSyntax
    • Display
    • Attachments
    • Info
    • Raw Text
    • Print View
    • Edit
    • Load
    • Save
  • Login

Navigation

  • Start
  • Sitemap
Revision 5 as of 2025-01-28 17:07:15
  • MachineLearning

Contents

  1. Machine Learning
    1. K-means clustering
    2. K-NN classifier
    3. Decision trees
    4. Naive Bayes classifier
    5. Apriori algorithm
    6. Libraries/frameworks

Machine Learning

K-means clustering

  • https://en.wikipedia.org/wiki/K-means_clustering

It aims to partition n observations into k cluster. It's an unsupervised k-means algorithm

  • PSPP contains k-means, The QUICK CLUSTER command performs k-means clustering on the dataset.
  • Weka contains k-means and x-means.
  • Octave contains k-means.
  • OpenCV contains a k-means implementation.
  • Spark MLlib implements a distributed k-means algorithm.

K-NN classifier

  • https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

k-nearest neighbors algorithm allows classification and regression

A confusion matrix or "matching matrix" is often used as a tool to validate the accuracy of k-NN classification.

  • https://en.wikipedia.org/wiki/Confusion_matrix

Decision trees

  • https://en.wikipedia.org/wiki/Decision_tree_learning

Createa a model that predicts the value of a target variable based on several input variables. Classification tree outcome is the class (discrete) to which the data belongs. Regression tree outcome can be considered a real number

Notable decision tree algorithms include:

  • ID3 (Iterative Dichotomiser 3)
  • C4.5 (successor of ID3)
  • CART (Classification And Regression Tree)
  • Chi-square automatic interaction detection (CHAID)
  • MARS

ID3

  • https://en.wikipedia.org/wiki/ID3_algorithm

Algorithm invented by Ross Quinlan[1] used to generate a decision tree from a dataset.

Naive Bayes classifier

  • https://en.wikipedia.org/wiki/Naive_Bayes_classifier

Document classification Here is a worked example of naive Bayesian classification to the document classification problem. Consider the problem of classifying documents by their content, for example into spam and non-spam e-mails.

Apriori algorithm

https://en.wikipedia.org/wiki/Apriori_algorithm association rule learning market basket analysis

Libraries/frameworks

  • scikit-learn
  • R (an open-source software environment for statistical computing, which includes several CART implementations such as rpart, party and randomForest packages),
  • Weka (a free and open-source data-mining suite, contains many decision tree algorithms),
  • Orange
  • KNIME
  • OpenCV

w3schools python ML

  • https://www.w3schools.com/python/python_ml_getting_started.asp

    • matplotlib.pyplot.scatter
    • matplotlib.pyplot.hist
    • numpy.mean
    • numpy.median
    • numpy.std
    • numpy.var
    • numpy.percentile
    • numpy.random.uniform
    • numpy.random.normal
    • numpy.poly1d
    • numpy.polyfit
    • pandas.read_csv
    • scipy.stats.mode
    • scipy.stats.linregress
    • scipy.cluster.hierarchy.dendrogram
    • scipy.cluster.hierarchy.linkage
    • sklearn.metrics.r2_score
    • sklearn.linear_model
    • sklearn.preprocessing.StandardScaler

    • sklearn.tree
    • sklearn.tree.DecisionTreeClassifier

    • sklearn.metrics.confusion_matrix
    • sklearn.metrics.accuracy_score
    • sklearn.metrics.precision_score
    • sklearn.metrics.recall_score
    • sklearn.metrics.f1_score
    • sklearn.cluster.AgglomerativeClustering

    • sklearn.linear_model.LogisticRegression

    • sklearn.cluster.KMeans
    • sklearn.neighbors.KNeighborsClassifier
  • MoinMoin Powered
  • Python Powered
  • GPL licensed
  • Valid HTML 4.01