MoinMoin Logo
  • Comments
  • Immutable Page
  • Menu
    • Navigation
    • RecentChanges
    • FindPage
    • Local Site Map
    • Help
    • HelpContents
    • HelpOnMoinWikiSyntax
    • Display
    • Attachments
    • Info
    • Raw Text
    • Print View
    • Edit
    • Load
    • Save
  • Login

Navigation

  • Start
  • Sitemap
Revision 3 as of 2023-06-02 14:42:52
  • MachineLearning

Machine Learning

K-means clustering

  • https://en.wikipedia.org/wiki/K-means_clustering

It aims to partition n observations into k cluster. It's an unsupervised k-means algorithm

  • PSPP contains k-means, The QUICK CLUSTER command performs k-means clustering on the dataset.
  • Weka contains k-means and x-means.
  • Octave contains k-means.
  • OpenCV contains a k-means implementation.
  • Spark MLlib implements a distributed k-means algorithm.

K-NN classifier

  • https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

k-nearest neighbors algorithm allows classification and regression

A confusion matrix or "matching matrix" is often used as a tool to validate the accuracy of k-NN classification.

  • https://en.wikipedia.org/wiki/Confusion_matrix

Decision trees

  • https://en.wikipedia.org/wiki/Decision_tree_learning

Createa a model that predicts the value of a target variable based on several input variables. Classification tree outcome is the class (discrete) to which the data belongs. Regression tree outcome can be considered a real number

Notable decision tree algorithms include:

  • ID3 (Iterative Dichotomiser 3)
  • C4.5 (successor of ID3)
  • CART (Classification And Regression Tree)
  • Chi-square automatic interaction detection (CHAID)
  • MARS

ID3

  • https://en.wikipedia.org/wiki/ID3_algorithm

Algorithm invented by Ross Quinlan[1] used to generate a decision tree from a dataset.

Naive Bayes classifier

  • https://en.wikipedia.org/wiki/Naive_Bayes_classifier

Document classification Here is a worked example of naive Bayesian classification to the document classification problem. Consider the problem of classifying documents by their content, for example into spam and non-spam e-mails.

Apriori algorithm

https://en.wikipedia.org/wiki/Apriori_algorithm association rule learning market basket analysis

Libraries/frameworks

  • scikit-learn
  • R (an open-source software environment for statistical computing, which includes several CART implementations such as rpart, party and randomForest packages),
  • Weka (a free and open-source data-mining suite, contains many decision tree algorithms),
  • Orange
  • KNIME
  • OpenCV

w3schools python ML

  • https://www.w3schools.com/python/python_ml_getting_started.asp

matplotlib.pyplot.scatter matplotlib.pyplot.hist numpy.mean numpy.median numpy.std numpy.var numpy.percentile numpy.random.uniform numpy.random.normal numpy.poly1d numpy.polyfit pandas.read_csv scipy.stats.mode scipy.stats.linregress scipy.cluster.hierarchy.dendrogram scipy.cluster.hierarchy.linkage sklearn.metrics.r2_score sklearn.linear_model sklearn.preprocessing.StandardScaler sklearn.tree sklearn.tree.DecisionTreeClassifier sklearn.metrics.confusion_matrix sklearn.metrics.accuracy_score sklearn.metrics.precision_score sklearn.metrics.recall_score sklearn.metrics.f1_score sklearn.cluster.AgglomerativeClustering sklearn.linear_model.LogisticRegression sklearn.cluster.KMeans sklearn.neighbors.KNeighborsClassifier

  • MoinMoin Powered
  • Python Powered
  • GPL licensed
  • Valid HTML 4.01