{"id":35,"date":"2017-09-26T16:18:18","date_gmt":"2017-09-26T16:18:18","guid":{"rendered":"http:\/\/www.nullplug.org\/ML-Blog\/?p=35"},"modified":"2017-10-19T07:38:58","modified_gmt":"2017-10-19T07:38:58","slug":"supervised-learning","status":"publish","type":"post","link":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/09\/26\/supervised-learning\/","title":{"rendered":"Supervised Learning"},"content":{"rendered":"<blockquote><p>\n  A big computer, a complex algorithm, and a long time does not equal science. &#8211; <a href=\"https:\/\/www.computerhope.com\/people\/robert_gentleman.htm\">Robert Gentleman<\/a>\n<\/p><\/blockquote>\n<h2>Examples<\/h2>\n<p>Before getting into what supervised learning precisely is, let&#8217;s look at some examples of supervised learning tasks:<\/p>\n<ol>\n<li><a href=\"https:\/\/www.kaggle.com\/uciml\/breast-cancer-wisconsin-data\">Identifying breast cancer<\/a>. <!--- ![Image of cells](http:\/\/curiousily.com\/assets\/1.diagnosing_breast_cancer_files\/biopsy.jpg) -->\n<ul>\n<li>A <a href=\"https:\/\/www.kaggle.com\/gargmanish\/basic-machine-learning-with-cancer\/data\">sample study<\/a>.<\/li>\n<\/ul>\n<\/li>\n<li><a href=\"http:\/\/cs.stanford.edu\/people\/karpathy\/ilsvrc\/\">Image classification<\/a>.\n<ul>\n<li>List of last year&#8217;s ILSVRC <a href=\"http:\/\/image-net.org\/challenges\/LSVRC\/2016\/results\">Winners<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/www.kaggle.com\/c\/passenger-screening-algorithm-challenge\">Threat assessment<\/a><br \/>\n<!--- ![Best to be secure](https:\/\/media.giphy.com\/media\/pNTqjthDxPDBm\/giphy.gif) --><\/li>\n<li><a href=\"https:\/\/translate.google.com\/#\">Language Translation<\/a> <!--- ![Alignment problems](https:\/\/devblogs.nvidia.com\/wp-content\/uploads\/2015\/07\/Figure6_sample_translations1-624x282.png) -->\n<ul>\n<li><a href=\"https:\/\/www.nytimes.com\/2016\/12\/14\/magazine\/the-great-ai-awakening.html\">NYTimes: Google&#8217;s neural machine translation<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/qph.ec.quoracdn.net\/main-qimg-1e398bd3e145227faf1cb31373dc4ec8.webp\">Identifying faces in images<\/a><\/li>\n<\/ol>\n<h4>Definition<\/h4>\n<p><em>Supervised learning<\/em> is concerned with the construction of machine learning algorithms $f\\colon 2^{D\\times T}\\times D\\to T$. The subsets of $D\\times T$ are referred to as subsets of <em>labeled examples<\/em>.<\/p>\n<p>We often assume that the labeled examples arise from restricting some presumed function $F\\colon D\\to T$ whose values we know on $E\\subset D$. We can then train $f$ on pairs $&#92;{s, F(s) | s\\in E&#92;}$. We can then devise a cost function which measures the distance from the learned $f$ and the presumed $F$ (e.g., $L^2$-distance).<\/p>\n<p>Supervised learning has been the most successful of the three branches to real world problems. The existence of labeled examples usually leads to well-defined performance metrics that transform supervised learning tasks into two other tasks finding an appropriate parametrized class of functions to choose $f$ from and an optimization problem of finding the best function in that class.<\/p>\n<p>Typically $D$ will be some subset of $\\Bbb R^n$ and we will refer to the components of $\\Bbb R^n$ as <em>features<\/em>, <em>dependent variables<\/em>, or <em>attributes<\/em> (many concepts in machine learning have many names). Sometimes $D$ will be discrete, in which case we refer to these as <em>categorical variables<\/em>. There are a few simple tricks to map categorical variables into $\\Bbb R^n$ (such as <a href=\"https:\/\/en.wikipedia.org\/wiki\/One-hot\">one-hot encoding<\/a>), so it usually does not hurt to think of $D$ as a subset of $\\Bbb R^n$.<\/p>\n<p>When $T$ is discrete (typically finite), then the supervised learning problem is called a <em>classification problem<\/em>. When it is continuous (e.g, $\\Bbb R$) then it is called a <em>regression problem<\/em>.<\/p>\n<h3>Examples<\/h3>\n<p>Let us consider the following sequence of supervised learning methods in turn.<\/p>\n<ol>\n<li><a href=\"http:\/\/www.nullplug.org\/ML-Blog\/2017\/10\/04\/linear-regression\/\">Linear Regression<\/a> (Regression problems).<\/li>\n<li>$k$-nearest Neighbors (Classification or Regression problems).<\/li>\n<li>Logistic Regression (Classification problems<sup id=\"fnref-35-1\"><a href=\"#fn-35-1\" class=\"jetpack-footnote\">1<\/a><\/sup>).<\/li>\n<li>Naive Bayes Classifier (Classification problems).<\/li>\n<li>Linear Discriminant Analysis (Classification problems).<\/li>\n<li>Support Vector Machines (Classification problems).<\/li>\n<li>Decision trees (Primarily Classification problems).<\/li>\n<li>Neural networks (Classification or Regression problems).<\/li>\n<\/ol>\n<div class=\"footnotes\">\n<hr \/>\n<ol>\n<li id=\"fn-35-1\">\nI know the name is confusing.&#160;<a href=\"#fnref-35-1\">&#8617;<\/a>\n<\/li>\n<\/ol>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A big computer, a complex algorithm, and a long time does not equal science. &#8211; Robert Gentleman Examples Before getting into what supervised learning precisely is, let&#8217;s look at some examples of supervised learning tasks: Identifying breast cancer. A sample study. Image classification. List of last year&#8217;s ILSVRC Winners Threat assessment Language Translation NYTimes: Google&#8217;s &hellip; <a href=\"http:\/\/www.nullplug.org\/ML-Blog\/2017\/09\/26\/supervised-learning\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Supervised Learning&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[3],"tags":[],"class_list":["post-35","post","type-post","status-publish","format-standard","hentry","category-supervised-learning"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9dIpN-z","jetpack_likes_enabled":true,"jetpack-related-posts":[{"id":33,"url":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/09\/26\/machine-learning-overview\/","url_meta":{"origin":35,"position":0},"title":"Machine Learning Overview","author":"Justin Noel","date":"September 26, 2017","format":false,"excerpt":"Science is knowledge which we understand so well that we can teach it to a computer; and if we don't fully understand something, it is an art to deal with it. Donald Knuth Introduction First Attempt at a Definition One says that an algorithm learns if its performance improves with\u2026","rel":"","context":"In &quot;General&quot;","block_context":{"text":"General","link":"http:\/\/www.nullplug.org\/ML-Blog\/category\/general\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/web.stanford.edu\/class\/cs234\/images\/header2.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/web.stanford.edu\/class\/cs234\/images\/header2.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/web.stanford.edu\/class\/cs234\/images\/header2.png?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/web.stanford.edu\/class\/cs234\/images\/header2.png?resize=700%2C400 2x"},"classes":[]},{"id":508,"url":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/11\/09\/problem-set-4\/","url_meta":{"origin":35,"position":1},"title":"Problem Set 4","author":"Justin Noel","date":"November 9, 2017","format":false,"excerpt":"Problem Set 4 This is to be completed by November 16th, 2017. Exercises Datacamp Complete the lessons: a. Supervised Learning in R: Regression b. Supervised Learning in R: Classification c. Exploratory Data Analysis (If you did not already do so) Let $\\lambda\\geq 0$, $X\\in \\Bbb R^n\\otimes \\Bbb R^m$, $Y\\in \\Bbb\u2026","rel":"","context":"In &quot;General&quot;","block_context":{"text":"General","link":"http:\/\/www.nullplug.org\/ML-Blog\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":118,"url":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/09\/27\/additional-sources\/","url_meta":{"origin":35,"position":2},"title":"Additional Sources","author":"Justin Noel","date":"September 27, 2017","format":false,"excerpt":"Textbooks Machine Learning: A probabilistic perspective\u00a0by Kevin Murphy. The material in this book is closest to what we will cover in the course, but is unfortunately not available for free. Written by an academic and a practitioner of machine learning, this text is full of real world examples and applications,\u2026","rel":"","context":"In &quot;Supplementary material&quot;","block_context":{"text":"Supplementary material","link":"http:\/\/www.nullplug.org\/ML-Blog\/category\/supplementary-material\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":531,"url":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/11\/17\/problem-set-5\/","url_meta":{"origin":35,"position":3},"title":"Problem Set 5","author":"Justin Noel","date":"November 17, 2017","format":false,"excerpt":"Problem Set 5 This is to be completed by November 23rd, 2017. Exercises Datacamp Complete the lesson: a. Machine Learning Toolbox R Lab: Write a function in R that will take in a vector of discrete variables and will produce the corresponding one hot encodings. Write a function in R\u2026","rel":"","context":"In &quot;General&quot;","block_context":{"text":"General","link":"http:\/\/www.nullplug.org\/ML-Blog\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":538,"url":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/11\/24\/problem-set-6\/","url_meta":{"origin":35,"position":4},"title":"Problem Set 6","author":"Justin Noel","date":"November 24, 2017","format":false,"excerpt":"Problem Set 6 This is to be completed by November 30th, 2017. Exercises Datacamp Complete the lesson: a. Text Mining: Bag of Words Exercises from Elements of Statistical Learning Complete exercises: a. 4.2 b. 4.6 Run the perceptron learning algorithm by hand for the two class classification problem with $(X,Y)$-pairs\u2026","rel":"","context":"In &quot;General&quot;","block_context":{"text":"General","link":"http:\/\/www.nullplug.org\/ML-Blog\/category\/general\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":214,"url":"http:\/\/www.nullplug.org\/ML-Blog\/2017\/10\/04\/linear-regression\/","url_meta":{"origin":35,"position":5},"title":"Linear Regression","author":"Justin Noel","date":"October 4, 2017","format":false,"excerpt":"Prediction is very difficult, especially about the future. - Niels Bohr The problem Suppose we have a list of vectors (which we can think of as samples) $x_1, \\cdots, x_m\\in \\Bbb R^n$ and a corresponding list of output scalars $y_1, \\cdots, y_m \\in \\Bbb R$ (which we can regard as\u2026","rel":"","context":"In &quot;Regression&quot;","block_context":{"text":"Regression","link":"http:\/\/www.nullplug.org\/ML-Blog\/category\/regression\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/www.nullplug.org\/ML-Blog\/wp-content\/uploads\/2017\/10\/trace.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.nullplug.org\/ML-Blog\/wp-content\/uploads\/2017\/10\/trace.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/www.nullplug.org\/ML-Blog\/wp-content\/uploads\/2017\/10\/trace.png?resize=525%2C300 1.5x"},"classes":[]}],"_links":{"self":[{"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/posts\/35","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/comments?post=35"}],"version-history":[{"count":11,"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/posts\/35\/revisions"}],"predecessor-version":[{"id":440,"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/posts\/35\/revisions\/440"}],"wp:attachment":[{"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/media?parent=35"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/categories?post=35"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.nullplug.org\/ML-Blog\/wp-json\/wp\/v2\/tags?post=35"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}