Collecting, Analyzing, and Visualizing Data with Python - Part I

The Art of Analyzing Big Data - The Data Scientist’s Toolbox - Lecture 2

By Dr. Michael Fire


1. Collecting Data from Websites

Let's write code that easily can get the titles from the Guido van Rossum blog

In [1]:
import requests
u = "http://neopythonic.blogspot.com/"
s = requests.get(u).content.decode('utf-8')
s
Out[1]:
'<!DOCTYPE html>\n<html dir=\'ltr\' xmlns=\'http://www.w3.org/1999/xhtml\' xmlns:b=\'http://www.google.com/2005/gml/b\' xmlns:data=\'http://www.google.com/2005/gml/data\' xmlns:expr=\'http://www.google.com/2005/gml/expr\'>\n<head>\n<link href=\'https://www.blogger.com/static/v1/widgets/2549344219-widget_css_bundle.css\' rel=\'stylesheet\' type=\'text/css\'/>\n<link href=\'http://google-code-prettify.googlecode.com/svn/trunk/src/prettify.css\' rel=\'stylesheet\' type=\'text/css\'/>\n<script src=\'http://google-code-prettify.googlecode.com/svn/trunk/src/prettify.js\' type=\'text/javascript\'></script>\n<meta content=\'text/html; charset=UTF-8\' http-equiv=\'Content-Type\'/>\n<meta content=\'blogger\' name=\'generator\'/>\n<link href=\'http://neopythonic.blogspot.com/favicon.ico\' rel=\'icon\' type=\'image/x-icon\'/>\n<link href=\'http://neopythonic.blogspot.com/\' rel=\'canonical\'/>\n<link rel="alternate" type="application/atom+xml" title="Neopythonic - Atom" href="http://neopythonic.blogspot.com/feeds/posts/default" />\n<link rel="alternate" type="application/rss+xml" title="Neopythonic - RSS" href="http://neopythonic.blogspot.com/feeds/posts/default?alt=rss" />\n<link rel="service.post" type="application/atom+xml" title="Neopythonic - Atom" href="https://www.blogger.com/feeds/4195135246107166251/posts/default" />\n<link rel="me" href="https://www.blogger.com/profile/12821714508588242516" />\n<!--Can\'t find substitution for tag [blog.ieCssRetrofitLinks]-->\n<meta content=\'http://neopythonic.blogspot.com/\' property=\'og:url\'/>\n<meta content=\'Neopythonic\' property=\'og:title\'/>\n<meta content=\'Ramblings through technology, politics, culture and philosophy by the creator of the Python programming language.\' property=\'og:description\'/>\n<!--[if IE]> <script> (function() { var html5 = ("abbr,article,aside,audio,canvas,datalist,details," + "figure,footer,header,hgroup,mark,menu,meter,nav,output," + "progress,section,time,video").split(\',\'); for (var i = 0; i < html5.length; i++) { document.createElement(html5[i]); } try { document.execCommand(\'BackgroundImageCache\', false, true); } catch(e) {} })(); </script> <![endif]-->\n<title>Neopythonic</title>\n<style id=\'page-skin-1\' type=\'text/css\'><!--\n/*\n-----------------------------------------------\nBlogger Template Style\nName:     Dots\nDate:     24 Feb 2004\nUpdated by: Blogger Team\n----------------------------------------------- */\n\nbody {\nmargin: 0px 0px 0px 0px;\nbackground:#fff url("https://resources.blogblog.com/blogblog/data/dots/bg_dots.gif");\nbackground-position: 50% 31px;\ntext-align:center;\nfont:x-small Verdana, Arial, Sans-serif;\ncolor:#333333;\nfont-size/* */:/**/small;\nfont-size: /**/small;\n}\n/* Page Structure\n----------------------------------------------- */\n#outer-wrapper {\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/bg_3dots.gif") no-repeat 250px 50px;\nwidth:700px;\nmargin:0 auto;\ntext-align:left;\nfont:normal normal 100% Verdana,Arial,Sans-Serif;\n}\n#header-wrapper {\ndisplay: none;\n}\n#main-wrapper {\nwidth:450px;\nfloat:right;\npadding:100px 0 20px;\nfont-size:85%;\nword-wrap: break-word; /* fix for long text breaking sidebar float in IE */\noverflow: hidden;     /* fix for long non-text content breaking IE sidebar float */\n}\n#main {\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/bg_dots2.gif") -100px -100px;\npadding:20px 10px 15px;\n}\n#sidebar-wrapper {\nwidth:200px;\nfloat:left;\nfont-size:85%;\npadding-bottom:20px;\nword-wrap: break-word; /* fix for long text breaking sidebar float in IE */\noverflow: hidden;     /* fix for long non-text content breaking IE sidebar float */\n}\n#sidebar {\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/bg_dots2.gif") 150px -50px;\npadding:40px 10px 15px;\nwidth:200px;\nwidth/* */:/**/180px;\nwidth: /**/180px;\n}\n/* Title & Description\n----------------------------------------------- */\n.Header h1 {\nmargin:0 0 .5em;\nline-height: 1.4em;\nfont: normal normal 250% Georgia,Serif;\ncolor: #335533;\n}\n.Header h1 a {\ncolor:#335533;\ntext-decoration:none;\n}\n.Header .description {\nmargin:0 0 1.75em;\ncolor: #999966;\nfont: normal normal 100% Verdana, Arial, Sans-Serif;\n}\n/* Links\n----------------------------------------------- */\na:link {\ncolor:#448888;\n}\na:visited {\ncolor:#888855;\n}\na:hover {\ncolor:#888855;\n}\na img {\nborder-width:0;\n}\n/* Posts\n----------------------------------------------- */\nh2.date-header {\nmargin:0 0 .75em;\npadding-bottom:.35em;\nborder-bottom:1px dotted #99bb99;\ntext-transform:uppercase;\nletter-spacing:.3em;\ncolor: #666633;\nfont: normal normal 95% Georgia, Serif;\n}\n.post {\nmargin:0 0 2.5em;\n}\n.post h3 {\nmargin:.25em 0;\nline-height: 1.4em;\nfont: normal normal 100% Georgia,Serif;\nfont-size: 130%;\nfont-weight: bold;\ncolor:#999966;\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/bg_post_title_left.gif") no-repeat left .25em;\npadding:0 0 1px 45px;\n}\n.post h3 a {\ntext-decoration:none;\ncolor: #999966;\n}\n.post h3 a:hover {\ncolor: #333333;\n}\n.post-body {\nmargin:0 0 .75em;\nline-height:1.6em;\n}\n.post-body blockquote {\nline-height:1.3em;\n}\n.post-footer {\nmargin:0;\n}\n.uncustomized-post-template .post-footer {\ntext-align: right;\n}\n.uncustomized-post-template .post-author,\n.uncustomized-post-template .post-timestamp {\ndisplay: block;\nfloat: left;\nmargin-right: 4px;\ntext-align: left;\n}\n.post-author, .post-timestamp {\ncolor:#999966;\n}\na.comment-link {\n/* IE5.0/Win doesn\'t apply padding to inline elements,\nso we hide these two declarations from it */\nbackground/* */:/**/url("https://resources.blogblog.com/blogblog/data/dots/icon_comment_left.gif") no-repeat left .25em;\npadding-left:15px;\n}\nhtml>body a.comment-link {\n/* Respecified, for IE5/Mac\'s benefit */\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/icon_comment_left.gif") no-repeat left .25em;\npadding-left:15px;\n}\n.post img, table.tr-caption-container {\nmargin:0 0 5px 0;\npadding:4px;\nborder:1px solid #99bb99;\n}\n.tr-caption-container img {\nborder: none;\nmargin: 0;\npadding: 0;\n}\n.feed-links {\nclear: both;\nline-height: 2.5em;\n}\n#blog-pager-newer-link {\nfloat: left;\n}\n#blog-pager-older-link {\nfloat: right;\n}\n#blog-pager {\ntext-align: center;\n}\n/* Comments\n----------------------------------------------- */\n#comments {\nmargin:0;\n}\n#comments h4 {\nmargin:0 0 10px;\nborder-top:1px dotted #99bb99;\npadding-top:.5em;\nline-height: 1.4em;\nfont: bold 110% Georgia,Serif;\ncolor:#333;\n}\n#comments-block {\nline-height:1.6em;\n}\n.comment-author {\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/icon_comment_left.gif") no-repeat 2px .35em;\nmargin:.5em 0 0;\npadding-top: 0;\npadding-bottom:0;\npadding-left:20px;\npadding-right:20px;\nfont-weight:bold;\n}\n.comment-body {\nmargin:0;\npadding-top: 0;\npadding-bottom:0;\npadding-left:20px;\npadding-right:20px;\n}\n.comment-body p {\nmargin:0 0 .5em;\n}\n.comment-footer {\nmargin:0 0 .5em;\npadding:0 0 .75em 20px;\npadding-top: 0;\npadding-bottom:.75em;\npadding-left:20px;\npadding-right:0;\ncolor:#996;\n}\n.comment-footer a:link {\ncolor:#996;\n}\n.deleted-comment {\nfont-style:italic;\ncolor:gray;\n}\n/* More Sidebar Content\n----------------------------------------------- */\n.sidebar h2 {\nmargin:2em 0 .75em;\npadding-bottom:.35em;\nborder-bottom:1px dotted #99bb99;\nline-height: 1.4em;\nfont-size: 95%;\nfont: normal normal 100% Georgia,Serif;\ntext-transform:uppercase;\nletter-spacing:.3em;\ncolor:#666633;\n}\n.sidebar p {\nmargin:0 0 .75em;\nline-height:1.6em;\n}\n.sidebar ul {\nlist-style:none;\nmargin:.5em 0;\npadding:0 0px;\n}\n.sidebar .widget {\nmargin: .5em 0 1em;\npadding: 0 0px;\nline-height: 1.5em;\n}\n.main .widget {\npadding-bottom: 1em;\n}\n.sidebar ul li {\nbackground:url("https://resources.blogblog.com/blogblog/data/dots/bullet.gif") no-repeat 3px .45em;\nmargin:0;\npadding-top: 0;\npadding-bottom:5px;\npadding-left:15px;\npadding-right:0;\n}\n.sidebar p {\nmargin:0 0 .6em;\n}\n/* Profile\n----------------------------------------------- */\n.profile-datablock {\nmargin: 0 0 1em;\n}\n.profile-img {\nfloat: left;\nmargin-top: 0;\nmargin-bottom:5px;\nmargin-left:0;\nmargin-right:8px;\nborder: 4px solid #cc9;\n}\n.profile-data {\nmargin: 0;\nline-height: 1.5em;\n}\n.profile-textblock {\nclear: left;\nmargin-left: 0;\n}\n/* Footer\n----------------------------------------------- */\n#footer {\nclear:both;\npadding:15px 0 0;\n}\n#footer p {\nmargin:0;\n}\n/* Page structure tweaks for layout editor wireframe */\nbody#layout #sidebar, body#layout #main,\nbody#layout #main-wrapper,\nbody#layout #outer-wrapper,\nbody#layout #sidebar-wrapper {\npadding: 0;\n}\nbody#layout #sidebar, body#layout #sidebar-wrapper {\npadding: 0;\nwidth: 240px;\n}\n\n--></style>\n<link href=\'https://www.blogger.com/dyn-css/authorization.css?targetBlogID=4195135246107166251&amp;zx=3404a461-9e33-4ad1-8b2b-080af63b1f5a\' media=\'none\' onload=\'if(media!=&#39;all&#39;)media=&#39;all&#39;\' rel=\'stylesheet\'/><noscript><link href=\'https://www.blogger.com/dyn-css/authorization.css?targetBlogID=4195135246107166251&amp;zx=3404a461-9e33-4ad1-8b2b-080af63b1f5a\' rel=\'stylesheet\'/></noscript>\n\n</head>\n<body onload=\'prettyPrint()\'>\n<div class=\'navbar section\' id=\'navbar\'><div class=\'widget Navbar\' data-version=\'1\' id=\'Navbar1\'><script type="text/javascript">\n    function setAttributeOnload(object, attribute, val) {\n      if(window.addEventListener) {\n        window.addEventListener(\'load\',\n          function(){ object[attribute] = val; }, false);\n      } else {\n        window.attachEvent(\'onload\', function(){ object[attribute] = val; });\n      }\n    }\n  </script>\n<div id="navbar-iframe-container"></div>\n<script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script>\n<script type="text/javascript">\n      gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() {\n        if (gapi.iframes && gapi.iframes.getContext) {\n          gapi.iframes.getContext().openChild({\n              url: \'https://www.blogger.com/navbar.g?targetBlogID\\x3d4195135246107166251\\x26blogName\\x3dNeopythonic\\x26publishMode\\x3dPUBLISH_MODE_BLOGSPOT\\x26navbarType\\x3dBLUE\\x26layoutType\\x3dLAYOUTS\\x26searchRoot\\x3dhttps://neopythonic.blogspot.com/search\\x26blogLocale\\x3den\\x26v\\x3d2\\x26homepageUrl\\x3dhttp://neopythonic.blogspot.com/\\x26vt\\x3d5369531368964104765\',\n              where: document.getElementById("navbar-iframe-container"),\n              id: "navbar-iframe"\n          });\n        }\n      });\n    </script><script type="text/javascript">\n(function() {\nvar script = document.createElement(\'script\');\nscript.type = \'text/javascript\';\nscript.src = \'//pagead2.googlesyndication.com/pagead/js/google_top_exp.js\';\nvar head = document.getElementsByTagName(\'head\')[0];\nif (head) {\nhead.appendChild(script);\n}})();\n</script>\n</div></div>\n<div id=\'outer-wrapper\'><div id=\'wrap2\'>\n<!-- skip links for text browsers -->\n<span id=\'skiplinks\' style=\'display:none;\'>\n<a href=\'#main\'>skip to main </a> |\n      <a href=\'#sidebar\'>skip to sidebar</a>\n</span>\n<div id=\'content-wrapper\'>\n<div id=\'crosscol-wrapper\' style=\'text-align:center\'>\n<div class=\'crosscol no-items section\' id=\'crosscol\'></div>\n</div>\n<div id=\'main-wrapper\'>\n<div class=\'main section\' id=\'main\'><div class=\'widget Blog\' data-version=\'1\' id=\'Blog1\'>\n<div class=\'blog-posts hfeed\'>\n\n          <div class="date-outer">\n        \n<h2 class=\'date-header\'><span>Friday, March 15, 2019</span></h2>\n\n          <div class="date-posts">\n        \n<div class=\'post-outer\'>\n<div class=\'post hentry uncustomized-post-template\' itemprop=\'blogPost\' itemscope=\'itemscope\' itemtype=\'http://schema.org/BlogPosting\'>\n<meta content=\'4195135246107166251\' itemprop=\'blogId\'/>\n<meta content=\'775339472173253922\' itemprop=\'postId\'/>\n<a name=\'775339472173253922\'></a>\n<h3 class=\'post-title entry-title\' itemprop=\'name\'>\n<a href=\'http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html\'>Why operators are useful</a>\n</h3>\n<div class=\'post-header\'>\n<div class=\'post-header-line-1\'></div>\n</div>\n<div class=\'post-body entry-content\' id=\'post-body-775339472173253922\' itemprop=\'description articleBody\'>\nThis is something I posted on python-ideas, but I think it\'s interesting to a wider audience.<br />\n<br />\nThere\'s been a lot of discussion recently about an operator to merge two dicts.<br />\n<br />\nIt prompted me to think about the reason (some) people like operators, and a discussion I had with my mentor Lambert Meertens over 30 years ago came to mind.<br />\n<br />\nFor mathematicians, operators are essential to how they think. Take a simple operation like adding two numbers, and try exploring some of its behavior.<br />\n<br />\n&nbsp;&nbsp;&nbsp; add(x, y) == add(y, x)&nbsp;&nbsp;&nbsp; (1)<br />\n<br />\nEquation (1) expresses the law that addition is commutative. It\'s usually written using an operator, which makes it more concise:<br />\n<br />\n&nbsp;&nbsp;&nbsp; x + y == y + x&nbsp;&nbsp;&nbsp; (1a)<br />\n<br />\nThat feels like a minor gain.<br />\n<br />\nNow consider the associative law:<br />\n<br />\n&nbsp;&nbsp;&nbsp; add(x, add(y, z)) == add(add(x, y), z)&nbsp;&nbsp;&nbsp; (2)<br />\n<br />\nEquation (2) can be rewritten using operators:<br />\n<br />\n&nbsp;&nbsp;&nbsp; x + (y + z) == (x + y) + z&nbsp;&nbsp;&nbsp; (2a)<br />\n<br />\nThis is much less confusing than (2), and leads to the observation that the parentheses are redundant, so now we can write<br />\n<br />\n&nbsp;&nbsp;&nbsp; x + y + z&nbsp;&nbsp;&nbsp; (3)<br />\n<br />\nwithout ambiguity (it doesn\'t matter whether the + operator binds tighter to the left or to the right).<br />\n<br />\nMany other laws are also written more easily using operators.&nbsp; Here\'s one more example, about the identity element of addition:<br />\n<br />\n&nbsp;&nbsp;&nbsp; add(x, 0) == add(0, x) == x&nbsp;&nbsp;&nbsp; (4)<br />\n<br />\ncompare to<br />\n<br />\n&nbsp;&nbsp;&nbsp; x + 0 == 0 + x == x&nbsp;&nbsp;&nbsp; (4a)<br />\n<br />\nThe general idea here is that once you\'ve learned this simple notation, equations written using them are easier to *manipulate* than equations written using functional notation -- it is as if our brains grasp the operators using different brain machinery, and this is more efficient.<br />\n<br />\nI think that the fact that formulas written using operators are more easily processed *visually* has something to do with it: they engage the brain\'s visual processing machinery, which operates largely subconsciously, and tells the conscious part what it sees (e.g. "chair" rather than "pieces of wood joined together"). The functional notation must take a different path through our brain, which is less subconscious (it\'s related to reading and understanding what you read, which is learned/trained at a much later age than visual processing).<br />\n<br />\nThe power of visual processing really becomes apparent when you combine multiple operators. For example, consider the distributive law:<br />\n<br />\n&nbsp;&nbsp;&nbsp; mul(n, add(x, y)) == add(mul(n, x), mul(n, y))&nbsp; (5)<br />\n<br />\nThat was painful to write, and I believe that at first you won\'t see the pattern (or at least you wouldn\'t have immediately seen it if I hadn\'t mentioned this was the distributive law).<br />\n<br />\nCompare to:<br />\n<br />\n&nbsp;&nbsp;&nbsp; n * (x + y) == n * x + n * y&nbsp;&nbsp;&nbsp; (5a)<br />\n<br />\nNotice how this also uses relative operator priorities. Often mathematicians write this even more compact:<br />\n<br />\n&nbsp;&nbsp;&nbsp; n(x+y) == nx + ny&nbsp;&nbsp;&nbsp; (5b)<br />\n<br />\nbut alas, that currently goes beyond the capacities of Python\'s parser.<br />\n<br />\nAnother very powerful aspect of operator notation is that it is convenient to apply them to objects of different types. For example, laws (1) through (5) also work when x, y and z are same-size vectors and n is a scalar (substituting a vector of zeros for the literal "0"), and also if they are matrices (again, n has to be a scalar).<br />\n<br />\nAnd you can do this with objects in many different domains. For example, the above laws (1) through (5) apply to functions too (n being a scalar again).<br />\n<br />\nBy choosing the operators wisely, mathematicians can employ their visual brain to help them do math better: they\'ll discover new interesting laws sooner because sometimes the symbols on the blackboard just jump at you and suggest a path to an elusive proof.<br />\n<br />\nNow, programming isn\'t exactly the same activity as math, but we all know that Readability Counts, and this is where operator overloading in Python comes in. Once you\'ve internalized the simple properties which operators tend to have, using + for string or list concatenation becomes more readable than a pure OO notation, and (2) and (3) above explain (in part) why that is.<br />\n<br />\nOf course, it\'s definitely possible to overdo this -- then you get Perl. But I think that the folks who point out "there is already a way to do this" are missing the point that it really is easier to grasp the meaning of this:<br />\n<br />\n&nbsp;&nbsp;&nbsp; d = d1 + d2<br />\n<br />\ncompared to this:<br />\n<br />\n&nbsp;&nbsp;&nbsp; d = d1.copy()<br />\n&nbsp;&nbsp;&nbsp; d.update(d2)&nbsp;&nbsp;&nbsp; # CORRECTED: This line was previously wrong<br />\n<br />\nand it is not just a matter of fewer lines of code: the first form allows us to use our visual processing to help us see the meaning quicker -- and without distracting other parts of our brain (which might already be occupied by keeping track of the meaning of d1 and d2, for example).<br />\n<br />\nOf course, everything comes at a price. You have to learn the operators, and you have to learn their properties when applied to different object types. (This is true in math too -- for numbers, x*y == y*x, but this property does not apply to functions or matrices; OTOH x+y == y+x applies to all, as does the associative law.)<br />\n<br />\n"But what about performance?" I hear you ask. Good question. IMO, readability comes first, performance second. And in the basic example (d = d1 + d2) there is no performance loss compared to the two-line version using update, and a clear win in readability. I can think of many situations where performance difference is irrelevant but readability is of utmost importance, and for me this is the default assumption (even at Dropbox -- our most performance critical code has already been rewritten in ugly Python or in Go). For the few cases where performance concerns are paramount, it\'s easy to transform the operator version to something else -- *once you\'ve confirmed it\'s needed* (probably by profiling).\n<div style=\'clear: both;\'></div>\n</div>\n<div class=\'post-footer\'>\n<div class=\'post-footer-line post-footer-line-1\'>\n<span class=\'post-author vcard\'>\nPosted by\n<span class=\'fn\' itemprop=\'author\' itemscope=\'itemscope\' itemtype=\'http://schema.org/Person\'>\n<meta content=\'https://www.blogger.com/profile/12821714508588242516\' itemprop=\'url\'/>\n<a class=\'g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' title=\'author profile\'>\n<span itemprop=\'name\'>Guido van Rossum</span>\n</a>\n</span>\n</span>\n<span class=\'post-timestamp\'>\nat\n<meta content=\'http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html\' itemprop=\'url\'/>\n<a class=\'timestamp-link\' href=\'http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html\' rel=\'bookmark\' title=\'permanent link\'><abbr class=\'published\' itemprop=\'datePublished\' title=\'2019-03-15T10:58:00-07:00\'>10:58 AM</abbr></a>\n</span>\n<span class=\'reaction-buttons\'>\n</span>\n<span class=\'post-comment-link\'>\n<a class=\'comment-link\' href=\'https://www.blogger.com/comment.g?blogID=4195135246107166251&postID=775339472173253922\' onclick=\'\'>\nNo comments:\n    </a>\n</span>\n<span class=\'post-backlinks post-comment-link\'>\n</span>\n<span class=\'post-icons\'>\n<span class=\'item-control blog-admin pid-1774424698\'>\n<a href=\'https://www.blogger.com/post-edit.g?blogID=4195135246107166251&postID=775339472173253922&from=pencil\' title=\'Edit Post\'>\n<img alt=\'\' class=\'icon-action\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_edit_allbkg.gif\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'post-share-buttons goog-inline-block\'>\n</div>\n</div>\n<div class=\'post-footer-line post-footer-line-2\'>\n<span class=\'post-labels\'>\n</span>\n</div>\n<div class=\'post-footer-line post-footer-line-3\'>\n<span class=\'post-location\'>\n</span>\n</div>\n</div>\n</div>\n</div>\n\n          </div></div>\n        \n\n          <div class="date-outer">\n        \n<h2 class=\'date-header\'><span>Monday, November 26, 2018</span></h2>\n\n          <div class="date-posts">\n        \n<div class=\'post-outer\'>\n<div class=\'post hentry uncustomized-post-template\' itemprop=\'blogPost\' itemscope=\'itemscope\' itemtype=\'http://schema.org/BlogPosting\'>\n<meta content=\'4195135246107166251\' itemprop=\'blogId\'/>\n<meta content=\'2471146972433715807\' itemprop=\'postId\'/>\n<a name=\'2471146972433715807\'></a>\n<h3 class=\'post-title entry-title\' itemprop=\'name\'>\n<a href=\'http://neopythonic.blogspot.com/2018/11/what-do-do-with-your-computer-science.html\'>What to do with your computer science career</a>\n</h3>\n<div class=\'post-header\'>\n<div class=\'post-header-line-1\'></div>\n</div>\n<div class=\'post-body entry-content\' id=\'post-body-2471146972433715807\' itemprop=\'description articleBody\'>\nI regularly receive questions from students in the field of computer science looking for career advice.<br />\n<br />\nHere\'s an answer I wrote to one of them. It\'s not comprehensive or anything, but I thought people might find it interesting.<br />\n<br />\n[A question about whether to choose a 9-5 job or be an entrepreneur]<br />\n<br />\nThe question about "9-5" vs. "entrepreneur" is a complex one -- not everybody can be a successful entrepreneur (who would do the work? :-) and not everybody has the temperament for it. For me personally it was never an option -- there are vast parts of management and entrepreneurship that I wouldn\'t enjoy doing, such as hiring (I hate interviewing and am bad at it) and firing (too emotionally draining -- even just giving negative feedback is hard for me). Pitching ideas to investors is another thing that I\'d rather do without.<br />\n<br />\nIf any of that resonates with you, you may be better off not opting for entrepreneurship -- the kind of 9-5 software development jobs I have had are actually (mostly) very rewarding: I get to write software that gets used by hundreds or thousands of other developers (or millions in the case of Python), and those other developers in turn use my software to produce product that get uses by hundreds of thousands or, indeed hundreds of millions of users. Not every 9-5 job is the same! For me personally, I don\'t like the product stuff (since usually that means it\'s products I have no interest in using myself), but "your mileage may vary" (as they say in the US). Just try to do better than an entry-level web development job;&nbsp; that particular field (editing HTML and CSS) is likely to be automated away, and would feel repetitive to me.<br />\n<br />\n[A question about whether AI would make human software developers redundant (not about what I think of the field of AI as a career choice)]<br />\n<br />\nRegarding AI, I\'m not worried at all. The field is focused on automating boring, repetitive tasks like driving a car or recognizing faces, which humans can learn to do easily but find boring if they have to do it all the time. The field of software engineering (which includes the field of AI) is never boring, since as soon as a task is repetitive, you automate it, and you start solving new problems.\n<div style=\'clear: both;\'></div>\n</div>\n<div class=\'post-footer\'>\n<div class=\'post-footer-line post-footer-line-1\'>\n<span class=\'post-author vcard\'>\nPosted by\n<span class=\'fn\' itemprop=\'author\' itemscope=\'itemscope\' itemtype=\'http://schema.org/Person\'>\n<meta content=\'https://www.blogger.com/profile/12821714508588242516\' itemprop=\'url\'/>\n<a class=\'g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' title=\'author profile\'>\n<span itemprop=\'name\'>Guido van Rossum</span>\n</a>\n</span>\n</span>\n<span class=\'post-timestamp\'>\nat\n<meta content=\'http://neopythonic.blogspot.com/2018/11/what-do-do-with-your-computer-science.html\' itemprop=\'url\'/>\n<a class=\'timestamp-link\' href=\'http://neopythonic.blogspot.com/2018/11/what-do-do-with-your-computer-science.html\' rel=\'bookmark\' title=\'permanent link\'><abbr class=\'published\' itemprop=\'datePublished\' title=\'2018-11-26T09:13:00-08:00\'>9:13 AM</abbr></a>\n</span>\n<span class=\'reaction-buttons\'>\n</span>\n<span class=\'post-comment-link\'>\n<a class=\'comment-link\' href=\'https://www.blogger.com/comment.g?blogID=4195135246107166251&postID=2471146972433715807\' onclick=\'\'>\nNo comments:\n    </a>\n</span>\n<span class=\'post-backlinks post-comment-link\'>\n</span>\n<span class=\'post-icons\'>\n<span class=\'item-control blog-admin pid-1774424698\'>\n<a href=\'https://www.blogger.com/post-edit.g?blogID=4195135246107166251&postID=2471146972433715807&from=pencil\' title=\'Edit Post\'>\n<img alt=\'\' class=\'icon-action\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_edit_allbkg.gif\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'post-share-buttons goog-inline-block\'>\n</div>\n</div>\n<div class=\'post-footer-line post-footer-line-2\'>\n<span class=\'post-labels\'>\n</span>\n</div>\n<div class=\'post-footer-line post-footer-line-3\'>\n<span class=\'post-location\'>\n</span>\n</div>\n</div>\n</div>\n</div>\n\n          </div></div>\n        \n\n          <div class="date-outer">\n        \n<h2 class=\'date-header\'><span>Saturday, July 23, 2016</span></h2>\n\n          <div class="date-posts">\n        \n<div class=\'post-outer\'>\n<div class=\'post hentry uncustomized-post-template\' itemprop=\'blogPost\' itemscope=\'itemscope\' itemtype=\'http://schema.org/BlogPosting\'>\n<meta content=\'4195135246107166251\' itemprop=\'blogId\'/>\n<meta content=\'2468107226962512288\' itemprop=\'postId\'/>\n<a name=\'2468107226962512288\'></a>\n<h3 class=\'post-title entry-title\' itemprop=\'name\'>\n<a href=\'http://neopythonic.blogspot.com/2016/07/about-spammers-and-comments.html\'>About spammers and comments</a>\n</h3>\n<div class=\'post-header\'>\n<div class=\'post-header-line-1\'></div>\n</div>\n<div class=\'post-body entry-content\' id=\'post-body-2468107226962512288\' itemprop=\'description articleBody\'>\nI\'m turning off commenting for my blogs. While I\'ve enjoyed some feedback, the time wasted to moderate spam posts just isn\'t worth it. Thank you, spammers! :-(\n<div style=\'clear: both;\'></div>\n</div>\n<div class=\'post-footer\'>\n<div class=\'post-footer-line post-footer-line-1\'>\n<span class=\'post-author vcard\'>\nPosted by\n<span class=\'fn\' itemprop=\'author\' itemscope=\'itemscope\' itemtype=\'http://schema.org/Person\'>\n<meta content=\'https://www.blogger.com/profile/12821714508588242516\' itemprop=\'url\'/>\n<a class=\'g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' title=\'author profile\'>\n<span itemprop=\'name\'>Guido van Rossum</span>\n</a>\n</span>\n</span>\n<span class=\'post-timestamp\'>\nat\n<meta content=\'http://neopythonic.blogspot.com/2016/07/about-spammers-and-comments.html\' itemprop=\'url\'/>\n<a class=\'timestamp-link\' href=\'http://neopythonic.blogspot.com/2016/07/about-spammers-and-comments.html\' rel=\'bookmark\' title=\'permanent link\'><abbr class=\'published\' itemprop=\'datePublished\' title=\'2016-07-23T14:11:00-07:00\'>2:11 PM</abbr></a>\n</span>\n<span class=\'reaction-buttons\'>\n</span>\n<span class=\'post-comment-link\'>\n<a class=\'comment-link\' href=\'https://www.blogger.com/comment.g?blogID=4195135246107166251&postID=2468107226962512288\' onclick=\'\'>\nNo comments:\n    </a>\n</span>\n<span class=\'post-backlinks post-comment-link\'>\n</span>\n<span class=\'post-icons\'>\n<span class=\'item-control blog-admin pid-1774424698\'>\n<a href=\'https://www.blogger.com/post-edit.g?blogID=4195135246107166251&postID=2468107226962512288&from=pencil\' title=\'Edit Post\'>\n<img alt=\'\' class=\'icon-action\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_edit_allbkg.gif\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'post-share-buttons goog-inline-block\'>\n</div>\n</div>\n<div class=\'post-footer-line post-footer-line-2\'>\n<span class=\'post-labels\'>\n</span>\n</div>\n<div class=\'post-footer-line post-footer-line-3\'>\n<span class=\'post-location\'>\n</span>\n</div>\n</div>\n</div>\n</div>\n\n          </div></div>\n        \n\n          <div class="date-outer">\n        \n<h2 class=\'date-header\'><span>Wednesday, May 18, 2016</span></h2>\n\n          <div class="date-posts">\n        \n<div class=\'post-outer\'>\n<div class=\'post hentry uncustomized-post-template\' itemprop=\'blogPost\' itemscope=\'itemscope\' itemtype=\'http://schema.org/BlogPosting\'>\n<meta content=\'4195135246107166251\' itemprop=\'blogId\'/>\n<meta content=\'4387175608679924841\' itemprop=\'postId\'/>\n<a name=\'4387175608679924841\'></a>\n<h3 class=\'post-title entry-title\' itemprop=\'name\'>\n<a href=\'http://neopythonic.blogspot.com/2016/05/union-syntax.html\'>Union syntax</a>\n</h3>\n<div class=\'post-header\'>\n<div class=\'post-header-line-1\'></div>\n</div>\n<div class=\'post-body entry-content\' id=\'post-body-4387175608679924841\' itemprop=\'description articleBody\'>\n<h2>\nUnion syntax</h2>\n<blockquote class="tr_bq">\n<i>(I\'m trying to do this as a quick post in response to some questions I received on this topic. I realize this will probably reopen the whole discussion about the best syntax for types, but sorry folks, PEP 484 was accepted nearly a year ago, after many months of discussions and hundreds of messages. It\'s unlikely that any idea you can think of here would be new. This post just explains the rationale of one particular decision and tries to put it in some context.)</i></blockquote>\nI\'ve heard some grumbling about the union syntax in <a href="https://www.python.org/dev/peps/pep-0484/">PEP 484</a>: Union[X, Y, Z] (where X, Y and Z are arbitrary type expressions). In the past people have suggested X|Y|Z for this, or (X, Y, Z) or {X, Y, Z}. Why did we go with the admittedly clunkier Union[X, Y, Z]?<br />\n<br />\nFirst of all, despite all the attention drawn to it, unions are actually a pretty minor feature, and you shouldn\'t be using them much. So you also shouldn\'t care that much.<br />\n<h3>\nWhy not X|Y|Z?</h3>\nThis won\'t fly because we want compatibility with versions of Python 3 that were already frozen (see below). We want to be able to express e.g. a union of int and str, which under this notation would be written as int|str. But for that to fly we\'d have to modify the builtin \'type\' class to implement __or__ -- and that wouldn\'t fly on already-frozen Python versions. Supporting X|Y only for types (like List) imported from the typing module and some other notation for builtin types would only sow confusion. So X|Y|Z is out.<br />\n<h3>\nWhy not {X, Y, Z}?</h3>\nThat\'s the set with elements X, Y and Z, using the builtin set notation. We can usefully consider types to be sets of values, and this makes a union a set of values too (that\'s why it\'s called union :-).<br />\n<br />\nHowever, {X, Y, Z} confuses the set of <i>types</i> with the set of <i>values</i>, which I consider a mortal sin. This would just cause endless confusion.<br />\n<br />\nThis notation would also confuse things when taking the union of several classes that overlap, e.g. if we have classes B and C, where C inherits from B, then the union of B and C is just B. But the builtin set doesn\'t see it that way. In contrast, the X|Y notation could actually solve this (since in principle we could overload __or__ to do whatever we want), and the Union[] operator ("functor"?) from PEP 484 indeed solves this -- in this example Union[B, C] returns the (non-union) type B, both in the type checker and at runtime.<br />\n<h3>\nWhy not (X, Y, Z)?</h3>\nThat\'s the tuple (X, Y, Z). It has the same disadvantages as {X, Y, Z}, but at least it has the advantage of being similar to how unions are expressed as arguments to isinstance(), for example isinstance(x, (int, str, list)) or isinstance(x, (Sequence, Mapping)). (Similarly the except clause: try: ... / except (KeyError, IndexError): ...)<br />\n<br />\nAnother problem with tuples is that the tuple syntax is already overloaded in so many ways that it would be confused with other uses even more easily. One particular confusion would be other generic types, for which we\'d still want to use square brackets. (You can\'t really beat Iterable[int] for clarity if you have an iterable of integers. :-) Suppose you have a sequence of values that could be integers or strings. In PEP 484 notation we write this as Sequence[Union[int, str]]. Using the tuple notation we\'d want to write this as Sequence[(int, str)]. But it turns out that the __getitem__ overload on the metaclass can\'t tell the difference between Sequence[(int, str)] and Sequence[int, str] -- and we would like to reject the latter as a mistake since Sequence[] is a generic class over a single parameter. (An example of a generic class over two parameters would be Mapping[K, V].) Disambiguating all this would place us on very thin ice indeed.<br />\n<br />\nThe nail in this idea\'s coffin is the competing idea of using (X, Y, Z) to indicate a tuple with three items, with respective types, X, Y and Z. At first sight this seems an even better use of the tuple syntax than unions would be, and tuples are way more common than unions. But it runs afoul of the same problems with Foo[(X, Y)] vs. Foo[X, Y]. (Also, there would be no easy way to describe what PEP 484 calls Tuple[X, ...], i.e. a variable-length tuple with uniform item type X.)<br />\n<h3>\nPS. Why support old Python 3 versions?</h3>\nThe reason for supporting older versions is adoption. Only a relatively small crowd of early adopters can upgrade to the latest Python version as soon as it\'s out; the rest of us are stuck on older versions (even Python 2.7!). <br />\n<br />\nSo for PEP 484 and the typing module, we wanted to support 3.2 and up -- we chose 3.2 because it\'s the newest Python 3 supported by some older but still popular Ubuntu and Debian distributions. (Also, 3.0 and 3.1 were too immature at their time of release to ever have a large following.)<br />\n<br />\nThere\'s a typing package that you can install easily using pip, and this defines all sorts of useful things for typing, from Any and Union to generic versions of List and Sequence. But such a package can\'t modify existing builtins like int or list.<br />\n<br />\n(Eventually we also added Python 2.7 support, using type comments for function signatures.)\n<div style=\'clear: both;\'></div>\n</div>\n<div class=\'post-footer\'>\n<div class=\'post-footer-line post-footer-line-1\'>\n<span class=\'post-author vcard\'>\nPosted by\n<span class=\'fn\' itemprop=\'author\' itemscope=\'itemscope\' itemtype=\'http://schema.org/Person\'>\n<meta content=\'https://www.blogger.com/profile/12821714508588242516\' itemprop=\'url\'/>\n<a class=\'g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' title=\'author profile\'>\n<span itemprop=\'name\'>Guido van Rossum</span>\n</a>\n</span>\n</span>\n<span class=\'post-timestamp\'>\nat\n<meta content=\'http://neopythonic.blogspot.com/2016/05/union-syntax.html\' itemprop=\'url\'/>\n<a class=\'timestamp-link\' href=\'http://neopythonic.blogspot.com/2016/05/union-syntax.html\' rel=\'bookmark\' title=\'permanent link\'><abbr class=\'published\' itemprop=\'datePublished\' title=\'2016-05-18T11:55:00-07:00\'>11:55 AM</abbr></a>\n</span>\n<span class=\'reaction-buttons\'>\n</span>\n<span class=\'post-comment-link\'>\n<a class=\'comment-link\' href=\'https://www.blogger.com/comment.g?blogID=4195135246107166251&postID=4387175608679924841\' onclick=\'\'>\nNo comments:\n    </a>\n</span>\n<span class=\'post-backlinks post-comment-link\'>\n</span>\n<span class=\'post-icons\'>\n<span class=\'item-control blog-admin pid-1774424698\'>\n<a href=\'https://www.blogger.com/post-edit.g?blogID=4195135246107166251&postID=4387175608679924841&from=pencil\' title=\'Edit Post\'>\n<img alt=\'\' class=\'icon-action\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_edit_allbkg.gif\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'post-share-buttons goog-inline-block\'>\n</div>\n</div>\n<div class=\'post-footer-line post-footer-line-2\'>\n<span class=\'post-labels\'>\n</span>\n</div>\n<div class=\'post-footer-line post-footer-line-3\'>\n<span class=\'post-location\'>\n</span>\n</div>\n</div>\n</div>\n</div>\n<div class=\'post-outer\'>\n<div class=\'post hentry uncustomized-post-template\' itemprop=\'blogPost\' itemscope=\'itemscope\' itemtype=\'http://schema.org/BlogPosting\'>\n<meta content=\'4195135246107166251\' itemprop=\'blogId\'/>\n<meta content=\'8854185106045973213\' itemprop=\'postId\'/>\n<a name=\'8854185106045973213\'></a>\n<h3 class=\'post-title entry-title\' itemprop=\'name\'>\n<a href=\'http://neopythonic.blogspot.com/2016/05/adding-type-annotations-for-fspath.html\'>Adding type annotations for fspath</a>\n</h3>\n<div class=\'post-header\'>\n<div class=\'post-header-line-1\'></div>\n</div>\n<div class=\'post-body entry-content\' id=\'post-body-8854185106045973213\' itemprop=\'description articleBody\'>\n<div>\n<h1 class="ace-copy-paste-skip-this-tag">\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Type annotations for fspath</span></h1>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Python 3.6 will have a new </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="http://www.pixelmonkey.org/2013/04/11/python-double-under-double-wonder" href="http://www.pixelmonkey.org/2013/04/11/python-double-under-double-wonder" rel="noreferrer nofollow" target="_blank">dunder protocol</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">, </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">__fspath__()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , which should be supported by classes that represent filesystem paths. Example of such classes are the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">pathlib.Path</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> family and </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.DirEntry</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;(returned by </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.scandir()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> ).</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">You can read more about this protocol in the brand new </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0519/" href="https://www.python.org/dev/peps/pep-0519/" rel="noreferrer nofollow" target="_blank">PEP 519</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">. In this blog post I&#8217;m going to discuss how we would add type annotations for these additions to the standard library.</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">I&#8217;m making frequent use of </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , a quite magical type variable predefined in the typing module. If you&#8217;re not familiar with it, I recommend reading my </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html" href="http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html" rel="noreferrer nofollow" target="_blank">blog post about </a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code attrlink url"><a class="attrlink" data-target-href="http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html" href="http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html" rel="noreferrer nofollow" target="_blank">AnyStr</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . You may also want to read up on </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0484/#generics" href="https://www.python.org/dev/peps/pep-0484/#generics" rel="noreferrer nofollow" target="_blank">generics in PEP 484</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> (or read </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="http://mypy.readthedocs.io/en/latest/generics.html" href="http://mypy.readthedocs.io/en/latest/generics.html" rel="noreferrer nofollow" target="_blank">mypy&#8217;s docs on the subject</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">).</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Adding os.scandir() to the stubs for os.py</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">For practice, let&#8217;s see if we can add something to the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi" href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi" rel="noreferrer nofollow" target="_blank">stub file for os.py</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">. As of this writing there&#8217;s no </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed" href="https://github.com/python/typeshed" rel="noreferrer nofollow" target="_blank">typeshed</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> information for </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code attrlink url"><a class="attrlink" data-target-href="https://docs.python.org/3/library/os.html" href="https://docs.python.org/3/library/os.html" rel="noreferrer nofollow" target="_blank">os.scandir()</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , which I think is a shame. I think the following will do nicely. Note how we only define </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;and </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> for Python versions &gt;= 3.5. (Mypy </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/mypy/issues/698" href="https://github.com/python/mypy/issues/698" rel="noreferrer nofollow" target="_blank">doesn&#8217;t support this yet</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">, but it will soon, and the example here still works &#8212; it just doesn&#8217;t realize </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;is only available in Python 3.5.) This could be added to the end of </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">stdlib/3/os/__init__.pyi</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from typing import Generic, AnyStr, overload, Iterator</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">if sys.version_info </span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span>= (3, 5):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; class DirEntry(</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-42889384956">Generic</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">[AnyStr]):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; name = ... &nbsp;# type: AnyStr</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; path = ... &nbsp;# type: AnyStr</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; def inode(self) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> int: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; def is_dir(self, *, follow_symlinks: bool = ...) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> bool: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; def is_file(self, *, follow_symlinks: bool = ...) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> bool: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; def is_symlink(self) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> bool: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; def stat(self, *, follow_symlinks: bool = ...) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> stat_result: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">&nbsp; &nbsp; @overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">&nbsp; &nbsp; def scandir() -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[str]]: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">&nbsp; &nbsp; @overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">&nbsp; &nbsp; def scandir(path: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-21653656371 thread-53340393283">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[AnyStr]]: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Deconstructing this a bit, we see a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0484/#generics" href="https://www.python.org/dev/peps/pep-0484/#generics" rel="noreferrer nofollow" target="_blank">generic class</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> (that&#8217;s what the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">Generic[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;base class means) and an overloaded function. &nbsp;The </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> definition uses </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">@overload</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> because it can also be called without arguments. </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">We could also write it as follows; it&#8217;ll work either way:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">&nbsp; &nbsp; @overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">&nbsp; &nbsp; def scandir(</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">path: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-72143937476">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> = ...) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[str]]: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; @overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; def scandir(path: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-86865838424">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">]]: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Either way there really are three ways to call </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir()</span><span class=""> </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">, all three returning an iterable of DirEntry objects:</span></div>\n<div>\n<br /></div>\n<ul class="listtype-bullet listindent1 list-bullet1">\n<li><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir() -</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[str]]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp;</span></li>\n<li><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir(str) -</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[str]]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp;</span></li>\n<li><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir(bytes) -</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[bytes]]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp;</span></li>\n</ul>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Adding os.fspath()</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Next I&#8217;ll show how to add </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.fspath()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> and how to add support for the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">__fspath__()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;protocol to </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> .</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0519/" href="https://www.python.org/dev/peps/pep-0519/" rel="noreferrer nofollow" target="_blank">PEP 519</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> defines a simple ABC (</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://docs.python.org/3/library/abc.html" href="https://docs.python.org/3/library/abc.html" rel="noreferrer nofollow" target="_blank">abstract base class</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">), </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , with one method, </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">__fspath__()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . We need to add this to the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi" href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi" rel="noreferrer nofollow" target="_blank">stub for </a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi" href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi" rel="noreferrer nofollow" target="_blank">os.py</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , as follows:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">class PathLike(</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-84113787329">Generic[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; @abstractmethod</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; def __fspath__(self) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> AnyStr: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">That&#8217;s really all there is to it (except for the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">sys.version_info</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;check, which I&#8217;ll leave out here since it doesn&#8217;t really work yet). Next we define </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.fspath()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , which wraps this protocol. It&#8217;s slightly more complicated than just calling its argument&#8217;s </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">__fspath__()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;method, because it also handles strings and bytes. So here it is:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def fspath(path: PathLike[AnyStr]) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> AnyStr: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def fspath(path: AnyStr) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> AnyStr: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Easy enough! Next is update the definition of </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . That&#8217;s easy too &#8212; in fact we only need to make it inherit from </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , the rest is the same as the definition I gave above:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">class DirEntry(PathLike[AnyStr], Generic[AnyStr]):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; # Everything else unchanged!</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The only slightly complicated bit here is the extra base class </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">Generic[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . This seems redundant, and in fact PEP 484 says we can leave it off, but mypy doesn&#8217;t support that yet, and it&#8217;s quite harmless &#8212; this just rubs into mypy&#8217;s face that this is a generic class of one type variable (the by-now famous </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> ).</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Finally we need to make a similar change to the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3.4/pathlib.pyi" href="https://github.com/python/typeshed/blob/master/stdlib/3.4/pathlib.pyi" rel="noreferrer nofollow" target="_blank">stub for </a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3.4/pathlib.pyi" href="https://github.com/python/typeshed/blob/master/stdlib/3.4/pathlib.pyi" rel="noreferrer nofollow" target="_blank">pathlib.py</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . Again, all we need to do is to make </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PurePath</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;inherit from </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[str]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , like so:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from os import PathLike</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">class PurePath(PathLike[str]):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp;</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-86168840759"> # Everything else unchanged!</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">However, here we don&#8217;t add </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">Generic</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , because this is not a generic class! It inherits from </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[str]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , which is quite un-generic, since it&#8217;s </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z i"><i>specialized</i></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> for just </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> .</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Note that we don&#8217;t actually have to define the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">__fspath__()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;method in these stubs &#8212; we&#8217;re not supposed to call them directly, and stubs don&#8217;t provide implementations, only interfaces.</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Putting it all together, we see that it&#8217;s quite elegant:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">for a in os.scandir(\'.\'):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; b = os.fspath(a)</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; # Here, </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-459852399">the</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> typechecker will know that the type of b is str!</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The derivation that </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">b</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> has type </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;is not too complicated: first, </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.scandir(\'.\')</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;has a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;argument, so it returns an iterator of </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;objects parameterized with </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , which we write as </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry[str]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . Passing this </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry[str]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;to </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.fspath()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;then takes the first of that function&#8217;s two overloads (the one with </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> ), since it doesn&#8217;t match the second one ( </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;doesn&#8217;t inherit from </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , because it&#8217;s neither a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;nor </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> ). Further the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> type variable in </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> is solved to stand for just </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , because </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry[str]</span><span class=""> </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp;inherits from </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[str]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . This is the specialized version of what the code says: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">DirEntry[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;inherits from </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> .</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Okay, so maybe that last paragraph was intermediate or advanced. And maybe it could be expanded. Maybe I&#8217;ll write another blog about how type inference works, but there&#8217;s a lot on that topic, and other authors have probably already written better introductory material about generics (in other languages, though).</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Making things accept PathLike</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">There&#8217;s a bit of cleanup work that I&#8217;ve left out. PEP 519 says that many stdlib functions that currently take strings for pathnames will be modified to also accept </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . For example, here&#8217;s how the signatures for </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.scandir()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;would change:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">def scandir() -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[str]]: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">def scandir(path: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-21653656371">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[AnyStr]]: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def scandir(path: PathLike[AnyStr]) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[AnyStr]]: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The first two entries are unchanged; I&#8217;ve just added a third overload. (Note that the alternative way of defining </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">scandir()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> would require more changes &#8212; an indication that this way is more natural.)</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">I also tried doing this with a union:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def scandir() </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">-</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[str]]: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">def scandir(path: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-21653656371">Union[AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283">, PathLike[AnyStr]]) -</span></code><code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z thread-53340393283"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;</span> Iterator[DirEntry[AnyStr]]: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">But I couldn&#8217;t get this to work, so the extra overload is probably the best we can do. Quite a few functions will require a similar treatment, sometimes introducing overloading where none exists today (but that shouldn&#8217;t hurt anything).</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">A note about </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">pathlib</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> : since it only deals with strings, its methods (the ones that PEP 519 says should be changed anyway) should use </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[str]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;rather than </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">PathLike[AnyStr]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> .</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Acknowledgments</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">(Thanks for comments on the draft to Stephen Turnbull, Koos Zevenhoven, Eth</span><span class="author-d-4z65zz66zl57z75zyiz66zfr2fz87zwz89znuiz90zz78zoz72zz87zhgh7z71zz88zz77zfz66zquz87zq3xz82zcz82zq5caz88z9">a</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">n Furman, and Brett Cannon.)</span></div>\n<div style=\'clear: both;\'></div>\n</div>\n<div class=\'post-footer\'>\n<div class=\'post-footer-line post-footer-line-1\'>\n<span class=\'post-author vcard\'>\nPosted by\n<span class=\'fn\' itemprop=\'author\' itemscope=\'itemscope\' itemtype=\'http://schema.org/Person\'>\n<meta content=\'https://www.blogger.com/profile/12821714508588242516\' itemprop=\'url\'/>\n<a class=\'g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' title=\'author profile\'>\n<span itemprop=\'name\'>Guido van Rossum</span>\n</a>\n</span>\n</span>\n<span class=\'post-timestamp\'>\nat\n<meta content=\'http://neopythonic.blogspot.com/2016/05/adding-type-annotations-for-fspath.html\' itemprop=\'url\'/>\n<a class=\'timestamp-link\' href=\'http://neopythonic.blogspot.com/2016/05/adding-type-annotations-for-fspath.html\' rel=\'bookmark\' title=\'permanent link\'><abbr class=\'published\' itemprop=\'datePublished\' title=\'2016-05-18T07:06:00-07:00\'>7:06 AM</abbr></a>\n</span>\n<span class=\'reaction-buttons\'>\n</span>\n<span class=\'post-comment-link\'>\n<a class=\'comment-link\' href=\'https://www.blogger.com/comment.g?blogID=4195135246107166251&postID=8854185106045973213\' onclick=\'\'>\n3 comments:\n    </a>\n</span>\n<span class=\'post-backlinks post-comment-link\'>\n</span>\n<span class=\'post-icons\'>\n<span class=\'item-control blog-admin pid-1774424698\'>\n<a href=\'https://www.blogger.com/post-edit.g?blogID=4195135246107166251&postID=8854185106045973213&from=pencil\' title=\'Edit Post\'>\n<img alt=\'\' class=\'icon-action\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_edit_allbkg.gif\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'post-share-buttons goog-inline-block\'>\n</div>\n</div>\n<div class=\'post-footer-line post-footer-line-2\'>\n<span class=\'post-labels\'>\n</span>\n</div>\n<div class=\'post-footer-line post-footer-line-3\'>\n<span class=\'post-location\'>\n</span>\n</div>\n</div>\n</div>\n</div>\n\n          </div></div>\n        \n\n          <div class="date-outer">\n        \n<h2 class=\'date-header\'><span>Tuesday, May 17, 2016</span></h2>\n\n          <div class="date-posts">\n        \n<div class=\'post-outer\'>\n<div class=\'post hentry uncustomized-post-template\' itemprop=\'blogPost\' itemscope=\'itemscope\' itemtype=\'http://schema.org/BlogPosting\'>\n<meta content=\'4195135246107166251\' itemprop=\'blogId\'/>\n<meta content=\'1468618515324597653\' itemprop=\'postId\'/>\n<a name=\'1468618515324597653\'></a>\n<h3 class=\'post-title entry-title\' itemprop=\'name\'>\n<a href=\'http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html\'>The AnyStr type variable</a>\n</h3>\n<div class=\'post-header\'>\n<div class=\'post-header-line-1\'></div>\n</div>\n<div class=\'post-body entry-content\' id=\'post-body-1468618515324597653\' itemprop=\'description articleBody\'>\n<div>\n<h1 class="ace-copy-paste-skip-this-tag">\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The AnyStr type variable </span></h1>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">I was drafting a blog post on how to add type annotations for the new </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">__fspath__()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;protocol (</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0519/" href="https://www.python.org/dev/peps/pep-0519/" rel="noreferrer nofollow" target="_blank">PEP 519</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">) when I realized that I should write a separate post about </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . So here it is.</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">A simple function on strings</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Let&#8217;s write a function that surrounds a string in parentheses. We&#8217;ll put it in a file named </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">demo.py</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> :</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; return \'(\' + s + \')\'</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">It works, too:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;&gt;&gt; from demo import parenthesize</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;&gt;&gt; print(parenthesize(\'hola\'))</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">(hola)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Of course, if you pass it something that&#8217;s not a string it will fail:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;&gt;&gt; parenthesize(42)</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Traceback (most recent call last):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; File "demo.py", line 1, in <module></module></span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; File "demo.py", line 2, in parenthesize</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">TypeError: Can\'t convert \'int\' object to str implicitly</span></code></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Adding type annotations</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Using </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0484/" href="https://www.python.org/dev/peps/pep-0484/" rel="noreferrer nofollow" target="_blank">PEP 484</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> type annotations we can clarify our little function&#8217;s signature:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: str) -&gt; str:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; return \'(\' + s + \')\'</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Nothing to it, right? Even if you&#8217;ve never heard of PEP 484 before you can guess what this means. (Note that PEP 484 also says that the runtime behavior is unchanged. The calls I showed above will still have exactly the same effect, including the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">TypeError</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> raised by </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">parenthesize(42)</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> .)</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Polymorphic functions</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Now suppose this is actually part of a networking app and we need to be able to parenthesize byte strings as well as text strings. Here&#8217;s how you&#8217;d implement that:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; if isinstance(s, str):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; return \'(\' + s + \')\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; elif isinstance(s, bytes):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; return b\'(\' + s + b\')\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; else:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; raise TypeError(f"That\'s not a string, it\'s a {type(s)}") &nbsp;# See </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://www.python.org/dev/peps/pep-0498/" href="https://www.python.org/dev/peps/pep-0498/" rel="noreferrer nofollow" target="_blank">PEP 498</a></span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">With a fancy word we call that a polymorphic function. How do you write a signature for such a function? For the answer we have to dive a little deeper into PEP 484. It defines a nifty operator named </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">Union</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;that lets us state that a type can be either this or that (or something else). In our case, it&#8217;s either </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;or </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , so we can write it like this:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from typing import Union</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: Union[str, bytes]) -&gt; Union[str, bytes]:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; if isinstance(s, str):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; # Etc.</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Now let&#8217;s write a little main program with a bug, to show off the type checker:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from demo import parenthesize</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">a = parenthesize(\'hello\')</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">b = parenthesize(b\'hola\')</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">c = a + b &nbsp;### bug here<-- bug="" span=""></--></span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">print(c)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">When we try to run this, the two </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">parenthesize()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;calls work fine (yay polymorphism!) but we get a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">TypeError</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> on the last line:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">$ python3 main.py&nbsp;</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Traceback (most recent call last):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; File "main.py", line 5, in <module></module></span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; c = a + b &nbsp;### bug here<-- bug="" span=""></--></span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">TypeError: Can\'t convert \'bytes\' object to str implicitly</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The reason should be pretty obvious: in Python 3 you can&#8217;t mix bytes and str objects. And when we type-check this program using </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="http://mypy-lang.org/" href="http://mypy-lang.org/" rel="noreferrer nofollow" target="_blank">mypy</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> we indeed get a type error:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">$ mypy main.py&nbsp;</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">main.py:5: error: Unsupported operand types for + (likely involving Union)</span></code></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Debugging the bug</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">So let&#8217;s try a program without a bug:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from demo import parenthesize</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">a = parenthesize(\'hello\')</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">b = parenthesize(\'hola\')</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">c = a + b &nbsp;### bug here<-- bug="" no="" span=""></--></span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">print(c)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Run it and it works great:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">$ python3 main.py</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">(hello)(hola)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">So the type checker should be happy too, right?</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">$ mypy main.py</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">main.py:5: error: Unsupported operand types for + (likely involving Union)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Whoops! The same error. What happened? Of course, I set you up, so I can explain something about type checking.</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The trouble with </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z s"><s>tribbles</s></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> unions</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The type checker takes the signature at face value, so that when checking the call, it infers the type </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">Union[str, bytes]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;for every call to </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">parenthesize()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , regardless of what the arguments are. This is because, for most functions of even modest complexity, a type checker doesn&#8217;t understand enough about what&#8217;s going on in the function body, so it just has to believe the types in the signature (even though in this particular case it would probably be easy enough to do better).</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">In our test program the types of </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">a</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;and </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">b</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;are both inferred to be exactly what </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">parenthesize()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;claims to return, i.e., both variables have the type </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">Union[str, bytes]</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . The type checker then analyzes the expression </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">a + b</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , and for this i</span><span class="author-d-z89zz72zz79zvhpz67zz83z9z66zz78zxz122z1xz74zu4z83z4myz73zkiz71zdz77zz71zz65zz79z4iz79ziosz75zz85zreqz69z">t</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> discovers a problem: if </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">a</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> is either str or bytes, and so is </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">b</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , then the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">+</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;operator may be invoked on any of these combinations of types: </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str + str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str + bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes + str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , or </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes + bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . But only the first and the last are valid! In Python 3, </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str + bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;or </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes + str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;are invalid operations.</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Aside: Even in Python 2, those two are suspect: since while </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">\'x\' + u\'y\'</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;indeed works (returning </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">u\'xy\'</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> ), other combinations will raise UnicodeDecodeError, e.g.:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&gt;&gt;&gt;\'Franç\' + u\'ois\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Traceback (most recent call last):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; File "<stdin>", line 1, in <module></module></stdin></span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">UnicodeDecodeError: \'ascii\' codec can\'t decode byte 0xc3 in position 4:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ruby" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">ordinal not in range(128)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Anyway, the type checker doesn&#8217;t like this business, and it rejects operations on Unions where some combinations are invalid. What can we do instead?</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Function overloading</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">One option would be function overloading. PEP 484 defines a magical decorator, </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">@overload</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , which lets us get around this problem. We could write something like this:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from typing import overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: str) -&gt; str: ...</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">@overload</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: bytes) -&gt; bytes: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">This tells the type checker that if the argument is a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , the return value is also a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> , and similarly for </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . Unfortunately </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">@overload</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;is only allowed in </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="http://mypy.readthedocs.io/en/latest/basics.html#library-stubs-and-the-typeshed-repo" href="http://mypy.readthedocs.io/en/latest/basics.html#library-stubs-and-the-typeshed-repo" rel="noreferrer nofollow" target="_blank">stub files</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">, which are a kind of interface definition files that show a type checker the signatures of a module&#8217;s contents without giving the implementation.</span></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Type variables</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Fortunately there&#8217;s an even better way, using type variables. This is how it goes:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from typing import TypeVar</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">S = TypeVar(\'S\')</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: S) -&gt; S:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; if isinstance(s, str):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; return \'(\' + s + \')\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; elif isinstance(s, bytes):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; return b\'(\' + s + b\')\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; else:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; raise TypeError("That\'s not a string, dude! It\'s a %s" % type(s))</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Well&#8230; Almost. Our main.py program (unchanged from above) now gets a clean bill of health, but when we type-check this version we get errors on both </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">return</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;lines:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">demo.py: note: In function "parenthesize":</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">demo.py:7: error: Incompatible return value type: expected S`-1, got builtins.str</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">demo.py:9: error: Incompatible return value type: expected S`-1, got builtins.bytes</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">This is a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/mypy/issues/1539" href="https://github.com/python/mypy/issues/1539" rel="noreferrer nofollow" target="_blank">bit hard to fathom</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">, but the fix is what I was leading up to anyway, so I&#8217;ll reveal it now:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from typing import TypeVar</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">S = TypeVar(\'S\', str, bytes)</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: S) -&gt; S:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; if isinstance(s, str):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; return \'(\' + s + \')\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; elif isinstance(s, bytes):</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; return b\'(\' + s + b\')\'</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; else:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; &nbsp; &nbsp; raise TypeError("That\'s not a string, dude! It\'s a %s" % type(s))</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">The only changed line is this one:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-bash" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">S = TypeVar(\'S\', str, bytes)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">This notation is called a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="http://mypy.readthedocs.io/en/latest/generics.html#type-variables-with-value-restriction" href="http://mypy.readthedocs.io/en/latest/generics.html#type-variables-with-value-restriction" rel="noreferrer nofollow" target="_blank">type variable with value restriction</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . Yes, it&#8217;s mouthful; we sometimes also call it a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z i"><i>constrained type variable</i></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">. </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">S</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> is a type variable restricted to a set of types. It also has the advantage of telling the type checker that types other than </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;or </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;are not acceptable. Without that, a call like this would have been considered valid:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-ini" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">x = parenthesize(42)</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">because the original type variable (without the restrictions) doesn\'t tell mypy that this is a bad idea.</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">In fact, this particular use case (a type variable constrained to str or bytes) is so commonly needed that it\'s predefined in the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">typing</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> module, and all we have to do is import it:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">from typing import AnyStr</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1" spellcheck="false"><br /></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def parenthesize(s: AnyStr) -&gt; AnyStr:</span></code></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">&nbsp; &nbsp; # Etc. -- trust me, it works!</span></code></div>\n<div>\n<h2>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">Real-world use of AnyStr</span></h2>\n</div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">In fact, this is how many polymorphic functions in the </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;and </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.path</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;modules are defined. For example, in the stub for </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">os.py</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;we find definitions like </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi#L236" href="https://github.com/python/typeshed/blob/master/stdlib/3/os/__init__.pyi#L236" rel="noreferrer nofollow" target="_blank">the following</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def link(src: AnyStr, link_name: AnyStr) -&gt; None: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">and also </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z attrlink url"><a class="attrlink" data-target-href="https://github.com/python/typeshed/blob/master/stdlib/3/os/path.pyi#L57" href="https://github.com/python/typeshed/blob/master/stdlib/3/os/path.pyi#L57" rel="noreferrer nofollow" target="_blank">this</a></span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">:</span></div>\n<div>\n<br /></div>\n<div>\n<code class="listtype-code listindent1 list-code1 lang-python" spellcheck="false"><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">def split(path: AnyStr) -&gt; Tuple[AnyStr, AnyStr]: ...</span></code></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">These show us a bit more of the power of type variables: the signature for </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">link()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;indicates that either both arguments must be </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;or both must be </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> ; </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">split()</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;demonstrates that the type variable may also occur in more complex constructs: splitting a </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> returns a tuple of two </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">str</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> objects, while splitting </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> returns a tuple of two </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">bytes</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> &nbsp;objects.</span></div>\n<div>\n<br /></div>\n<div>\n<span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z">That&#8217;s all I wanted to share about </span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z inline-code">AnyStr</span><span class="author-d-16z86ztz122z98z81zz82zz85zunv3z82zpqfnlaklz69zehdlvnz73zz65zz81zz79zz73zpz76z22z66ztsz89zz122zz73zz122zfz83z"> . Thanks for comments on the draft to Stephen Turnbull, Koos Zevenhoven, Ethan Furman, and Brett Cannon.</span></div>\n<div>\n<br /></div>\n<div style=\'clear: both;\'></div>\n</div>\n<div class=\'post-footer\'>\n<div class=\'post-footer-line post-footer-line-1\'>\n<span class=\'post-author vcard\'>\nPosted by\n<span class=\'fn\' itemprop=\'author\' itemscope=\'itemscope\' itemtype=\'http://schema.org/Person\'>\n<meta content=\'https://www.blogger.com/profile/12821714508588242516\' itemprop=\'url\'/>\n<a class=\'g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' title=\'author profile\'>\n<span itemprop=\'name\'>Guido van Rossum</span>\n</a>\n</span>\n</span>\n<span class=\'post-timestamp\'>\nat\n<meta content=\'http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html\' itemprop=\'url\'/>\n<a class=\'timestamp-link\' href=\'http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html\' rel=\'bookmark\' title=\'permanent link\'><abbr class=\'published\' itemprop=\'datePublished\' title=\'2016-05-17T09:53:00-07:00\'>9:53 AM</abbr></a>\n</span>\n<span class=\'reaction-buttons\'>\n</span>\n<span class=\'post-comment-link\'>\n<a class=\'comment-link\' href=\'https://www.blogger.com/comment.g?blogID=4195135246107166251&postID=1468618515324597653\' onclick=\'\'>\n5 comments:\n    </a>\n</span>\n<span class=\'post-backlinks post-comment-link\'>\n</span>\n<span class=\'post-icons\'>\n<span class=\'item-control blog-admin pid-1774424698\'>\n<a href=\'https://www.blogger.com/post-edit.g?blogID=4195135246107166251&postID=1468618515324597653&from=pencil\' title=\'Edit Post\'>\n<img alt=\'\' class=\'icon-action\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_edit_allbkg.gif\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'post-share-buttons goog-inline-block\'>\n</div>\n</div>\n<div class=\'post-footer-line post-footer-line-2\'>\n<span class=\'post-labels\'>\n</span>\n</div>\n<div class=\'post-footer-line post-footer-line-3\'>\n<span class=\'post-location\'>\n</span>\n</div>\n</div>\n</div>\n</div>\n\n        </div></div>\n      \n</div>\n<div class=\'blog-pager\' id=\'blog-pager\'>\n<span id=\'blog-pager-older-link\'>\n<a class=\'blog-pager-older-link\' href=\'http://neopythonic.blogspot.com/search?updated-max=2016-05-17T09:53:00-07:00&amp;max-results=7\' id=\'Blog1_blog-pager-older-link\' title=\'Older Posts\'>Older Posts</a>\n</span>\n<a class=\'home-link\' href=\'http://neopythonic.blogspot.com/\'>Home</a>\n</div>\n<div class=\'clear\'></div>\n<div class=\'blog-feeds\'>\n<div class=\'feed-links\'>\nSubscribe to:\n<a class=\'feed-link\' href=\'http://neopythonic.blogspot.com/feeds/posts/default\' target=\'_blank\' type=\'application/atom+xml\'>Posts (Atom)</a>\n</div>\n</div>\n</div></div>\n</div>\n<div id=\'sidebar-wrapper\'>\n<div class=\'sidebar section\' id=\'header\'><div class=\'widget Header\' data-version=\'1\' id=\'Header1\'>\n<div id=\'header-inner\'>\n<div class=\'titlewrapper\'>\n<h1 class=\'title\'>\nNeopythonic\n</h1>\n</div>\n<div class=\'descriptionwrapper\'>\n<p class=\'description\'><span>Ramblings through technology, politics, culture and philosophy by the creator of the Python programming language.</span></p>\n</div>\n</div>\n</div></div>\n<div class=\'sidebar section\' id=\'sidebar\'><div class=\'widget Followers\' data-version=\'1\' id=\'Followers1\'>\n<h2 class=\'title\'>Followers</h2>\n<div class=\'widget-content\'>\n<div id=\'Followers1-wrapper\'>\n<div style=\'margin-right:2px;\'>\n<div><script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script>\n<div id="followers-iframe-container"></div>\n<script type="text/javascript">\n    window.followersIframe = null;\n    function followersIframeOpen(url) {\n      gapi.load("gapi.iframes", function() {\n        if (gapi.iframes && gapi.iframes.getContext) {\n          window.followersIframe = gapi.iframes.getContext().openChild({\n            url: url,\n            where: document.getElementById("followers-iframe-container"),\n            messageHandlersFilter: gapi.iframes.CROSS_ORIGIN_IFRAMES_FILTER,\n            messageHandlers: {\n              \'_ready\': function(obj) {\n                window.followersIframe.getIframeEl().height = obj.height;\n              },\n              \'reset\': function() {\n                window.followersIframe.close();\n                followersIframeOpen("https://www.blogger.com/followers.g?blogID\\x3d4195135246107166251\\x26colors\\x3dCgt0cmFuc3BhcmVudBILdHJhbnNwYXJlbnQaByMzMzMzMzMiByM0NDg4ODgqByNGRkZGRkYyByMwMDAwMDA6ByMzMzMzMzNCByM0NDg4ODhKByMwMDAwMDBSByM0NDg4ODhaC3RyYW5zcGFyZW50\\x26pageSize\\x3d21\\x26origin\\x3dhttp://neopythonic.blogspot.com/");\n              },\n              \'open\': function(url) {\n                window.followersIframe.close();\n                followersIframeOpen(url);\n              },\n              \'blogger-ping\': function() {\n              }\n            }\n          });\n        }\n      });\n    }\n    followersIframeOpen("https://www.blogger.com/followers.g?blogID\\x3d4195135246107166251\\x26colors\\x3dCgt0cmFuc3BhcmVudBILdHJhbnNwYXJlbnQaByMzMzMzMzMiByM0NDg4ODgqByNGRkZGRkYyByMwMDAwMDA6ByMzMzMzMzNCByM0NDg4ODhKByMwMDAwMDBSByM0NDg4ODhaC3RyYW5zcGFyZW50\\x26pageSize\\x3d21\\x26origin\\x3dhttp://neopythonic.blogspot.com/");\n  </script></div>\n</div>\n</div>\n<div class=\'clear\'></div>\n<span class=\'widget-item-control\'>\n<span class=\'item-control blog-admin\'>\n<a class=\'quickedit\' href=\'//www.blogger.com/rearrange?blogID=4195135246107166251&widgetType=Followers&widgetId=Followers1&action=editWidget&sectionId=sidebar\' onclick=\'return _WidgetManager._PopupConfig(document.getElementById("Followers1"));\' rel=\'nofollow\' target=\'configFollowers1\' title=\'Edit\'>\n<img alt=\'\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_wrench_allbkg.png\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'clear\'></div>\n</div>\n</div><div class=\'widget BlogArchive\' data-version=\'1\' id=\'BlogArchive1\'>\n<h2>Blog Archive</h2>\n<div class=\'widget-content\'>\n<div id=\'ArchiveList\'>\n<div id=\'BlogArchive1_ArchiveList\'>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate expanded\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy toggle-open\'>\n\n        &#9660;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2019/\'>\n2019\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate expanded\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy toggle-open\'>\n\n        &#9660;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2019/03/\'>\nMarch\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n<ul class=\'posts\'>\n<li><a href=\'http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html\'>Why operators are useful</a></li>\n</ul>\n</li>\n</ul>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2018/\'>\n2018\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2018/11/\'>\nNovember\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2016/\'>\n2016\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(5)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2016/07/\'>\nJuly\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2016/05/\'>\nMay\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(3)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2016/04/\'>\nApril\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2013/\'>\n2013\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(2)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2013/10/\'>\nOctober\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(2)</span>\n</li>\n</ul>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2011/\'>\n2011\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(5)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2011/08/\'>\nAugust\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2011/07/\'>\nJuly\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2011/06/\'>\nJune\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2011/01/\'>\nJanuary\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(2)</span>\n</li>\n</ul>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/\'>\n2009\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(16)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/12/\'>\nDecember\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/11/\'>\nNovember\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/09/\'>\nSeptember\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/07/\'>\nJuly\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(2)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/06/\'>\nJune\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(3)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/05/\'>\nMay\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/04/\'>\nApril\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(4)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/03/\'>\nMarch\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(1)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2009/01/\'>\nJanuary\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(2)</span>\n</li>\n</ul>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2008/\'>\n2008\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(14)</span>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2008/12/\'>\nDecember\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(2)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2008/11/\'>\nNovember\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(5)</span>\n</li>\n</ul>\n<ul class=\'hierarchy\'>\n<li class=\'archivedate collapsed\'>\n<a class=\'toggle\' href=\'javascript:void(0)\'>\n<span class=\'zippy\'>\n\n        &#9658;&#160;\n      \n</span>\n</a>\n<a class=\'post-count-link\' href=\'http://neopythonic.blogspot.com/2008/10/\'>\nOctober\n</a>\n<span class=\'post-count\' dir=\'ltr\'>(7)</span>\n</li>\n</ul>\n</li>\n</ul>\n</div>\n</div>\n<div class=\'clear\'></div>\n<span class=\'widget-item-control\'>\n<span class=\'item-control blog-admin\'>\n<a class=\'quickedit\' href=\'//www.blogger.com/rearrange?blogID=4195135246107166251&widgetType=BlogArchive&widgetId=BlogArchive1&action=editWidget&sectionId=sidebar\' onclick=\'return _WidgetManager._PopupConfig(document.getElementById("BlogArchive1"));\' rel=\'nofollow\' target=\'configBlogArchive1\' title=\'Edit\'>\n<img alt=\'\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_wrench_allbkg.png\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'clear\'></div>\n</div>\n</div><div class=\'widget Profile\' data-version=\'1\' id=\'Profile1\'>\n<h2>About Me</h2>\n<div class=\'widget-content\'>\n<a href=\'https://www.blogger.com/profile/12821714508588242516\'><img alt=\'My photo\' class=\'profile-img\' height=\'80\' src=\'//2.bp.blogspot.com/_FG9t5W1SJ14/SO0aRdEpTAI/AAAAAAAACvw/sQy2btDo2DI/S220-s80/IMG_2192.jpg\' width=\'53\'/></a>\n<dl class=\'profile-datablock\'>\n<dt class=\'profile-data\'>\n<a class=\'profile-name-link g-profile\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\' style=\'background-image: url(//www.blogger.com/img/logo-16.png);\'>\nGuido van Rossum\n</a>\n</dt>\n<dd class=\'profile-textblock\'>Python\'s BDFL</dd>\n</dl>\n<a class=\'profile-link\' href=\'https://www.blogger.com/profile/12821714508588242516\' rel=\'author\'>View my complete profile</a>\n<div class=\'clear\'></div>\n<span class=\'widget-item-control\'>\n<span class=\'item-control blog-admin\'>\n<a class=\'quickedit\' href=\'//www.blogger.com/rearrange?blogID=4195135246107166251&widgetType=Profile&widgetId=Profile1&action=editWidget&sectionId=sidebar\' onclick=\'return _WidgetManager._PopupConfig(document.getElementById("Profile1"));\' rel=\'nofollow\' target=\'configProfile1\' title=\'Edit\'>\n<img alt=\'\' height=\'18\' src=\'https://resources.blogblog.com/img/icon18_wrench_allbkg.png\' width=\'18\'/>\n</a>\n</span>\n</span>\n<div class=\'clear\'></div>\n</div>\n</div></div>\n</div>\n<!-- spacer for skins that want sidebar and main to be the same height-->\n<div class=\'clear\'>&#160;</div>\n</div>\n<!-- end content-wrapper -->\n<div id=\'footer-wrapper\'>\n<div class=\'footer no-items section\' id=\'footer\'></div>\n</div>\n</div></div>\n<!-- end outer-wrapper -->\n<script src=\'https://apis.google.com/js/plusone.js\' type=\'text/javascript\'></script>\n\n<script type="text/javascript" src="https://www.blogger.com/static/v1/widgets/3236635003-widgets.js"></script>\n<script type=\'text/javascript\'>\nwindow[\'__wavt\'] = \'AOuZoY5yNKA5W7NfDL5jC8sgYKdhWMg9Vw:1582878948016\';_WidgetManager._Init(\'//www.blogger.com/rearrange?blogID\\x3d4195135246107166251\',\'//neopythonic.blogspot.com/\',\'4195135246107166251\');\n_WidgetManager._SetDataContext([{\'name\': \'blog\', \'data\': {\'blogId\': \'4195135246107166251\', \'title\': \'Neopythonic\', \'url\': \'http://neopythonic.blogspot.com/\', \'canonicalUrl\': \'http://neopythonic.blogspot.com/\', \'homepageUrl\': \'http://neopythonic.blogspot.com/\', \'searchUrl\': \'http://neopythonic.blogspot.com/search\', \'canonicalHomepageUrl\': \'http://neopythonic.blogspot.com/\', \'blogspotFaviconUrl\': \'http://neopythonic.blogspot.com/favicon.ico\', \'bloggerUrl\': \'https://www.blogger.com\', \'hasCustomDomain\': false, \'httpsEnabled\': true, \'enabledCommentProfileImages\': true, \'gPlusViewType\': \'FILTERED_POSTMOD\', \'adultContent\': false, \'analyticsAccountNumber\': \'\', \'encoding\': \'UTF-8\', \'locale\': \'en\', \'localeUnderscoreDelimited\': \'en\', \'languageDirection\': \'ltr\', \'isPrivate\': false, \'isMobile\': false, \'isMobileRequest\': false, \'mobileClass\': \'\', \'isPrivateBlog\': false, \'feedLinks\': \'\\x3clink rel\\x3d\\x22alternate\\x22 type\\x3d\\x22application/atom+xml\\x22 title\\x3d\\x22Neopythonic - Atom\\x22 href\\x3d\\x22http://neopythonic.blogspot.com/feeds/posts/default\\x22 /\\x3e\\n\\x3clink rel\\x3d\\x22alternate\\x22 type\\x3d\\x22application/rss+xml\\x22 title\\x3d\\x22Neopythonic - RSS\\x22 href\\x3d\\x22http://neopythonic.blogspot.com/feeds/posts/default?alt\\x3drss\\x22 /\\x3e\\n\\x3clink rel\\x3d\\x22service.post\\x22 type\\x3d\\x22application/atom+xml\\x22 title\\x3d\\x22Neopythonic - Atom\\x22 href\\x3d\\x22https://www.blogger.com/feeds/4195135246107166251/posts/default\\x22 /\\x3e\\n\', \'meTag\': \'\\x3clink rel\\x3d\\x22me\\x22 href\\x3d\\x22https://www.blogger.com/profile/12821714508588242516\\x22 /\\x3e\\n\', \'adsenseHostId\': \'ca-host-pub-1556223355139109\', \'adsenseHasAds\': false, \'view\': \'\', \'dynamicViewsCommentsSrc\': \'//www.blogblog.com/dynamicviews/4224c15c4e7c9321/js/comments.js\', \'dynamicViewsScriptSrc\': \'//www.blogblog.com/dynamicviews/752be96649ffb269\', \'plusOneApiSrc\': \'https://apis.google.com/js/plusone.js\', \'disableGComments\': true, \'sharing\': {\'platforms\': [{\'name\': \'Get link\', \'key\': \'link\', \'shareMessage\': \'Get link\', \'target\': \'\'}, {\'name\': \'Facebook\', \'key\': \'facebook\', \'shareMessage\': \'Share to Facebook\', \'target\': \'facebook\'}, {\'name\': \'BlogThis!\', \'key\': \'blogThis\', \'shareMessage\': \'BlogThis!\', \'target\': \'blog\'}, {\'name\': \'Twitter\', \'key\': \'twitter\', \'shareMessage\': \'Share to Twitter\', \'target\': \'twitter\'}, {\'name\': \'Pinterest\', \'key\': \'pinterest\', \'shareMessage\': \'Share to Pinterest\', \'target\': \'pinterest\'}, {\'name\': \'Email\', \'key\': \'email\', \'shareMessage\': \'Email\', \'target\': \'email\'}], \'disableGooglePlus\': true, \'googlePlusShareButtonWidth\': 300, \'googlePlusBootstrap\': \'\\x3cscript type\\x3d\\x22text/javascript\\x22\\x3ewindow.___gcfg \\x3d {\\x27lang\\x27: \\x27en\\x27};\\x3c/script\\x3e\'}, \'hasCustomJumpLinkMessage\': false, \'jumpLinkMessage\': \'Read more\', \'pageType\': \'index\', \'pageName\': \'\', \'pageTitle\': \'Neopythonic\'}}, {\'name\': \'features\', \'data\': {\'sharing_get_link_dialog\': \'true\', \'sharing_native\': \'false\'}}, {\'name\': \'messages\', \'data\': {\'edit\': \'Edit\', \'linkCopiedToClipboard\': \'Link copied to clipboard!\', \'ok\': \'Ok\', \'postLink\': \'Post Link\'}}, {\'name\': \'template\', \'data\': {\'name\': \'custom\', \'localizedName\': \'Custom\', \'isResponsive\': false, \'isAlternateRendering\': false, \'isCustom\': true}}, {\'name\': \'view\', \'data\': {\'classic\': {\'name\': \'classic\', \'url\': \'?view\\x3dclassic\'}, \'flipcard\': {\'name\': \'flipcard\', \'url\': \'?view\\x3dflipcard\'}, \'magazine\': {\'name\': \'magazine\', \'url\': \'?view\\x3dmagazine\'}, \'mosaic\': {\'name\': \'mosaic\', \'url\': \'?view\\x3dmosaic\'}, \'sidebar\': {\'name\': \'sidebar\', \'url\': \'?view\\x3dsidebar\'}, \'snapshot\': {\'name\': \'snapshot\', \'url\': \'?view\\x3dsnapshot\'}, \'timeslide\': {\'name\': \'timeslide\', \'url\': \'?view\\x3dtimeslide\'}, \'isMobile\': false, \'title\': \'Neopythonic\', \'description\': \'Ramblings through technology, politics, culture and philosophy by the creator of the Python programming language.\', \'url\': \'http://neopythonic.blogspot.com/\', \'type\': \'feed\', \'isSingleItem\': false, \'isMultipleItems\': true, \'isError\': false, \'isPage\': false, \'isPost\': false, \'isHomepage\': true, \'isArchive\': false, \'isLabelSearch\': false}}]);\n_WidgetManager._RegisterWidget(\'_NavbarView\', new _WidgetInfo(\'Navbar1\', \'navbar\', document.getElementById(\'Navbar1\'), {}, \'displayModeFull\'));\n_WidgetManager._RegisterWidget(\'_BlogView\', new _WidgetInfo(\'Blog1\', \'main\', document.getElementById(\'Blog1\'), {\'cmtInteractionsEnabled\': false, \'lightboxEnabled\': true, \'lightboxModuleUrl\': \'https://www.blogger.com/static/v1/jsbin/577060686-lbx.js\', \'lightboxCssUrl\': \'https://www.blogger.com/static/v1/v-css/368954415-lightbox_bundle.css\'}, \'displayModeFull\'));\n_WidgetManager._RegisterWidget(\'_HeaderView\', new _WidgetInfo(\'Header1\', \'header\', document.getElementById(\'Header1\'), {}, \'displayModeFull\'));\n_WidgetManager._RegisterWidget(\'_FollowersView\', new _WidgetInfo(\'Followers1\', \'sidebar\', document.getElementById(\'Followers1\'), {}, \'displayModeFull\'));\n_WidgetManager._RegisterWidget(\'_BlogArchiveView\', new _WidgetInfo(\'BlogArchive1\', \'sidebar\', document.getElementById(\'BlogArchive1\'), {\'languageDirection\': \'ltr\', \'loadingMessage\': \'Loading\\x26hellip;\'}, \'displayModeFull\'));\n_WidgetManager._RegisterWidget(\'_ProfileView\', new _WidgetInfo(\'Profile1\', \'sidebar\', document.getElementById(\'Profile1\'), {}, \'displayModeFull\'));\n</script>\n</body>\n</html>'

We have several options to get the titles:

  • Using string splits
  • Using regular expressions
  • Using HTML parsing package

Let's try to find the titles using each method:

1.1 Parsing using String Splits

In [2]:
html = s
html_parts = html.split("h3 class='post-title entry-title")

def get_title(html):
    h = html.split("</h3>")[0]
    print("After first split:\n %s\n" % h)
    h = h.split("'>")[2]
    print("After second split:\n %s\n" % h)
    return h.replace("</a>", "").strip()


l = [get_title(i) for i in html_parts[1:]]
l
    
After first split:
 ' itemprop='name'>
<a href='http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html'>Why operators are useful</a>


After second split:
 Why operators are useful</a>


After first split:
 ' itemprop='name'>
<a href='http://neopythonic.blogspot.com/2018/11/what-do-do-with-your-computer-science.html'>What to do with your computer science career</a>


After second split:
 What to do with your computer science career</a>


After first split:
 ' itemprop='name'>
<a href='http://neopythonic.blogspot.com/2016/07/about-spammers-and-comments.html'>About spammers and comments</a>


After second split:
 About spammers and comments</a>


After first split:
 ' itemprop='name'>
<a href='http://neopythonic.blogspot.com/2016/05/union-syntax.html'>Union syntax</a>


After second split:
 Union syntax</a>


After first split:
 ' itemprop='name'>
<a href='http://neopythonic.blogspot.com/2016/05/adding-type-annotations-for-fspath.html'>Adding type annotations for fspath</a>


After second split:
 Adding type annotations for fspath</a>


After first split:
 ' itemprop='name'>
<a href='http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html'>The AnyStr type variable</a>


After second split:
 The AnyStr type variable</a>


Out[2]:
['Why operators are useful',
 'What to do with your computer science career',
 'About spammers and comments',
 'Union syntax',
 'Adding type annotations for fspath',
 'The AnyStr type variable']

1.2 Parsing using Regular Expressions

In [3]:
import re
r = re.compile(r"<h3 class='post-title entry-title'.*?>.*?>(.*?)</a><\/h3>")
r.findall(html.replace("\r","").replace("\n",""))
Out[3]:
['Why operators are useful',
 'What to do with your computer science career',
 'About spammers and comments',
 'Union syntax',
 'Adding type annotations for fspath',
 'The AnyStr type variable']

1.3 Parsing using BeautifulSoup

In [4]:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
l = soup.findAll('h3', attrs={'class': 'post-title entry-title'})
l
Out[4]:
[<h3 class="post-title entry-title" itemprop="name">
 <a href="http://neopythonic.blogspot.com/2019/03/why-operators-are-useful.html">Why operators are useful</a>
 </h3>, <h3 class="post-title entry-title" itemprop="name">
 <a href="http://neopythonic.blogspot.com/2018/11/what-do-do-with-your-computer-science.html">What to do with your computer science career</a>
 </h3>, <h3 class="post-title entry-title" itemprop="name">
 <a href="http://neopythonic.blogspot.com/2016/07/about-spammers-and-comments.html">About spammers and comments</a>
 </h3>, <h3 class="post-title entry-title" itemprop="name">
 <a href="http://neopythonic.blogspot.com/2016/05/union-syntax.html">Union syntax</a>
 </h3>, <h3 class="post-title entry-title" itemprop="name">
 <a href="http://neopythonic.blogspot.com/2016/05/adding-type-annotations-for-fspath.html">Adding type annotations for fspath</a>
 </h3>, <h3 class="post-title entry-title" itemprop="name">
 <a href="http://neopythonic.blogspot.com/2016/05/the-anystr-type-variable.html">The AnyStr type variable</a>
 </h3>]
In [5]:
#Getting the titles
[t.text.strip() for t in l]
Out[5]:
['Why operators are useful',
 'What to do with your computer science career',
 'About spammers and comments',
 'Union syntax',
 'Adding type annotations for fspath',
 'The AnyStr type variable']

2. Collecting Data using APIs

One straightforward way to collect data is using APIs. In the following example, we will use the Wikipedia Python package, which warps the MediaWiki API. First, we install the Wikipedia package. For visualization, we will also install the Networkx package

In [6]:
!pip install wikipedia
!pip install networkx
Collecting wikipedia
Requirement already satisfied: requests<3.0.0,>=2.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from wikipedia) (2.22.0)
Requirement already satisfied: beautifulsoup4 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from wikipedia) (4.8.0)
Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (1.24.2)
Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.0.0->wikipedia) (3.0.4)
Requirement already satisfied: soupsieve>=1.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from beautifulsoup4->wikipedia) (1.9.3)
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0
Requirement already satisfied: networkx in /anaconda3/envs/massivedata/lib/python3.6/site-packages (2.3)
Requirement already satisfied: decorator>=4.3.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from networkx) (4.4.0)
In [7]:
import wikipedia
w = wikipedia.page("Machine Learning")
w.summary
Out[7]:
'Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop a conventional algorithm for effectively performing the task.\nMachine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a field of study within machine learning, and focuses on exploratory data analysis through unsupervised learning. In its application across business problems, machine learning is also referred to as predictive analytics.\n\n'
In [8]:
w.links[:20]
Out[8]:
['ACM Computing Classification System',
 'ACM Computing Surveys',
 'ADALINE',
 'AT&T Labs',
 'Action selection',
 'Active learning (machine learning)',
 'Adaptive website',
 'Affective computing',
 'Alan Turing',
 'Algorithm',
 'Algorithm design',
 'Algorithmic bias',
 'Algorithmic efficiency',
 'Amazon Machine Learning',
 'Analysis of algorithms',
 'Angoss',
 'Anomaly detection',
 'Apache Mahout',
 'Apache Spark',
 'Apache SystemML']

Let's build a graph in which each vertex is a category and each link is between categories that reference each other.

In [9]:
verticies = set(w.links[:50])
links = [("Machine Learning", l) for l in verticies]

for v in verticies:
    try:
        w = wikipedia.page(v)
        for v2 in w.links:
            if v2 in verticies:
                links.append((v,v2))    
    except Exception as e:
        print(e)
        
/anaconda3/envs/massivedata/lib/python3.6/site-packages/wikipedia/wikipedia.py:389: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 389 of the file /anaconda3/envs/massivedata/lib/python3.6/site-packages/wikipedia/wikipedia.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.

  lis = BeautifulSoup(html).find_all('li')
"adeline" may refer to: 
Adeline (given name)
Yves-Marie Adeline
Adeline, Illinois
Adeline Records
Adeline Software International
Ballade pour Adeline
Portrait of Mary Adeline Williams
"Adeline" (song)
Adeleorina
Cyclone Adeline
Pépinières Arboretum Adeline
Adeline (rocket)
Sweet Adeline (disambiguation)

Let's draw the graph:

In [10]:
import networkx as nx
%matplotlib inline 
g = nx.DiGraph()
g.add_edges_from(set(links))
nx.info(g)
Out[10]:
'Name: \nType: DiGraph\nNumber of nodes: 51\nNumber of edges: 281\nAverage in degree:   5.5098\nAverage out degree:   5.5098'
In [11]:
import matplotlib.pyplot as plt
plt.figure(3,figsize=(14,14))
nx.draw_kamada_kawai(g, with_labels=True)
/anaconda3/envs/massivedata/lib/python3.6/site-packages/networkx/drawing/nx_pylab.py:579: MatplotlibDeprecationWarning: 
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if not cb.iterable(width):
/anaconda3/envs/massivedata/lib/python3.6/site-packages/networkx/drawing/nx_pylab.py:676: MatplotlibDeprecationWarning: 
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if cb.iterable(node_size):  # many node sizes