{"id":5388,"date":"2020-05-16T02:16:46","date_gmt":"2020-05-16T01:16:46","guid":{"rendered":"http:\/\/mathscitech.org\/articles\/?page_id=5388"},"modified":"2020-12-25T21:38:03","modified_gmt":"2020-12-25T21:38:03","slug":"data-science","status":"publish","type":"post","link":"https:\/\/mathscitech.org\/articles\/data-science","title":{"rendered":"Data Science"},"content":{"rendered":"<p>Short Articles on Data Science<br \/>\n<!--more--><\/p>\n<hr\/>\n<p><a id=\"211\"><\/a><br \/>\n<strong>#211 &#8211; Antibiotics effective on drug-resistant bacteria have been found using computer-aided drug discovery running machine learning\/AI search algorithms on databases of pharmaceutical compounds<\/strong><br \/>\n<em>Feb 21st, 2020<\/em><br \/>\nResearchers at MIT trained a deep learning algorithm using 2,500 compounds that were effective at killing bacteria.  They then turned the algorithm loose on 6000 compounds under investigation and found &#8220;halicin&#8221;.  They then expanded the search space of the algorithm to 107 million (~7%) of a massive database of 1.5 billion known pharma compounds.  In a few hours, the algorithm had identified 23 promising candidates, of which two, in addition to halicin, have been found to be highly effective in lab trials on almost all known drug-resistant bacteria.<\/p>\n<p>This is good news for medical science, and great news for the potency of deep machine learning.<\/p>\n<p>[1] <a href=\"https:\/\/www.theguardian.com\/society\/2020\/feb\/20\/antibiotic-that-kills-drug-resistant-bacteria-discovered-through-ai\" rel=\"noopener noreferrer\" target=\"_blank\">Discovering novel super antibiotics using machine learning<\/a><\/p>\n<hr\/>\n<p><a id=\"191\"><\/a><br \/>\n<strong>#191 &#8211; Statistical Testing: Why Signal to Noise (and Type II errors) matter.<\/strong><br \/>\n<em>Mon 19th August, 2019<\/em><\/p>\n<p>The sample size for a statistical test is a function of three factors:<\/p>\n<p>(1) significance level, or what likelihood to assign Type 1 errors (false alarms). Typically this is at least 5% (95% confidence).<\/p>\n<p>(2) power of the test to detect the effect, or what likelihood to assign Type 2 errors (missed detections). Typically this is at least 20% (80% power), though for important studies, this will be 10% (90% power).<\/p>\n<p>(3) Signal to Noise Ratio (SNR), which is the ratio of the difference in the means (signal) and the standard deviation in measurements (noise).<\/p>\n<p>The result indicates the recommended sample size.<\/p>\n<p>Rules of thumb:<br \/>\nSNR=0.2 means N=500,<br \/>\nSNR=1.0 means N=20,<br \/>\nSNR&gt;=2.0 means N&lt;=6.<\/p>\n<div id=\"attachment_3413\" style=\"width: 161px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" aria-describedby=\"caption-attachment-3413\" loading=\"lazy\" class=\"size-full wp-image-3413\" src=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2019\/06\/screenshot.0780.png\" alt=\"Sample Size from Signal to Noise Ratio and Desired Power\" width=\"151\" height=\"491\" srcset=\"https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2019\/06\/screenshot.0780.png 151w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2019\/06\/screenshot.0780-92x300.png 92w, https:\/\/mathscitech.org\/articles\/wp-content\/uploads\/2019\/06\/screenshot.0780-46x150.png 46w\" sizes=\"auto, (max-width: 151px) 100vw, 151px\" \/><p id=\"caption-attachment-3413\" class=\"wp-caption-text\">Sample Size from Signal to Noise Ratio and Desired Power<\/p><\/div>\n<p>Reference:<br \/>\n<a href=\"http:\/\/www.3rs-reduction.co.uk\/html\/6__power_and_sample_size.html\" target=\"_blank\" rel=\"noopener noreferrer\">Power &amp; Sample Size<\/a><\/p>\n<hr\/>\n<p><a id=\"186\"><\/a><br \/>\n<strong>#186 &#8211; Data looks better naked. Less is more&#8230; effective, attractive, impactive.<\/strong><br \/>\n<em>Sat 20th July, 2019<\/em><br \/>\nIn Aug 2013, Joey Cherdarchuk, cofounder of Darkhorse Analytics, published the 3-part series called <a href=\"https:\/\/www.darkhorseanalytics.com\/search?q=data%20looks%20better%20naked\" target=\"_blank\" rel=\"noopener noreferrer\">Data looks better naked<\/a>.<\/p>\n<p>Everyone working with tables, numbers, data, visualizations, will benefit from looking at the slides. The changes are easy to make but powerful. They embody some fundamental tenets: (1) maximize the data-ink ratio by reducing non-data related ink, (2) simplify, (3) to let the data speak for itself i.e. get rid of the clutter.<\/p>\n<p>Love the quote at the end: &#8220;Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away\u201d \u2013 Antoine de Saint-Exupery<\/p>\n<p><a href=\"http:\/\/static1.squarespace.com\/static\/56713bf4dc5cb41142f28d1f\/5671e8bf816924fc22651410\/5671eb1e816924fc2265196e\/1450306334085\/ClearOffTheTableMd.gif?format=original\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter\" src=\"http:\/\/static1.squarespace.com\/static\/56713bf4dc5cb41142f28d1f\/5671e8bf816924fc22651410\/5671eb1e816924fc2265196e\/1450306334085\/ClearOffTheTableMd.gif?format=original\" alt=\"Data Looks better Naked - Less Terrible Tables\" width=\"80%\" height=\"80%\" \/><\/a><\/p>\n<p>The series:<br \/>\nPart I: <a href=\"https:\/\/www.darkhorseanalytics.com\/blog\/data-looks-better-naked\" target=\"_blank\" rel=\"noopener noreferrer\">Improve your Bar Charts<\/a><\/p>\n<p>Part II: <a href=\"https:\/\/www.darkhorseanalytics.com\/blog\/clear-off-the-table?rq=data%20looks%20better%20naked\" target=\"_blank\" rel=\"noopener noreferrer\">Improve your Data Tables<\/a><\/p>\n<p>Part III: <a href=\"https:\/\/www.darkhorseanalytics.com\/blog\/data-looks-better-naked-maps-edition?rq=data%20looks%20better%20naked\" target=\"_blank\" rel=\"noopener noreferrer\">Improve your Heat Maps<\/a><\/p>\n<p><strong>References:<\/strong><\/p>\n<ol>\n<li><a href=\"https:\/\/www.darkhorseanalytics.com\/search?q=data%20looks%20better%20naked\" target=\"_blank\" rel=\"noopener noreferrer\">Data Looks Better Naked series<\/a><br \/>\n<a href=\"https:\/\/www.darkhorseanalytics.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">DarkHorse Analytics<\/a><\/li>\n<li><a href=\"https:\/\/www.darkhorseanalytics.com\/joey\" target=\"_blank\" rel=\"noopener noreferrer\">Joey Cherdarchuk<\/a><\/li>\n<\/ol>\n<hr \/>\n<p><a id=\"113\"><\/a><br \/>\n#113 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20170806%20-%20Tribute%20to%20Usain%20Bolt%20-%20100m%20stats%20%20Thi.html\" target=\"_blank\" rel=\"noopener noreferrer\">20170806 &#8211; Tribute to Usain Bolt &#8211; 100m stats<\/a><\/p>\n<p><a id=\"101\"><\/a><br \/>\n#101 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20160917%20-%20Data%20Visualization%20-%20The%20Fallen%20of%20Worl%281%29.html\" target=\"_blank\" rel=\"noopener noreferrer\">20160917 &#8211; Data &amp; Infographics &#8211; The Fallen in the World War<\/a><\/p>\n<p><a id=\"100\"><\/a><br \/>\n#100 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20160917%20-%20Data%20Visualization%20-%20The%20Fallen%20of%20Worl.html\" target=\"_blank\" rel=\"noopener noreferrer\">20160917 &#8211; Data Visualization &#8211; The Fallen in the World War<\/a><\/p>\n<p><a id=\"92\"><\/a><br \/>\n#092 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20160313%20-%20Financial%20Signals%20in%20the%20US%20Macro%20Econo.html\" target=\"_blank\" rel=\"noopener noreferrer\">20160313 &#8211; Financial Signals in the US Macro Economy<\/a><\/p>\n<p><a id=\"81\"><\/a><br \/>\n#081 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20151007%20-%20Analytics%20in%20the%20AWS%20cloud%C2%A0%20Amazon_s%20AW.html\" target=\"_blank\" rel=\"noopener noreferrer\">20151007 &#8211; Analytics in the AWS cloud<\/a><\/p>\n<p><a id=\"80\"><\/a><br \/>\n#080 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20150923%20-%20Distinguishing%20occupants%20in%20a%20room%20from.html\" target=\"_blank\" rel=\"noopener noreferrer\">20150923 &#8211; Distinguishing occupants in a room<\/a><\/p>\n<p><a id=\"55b\"><\/a><br \/>\n#55b <a href=\"http:\/\/www.forbes.com\/sites\/ciocentral\/2013\/04\/22\/big-data-isnt-about-big\/\" rel=\"noopener noreferrer\" target=\"_blank\">20140927 &#8211; Big Data isn&#8217;t about Big<\/a>  The focus on data should be about analytical capability rather than size: &#8220;&#8216;Big Data&#8217; is the subjective state a company finds itself in when its human and technical infrastructure can\u2019t keep pace with its data needs.&#8221;<\/p>\n<p><a id=\"49\"><\/a><br \/>\n#049 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20140819%20-%20Less%20terrible%20tables%20by%20Darkhorse%20Analy.html\" target=\"_blank\" rel=\"noopener noreferrer\">20140819 &#8211; Presenting Data: Less terrible tables<\/a><\/p>\n<p><a id=\"36\"><\/a><br \/>\n#036 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20140627%20-%20There%20is%20a%20dictum,%20%60%60The%20Map%20is%20Not%20the.html\" target=\"_blank\" rel=\"noopener noreferrer\">20140627 &#8211; &#8220;The Map is Not the Territory&#8221;<\/a><\/p>\n<p><a id=\"33\"><\/a><br \/>\n#033 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20140624%20-%20Who,%20really,%20is%20a%20_Data%20Scientist_%C2%A0%C2%A0%20+.html\" target=\"_blank\" rel=\"noopener noreferrer\">20140624 &#8211; Who, <em>really<\/em>, is a Data Scientist?<\/a><\/p>\n<p><a id=\"24\"><\/a><br \/>\n#024 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20140508%20-%20Innovation%20IV_%C2%A0%20Why%20Successful%20Small%20Co.html\" target=\"_blank\" rel=\"noopener noreferrer\">20140508 &#8211; Innovation I<\/a><\/p>\n<p><a id=\"18\"><\/a><br \/>\n#018 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20140426%20-%20The%20technical%20world%20tends%20to%20splinter%20i.html\" target=\"_blank\" rel=\"noopener noreferrer\">20140426 &#8211; The technical world tends to splinter<\/a><\/p>\n<p><a id=\"13\"><\/a><br \/>\n#013 <a class=\"takeout-resource-link\" href=\"http:\/\/www.mathscitech.org\/gplus\/20140424%20-%20%20%23Probability%20%20is%20the%20heart%20of%20%20%23simula.html\" target=\"_blank\" rel=\"noopener noreferrer\">20140424 &#8211; Probability is the heart of simulation<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Short Articles on Data Science<\/p>\n<p> [Read More&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","footnotes":""},"categories":[194,31,8,189],"tags":[],"coauthors":[112],"class_list":["post-5388","post","type-post","status-publish","format-standard","hentry","category-data-science","category-mathematics","category-statistics","category-short-articles","odd"],"views":2316,"_links":{"self":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts\/5388","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/comments?post=5388"}],"version-history":[{"count":2,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts\/5388\/revisions"}],"predecessor-version":[{"id":6246,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/posts\/5388\/revisions\/6246"}],"wp:attachment":[{"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/media?parent=5388"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/categories?post=5388"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/tags?post=5388"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/mathscitech.org\/articles\/wp-json\/wp\/v2\/coauthors?post=5388"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}