Revolutions
http://blog.revolutionanalytics.com/
Daily news about using open source R for big data analysis, predictive modeling, data science, and visualization since 2008en-US2017-07-26T12:30:05-07:00Introducing Joyplots
http://blog.revolutionanalytics.com/2017/07/joyplots.html
This is a joyplot: a series of histograms, density plots or time series for a number of data segments, all aligned to the same horizontal scale and presented with a slight overlap. Peak time for sports and leisure #dataviz. About time for a joyplot; might do a write-up on them. #rstats code at https://t.co/Q2AgW068Wa pic.twitter.com/SVT6pkB2hB — Henrik Lindberg (@hnrklndbrg) July 8, 2017 The name "Joy Plot" was apparently coined by Jenny Bryan in April 2017, in response to one of Lindberg's earlier visualizations using this style. (The community appears to have settled on 'joyplot' since then.) The name refers to...<p>This is a <strong>joyplot</strong>: a series of histograms, density plots or time series for a number of data segments, all aligned to the same horizontal scale and presented with a slight overlap.</p>
<blockquote class="twitter-tweet" data-lang="en">
<p dir="ltr" lang="en">Peak time for sports and leisure <a href="https://twitter.com/hashtag/dataviz?src=hash">#dataviz</a>. About time for a joyplot; might do a write-up on them. <a href="https://twitter.com/hashtag/rstats?src=hash">#rstats</a> code at <a href="https://t.co/Q2AgW068Wa">https://t.co/Q2AgW068Wa</a> <a href="https://t.co/SVT6pkB2hB">pic.twitter.com/SVT6pkB2hB</a></p>
— Henrik Lindberg (@hnrklndbrg) <a href="https://twitter.com/hnrklndbrg/status/883675698300420098">July 8, 2017</a></blockquote>
<script src="//platform.twitter.com/widgets.js"></script>
<p>The name "Joy Plot" was apparently <a href="https://twitter.com/JennyBryan/status/856674638981550080">coined by Jenny Bryan</a> in April 2017, in response to one of Lindberg's earlier visualizations using this style. (The community appears to have settled on 'joyplot' since then.) The name refers to the classic 1979 Joy Division "Unknown Pleasures" <a href="https://en.wikipedia.org/wiki/File:Unknown_Pleasures_Joy_Division_LP_sleeve.jpg">album cover</a>, which was in actuality a <a href="https://blogs.scientificamerican.com/sa-visual/pop-culture-pulsar-the-science-behind-joy-division-s-unknown-pleasures-album-cover/">joyplot of radio intensities</a> from the first known pulsar. The album cover reproduced the design from a <a href="https://www.scientificamerican.com/magazine/sa/1971/01-01/#article-the-nature-of-pulsars">1971 Scientific American article</a> about pulsars.</p>
<p>Lindberg created the chart above using a <a href="https://github.com/halhen/viz-pub/blob/master/sports-time-of-day/2_gen_chart.R">simple R script</a> and some <a href="http://ggplot2.org/">ggplot2</a> calls, but now it's even easier to create joyplots in R thanks to the <a href="https://github.com/clauswilke/ggjoy">ggjoy package</a> by Claus Wilke, now <a href="http://mran.microsoft.com/package/ggjoy/">available on CRAN</a>. The <a href="http://mran.microsoft.com/web/packages/ggjoy/vignettes/introduction.html">vignette provides instructions</a> on how to use the package to create your own joyplots: you can either use the <code>geom_ridgeline</code> geom to plot individual lines (useful if your data is already in density or time series format), or use <code>geom_joy</code> to calculate densities from the samples . There's also a <code>theme_joy</code> theme which eliminates the traditional ggplot grid for a cleaner result more reminiscent of the classic 1971 joyplot. You can see several examples (with code) in <a href="http://mran.microsoft.com/web/packages/ggjoy/vignettes/gallery.html">this gallery</a>, and I've also included a couple of other examples below.</p>
<p>Polarization of the US House of Representative members in the Democratic and Republican parties (based on <a href="http://rpubs.com/ianrmcdonald/293304">code by Ian McDonald</a>):</p>
<blockquote class="twitter-tweet" data-lang="en">
<p dir="ltr" lang="en">Asymmetric polarization: The entire GOP has moved to the right, where Dems have just become slightly less moderate. (H/t <a href="https://twitter.com/ianrmcdonald">@ianrmcdonald</a>) <a href="https://t.co/IeIo959qRV">pic.twitter.com/IeIo959qRV</a></p>
— G Elliott Morris📈🙂 (@gelliottmorris) <a href="https://twitter.com/gelliottmorris/status/888904322326638592">July 22, 2017</a></blockquote>
<script src="//platform.twitter.com/widgets.js"></script>
<p>Worker wages by industry:</p>
<blockquote class="twitter-tweet" data-lang="en">
<p dir="ltr" lang="en">The <a href="https://twitter.com/justcapital_">@justcapital_</a> worker wage data for this year's rankings is taking shape. <a href="https://twitter.com/hashtag/ESG?src=hash">#ESG</a> <a href="https://twitter.com/hashtag/rstats?src=hash">#rstats</a> Thank you <a href="https://twitter.com/ClausWilke">@ClausWilke</a> for the joyplot package <a href="https://t.co/HBRuF7mIdq">pic.twitter.com/HBRuF7mIdq</a></p>
— Hernando Cortina (@cortinah) <a href="https://twitter.com/cortinah/status/889970303308320769">July 25, 2017</a></blockquote>
<script src="//platform.twitter.com/widgets.js"></script>
<p>As you can see from the examples above, joyplots work best when you want to compare the distributions of several subgroups, and the number of groups is large (so that a traditional panel plot would take up too much space). That said, joyplots do inevitably obscure some data due to the overlap, and you may want to <a href="https://eagereyes.org/blog/2017/joy-plots">consider alternatives</a> like heat maps or even simple bar charts of means or medians. In any case, the new ggjoy package makes it easy to try them out and decide what works best for your data.</p>
<p>Github (clauswilke): <a href="https://github.com/clauswilke/ggjoy">ggjoy: Geoms to make joy plots using ggplot2</a></p>
<p> </p>graphicsRDavid Smith2017-07-26T12:30:05-07:00SQL Server 2017 release candidate now available
http://blog.revolutionanalytics.com/2017/07/sql-server-rc1.html
SQL Server 2017, the next major release of the SQL Server database, has been available as a community preview for around 8 months, but now the first full-featured release candidate is available for public preview. For those looking to do data science with data in SQL Server, there are a number of new features compared to SQL Server 2017 which might be of interest: SQL Server 2017 will be supported on Linux, specifically on Red Hat, SUSE, and Ubuntu (as well as within a Docker container). You can read some of the backstory on how SQL Server came to Linux...<p>SQL Server 2017, the next major release of the SQL Server database, has been available as a community preview for around 8 months, but now the <a href="https://blogs.technet.microsoft.com/dataplatforminsider/2017/07/17/first-release-candidate-of-sql-server-2017-now-available/">first full-featured release candidate is available for public preview</a>. For those looking to do data science with data in SQL Server, there are a number of new features compared to SQL Server 2017 which might be of interest: </p>
<ul>
<li>SQL Server 2017 will be <a href="https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-overview">supported on Linux</a>, specifically on Red Hat, SUSE, and Ubuntu (as well as within a Docker container). You can read some of the <a href="https://techcrunch.com/2017/07/17/how-microsoft-brought-sql-server-to-linux/">backstory on how SQL Server came to Linux here</a>.</li>
<li>The new <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services">SQL Server Machine Learning Services</a> will provide in-database analytics in both the R and — new to SQL Server 2017 — <a href="https://blogs.technet.microsoft.com/dataplatforminsider/2017/04/19/python-in-sql-server-2017-enhanced-in-database-machine-learning/">Python</a> languages.</li>
<li>There are several improvements to the in-database R integration (previously known as "R Services" in SQL Server 2016). These includes a <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/r/r-package-management-for-sql-server-r-services">new package management</a> system, <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/sql-native-scoring">real-time in-database scoring</a> of predictive models, and an updated R engine.</li>
</ul>
<p>SQL Server 2017 Release Candidate 1 is <a href="https://www.microsoft.com/en-us/sql-server/sql-server-2017#resources">available for download now</a>. For more details on these and other new features in this release, check out the link below.</p>
<p>SQL Server Blog: <a href="https://blogs.technet.microsoft.com/dataplatforminsider/2017/07/17/first-release-candidate-of-sql-server-2017-now-available/">First release candidate of SQL Server 2017 now available</a> </p>MicrosoftpythonRDavid Smith2017-07-25T09:03:25-07:00Analyzing Github pull requests with Neural Embeddings, in R
http://blog.revolutionanalytics.com/2017/07/neural-embeddings-nlp.html
At the useR!2017 conference earlier this month, my colleague Ali Zaidi gave a presentation on using Neural Embeddings to analyze GitHub pull request comments (processed using the tidy text framework). The data analysis was done using R and distributed on Spark, and the resulting neural network trained using the Microsoft Cognitive Toolkit. You can see the slides here, and you can watch the presentation below.<p>At the useR!2017 conference earlier this month, my colleague Ali Zaidi <a href="https://user2017.sched.com/event/Axpk/neural-embeddings-and-nlp-with-r-and-spark?iframe=no&w=&sidebar=yes&bg=no">gave a presentation</a> on using <a href="http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/">Neural Embeddings</a> to analyze <a href="https://github.com/Microsoft/ghinsights">GitHub pull request comments</a> (processed using the tidy text framework). The data analysis was done using R and distributed on Spark, and the resulting neural network trained using the <a href="https://github.com/Microsoft/CNTK">Microsoft Cognitive Toolkit</a>. You can see the slides here, and you can <a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/Neural-Embeddings-and-NLP-with-R-and-Spark\">watch the presentation</a> below. </p>
<p class="asset-video"><iframe allowfullscreen="" frameborder="0" height="288" src="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/Neural-Embeddings-and-NLP-with-R-and-Spark/player" width="512"></iframe></p>
<p> </p>
<p> </p>eventsMicrosoftRDavid Smith2017-07-24T14:52:44-07:00Because it's Friday: How Bitcoin works
http://blog.revolutionanalytics.com/2017/07/because-its-friday-how-bitcoin-works.html
Cryptocurrencies have been in the news quite a bit lately. Bitcoin prices have been soaring recently after the community narrowly avoided the need for a fork, while $32M in rival currency Etherium was recently stolen, thanks to a coding error in wallet application Purity. But what is a crypto-currency, and what does a "wallet" or a "fork" mean in that context? The video below gives the best explanation I've seen for how cryptocurrencies work. It's 25 minutes long, but it's a complex and surprisingly subtle topic, made easy to understand by math explainer channel 3Blue1Brown. That's all from the blog...<p>Cryptocurrencies have been in the news quite a bit lately. Bitcoin prices have been <a href="https://qz.com/1035565/bitcoin-price-and-segwit-the-cryptocurrency-is-surging-because-a-hard-fork-has-been-averted/">soaring</a> recently after the community narrowly avoided the need for a fork, while $32M in rival currency Etherium was <a href="https://arstechnica.com/security/2016/06/bitcoin-rival-ethereum-fights-for-its-survival-after-50-million-heist/">recently stolen</a>, thanks to a coding error in wallet application Purity. But what is a crypto-currency, and what does a "wallet" or a "fork" mean in that context? The video below gives the best explanation I've seen for how cryptocurrencies work. It's 25 minutes long, but it's a complex and surprisingly subtle topic, made easy to understand by math explainer channel <a href="https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw">3Blue1Brown</a>.</p>
<p class="asset-video"><iframe allowfullscreen="" frameborder="0" height="281" src="https://www.youtube.com/embed/bBC-nXj3Ng4?rel=0" width="500"></iframe></p>
<p class="asset-video">That's all from the blog for this week. Have a great weekend, and we'll be back on Monday.</p>randomDavid Smith2017-07-21T15:10:45-07:00IEEE Spectrum 2017 Top Programming Languages
http://blog.revolutionanalytics.com/2017/07/ieee-spectrum-2017-top-programming-languages.html
IEEE Spectrum has published its fourth annual ranking of of top programming languages, and the R language is again featured in the Top 10. This year R ranks at #6, down a spot from its 2016 ranking (and with an IEEE score — derived from search, social media, and job listing trends — tied with the #5 place-getter, C#). Python has taken the #1 slot from C, jumping from its #3 ranking in 2016. For R (a domain specific language for data science) to rank in the top 10, and for Python (a general-purpose language with many data science applications)...<p>IEEE Spectrum has published its <a href="http://blog.revolutionanalytics.com/2016/07/r-moves-up-to-5th-place-in-ieee-language-rankings.html">fourth annual ranking</a> of of top programming languages, and the R language is again featured in the Top 10. This year R ranks at #6, down a spot from its <a href="http://blog.revolutionanalytics.com/2016/07/r-moves-up-to-5th-place-in-ieee-language-rankings.html">2016 ranking</a> (and with an IEEE score — <a href="http://spectrum.ieee.org/ns/IEEE_TPL_2017/methods.html">derived from</a> search, social media, and job listing trends — tied with the #5 place-getter, C#). Python has taken the #1 slot from C, jumping from its #3 ranking in 2016.</p>
<p><a class="asset-img-link" href="http://spectrum.ieee.org/static/interactive-the-top-programming-languages-2017" style="display: inline;"><img alt="IEEE Spectrum 2017" border="0" class="asset asset-image at-xid-6a010534b1db25970b01b7c90de665970b image-full img-responsive" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b7c90de665970b-800wi" title="IEEE Spectrum 2017" /></a></p>
<p>For R (a domain specific language for data science) to rank in the top 10, and for Python (a general-purpose language with many data science applications) to take the top spot, may seem like a surprise. I attribute this to continued broad demand for machine intelligence application development, driven by the growth of "big data" initiatives and the strategic imperative to capitalize on these data stores by companies wordwide. Other data-oriented languages appear in the <a href="http://spectrum.ieee.org/static/interactive-the-top-programming-languages-2017">Top 50</a> rankings, including Matlab (#15), SQL (#23), Julia (#31) and SAS (#37).</p>
<p>For the complete announcement of the 2017 IEEE Spectrum rankings, including additional commentary and analysis of changes, follow the link below.</p>
<p>IEEE Spectrum: <a href="http://spectrum.ieee.org/computing/software/the-2017-top-programming-languages">The 2017 Top Programming Languages</a></p>
<p> </p>
<p> </p>popularitypythonRDavid Smith2017-07-21T11:41:51-07:00Data Analysis for Life Sciences
http://blog.revolutionanalytics.com/2017/07/harvardx.html
Rafael Irizarry from the Harvard T.H. Chan School of Public Health has presented a number of courses on R and Biostatistics on EdX, and he recently also provided an index of all of the course modules as YouTube videos with supplemental materials. The EdX courses are linked below, which you can take for free, or simply follow the series of YouTube videos and materials provided in the index. Data Analysis for the Life Sciences Series Statistics and R Introduction to Linear Models and Matrix Algebra Statistical Interference and Modeling for High-throughput Experiments High-Dimensional Data Analysis A companion book and associated...<p>Rafael Irizarry from the Harvard T.H. Chan School of Public Health has presented a number of courses on R and Biostatistics on EdX, and he recently also <a href="http://rafalab.github.io/pages/harvardx.html">provided an index</a> of all of the course modules as YouTube videos with supplemental materials. The EdX courses are linked below, which you can take for free, or simply follow the series of YouTube videos and materials provided in the index. </p>
<p><a class="asset-img-link" href="https://leanpub.com/dataanalysisforthelifesciences" style="float: right;"><img alt="DALS" class="asset asset-image at-xid-6a010534b1db25970b01b8d29797b0970c img-responsive" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b8d29797b0970c-120wi" style="margin: 0px 0px 5px 5px;" title="DALS" /></a><strong><a class="asset-img-link" href="https://www.edx.org/xseries/data-analysis-life-sciences">Data Analysis for the Life Sciences Series</a></strong> </p>
<ul>
<li><a href="https://www.edx.org/course/statistics-r-harvardx-ph525-1x-0">Statistics and R</a></li>
<li><a href="https://www.edx.org/course/introduction-linear-models-matrix-harvardx-ph525-2x-1">Introduction to Linear Models and Matrix Algebra</a></li>
<li><a href="https://www.edx.org/course/statistical-inference-modeling-high-harvardx-ph525-3x-0">Statistical Interference and Modeling for High-throughput Experiments</a></li>
<li><a href="https://www.edx.org/course/high-dimensional-data-analysis-harvardx-ph525-4x-0">High-Dimensional Data Analysis</a></li>
</ul>
<p>A <a href="https://leanpub.com/dataanalysisforthelifesciences">companion book</a> and associated <a href="http://genomicsclass.github.io/book/">R Markdown documents</a> are also available for download.</p>
<p><strong>Genomics Data Analysis Series</strong></p>
<ul>
<li><a href="https://www.edx.org/course/introduction-bioconductor-annotation-harvardx-ph525-5x-0">Introduction to Bioconductor: Annotation and Analysis of Genomes and Genomic Assays</a></li>
<li><a href="https://www.edx.org/course/high-performance-computing-reproducible-harvardx-ph525-6x-0">High-performance computing for reproducible genomics with Bioconductor</a></li>
<li><a href="https://www.edx.org/course/case-studies-functional-genomics-harvardx-ph525-7x-0">Case Studies in Functional Genomics</a></li>
</ul>
<p>For links to all of the course components, including videos and supplementary materials, follow the link below.</p>
<p>rafalab: <a href="http://rafalab.github.io/pages/harvardx.html">HarvardX Biomedical Data Science Open Online Training</a></p>courseslife sciencesRDavid Smith2017-07-20T08:00:00-07:00Securely store API keys in R scripts with the "secret" package
http://blog.revolutionanalytics.com/2017/07/secret-package.html
If you use an API key to access a secure service, or need to use a password to access a protected database, you'll need to provide these "secrets" in your R code somewhere. That's easy to do if you just include those keys as strings in your code — but it's not very secure. This means your private keys and passwords are stored in plain-text on your hard drive, and if you email your script they're available to anyone who can intercept that email. It's also really easy to inadvertently include those keys in a public repo if you use...<p>If you use an API key to access a secure service, or need to use a password to access a protected database, you'll need to provide these "secrets" in your R code somewhere. That's easy to do if you just include those keys as strings in your code — but it's not very secure. This means your private keys and passwords are stored in plain-text on your hard drive, and if you email your script they're available to anyone who can intercept that email. It's also really easy to inadvertently include those keys in a public repo if you use Github or similar code-sharing services.</p>
<p>To address this problem, Gábor Csárdi and Andrie de Vries created the <a href="https://mran.microsoft.com/package/secret/">secret package</a> for R. The secret package integrates with <a href="https://www.openssh.com/">OpenSSH</a>, providing R functions that allow you to create a vault to keys on your local machine, define trusted users who can access those keys, and then include encrypted keys in R scripts or packages that can only be decrypted by you or by people you trust. You can see how it works in the vignette secret: Share Sensitive Information in R Packages, and in <a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/Can-you-keep-a-secret">this presentation by Andrie de Vries</a> at <a href="https://user2017.sched.com/event/Axq3/can-you-keep-a-secret?iframe=yes&w=&sidebar=yes&bg=no#">useR!2017</a>:</p>
<p> <iframe allowfullscreen="" frameborder="0" height="288" src="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/Can-you-keep-a-secret/player" width="512"></iframe></p>
<p>To use the secret package, you'll need access to your private key, which you'll also need to store securely. For that, you might also want to take a look at the in-progress <a href="https://github.com/gaborcsardi/keyring">keyring package</a>, which allows you to access secrets stored in Keychain on macOS, Credential Store on Windows, and the Secret Service API on Linux.</p>
<p>The secret package is <a href="https://cran.r-project.org/package=secret">available now on CRAN</a>, and you can also find the <a href="https://github.com/gaborcsardi/secret">latest development version on Github</a>.</p>
<p> </p>packagesRDavid Smith2017-07-19T05:00:00-07:00Neural Networks from Scratch, in R
http://blog.revolutionanalytics.com/2017/07/nnets-from-scratch.html
By Ilia Karmanov, Data Scientist at Microsoft This post is for those of you with a statistics/econometrics background but not necessarily a machine-learning one and for those of you who want some guidance in building a neural-network from scratch in R to better understand how everything fits (and how it doesn't). Andrej Karpathy [wrote](https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b) that when CS231n (Deep Learning at Stanford) was offered: >"we intentionally designed the programming assignments to include explicit calculations involved in backpropagation on the lowest level. The students had to implement the forward and the backward pass of each layer in raw numpy. Inevitably, some students...<p><em>By Ilia Karmanov, Data Scientist at Microsoft </em></p>
<p>This post is for those of you with a statistics/econometrics background but not necessarily a machine-learning one and for those of you who want some guidance in building a neural-network from scratch in R to better understand how everything fits (and how it doesn't).</p>
<p>Andrej Karpathy <a href="https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b">wrote</a> that when CS231n (Deep Learning at Stanford) was offered:</p>
<blockquote>
<p>"we intentionally designed the programming assignments to include explicit calculations involved in backpropagation on the lowest level. The students had to implement the forward and the backward pass of each layer in raw numpy. Inevitably, some students complained on the class message boards".</p>
</blockquote>
<p>Why bother with backpropagation when all frameworks do it for you automatically and there are more interesting deep-learning problems to consider?</p>
<p>Nowadays we can literally train a full neural-network (on a GPU) in 5 lines.</p>
<pre>
import keras
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=RMSprop())
model.fit()
</pre>
<p>Karpathy, abstracts away from the "intellectual curiosity" or "you might want to improve on the core algorithm later" argument. His argument is that the calculations are a <a href="https://en.wikipedia.org/wiki/Leaky_abstraction">leaky abstraction</a>:</p>
<blockquote>
<p>“it is easy to fall into the trap of abstracting away the learning process — believing that you can simply stack arbitrary layers together and backprop will 'magically make them work' on your data”</p>
</blockquote>
<p>Hence, my motivation for this post is two-fold:</p>
<ol>
<li><p>Understanding (by writing from scratch) the leaky abstractions behind neural-networks dramatically shifted my focus to elements whose importance I initially overlooked. If my model is not learning I have a better idea of what to address rather than blindly wasting time switching optimisers (or even frameworks).</p></li>
<li><p>A deep-neural-network (DNN), once taken apart into lego blocks, is no longer a black-box that is inaccessible to other disciplines outside of AI. It's a combination of many topics that are very familiar to most people with a basic knowledge of statistics. I believe they need to cover very little (just the glue that holds the blocks together) to get an insight into a whole new realm.</p></li>
</ol>
<p>Starting from a linear regression we will work through the maths and the code all the way to a deep-neural-network (DNN) in the accompanying R-notebooks. Hopefully to show that very little is actually new information.</p>
<p><img src="https://github.com/ilkarman/DemoNeuralNet/blob/master/pic_support/blog_summ.png?raw=true" alt="Summary" /></p>
<h2>Step 1 - Linear Regression (<a href="https://github.com/ilkarman/DemoNeuralNet/blob/master/01_LinearRegression.ipynb">See Notebook</a>)</h2>
<p><a class="asset-img-link" style="display: inline;" href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b8d2961da1970c-pi"><img class="asset asset-image at-xid-6a010534b1db25970b01b8d2961da1970c img-responsive" alt="Step1" title="Step1" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b8d2961da1970c-500wi" /></a><br /></p>
<p>Implementing the closed-form solution for the Ordinary Least Squares estimator in R requires just a few lines:</p>
<pre>
# Matrix of explanatory variables
X <- as.matrix(X)
# Add column of 1s for intercept coefficient
intcpt <- rep(1, length(y))
# Combine predictors with intercept
X <- cbind(intcpt, X)
# OLS (closed-form solution)
beta_hat <- solve(t(X) %*% X) %*% t(X) %*% y
</pre>
<p>The vector of values in the variable <code>beta_hat</code> define our "machine-learning model". A linear regression is used to predict a continuous variable (e.g. how many minutes will this plane be delayed by). In the case of predicting a category (e.g. will this plane be delayed - yes/no) we want our prediction to fall between 0 and 1 so that we can interpret it as the probability of observing the respective category (given the data). </p>
<p>When we have just two mutually-exclusive outcomes we would use a binomial logistic regression. With more than two outcomes (or "classes"), which are mutually-exclusive (e.g. this plane will be delayed by less than 5 minutes, 5-10 minutes, or more than 10 minutes), we would use a multinomial logistic regression (or "softmax"). In the case of many (n) classes that are not mutually-exclusive (e.g. this post references "R" and "neural-networks" and "statistics"), we can fit n-binomial logistic regressions.</p>
<p>An alternative approach to the closed-form solution we found above is to use an iterative method, called <a href="https://en.wikipedia.org/wiki/Gradient_descent">Gradient Descent</a> (GD). The procedure may look like so:</p>
<ul>
<li>Start with a random guess for the weights</li>
<li>Plug guess into loss function</li>
<li>Move guess in the opposite direction of the gradient at that point by a small amount (something we call the `learning-rate')</li>
<li>Repeat above for N steps</li>
</ul>
<p>GD only uses the <a href="https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant">Jacobian</a> matrix (not the <a href="https://en.wikipedia.org/wiki/Hessian_matrix">Hessian</a>), however we know that when we have a convex loss, all local minima are global minima and thus GD is guaranteed to converge to the global minimum. </p>
<p>The loss-function used for a linear-regression is the Mean Squared Error:</p>
<p>\begin{equation*}
C = \frac{1}{2n}\sum_x(y(x) - a(x))^2
\end{equation*}</p>
<p>To use GD we only need to find the partial derivative of this with respect to <code>beta_hat</code> (the 'delta'/gradient).</p>
<p>This can be implemented in R, like so:</p>
<pre>
# Start with a random guess
beta_hat <- matrix(0.1, nrow=ncol(X_mat))
# Repeat below for N-iterations
for (j in 1:N)
{
# Calculate the cost/error (y_guess - y_truth)
residual <- (X_mat %*% beta_hat) - y
# Calculate the gradient at that point
delta <- (t(X_mat) %*% residual) * (1/nrow(X_mat))
# Move guess in opposite direction of gradient
beta_hat <- beta_hat - (lr*delta)
}
</pre>
<p>Running this for 200 iterations gets us to same gradient and coefficient as the closed-form solution. Aside from being a stepping stone to a neural-network (where we use GD), this iterative method can be useful in practice when the the closed-form solution cannot be calculated because the matrix is too big to invert (to fit into memory).</p>
<h2>Step 2 - Logistic Regression (<a href="https://github.com/ilkarman/DemoNeuralNet/blob/master/02_LogisticRegression.ipynb">See Notebook</a>)</h2>
<p><a class="asset-img-link" style="display: inline;" href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b7c90bcaaa970b-pi"><img class="asset asset-image at-xid-6a010534b1db25970b01b7c90bcaaa970b img-responsive" alt="Step2" title="Step2" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b7c90bcaaa970b-500wi" /></a><br /></p>
<p>A logistic regression is a linear regression for binary classification problems. The two main differences to a standard linear regression are:</p>
<ol>
<li>We use an 'activation'/link function called the logistic-sigmoid to squash the output to a probability bounded by 0 and 1</li>
<li>Instead of minimising the quadratic loss we minimise the negative log-likelihood of the Bernoulli distribution</li>
</ol>
<p>Everything else remains the same.</p>
<p>We can calcuate our activation function like so:</p>
<pre>
sigmoid <- function(z){1.0/(1.0+exp(-z))}
</pre>
<p>We can create our log-likelihood function in R:</p>
<pre>
log_likelihood <- function(X_mat, y, beta_hat)
{
scores <- X_mat %*% beta_hat
ll <- (y * scores) - log(1+exp(scores))
sum(ll)
}
</pre>
<p>This loss function (the logistic loss or the log-loss) is also called the cross-entropy loss. The cross-entropy loss is basically a measure of 'surprise' and will be the foundation for all the following models, so it is worth examining a bit more.</p>
<p>If we simply constructed the least-squares loss like before, because we now have a non-linear activation function (the sigmoid), the loss will no longer be convex which will make optimisation hard.</p>
<p>\begin{equation*}
C = \frac{1}{2n}\sum_x(y(x) - a(x))^2
\end{equation*}</p>
<p>We could construct our own loss function for the two classes. When \(y=1\), we want our loss function to be very high if our prediction is close to 0, and very low when it is close to 1. When \(y=0\), we want our loss function to be very high if our prediction is close to 1, and very low when it is close to 0. This leads us to the following loss function:</p>
<p>\begin{equation*}
C = -\frac{1}{n}\sum_xy(x)\ln(a(x)) + (1 - y(x))\ln(1-a(x))
\end{equation*}</p>
<p>The delta for this loss function is pretty much the same as the one we had earlier for a linear-regression. The only difference is that we apply our sigmoid function to the prediction. This means that the GD function for a logistic regression will also look very similar:</p>
<pre>
logistic_reg <- function(X, y, epochs, lr)
{
X_mat <- cbind(1, X)
beta_hat <- matrix(1, nrow=ncol(X_mat))
for (j in 1:epochs)
{
# For a linear regression this was:
# 1*(X_mat %*% beta_hat) - y
residual <- sigmoid(X_mat %*% beta_hat) - y
# Update weights with gradient descent
delta <- t(X_mat) %*% as.matrix(residual, ncol=nrow(X_mat))*(1/nrow(X_mat))
beta_hat <- beta_hat - (lr*delta)
}
# Print log-likliehood
print(log_likelihood(X_mat, y, beta_hat))
# Return
beta_hat
}
</pre>
<h2>Step 3 - Softmax Regression (No Notebook)</h2>
<p><a class="asset-img-link" style="display: inline;" href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01bb09af042b970d-pi"><img class="asset asset-image at-xid-6a010534b1db25970b01bb09af042b970d img-responsive" alt="Step3" title="Step3" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01bb09af042b970d-500wi" /></a><br /></p>
<p>A generalisation of the logistic regression is the multinomial logistic regression (also called 'softmax'), which is used when there are more than two classes to predict. I haven't created this example in R, because the neural-network in the next step can reduce to something similar, however for completeness I wanted to highlight the main differences if you wanted to create it.</p>
<p>First, instead of using the sigmoid function to squash our (one) value between 0 and 1:</p>
<p>\begin{equation*}
\sigma(z)=\frac{1}{1+e^{-z}}
\end{equation*}</p>
<p>We use the softmax function to squash the sum of our \(n\) values (for \(n\) classes) to 1:</p>
<p>\begin{equation*}
\phi(z)=\frac{e^{z_j}}{\sum_ke^{z_k}}
\end{equation*}</p>
<p>This means the value supplied for each class can be interpreted as the probability of that class, given the evidence. This also means that when we see the target class and increase the weights to increase the probability of observing it, the probability of the other classes will fall. The implicit assumption is that our classes are mutually exclusive.</p>
<p>Second, we use a more general version of the cross-entropy loss function:</p>
<p>\begin{equation*}
C = -\frac{1}{n}\sum_x\sum_j y_j\ln(a_j)
\end{equation*}</p>
<p>To see why, remember that for binary classifications (previous example) we had two classes: \(j=2\), under the condition that the categories are mutually-exclusive \(\sum_ja_j=1\) and that \(y\) is <a href="https://www.quora.com/What-is-one-hot-encoding-and-when-is-it-used-in-data-science">one-hot</a> so that \(y1+y2=1\), we can re-write the general formula as:</p>
<p>\begin{equation*}
C = -\frac{1}{n}\sum_xy_1\ln(a_1) + (1 - y_1)\ln(1-a_1)
\end{equation*}</p>
<p>Which is the same equation we first started with. However, now we relax the constraint that \(j=2\). It can be shown that the cross-entropy loss here has the same gradient as for the case of the binary/two-class cross-entropy on logistic outputs.</p>
<p>\begin{equation*}
\frac{\partial C}{\partial \beta_i} = \frac{1}{n}\sum_xx_i(a(x) - y)
\end{equation*}</p>
<p>However, although the gradient has the same formula it will be different because the activation here takes on a different value (softmax instead of logistic-sigmoid). </p>
<p>In most deep-learning frameworks you have the choice of 'binary-crossentropy' or 'categorical-crossentropy' loss. Depending on whether your last layer contains sigmoid or softmax activation you would want to choose binary or categorical cross-entropy (respectively). The training of the network should not be affected, since the gradient is the same, however the reported loss (for evaluation) would be wrong if these are mixed up. </p>
<p>The motivation to go through softmax is that most neural-networks will use a softmax layer as the final/'read-out' layer, with a multinomial/categorical cross-entropy loss instead of using sigmoids with a binary cross-entropy loss — when the categories are mutually exclusive. Although multiple sigmoids for multiple classes can also be used (and will be used in the next example), this is generally only used for the case of non-mutually-exclusive labels (i.e. we can have multiple labels). With a softmax output, since the sum of the outputs is constrained to equal 1, we have the advantage of interpreting the outputs as class probabilities.</p>
<h2>Step 4 - Neural Network (<a href="https://github.com/ilkarman/DemoNeuralNet/blob/master/03_NeuralNet.ipynb">See Notebook</a>)</h2>
<p><a class="asset-img-link" style="display: inline;" href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b7c90bcbd2970b-pi"><img class="asset asset-image at-xid-6a010534b1db25970b01b7c90bcbd2970b img-responsive" alt="Step4" title="Step4" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b7c90bcbd2970b-500wi" /></a><br /></p>
<p>A neural network can be thought of as a series of logistic regressions stacked on top of each other. This means we could say that a logistic regression is a neural-network (with sigmoid activations) with no hidden-layer.</p>
<p>This hidden-layer lets a neural-network generate non-linearities and leads to the <a href="https://en.wikipedia.org/wiki/Universal_approximation_theorem">Universal approximation theorem</a>, which states that a network with just one hidden layer can approximate any linear or non-linear function. The number of hidden-layers can go into the hundreds. </p>
<p>It can be useful to think of a neural-network as a combination of two things: 1) many logistic regressions stacked on top of each other that are 'feature-generators' and 2) one read-out-layer which is just a softmax regression. The recent successes in deep-learning can arguable be attributed to the 'feature-generators'. For example; previously with computer vision, we had to painfully state that we wanted to find triangles, circles, colours, and in what combination (similar to how economists decide which interaction-terms they need in a linear regression). Now, the hidden-layers are basically an optimisation to decide which features (which 'interaction-terms') to extract. A lot of deep-learning (transfer learning) is actually done by generating features using a trained-model with the head (read-out layer) cut-off, and then training a logistic regression (or boosted decision-trees) using those features as inputs.</p>
<p>The hidden-layer also means that our loss function is not convex in parameters and we can't roll down a smooth-hill to get to the bottom. Instead of using Gradient Descent (which we did for the case of a logistic-regression) we will use Stochastic Gradient Descent (SGD), which basically shuffles the observations (random/stochastic) and updates the gradient after each mini-batch (generally much less than total number of observations) has been propagated through the network. There are many alternatives to SGD that Sebastian Ruder does a great job of summarising <a href="http://sebastianruder.com/optimizing-gradient-descent/">here</a>. I think this is a fascinating topic to go through, but outside the scope of this blog-post. Briefly, however, the vast majority of the optimisation methods are first-order (including SGD, Adam, RMSprop, and Adagrad) because calculating the second-order is too computionally difficult. However, some of these first-order methods have a fixed learning-rate (SGD) and some have an adaptive learning-rate (Adam), which means that the 'amount' we update our weights by becomes a function of the loss - we may make big jumps in the beginning but then take smaller steps as we get closer to the target.</p>
<p>It should be clear, however that minimising the loss on training data is not the main goal - in theory we want to minimise the loss on 'unseen'/test data; hence all the opimisation methods proxy for that under the assumption that a low lost on training data will generalise to 'new' data from the same distribution. This means we may prefer a neural-network with a higher training-loss; because it has a lower validation-loss (on data it hasn't been trained on) - we would typically say that the network has 'overfit' in this case. There have been some <a href="https://arxiv.org/abs/1705.08292">recent papers</a> that claim that adaptive optimisation methods do not generalise as well as SGD because they find very sharp minima points.</p>
<p>Previously we only had to back-propagate the gradient one layer, now we also have to back-propagate it through all the hidden-layers. Explaining the back-propagation algorithm is beyond the scope of this post, however it is crucial to understand. Many good <a href="http://neuralnetworksanddeeplearning.com/chap2.html">resources</a> exist online to help.</p>
<p>We can now create a neural-network from scratch in R using four functions.</p>
<p>First, we initialise our weights:</p>
<pre>
neuralnetwork <- function(sizes, training_data, epochs,
mini_batch_size, lr, C, verbose=FALSE,
validation_data=training_data)
</pre>
<p>Since we now have a complex combination of parameters we can't just initialise them to be 1 or 0, like before - the network may get stuck. To help, we use the gaussian distribution (however, just like with the opimisation, there are many other methods):</p>
<script src="https://gist.github.com/ilkarman/eedfca43224c44a360ac151e65294b69.js"></script>
<p>Second, we use stochastic gradient descent as our optimisation method:</p>
<script src="https://gist.github.com/ilkarman/09766b37841e6dccba52c932bf57cf29.js"></script>
<p>Third, as part of the SGD method, we update the weights after each mini-batch has been forward and backwards-propagated:</p>
<script src="https://gist.github.com/ilkarman/bee1d8af000b0a0d17f3948f90448af6.js"></script>
<p>Fourth, the algorithm we use to calculate the deltas is the back-propagation algorithm.</p>
<p>In this example we use the cross-entropy loss function, which produces the following gradient:</p>
<pre>
cost_delta <- function(method, z, a, y) {
if (method=='ce'){return (a-y)}
}
</pre>
<p>Also, to be consistent with our logistic regression example we use the sigmoid activation for the hidden layers and for the read-out layer:</p>
<pre>
# Calculate activation function
sigmoid <- function(z){1.0/(1.0+exp(-z))}
# Partial derivative of activation function
sigmoid_prime <- function(z){sigmoid(z)*(1-sigmoid(z))}
</pre>
<p>As mentioned previously, usually the softmax activation is used for the read-out layer. For the hidden layers, ReLU is more common, which is just the max function (negative weights get flattened to 0). The activation function for the hidden layers can be imagined as a race to carry a baton/flame (gradient) without it dying. The sigmoid function flattens out at 0 and at 1, resulting in a flat gradient which is equivalent to the flame dying out (we have lost our signal). The ReLU function helps preserve this gradient.</p>
<p>The back-propagation function is defined as:</p>
<pre>
backprop <- function(x, y, C, sizes, num_layers, biases, weights)
</pre>
<p>Check out the <a href="https://notebooks.azure.com/ilia/libraries/nnetR">notebook</a> for the full code — however the principle remains the same: we have a forward-pass where we generate our prediction by propagating the weights through all the layers of the network. We then plug this into the cost gradient and update the weights through all of our layers.</p>
<p>This concludes the creation of a neural network (with as many hidden layers as you desire). It can be a good exercise to replace the hidden-layer activation with ReLU and read-out to be softmax, and also add L1 and L2 regularization. Running this on the <a href="http://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html">iris dataset</a> in the notebook (which contains 4 explanatory variables with 3 possible outomes), with just one hidden-layer containing 40 neurons we get an accuracy of 96% after 30 rounds/epochs of training.</p>
<p>The notebook also runs a 100-neuron <a href="http://yann.lecun.com/exdb/mnist/">handwriting-recognition</a> example to predict the digit corresponding to a 28x28 pixel image.</p>
<h2>Step 5 - Convolutional Neural Network (<a href="https://github.com/ilkarman/DemoNeuralNet/blob/master/04_Convolutions.ipynb">See Notebook</a>)</h2>
<p><a class="asset-img-link" style="display: inline;" href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01bb09af05cf970d-pi"><img class="asset asset-image at-xid-6a010534b1db25970b01bb09af05cf970d img-responsive" alt="Step5" title="Step5" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01bb09af05cf970d-500wi" /></a><br /></p>
<p>Here, we will briefly examine only the forward-propagation in a convolutional neural-network (CNN). CNNs were first made popular in 1998 by <a href="http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf">LeCun's seminal paper</a>. Since then, they have proven to be the best method we have for recognising patterns in images, sounds, videos, and even text!</p>
<p>Image recognition was initially a manual process; researchers would have to specify which bits (features) of an image were useful to identify. For example, if we wanted to classify an image into ‘cat’ or ‘basketball’ we could have created code that extracts colours (basketballs are orange) and shapes (cats have triangular ears). Perhaps with a count of these features we could then run a linear regression to get the relationship between number of triangles and whether the image is a cat or a tree. This approach suffers from issues of image scale, angle, quality and light.
<a href="https://en.wikipedia.org/wiki/Scale-invariant_feature_transform">Scale Invariant Feature Transformation</a> (SIFT) largely improved upon this and was used to provide a `feature description' of an object, which could then be fed into a linear regression (or any other relationship learner). However, this approach had set-in-stone rules that could not be optimally altered for a specific domain.</p>
<p>CNNs look at images (extract features) in an interesting way. To start, they look only at very small parts of an image (at a time), perhaps through a restricted window of 5 by 5 pixels (a filter). 2D convolutions are used for images, and these slide the window across until the whole image has been covered. This stage would typically extract colours and edges. However, the next layer of the network would look at a combination of the previous filters and thus 'zoom-out'. After a certain number of layers the network would be 'zoomed-out' enough to recognise shapes and larger structures. </p>
<p>These filters end up as the 'features' that the network has learned to identify. It can then pretty much count the presence of each feature to identify a relationship with the image label ('basketball' or 'cat'). This approach appears quite natural for images — since they can broken down into small parts that describe it (colours, textures, etc.). CNNs appear to thrive on the fractal-like nature of images. This also means they may not be a great fit for other forms of data such as an excel worksheet where there is no inherent structure: we can change the column order and the data remains the same — try swapping pixels in an image (the image changes)! </p>
<p>In the previous example we looked at a standard neural-net classifying handwritten text. In that network each neuron from layer \(i\), was connected to each neuron at layer \(j\) — our 'window' was the whole image. This means if we learn what the digit '2' looks like; we may not recognise it when it is written upside down by mistake, because we have only seen it upright. CNNs have the advantage of looking at small bits of the digit '2' and finding patterns between patterns between patterns. This means that a lot of the features it extracts may be immune to rotation, skew, etc. For more detail, Brandon Rohrer explains <a href="https://www.youtube.com/watch?v=FmpDIaiMIeA">here</a> what a CNN actually is in detail.</p>
<p>We can define a 2D convolution function in R:</p>
<pre>
convolution <- function(input_img, filter, show=TRUE, out=FALSE)
{
conv_out <- outer(
1:(nrow(input_img)-kernel_size[[1]]+1),
1:(ncol(input_img)-kernel_size[[2]]+1),
Vectorize(function(r,c) sum(input_img[r:(r+kernel_size[[1]]-1),
c:(c+kernel_size[[2]]-1)]*filter))
)
}
</pre>
<p>And use it to a apply a 3x3 filter to an image:</p>
<pre>
conv_emboss <- matrix(c(2,0,0,0,-1,0,0,0,-1), nrow = 3)
convolution(input_img = r_img, filter = conv_emboss)
</pre>
<p>You can check the notebook to see the result, however this seems to extract the edges from a picture. Other, convolutions can 'sharpen' an image, like this 3x3 filter:</p>
<pre>
conv_sharpen <- matrix(c(0,-1,0,-1,5,-1,0,-1,0), nrow = 3)
convolution(input_img = r_img, filter = conv_sharpen)
</pre>
<p>Typically we would randomly initialise a number of filters (e.g. 64):</p>
<pre>
filter_map <- lapply(X=c(1:64), FUN=function(x){
# Random matrix of 0, 1, -1
conv_rand <- matrix(sample.int(3, size=9, replace = TRUE), ncol=3)-2
convolution(input_img = r_img, filter = conv_rand, show=FALSE, out=TRUE)
})
</pre>
<p>We can visualise this map with the following function:</p>
<script src="https://gist.github.com/ilkarman/3a2cb478a39b7a53d4a2ad2a2ca804f7.js"></script>
<p><a class="asset-img-link" style="display: inline;" href="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01bb09af0634970d-pi"><img class="asset asset-image at-xid-6a010534b1db25970b01bb09af0634970d img-responsive" alt="Features" title="Features" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01bb09af0634970d-500wi" /></a><br /></p>
<p>Running this function we notice how computationally intensive the process is (compared to a standard fully-connected layer). If these feature maps are not useful 'features' (i.e. the loss is difficult to decrease when these are used) then back-propagation will mean we will get different weights which correspond to different feature-maps; which will become more useful to make the classification.</p>
<p>Typically we stack convolutions on top of other convolutions (and hence the need for a deep network) so that edges becomes shapes and shapes become noses and noses become faces. It can be interesting to examine some <a href="https://adeshpande3.github.io/assets/deconvnet.png">feature maps</a> from trained networks to see what the network has actually learnt.</p>
<h3>Download Notebooks</h3>
<p>You can find notebooks implementing the code behind this post on Github by following the links in the section headings, or as Azure Notebooks at the link below:</p>
<p>Azure Notebooks: <a href="https://notebooks.azure.com/ilia/libraries/nnetR">NeuralNetR</a></p>
Guest Blogger2017-07-18T09:30:00-07:00Revisiting the useR!2017 conference: Recordings now available
http://blog.revolutionanalytics.com/2017/07/revisiting-user2017.html
The annual useR!2017 conference took place July 4-7 in Brussels, and in every dimension it was the best yet. It was the largest (with over 1,100 R users from around the world in attendance), and yet still very smoothly run with many amazing talks and lots of fun for everyone. If you weren't able to make it to Brussels, take a look at these recaps from Nick Strayer & Lucy D'Agostino McGowan, Once Upon Data and DataCamp to get a sense of what it was like, or simply take a look at this recap video: From my personal point of...<p>The annual useR!2017 conference took place July 4-7 in Brussels, and in every dimension it was the best yet. It was the largest (with over 1,100 R users from around the world in attendance), and yet still very smoothly run with many amazing talks and lots of fun for everyone. If you weren't able to make it to Brussels, take a look at these recaps from <a href="http://livefreeordichotomize.com/2017/07/12/user-rundown/">Nick Strayer & Lucy D'Agostino McGowan</a>, <a href="http://www.onceupondata.com/2017/07/12/user-2017/">Once Upon Data</a> and <a href="https://www.datacamp.com/community/blog/user-2017-in-retrospect#gs.1TQSZdw">DataCamp</a> to get a sense of what it was like, or simply take a look at this recap video:</p>
<p class="asset-video"><iframe allowfullscreen="" frameborder="0" height="281" src="https://www.youtube.com/embed/YWF6nbUTRao?rel=0" width="500"></iframe></p>
<p class="asset-video">From my personal point of view, if I were to try and capture user!2017 in just one word, it would be: <strong>vibrant</strong>. With so many first-time attendees, an atmosphere of excitement was everywhere, and the conference was noticeably much more <a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference/Diversity-of-the-R-Community">diverse</a> than in prior years — a really positive development. Kudos to the organizers for their focus on making useR!2017 a <a href="https://twitter.com/olga_mie/status/883301504991547396">welcoming and inclusive</a> conference, and a special shout-out to the <a href="https://twitter.com/RLadiesGlobal/status/882561056752754689">R-Ladies community</a> for encouraging and inspiring so many. I especially enjoyed meeting the diversity scholars and being a part of the special <a href="https://user2017.brussels/schedule">beginner's session</a> held before the conference officially began (and so sadly unrecorded). Judging from the 200+ attendees reactions there, many welcomed getting a jump-start on the R project, its community, and how best to participate and contribute.</p>
<p>The diversity was reflected in the content, too, with a great mix of tutorials, keynotes and talks on R packages, R applications, the R community and ecosystem, and the R project itself. With thanks to Microsoft, all of this material was recorded, andis now available to view on Channel 9: </p>
<p style="text-align: center;"><strong>useR!2017 Recordings</strong>: <a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference">useR! International R User 2017 Conference</a></p>
<p>All recordings are streamable and downloadable, and are shared under a <a href="https://creativecommons.org/licenses/by-nc-nd/3.0/">Creative Commons license</a>. (Note: a few talks are still in the editing room awaiting posting, but all the content should be available at the link above by July 21.) In many cases, you can also find slides in the sessions listed in the <a href="https://user2017.brussels/schedule">useR!2017 schedule</a>. </p>
<p>With around 300 videos it might be tricky to find the one you want, but you can use the Filters button to reveal a search tool, and you can also filter by specific speakers:</p>
<p><a class="asset-img-link" href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference" style="display: inline;"><img alt="Filters" border="0" class="asset asset-image at-xid-6a010534b1db25970b01b8d296ea52970c image-full img-responsive" src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01b8d296ea52970c-800wi" title="Filters" /></a></p>
<p>Here are a few searches you might find useful:</p>
<ul>
<li><a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference?sort=status&direction=desc&d=0&term=">Tutorials</a> (Day 0)</li>
<li><a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference?sort=status&direction=desc&term=keynote">Keynote presentations</a></li>
<li><a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference?sort=status&direction=desc&term=lightning">Lightning talk sessions</a></li>
<li><a href="https://channel9.msdn.com/Events/useR-international-R-User-conferences/useR-International-R-User-2017-Conference?sort=status&direction=desc&term=sponsor">Sponsor presentations</a></li>
</ul>
<p>Next year's useR! conference, <a href="https://user2018.r-project.org/">useR!2018</a>, will be held July 10-13 in Brisbane, Australia. The organizers have opened a <a href="https://user2018.r-project.org/blog/2017/06/14/survey/">survey on useR!2018</a> to give the R community an opportunity to make suggestions on the content. If you have ideas for tutorial topics and presenters, keynote speakers, services like child care, or sign language interpreters, or how scholarships should be awarded, please do contribute your ideas.</p>
<p>Looking even further out, useR!2019 will be in Toulouse (France), and useR!2020 will be in Boston (USA). That's a lot to be looking forward to, and with useR!2017 setting such a high a high bar I'm sure these will be outstanding conferences as well. See you there! </p>
<p> </p>eventsRDavid Smith2017-07-17T13:12:50-07:00Because it's Friday: Hidden Holes
http://blog.revolutionanalytics.com/2017/07/because-its-friday-hidden-holes.html
They recently resurfaced the street in front of my house in Chicago. The first step was to grade away the existing layer of bitumen, to level the ground ready for a fresh layer. To my surprise (and I'm sure to the surprise of the engineers — the project was suspended for a couple of weeks), the old bitumen layer was hiding two sinkholes, one easily large enough to swallow a car. It was shocking to think we'd driven over that hole hundreds of times, and the only thing keeping us from falling in was a thin layer of bitumen. As...<p>They recently resurfaced the street in front of my house in Chicago. The first step was to grade away the existing layer of bitumen, to level the ground ready for a fresh layer. To my surprise (and I'm sure to the surprise of the engineers — the project was suspended for a couple of weeks), the old bitumen layer was hiding two sinkholes, one easily large enough to swallow a car. It was shocking to think we'd driven over that hole hundreds of times, and the only thing keeping us from falling in was a thin layer of bitumen.</p>
<p>As the video below explains, such sinkholes are usually caused by water erosion — in our case, probably by a broken water main. Check out the demo <a href="https://youtu.be/e-DVIQPqS8E?t=4m">at the 4:00 mark</a> to see how this can happen.</p>
<p class="asset-video"><iframe allowfullscreen="" frameborder="0" height="281" src="https://www.youtube.com/embed/e-DVIQPqS8E?rel=0" width="500"></iframe></p>
<p>That's all for this week. Enjoy your weekend, and we'll be back with more on the blog on Monday.</p>randomDavid Smith2017-07-14T13:57:54-07:00