« Gender ratios of programmers, by language | Main | Data Journalism Awards Data Visualization of the Year, 2016 »

June 16, 2016

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Typo: lpSolveAPI and not lpSolveAPO.

Regards

Hi Joe,
I also wanted to add heatmaply to the mix, but didn't get to release it before submitting the abstract (I will, however, speak about it, along side d3heatmap).

See you soon :)
Tal

Very nice post. Sadly I won't attend in person, but the videos will hopefully be awesome.

Did you do automated scraping of the abstracts to get the package mentions? Or has that all been manual?

This was mostly a manual process. I did write some simple R code to scrape the abstract description fields, break them into words etc. and then look to see if any of these words were in the list of CRAN packages. (The available.packages() function makes this easy) However, I soon realized that this approach is way to simple to solve the problem in general. Look at this first sentence from the abstract for the bamdit package:
In this work we present the R package bamdit, its name stands for "Bayesian meta-analysis of diagnostic test-data".

"its" is an R package on CRAN, but obviously not relevant to this talk. I need a way to establish the context to sort this out. Maybe for this relatively small corpus someone (probably not me) could write a few simple rules that would get 90% of the way there. But in general, I think this is a pretty difficult and interesting problem.

I have been advised that the package listed as WTAQ2" in the table above is actually called "kwb.wtaq" and is available under Github: https://github.com/KWB-R/kwb.wtaq.

The comments to this entry are closed.

Search Revolutions Blog




Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid
Get this blog via email with Blogtrottr