by Joseph Rickert
The R Consortium, the non-profit trade organization formed under the Linux Foundation to support the R language and the R Community, is beginning to build real momentum. First of all, two new companies recently joined the Consortium: Avant which provides online personal and auto loans and Procogia, a consulting firm that helps companies make data-driven business decisions. Having companies join the R Consortium that are not directly involved in making R tools and extensions is an exciting development that shows that reach of R and its importance to data-driven business of all types.
Last month the R Consortium announced the results of the first broad call for proposals. The ISC funded 7 projects and 3 working groups. The projects included:
A unified Framework for doing distributed computing in R proposed by Michael Lawrence (an R core member) and Edward Ma and Indrajit Roy was awarded $10K. This project grew out of the workshop on distributed computing that was held at HP in January 2015. The immediate work is to build out the ddR package to include more algorithms and add more "back end" distributed platforms. However, since the project could have significant influence on future work on distributed computing it is also being organized as working group (See below.)
An effort proposed by Kirill Müller to improve the DBI interface was awarded $25K. The general idea is to create a DBI specification and centralized tests and boiler that will make it easier to connect to databases while improving the existing DBI interfaces RMySQL, RPostgress and RSQLite.
The creation of a one-day RIOT (R Implementation, Optimization and Tooling) workshop aimed at identifying R language development and tooling opportunities was awarded $10K. Proposed by Mark Hornick, Lucas Stadler and Adam Welc, it is focused on increasing the involvement of the R Community in this area.
RL10N, a project proposed by Richie Cotton and Thomas Leeper to make it easier to localize R packages was awarded $10K. The work here will focus on improving the msgtools package and developing two new packages: MTurkR for managing local translation and translateR for automated translations.
SatRdays, proposed by Steph Locke and Gergely Daroczi, an effort to create a series of one-day community-led, regional R conferences was awarded $10K.
John Blishak, Jonah Duckles, Laurent Gatto, David LeBauer and Greg Wilson of Software Carpentry were awarded $10K to create a two-day, in-person course to train R instructors. The course is intended to teach already accomplished R programmers to teach R programming to adult learners.
Edzer Pebesma of the institute for Geoinformatics at University of Muenster was awarded $10K to create an R package and other materials to simplify the R analysis of geospatial data. The idea is comply with the Simple Features standard for the access and manipulation of spatial vector data.
A new development that came out of the recent round of funding is that of organizing collaborative projects as "Working Groups" under the ISC. The idea here is for the ISC to provide the structure for projects of some significance that may not need immediate funding but require consensus and collaboration before progress can be made. As you might expect, the ISC is hesitant to fund any project that has the potential to influence the direction of the development of the R language without achieving some level of consensus from the knowledgeable and interested members of the extended R Community.
These considerations led to an ISC vote to organize a working group around the ddR projects in addition to providing initial funding. Even if the very promising technical approach implemented in ddR turns out to be the preferred way for providing a foundation for distributed computing in R, discussions about what “backends” to build out ought to include representatives from the broader R Community. Should connecting to Spark, for example, be a priority for the R community?
Two additional working groups were also authorized by the ISC.
Adam Welc, Lucas Stadler and Mark Hornick proposed a working group to assess R’s native API. Their proposal asserted the current API is broader than necessary, redundant and, in many cases, tied to the internals of GNUR. The working group will seek consensus from the R Core group, R developers and the greater R Community on designing an easy to understand, consistent and verifiable API.
Shivank Agrawal, Santosh K Chaudhari, Chen Liang, Qin Wang, Vlad Sharanhovich and Mark Hornick to seek direction and consensus for developing tools for R to determine code coverage upon execution of a test suite and to promote code coverage more systematically within the R ecosystem.
I am hopeful that the ISC working groups will be able to foster meaningful discussion and involve the greater R Community in the work of the ISC.
Finally, Gábor Csárdi is making very good progress with R-Hub, a platform for building, testing and validating R packages. This major project, funded last year with an initial budget of $80K, should streamline and improve the package building experience when completed. The following figure from a recent presentation the R Consortium’s board illustrates the R-Hub Architecture. The part in green indicates the prototype structures that have been built out.
For the details, check out the project on Github.
Having been a member of the R Consortium’s Infrastructure Steering Committee (ISC) for nearly a year now, it is clear to me that, in the long run, the success of the R Consortium will hinge on three factors: (1) the impact of the projects it funds, (2) the ability of the ISC to guide, and in the case of working groups, help manage projects to successful completions and (3) the interest and participation of the greater R Community.
We will know if we are succeeding at this last factor if we can attract the Community experts into the working groups and continue to receive proposals for high quality projects.
Here is my take on problems that the ISC perceived with proposals that were rejected during the last call for proposals and some ideas about formulating a fundable project.
Projects that were rejected displayed one, or more of the of the following shortcomings. They were either: too specific, too small, too ambitious for resources requested, too open ended or proposed to do something that already exists or has been partially implemented somewhere else.
On the flip side, here are some ideas that you might find helpful in writing a proposal for this next round of funding:
- Think big: shoot for something that will benefit a sizable portion of the R Community for years to come
- Collaborate: seek expert opinion about your ideas and find potential collaborators
- Do your homework: make sure you understand what relevant work already exists
- Write a detailed proposal and carefully estimate work, resources and money required
- Be careful what you ask for!
If you think that you have an idea for a project that can benefit a significant portion of the R Community, please find some collaborators, strap yourselves in and start writing. The window for submitting a proposal for the next round of funding closes on July 10th. To submit a proposal or find more information, please visit https://www.r-consortium.org/about/isc/proposals or email [email protected].
Here is the link to the mailing lists for the ISC working groups.