by Stephen Weller, Senior Support Engineer at Revolution Analytics, and Joseph Rickert
For someone trying to learn any new technology getting help with a problem on a public forum can be stressful. Knowing where to go, deciding how to pose a question and figuring out how to deal with a response can be challenging. Moreover, an unpleasant interaction could be ego bruising and a real set-back to learning. Before posting a question on an internet forum do everything you can to make it a positive experience for everyone involved. Here are some recommendations Steve and I have for getting help with R questions.
Preliminary Work
Two of the most novice friendly places to go for help in the R world are the R-Help mailing list on CRAN and the R section of stack overflow. Both of these forums are monitored by experts who are very willing to patiently answer questions, but not always well disposed towards mind-reading. Maximize your chances of getting a quick, positive response by formulating your question or problem as clearly as possible with minimum ambiguity. And then: do your homework. The r-project posting guide shows several ways to search for R help, lists the common mistakes people make in posting questions and provides a host of details on the resources available for getting help and the mechanics of using the various R mailing lists.
Stack overflow provides some excellent suggestions on posting questions. Doing the work to thoroughly research your question is also at the top of their list. Moreover, they point out that taking the trouble to do this makes you a valuable contributor to the R community. They write:
Sharing your research helps everyone. Tell us what you found ...and why it didn’t meet your needs. This demonstrates that you’ve taken the time to try to help yourself, it saves us from reiterating obvious answers, and above all, it helps you get a more specific and relevant answer!
Posting Your Question
When comes time to post your question you may find Steves guidelines helpful. These are based years of trouble shooting problems as a member of Revolution’s Technical Support organization.
- You should always include information on which version of R and what flavor of operating system you are running (see 'R.version' and 'Revo.version' in R)
- Be as specific as you can in describing your problem. Others will often need to duplicate the issue, even if they have worked with similiar code in the past. Simply saying 'function xxx doesn't work' is not helpful enough. Ask yourself what someone else might need to reproduce the problem on their system.
- Include sufficient context information, so that someone reading your question understands what your goal is with running your code or in doing a set of calculations in R.
- If the problem occurs in complicated or lengthy code, identify the problem function and provide a simpler reproduce for others, when possible.
- If you are having problems with an analytic function, either provide test data or provide information on how to reproduce your problem using built-in data in R(for example using 'kyphosis','airquality', etc.)
Finally, here are three examples from the Revolution Technical support archives that illustrate good and bad posts. Two are examples of what Steve calls "pretty well framed support questions" and one is an example of a question that lacks needed information.
#1 - a good post
A few folks here are trying to load the rJava library, they have JAVA_HOME set to their 64 bit java (1.6) but are getting this error.
call: inDL(x, as.logical(local), as.logical(now), ...)
error: unable to load shared library 'z:/R/win64-library/2.11/rJava/libs/x64/rJava.dll':
LoadLibrary failure: The specified module could not be found. We get this when trying to load the library in the console line. I am sure we’re missing something, but we are not sure what.
thanks
What makes this a good post is that it provides information on the versions of Java and R being run and provides a complete error message.
#2 - a good post
Platform: Windows (32-bit)
I am working with a 1.9 GB SPSS data file with 99 variables and 4,684,587 cases. When I try to read the file into Revolution Analytics R using the following command: inDataFileR3C <- "D:/2012 Base Year/RevolutionR/RandomVariables3C.sav" reaValExtData <- rxImport(inData = inDataFileR3C, outFile = "D:/2012 Base Year/RevolutionR/RandomVariables3C.xdf", stringsAsFactors = TRUE,rowsPerRead = 50000) I get the following error message: Rows Read: 50000, Total Rows Processed: 2550000, Total Chunk Time: 12.152 seconds Rows Read: 50000, Total Rows Processed: 2600000Failed to allocate 15300000 bytes. Error in rxCall("Rx_ImportDataSource", params) : bad allocation However, if I break the SPSS data file into 2 parts, one with 2,300,000 cases and the second with 2,384,857 cases, both parts can be read into R successfully.
Thank you,
This post provides very specific information on the error involved and on what the user did to troubleshoot the problem.
#3 - Not a good post
Hi,
On a number of occasions I have been importing fairly large csv's (2 - 3 million rows). I know these are properly formatted (e.g. data is encapsulated by double quotes) and have row counts from using the wc command in a Unix environment. When I import these using rxImport, fewer rows are imported. Is there any reason why this might occur? No errors are reported and the job seems to complete successfully. Changing the number of rowsPerRead doesn't seem to make any difference.
Thanks in advance for any advice.
This question is missing the key information required to reproduce and troubleshoot the problem:
- How is the datafile being imported delimited(csv(comma-delimited), other)?
- What operating system is involved(Linux, Windows)?
- What version of R running?
- The post does not provide an example of the R code that led to the problem
Steve estimates that roughly 50% of the time the support engineers at Revolution Analytics have to ask for more information. When you post a request for help do your best to become part of the solution.
Thanks for some great info & examples!
Re novice friendly: Stack overflow on R is extremely useful, but "novice friendly" it is not -- or at least not compared to, say, some of the Python and Ruby communities I've frequented. While there are plenty of really nice R folks on stack overflow, there are too many people who are curt or just plain jerks for the R newbies I know to feel comfortable asking for help.
I can certainly understand why folks providing help can get irritated at having to keep asking the same questions -- boy do I understand. But if a site wants want to be novice friendly folks who respond have got to be a little more relaxed. It's ok not to want to have to deal with that, but then it's not in any meaningful sense novice friendly. Being a novice can be overwhelming, involving a lot of flailing about, and so expecting that novices will usually have well formed questions after they may have just spend a few hours -- or days -- banging their head against a seemingly intractable problem is a lot to ask. I love stack overflow's coverage of R, and I've gotten an enormous amount out of it. But novice friendly? Not so much.
Speaking of trying to be novice friendly, if you're trying to help novices get better at asking for help, I would not call a section, "not a good post." To a lot of newbies it's going to come off as too harsh; one novice I asked compared it to getting smacked in the nose with a rolled up newspaper -- clearly not the effect you were going for. :) Instead, I'd flip around the structure so it goes from not-good post to good post, with titles like these:
1) A Typical Newbie Post or
1) A Typical Post That's Missing Info
2) A Better Post or
2) A More Focused Post
Same info, more novice friendly vibe.
Posted by: Anders Schneiderman | January 02, 2014 at 17:07
Anders, I couldn't agree with you more about some SO members being jerks or curt. Here's an example ...
http://stackoverflow.com/questions/1923273/counting-the-number-of-elements-with-the-values-of-x-in-a-vector
A valid and sensible question, and some arse feels the need to write the following comment ...
'homework... or someone learning R. Or both! shrug'.
At present, 13 other R 'superstars' have +1 that comment.
Well, I am a newbie to R, and I would not have the kahuna's to ask a question on SO because that comment is not that uncommon. I don't understand the superior attitude of these folks ... how did they learn to use R ?
John.
Posted by: John | January 31, 2014 at 03:10