Nina Zumel, Ph.D.
Win-Vector LLC
The combination of R plus SQL offers an attractive way to work with what we call medium-scale data: data that's perhaps too large to gracefully work with in its entirety within your favorite desktop analysis tool (whether that be R or Excel), but too small to justify the overhead of big data infrastructure. In some cases you can use a serverless SQL database that gives you the power of SQL for data manipulation, while maintaining a lightweight infrastructure. We call this work pattern "SQL Screwdriver": delegating data manipulation to a lightweight infrastructure with the power of SQL for data manipulation.

We assume for this how-to that you already have a PostgreSQL database up and running and you want to work with that data in R. If you don't, then to get PostgreSQL for Windows, OSX, or Unix use the instructions at PostgreSQL downloads. If you happen to be on a Mac, then Postgres.app provides a "serverless" (or application oriented) install option.
In the rest of this post, we give a quick how-to on using the RpostgreSQL
package to interact with Postgres databases in R.
Please click here to read on.
Comments