The American Community Survey, conducted by the US Census Bureau, collects data from around 3.5 million households each year in order to estimate various demographic statistics of the US population, including appliances installed in the home, languages spoken, work experience and much more (here's the complete data dictionary). The data science competition platform Kaggle recently introduced a library of hosted datasets, and the American Community Survey is one of the data sets available.
Kaggle users can also publish scripts (in R, Python, or Julia) to analyze the data sets and link the analysis to the datasets. For example, Kaggle user "A.M.A." used R to look at the educational attainment and income data from the ACS to decide whether it's work pursuing a PhD.
As it turns out, there's a surprisingly distinct jump in income range for educational level attained from Bachelors to Masters and Doctorate. (The horizonal axis is in units of $10,000.) You can find more analyses of the data by Kaggle users at the link below.
Kaggle datasets: 2013 American Community Survey
I wonder if the data is somewhat normalized, because the range is very small (about 1k $ for each level of education). Assuming it would be something like the first year after school mean income, I think the difference is quite small (about 3k between a Bsc and a PHD).
Posted by: Owe Jessen | January 26, 2016 at 00:57
PS: I used the source, and it now becomes clearer: The horizontal axis is log10(income), so 4.6 is about 40k and 4.8 is about 70k, and the OP plotted the distribution of medians of the samples he took from the original data. This explains it to me, but I think it's a curious treatment of the income variable.
Posted by: Owe Jessen | January 26, 2016 at 01:06