« In case you missed it: October Roundup | Main | ACM Data Mining Camp »

November 16, 2010


Feed You can follow this conversation by subscribing to the comment feed for this post.

I didn't know that the digital humanities field was as new as it sounds here. Booklamp.org's been around for 7 years now. I began working as an analyst for Booklamp for over 2 years now and it is astonishing the sorts of information that can be derived from simple text.

About the debate over individual and quantitative analysis - they're complementary, period. It depends on your question. I will say that the sort of information we (mass-)produce on a daily basis, at Booklamp, is nothing close to possible using only the human mind. There's a lot to be learned, and (apparently) the push is just getting started.

PS The technology that can be experience at Booklamp.org for public (pre-beta) consumption is about 2.5 years old now - FYI. We've kept everything to ourselves since then, but it'll make it to the public in time, and you'll never read a bad book again.

I thought I'd add a rough-n-tumble bit of R code to this Humanities blog post:

text1 <- c('a', 'b', 'c', 'd', 2)
text2 <- c('1', 'b', 'e', 2, 'a', 'f')

length1 <- length( text1 ) ## 5
length2 <- length( text2 ) ## 6
overlap <- length( text1[ text1 %in% text2 ] ) ## 3

overlap_percentage <- ( 2 * overlap ) / ( length1 + length2 )

## overlap_percentage = 55%

... rough-n-tumble fer sure; but fun nonetheless.

The comments to this entry are closed.

R for the Enterprise

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid

Search Revolutions Blog