« In case you missed it: October Roundup | Main | ACM Data Mining Camp »

November 16, 2010


Feed You can follow this conversation by subscribing to the comment feed for this post.

I didn't know that the digital humanities field was as new as it sounds here. Booklamp.org's been around for 7 years now. I began working as an analyst for Booklamp for over 2 years now and it is astonishing the sorts of information that can be derived from simple text.

About the debate over individual and quantitative analysis - they're complementary, period. It depends on your question. I will say that the sort of information we (mass-)produce on a daily basis, at Booklamp, is nothing close to possible using only the human mind. There's a lot to be learned, and (apparently) the push is just getting started.

PS The technology that can be experience at Booklamp.org for public (pre-beta) consumption is about 2.5 years old now - FYI. We've kept everything to ourselves since then, but it'll make it to the public in time, and you'll never read a bad book again.

I thought I'd add a rough-n-tumble bit of R code to this Humanities blog post:

text1 <- c('a', 'b', 'c', 'd', 2)
text2 <- c('1', 'b', 'e', 2, 'a', 'f')

length1 <- length( text1 ) ## 5
length2 <- length( text2 ) ## 6
overlap <- length( text1[ text1 %in% text2 ] ) ## 3

overlap_percentage <- ( 2 * overlap ) / ( length1 + length2 )

## overlap_percentage = 55%

... rough-n-tumble fer sure; but fun nonetheless.

The comments to this entry are closed.

Search Revolutions Blog

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid