Tag Archives: map

GenBase: A Benchmark for the Genomics Era

GenBase: A Benchmark for the Genomics Era

By Rebecca Taft, MIT CSAIL* Genomics is quickly becoming the focus of many Big Data scientists due to the seemingly sudden availability of vast amounts of data.  As mentioned in a previous post, a single gene-sequencing facility can sequence 2000 people per day and produce 6 TB of data […]

Hoya (HBase on YARN) : Application Architecture

Hoya (HBase on YARN) : Application Architecture

At Hadoop Summit in June, we introduced a little project we’re working on: Hoya: HBase on YARN. Since then the code has been reworked and is now up on Github . It’s still very raw, and requires some local builds of bits of Hadoop and HBase – but it […]

The Eve of the NYC Democratic Mayoral Primary Election

The Eve of the NYC Democratic Mayoral Primary Election

It is the eve of the New York City Democratic mayoral primary election.  This is a simple follow-up on my post from last Friday as I was curious how the final pre-Election Day polling was going to fall.  It’s fairly clear who is going to get the most votes. […]

Text Mining the Complete Works of William Shakespeare

Text Mining the Complete Works of William Shakespeare

I am starting a new project that will require some serious text mining. So, in the interests of bringing myself up to speed on the tm package, I thought I would apply it to the Complete Works of William Shakespeare and just see what falls out. The first order […]

Addressing Big Data Security

Addressing Big Data Security

Data security rules have changed in the age of big data. The V-Force (Volume, Velocity and Variety) has changed the landscape for data processing and storage in many organizations. Organizations are collecting, analyzing and making decisions based on analysis of massive amounts of data sets from various sources such […]

How Do You Map America’s Scary Shortage of Fresh Food?

How Do You Map America’s Scary Shortage of Fresh Food?

Next time you’re going to complain about lugging a few bags back from the grocery store on foot, you should really take a look at Nathan Yau’s most recent data visualization. The statistician behind Flowing Data has plotted the nation’s food deserts, which by definition is any place residents […]

Chronic Illnesses Outpace Infections As Big Killers Worldwide

Chronic Illnesses Outpace Infections As Big Killers Worldwide

People around the world are getting healthier and living longer. Infectious diseases are declining around the globe. But at the same time, chronic health problems are on the rise, particularly in developing nations. These are some of the key findings in the latest reports released by the World Bank […]

Introducing Weblog Add-on

Introducing Weblog Add-on

Another exciting day at Splunk and another great product release!  I am thrilled to announce the release of Weblog Add-on.  During .conf2011, we announced beta release of Splunk App for Web Intelligence.  We learned quite a bit from this beta release. After over 7500 downloads of the Web Intelligence […]

Unleashing NASA MODIS Data for Earth and Ocean Scientists

Unleashing NASA MODIS Data for Earth and Ocean Scientists

By Leilani Battle , MIT CSAIL ; James Frew , University of California, Santa Barbara ; and Bill Howe , University of Washington With the massive influx of data streaming in from telescopes, satellites, sequencers, imagers and other instruments, research in the physical and biological sciences is becoming more […]

Big Data Bits Featuring Couchbase, Datawatch, Panopticon, MongoDB, Hadoop 2.x

Big Data Bits Featuring Couchbase, Datawatch, Panopticon, MongoDB, Hadoop 2.x

Europe seems to be on holiday this week and in North America it’s the unofficial last week of summer. But in the world of Big Data there’s no time for reclining or relaxing. Though we can’t cover everything, here are a few bits of news we find worth mentioning. […]