## Benchmarking Graph Databases

By Alekh Jindal , MIT CSAIL Graph data management has recently received a lot of attention, particularly with the explosion of social media and other complex, inter-dependent datasets. As a result, a number of graph data management systems have been proposed. But this brings us to the question: What […]

## Wolfram Science Summer School 2013: Bigger and Better

The Wolfram Science Summer School is an intense three-week course that furthers people’s careers by teaching them the ideas and methods used by Stephen Wolfram and his advanced research team. This year we had 47 students and 10 instructors, more than we have ever had since we started back […]

## Introduction to ElasticSearch

ElasticSearch is an open source tool developed with Java. It is a Lucene -based, scalable, full-text search engine, and a data analysis tool.  A huge amount of data is produced at any moment in today’s world of information technology in social media, in video sharing sites, and in medium- […]

## Splunking Foursquare

I tend to travel quite a bit in my role at Splunk.The other day I was wondering to myself how far I had traveled in the last week , the last month , the last year. It just so happens that I am a Foursquare user , not because […]

## Laplace the Bayesianista and the Mass of Saturn

I’m reviewing Bayes’ theorem and related topics for the upcoming GDAT class . In its simplest form, Bayes’ theorem is statement about conditional probabilities. The probability of A, given that B has occurred, is expressed as: begin{equation} Pr(A|B) = dfrac{Pr(B|A)timesPr(A)}{Pr(B)} label{eqn:bayes} end{equation} In Bayesian language, \$Pr(A|B)\$ is called the […]

## Social Media Marketing: How Big Data is Changing Everything

Every second of every day, Big Data gets bigger. Social media alone generates endless streams of data, flowing in from Facebook, Twitter, Pinterest and other social sites like never before. Fortunately, sophisticated analytics platforms have arrived on the scene to help social media marketers manage, analyze and leverage large […]

## How Data Visualization Experts See the Future

Big Data Analytics Masters Degrees: 20 Top Programs (click image for larger view and for slideshow) Without a way to show correlations, trends and outliers, organizations now busy collecting multiple terabytes of data won’t be able to pull out actionable insights. Fortunately, data visualization software tools have been keeping […]

## How to Plan and Configure YARN and MapReduce 2 in HDP 2.0

As part of HDP 2.0 Beta , YARN  takes the resource management capabilities that were in MapReduce and packages them so they can be used by new engines.  This also streamlines MapReduce to do what it does best, process data.  With YARN, you can now run multiple applications in Hadoop, all […]

## Apache Tez: A New Chapter in Hadoop Data Processing

In this post we introduce the motivation behind Apache Tez  ( http://incubator.apache.org/projects/tez.html ) and provide some background around the basic design principles for the project. As Carter discussed in our previous post on Stinger progress , Apache Tez is a crucial component of phase 2 of that project. What […]

## Bringing Big Data Analytics into Focus for Marketers: 3 Principles to Simplify Your Life

Customer Experience, Digital Marketing, Bringing Big Data Analytics into Focus for Marketers – 3 Principles for Simplifying Your Life How’s it going with figuring out how you’re going to use “big data” to get more return on your campaigns? Or understand which customers to target for upsells or new […]