Tag Archives: hadoop

Wanna Race? Cloudera Says Impala is Faster than Hive and Proprietary RDMS

Wanna Race? Cloudera Says Impala is Faster than Hive and Proprietary RDMS

race cars.jpg Cloudera made a big splash at O’Reilly Strata + Hadoop World 2013  in New York City last October when it announced its Enterprise Data Hub strategy. It wants it to be the place where companies park all of their data, regardless of its format, and from which […]

How Much Time to Conceive?

How Much Time to Conceive?

This morning my wife presented me with a rather interesting statistic: a healthy couple has a 25% chance of conception every month [1], and that this should result in a 75% to 85% chance of conception after a year. This sounded rather interesting and it occurred to me that […]

Pivotal Hadoop Distribution and HAWQ Realtime Query Engine

Pivotal Hadoop Distribution and HAWQ Realtime Query Engine

PivotalHD_ArchitectDiagram Introduction SQL on Hadoop and the support for interactive, ad-hoc queries in Hadoop is in increasing demand and all the vendors are providing their answer to these requirements. In the open source world Cloudera’s Impala, Apache Drill (backed by MapR), Hortonworks’s Stinger initiatives are competing in this market, […]

Spark: Low Latency, Massively Parallel Processing Framework

While Hadoop fits well in most batch processing workloads, and is the primary choice of big data processing today, it is not optimized for other types of workloads due to its following limitation:  For a more detail elaboration of the Hadoop limitation , refer to my previous post . […]

Splunk Digs Into the Year of the Big Data Application

Splunk Digs Into the Year of the Big Data Application

Thursday Jan 2nd 2014 by Mike Vizard This year will be the one in which businesses discover the applications that turn all that data into something of business value. Slide Show Big Data: Not Just for Big Business Anymore If 2013 was the year that most organizations discovered what […]

CMWire's Top 10 Hits of 2013: Big Data

CMWire’s Top 10 Hits of 2013: Big Data

Yes, Big Data was a Big Buzzword in 2013. The technology and business press — and even mainstream media — got a piece of the action, churning out article after article about what Big Data means to you. And that’s part of the problem. Big Data means lots of […]

Using Amazon’s Elastic MapReduce to Compute Recommendations with Apache Mahout 0.8

Using Amazon’s Elastic MapReduce to Compute Recommendations with Apache Mahout 0.8

Apache Mahout is a “scalable machine learning library” which, among others, contains implementations of various single-node and distributed recommendation algorithms. In my last blog post, I described how to implement an on-line recommender system processing data on a single node. What if the data is too large to fit […]

Make Big Data Portable: the Basics

Make Big Data Portable: the Basics

Soam Acharya If you’re reading this, then you probably know that we’re very much pro Hadoop-as-a-Service. Obviously, many organizations we speak to have concerns about the logistics of transporting all their data. While at first glance this process can appear intimidating, it’s actually a lot easier than many suspect, […]

Cloudera's Enterprise Data Hub Rises to the Call of Amazon's AWS

Cloudera’s Enterprise Data Hub Rises to the Call of Amazon’s AWS

Room with Clouds Someone joked at Strata and Hadoop World earlier this year that Cloudera was ahead of its time when it chose its name. “You should have called it On-premise era,” said the would-be comedian, referring to the fact that Cloudera and most other enterprise-grade Hadoop distros live […]

Hadoop gets native R programming for big data analysis

Hadoop gets native R programming for big data analysis

Sensing a growing interest in big data-style analysis, software provider Revolution Analytics has updated its flagship package of R statistical functions so it can be run with the Hadoop data processing platform. Revolution R Enterprise 7 (RRE 7), to be made available on Monday , also features the ability […]