Tag Archives: reduce

EnterpriseDB Says JSON Toolkit Nixes NoSQL Drawbacks

EnterpriseDB Says JSON Toolkit Nixes NoSQL Drawbacks

NoSQL databases start out easy, but you’ll later struggle with data logic, says EnterpriseDB. JSON toolkit promises best of NoSQL and RDBMS. EnterpriseDB on Tuesday introduced a free developer kit…

Pivotal Hadoop Distribution and HAWQ Realtime Query Engine

Pivotal Hadoop Distribution and HAWQ Realtime Query Engine

PivotalHD_ArchitectDiagram Introduction SQL on Hadoop and the support for interactive, ad-hoc queries in Hadoop is in increasing demand and all the vendors are providing their answer to these requirements. In the open source world Cloudera’s Impala, Apache Drill (backed by MapR), Hortonworks’s Stinger initiatives are competing in this market, […]

Spark: Low Latency, Massively Parallel Processing Framework

While Hadoop fits well in most batch processing workloads, and is the primary choice of big data processing today, it is not optimized for other types of workloads due to its following limitation:  For a more detail elaboration of the Hadoop limitation , refer to my previous post . […]

Splunk Digs Into the Year of the Big Data Application

Splunk Digs Into the Year of the Big Data Application

Thursday Jan 2nd 2014 by Mike Vizard This year will be the one in which businesses discover the applications that turn all that data into something of business value. Slide Show Big Data: Not Just for Big Business Anymore If 2013 was the year that most organizations discovered what […]

Make Big Data Portable: the Basics

Make Big Data Portable: the Basics

Soam Acharya If you’re reading this, then you probably know that we’re very much pro Hadoop-as-a-Service. Obviously, many organizations we speak to have concerns about the logistics of transporting all their data. While at first glance this process can appear intimidating, it’s actually a lot easier than many suspect, […]

How Ancestry.com Manages Generations Of Big Data

How Ancestry.com Manages Generations Of Big Data

Jeff Bertolucci Over the past year, the genealogy site’s repository of family historical data has more than doubled in size. Here’s how Ancestry managed its growth. Businesses often use — or overuse — the term “big data” to describe all sorts of data-related products and services, but the buzzword […]

Object Storage: The Next Storage Paradigm

Object Storage: The Next Storage Paradigm

Jim O’Reilly Object storage is evolving from a data archive to the primary form of storage in large systems. I remember my first object store in 2007. Using a COTS x 86 server with 6 TB of storage, it was powered by Caringo software . I needed a cluster […]

Five ways to handle Big Data in R

Five ways to handle Big Data in R

Five strategies to tackle big data with R Big data was one of the biggest topics on this year’s useR conference in Albacete and it is definitely one of today’s hottest buzzwords. But what defines “Big Data”? And on the practical side: How can big data be tackled in […]

UPS Nets Huge Fuel Savings With Analytics

UPS Nets Huge Fuel Savings With Analytics

5 Big Wishes For Big Data Deployments (click image for larger view and for slideshow) Constructive dissatisfaction. That’s what UPS calls its ongoing quest for process improvement that brought about ORION, an On-Road Integrated Optimization and Navigation system that will save the shipper 1.5 million gallons of fuel in […]

Spatial Clustering With Equal Sizes

Spatial Clustering With Equal Sizes

Cluster Map This is a problem I have encountered many times where the goal is to take a sample of spatial locations and apply constraints to the algorithm.  In addition to providing a pre-determined number of K clusters a fixed size of elements needs to be held constant within […]