Tag Archives: data

How to Stream Internet of Things Data into Splunk in Ten Easy Steps!

How to Stream Internet of Things Data into Splunk in Ten Easy Steps!

Inspired by Discovered Intelligence’s blog post “ How to Stream Twitter into Splunk in 10 Simple Steps ” last week, I began thinking about a simple Internet of Things example where we could demonstrate an easy integration of IoT platforms and data into Splunk that everyone could access. There […]

Ethernet Interfaces Transform Object Storage

Ethernet Interfaces Transform Object Storage

Commentary Jim O’Reilly New direct Ethernet interfaces for object-oriented storage will change the rules of storage, allowing for large performance gains. The idea of direct Ethernet drive interfaces dates back to at least 2001, although very little interest was generated among the very conservative storage clientele. A product platform […]

What the Internet of Things Can Learn from IBM's Smarter Cities Initiative

What the Internet of Things Can Learn from IBM’s Smarter Cities Initiative

Infographic Smarter Cities. Turning Big Data into Insight.jpg If you take a step back and look at the bigger picture, it’s clear that many of the concepts driving the Internet of Things (IoT) have been around for some time now. It took a while for the penny to drop, but […]

Wanna Race? Cloudera Says Impala is Faster than Hive and Proprietary RDMS

Wanna Race? Cloudera Says Impala is Faster than Hive and Proprietary RDMS

race cars.jpg Cloudera made a big splash at O’Reilly Strata + Hadoop World 2013  in New York City last October when it announced its Enterprise Data Hub strategy. It wants it to be the place where companies park all of their data, regardless of its format, and from which […]

How Much Time to Conceive?

How Much Time to Conceive?

This morning my wife presented me with a rather interesting statistic: a healthy couple has a 25% chance of conception every month [1], and that this should result in a 75% to 85% chance of conception after a year. This sounded rather interesting and it occurred to me that […]

Getting Started with ElasticSearch

Introduction ElasticSearch is an open-source and distributed search engine which is very much scalable and supports a good amount of enterprise Search use cases. It’s built on top of Lucene (just like Apache Solr4). It supports realtime time indexing and full text search. You can read about Elastic Search […]

Pivotal Hadoop Distribution and HAWQ Realtime Query Engine

Pivotal Hadoop Distribution and HAWQ Realtime Query Engine

PivotalHD_ArchitectDiagram Introduction SQL on Hadoop and the support for interactive, ad-hoc queries in Hadoop is in increasing demand and all the vendors are providing their answer to these requirements. In the open source world Cloudera’s Impala, Apache Drill (backed by MapR), Hortonworks’s Stinger initiatives are competing in this market, […]

Spark: Low Latency, Massively Parallel Processing Framework

While Hadoop fits well in most batch processing workloads, and is the primary choice of big data processing today, it is not optimized for other types of workloads due to its following limitation:  For a more detail elaboration of the Hadoop limitation , refer to my previous post . […]

Semantic Web Business: Going Nowhere Slowly

Semantic Web Business: Going Nowhere Slowly

Seth Grimes The semantic web vision persists, but the tools and processes don’t stand up to today’s data chaos. I’ve been a semantic web skeptic for years. SemWeb is a narrowly purposed replica of a subset of the World Wide Web. It’s useful for information enrichment in certain domains, […]

Splunk Digs Into the Year of the Big Data Application

Splunk Digs Into the Year of the Big Data Application

Thursday Jan 2nd 2014 by Mike Vizard This year will be the one in which businesses discover the applications that turn all that data into something of business value. Slide Show Big Data: Not Just for Big Business Anymore If 2013 was the year that most organizations discovered what […]