back to article Pivotal: So who fancies skinny-dipping in our 'Business Data Lake'? PS: It'll cost you

EMC and VMware's spin-off Pivotal is trying to make money from a bevy of open-source big-data-crunching technologies by making their most complicated aspects disappear. With the launch of Hadoop distribution Pivotal HD 2.0 and analytics engine Pivotal GemFire XD on Monday, the company has created a clutch of technologies that …

COMMENTS

This topic is closed for new posts.
  1. BlueGreen

    so...

    > an in-memory SQL data store [...]

    "in-memory" meaning what here? Guranteed in memory always? I don't think so given the amounts of data hadoop may handle.

    > that sits upon data stored in the Hadoop File System (HDFS),

    So a data store that sits on a data store. On rather, an (sql-ish DDL-ish) interface that sits on a data store? Or what?

    > and an engine called HAWQ that can query data via SQL.

    Like Hive then. Smells of NIH.

    Perhaps a engineer from pivotal can explain, thanks.

    1. mclarenfan

      Re: so...

      GemFireXD is meant to be an OLTP database that also stores data on HDFS. It stores all data in memory, but allows evicting data from memory based on a SQL clause.

      For example, if you had a TRADES table, you can choose to keep only the last month of data in memory for super fast access, although GemFireXD will also be able to read/query historic data on HDFS too. You will not have to archive your data.

      All (even the most recent) data is available on HDFS to map-reduce jobs/HAWQ for deep analytics.

      1. BlueGreen

        Re: so...

        Thanks for the reply, sorry I didn't see it earlier. Unfortunately I'm still unclear.

        > GemFireXD is meant to be an OLTP database that also stores data on HDFS.

        I thought HDFS was unsuitable for OLTP as it's optimised for few writes but many reads?

        > It stores all data in memory

        If it can do that then you don't have much data; probably not enough to justify using hadoop.

        > but allows evicting data from memory based on a SQL clause.For example, if you had a TRADES table, you can choose to keep only the last month of data in memory for super fast access, although GemFireXD will also be able to read/query historic data on HDFS too. You will not have to archive your data.

        and if you just limit yourself to querying the last month's data then an sql DB will also keep it in memory. And what's this about archiving data anyway. But as you said "It stores all data in memory" then why evict any of it anyway?

        And why not use hive?

This topic is closed for new posts.

Other stories you might like