Recent Tweets
join our mailing list
* indicates required

About Ambleside

Ambleside Logic is led by Aaron Rosenbaum. Father of 3, Programming since 7, DevOps since 11 (hacking RSTS), exIngres, exCTP, exCohera. Sold two companies to Oracle, one to HP. Research + Strategy for NoSQL/BigData ecosystem implementors, vendors and investors.

Friday
Jul292011

A taxonomy of Big Data

Every big data presentation I've seen starts with a discussion of how there are huge mountains of unanalyzed valuable data and how so much data produced is instructured.  All big data, however, is not created equal.

Log Data (structured but big)

System logs such as web logs, error logs, etc, are fairly structured data.  Most likely un-normalized, maybe some time sync errors, but as data goes, machine generated data is pretty structured.  But at large organizations, the data can be quite large and while traditional data management can deal with it, costs are cheaper with NoSQL sometimes. Splunk, Flume are some leaders here.

Big Graphs

Click to read more ...

Monday
Jul252011

Peaks, Valleys and wrong-turns - presenting time series data in analytics applications


There are many ways to fail when presenting time series data.  
I am going to start by assuming your data is clean and in order.  This is, of course, non-trivial.  This post focuses on the presentation issues.

1) Is your data smooth or noisy? 

If the performance of what you are measuring is continuous (staffing levels, minutes) rather than episodic (sales $, calls, hits) then the sampled data should not be that noisy. It may be that the sampling interval is too long or that there is a step-function somewhere in how things are measured.  If the data is episodic, the reverse can happen - too short a sample makes things very noisy/jumpy.

2) Has the data already been smoothed?

Regressions, moving averages, or any type of curve fitting can alter the data quite a bit.  Many tools allow zooming in on exact values. If you then apply any other aggregation to the data, what you present will be quite flawed.

3) Is the graph being used for identification of new issues or validation of trends? What decisions are being made with the graphs?

All visualization needs to be aligned to specific business processes

Click to read more ...

Friday
Jul222011

Open Source ETL with Hadoop

As followup to my article on BI projects for 2012, I got a few questions about ETL and Hadoop.  Here are some of the leading options for doing ETL projects with Hadoop.

 

Cloudera/Sqoop

Lots of nifty tools.  Sqoop moves data to and from HDFS from RDMS's.  Flume moves log files.  Transform logic gets written as part of Map(). I think they are bundling connectors for Netazza and some stuff from Quest for Oracle but fuzzy on licensing terms...

Click to read more ...

Wednesday
Jul202011

User Experience Guidelines for shared dashboards (less is more)

Dashboards are cool.  

In offices everywhere, the "mission control" style displays made famous by NASA are being thrown up right and left as are cool graphics of KPI's for running the business.   

There are different sorts of dashboard displays that have dramatically different UX requirements.  There is often no serious user advocate for the UX portion of the project and typically, the prototyping envirorment for public displays doesn't use the actual display but just a computer screen.  This post is intended to help you discuss these issues with your users.

Click to read more ...

Thursday
Jul072011

PaaS for Enterprise - make it simpler and it will happen

 

There are some common notions that PaaS is stalled at the enterprise because of enterprise readiness, lock-in, security and integration.  I think each of these is true to a certain extent but I think mostly it’s still too damn hard to make sense of how everything fits together.

Departmental App growth leads all enterprise adoption and developer productivity has led every single transition.  IMS to RDBMS? Report writing productivity.  3GL’s to RAD tools? Business logic encapsulation and GUI productivity.  And I’ll claim that mid-tier Unix never really would have taken off at all without RAD and RDBMS. Many concerns melt away if there is enough money to be made off the app.  But you can’t get to value quickly if the business domain experts have trouble making stuff work at all.

Each generation of Enterprise Technology to achieve widespread adoption has done so because of dramatically easier development vs. the previous generation.

Click to read more ...