CLOSE ✕
Get in touch with us
Cloud consulting is what we do best - whether it's about taking your business to the next level or working for us we'd love to hear from you.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Jindabyne Datalake

Origin Energy

Project Date

October 2016 - Present

problem statement

Cloudten was originally engaged by Origin Energy over 4 years ago to perform some of the initial design and delivery work for one of Australia’s first Redshift implementations.

This environment has now grown to become one of the largest data lakes in the country. It provides real time analytics capabilities that enable Origin customers to effectively query and model their energy usage and projected spend. It has also expanded to includeSnowflake and AWS Aurora.

proposal

We have established ourselves as a trusted data partner to Origin and have a number of resources deployed full-time to manage the entire data lake infrastructure as well as provide architectural consulting, development and DBA support for a range of platforms (Postgres, MySQL, SQLServer). We pride ourselves on our commitment to our joint goals and our reputation for always going the extra mile to ensure a smooth running of the estate.

The environment consists of a number of third-party integrations and tools such as Matillion ETL, BryteFlow, AlexSolutions, Tableau BI and SAS. We have recently been involved in the incorporation of several native AWS services to improve data loading, scalable compute and the ability to query unstructured data. A number of machine learning components have also been added to the platform including AWS Sagemaker and Databricks.

 

Cloudten provides 24x7 support, monitoring and reporting for the data and analytics components as well as the AWS infrastructure. We also provide maintenance, recommendations and PoCs for prospective new technology.

 

We work closely with Origin’s internal teams and external vendors and leverage Origin’s internal tools for incident and change management.

 

We currently provide management and support of multiple cluster of the Matillion tool dealing with issues relating to stopped jobs, incorrect configuration, application/system OOM, application patch installation, new jobs to be set up and access to data.

 

We work closely with Origin’s internal IT teams, other vendors and Matillion themselves to identify, analyse and resolve issues on an ongoing basis. Our engineers attend regular conference calls outside of business hours with Matillion’s head office back in the UK.

Outcomes and results

Outcomes: Cloudten’s relationship with Origin Energy has been extremely successful over the last 4 years. Our original 2+1 support contract with the Energy Markets team has been renewed for another 3 years and we have also picked up new support contracts with their Integrated Gas andVirtual Power Plant teams.

In addition to support, we are also a preferred partner for the design and delivery of project work which has covered a number of areas from the design and development of Matillion ETL flows, the implementation and migration of Snowflake warehouses and the design and development of BI dashboards. We are also constantly involved in the on boarding and integration of new data sets into a data ingestion/processing pipeline that we built using cloud native services, established CI/CD tools and Apache Airflow.

Challenges: The Origin Energy data services landscape is extremely large and complex with stakeholders and business units spread across three Australian states. There are a wide number of products and services with multiple support teams from a range of vendors. Whilst it can be quite daunting at times, Cloudten has adapted to this environment extremely well and is actively involved in large scale collaborations.  

aws services used

Cloudten data services team and managed services team has been providing level 2 (operational) and Level 3 (engineering) support for several analytics tools and technologies and have experience/skillset in the following:

  • Data analysis with Python and ANSI SQL
  • Creating Tableau dashboard and stories
  • Creating data repository and visualisation with ELK stack
  • Data analysis with Amazon Athena
  • Data warehousing with Amazon Redshift and Snowflake
  • ETL/ELT processing with Amazon DMS, Databricks, Glue, Matillion and Talend
  • ETL scheduling with Apache Airflow
  • Installing and configuring standalone Hadoop     clusters and managed clusters with Amazon EMR
  • Managing notebook environment in Amazon SageMaker
  • Apache toolsets such as Hive, Sqoop, Cassandra and Flume

Third party application or solution used

The Origin Energy data service environment is supported by a range of internal teams and third-party service providers that include small boutique vendors up to large multi-national system integrators.  The Cloudten team is an established and respected part of this ecosystem and works effectively and cohesively with init.

The diagram gives a high level overview of theIntegrated Gas workflow: