data-engineering

Open-source Data Engineering with PostgreSQL

Blog-4: Apache Drill Magic across PostgreSQL, Local Parquet, and S3 INTRODUCTION: Welcome back! Following our exploration of data movement between PostgreSQL and Amazon S3 in the previous blog, we now venture into the realm of querying with Apache Drill. In … Read More

Open-source Data Engineering with PostgreSQL

Blog-3: Data Loading with Apache Spark INTRODUCTION: Welcome to the next installment of our series on Open-source Data Engineering with PostgreSQL. In this blog, we’ll delve into the practicalities of transforming table data from PostgreSQL into the Parquet format and … Read More

Open-source Data Engineering with PostgreSQL

Blog-2: Installation and Setup on Ubuntu INTRODUCTION: Welcome back to the series on Open-source Data Engineering with PostgreSQL. In this post, we shall delve into the installation and configuration of Apache Spark and Apache Drill on an Ubuntu environment. Our … Read More

Open-source Data Engineering with PostgreSQL

Overview – A Curtain raiser Introduction: In the ever-evolving landscape of Data management, organizations are constantly seeking efficient ways to handle, transform, and query massive datasets. Data Archiving has become an important component of Data Engineering in the ever-evolving landscape … Read More

Mastering Timestamp-Based CDC Hurdles: A Proven Solution

Introduction Have you experimented with Timestamp-Based Change Data Capture using the Pentaho Data Integration (PDI) tool? Achieving data replication from a source database to a target database through “Timestamp-Based Change Data Capture” with Pentaho Data Integration is indeed straightforward. Perhaps … Read More

Unleashing the Power of Change Data Capture

Introduction Here I am again! Talking about the series of topics around Data Integration with PostgreSQL, The World’s Most Advanced Open Source Relational Database. If you haven’t looked at the previous blog in the series, I’d highly recommend reading the … Read More

Pentaho Data Integration with PostgreSQL

Introduction Pentaho Data Integration (PDI) serves as a robust ETL (Extract, Transform, Load) tool, playing a pivotal role in handling the complexities of data ingestion pipelines. As organizations accumulate vast amounts of data from diverse sources and in different formats, … Read More

Data Engineering with Hydra – The Basics

It is already well-known that Postgres itself offers a solid foundation for the efficient and speedy execution of analytical processes, and we have chosen Hydra to enhance these features further. It uses advanced techniques like columnar storage, vectorized execution, and … Read More

Deep Dive into Data Engineering – our take on Postgresql with Hydra

Our take on PostgreSQL with Hydra DWaaS At the forefront of the Open-Source Databases revolution, OpenSource DB Team is dedicated to having a straightforward yet powerful goal: Empowering businesses with speed, security, and resilience to enable rapid development and widespread … Read More