In the ever-evolving realm of Microservices, one of the fundamental decisions developers face is whether to adopt a shared database model or opt for separate databases for each service. This blog aims to dissect both approaches, offering insights to help … Read More
data-engineering
Open-source Data Engineering with PostgreSQL
Blog-4: Apache Drill Magic across PostgreSQL, Local Parquet, and S3 INTRODUCTION: Welcome back! Following our exploration of data movement between PostgreSQL and Amazon S3 in the previous blog, we now venture into the realm of querying with Apache Drill. In … Read More
Open-source Data Engineering with PostgreSQL
Blog-3: Data Loading with Apache Spark INTRODUCTION: Welcome to the next installment of our series on Open-source Data Engineering with PostgreSQL. In this blog, we’ll delve into the practicalities of transforming table data from PostgreSQL into the Parquet format and … Read More
Open-source Data Engineering with PostgreSQL
Blog-2: Installation and Setup on Ubuntu INTRODUCTION: Welcome back to the series on Open-source Data Engineering with PostgreSQL. In this post, we shall delve into the installation and configuration of Apache Spark and Apache Drill on an Ubuntu environment. Our … Read More
Open-source Data Engineering with PostgreSQL
Overview – A Curtain raiser Introduction: In the ever-evolving landscape of Data management, organizations are constantly seeking efficient ways to handle, transform, and query massive datasets. Data Archiving has become an important component of Data Engineering in the ever-evolving landscape … Read More
Mastering Timestamp-Based CDC Hurdles: A Proven Solution
Introduction Have you experimented with Timestamp-Based Change Data Capture using the Pentaho Data Integration (PDI) tool? Achieving data replication from a source database to a target database through “Timestamp-Based Change Data Capture” with Pentaho Data Integration is indeed straightforward. Perhaps … Read More
Unleashing the Power of Change Data Capture
Introduction Here I am again! Talking about the series of topics around Data Integration with PostgreSQL, The World’s Most Advanced Open Source Relational Database. If you haven’t looked at the previous blog in the series, I’d highly recommend reading the … Read More
Pentaho Data Integration with PostgreSQL
Introduction Pentaho Data Integration (PDI) serves as a robust ETL (Extract, Transform, Load) tool, playing a pivotal role in handling the complexities of data ingestion pipelines. As organizations accumulate vast amounts of data from diverse sources and in different formats, … Read More
Data Engineering with Hydra – The Basics
It is already well-known that Postgres itself offers a solid foundation for the efficient and speedy execution of analytical processes, and we have chosen Hydra to enhance these features further. It uses advanced techniques like columnar storage, vectorized execution, and … Read More
Deep Dive into Data Engineering – our take on Postgresql with Hydra
Our take on PostgreSQL with Hydra DWaaS At the forefront of the Open-Source Databases revolution, OpenSource DB Team is dedicated to having a straightforward yet powerful goal: Empowering businesses with speed, security, and resilience to enable rapid development and widespread … Read More