Data Integration Pipeline

Combine data from various internal and external data sources to enable downstream Dashboards/Reports, Business Intelligence, and Machine Learning/AI. We help build scalable – for petabytes of data and extensible – supporting hundreds of data sources – ETL/ELT data pipeline.

Data Pipeline: Orchestration and Choreography

Manage the data pipeline states and steps through Orchestration Engines, Schedulers, Choreography, and Orchestration/Choreography Hybrid Architecture.

Data Validation, Transformation and Mastering

Build effective Data Validation, Data Transformation, and Data Mastering logic for preparation of data to power Business Intelligence and Machine Learning/AI use-cases.

Data Ops

Make analytics infrastructure agile and reliable: Build analytics infrastructure which responds to the changing business needs and achieves high reliability through monitoring, observation, lineage/governance, remediation, and collaboration.

Stream and Complex Event Processing (CEP)

Integrate stream – audio/video data and event data – such as HL7 events, IoT events – and perform windowing, transforms and aggregations.


Data Cleansing: Handling Missing Values

Data cleansing plays a key role in building models in machine learning. A significant amount of any data scientists’ time is dedicated to data cleansing activity. In a lot of…

Web Scrapers: Basic proxy authentication and headless Firefox

Selenium, Web Proxy and HTTP Basic Auth Problem Statement: Please offer a solution for basic authentication · Issue #6644 · SeleniumHQ/selenium Solution Install Firefox version 66.0.5 from here -…
Exploratory Data Analysis with Trifacta and Mode/Tableau

Use case: We wanted to process raw data to identify new patterns and grasp difficult concepts which later we could share with our customers in a pictorial or a graphical…