Responsibilities

Implement and optimize data pipeline architectures for sourcing, ingestion, transformation, and extraction processes, ensuring data integrity and compliance with organizational standards
Develop and maintain scalable database schemas, data models, and data warehouse structures; perform data mapping, schema evolution, and integration between source systems, staging areas, and data marts
Automate data extraction workflows and create comprehensive technical documentation for ETL/ELT procedures; collaborate with cross-functional teams to translate business requirements into technical specifications
Establish and enforce data governance standards, including data quality metrics, validation rules, and best practices for data warehouse design and architecture
Develop, test, and deploy ETL/ELT scripts using SQL, Python, Spark, or other relevant languages; optimize code for performance and scalability
Tune data warehouse systems for query performance and batch processing efficiency; apply indexing, partitioning, and caching strategies
Perform advanced data analysis, validation, and profiling using SQL and scripting languages; develop data models, dashboards, and reports in collaboration with stakeholders
Conduct testing and validation of ETL workflows to ensure data loads meet SLAs and quality standards; document testing protocols and remediation steps
Troubleshoot production issues, perform root cause analysis, and implement corrective actions; validate data accuracy and consistency across systems

Skills

Minimum of 3 years of experience in data analysis
Strong analytical and problem-solving skills with attention to detail
Proficiency in SQL and ability to develop complex queries (e.g., multi-join), tune performance, and troubleshoot
Experience with Unix/Linux shell scripting for ETL automation
Familiarity with database tools and platforms (e.g., Teradata, Oracle, Non-Relational)
Excellent verbal and written communication skills; ability to collaborate across all levels
Ability to prioritize and multi-task in a fast-paced environment
Knowledge of Java/J2EE, REST APIs, Web Services, and event-driven microservices
Experience with Kafka streaming, schema registry, OAuth authentication
Familiarity with Spring Framework, GCP services, Git, CI/CD pipelines, containerization, and data ingestion/data modeling
Experience with Databricks concepts and terminology (e.g., workspace, catalog)
Proficiency in Python and Spark
Background in architecting real-time data ingestion solutions using microservices and Kafka

Company Overview

ICF is a global consulting and technology services provider focused on making big things possible for our clients. It was founded in 1969, and is headquartered in Fairfax, Virginia, USA, with a workforce of 5001-10000 employees. Its website is https://www.icf.com.

Company H1B Sponsorship
ICF has a track record of offering H1B sponsorships, with 1 in 2026, 29 in 2025, 31 in 2024, 35 in 2023, 29 in 2022, 31 in 2021, 38 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Note: The job is a remote job and is open to candidates in USA.

[Remote] Data Engineer - Data Warehouse Architect

Job summary

Work model

Responsibilities

Skills

Company Overview