NOT OPEN TO C2C OR W2 REFERRALS AT THIS TIME

Position Details

Location: Fully remote with potential for quarterly travel to Gaithersburg, MD / Washington D.C. metro area.
Clearance: Public Trust (or willing to obtain; MUST be a U.S. Citizen).

Job Description

Seeking a Data Automation Engineer to design and implement innovative, AI-driven automation solutions across AWS and Azure hybrid environments. You will be responsible for building intelligent, scalable data pipelines and automations that integrate cloud services, enterprise tools, and Generative AI to support mission-critical analytics, reporting, and customer engagement platforms. The ideal candidate is mission-focused, delivery-oriented, and applies critical thinking to create innovative functions and solve technical issues.

Key Responsibilities

Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions.
Develop ETL/ELT processes to move data between systems, including DynamoDB, SQL Server (AWS), and Azure SQL.
Integrate AWS Connect CRM data into the enterprise data pipeline for analytics and operational reporting.
Engineer and enhance ingestion pipelines with Apache Spark, Flume, and Kafka for real-time and batch processing into Apache Solr and AWS Open Search platforms.
Leverage Generative AI services and frameworks (AWS Bedrock, Amazon Q, Azure OpenAI, Hugging Face, LangChain) to:
- Create automated processes for vector generation and embeddings from unstructured data.
- Automate data quality checks, metadata tagging, and lineage tracking.
- Enhance ingestion/ETL with LLM-assisted transformation and anomaly detection.
- Build conversational BI interfaces for natural language access to Solr and SQL data.
- Develop AI-powered copilots for pipeline monitoring and automated troubleshooting.
Implement SQL Server stored procedures, indexing, query optimization, profiling, and execution plan tuning.
Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps.
Ensure security and compliance through IAM, KMS encryption, VPC isolation, RBAC, and firewalls.
Support Agile DevOps processes with sprint-based delivery.

Required Qualifications

BS in Computer Science or related field with 2+ years of data engineering and automation experience.
Hands-on experience with SQL, SSIS, Python, Spark, Bash, PowerShell, and AWS/Azure CLIs.
Experience with AWS services: S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB.
Familiarity with Apache Flume, Kafka, and Solr for large-scale data ingestion and search.
Familiarity with LLM and GenAI frameworks (AWS Bedrock, Azure OpenAI, or open-source tools).
Experience integrating REST API calls in data pipelines and workflows.
Familiarity with JIRA, GitHub, Azure DevOps, or Jenkins for SDLC and CI/CD.
Strong troubleshooting and performance optimization skills in SQL, Spark, or other data engineering solutions.
Experience operationalizing GenAI Ops pipelines (model deployment, monitoring, retraining, and lifecycle management).
Good communication and presentation skills.
Ability to obtain Federal government Public Trust clearance.

Preferred Qualifications

Certifications: AWS Data Engineer, AWS AI/ML Specialty, Azure AI Engineer, or Databricks Certified Data Engineer.
Experience implementing RAG pipelines, embeddings, and vector search (Solr, OpenSearch, FAISS, Pinecone, or Pgvector).
Experience with GenAI-powered coding tools (Claude Code, OpenAI Codex, VS Code).
Experience with multi-cloud data integration (AWS to Azure SQL).
Familiarity with Microsoft BizTalk and SSIS for SQL Server ETL workflows.
Knowledge of data lineage/governance tools (Purview, Unity Catalog, AWS Glue Catalog).
Familiarity with Infrastructure-as-Code (Terraform, CloudFormation, Bicep).
Experience with compliance frameworks (FedRAMP, PCI-DSS, HIPAA).

Data Automation Engineer

Job summary

Work model