Google Cloud Platform Data Engineer

Job summary

M5G 1W8 Toronto
Engineering

Work model

Fully remote
Only Canada
1 week ago
Job description

Role Overview:

We're looking for a skilled Data Engineer to design, build, and optimize scalable, cloud-native data pipelines on Google Cloud Platform (Google Cloud Platform) . The role involves extensive work with Apache Airflow , Spark , Python , and Scala to develop high-performance data solutions supporting analytics, streaming , and generative AI initiatives.

Key Responsibilities:

● Develop, automate, and maintain batch and streaming ETL pipelines using Apache Airflow, Apache Spark, Python, and Scala.

● Build and manage cloud-based data ecosystems on Google Cloud Platform (BigQuery, Bigtable, Dataproc, Pub/Sub, Cloud Storage, IAM, VPC).

● Design and optimize SQL and NoSQL data models for data lakes and warehouses (BigQuery, MongoDB, Snowflake).

● Write complex SQL queries for advanced data transformation, aggregation, and analytics optimization within BigQuery or equivalent platforms.

● Apply modern Test Driven Development (TDD) methodologies for big data pipelines, ensuring test automation across Airflow workflows, Spark jobs, and transformation logic.

● Apply data mesh and data-as-a-product principles to enable reusable and domain-driven datasets.

● Implement real time ingestion with Kafka Connect and process streaming data using Spark Streaming, Apache Flink, or similar technologies

● Optimize data performance, scalability, and cost efficiency across Google Cloud Platform components.

● Ensure compliance with PCI and PII data with standards such as GDPR, PCI DSS, SOX, and CCPA.

● Integrate GenAI tools such as OpenAI, Gemini, and Anthropic LLMs for intelligent data quality and analytics enhancement.

● Collaborate with stakeholders, data scientists, and full stack engineers to deliver trusted, documented, and reusable data products

Required Qualifications:

● Bachelor's or Master's in Computer Science, Data Engineering, or related field.

● 5 years of hands-on experience with large-scale data engineering in cloud environments.

● Advanced skills using Python, Scala, Spark ecosystem, SQL to build data pipelines

Strong Google Cloud Platform expertise (BigQuery, Bigtable, Dataproc, Pub/Sub, IAM, VPC).

● Proficiency in SQL/NoSQL modeling and data architecture for cloud data lakes.

● Familiarity with streaming frameworks (Kafka, Flume).

● Experience handling sensitive data and ensuring regulatory compliance.

● Working knowledge of Docker, CI/CD , and modern DevOps practices for data platforms.

Preferred Qualifications:

● Experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible.

● Contributions to open-source projects or internal developer tooling.

● Prior experience building **Customer Data Platforms (CDPs)**inhouse

● Experience with AI-assisted developer tools (for example, IntelliJ plug-ins using OpenAI or Anthropic models), Codex CLI , Windsurf.