Data engineer, GCP and LLM prototyping (Python, SQL, Go)

Job Type: Remote

A leading organization is hiring a data engineer to build and support cloud-based data capabilities while also creating proof-of-concept prototypes that apply large language models and agent-style workflows to real business problems. This position blends strong data engineering fundamentals with practical experience prototyping AI solutions. The ideal candidate has hands-on experience building prototypes, not just studying AI concepts, and is comfortable working across Python, SQL, Google Cloud Platform, and Go.

Responsibilities

  • Design, build, and maintain data pipelines and data workflows in a Google Cloud Platform environment
  • Develop reliable data transformations and data validation routines using Python and SQL
  • Create and iterate on proof-of-concept implementations that apply large language models to data and operational use cases
  • Prototype agent-style workflows, including multi-step task execution patterns and tool-based interactions that connect AI outputs to data systems
  • Implement backend components and services that support AI-enabled prototypes, including integrations with data stores and APIs
  • Produce clear, maintainable deliverables suitable for handoff, reuse, and extension beyond prototype stages
  • Collaborate with cross-functional partners to understand the problem context and translate it into practical technical experiments and deliverables
  • Document approaches, assumptions, and outcomes so prototypes can be evaluated and compared consistently

Required experience and skills

  • Strong Python skills for data engineering, automation, and building prototype workflows
  • Advanced SQL skills for complex querying, joining datasets, and building repeatable transformations
  • Experience working on Google Cloud Platform for data workloads
  • Demonstrated experience building LLM proof-of-concepts or prototypes that move beyond experimentation into working demos or pilot-ready workflows
  • Familiarity with building agent-style prototypes and orchestration patterns
  • Working knowledge of Go (Golang) for building services, utilities, or performance-oriented components
  • Ability to work independently, execute quickly, and communicate progress and outcomes clearly

FAQ

1. What are the core responsibilities of a Data Engineer focused on GCP and LLM prototyping?
This role combines data engineering with rapid prototyping of large language model (LLM) use cases. It involves building scalable data pipelines on GCP, preparing datasets for AI applications, and experimenting with LLM-driven solutions. The engineer ensures that prototypes can evolve into production-ready systems.

2. What types of projects are typically handled in this role?
Projects include building data pipelines, enabling analytics platforms, and developing LLM-powered features such as chatbots, summarization tools, or search enhancements. Work often spans both experimentation and productionization. The focus is on delivering practical, data-driven AI solutions.

3. What GCP services are commonly used in this position?
Common services include BigQuery for data warehousing, Dataflow for processing, and Cloud Storage for data lakes. Tools like Vertex AI may be used for model development and deployment. The role requires strong familiarity with cloud-native data architectures.

4. How are LLMs used in data engineering workflows?
LLMs are used for tasks such as text processing, data enrichment, and generating insights from unstructured data. They can also support automation of data transformations and documentation. Prototyping helps evaluate feasibility before scaling solutions.

5. What programming languages and tools are required?
Python is widely used for data processing and AI integration, while SQL is essential for querying and transforming data. Go may be used for building high-performance services. Additional tools include APIs, orchestration frameworks, and containerization technologies.

6. How is data quality ensured in this role?
Data quality is maintained through validation checks, monitoring pipelines, and implementing data governance practices. Clean and reliable data is critical for both analytics and AI model performance. Automated testing helps ensure consistency.

 

Apply for this position

**If you have already submitted your resume for another Job Opening please do not re-apply to a different role. You can email through Contact Us about your interest in other roles.

Allowed Type(s): .pdf, .doc, .docx

Related Job Openings