At BairesDev®, we've been leading the way in technology projects for over 15 years. We deliver cutting-edge solutions to giants like Google and the most innovative startups in Silicon Valley.
Our diverse 4,000+ team, composed of the world's Top 1% of tech talent, works remotely on roles that drive significant impact worldwide.
When you apply for this position, you're taking the first step in a process that goes beyond the ordinary. We aim to align your passions and skills with our vacancies, setting you on a path to exceptional career development and success.
Distributed Systems Engineer (Apache Spark Internals) at BairesDev
We are seeking a Distributed Systems Engineer with experience in Apache Spark internals to work on the engine itself, not pipelines built on top of it. The work covers the core components of Spark — Catalyst Optimizer, Tungsten execution, DAG Scheduler, Shuffle subsystem, and memory management — at petabyte scale. The technology stack centers on the open source Apache Spark project and adjacent ecosystem projects (Iceberg, Hive, Hadoop, Arrow, Parquet), with ample scope for upstream contribution. We're looking for engineers who have worked on engine-level code in Spark or comparable distributed compute systems and are comfortable debugging concurrency, memory, and execution issues at scale.
What You'll Do:
- Contribute production-grade code to the Apache Spark project, particularly in Spark SQL and Structured Streaming.
- Debug and optimize Spark internals — Catalyst, Tungsten, DAG Scheduler, Shuffle, and memory management — at petabyte scale.
- Influence architectural direction for Spark performance and scalability.
- Profile and tune JVM behavior (GC, memory layout, concurrency) at the engine level.
- Collaborate with cross-functional engineering teams and open source committers on integrations and ecosystem work.
- Mentor senior engineers and raise the engineering bar through code reviews and design critiques.
What we are looking for:
- 6+ years of experience in software development.
- Strong Java and/or Scala skills.
- Experience with distributed systems and concurrent or parallel programming.
- Working knowledge of Spark internals (Catalyst, Tungsten, DAG Scheduler, Shuffle, or memory management).
- Familiarity with JVM performance characteristics (GC, memory, threading).
- Advanced level of English.
Nice to have:
- Upstream contributions to Apache Spark (merged PRs in Spark SQL or Structured Streaming); committer status is a strong plus.
- Contributions to Apache Iceberg, Hive, Hadoop, Arrow, or Parquet.
- Kubernetes and cloud-native Spark deployment experience.
- Experience operating Spark at petabyte scale in production.
How we do make your work (and your life) easier:
- Remote Work.
- Excellent compensation in USD or your local currency if preferred.
- Hardware and software setup for you to work from home.
- Flexible hours: create your own schedule.
- Paid parental leaves, vacations, and national holidays.
- Innovative and multicultural work environment: collaborate and learn from the global Top 1% of talent.
- Supportive environment with mentorship, promotions, skill development, and diverse growth opportunities.
Join a global team where your unique talents can truly thrive and make a significant impact!
Apply now!