Column Technical Services is seeking a highly skilled AI DevOps Systems Administrator to architect, support, and evolve the infrastructure powering our cutting-edge Artificial Intelligence and Machine Learning initiatives in a secure, classified environment based in Scottsdale, AZ. In this role, you'll be at the forefront of innovation, driving reliable model development and deployment by optimizing pipelines, maximizing compute performance, and ensuring robust scalability and security across platforms. This is a unique opportunity to work with advanced technologies while making a direct impact on mission-critical systems. If you're passionate about AI infrastructure, thrive in high-performance environments, and are ready to take on meaningful, complex challenges, we encourage you to apply.
**Sponsorship is not available for this role. Candidates must currently reside in or near Scottsdale, Arizona. Applicants must hold an active TS/SCI with Polygraph clearance.**
In this position, you will work closely with data scientists and machine learning engineers to enable seamless transitions from experimentation to production.
Core Responsibilities
- Architect, deploy, and support scalable environments for AI/ML training and inference workloads
- Build and maintain automated CI/CD workflows for machine learning models and AI-driven applications
- Administer and fine-tune Linux-based systems across physical and virtual infrastructures
- Implement and manage containerized environments using tools such as Docker and Kubernetes to support scalable ML services
- Utilize Infrastructure as Code (IaC) solutions (e.g., Terraform, Ansible) to automate provisioning, configuration, and system management
- Optimize allocation and usage of GPU resources for compute-intensive workloads
- Establish monitoring, logging, and alerting frameworks to ensure system health, availability, and performance
- Partner with engineering teams to troubleshoot issues, improve workflows, and meet infrastructure requirements
Additional Responsibilities
As a senior-level contributor, you will serve as a key technical point of contact, supporting users and participating in system design and evolution efforts to align with emerging technologies. You will:
- Install, configure, and maintain software and system components
- Diagnose and resolve technical issues, including access control and permissions
- Provide guidance and training to users on system functionality
- Manage daily operations of server environments across both physical and virtual platforms
- Configure, maintain, and troubleshoot hardware, operating systems, and network interfaces
- Investigate and resolve system alerts, ensuring continuity of services
- Develop scripts to streamline and automate repetitive operational tasks
- Collaborate directly with stakeholders to identify, isolate, and resolve system-related issues impacting broader services
What Sets You Apart
- A collaborative mindset with a strong commitment to team success and shared outcomes
- Solid understanding of how systems, servers, and services interconnect within a broader IT ecosystem
- Advanced expertise in supporting both physical and virtual server environments
- Deep knowledge of access controls, permissions, and security practices to ensure appropriate and secure data access
- A proactive approach to identifying opportunities to leverage AI for operational efficiency, continuous improvement, and innovation
What You'll Experience
- Work with advanced and often highly classified technologies
- Be part of a forward-thinking team focused on innovation and exploration
- Continuous learning opportunities aligned with emerging advancements
Qualifications
- Minimum of 8 years of relevant experience OR a Master's degree with 6+ years of experience
- Bachelor's degree in Computer Science, a related discipline, or equivalent experience
- Deep expertise in server-based operating systems
- Strong proficiency in Linux environments, containerization, and AI/ML infrastructure
- Proven ability to serve as a subject matter expert and mentor team members
- Advanced troubleshooting skills across operating systems, networking, and storage technologies
- Hands-on experience building, deploying, and maintaining enterprise-scale server environments
- Exposure to or experience working with AI/ML workloads is highly desirable
- Willingness to travel occasionally
Pay: From $130,000.00 per year
Benefits:
- 401(k) matching
- Dental insurance
- Flexible spending account
- Health insurance
- Life insurance
- Paid time off
- Vision insurance
Application Question(s):
- How have you tuned Linux systems to support high-performance AI/ML workloads?
- What’s your experience scaling containerized ML workloads with Docker and Kubernetes?
- How have you automated ML model deployment using CI/CD pipelines?
- Do you have an active TS/SCI with Polygraph clearance?
- Will you now or in the future require sponsorship to work in the USA?
Location:
- Scottsdale, AZ 85257 (Required)
Ability to Commute:
- Scottsdale, AZ 85257 (Required)
Work Location: In person