Lead AI Inference Engineer QVAC

Elevare Search• Anywhere

Any experience

Negotiable

Posted: 5 days ago

Other

Full-time

Job Summary

Tether is a global fintech innovator leveraging blockchain and advanced technologies to power secure digital finance systems. The company is seeking a Lead AI Inference Engineer to own and optimize the inference infrastructure for its QVAC platform, leading cross-functional teams to deploy high-performance, on-device AI systems that are scalable, reliable, and production-ready across diverse hardware environments.

Job Description

Tether is a leader in digital finance and advanced technology, combining blockchain innovation with AI-driven solutions to build scalable and secure global systems. Within its QVAC platform, Tether is developing next-generation on-device AI capabilities. The Lead AI Inference Engineer will play a key role in building and optimizing the runtime infrastructure that powers efficient, reliable AI inference across edge devices.

Responsibilities:
- Lead development of AI inference systems optimized for edge device performance
- Deploy machine learning models using frameworks such as llama.cpp, ggml, and ONNX
- Collaborate with researchers to transition models from research to production
- Integrate AI capabilities into products to enhance functionality and performance
- Manage a cross-functional team including C++, JavaScript, QA, and documentation engineers
- Ensure stable releases through structured processes and performance evaluation

Requirements:
- Strong programming expertise in C++
- Experience with inference engines such as llama.cpp and ggml
- Solid understanding of deep learning models including transformers and diffusion models
- Experience working with LLMs and deploying models to production environments
- Proven ability to manage small, specialized engineering teams
- Degree in Computer Science, AI, or related field with strong AI R&D experience

Benefits:
- Fully remote work with a globally distributed engineering team
- Opportunity to build cutting-edge edge AI and peer-to-peer technologies
- Work on high-impact systems powering next-generation AI applications
- Collaborative environment focused on innovation and performance

Join Tether to help define the future of on-device AI inference and scalable machine learning systems.

Keyskills

C++ Machine Learning LLMs ONNX ggml llama.cpp Deep Learning Edge AI JavaScript Team Leadership

This site uses cookies

Lead AI Inference Engineer QVAC

Job Summary

Job Description

Keyskills