Mindrift is seeking a Senior Software Engineer in Test with strong full-stack development and Python expertise to help evaluate and improve AI coding systems. This project-based opportunity focuses on creating challenging coding tasks and comprehensive automated tests that assess the performance of AI-generated software solutions. Contributors will analyze how AI systems perform across complex development scenarios and provide insights that support the advancement of AI-assisted programming technologies. The role requires strong technical expertise in testing methodologies, software engineering practices, and full-stack development environments.
Responsibilities:
- Design and refine realistic coding tasks based on existing production codebases.
- Develop comprehensive functional and integration tests that validate end-to-end application behavior.
- Create challenging test scenarios that evaluate AI coding capabilities and reasoning skills.
- Analyze AI-generated outputs to identify strengths, weaknesses, and failure patterns.
- Collaborate with QA reviewers to refine test cases based on evaluation feedback.
- Continuously improve testing frameworks and evaluation strategies for AI systems.
Requirements:
- Degree in Computer Science, Software Engineering, or a related technical field.
- Minimum of 5 years of professional experience in software development with Python.
- Experience in full-stack development including React-based interfaces and backend systems.
- Strong experience writing functional and integration tests.
- Familiarity with Docker containers and running local evaluation environments.
- Understanding of CI/CD workflows such as GitHub Actions.
Benefits:
- Flexible project-based work schedule with remote participation.
- Opportunity to contribute to the advancement of AI technologies.
- Competitive hourly compensation based on expertise and contribution level.
- Exposure to complex software testing and AI evaluation projects.
Join Mindrift to help push the boundaries of AI coding systems through advanced software testing and technical evaluation.