Job Details

View jobs in our app

Learn more about the app. Workinapps.com

Data Scientist - AI Evaluation

2026-04-16 wizard all cities,AK

Description:

About WizardWizard is the top-performing AI Shopping Agent, delivering the best products from across the web with unmatched accuracy, quality, and trust.The RoleWe're looking for a Data Scientist to own how we measure, understand and improve the accuracy of our AI agent. This role sits at the intersection of data science, machine learning and product and is focused on evaluation, experimentation and insight generation. You won't be building models but you will make sure they work in real world scenarios. You will build the systems to measure what good looks like and partner closely with ML, AI Engineering and Product to continuously improve the agent's performance.What You'll DoDefine and evolve accuracy metrics across the full shopping experience (retrieval, ranking, recommendations and outcomes)Design and run experiments to measure improvements and regressionsBuild and maintain evaluation datasets, benchmarks and scoring frameworksTranslate ambiguous product questions into clear, measurable hypotheses and analysisPartner with ML Engineers to validate model changes and guide iterationIdentify failure modes and edge cases and drive improvements through dataCreate dashboards and reporting that make agent performance visible, trusted and actionableWhat Success Looks likeClear, trusted accuracy metrics are consistently used across product and engineeringA robust automated evaluation framework exists for both offline and live experimentsModel and product changes are consistently measured before and after launchIdeal Background4-6+ years in Data Science, ML Evaluation or Applied AI or similar rolesDeep experience evaluating AI/ML systems (ranking, recommendations, LLMs, etc)Strong experience with experimentation (A/B testing, causal inference)Experience working on consumer products or user facing systems and exposure to marketplace or e-commerce systemsAbility to translate messy problems into structured analysis and metricsStrong product mindset, you care about real user outcomesClear communication with the ability to influence across engineering and productCompensation & BenefitsThe expected base salary range for this role is $225,000 - $280,000 USD, and will vary based on skills, experience, role level, and geographic location. Final compensation will be determined by considering these factors alongside overall role scope and responsibilities.In addition to base salary, Wizard offers:Equity in the form of stock optionsMedical, dental, and vision coverage401(k) planFlexible PTO and company holidaysFully remote work within the United StatesPeriodic company offsites and team gatheringsWizard is committed to fair, transparent, and competitive compensation practices.

Job Details

View jobs in our app

Data Scientist - AI Evaluation

Apply for this Job

Registration Required

Login to Apply

You are leaving our site

Registration Required

Email this job to a friend

Job: Data Scientist - AI Evaluation

Job Alert Sign Up

Add To Job Alert

Job Alert Updated

Email Customer Care