Position Summary:

As a Robot Data Annotation Analyst, you will help shape the next generation of Physical AI foundation models. You’ll analyze and label videos of robot interactions with fine-grained precision, segmenting actions, identifying objects, and describing events according to detailed annotation specifications. Your work will directly support the training and evaluation of vision-language-action (VLA) models used to teach robots how to perceive and act in the world.

Job Details:

Work from Home
Monday to Friday | 9 AM to 6 PM Philippines Time (The schedule may be subject to change in the future depending on workload and client requirements.)
*Following US Holidays

Responsibilities:

  • Annotate and Segment Videos: Accurately timestamp video segments according to discrete robot actions such as pick, move, place, push, and pull.
  • Identify and Classify Objects: Label target objects with the correct attributes (e.g., color, shape, and type/noun).
  • Action Attribution: Identify which robotic arm is used and apply the correct action verb for each event.
  • Generate Descriptive Captions: Write clear, concise natural language descriptions summarizing each segment’s action, object, and spatial context.
  • Evaluate Performance: Rate each action and overall task as success, sub-optimal, or failure based on completion criteria.
  • Detect Idle Time: Identify and flag idle or non-action segments within videos.
  • Quality Control: Perform reviews on annotations for accuracy, consistency, and linguistic quality; provide feedback for continuous improvement.
  • Documentation: Contribute to annotation guidelines, standard operating procedures (SOPs), and troubleshooting documentation

Qualifications:

  • Exceptional attention to detail and a proactive, quality-focused mindset.
  • Strong written and verbal communication skills in English.
  • Ability to follow written and verbal instructions with high precision.
  • Experience with data labeling, linguistic annotation, or video analysis tools is an advantage
  • Background or coursework in robotics, cognitive science, linguistics, or computer vision is a plus
  • Demonstrated consistency, reliability, and self-motivation.
  • Openness to feedback and commitment to continuous improvement.
  • Ability to maintain accuracy and efficiency while working independently.
  • Typing speed of at least 40 words per minute is highly preferred.