Article 2 Evaluating Large Language Models on Gender-Occupational Stereotypes Using the Wino Bias Test