Bounding boxes coordinates
#13
by
ljoana
- opened
What rescaling should be done so that the bbox coordinates are matching the original image? I am seeing some mismatches but can't seem to figure what's the issue.
The bbox_2d coordinates are x1, y1, x2, y2 rather than x,y,w,h. And they will be relative to your resized image size if you are resizing. For example:
image = Image.open(image_path)
img_width, img_height = image.size
max_size = 1280
if max(image.size) > max_size:
ratio = max_size / max(image.size)
new_size = tuple(int(dim * ratio) for dim in image.size)
# set each dimension to be a multiple of 28
new_size = tuple(int(dim // 28) * 28 for dim in new_size)
image = image.resize(new_size, Image.LANCZOS)
img_width, img_height = image.size
then in the messages:
{
"role": "user",
"content": [
{
"type": "image",
"image": f"file://{image_path}",
"resized_width": img_width,
"resized_height": img_height,
},
.....