zeroshot-detect

Drop an image. Type any English noun (or several, comma-separated). See bounding boxes — no class list, no fine-tuning.

Model: google/owlv2-base-patch16-ensemble · CPU inference: 5–15 s per image.

0.05 0.5

Detections