Did the image get copied onto their servers in a manner they were not provided a legal right to? Then they violated copyright. Whatever they do after that isn’t the copyright violation.
And this is obvious because they could easily assemble a dataset with no copyright issues. They could also attempt to get permission from the copyright holders for many other images, but that would be hard and/or costly and some would refuse. They want to use the extra images, but don’t want to get permission, so they just take it, just like anyone else who would like an image but doesn’t want to pay for it.
A freely available and unencumbered binary (e.g., the model weights) isn’t the same thing as open-source. The source is the data. You can’t rebuild the model without the data, nor can you verify that it wasn’t intentionally biased or crippled.