The model is a massive part of the AI-ecosystem, used by Google and Stable Diffusion. The removal follows discoveries made by Stanford researchers, who found thousands instances of suspected child sexual abuse material in the dataset.
While I get what you are saying, it’s pretty clear that what he was saying was that if you actually populate the dataset by downloading the images contained in the links (which anyone who is actually using the dataset to train a model would need to do), then you have inadvertantly downloaded illegal images.
It is mentioned repeatedly in the article that the dataset itself is simply a list of urls to the images.
While I get what you are saying, it’s pretty clear that what he was saying was that if you actually populate the dataset by downloading the images contained in the links (which anyone who is actually using the dataset to train a model would need to do), then you have inadvertantly downloaded illegal images.
It is mentioned repeatedly in the article that the dataset itself is simply a list of urls to the images.