• Zarxrax@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    11 months ago

    While I get what you are saying, it’s pretty clear that what he was saying was that if you actually populate the dataset by downloading the images contained in the links (which anyone who is actually using the dataset to train a model would need to do), then you have inadvertantly downloaded illegal images.

    It is mentioned repeatedly in the article that the dataset itself is simply a list of urls to the images.