Also, the alt texts vary in descriptiveness for that exact purpose. They’re meant to be useful for humans, not for training data.
What would a blind person rather have as the alt text:
(there are no photos here, for the blind people listening)
A cute Alsatian puppy looking into the camera with a dog toy in its mouth
A 14 week old black/brown dog sitting on a tiled floor with a synthetic-rubber cuboid-cylindrical-shaped, blue-green-gradient chew toy in its mouth with its eyes and noises poised at a 30° angle towards the photographer’s origin. Each tile on the floor is approximately 1.47m^2 and are a pearlescent shade of off-white. There is an unidentifiable black speck on the first tile in the top left quadrant of the image. The cameraman’s fat finger is covering 1.97% of the bottom right quadrant. Focal length is set to 100mm. Exposure settings appear to be increased. The dog’s genitals are not visible.