Bullshit. One is generated from a prompt. The other starts with a real raw image. The CNNs inference process has some cosmetic similarities to filter convolutions but they are entirely different pipelines. This is like saying cows and leather hand bags have similar textures.

Three quick things: 

1. Go swear someplace else. 

2.  You really think all that extra ML silicon in SoCs is for filter convolutions?

3. Here’s a closeup from the “real photo”, you think the chef has those lines across his head IRL? 

Searchcaster