Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

'No AI' Tag and Site Scraping

A topic by Fals Fiction created 24 days ago Views: 361 Replies: 19
Viewing posts 1 to 4
(+1)

The “No AI” tag is reassuring.

I hesitate to use it because greedy, grabby programmer types go after work that’s explicitly tagged as not being AI generated.

Is that a problem here? Is an asset, book, or game more likely to be added to a training dataset for generative programs if it’s tagged this way?

Deleted 23 days ago
(1 edit) (+2)

This was a deeply disappointing response.

  1. Asking the bot is not “asking at the source.” An actual source relevant to my question would be a person familiar with itch.io who is also involved in web scraping or building datasets for AI generators. All the bot knows is what’s been explained already by someone else, like in an FAQ.

  2. I’m disgusted my question inspired you to prompt the chatbot, wasting the energy (estimated) of a hundred standard web searches, and I’m annoyed you posted its meaningless output into my topic.

  3. Your answer missed the point of my asking. I’m aware people here can simultaneously answer the required yes/no question honestly and hide the status of AI-generated content. That’s why the “No AI” added to tags is reassuring. That’s why I want to specify with a tag that I’m not using that technology in my uploads! It’s easy to miss the filter in the site search.

I’ve had my images and my writing scraped without my permission for training datasets before. For a while, I stopped sharing my work online in frustration over the disregard for my copy and use rights, the breaking of trust between creators and our audience, and the abusive ways our data is used by AI developers.

Maybe— yes, it’s best to assume that because so many sneaky AI fans are on itch.io, and because a chatbot will quote pages here, then anything that’s easy to find and copy will be thrown into a dataset for these things. The “No AI” tag would be easy to send an outside bot through if there are no protections.

(+1)

100% agree, but this is ultimately a societal problem, not a technological one.

Even though I myself place some anti-AI features on my webpages, I don’t expect it to work indefinitely. It’s a cat and mouse game, same as adblocking or DRM, and the only way out of it is though spreading the message, something I don’t think is going to work anyway. The majority of people will always go the degenerate route, no matter the consequences.

Yeah, I also feel the same way about using the ‘No AI’ tags. I’m afraid it’ll make my projects more likely to be scrapped, to avoid AI ‘inbreeding’.

For AI scrappers/crawlers specifically, itch.io doesn’t seems to implement any measures against them in the robots.txt, HTML head meta tags, and in the HTTP headers. At least as far as I know.

It does not matter if you tag it or not.

See here

https://itch.io/games/tag-cats/tag-cooking/tag-no-ai

What do you think this shows?

(+2)

It shows a game of nnda appearing in the "no-ai" tag, despite not showing the "no-ai" tag in the information box of that game.

Ohhh, I get it.

Itch.io is sometimes adding the tag.

Uhm?

To my knowledge all the no-ai tags and the ai-generated tags that you can see in the info boxes were selected by the devloper in the 10 tags to select or write in. Both those tags were in use before the AI disclosure changes. And they can still be selected.

Just like you are considering to manually use one of your 10 tags to say "no-ai".

For completion ;-)

https://itch.io/game-assets/tag-16x16/tag-backgrounds/tag-fantasy/tag-no-ai/tag-...

I’m confused….

(+1)

I assume this is about the tagging or lack of tagging and the project still showing up under the tag regardless.

The announcement of the feature:

https://itch.io/t/4309690/generative-ai-disclosure-tagging

If you select “yes,” then your page will automatically receive the AI Generated tag.
For projects that select “no,” they will receive the tag No AI.
Note: It’s not necessary to use these “automatic” tags manually when classifying your page. We may rename or modify them to fit future classification needs, so please correctly fill out the disclosure form instead of manually tagging.

"Receiving" the tags is not exactly correct. The projects will decidedly not get a tag. But they will be searchable via the tag search. This results in what you see: your project appears in tag-no-ai, but it will not have no-ai in the info box under tags. Same for the 5 AI tags. Any AI tags you do see, where chosen manually as regular tags.

As leafo said, they might change the feature in the future. Maybe there will be a different button for filtering. Maybe they will display a meta information section with more detailed information in case a project is disclosed as having AI. Just like the info box currently expands on different kinds of existing meta information. The secion about "made with" and "average session" and others will also only display, if there is any information. Just like I elaborated in my wish below, how it should look. Non information should not be displayed, but information should. Not having AI is not information to me.

But unfortunately, they currently do not display the information of a game using AI. But to my reading of the quality guidelines, games should have had disclosure even before that disclosure feature. In the description. Maybe my reading is not correct and the context is different. But it still says this in the quality guidelines (in an easy to miss sentence, so maybe there is a context issue in my understanding.)

If your project involves automatic or AI generation, make sure it’s clearly stated in your project description

This whole implementation feels like a quick hack to make it work. A makeshift solution for that grace period with the assets that is mentioned in the announcement. So that users can already use some of that functionality.

(+2)

Ah, I see… Even though I didn’t explicitly put the No AI tag. My project is still shown under the No AI tags list, because I picked No in the AI generation disclosure.

Good news: The robots.txt does tell bots to stay out of user pages. Now that I’m looking at the file for itch.io again, I feel better about the quote in the earlier reply to my question.

Crawling bots are allowed on the site’s About and FAQ pages.

Wait, so projects that aren't tagged with No AI will make your games get scraped? Yikes

Moderator(+2)

I doubt it. AI scrapers hoover up everything they can. Including their own refuse.

You seem to have misread the question.

no-ai means that no ai was used. It is not a warding sign against ai scrapers to not visit.

So many misunderstandings here. I deleted my post above, it had an explanation too, but OP was offended because I asked an AI how AI are trained.

It kinda is explained here https://itch.io/t/4309690/generative-ai-disclosure-tagging , but people tagging their projects or browsing for things might not have read that. Even after reading, it is not that intiuitve.

The AI disclosure creates virtual tags that are not visible on a page, but can be filtered with the tag system regardles. And because Itch has freestyle tagging, those virtual tags also exist as regular tags. I suspect that the OP stems from that quirk. The projects of OP are already tagged no-ai because of the ai disclosure, so musing over tagging them no-ai will lead to nowhere. Unless OP want's a discussion if one should also tag it manually.

Thank you deleting your first reply. Sincerely— that was a distressing response that other creators don’t need to read.

As for my question: you misunderstood. I am asking about the manually added tags. Those are visible when opening the arrowed meta box on a work page. I look at tags regularly. I would like to use them, but on other sites #noAI has been explicitly targeted by scrapers. So I was wondering about the history with the tag on this site.

Admin(+1)

As far as we know this has not happened on itch.io, no. While your concern is understandable, I do think you may be over-thinking it a bit. Unfortunately, just as spambots are an inevitability on the internet in 2025, AI scrapers are, as well -- and if someone really wanted to use your project as material for AI scraping, whether or not it's tagged "no-AI" doesn't really have much of an effect. With the way that AI scrapers are utilized on the internet, many of them are not discriminating as to whether or not content has or doesn't have AI before scraping from it. 

If you believe that a project on itch.io may be a result of AI scraping of your own content, please reach out to us and we will work to get your case handled.

(+1)

Sorry for distressing you. But I am afraid you will get only opinions and guesses here. The real scrapers are unlikely to show up here and talk business.

On that other sites the situation might be vastly different due to how they handle tags and how accurate those tags are. Filtering for the #noai might be their most reliable option.

On Itch the manually added tags would not change filtering in the case that someone were to filter for no-ai. Things will appear in the tag no-ai either by manually selecting no-ai or by filling out the ai disclosure with "no".

If a scraper bot looks for tags onpage on Itch, that means the bot creator knows about Itch tags and can just as well use inbuilt tag filter system to prefilter the input for the scraper. Would be a lot more accurate too, since the 11 tags are not really meant for meta information.

And from a player's perspective (mine), seeing no-ai (as a tag) is not informative. That's the default. It is not information. If any, I want to see detailed info about ai-usage and not about non-ai-usage. I do also not see that a renpy game was not made with unity and not made with rpg maker and that no puppies were hurt in the making of it. I like the ai disclosure information on Steam with the detailed info, and that is only shown, if there is ai things going on.

If a dev cares for these things and want's to make a statement about how no ai was used, I would prefer this to be positivly worded. Saying that it was hand crafted by the developer. Or commissioned with links to the artist, or whatever. no-ai can also mean pictures drawn in sweat shop by child labor, to exaggerate. A lot of people that want to avoid ai, do so for ethical reasons. Not having used ai does not equate that the things done there were created ethically. And I know a professional studio that was very unethical like not paying artists and such. At least that's what I heard about that studio.