copyright and hypermint

(a diffusion-based image generation AI)


the idea of copyright and diffusion-based AI have been in a war for quite a while. this usually entails a few things, but this will focus on a specific point:

  • artists do not want their artwork "stolen" or used without consent to train AI models.
which is entirely understandable.


copyright is a finicky thing. here's the basic rundown:

  • works, corporate works, and/or anonymous/works for hire become public domain 95 years after publication, or 120 years after creation (whichever expires first)
this usually refers to posts made on popular social media, such as twitter or instagram. it means any original artwork or media posted is protected under general copyright laws.

so why is this such a large issue against AI?
people are using non-public domain images and media to train their AI models including but not limited to image generation models. due to the size of some training sets, thousands or potentially millions of copyrighted artworks are used without any permission or even knowledge from the author.


before we get deeper into what hypermint does to, at the very least, attempt to stay away from copyright infringement, let's talk about image EXIF data.

what is EXIF?

EXIF (meta)data is essentially a list of tags applied to an image when using most forms of cameras, that contains more information on the capturer's device, image capture date, and a host of other data.

what does this have to do with copyright?

EXIF metatags usually have a tag specifically for identifying copyright. the tag is copyright and it can be set to the artist's name, handle, whatever works.

how to modify EXIF tags

EXIF data can be edited in various different ways, with various different programs having their own way of doing so. popular choices are adobe lightroom, photoshop, GIMP, and a recommendation, ExifTool. each tool/program has its own way of editing EXIF, and thus each will have its own documentation to refer to for doing so. ExifTool is recommended for ease of use.

note that this is not a replacement for fully copyrighting artistic works and should not be assumed so. the next section details specifically how using EXIF data will prevent hypermint, and by extension any of my models, from using copyrighted artwork in training data.

mintea's "method"

hypermint currently uses a method that concerns reading the EXIF data of an image, and searching for a tag copyright that is not blank, tagging the image as copyrighted and as a result, excluding the image from any training data it pulls from various sources.

it is a quick and easy way to check for copyrighted images and hopefully, exif data becomes more normalized as an addition to artistic works as by default, many art creation programs do not include exif.

also note, i am not claiming that this method is solely my own, and i am not claiming that all AI models use this method, nor am i claiming that this will become standard in the future. this is simply a document on hypermint's method to avoid copyright infringement, and my stance on the creation of this model in reference to copyright, so readers may be informed.

- mintea