This is a hard topic to discuss at surface level depth, for me at least.
From the context of an LLM, I would define ethical AI being an LLM that was trained on data owned by or licensed by the entity training it.
The debate happening legally outside that scope is interesting, and I'm personally conflicted, which adds complexity to the previous statement.
On one side it theoretically is "learning" like humans can. On the other side it is basically predictive text that learned the predictive process by consuming published works.
If the published works were put out on the internet for anyone to consume, then is that not like putting your trash at the street? It's intentionally publishing for anyone to dig through, including bots. We just didn't see all the bots and what they were doing. But, just because it was published freely doesn't negate IP/copyright law.
If it's paid content, and someone paid to access that content within the license terms, and learns from it (programmatically or as a human) is that not legal?
Circling back, it feels like the easy place for me to land is if the output of the LLM doesn't violate the license, is that's legal.
The ethics of business are gray, so if it's legal is it ethical? Maybe not by the standards of humans who feel screwed over, but are those people OK with using it? If so, then like myself with a conflicted/double standard position I think we may need to accept it's ethical if we choose to participate in using it en masse, because we are accepting it.
Ethics are also subjective moving targets, and separate from legality although there can be overlap. A thing that was accepted as ethical in history may not be now due to changes in societal norms. On a different thing the opposite may be true.
Summarize that beyond "it's complicated"! ????