Not much point in this anyway
As has been said, most forums have filters to block the obvious and the less obvious (forum mods know the language of their posters generally, with all its variation in spelling) add to this that the MOST abusive posts are generally in pretty good English, not requiring any swearing or nasty words.
The idea is to hurt the person you are trolling, and generally this is accomplished by making them feel small, physical size doesn't come into it, therefore the attacker can generally beat their target into submission by putting a post that is just "better written" than the target can respond too. (case of the small people being able to pick their battlefield and attack from a position of strength)
Check out some of the threads on E lReg for examples, those in the lower leagues tend to be left looking like the only surviving brain transplant donor by those who can string a few non-swearing insults together.
And then we have the final issue with language and words having their meaning changed or just having an generally accepted double meaning, those mugs at the AI lab will have a hard time working out whats an actual attack by only looking at the language being used. (see what I did there? English is great for this kind of thing)
Syntax and placement of words are almost as important as the words being used themselves. but I guess that is the point in using this as a training ground for an AI, after all a self learning system would come out of this exercise either totally broken and crying or able to work for IGN forums as a mod.