The term 'Artificial Intelligence' is so ill-defined as to be near-useless.
We have only recently come to accept that 'real' intelligence comes in many forms. To me, 'intelligence' is the capability to take input and process it in such a way that it is understood.
People who have Asperger's can display a lack of cognitive empathy - sometimes generalised as 'emotional intelligence'. It fits well here because the inputs are there but the person cannot really process them in such a way that provides actual understanding.
Likewise there are people (some of those on the 'spectrum') who don't understand facial expressions. They can be taught to recognise certain expressions and develop rules but these will be inflexibly applied and so miss subtlety and context.
That is very much like what happens with most AI projects - they can be taught (programmed) rules but these get applied without any real understanding. Thus you have the odd responses that even a non-native speaker wouldn't give.
The question, of course, comes to what it is to 'understand' something.
For much 'intelligence', I believe that understanding is rooted in our ability to place things into the context of our own experiences. For language, we build up this context gradually, starting from a very young age. It's a vast and complex structure with fluid links and convoluted dependencies. New experiences may subtly redefine our understanding or connection with certain words because those words are really just labels for concepts.
There is a similar requirement for the ability to process images. When you see an image you can identify the individual components - people, buildings, trees, cars, cats, etc... - because you have an understanding that each of these things can exist in isolation or in different contexts. We know that a sign is not part of a building because we have experiences of signs without building and buildings without signs and know that signs are added and removed from buildings.
The point is that our intelligence comes from our ability to place items correctly (or even inventively) in relation to others.
There may come a time when we can imbue a computer with this kind of knowledge - perhaps force-feeding it masses of raw data, such as somehow piping the whole of the Internet into it. Full texts of innumerable works; dictionary and thesaurus definitions; billions of photos and images; movie scripts and synopses; blogs; news; videos of cats on pianos, dogs of surfboards and people on each other; encyclopedia entries, research papers and textbooks; court transcripts; religious texts; product descriptions, political speeches and song lyrics.
It's amazing the breadth of things we humans process and remember and can draw upon to help us assess a familiar situation, interpret a new one, or even to image fantastical and even impossible things.
This is especially true in the interpretation of ambiguity. For example, when faced with the name 'Sam', I might assume a male but someone else - say with a wife named 'Samantha' - might assume a female. If the name was referring to a nurse then, having been in hospitals before, most people would assume female. If the context was "I took Sam for a walk" then we assume a dog.
One big hurdle with 'AI' is that many things that are obvious to most people are near-impossible to determine programmatically.
For example - how would you program a computer to identify stage directions in a play or script? Admittedly, that is apparently difficult for humans* as well . . .
* - Or demigods if you like.