Advances in natural language processing

Last year, Science published an excellent and worth reading special issue on Artificial Intelligence (AI), which reviewed current trends and applications in many aspects of AI. One of the articles “Advances in natural language processing“, by Julia Hirschberg (University of Columbia) and Christopher D. Manning (University of Stanford), describes some current application areas of interest in language research, and describes very nicely how computational linguistics is nowadays increasingly being incorporated into consumer products. The article explains also why this is happening, which the authors summarise in four key factors: “(i) a vast increase in computing power, (ii) the availability of very large amounts of linguistic data, (iii) the development of highly successful machine learning (ML) methods, and (iv) a much richer understanding of the structure of human language and its deployment in social contexts.

Given that in our research group (OEG) there are a number of people interested in language technologies, I organised one of our “reading club over coffee” meetings about this paper a few weeks ago (see here and there for the report of other meetings).  My colleagues Olga Giraldo, Julia Bosque-Gil, Pablo Calleja, Carlos Badenes, and myself (Jorge Gracia), participated in the discussion, around good coffee and some chocolates to make the occasion even better.
  IMG_20160212_140449 (1).jpg

The discussion was open and informal as usual. I started by putting these three questions on the table:

  1. What is, in your view, the most remarkable aspect of the current state of things in natural language processing (NLP)?
  2. Can you mention any idea contained in the paper that you have found particularly appealing, motivating, interesting? (either for your own work or in general)
  3. How do you perceive the relation between the NLP and Semantic Web areas?
 I will try to summarise, without entering into the details, some of our reflections (with little order as you will see).

 


1. What is, in your view, the most remarkable aspect of the current state of things in NLP?

Despite the recent progresses in NLP, very well summarised in the paper, there are also interesting unresolved challenges. For Olga, Machine Translation is one of the most remarkable ones; also regarding the evaluation aspects, as pointed out by Carlos. For Pablo, sentiment analysis “beyond polarity” poses also many challenges. For me, one of the most remarkable aspect is how deep learning is boosting the field in many aspects.

From an historical perspective, NLP evolution can be summarised as follows: first, rule based techniques were proposed to capture the regularities of human language; second statistical analysis and bag of words models in which frequently used patterns were explored; and third, the addition of semantic and syntactic knowledge to such statistical models. Regarding the latter, we wondered about the relation of the authors’s ideas with respect to the notion of “common sense”.

But what is “common sense“, btw? For Pablo, common sense is something that usually is not represented, that is “immanent” to people. OK, so for instance, does the capacity of knowing that some expression is ironic come from the common sense? For Julia, you still need to know what is the subject of the sentence to interpret the expression as ironic. Carlos replied that, on the other hand, you do not need to know the grammatical rules, you can intuit what is the subject without knowing them. Unavoidably this argument led us to Chomsky’s notion of “universal grammar” as well as Greenberg’s “linguistic universals“.

I pointed that if we are talking about how to make machines understand language, current probabilistic methods are working pretty well, without explicit grammar rules. But for Julia, even if you do not represent “common sense”, you still need language to understand it. For instance you still need gramatical rules for text generation, as well as for learning.

In general we agreed that text models + common sense models are needed to reach new horizons in NLP. The question is how to model common sense?  (are ontologies enough?… probably not)

2. Can you mention any idea contained in the paper that you have found particularly appealing, motivating, interesting? (either for your own work or in general)

Moving to the next question,  deep learning is for Pablo one of the the most interesting aspects in modern NLP, which can overcome some known limitations of rule-based approaches. I mentioned that, in addition to deep learning for NLP, the notion of distant supervised learning recalled my attention. In fact this corresponds to intuitions I had in mind a time ago and I don’t see difficult to reuse it for my own work.

Carlos pointed out that the authors are giving to the notion of “context” a lot of value, meaning external “non textual” features (like gestures or intonation in oral communication). A lot has to be explored in that direction.

Olga sees the potential of NLP techniques to infer rules and to extract domain knowledge in laboratory protocols.

For Julia deep learning is appealing… but semantics and context still are the key. One still needs to understand grammar, discourse, etc. as the authors point out in the conclusions. She also likes the fact that “pure” linguists are moving more and more towards scientific and technological areas (e.g.,researchers in pragmatics, poetry, etc. are introducing new technologies in their studies). See the POSTDATA project (Poetry Standardization and Linked Open Data) as a remarkable example.

But we did not only talk about appealing ideas, but about worrying ones as well. For instance, creating AI agents capable of giving “a sense of companionship” (for elder people, for instance). Isn’t it dangerous? Aren’t we relegating true company and communication to a second level? We mentioned also the “intelligent” Mattel dolls, which were able to speak with children and learn from them, opening a range of educational and moral dilemmas. We left the continuation of this discussion thread for a future “coffee”, though, given its interest.

3. How do you perceive the relation between the NLP and Semantic Web areas?

As for me, hybrid approaches (Machine Learning + knowledge representation-based system) need further exploration in the near future. In that scenario, the standard representation mechanisms and query means that the Semantic Web provide can play an important role.

Julia advocates for richer annotation mechanisms, resulting in a richer representation of context (e.g., annotation of dialog intonation,  sentiment, etc.). Semantic Web could provide the common framework to make these things interoperable. MASC corpus is a step in that direction, but there is much more to come.

Olga pointed out the interest of exploring the relation between NLP and SW in the context of “nanopublications”.

Carlos, for his part, opened another “old” debate: is research on SW a mean or an end in itself?  Well, it will depend on the type of research you do.

For Pablo (and myself) distant supervised learning is an example of the overlap between the current trends in NLP (based on statistic analysis) and the necessity of knowledge representation schemes (which the SW can provide as ontologies).

 6199621218_147148015e_o

 

DIS-TOPIC FUTURE or “what the hell did you put on our coffee, man?” 

We are in an early phase of massive communication. What we have seen so far is nothing compared to what will come. Now companies want to extract opinions from text, but soon from video and other media types. Multilingualism makes such challenge even bigger. Talking about future computer interfaces, we think that they will rely more and more on NLP techniques, making our current interaction means look “obsolete” sooner than later.

Eventually, our discussion brought us to discuss more “extreme” futuristic visions (unavoidable after more than one hour talking about machines and their capacities to understand us):

Is the human being going to became “prescindible” after the advent of what some people call the “singularity“? For illustration, hypothesis generation (a task relying typically on human knowledge and intuition) is devised as a feasible task in the short term, in the context of the so called Big Data.

More interestingly, in a not so distant future of machines talking and “understanding” each other, is NLP going to be necessary? Actually the rules of human languages plays a “limiting” role in a machine to machine interaction. Maybe at some point machines will be able to invent their own language (much more precise, unambiguous and efficient) so they will be able to communicate under our radar, for their own purposes. Once they have encoded all human knowledge in their own formats, NLP will not be needed anymore. Instead of developing techniques for machines trying to understand us, we will have to decode their own created language. Maybe to stop the robot revolution. Isn’t it scary?

 


CREDITS
Featured image taken from https://fabiusmaximus.wordpress.com
“Coffee” picture by the author (cc by)
“Terminator” picture by Slds, taken from Flikr (cc by-sa)