SYNTH Project

 

Luc De Raedt and his team at the lab for Declarative Languages and Artificial Intelligence at  KU Leuven were awarded a prestigious Advanced Grant by the European Research Council. These very competitive grants are designed to allow outstanding research leaders to pursue ground-breaking, high-risk projects in Europe. The SYNTH project starts on September 1, 2016 and will be running till 2021.

Luc De Raedt introduces the SYNTH project:

It is a core artificial intelligence project. As you probably know, one of the goals of Artificial Intelligence is to develop machines that carry out and automate tasks that require intelligence. Now, doing science requires a lot of intelligence. It is therefore no surprise that AI researchers have tried to automate scientific reasoning and scientific processes. For instance, the robot scientist project wanted to do this for the life sciences.  The robot scientist could autonomously select and perform experiments related to drug design. But so far, robot scientists have focussed almost exclusively on the life sciences.

What we want to do in the SYNTH project is to automate a subfield of AI itself. That field is data science, you could also call it data mining or machine learning, these are all very much related and study essentially the same problems. So, SYNTH is related to big data - to techniques that can analyse data and discover new knowledge, which can then be used to make predictions. Machine learning, data mining and data science are really causing a revolution in society today. It has become so easy and so cheap to gather large amounts of data, and that data can then be used to extract knowledge. The question is no longer how to gather or store data, but how to analyse the data in order to find the right patterns and the right predictive models.

Data analysis is usually quite painful because there is so much data to consider, and one needs to select the right subset of the data, put that data in the right form, determine what the learning tasks will be, select the right algorithms, evaluate the results, ask the experts, etc. There are many steps that must be carried out, it is a real craft and it requires the involvement of highly skilled data scientists.

The key research question that we want to answer in the SYNTH project is whether it is possible to automate or semi-automate the data science process. So, we want to develop techniques and tools that automate the different steps in this process. If we succeed, that will have a lot of applications, it will become so much easier to analyse data and to develop applications of machine learning and data mining.

I am really very excited about this project, it is a big opportunity for our lab, and it also builds upon the expertise developed in our lab over the the past 20-25 years. Especially the expertise of Hendrik Blockeel will be important, he has been working on many of the techniques we will need. He is the key collaborator in the project.

Many techniques will be needed in SYNTH. Obviously, we will be working in a relational setting as most data is stored in relational databases and we have been developing logical and relational learning techniques for the past 20 years. We shall also build on inductive databases — these are databases that not only allow one to query for data but also for patterns and models using expressive query languages. Other ingredients come from constraint programming and probabilistic programming. We have been using these for a while now in machine learning and data mining. Another ingredient is that of program synthesis and automatic programming — here the idea is that one provides simply examples of the input-output behaviour of a program and the programs are programmed automatically! We will need all of these techniques to synthesise inductive data models.

Last but certainly not least, we want to do useful science, that is we do not want to do only theory, but we will also build practical systems and evaluate them on real life problems. In particular we will look into sports analytics (with team member Jesse Davis) and rostering (with team member Patrick De Causmaecker).