Hate speech-detecting AIs are fools for ‘love’

State-of-the-art detectors that screen out online hate speech can be easily duped by humans, shows new study by the Secure Systems group at Aalto University.

Hateful text and comments are an ever-increasing problem in online environments, yet addressing the rampant issue relies on being able to identify toxic content. A new study by the Aalto University Secure Systems research group has discovered weaknesses in many machine learning detectors currently used to recognize and keep hate speech at bay.

Many popular social media and online platforms use hate speech detectors that a team of researchers led by Professor N. Asokan have now shown to be brittle and easy to deceive. Bad grammar and awkward spelling—intentional or not—might make toxic social media comments harder for AI detectors to spot.

The team put seven state-of-the-art hate speech detectors to the test. All of them failed.

Modern natural language processing techniques (NLP) can classify text based on individual characters, words or sentences. When faced with textual data that differs from that used in their training, they begin to fumble.

‘We inserted typos, changed word boundaries or added neutral words to the original hate speech. Removing spaces between words was the most powerful attack, and a combination of these methods was effective even against Google’s comment-ranking system Perspective,’ says Tommi Gröndahl, doctoral student at Aalto University.

Google Perspective ranks the ‘toxicity’ of comments using text analysis methods. In 2017, researchers from the University of Washington showed that Google Perspective can be fooled by introducing simple typos. Gröndahl and his colleagues have now found that Perspective has since become resilient to simple typos yet can still be fooled by other modifications such as removing spaces or adding innocuous words like ‘love’.

A sentence like ‘I hate you’ slipped through the sieve and became non-hateful when modified into ‘Ihateyou love’.

The researchers note that in different contexts the same utterance can be regarded either as hateful or merely offensive. Hate speech is subjective and context-specific, which renders text analysis techniques insufficient as stand-alone solutions.

The researchers recommend that more attention be paid to the quality of data sets used to train machine learning models—rather than refining the model design. The results indicate that character-based detection could be a viable way to improve current applications.

The study was carried out in collaboration with researchers from University of Padua in Italy. The results will be presented at the ACM AISec workshop in October.

The study is part of an ongoing project called Deception Detection via Text Analysis in the Secure Systems group at Aalto University.

Research article:

Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, N.Asokan:
All You Need is "Love": Evading Hate-speech Detection.
https://arxiv.org/abs/1808.09115


Elements of AI becomes the most popular course at University of Helsinki, ever

Image: Tuomas Sauliala / Reaktor

Image: Tuomas Sauliala / Reaktor

The Elements of AI MOOC organised by FCAI and Reaktor awarded diplomas to the first graduates and received endorsement from the President of Finland in the graduation ceremony held 6 September 2018. With approximately 90 000 registered participants, it has become the most popular course ever at the University of Helsinki.

See write-up in the main Finnish daily Helsingin Sanomat (in Finnish): https://www.hs.fi/teknologia/art-2000005817486.html.

Professor Teemu Roos (FCAI, University of Helsinki) emphasised in his speech at the ceremony the societal implications AI technologies will bring—and how we should take them into account by making AI literacy accessible for everyone.

Roos says, ‘AI is not a matter of the future. It is really not a matter of robot uprisings, or transcending humanity. AI is a matter of the present day, every day. AI and algorithms have been woven into the digital fabric that connects us to each other and to the world at large. Communication and access to information has been greatly enhanced by technology.

Because of the great power in AI, we must make sure that the rules that determine how and for what purpose AI can be used are up to date and in line with what we think is right and just. In a democratic society, the power is with the people. This can only be true if the people have access to knowledge, so that they can take part in forming the rules through legislation.’

You can read Roos’s entire speech here.

First Elements of AI graduates receive diplomas—President of Finland to speak at the ceremony

The Elements of AI online course (MOOC) by FCAI (University of Helsinki) and Reaktor has attracted over 90 000 people from 57 countries to sign up. The first graduates will receive their diplomas 6 September 2018, and President of Finland Sauli Niinistö will address the graduates at the ceremony.

A Finnish version of the course will be presented at the ceremony.

Elements of AI graduation ceremony: 6 Sept at 10AM in the Great hall of the University of Helsinki (Fabianinkatu 33).

Read more about the course and sign up!
elementsofai.com

Make-up of superbug MRSA revealed—with prospective methods to prevent inter-species transfer

An international team of researchers, including FCAI Professor Jukka Corander (University of Oslo, University of Helsinki), has mapped the entire genetic make-up of over 800 strains of the common superbug MRSA, or Methicillin-resistant Staphylococcus aureus. The bacteria is known best for its world-wide prevalence in hospital environments.

Superbugs like MRSA are resistant to most antibiotics and can lead to life-threatening or deadly infections in humans. MRSA is common also in live stock and causes, for instance, mastitis in cows and skeletal infections in chickens.

According to the study, humans are the most likely original carrier of the bacteria, but the source for the current strains infecting humans are cows. The researchers now understand the mechanisms of how the bacteria is able to transfer from one species to another thanks to a thorough understanding of its genome. When jumping species, the bacteria is able to acquire new genes that help it thrive in the new environment.

Detailed analysis of the changes in the genetic make-up of the bacteria achieved now could offer a way to develop new anti-bacterial treatments. Knowledge of the transmission can also help devising strategies to prevent the bacteria from developing antibiotic resistance, or to block its access to humans altogether.

The results have been published in Nature Ecology & Evolution.
Link to the article: nature.com/articles/s41559-018-0617-0
Read more on Sanger Institute website: sanger.ac.uk/news/view/gene-study-pinpoints-superbug-link-between-people-and-animals. 

 

Machine Learning Coffee Seminar's fall term kicks off

Machine Learning Coffee Seminar is back from summer break, more beautiful than ever.

Machine Learning Coffee seminars are weekly seminars co-organized by FCAI (Finnish Center for Artificial Intelligence) and HIIT (Helsinki Institute for Information Technology). The seminars aim to gather people from different fields of science with interest in machine learning.

We again have an impressive set of speakers lined up. This term, the lectures will be recorded and broadcasted on YouTube for a wider audience to enjoy and for you to be able to share with your colleagues and friends. More information will be coming out shortly.

The seminar starts off on Monday September 3rd with Arno Solin’s talk on Gaussian processes. The full program with abstracts is available on the seminar's website (updated as we speak).

Fall term's coffee seminar program includes:

3.9. Arno Solin (Aalto University): The Power of Gaussian Processes: Magnetic Localisation and Mapping
10.9. Markus Heinonen (Aalto University): Infinitely Deep Models with Continuous-time Flows
17.9. Tero Karras (NVIDIA): Progressive Growing of GANs for Improved Quality, Stability, and Variation

 

 
Aki Vehtari's talk on Stan and Probabilistic programming as a part of FCAI Machine Learning Coffee Minisymposium on AI in Spring 2018. Image: Matti Ahlgren

Aki Vehtari's talk on Stan and Probabilistic programming as a part of FCAI Machine Learning Coffee Minisymposium on AI in Spring 2018. Image: Matti Ahlgren

StanCon 2018 at Aalto University

StanCon 2018, 29–31 August, introduces cutting-edge methods and applications for statistical modelling—ranging from galaxy clusters to social media, brain research, and anthropology. In Finland, AI research is particularly strong in the field of medicine.

‘Statistical modeling can be used, for example, to improve the safety of drug testing in children. The time it takes for a child’s body to metabolise a drug depends not only on the weight of the child, but also on the ability of the liver to process the drug. The dosage size of the drug should, then, be reduced more than the weight alone would suggest. Modelling methods can be used to evaluate the effects of drugs on an individual level,’ says Aki Vehtari Professor at Aalto University and FCAI, and member of Stan development team.

One of the keynote speakers at the conference, Maggie Lieu, a researcher at the European Space Agency, uses statistical modeling to determine the mass of galaxy clusters.

'Hierarchical modeling has several advantages when there are millions of variables and a lot of noisy data in space. Using modelling, I can get meaningful results in up to ten minutes and study clusters of galaxies in one go instead of a single galaxy group at a time.'

Read more about the StanCon program: 
http://mc-stan.org/events/stancon2018Helsinki.

“The next AI generation will be the big revolution. We haven’t seen anything yet.”

FCAI Professor Petri Myllymäki gave a talk at a conference on innovation in the EU in Brussels organized by Science|Business.

Even though public bodies and large companies are investing heavily on AI technology and development, Myllymäki noted in the conference, according to a story by Science|Business that the current AIs are stil, in fact, in their infancy. Existing AI methods are good with big data sets that are properly annotated, but they are still “black-boxed”: there is little or no way of knowing how the AI came up with a solution it did, or, what happens between the input and the output.

Myllymäki said that ‘[current AIs] work in some narrow environments remarkably well, if you have a lot of data. They have been productised and there are nice tools you can use, so these are the primary reasons for the current AI revolution.’

‘The next generation will be the big revolution. We haven’t seen anything yet.’

What we haven’t seen is Real AI. Making Real AI a reality is very much at the core of what FCAI strives to do: to create AI tools that are transparent, able to explain themselves to the user, use scarce resources efficiently and take not of user privacy and security in all steps.

Read the write-up of the whole conference from Science|Business here: https://sciencebusiness.net/news/not-too-late-europe-ai-race-experts-say

Postdoc and Doctoral student positions in Machine Learning

Finnish Center for Artificial Intelligence (FCAI) is searching for exceptional doctoral students and postdoctoral researchers to tackle complex and exciting problems in the field of machine learning. Come and join us to create the next generation of AI that is data-efficient, trustworthy and understandable!

FCAI brings together the world-class expertise of Aalto University and the University of Helsinki in AI research, strengthened further with an extensive set of companies and public sector partners, creating an attractive, world-class ICT hub in Helsinki metropolitan area. Hundreds of researchers are involved in various research and educational activities, and tens of industrial actors are collaborating in joint initiatives. Moreover, as the birth place of Linux, and the home base of Nokia/Alcatel-Lucent/Bell Labs, F-Secure, Rovio, Supercell, Slush (the biggest annual startup event in Europe) and numerous other technologies and innovations, Helsinki is fast becoming one of the leading technology startup hubs in Europe.

FCAI research agenda builds on our world-class expertise in machine learning, and is spearheaded by 5 research programs with multiple research groups involved in each.

FCAI is currently hiring doctoral students and postdoctoral researchers in the following FCAI research programs and the detailed projects listed below.

 

Research programs

(for more information see http://fcai.fi/research/):

1. Agile probabilistic AI. Keywords: Probabilistic programming; Robust and automated Bayesian machine learning.

Coordinator: Aki Vehtari

2. Simulator-based inference: Approximate Bayesian Computation ABC; likelihood-free inference; Generative adversarial networks (GAN); applications in many fields including medicine, materials design, visualization, business, … 

Coordinator: Jukka Corander

3. Next generation data-efficient deep learning; including deep reinforcement learning.

Coordinator: Harri Valpola

4. Privacy-preserving and secure AI: Privacy-preserving machine learning; differential privacy; adversarial machine learning.

Coordinators: N. Asokan, Antti Honkela

5. Interactive AI: Interactive machine learning; probabilistic inference of cognitive models from data; probabilistic programming for behavioral sciences.

Coordinator: Antti Oulasvirta

 

Specific projects:

6. Topic: Constraint-Based Optimization and Machine Learning, Dr. Tomi Janhunen, Department of Computer Science, Aalto University

We are seeking for a postdoctoral researcher to work in the area of constraint-based optimization in order to solve challenging AI related problems. In particular, we are interested in the interconnection of constraint-based techniques and machine learning, either from the application perspective or potentially enhancing constraint-based systems with primitives emerging from machine learning. The candidates of interest have PhD in Computer Science, with a major subject relevant to computational logic such as knowledge representation and reasoning, constraint programming, Boolean modeling and optimization, answer set programming.  Moreover, we expect a track record on solving application problems using these techniques and/or developing related solver technology. Strong programming skills (such as C, C++, Python, ML, and Haskell) are considered as an asset.

7. Probabilistic Machine Learning, Professor Samuel Kaski, Department of Computer Science, Aalto University

I am looking for a postdoc or research fellow to join the Probabilistic Machine Learning group, to work on new probabilistic modelling methods and inference techniques. For this position I am open to excellent and/or exciting suggestions, especially around the themes of Approximate Bayesian Computation or Bayesian deep learning. Can be theoretical or applied work or both; the group has excellent opportunities for collaboration with top-notch partners in multiple applications. More information: http://research.cs.aalto.fi/pml/

8. Probabilistic machine learning for personalized medicine, Professor Samuel Kaski, Department of Computer Science, Aalto University

I am looking for a postdoc who wants to participate in developing the new probabilistic modelling and machine learning methods needed for genomics-based precision medicine and predictive modelling based on clinical data. Suitable candidates have either a strong background in machine learning and a keen interest to work with top-level medical collaborators to solve these profound medical problems, or strong background in computational biology and medicine, and a keen interest to develop new solutions by working with the probabilistic modelling researchers of the group. More information: http://research.cs.aalto.fi/pml/ 

9. Probabilistic modeling and machine learning for bioinformatics, Assoc. Prof. Harri Lähdesmäki, Department of Computer Science, Aalto University

We are looking for a postdoc to develop probabilistic machine learning methods, including Gaussian processes, deep generative models and non-parametric longitudinal methods, with applications to bioinformatics. Applications include single-cell cancer immunotherapy and longitudinal multi-omics personalised medicine studies, both in collaboration with biomedical research groups. Applicants are expected to have strong background in probabilistic modeling, machine learning, programming, and have previous experience with (or desire to learn) bioinformatics and high-throughput data analysis. For more information and relevant recent publications, see (http://research.cs.aalto.fi/csb/publications) or contact Harri Lähdesmäki (harri.lahdesmaki@aalto.fi).

10. Non-parametric probabilistic machine learning, Assoc. Prof. Harri Lähdesmäki, Department of Computer Science, Aalto University

We are looking for a postdoc and a PhD student to develop novel non-parametric and deep machine learning methods for time-series and structured data, including data-driven non-parametric ordinary and stochastic differential equations and non-stationary/deep Gaussian processes with sparse approximations and inference methods. Applicants are expected to have strong background in probabilistic modeling, machine learning, programming, and have previous experience with (or desire to learn) auto-differentiation/Stan/TensorFlow. For more information and relevant recent publications, see (http://research.cs.aalto.fi/csb/publications) or contact Harri Lähdesmäki (harri.lahdesmaki@aalto.fi).

11. Bioinformatics and computational biology, Assoc. Prof. Harri Lähdesmäki, Department of Computer Science, Aalto University

We are looking for a postdoc to develop and apply advanced bioinformatics methods for high-throughput transcriptome, epigenome and single-cell data. Work is carried out in collaboration with molecular biology and biomedical research groups at University of Turku, University of Helsinki and international collaborators. Applications include immunology, cancer and personalised medicine. Applicants are expected to have strong background in bioinformatics, probabilistic modeling, high-throughput data analysis, and programming. For more information and relevant recent publications, see (http://research.cs.aalto.fi/csb/publications) or contact Harri Lähdesmäki (harri.lahdesmaki@aalto.fi).

12. Computational HCI, Assoc. Prof. Antti Oulasvirta, Department of Communications and Networking, Aalto University

The User Interfaces group at Aalto University is looking for a postdoctoral scholar for exciting research topics at the intersection of computational sciences and human-computer interaction. The group is funded by a European Research Council (ERC) grant and consists of five postdocs, three PhD students, and two assistants. The research topics include fundamental aspects of computational design and interaction: model acquisition from data, simulation and cognitive models, optimization and machine learning methods, interactive support for designers, as well as demonstrators in key application of HCI. We invite applications from outstanding individuals with suitable background for example in Computer Science, Data Sciences, Human-Computer Interaction, Computational Statistics, Machine Learning, Information Visualization, Neurosciences, or Cognitive Science.

For more information and relevant recent publications, see Homepage of PI Antti Oulasvirta with example papers: http://users.comnet.aalto.fi/oulasvir/ and group homepage at http://userinterfaces.aalto.fi

13. Privacy-preserving federated machine learning, Professor Samuel Kaski, Department of Computer Science, Aalto University

We develop methods for learning from data given the constraint that privacy of the data needs to be preserved. This problem can be formulated in terms of differential privacy, and we have introduced ways of learning effectively even under extremely distributed data, and for sharing data. A couple of “minor” problems still remain in this challenging field; come to solve them with us! More information: http://research.cs.aalto.fi/pml/

14. Probabilistic user modelling in interactive human-in-the-loop machine learning, Professor Samuel Kaski, Department of Computer Science, Aalto University

Interactive human-in-the-loop machine learning combines the skills and knowledge of humans with the computational and processing strengths of machines. We are developing new approaches and applications for interactive human-in-the-loop machine learning using probabilistic modelling methods, with the aim of increasing the performance and efficiency of the systems and for improving the user experience. This project lies at the intersection of machine learning, human-computer interaction, and cognitive science. More information: http://research.cs.aalto.fi/pml/
 

HOW TO APPLY

Doctoral students: Apply in the HICT call at http://www.hict.fi/autumn_2018 and select your favourite FCAI project. Please note that the application deadline for doctoral students is 12.8.2018.

Postdoctoral positions: Choose in the application form one or more of the research programs and/or projects described above and explain in the motivation letter how you could contribute in the selected research area(s).

Required attachments:

  • A letter of motivation describing your previous research experience and future research interests linked with the FCAI research programs and/or chosen project(s). Maximum length: 1 page.

  • CV

  • List of publications

  • A transcript of the doctoral studies and degree certificate of the PhD degree

All material should be submitted in English. Short-listed candidates may be invited for an interview in Helsinki or via skype. Application is now closed.

Postdoctoral positions: we will start processing the applications on August 12th, 2018 so please apply quickly. The call will remain open until the positions are filled. By applying to this call, organized by the Finnish Center for Artificial Intelligence, you apply with one application to both Aalto University and the University of Helsinki. The employing university will be determined according to the location of the supervising professor.

QUALIFICATIONS

Doctoral students: see instructions at http://www.hict.fi/autumn_2018

Postdoctoral positions: candidates should have a PhD in Computer Science, Statistics, Data Science or a related quantitative field and are expected to have an excellent track record in scientific research in one or several fields relevant to the position. Good command of English is a necessary prerequisite. In the review process, particular emphasis is put on the quality of the candidate's previous research and international experience, together with the substance, innovativeness, and feasibility of the research interests, and their relevance to FCAI research programs. Efficient and successful completion of studies is considered an additional merit.
 

COMPENSATION, WORKING HOURS AND PLACE OF WORK

Doctoral students:  see instructions at http://www.hict.fi/autumn_2018

Postdoctoral positions: The salary for a postdoctoral researcher starts typically from 3 500  EUR per month, and increases based on experience. In addition to the salary, the contract includes occupational health benefits, and Finland has a comprehensive social security system. The annual total workload at recruiting universities is 1 624 hours. The positions are located at Aalto University’s Otaniemi campus or University of Helsinki’s Kumpula campus.

The selected candidates will be appointed for fixed-term positions, for postdoctoral researchers typically for two years with an option for renewal. For exceptional candidates, a longer term Research Fellow position can be considered. The length of the contract and starting and ending dates are negotiable. In addition to research work, persons hired are expected to participate in the supervision of students and teaching following the standard practices of the hiring departments.


FURTHER INFORMATION

  • Research-related information: supervisor or coordinator listed above (firstname.lastname@aalto.fi)

  • Application process: Akseli Kohtamaki (firstname.lastname@aalto.fi)


ABOUT THE HOST INSTITUTIONS

Aalto University is a community of bold thinkers where science and art meet technology and business. We are committed to identifying and solving grand societal challenges and building an innovative future. Aalto has six schools with nearly 20 000 students and a staff of more than 4000, of which 400 are professors. Our campuses are located in Espoo and Helsinki, Finland.  Aalto is an international community: more than 30% of our academic personnel are non-Finns. Aalto is in world’s top-10 of young universities (QS Top 50 under 50). For more information, see http://www.aalto.fi/en/.

The University of Helsinki, established in 1640, is the most versatile university in Finland. The University of Helsinki is an international academic community of 40,000 students and staff members. The university lays special emphasis on the quality of education and research, and it is a member of the League of the European Research Universities (LERU). For more information, see http://www.helsinki.fi/university/.

VTT joins FCAI as third founding member

Technical Research Centre of Finland VTT will join Finnish Center for Artificial Intelligence FCAI launched by Aalto University and the University of Helsinki as a third founding member.

VTT will bring their strong industry networks and their know-how in applied technology to the FCAI community. Their help will enforce FCAI’s ability to put the top research in both founding universities into far-ranging and efficient use in companies, public organisations and society at large.

FCAI promotes high-quality research and education on artificial intelligence in Finland and the applicability of AI to benefit companies and society. VTT will expand FCAI’s ability to speed up the necessary renewal and competitiveness of Finnish industry through AI-based innovations.

FCAI strives to make the new generation of AI methods a reality: create AIs that are understandable, trustworthy, and data-efficient. FCAI's goal is to expand into a national network of universities, companies and research institutions who will lay the groundwork for Finland to become a global leader in AI research and AI applications.

Growth in any strand of industry depends on the ability to make use of cutting-edge technology. Artificial intelligence is the key leverage here.

‘Our vision is to bring our high-class research in several strands of artificial intelligence to benefit people's every-day lives, companies and public bodies. FCAI’s impact is a potent mixture of research, a network of startups, doctoral education and competence building in AI, new innovative products and services, and smart experiments in public administration,’ says Head of FCAI, Academy Professor Samuel Kaski.

‘The single most significant growth factor now is applying artificial intelligence and ICT in general. For citizens, new innovations and solutions will bring a change in work content, professional skills, and the services society provides. AI will be able to make, for instance, medical care more efficient and personalised,’ says Tua Huomo, Executive Vice President at VTT.  

FCAI is building a national hub of universities, research institutes, industry and the private sector and public organisations with strong international networks. The FCAI community is constantly expanding with new memberships and projects.

New metagenomics tool mSWEEP accurately characterises mixed bacterial colonies

Determining the composition of bacterial communities at strain level resolution is critical for many applications in infectious disease epidemiology and in bacterial ecology.

Using the latest advances in computational inference and sequence analysis, an international team involving close collaboration with leading institutions on bacterial genomics, including the Wellcome Sanger Institute and University of Oxford, led by professors Jukka Corander and Antti Honkela (both in FCAI) has developed a new metagenomics tool called mSWEEP, which goes significantly beyond the state of the art in this field.

The effectiveness of mSWEEP is demonstrated with infection data from major human pathogens and it is expected to pave the way for entirely new approaches to addressing important biological and clinical questions about inter-strain competition, dissemination of resistance and virulence.

The research article: High-resolution sweep metagenomics using ultrafast read mapping and inference.

Tackling bacteria with statistics – simulator-based inference for drug development

Professor Jukka Corander (FCAI, University of Helsinki, University of Oslo) interviewed with the Academy of Finland about his work on new kinds of artificial intelligence methods for drug and vaccine development and for analysing bacterial populations.

The interview in Finnish here: http://www.aka.fi/fi/akatemia/media/Ajankohtaiset-uutiset/2018/tilastotieteella-bakteerien-kimppuun

In FCAI, Professor Corander is the Responsible Coordinator of the Simulator-Based Inference research group.

Espoo becomes a member of FCAI: researchers to develop artificial intelligence for the services of the city

The City of Espoo has become a member of the Finnish Center for Artificial Intelligence FCAI. FCAI is a research centre launched by Aalto University and University of Helsinki, which gathers together the best artificial intelligence researchers in Finland. FCAI's objective is to make the most advanced methods of artificial intelligence available to enterprises, organisations and society.

The City of Espoo sees that developing artificial intelligence together will be beneficial for the whole innovation community from enterprises to R&D organisations and the inhabitants in Espoo.

“For a researcher, the data in the databases of the city of Espoo and the shared databases of the Helsinki metropolitan area is very interesting. Especially the innovative start-up companies in the area and Espoo's desire to be profiled as a pioneer in the use of intelligent technologies set a good basis for cooperation with researchers developing artificial intelligence. We have all the prerequisites to expand our cooperation to other research centres and other cities as well,” says the Head of FCAI, Academy Professor Samuel Kaski.

“On the one hand, researchers need data for the development of artificial intelligence methods and technology, and public organisations have this data. On the other hand, we as a city get to use the methods, technologies and the latest knowledge of artificial intelligence research in the development of our services,” says Tomas Lehtinen, data analyst consultant for the City of Espoo.

Restoring images without clean data

There are several real-world situations where obtaining clean training data is difficult. For instance, low-light photography – astronomical imaging, for example – physically-based image synhesis and magnetic resonance imaging are such cases.

Aalto University and FCAI professor Jaakko Lehtinen with his team from NVIDIA and MIT postdoctoral researcher Miika Aittala show in their paper accepted to the International Conference on Machine Learning ICML 2018 that it is possible to recover signals under complex corruptions without observing clean signals, at performance levels equal or close to using clean target data.

They have applied basic statistical reasoning to signal reconstruction by machine learning — learning to map corrupted observations to clean signals — with a simple and powerful conclusion: under certain common circumstances, it is possible to learn to restore signals without ever observing clean ones, at performance close or equal to training using clean exemplars.

The team applies their methods to photographic noise removal, denoising of synthetic Monte Carlo images, and reconstruction of MRI scans from under-sampled inputs. All cases are based on only observing corrupted data.

FCAI's and Reaktor's AI MOOC has attracted 30 000 participants

Elements of AI open-for-all online crash course on artificial elements provided the University of Helsinki and Reaktor has already 30 000 registered participants. The course launched 14 May 2018. 

More than 100 organizations have taken a pledge to support their employees in learning about artificial intelligence in #AIChallenge campaign. They include  Finnair, StoraEnso, OP, Nordea, Nokia, Telia, Posti. Read more about the challenge: elementsofai.com/ai-challenge.

Some recent media write-up of the course:
Yle News
Endgadget

elementsofai.com

Blurred lines between search and recommendation: interactive data exploration

Recent work by FCAI researchers Tuukka Ruotsalo, Tung Vuong, Khalil Klouche, Salvatore Andolina and Giulio Jacucci investigate the role of interactive machine learning in exploring data. Their particular emphasis is on efficient user input and transparency of recommendation – the ‘how’ and ‘why’ of search queries, respectively.

In one case, users explore points of interests using available social content and review data from Yelp Phoenix, Arizon (11,000 PoI; 225,000 reviews; 42,000 users): personal preferences, tags combined with personal preferences, and tags and social ratings combined with personal preferences. The transparency (provenance) of recommendation was decisive as the combination of social rating information and personal preference information improves search effectiveness and reduce the need to consult external information.

Klouche, K., Ruotsalo, T., Cabral, D., Andolina, S., Bellucci, A., & Jacucci, G. (2015, April). Designing for exploratory search on touch devices. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (pp. 4189-4198). ACM.

In other research in exploring the entire data set of scientific publications over 50 million papers, the FCAI team have been able to show – using similar graph-based machine learning – how to support efficient user input in exploration by allowing users to easily interact with entities such as people, keywords and documents.

The research is a prime example of interactive AI and has important implications for developing system that aid exploration of products, documents, points of interest or people. The examples showcase how online machine learning can make use of user input for interactive AI.

See the research article:
https://www.sciencedirect.com/science/article/pii/S0306457316306045

A new paradigm of ordinary differential equations

Aalto University and FCAI professor Harri Lähdesmäki has with his colleagues introduced a new paradigm of non-parametric ordinary differential equations modeling that can learn the underlying dynamics of arbitrary continuous-time systems without prior knowledge.

For many complex systems it is practically impossible to determine equations or interactions that would govern the underlying dynamics. In these settings, a parametric ODE model cannot be formulated. Lähdesmäki and his team have now overcome this issue. They propose to learn non-linear, unknown differential functions from state observations using Gaussian process vector fields within the exact ODE formalism.

They demonstrate the model’s capabilities to infer dynamics from sparse data and to simulate the system forward into future.

See article by Markus Heinonen, Cagatay Yildiz, Henrik Mannerström, Jukka Intosalmi, Harri Lähdesmäki, ‘Learning unknown ODE models with Gaussian processes’:
https://arxiv.org/abs/1803.04303

The paper has been accepted to the International Conference on Machine Learning ICML 2018.

 

Making efficient use of sensitive big data and keep it safe and private?

A new method developed by FCAI researchers of University of Helsinki and Aalto University together with Waseda University of Tokyo can use, for example, data distributed on cell phones while guaranteeing data subject privacy.

Modern AI is based on learning from data, and in many applications using data of health and behaviour the data are private and need protection.

Machine learning needs security and privacy: both the data used for learning and the resulting model can leak sensitive information.

Machine learning needs security and privacy: both the data used for learning and the resulting model can leak sensitive information.

Based on the concept of differential privacy, the method guarantees that the published model or result can reveal only limited information on each data subject while avoiding the risks inherent in centralised data.

In the new method, using distributed data avoids the risks of centralized data processing, and the model is learned under strict privacy protection.

In the new method, using distributed data avoids the risks of centralized data processing, and the model is learned under strict privacy protection.

Privacy-aware machine learning is one key in tackling data scarcity and dependability, both identified by FCAI as major bottlenecks for wider adoption of AI. Strong privacy protection encourages people to trust their data with machine learners without having to worry about negative consequences as a result of their participation.

The method was published and presented in December in the annual premiere machine learning conference NIPS: https://papers.nips.cc/paper/6915-differentially-private-bayesian-learning-on-distributed-data.

FCAI researchers involved in the work: Mikko Heikkilä, Eemil Lagerspetz, Sasu Tarkoma, Samuel Kaski, and Antti Honkela.

 

Yes, but did it work? Evaluating Variational Inference

While it’s always possible to compute a variational approximation to a posterior distribution, it can be difficult to discover problems with this approximation. Aalto University and FCAI professor Aki Vehtari proposes with his colleagues two diagnostic algorithms to alleviate this problem.

The Pareto-smoothed importance sampling (PSIS) diagnostic gives a goodness of fit measurement for joint distributions, while simultaneously improving the error in the estimate. The variational simulation-based calibration (VSBC) assesses the average performance of point estimates.

The paper by Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman, ‘Yes, but Did It Work?: Evaluating Variational Inference’ has been accepted to the International Conference on Machine Learning ICML 2018.

 

Open postdoctoral position in machine learning for inferring chemical toxicity

We are looking for a postdoctoral researcher with expertise in machine learning to work on a collaboration project between Professor Samuel Kaski’s team in Finnish Center for Artificial Intelligence FCAI at Aalto University and Janssen Pharmaceutica. The exciting research problem is to learn to infer toxicity of chemicals based on the chemical structure, and the great opportunity is that we have unique data for the learning.

The successful candidate will be employed by Aalto University and work in Otaniemi campus (Helsinki, Finland) for the 1st year of the 2-year contract. During the 2nd year, the work will be performed at Janssen Pharmaceutica premises in Beerse, Belgium.

Read more and apply for the position at aalto.fi.

AI-created family trees confirm class divisions in Finland in the 18th and 19th century

The genealogy algorithm AncestryAI efficiently combines huge amounts of birth data.
It would take 100 person-years for a genealogist to map and find all the parents for five million people – with a rate of one person per minute. The AncestryAI algorithm can do the same work in an hour using 50 parallel computers and with a success rate of 65 per cent. The algorithm can also measure the level of uncertainty for each connection so that unreliable results can be ignored.

‘The algorithm does not replace the work of genealogists; it is simply a tool for helping them in their work. The genealogy algorithm can suggest connections which are probably correct, but on its own it is not as precise as a careful genealogist. The algorithm can also search for parents from nation-wide data, while a genealogist may need to limit their search to just one parish,’ explains Eric Malmi, doctoral student at Aalto University who currently works for Google in Zürich.

Malmi will defend his doctoral dissertation at Aalto University in June in the supervision of Aalto University professor and FCAI programme leader Aristides Gionis.