European AI Night : Why AI Should Be Explainable?

1. Introduction

Interpretable, or Explainable, Artificial Intelligence (“AI”) has turned into an important topic for those software vendors and users in today business world working within the space. As AI has increasing impact on day to day activities; trust, transparency, liability and auditability have become prerequisites for any project deployed at a large scale.

A workshop was organized on this theme at the 2019 European AI Night in Paris. Four French noteworthy AI players were welcomed by France Digitale and Hub France IA, to examine why they are now increasingly focused on Explainable AI (“XAI): BleckwenD-EdgeCraft.ai and Thales.

Three AI use cases, already running today in production, were displayed, demonstrating how explainable AI can be leveraged to make better, more efficient and usable tools for projects within corporations.

2. Presentation

Interpretability is about communication: it’s mandatory to know the end users’ activities and processes in order to adapt the presentation of the results to their needs 

Yannick Martel, Chief Strategist at Bleckwen

Created in 2016, Bleckwen is a French fintech, leveraging Behavioural Analytics and Machine Learning to help Banks and Financial Institutions to fight against fraud. Up until this point, the appropriation of Artificial Intelligence in a critical sector such as Financial Services has been limited. Yannick Martel believes that interpretability is a key success factor in AI adoption, as both experts, customers and compliance officers need to get a better understanding of the results of algorithmic models to establish a trustful collaboration with technology-based solutions such as Bleckwen’s.

A significant area for improvement is ensuring providers give the best clarifications to clients, choosing among all mathematically correct explanations those that match with their thought processes and activities. This is a key strength of the Bleckwen platform as outlined and illustrated by Yannick through the discussion. Another test, as Yannick clarified, is to ensure there is a clear explanation for decision making within the platform – clearly illustrating and helping illuminating factors leading to decisions which helps create understanding. In Bleckwen’s case, Yannick was able to illustrate how this thought process directed the design process and how it was fostering the building of trust in the ultimate detection processes.

Explanations are mandatory when AI empowers humans to perform complex tasks

Antoine Buhl, CTO @D-Edge

D-Edge offers SaaS solutions for lodgings and inn networks. 11 000 lodgings in Europe and Asia are using the D-Edge solution for optimizing their distribution. D-Edge uses Artificial Intelligence alongside with statistical models to improve rooms pricing and to make reservation withdrawals predictions.

Choosing the right price for a room is very complicated and requires the combination of numerous elements (room officially sold, costs of the contenders, nearby events, and so on) including external events which can’t be foreseen. Antoine Buhl took the example of the ongoing “Gilets Jaunes” crisis in France, started by the end of 2018, which brought on an unusual and significant rise in the cancellation rate for hotels. What seems to be an “AI bug” can be effectively analysed if the AI lets the revenue manager knows that he does not recognise those elements. Additionally, D-Edge faces another challenge: analysing, even after the occasions, if a room price was ideal or not, is nearly an unachievable objective in an ever-evolving environment.

D-Edge solution presents recommendations, but the final decision is the Revenue Managers’ job. To settle on the correct choices in this evolving and complex environment, Revenue Managers need clarifications of the suggestions. Adoption is key in this cooperation between humans and machines. At D-Edge, they measure how Revenue Managers use the recommended prices to constantly quantify this selection (both the nature of the suggestion and the nature of the clarification). To an ever-increasing extent, they see Revenue Managers giving the AI a chance to change independently the price proposal according to the clarifications and different parameters.

Without interpretability, predictions have no value

Caroline Chopinaud, CCO @ Craft.ai

Craft.ai offers Explainable AI as-a-service to empower product and operational teams to develop and run XAI projects. Craft.ai manages information stream to computerise business processes, enable predictive maintenance or boost user engagement. Caroline explained how Dalkia uses Craft.ai to improve the efficiency of their energy managers by providing them with detailed analyses and recommendations. Explainability is a prerequisite; without it, human specialists would need to reinvestigate to understand the results, thus invalidating the efficiency advantage. That is only one illustration among others of why explainability is a key for AI deployment and that is the reason why craft.ai builds up their own whitebox Machine Learning models!

When it comes to create AI for critical systems, trustability and certifiability are mandatory

David Sadek, VP Research, Innovation & Technology @ Thales

David Sadek presented the difficulties faced by Thales as they create AI for complex frameworks: space, communications, avionics, defence…

A key issue is building trust between machines and the people that collaborate with them. It is critical to consider how explanations are passed on: for instance, through a conversational interface ready to dialog in a natural language and using explanatory variables that matter to the operators. Another significant field where explainability is key is autonomous vehicles certification. While current algorithmic models’ calculations are secret, having the option to understand decisions will be critical to certify such systems: why an obstruction was perceived, why an identified shape was not viewed as an obstacle, etc. To this end, hybrid solutions consolidating effective but unexplainable deep learning techniques and symbolic AI reasoning are investigated at Thales.

3. Roundtable

The workshop finished up with a talk between the participants and the panellists on the key issues for interpretable AI.

The main issue raised was on the nature of the explanations: the fidelity of explanations and the trust people can have on them. The panellists pointed out that those two aspects are clearly connected.

Yannick Martel disclosed that because fraud is a complex phenomenon, especially regarding the number of meaningful features that need to be considered, Bleckwen decided to develop a dual methodology: forecasts depending on non-explainable AI models, combined with local explanations based on surrogate models. This approach helps provide efficient insights to the users. While creating the AI, Bleckwen verified that the forecasts did not miss genuine frauds and that the explanations made sense to business experts.

Caroline Chopinaud depicted an explainable-by-design approach where the same model is used for prediction and explanation – which means no gap between prediction and insights provided to users. To be really insightful, the algorithms have to work on business-meaningful features and combinations of features – not just any combination that “works” for the data scientists but those which talk to business experts. This is the reason for Craft.ai investment in natively interpretable machine learning algorithms. Evaluating whether an explanation is useful and understandable requires a feedback from users – no quantitative assessment is currently provided.

A comparable explainable by-design approach is also used by D-Edge, Antoine Buhl clarified, relying on various AI models. Since approving a recommended price is complex, D-Edge concentrates its KPI on the trust the revenue manager put in the suggestions and the clarifications, by following how regularly Revenue managers approve the suggested prices as it stands.

David Sadek ended the discussion by presenting the ethics issue in AI. For him AI ought to be evaluated on three dimensions: accuracy, interpretability and morality. For a long time, most of the AI players have focussed on the first aspect. The two others are critical when it comes to put AI in production, especially in complex systems. Explainability is mandatory to control and audit the ethics of an AI model, helping to spot bias for instance, yet it isn’t enough to guarantee an ethical behaviour.

4. Key Take Away

Explainable AI may be a concern that has arrived only recently in the spotlight, however for certain players in the field, it has been key for quite a while. It’s not by chance that those actors could run in production AI projects affecting key parts of organizations.

Explainability is not just another feature of those AI projects,

it is a critical factor in the decision to go live!

> Want to know more about explainability, a key  success factor in fighting financial crime?

Contact our experts: contact@bleckwen.ai


Key Take Away From The 2nd Fraud And Financial Crime Conference In London

Bleckwen were delighted to participate and contribute at what was a fascinating and high energy event in London this week.

During two days filled with presentations and panel discussions, a wide-ranging audience discussed the rapidly moving developments across technology landscapes, regulatory topics, debating the psychology of crime and, of course, looking into evolving criminal strategies. Participants left rich with an updated set of relevant market statistics, estimates and predications.

 Fraud: the 15th largest country on earth!

Businesses everywhere are now affected in the broad wave of financial crime in all its guises: 47% of businesses have been affected by financial crime within the last 12 months. It is estimated that $1.47tn is lost to financial crime globally, which is 5% of global GDP according to one estimate. This criminal community (call it “Gotham City” just without Batman to oversee things!!) would rank as the 15th largest country on earth. Staggering stuff!

5% of global GDP was lost to financial crime during the last 12 months  

At the same time, it was also noted how cultural and reputational issues meant that not all financial crime was disclosed to allow the broader financial community to effectively mitigate or organize against repeat, similar attack profiles or bad actors.

Regulation speeds up

During the conference, there were many engaging discussions around Bank’s levels of readiness in respect to the 5th Money Laundering Directive (5MLD). For a number of banks the pace and speed at which regulatory change was approaching was raising concerns.

Considering that the 4th AMLD is not fully deployed yet, the 5th AMLD will put additional pressure on financial institutions and require new processes to be implemented. As an example, they will have to use the Ultimate Beneficiary Owner (“UBO”) registries put in place by local authorities as well as report any discrepancies they identify between the information gathered from customers and that which is available in the official registry.

Given the pace of regulatory change and the corresponding new requirements to be placed upon financial institutions – the upcoming 5MLD was compared to the recent implementation of SEPA where, as an industry, there were varying degrees of bank preparedness as deadline dates approached and with a corresponding awareness that not everybody would be ready on time to meet the new obligations.

The Arms Race

One recurring theme throughout many discussions was an acknowledgement of the rising level of collaboration amongst criminal fraternities. Against this backdrop there was a recognition that the evolution in real-time payment platforms and corporate/bank strategies were changing the nature of threat. A corresponding response in solution technology approach and architecture is now required. The situation was characterised in one panel session as being akin to an ‘Arms Race’ developing between the criminal and the business/individual when it comes to Fraud, Anti-Money Laundering and broader financial crime.

With views shared from the senior experts within, for e.g. the Metropolitan police, large banks, technology providers and European regulators, it was recognized that there is increasing sophistication, speed and also threat level and as an industry and as a community we need to respond quickly.

The most effective counter-strategy will need to be federated and collaborative 

 

Leveraging technology and AI

David Christie, Bleckwen’s CEO, joined a key panel discussing the “Uses of technology of AI and Machine Learning to automate and increase fraud and AML detection”. In today’s faster moving and more dynamic space, models based on averages or generic rules are no longer to be considered best-in-class: the rules can be quickly subverted by an intelligent, well mobilized and connected criminal fraternity.  Also broad averages would not adapt and adjust to individual profile development and hence would throw out too great a field of unnecessary investigations leading to both operational, cost and client friction.

AI and ML are now being effectively deployed in the fight against financial crime and particularly fraud detection. Noting the processing power, detection capabilities, intuitive nature of Bleckwen’s technology and enhancements on basic ‘rules based’ platforms, David highlighted how Bleckwen’s solution is geared towards addressing both the authorized and unauthorized fraud processes.

A critical differentiator amongst AI based solution providers is the “interpretability” of the output: humans need to know on what elements the algorithm has based its decision (the contribution of each variable to the results). Interpretability allows them to make an informed decision in an efficient and reliable way. Bleckwen, as a company, believes “Interpretability” to be a key for effectiveness.

 

To effectively fight fraud, AI based solutions need Interpretability  

 

Another take away from this conference is that regulators were now placing increasing weight and focus on Interpretability of AI to help both understand but also further mobilise these critical, advancing technologies.


Interpretability Of Machine Learning Models – Part 2

In the previous article, we explained why the interpretability of machine learning models is an important factor in the adoption of AI in industries, and more specifically in fraud detection. (https://www.bleckwen.ai/2017/09/06/interpretable-machine-learning-in-fraud-prevention/ ).

In this article, we’re going to explain how LIME works. It’s an intuitive technique that we have tested at Bleckwen.

Before looking at LIME in detail, it is necessary to situate it among other existing techniques. In general, interpretability techniques are categorized along two axes:

  • Applicability: Model-specific versus Model-agnostic
  • Scope: Global versus Local

Model-specific versus Model-agnostic

There are two types of techniques:

  • Specific techniques: these techniques apply to a single type of model because they rely on the internal structure of a machine learning algorithm. Some examples of specific techniques are: Deeplift for Deep Learning, Treeinterpreter for models tree-based models like RandomForest, XGBoost, etc.
  • One of the biggest advantages of model-specific techniques is that they generate, potentially more precise explanations because they are directly dependent on the model to be interpreted
  • However, the disadvantage is that the explainabilty process is therefore attached to the algorithm used by the model and any change to another model can become complicated.
  • Agnostic techniques: these techniques don’t take into account the model to which they apply and they only analyze the data used from inputs and the decisions taken out. Examples of agnostic techniques are: LIME, SHAP, Influence Functions,
  • The main advantage of agnostic techniques is their flexibility. The data scientist is free to use any type of machine learning model because the explanation process is separate from the algorithm used for the model.
  • The disadvantage is that often these techniques are based on replacement models (surrogates models) which can seriously reduce the quality of explanations provided.

Global versus Local

The underlying logic of a machine learning model can be explained on two levels:

Global Explanation: It’s important to understand the model as a whole, then to focus on a specific case (or group of cases). The global explanation provides an overview of the most influential variables in the model, based on the data input and the predicted variable. The most common method for obtaining an overall explanation of a model is the computation of features importance.

Local explanations identify the specific variables that contributed to an individual decision, a requirement that is increasingly critical for apps using machine learning.

The most important variables in the overall explanation of an algorithm doesn’t necessarily correspond to the most important variables of a local prediction.

When trying to understand why a machine learning algorithm reaches a particular decision, especially when this decision has an impact on an individual with a “right to explanation” (as stated in the service provider obligations under the GDPR) local explanations are generally more relevant.

Case study for banks

Let’s take an illustrative case study to understand MLI (machine learning interpretability) techniques better:

The BankCorp Bank offers its customers a mobile application to lend money instantly. The loan application consists of four pieces of information: age, income, SPC (socio-professional categories) and amount requested. To respond quickly to its customers, BankCorp uses a machine learning model that assigns a risk score (between 0 and 100) for each case in real time. Cases with a score greater than 50 require a manual review by bank risk analysts. The image below illustrates the utilization mechanism of this model:

Scoring of credit applications with a black box machine learning model.

A BankCorp risk analyst believes that the score of Case 3 is strangely high compared to the demand characteristic and wants to obtain detailed reasons for the score. The BankCorp data scientist team use a complex black-box model, given the financial performance constraints of the market and can’t provide an explanation for each case. However, the model used makes it possible to extract a global explanation of the important variables according to the model (figure below):

Global interpretation of the BankCorp black box model.

The global interpretation of the model provides an insight into the logic of the model through the level of importance of each variable. The level of importance of a variable is assigned by the model during the learning process (training) but this doesn’t indicate the absolute contribution of each factor, in the final score. In our example, we can see that the requested amount variable, as expected, is the most important variable from the models point of view for calculating the score. Income and age variables are slightly less important while the borrower’s SPC doesn’t seem to affect the score too much.

Although this level of interpretation offers a first understanding of the model, it’s not sufficient to explain why Case 3 is twice as poorly rated as Case 1, when both ask for the same amount and have income and similar ages. To answer this question, we must use a local and agnostic method (since the model is a black box).

Understanding the decisions made by a machine learning model with LIME

LIME (Local Interpretable Model-Agnostic Explanations) is an interpretation technique, applicable to all types of models (agnostic) that provides an explanation at the individual level (local)
. It was created in 2016 by three researchers from the University of Washington and remains one of the most known methods.

The idea of ​​LIME is quite intuitive: instead of explaining the results of a complex model as a whole, LIME will create another model, simple and explainable, applicable only in the vicinity of the case to be explained. By vicinity we mean the cases close to the case that we want to explain (in our example Case 3). LIME’s mathematical hypothesis is to demonstrate that this new model, also known as the “surrogate model” or replacement model, approximates the complex model (black-box) with good precision, in a very limited region.

The only prerequisites for using LIME is to have the input data (cases) and for it to be able to ask the black-box model as many times as necessary to know the scores. LIME then carries out a kind of, “reverse engineering” to reconstruct the inter logic workings around the specific case.

To do this, LIME will create new examples for the case slightly different from those you want to explain. This consists of changing the information in the original case, one at a time, and presenting it to the original model (black-box). This process is repeated a few thousand times depending on the number of variables to be modified. This process is known as “data perturbation” and the modified cases are called “perturbed data”.

Eventually, LIME will have set up a database of “local” labelled data (i.e., case → score) where LIME knows what it has changed from one case to another and the decision issued by the black-box model.

Construction of the training database from the case to be explained by the data disruption process.

From this database of cases similar to the one that we want to explain, LIME will create a new machine learning model, that is simpler but explainable. It’s therefore this new model of “replacement” that is used by LIME to extract the explanations.

Creation of the replacement machine learning model created by LIME

The figure below shows the explanation of the score provided by LIME for Case 3. The variable SPC and the amount requested contribute to a high score (+49 and +29 points respectively). On the other hand, the age and income variables reduce the risk score of demand (-6 and -2 points respectively). This level of interpretation highlights that for this particular case, the variable SPC is very important, contrary to what one could expect by looking only at the global interpretation of the model.

Therefore, the risk analyst would now be able to understand the particular reasons that led this case having a poor score (in this case SPC equal to craftspeople). The risk analyst then could compare this decision with their experience to judge whether the model responds correctly to the bank’s granting policy or if it’s biased towards a population.

Explanation of the score for Case 3

In its current version, LIME uses a linear regression (Ridge Regression) for building the replacement model. The explanations are therefore derived from the regression coefficients, which are immediately interpreted. It should be noted, that some of the concepts explained here differ slightly in LIME’s Python implementation. However, the idea presented makes it possible to understand the intuition of the technique as a whole. This video made by the author of the framework offers a little more understanding in the operation details of LIME.

The official implementation of LIME is available in Python. There are also other frameworks that offer LIME in Python (eli5 and Skater). A port in R language is also available here.

Advantages and disadvantages of LIME

At Bleckwen, we were able to test LIME with real data and in different case studies. From our experience, we are able to share with you the following advantages and disadvantages:

Advantages:

  • The use of LIME means that the data scientist doesn’t need to change the way they work or the models deployed to make them interpretable.
  • The official implementation supports structured (tabular), textual, and image (pixel) data.
  • The method is easy to understand and the implementation is well documented and open source.

Disadvantages:

  • Finding the right neighborhood level (close cases): LIME has a parameter to find the right neighborhood radius. However, its tuning is empirical and requires a trial and error approach.
  • The discretization of the variables: the continuous variables can be discretized in several ways. We found that the explanations were highly unstable, depending on the parameter used.
  • For rare targets, which are common cases in fraud detection, LIME gives rather unstable results because it’s difficult to perturb new data sufficiently to cover enough fraud cases.
  • Time consuming: LIME is a little slow at computing explanations for the results (a matter of seconds). This prevents us from using it in real time.

Conclusion

The interpretability of machine learning models is a blooming market where much remains to be done. Over the past three years, a growing number of new approaches have been seen and it’s important to be able to identify them according to two main axes: its application – Agnostic vs. Specific methods and its scope of interpretation – Global vs. Local.

In this article, we have introduced LIME, a local and agnostic technique created in 2016. LIME works by creating a local model from the inputs and outputs of the black-box model and then deriving the explanations from a replacement model, which is easier to interpret. This model is only applicable in a region well defined by the vicinity of the case that one wants to explain.

Other techniques like SHAP and Influence Functions are also promising because they are based on strong mathematical theory and will be the subject of a future blog post.


AI Books Summer Reading List

At Bleckwen we love reading books! For your summer break, we’ve compiled this list our favourite AI books. We are happy to share it with you. Enjoy!

WEAPONS OF MATH DESTRUCTION

Author: Cathy O’Neil

Cathy O’Neil (a Harvard PhD graduate in mathematics) has worked as a professor, hedge-fund analyst and data scientist. She founded ORCAA, an algorithmic auditing company. In her book, she explores how algorithms can threaten many aspects of our lives if they are used without control.

Algorithms, rule-based processes for solving mathematical and business problems, are being applied to a wide/large variety of fields. Their decisions directly affect our daily life: at which high school can we register? which car loan can we get? how much do we have to pay our health insurance?

In theory, as mathematics are neutral, we would say that it is fine: everyone is judged according to the same rules. But in practice, algorithm decisions can be biased because the models widely use today are opaque and unregulated. They provide only black box decisions: nobody can explain the logic and the reasons that lead an algorithm to produce its result. Thus they cannot easily be challenged or audited. Can we let models rule parts of our lives and shape our future?

Cathy O’Neil calls on data scientists to take more responsibility for their models and governments more regulations on their use. By the end, it is any citizen who should be savvy about the use of their personal data and the algorithm models that govern our lives.

 

Why we love this book: we enjoyed O’Neil actual description our present world and the weaknesses she points out, weaknesses that will expand with the increasing use of AI. At Bleckwen, we believe that Interpretability is necessary to create a trustful collaboration between Human and machine.

On the same subject, you can also take a look at our post: Interpretability, The Success Key Factor When Opting For Artificial Intelligence.

HUMANS NEED NOT APPLY

Author: Jerry Kaplan

Jerry Kaplan is a Silicon Valley serial entrepreneur and a pioneer in tablet computing. At Stanford University, he teaches ethics and the impact of artificial intelligence at the Computer Science Department.

In his book, Jerry Kaplan looks at the profound transformations Artificial Intelligence technologies are already bringing to our society and their consequences. Kaplan warns about a future growth driven more by assets than by labor, as AI through automatization decreases the value of labor. One possible consequence to the rise of AI would be unemployment and a broader income disparity.

Another consequence could be the risk to get a part of our economy under the control of algorithmic systems. This could happen if we decide to create cybernetic persons with the right to sign contracts and own property: this would grant them high autonomy with a limited capacity of control.

Sidestepping from techno-optimism, Kaplan exposes the necessary regulatory adaptations of our society to artificial intelligence in order to ensure a prosperous and fair future. It is important to tackle some of the moral, ethical, as well as political issues created by AI before it is too late.

 

Why we love this book:“Science without conscience is but the ruin of the soul” – and technology without politics could bring ruin to society! This book presents the promise and perils of Artificial Intelligence. AI has many individual and societal benefits but also significant risks if any ethical and political reflection is not established. We appreciate Kaplan’s future vision and deep reflexions on AI.

SUPERINTELLIGENCE: paths, dangers, strategies

Author: Nick Bostrom

Nick Bostrom is professor in the Faculty of Philosophy at Oxford University, specialized in foresight, especially that concerning the future of humanity. He is also the Director of the Future of Humanity Institute and of the Programme on the Impacts of Future Technology within the Oxford Martin School. He is the author of some 200 publications.

What happens when machines surpass humans in general intelligence?  This new superintelligence could become extremely powerful and possibly beyond our control.

In his book, Nick Bostrom is laying the foundation for understanding the future of humanity and intelligent life.

A superintelligence agent could arise from the extension of technics we presently use today such as: Artificial Intelligence, neuron emulation, genetic selection…. Nick Bostrom calls us to engineer initial conditions to make this superintelligence compatible with human survival and well-being. It seems necessary to solve the “control problem” of this intelligence, to ensure we develop the control mechanisms with growing capabilities in parallel with the capability of this intelligence. The author distinguishes two broad classes of potential methods for addressing this problem: capability control and motivation selection.

 

Why we love this book: at Bleckwen, we believe that debates about the future of AI and Machine Learning are very important for society. We love Bostrom’s practical vision of the potential risks entailed by the development of this superintelligence. In order to keep this superintelligence in our Humanity, he recommends research be guided and managed within a strict transparent and ethical framework. He calls for a collective responsibility.

NEUROMANCER

Author: William Gibson

We could not forget in our list this multi-awared book written in 1984 by the American-Canadian author William Ford Gibson. This well-known science fiction novel spawned the cyberpunk movement, a rather bleak vision of our future.

The novel tells the near-future story of Case, a washed-up computer hacker hired by a mysterious employer for one last job against a powerful corporation. Case and his cohorts will have to fight against the domination of a corporate-controlled society by breaking through the global computer network’s cyberspace matrix.

When Neuromancer was published in the early 80’s, only around 1% of Americans owned a computer and most people were unaware of the potential of the networked computing.  Gibson not only conceived of a credible evolution of virtual reality, but had already anticipate the kind of hacker culture that would emerge as the dark side of the web.

 

Why we love this book: we love this book because we love science fiction! It has an astounding predictive power and keeps us on our toes. This novel also challenges our assumptions about our technology and ourselves. Beyond the story, this books raises up a lot of ethical, philosophical and legal questions around the theme of control (and how to escape from it).


Using Graphs To Reconstruct Catalan Crisis Events With Tweets.

In 2017, over 330 million people around the world use Twitter each month to comment and react instantly to events. Over the same year around 180 billion of tweets have been sent by users! In autumn 2017, Spain experienced one of the most important social event since the Spanish Civil War: the Catalan independence referendum. Throughout the crisis, the population has widely used twitter to react to events in real time. This major event in the recent history of Spain received a massive media coverage as well.

At Bleckwen, we believe data and analytics can be used to answer challenging questions in many fields. In a daily basis, we use these techniques to fight fraud and protect our clients. Thus we decided to apply similar techniques on a completely different field: we asked ourselves if we could reconstruct the timeline of Catalonia crisis and correctly identify the main events using only twitter metadata, i.e. without analyzing the content of tweets.

Our goal: use the power of analytics to answer two questions:

  • Can we identify major events during the crisis from the metadata of the tweets, and then reconstruct the timeline of events?
  • Are we able to detect important events that have not been covered by traditional media?

On the morning of 27th October 2017, the Spanish Senate gives full powers to the head of government, Mariano Rajoy. The latter can now put Catalonia under guardianship. On the same day, in the afternoon, Carles Puigdemont, President of the Generalitat of Catalonia, proclaims the independence of the region, following the results of the referendum. Spain is experiencing a major crisis in its history.

Could we reconstruct the events that marked this crisis using only Twitter metadata?

To study the Catalan crisis, we collected via Twitter’s API Stream all tweets sent from October 3rd to November 6th, 2017, written in Spanish, Catalan, Galician and Basque. We filtered only tweets containing the words [catalogne, catalunia, catalunya, etc.]. We created a data set of 824 influential tweets and their 1 million retweets.

In the same time, we manually listed the dates of the 18 major events of the Catalan crisis (see image below) covered by 8 major traditional media: BBC, The Independent, The Local, Fox News, NBC, Euronews, US News, and Politico.eu.

Tweets are reactions to real life events

People tend to use Twitter in different ways: to share their moments, their ideas or just to post their cat’s photo!  During important events like social movements, World Cup or US election, tweets are the voice of people in reaction to what they see, feel or experience in the real life.

Based on this assumption, we collected all tweets of a given population in a delimited period of time. Then we tried to classify tweets reacting to a single event in different clusters. The result of this categorization is what we called “a set of an event abstraction”. Example of tweets clustering during the Catalonia crisis:

However, in order to group tweets together with analytical methods, we need to measure how close a tweet is to another. So we have to define a similarity metric. One could analyze the content of the tweets and assess if they talk about the same event. But as we like challenges, we tried to do this without content analysis!

Defining similarity of tweets with no content analysis

In order to compute the similarity between tweets we first need to understand two important concepts:

  1. The co-occurrence: tweet A and tweet B have been sent close in time
  2. Co-retweeting: both tweets have been retweeted by the same people

We can now state that the similarity between two tweets is defined by the product of these measures as illustrated in the figure below:

In other words, two tweets are more similar as they were sent near in time and as same users retweeted both of them.

 

Clustering tweets with a graph approach

Now we have a good way to measure how similar one tweet is to another, it is time to group a number of them together and discover the event they are correlated with. We use for that a data structure called Graph.

Graph is a powerful concept and widely used in many fields like chemical, security, fraud prevention and the most known, social networks.

According Wikipedia’s definition:

Graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense “related”. The objects correspond to mathematical abstractions called nodes and each of the related pairs of nodes is called an edge.

In our case, the nodes of the graph are the tweets we collected and the relation between them (edges) are the similarity with each other computed, according to the definition above.

We then applied a community detection algorithm called Walk Trap to the graph of Catalonia we have built. The assumption behind this approach is that each detected community correspond to a cluster containing homogeneous tweets linked to a specific event happened during the crisis. That is what we have called “the abstraction of an event”.

Is our model able to identify the events covered by the media?

We applied our approach to tweets sent between October 3 and November 6, 2017 and we found 34 sets of events’ abstraction. Are these 34 sets of events relevant ? Do they match with events that really happened in the real world ?

Remember that we assumed tweets inside a same cluster should be homogeneous because they have been sent in reaction to a same event. However, assessing the homogeneity of an event abstraction is a quite subjective task. It implies to look at the content of each tweet and judging if the majority of tweets that compose the abstraction are related to the same subject.

We manually reviewed the content of the tweets of each abstraction.

Here are the results:

Here is an example to visualize an “event abstraction”. Abstraction e34: 100% of the 18 tweets that compose this abstraction are linked to the event “Demonstration for the Union on October 8th, 2017″.

 

Now that we have a reasonable way to assess the relevancy of events found by our model, we are able to answer our first question. Again, our model uses only the tweet’s metadata to find events. The figure below shows that 12 events are correctly identified by our model among 18 covered by traditional media:

Of the 6 events not found by our model, 3 happened on the same day as a very large event. For example, the model does not identify two small events listed on October 3rd. However, we note that this same day there was a massive demonstration in Barcelona “against police’s violence” happening on the day before.

Partially recovered events refer to events that are not directly identified. For example, on 3rd and 5th November, two events are covered by media:

  • the arrest warrant required against Carles Puigdemont
  • Carles Puigdemont’s submission to the Belgian authorities

We noticed the model combined these two events into a single “three-day” event that could be defined as the “Puigdemont leak”.

 

Is our model able to find significant events not covered by the media?

Here again, our results are quite interesting: the model identified reactions to 11 events that are mostly less intense compared to those reported by the media. Among the events detected, we find for example:

– the publication of a press article;

– a media debate about the real or supposed indoctrination of children in Catalan schools;

– the agreement in principle between the PP (Partido Popular – People’s party) and the PSOE (Partido Socialista Obrero Español – Spanish Socialist Workers Party) on the organisation of new elections;

– the announcement of a demonstration in Brussels;

– the broadcast of the YouTube video HELP CATALONIA which denounces a “fascist Spanish state”.

The detection of minor events not mentioned by media allow a more accurate understanding of the events. Moreover, our model complements traditional media coverage of events.

Our model detects the abstractions of 12 of the 18 events mentioned by the 8 media in the first month of the Catalan crisis. It also detects 11 other “minor” events.

 

Conclusion

We have shown that it is possible to apply analytical methods to Twitter’s metadata to reconstruct a timeline of events occurring in the real world. Our approach is based on tweets’ metadata analysis (i.e. no content analysing) and graph models.

The presented model allowed us to correctly recover 12 of the 18 main events of the Catalan crisis covered by 8 media. The 6 undetected events are relatively minor.

In addition, our graph based approach identifies 11 additional events of a relatively low intensity and therefore more difficult to detect. This enable an additional understanding to the chronology of events carried out by the media.

As next step, we would like to develop a real-time version of our model. We look forward to see you for our next challenge!

On October 20th, 2017, an agreement in principle was signed between the PP and the PSOE concerning new elections to be held in Catalonia. This event has not been listed by the 8 media, nor even included in Wikipedia’s Catalan Crisis article. Our model detected this event !


Interpretability, The Key Success Factor When Opting For Artificial Intelligence.

According to Yannick Martel, Managing Director of Bleckwen, a fintech specialized in Artificial Intelligence for fraud prevention, interpretability is today a major stake to ensure a broad adoption of Artificial Intelligence. 

First of all, could you explain what interpretability is?

Interpretability is the ability to explain the logic and the reasons that lead an algorithms to produce its results. It applies at two levels:

  • Global: to understand the major trends, i.e. the most significant factors
  • Local: to precisely analyze the specific factors that contributed to the machine’s decision for a group of closely-related individuals

How does it work?

The interpretation of a model is obtained by applying an algorithm which will explain the contribution of each variable to the results.Imagine that you enjoyed a delicious black forest cake bought in a bakery (= a black box, eg. a model with opaque internal operations). If you want to make this cake at home, you will gather the ingredients (the data), follow the recipe (the algorithms) and you will get your cake (the model). But why is it not as good as the one from the bakery? Although you have used exactly the same ingredients, you probably lack the chef’s tips to explain to you the reasons why certain ingredients, at certain stages of the recipe, are important and how to combine them!

In this example, interpretability techniques will allow you to discover the chef’s tips.

There are two types of interpretability techniques:

  • the model-agnostic techniques: these techniques do not take into consideration the model to which they are applied and only analyze the data used in input and the decisions (ex: LIME: Local Interpretable Model-Agnostic Explanations, SHAP: Shapely, etc …)
  • the model-specific techniques: they rely on analyzing the inherent architecture of the model that one wants to explain to understand it (eg. for Deep Learning: Deeplift, for random forests: Ando Saabas, etc.). Both techniques can explain provide local and global model interpretation.

Could you give a concrete example of the application of interpretability?

At Bleckwen, we indifferently apply the techniques of the two types, depending on what we aim to explain. For example, for a customer’s credit request, we shall look for the reasons of its scoring by combining agnostic and specific techniques, at the local level. On another level, a global interpretability makes it possible to understand the overall logic of a model and to check for the variables deemed as important (for example, to ensure that an explanatory variable does not contain “too much” information, which is usually suspect …).

Why this need for transparency towards machine learning models?

AI is starting to be used to make critical or even vital decisions, for example in medical diagnosis, fight against fraud or terrorism, autonomous vehicles … At Bleckwen, we apply AI on sensitive topics of safety to help our customers to make decisions that are often difficult to take (eg. accepting a 40,000 euros credit granting or validating a 3,000,000 euros transfer towards a risky country). The challenges of avoiding an error and understanding a decision, or helping an expert to confirm a decision, are all the more important.

Our systems interact with human beings who ultimately must be in control of their decisions. The human being needs to understand the reasoning followed by our algorithms, to know on what elements the algorithm has based its decision. Interpretability allows them to make an informed decision in an efficient and reliable way.

How does interpretability become a societal and political issue?

Recent events show a growing concern about the use of personal data. The entry into force of the GDPR in May this year, is an important step for the data protection of European citizens. It also requires companies to be able to justify algorithmic decision making. The techniques to understand the algorithms have therefore become critical. Also in the United States, many people are questioning the “Fair use” of the algorithms’ data so as to structure their uses. This is what Cathy O’Neil, a renowned mathematician and data scientist, suggests on her blog (https://mathbabe.org) and in her book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy” by inviting us to be careful in the trust we place in Big Data and algorithmic decisions.

The Cambridge Analytica case will certainly contribute to reinforce this trend. Algorithmic decision-making becomes a major societal, political and ethical subject.

The adoption of AI will not happen without transparency.

At Bleckwen, we have made it a major focus of our offer and our technological developments.

 


 

Do you want to understand better the topic of interpretability?  Go on reading :


Social Engineering And Its Consequences.

While the media highlight increasingly sophisticated cyber-attacks, decision-makers could easily forget that humans are one of the main weak links of IT security. According to the latest IBM – Ponemon Institute report published in 2016, 25% of data leaks are due to human error or negligence.

Social engineering is about exploiting human weakness to obtain goods, services or key information.

Social engineering existed before the digital era. For example, during the 2000s, organized scammers used personal information available in Alumni directories to impersonate alumni of a prestigious university and extract money from their fellow classmates.

There is no need today to use malware or ransomware to access personal information: it is readily available on social media such as Facebook and LinkedIn. A white paper published by Alban Jarry in 2016 shows that 43% of people accept strangers on their LinkedIn network[1].

The president of a French bank recently showed us the Facebook profile of an individual allegedly working at the bank and trying to get in touch with clients: fake profile, fake identity obviously … In the same manner, how do you know who is behind the LinkedIn profile inviting you?

These “simple” techniques allow fraudsters to deceitfully obtain key information about a payer, a supplier… and subsequently impersonate them to initiate fraudulent wire transfers.

According to Grand Thornton, at least 3 out of 4 companies were targeted by fraud attempts over the past two years. If 80% of all attempts are failures, successful attacks can cause damages upward of $10 million.

$2.3 billion were stolen from businesses between 2013 and 2016, according to the FBI, and the number of victims identified in 2015 increased by 270%[2].

The phenomenon is significant, and companies have begun to build walls to contain it, implementing behavioral measures (e.g.: paying attention to corporate data published on personal social media, refraining from clicking on suspicious e-mails originating from unknown parties…), business processes to improve internal controls, etc. But these measures are not sufficient, even if correctly applied, because they still rely too much on the humans. This is the reason why new solutions are emerging, based on machine learning and big data processing. They automate more and more effectively the process of detecting attacks and fraud, in addition to human activities and processes.

You will find out more by reading our next post!

[1] https://fr.slideshare.net/AlbanJarry/livre-blanc-612-rencontres-sur-les-reseaux-sociaux-partie-2-etude

[2] https://www.lesechos.fr/08/04/2016/lesechos.fr/021827593152_le-boom-inquietant-de-la—fraude-au-president—.htm


Fighting Fraud : From Big Data To Fast Data.

Credit card fraud is the most visible type of consumer fraud: According to The Nilson Report, global damages caused by credit card fraud reached 21 billion dollars (18.4 billion euros) in 2015. Less known to consumers, wire transfer fraud (see https://www.bleckwen.ai/2018/05/31/social-engineering-consequences/) catches the attention of banks seeking to protect their customers, as one single such attack may siphon millions. It is important to realize that the system managing wire transfers is critical to the proper operation of the economy of a country. This system cannot be subjected to major breaches.

Traditional protection methods involve the implementation of expert rules and manual controls to identify and verify the most suspicious operations, but negatively impact the customer journey.

Machine Learning is a good candidate to improve the level of protection while reducing friction and manual processing during this journey.

During the design phase, creating models requires cold data analysis, in particular to build and choose variables that will reveal specific fraud patterns. Machine learning train on the model using data history. This step uses technologies that are specific to cold processing (batch).

If this part is essential, it is also necessary to consider very early on how the model will be deployed and used on “hot” data. To be effective, fraud fighting tools must be implemented on large data streams but must also be able to minimize processing delays for each wire transfer. New legislation related to instant payments further increases the requirements as to processing speed (less than 20 seconds to fully process a wire transfer[1] and a few hundred milliseconds to detect fraud). Fraud detection systems must operate in this context, which therefore requires designing a specific architecture, supported by appropriate technologies.

The main challenge of implementing a fraud detection system is the operational capacity to manage the flow of wire transfers, during peaks in particular. A fraud detection system must therefore meet at least the following requirements:

  • Comply with delay limitations per wire transfer and debit operation
  • In case of failure, switch to a second system (simple rules or automated approval) so as to not disrupt the complete chain
  • Maintain the integrity of the wire transfer chain (no duplicates or missing wire transfers)

The below diagram provides a macro view of the processing chain required for credit scoring.

The processing chain indicated in red must be completed in less than 20 seconds. To ensure this, some of the calculations must be performed offline.

  1. Fetching data history: Variables identifying fraud must be able to distinguish between “legitimate” and fraudulent wire transfers. Based on customer habits, old and recent history, these variables are therefore often queried. Old data history can usually be pre-calculated since it characterizes phenomena observed over long periods of time, with little variation. Recent data history must sometimes be calculated on the fly, depending on the observed time scale.
  2. Querying the pre-trained model: The time required for the prediction is generally negligible compared to the time required to train on the data model. This training is therefore also performed upstream.
  3. Interpretation: Analysis and decision assistance is an essential part of an effective fraud detection system, as an effective control call is characterized by precise indications given to the customer, because the risk of authorizing a detected fraud is real. Identity theft associated in cases of social engineering sometimes places the payer in a situation of trust (usual supplier, request from management), even when the alert is given.

To implement this processing chain, the requirement for streaming technologies (Fast Data) is added to existing big data requirements. There is a real technological challenge to providing tools that meet the level of reliability required by the banking industry, and support for recent innovations such as instant payments.

Our next blog post will take an in depth look at these technologies!

 

[1] https://www.europeanpaymentscouncil.eu/what-we-do/sepa-instant-credit-transfer


Fraud And Interpretability Of Machine Learning Models – Part 1

Interpretability: the missing link in Machine Learning adoption for fraud detection

Machine learning methods are increasingly used, especially in anti-fraud products (developed by Bleckwen and other vendors) to capture weak signals and spot patterns in data that humans would otherwise miss.

If the relevance of these methods for fraud detection is widely recognized, they are still mistrusted in certain industries such as banking, insurance or healthcare, due to their “black box” nature. The decisions made by a predictive model can be difficult to interpret by a business analyst, in part because the complexity behind of the calculations and the lack of transparency in the “recipe” that was used to produce the final output. Therefore, it seems quite understandable that an analyst having to make an important decision, for example granting a credit application or refusing the reimbursement of healthcare expenses, is reluctant to automatically apply the predictive model output without understanding the underlying reasons.

The predictive power of a machine learning model and its interpretability have long been considered as opposite. But that was before! For the past two or three years, there has been renewed interest from researchers, the industry and more broadly the data science community, to make machine learning more transparent, or even make it  “white box”.

Advantages of Machine Learning for fraud detection

Fraud is a complex phenomenon to detect because fraudsters are always a step ahead and constantly adapt their techniques. Rare by definition, fraud comes in many forms (from the simple falsification of an identity card to very sophisticated social engineering techniques) and represents a potentially high financial and reputational risk (money laundering, financing terrorism…). And, on top of that, fraud mechanism is known to be “adversarial”, which means that fraudsters are constantly working to subvert the procedures and detection systems in place to exploit the slightest breach.

Most anti-fraud systems currently in place are based on rules determined by a human because the derived results are relatively simple to understand and considered transparent by the industry. As a first step, these systems are easy to set up and prove to be effective. However, they become very difficult to maintain when the number of rules increases. With fraudsters adapting themselves to the rules in place, the system requires additional or updated rules, which makes the system more and more complicated to maintain.

One of the perverse effects is a steadily degradation of the anti-fraud defense. The system ends up becoming too intrusive (with rules capturing the specificities of data), or conversely, too broad. In both cases, it has a negative impact on good customers because fraudsters know how to perfectly mimic the “average customer”. It is a well-known fact for risk managers: “The typical fraudster profile? My best customers!”

Tracking fraudsters is therefore a difficult task and often causes friction in the customer experience, which generates significant direct and indirect costs.

As a result, an effective detection system that is not very intrusive and detect the latest fraud techniques must address considerable challenges. The machine learning is proving to be an effective solution to get around this problem.

Moreover, with the latest interpretability techniques, business analysts can be showed with the reasons that led the machine learning algorithm to emit one input or another.

Interpretability, why is it important?

In a general way, machine learning is becoming ubiquitous in our lives, and the need to understand and collaborate with machines is growing. On the other hand, machines do not often explain the results of their predictions, which can lead to a lack of confidence from end-users and ultimately hinder the adoption of these methods.

Obviously, certain machine learning applications do not require explanations. When used in a low-risk environment, such as music recommendation engines or to optimize online advertisements, errors have no significant impact. In contrast, when deciding who will be hired, braking a self-driving car or deciding whether to release someone on bail, the lack of transparency in the decision raises legitimate concerns from users, regulators and, more broadly, society.

In her book published in 2016, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, Cathy O’Neil, a renowned mathematician and data scientist, calls on society and politicians to be extremely vigilant about what she defines as “the era of blind faith in big data”. Among the most denouncing flaws, she highlights the lack of transparency and the discriminatory aspect of the algorithms that govern us. Techniques to understand decisions made by a machine have become a societal issue.

Interpretability means that we are able to understand why an algorithm makes a particular decision. Even if there is no real consensus as to its definition, an interpretable model can increase confidence, meet regulatory requirements (eg GDPRand CNIL, the French administrative regulatory body), explain the decisions to humans and improve existing models.

The need for interpretable models is not shared by all leading researchers in artificial intelligence field. Critics suggest instead a paradigm shift in how we can model and interpret the world around us. For example, few people really worry today about the explainability of a computer processor and trust the results displayed on screen. This topic is a source of debate even in machine learning conferences like NIPS.

Conclusion

Fraud is a complex phenomenon to detect, and the use of machine learning is a strong ally to fight it effectively. Interpretability favors its adoption by business analysts. The emergence of a new category of techniques in the last two years has made the interpretability of machine learning more accessible and directly applicable to AI products. With these techniques, we can now obtain very high predictive power without compromising their ability to explain the results to a human. In our next blog post, we will explain how techniques such as LIME, Influence Functions or SHAP, are used in machine learning models to bring more transparent decisions.

Further reading:

Miller, Tim. 2017. “Explanation in Artificial Intelligence: Insights from the Social Sciences.”

The Business Case for Machine Learning Interpretability http://blog.fastforwardlabs.com/2017/08/02/business-interpretability.html

Is there a ‘right to explanation’ for machine learning in the GDPR? https://iapp.org/news/a/is-there-a-right-to-explanation-for-machine-learning-in-the-gdpr/