Using Italian Flags quantitatively

What are the advantages and disadvantages of quantifying our Italian Flags?

• rigour – ensuring we do not make logical errors and our judgements are consistent
• conflict – identifying conflicting evidence
• learning – being able to explore quite easily alternative judgements
• repeatability – judgements are repeatable

• risk – a danger of putting too much emphasis on the actual numbers
• dependability – there is no guarantee that the judgements correspond to the facts
• awareness – consequently a risk of being unaware that the methodology provides a necessary but not sufficient condition for success. In other words we must have logical consistency but we must also have dependable judgements made by people who are suitably qualified to make them through a proper duty of care

What are the mappings from a word, phrase or sentence to an Italian Flag?

The diagram below illustrates these mappings from an uncertain word (such as an ambiguity) set in a context (a sentence or paragraph) to the FIR space and then to the Italian Flag (IF). The idea is to use the IF as a way of representing our judgement about the strength of the evidence about some proposition. The proposition will contain a mix of FIR (Fuzziness, Incompleteness and Randomness). For example the proposition ‘Jo will represent us well as a member of Parliament’ is fuzzy – what does representing us well actually mean? It is incomplete – no mention of whether we agree with her policies. It contains randomness – in the sense that Jo’s public speaking effectiveness and eloquence is perhaps known to be somewhat erratic. So our IF will contain a lot of white.

Another example is that we may say we are ‘more or less satisfied that the building is safe‘ . We are being rather vague or fuzzy – what does safe actually mean. We are being incomplete since more implies some positive evidence whilst less implies some negative evidence. Then there is some randomness since our judgements may vary over time.

A more precise example is we decide to measure the length of a cricket pitch. We do it a number of times and we get the same answer each time but with a small variation. It is common in those situations to call that variation a random error. In that case our IF will contain mostly green, no white and a very small amount of red depending on the random variations in our results.

Some may argue that propositions containing fuzziness and incompleteness are not acceptable in professional practice. Of course people (especially professional with a duty of care to others) have to be as precise as they can be in their language and in their technical and business models.  However it is inevitable in practical matters that uncertain propositions are proposed and acted upon.  This is especially the case when communicating with the wider public and media. One important example is the debate about the effects of climate change.

How do we start our learning journey towards using IFs quantitatively?

Lets start by looking at IFs with no white interval. In other words standard probability theory.

Imagine that we have 10 people in a room who need to elect a chair person who will represent them in discussions with higher authorities. They have two candidates Fred and Vera. They decide to vote on two attributes – policy and trustworthiness. They express these attributes in two propositions.

A – Has the candidate expressed policies with which you agree?

B – Can you trust the candidate to represent the group well in the discussions with the higher authorities?

They vote as follows:

Proposition A – votes for candidate Fred 3, votes against Fred 7, votes for candidate Vera votes for 5 votes against Vera 5

Proposition B – votes for candidate Fred 4, votes against Fred 6, votes for candidate Vera votes for 5 votes against Vera 5

Now we know nothing about voter behaviour and the chances that they influenced each other. We have only the results of the voting to work with.

We want to shed light on whether the voters influenced each other before we decide who should be elected.

Lets number the 10 people from 0 to 9.

How might the votes may be distributed? They could be at two extremes and all possibilities in between.

The extremes are that the votes are totally dependent on each other

OR

that they are perversely opposite.

If they are totally dependent then for example if member 3 thought member 7 was voting for Fred then he would vote for Fred too. Others might vote in a similar way.

If the votes are perversely opposite then member 3 would vote for Vera if he thought member 7 was voting for Fred. Again others may vote in a similar way.

Somewhere in between the votes might be genuinely independent. This means that the chance of a member voting for a particular candidate is uninfluenced by how that member thinks the others are voting

Let’s draw the Venn diagrams for these cases.  The left diagram below shows a possible distribution of votes for Fred based on proposition A and on the right for proposition B. We use the colourful Italian flag but with no white in the middle. Remember we do not know the actual distribution of votes – we only have the results which is a 30/40 split between judgements about A and about B.

We can imagine a similar but different set of votes for Vera.

We can combine Fred’s votes as below. You will see that we have chosen a distribution in which voter nos. 1, 3 voted for Fred on both propositions A and B so they are included in the intersection of the sets A and B. Voter 2 is in the set (A & notB). Voters 5, 6 are in the set (notA & B) and voters 0, 4, 7, 8, 9 are in the set (notA & notB). Voters in the set (A OR B) are 1, 2, 3, 5, 6.

We can assume all votes have equal weightings and we express the probability of a set X as p(X)

It follows that  p(A & B) = 2/10 = 0.2 and p(A OR B) = 5/10 = 0.5

So far we have simply assumed one possible distribution of votes. There are other possibilities but we don’t know what was actually the case.  There are too many combinations for us to consider realistically.

Another way of tackling this is not to try to guess the distribution of votes but to estimate some form of dependence. If we assume arbitrarily for illustration, that the votes are totally dependent or mutually exclusive then we get the results below. The left diagram above is Fred’s results assuming total dependency between the votes for A and B. The right diagram is Vera’s results assuming mutual exclusion between the votes for A and the votes for B.

The problem we have is getting logical consistency. Our original assumption of the distributuion of votes is inconsistent with both of our new assumptions about dependence. Votes with the blue background are being combined in a logically inconsistent way. For example in our original possible distribution voter 2 voted for Fred for proposition A but against proposition B. That is not consistent with an assumption of total dependence. Likewise Vera’s voters 2, 3 voted for both A and B as did voter 5 so the set of votes cannot be mutually exclusive.

Finding a logically consistent answer is actually not straighforward. In practice most analysts using this kind of approach assume that propositions are independent.

The trouble is that in many cases they may not be. This can lead to ‘double counting’ where evidence is weighted too strongly in one direction. For example if the votes for Fred were such that propositions A and B were totally dependent I.e. a vote for A is necessary (a must have) for a vote for B we would get a different result than where the two votes were independent.

Another example can occur in a risk analysis when there is a common cause of failure between two failure events. If two pieces of equipment E1 and E2 share a common source of electrical power C then we can get a misleading result if we assume independence.

To illustrate this assume that the probabilities of failure of E1 and E2 are p(E1) = 0.002  and  p(E2) = 0.005. If we assume E1 and E2 are independent then p(E1 & E2) = 0.002*0.005 = 0.00001

Now assume that the probability of failure of the power source is p(C) = 0.001 – a figure less than that for E1 and E2 separately.

As the failure of C is common to both E1 and E2 it is contained in the set formed by the intersection of E1 & E2. Failure event C is a sufficient for E1 and for E2  to fail separately but because it is a common cause it is also sufficient for the set (E1 & E2). Therefore p(E1 & E2 & C) = p(C) = 0.001

So when E1 and E2 are both totally dependent on C the chance of both E1 and E2 failing is 100 times more than if we assume that E1 and E2 fail independently.

How effective will our votes be at a higher level?

Going back to Fred and Vera we also need to consider how effective Fred and Vera might be in doing their job as chair person. Let us introduce a third proposition R as ‘The candidate will chair the group effectively‘.

We will now use the notation that p(X/Y) denotes the probability of X given Y has occurred.

For the purposes of this example we will assume that the propositions A and B are jointly logically necessary (must have) and sufficient (strong enough) for R. The question now is how do our votes as expressed in our flags get propagated upwards from A and B to R?

We do this by allocating ‘weightings’ to the flags. These weighting are actually conditional probabilities p(R/A) and p(notR/notA) with similar weightings for B.  For example a voter may think that Fred is lazy and will not ‘fight their corner’ in discussions relevant to the group and so will allocate a low weighting value to the values of  p(R/A) or p(notR/notA).

p(R/A) is interpreted as capturing the strength of the evidence in R being true if A is true. We call p(R/A) the sufficiency of A for R.

In other words sufficiency is the degree to which the voter judges that the evidence A will lead to a positive result for R.  If a bridge is well designed then that is sufficient evidence for it to be safe – a well designed bridge cannot be unsafe.

p(notR/notA) is more tricky. It is interpreted as capturing the strength of the evidence that if A fails or is false then R will fail. We call p(notR/notA) the necessity of A for R.

In other words necessity is degree to which the voter judges that if A fails then that will lead to a negative result for R. If a bridge is not safe then it is not a well designed bridge.

Note that mathematically p(R/notA) = 1- p(notR/notA).

So for example R could be the proposition that (the bridge is well designed given that it is not safe). A could be the proposition (the bridge is not well designed given that it is not safe). We would expect p(R) to be 0 and p(A) to be 1.

Other examples will be less clear cut. If R was the proposition (Fred would be a good chair) and A was the proposition (we feel Fred would represent us well) then p(R/notA) would enable us to express the idea that (Fred would be a good chair given we do not think he would represent us well). p(notR/notA) allows us to express the idea that (Fred would not be a good chair given we don’t think he would represent us well).

We may still decide to vote for Fred.

How does the algorithm for the Italian Flag work?

The algorithm on which the calculation of the Italian Flag for R is calculated is based on the total probability theorem

p(R) = p(R/A)*p(A) + p(R/notA)*p(notA)

It follows that

P(R) = p(R/A)*p(A) + (1- p(notR/notA)*(1- p(A)))

Now returning to the colourful notation of the Italian flag with g for green and r for red then

p(A) = g,  p(notA) = 1 – r and  if we label p(R/A) as suff and p(notR/notA) as nec then

p(R) = suff*g + (1 – nec)*(1 – r)

Introducing the white interval in the Italian Flag of uncertainty

So far we have not admitted any white into the flag. Indeed all that happens when we do is that the maths gets more complicated but the principles remain the same.

Websites such Tesla say the IF is based on a 3 valued logic. Strictly this is incorrect. In practical matters we want to work with clear statements. The statements are about our goals or conditions representing our progress towards our goals. They are about ‘states of affairs’ that can clearly be attained (i.e. true or dependable) or not (false or undependable) or about states of affairs that are clearly understandable or not. Of course those statements may be defined at various levels of definition or vagueness/fuzziness as holons in an hierachy. However the underlying logic with which we make deductions concerning the truthlikeness or dependability of those statements  is two valued – as in standard set theory. We will either attain the objectives of a process (no matter how vaguely defined) or we will fail. That clarity is important in knowing what success and failure looks like.

So it is important to realise that the uncertainty (which is FIR – fuzzy/vague, incomplete and random) is not about truthlikeness or dependability but rather is a characteristic of the evidence and information about a state of affairs. This uncertain evidence is all we have to make decisions and take actions. The interval measure of the IF over the truth space (either true or false) captures the incompleteness and randomness concerning the quality of our evidence about a fuzzy statement.

So when we use an IF we are not using a truth functional logic but an interval measure – however it has to be admitted that in much published research in probability and fuzzy logics this distinction does get blurred.

A word of explanation (as far as we understand it) concerning the fundamentals of this blurring may help. Truth in classical logic and set theory is captured by the indicator function. The indicator function can take only two values (true, 1) or (false, 0). In 3 valued logic the indicator can be true, false or neutral. In a multivalued logic the indicator is many valued. In fuzzy logic the indicator function is a real (continuous) number on the interval [0, 1] and renamed a membership function. Unfortunately as soon as real numbers are used to define a set the philosophers of mathematics and logical purists have a problem since the classical concept of a true/false set is used to define those numbers (Korner, S. 1986 The Philosophy of Mathematics, Dover Publications). The fundamental issue seems to be how one defines infinity but in practice these distinctions seem to have little practical significance.

In practice we can assume that in all of these logics the probability function is a measure over a set. In the Venn diagram of the voters for Fred and Vera the rectangle represents the system under consideration. All sets (representing true/false outcomes) are contained in that system.  The whole system is the set of all subsets. In other words we can draw all possible subsets of the system in that rectangle. The probability measure for the whole system is 1 and the sizes of all the sets in the system are in proportion. So in the Venn diagrams the sizes of the sets are proportional to 1 or the total size of the entire system represented by the rectangle.

In summary an Italian flag is an interval measure over a two valued set. To be clear an interval measure is one in which we state lower and upper bounds. For example we may say that the length of a line is between 20m and 22 m.

In theory we could allocate a probability distribtion over the interval as is common in probability and statistics. Indeed that is the usual way of capturing random variations in say the length of a line or the mass of an object. In mathematical jargon we model the length of the line as a random variable. However if we were to use that idea on an Italian Flag then we would have to define a double probability measure. In effect we would have a probability measure of a probability and that, we feel, is unnecessarily complicated in practice.

The practical point is that the white intermediate region between the green and red represents incompleteness of our knowledge regarding a particular judgement. In simple terms it represents ‘don’t know’.

Back to Fred and Vera

Then as far as Fred and Vera are concerned we proceed as before. But we do need an algorithm to distribute the votes in a logically consistent way. To continue click here.