universidade de vigo

Transcripción

universidade de vigo
UNIVERSIDADE DE VIGO
ESCUELA TÉCNICA SUPERIOR DE INGENIEROS
DE TELECOMUNICACIÓN
PROYECTO FIN DE CARRERA
CONTRIBUCIONES AL MODELADO
Y SIMULACIÓN DEL TRÁFICO URBANO
EN NETLOGO
AUTORA: Ana Peleteiro Ramallo
TUTOR: Juan Carlos Burguillo Rial
CURSO 2008-2009
Faculty
Engeneering
Departement
Electrotecnic – ESAT
KATHOLIEKE
UNIVERSITEIT
LEUVEN
Contributions to the modelling and simulation
of urban traffic in Netlogo
Master Thesis is submitted in partial fulfillment
of the requirements for the degree of Telecommunications Engeneering
Ana Peleteiro Ramallo
Promotor:
Prof.Dr.Ir.Bart Preenel
Daily Supervisors:
Dr.Ir.Claudia Diaz
Ir.Carmela Troncoso
2008 – 2009
Framework: Erasmus program
Abstract
Performing real-world experiments is often expensive in terms of resources, as well as the time
needed to collect the necessary information. For example, experiments that involve car traffic
require drivers who volunteer to carry tracking devices for a long period of time.
This is why Netlogo, a multi-agent modeling language that provides a programmable environment
for simulating natural and social phenomena, can be useful. There are many fields in which we
can use Netlogo as a tool for simulating real situations. In this thesis, we use it to simulate
car traffic scenarios and perform experiments in two different fields: Location Privacy and
Multi-agent Systems.
Location privacy is gaining importance with the increasing popularity of Location Based Services
(LBS.) If a person can get our location data, this person may be able to obtain useful and
private information. This is why location privacy defense algorithms must be developed, to
prevent that the use of location services reveals detailed information about our movements that
enable detailed tracking and profiling, thus compromising our privacy. In GPS anonymization,
the goal is to anonymize location samples, so that they can be used by external entities while
preventing user re-identification and protecting the privacy of the users providing the samples.
In Multi-agent Systems groups of agents interact to try to solve a problem. Moreover, in Multiagent Learning Systems, agents cooperate, but they also learn and adapt in an environment.
The problem is to be able to observe and compare the different behaviors obtained with different
Learning Algorithms without using lots of expensive resources.
In this thesis, we develop two main tasks. First, we develop a Netlogo city scenario with cars
that send location samples. We implement and perform defense (anonymization) and attack
(tracking) algorithms over these samples and use the results to compare the different defenses
and attacks. In the second task, we build a Netlogo city and we implement different Learning
Algorithms that agents perform in that scenario. The performance of these algorithms in
reaching a goal is compared.
i
Acknowledgements
I would like to thank all the COSIC group for the opportunity of working with them and for
making me feel like another researcher in the group.
To Carmela and Claudia, because you have not only been my supervisors, but also my advisors.
Thank you for your support, your dedication... thank you for everything.
To Juan Carlos, my supervisor at University of Vigo, Pedro and Kike, for introducing me to
the research world.
A mis amigos no telecos, por reı́rse de mis chistes frikis aún sin entenderlos.
A mis amigos telecos, en especial a 4 mejores amigos que han hecho que estos 5 años sean
inolvidables:
A Sara, por su energı́a inagotable, y por haber batido un record de natación olı́mpico contando
por lo bajo.
A Antı́a, por todas las risas en clase, y por tus mı́ticos ’ni de coña’ que se convierten en un
’seguro que’.
A Majo, por hacer vida cuando los demás dormı́amos y por ayudarme a gobernar el lado oscuro
como vicepresidenta.
A Humberto, por todos los caminos al CUVI y horas de prácticas compartidas, y por nunca
negarme una discusión cuando la necesitaba.
Gracias a los 4 por vuestro apoyo, amistad... y por todo lo que hemos compartido. En definitiva,
porque sólo con conoceros, teleco ya ha valido la pena.
A Marta, por ser la mejor hermana y amiga que hubiera podido tener, por darme siempre todo
tu apoyo, por ser mi consejera y escucharme siempre que lo necesito.
A Jota, mi bro favorito, por alegrarme los dı́as, por aguantar mis sermones, por estar ahı́ para
lo que sea, y por ser mi compañero de competiciones deportivas (aunque siempre subcampeón).
Gracias a los dos, porque es imposible expresar en unas lı́neas todo lo que significáis para mı́.
Por último y más importante, este proyecto va dedicado a mis padres, porque todo lo que soy
es gracias a vosotros, porque sois las personas que más quiero y admiro, por darme siempre el
mejor consejo, y por aguantarme en los momentos en los que ni yo me aguantarı́a. Todos mis
éxitos son también vuestros.
iii
Contents
Abstract
Acknowledgements
Contents
List of Figures
List of Tables
I
i
iii
v
vii
ix
Part I: Presentation
1 Introduction
1.1 Motivation . . . . . .
1.2 Goals . . . . . . . . .
1.3 Structure of the thesis
2 Netlogo
2.1 Introduction . . . . . .
2.2 Agents . . . . . . . . .
2.3 Model and settings . .
2.4 Example . . . . . . . .
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
II Part II : Location Privacy
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
2
2
3
5
5
6
7
10
12
3 Preliminaries
3.1 Introduction . . . . . . . . . . . . . . . . . . .
3.2 Classification of privacy methods according to
3.3 Attacks and countermeasures . . . . . . . . .
4 System model
4.1 Introduction . . . . . . . . . . . . . . . . . . .
4.2 Scenario . . . . . . . . . . . . . . . . . . . . .
4.3 Participants . . . . . . . . . . . . . . . . . . .
4.4 Adversarial model . . . . . . . . . . . . . . .
5 Algorithms
5.1 Introduction . . . . . . . . . . . . . . . . . . .
5.2 Mathematical Background . . . . . . . . . . .
5.3 Privacy enhancing algorithms . . . . . . . . .
5.4 Tracking algorithms . . . . . . . . . . . . . .
6 Experiments
6.1 Introduction . . . . . . . . . . . . . . . . . . .
6.2 Experimental setup . . . . . . . . . . . . . . .
v
13
. . . . . . . . . . . . . . . . . . . 13
the communication with the LBS 14
. . . . . . . . . . . . . . . . . . . 17
24
. . . . . . . . . . . . . . . . . . . 24
. . . . . . . . . . . . . . . . . . . 25
. . . . . . . . . . . . . . . . . . . 26
. . . . . . . . . . . . . . . . . . . 27
29
. . . . . . . . . . . . . . . . . . . 29
. . . . . . . . . . . . . . . . . . . 29
. . . . . . . . . . . . . . . . . . . 32
. . . . . . . . . . . . . . . . . . . 35
41
. . . . . . . . . . . . . . . . . . . 41
. . . . . . . . . . . . . . . . . . . 41
6.3
6.4
6.5
6.6
Experiments releasing all samples . . . . . .
Experiments subsampling . . . . . . . . . .
Experiments Uncertainty-aware algorithm .
Comparison of different defense algorithms .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
III Part III: Learning algorithms
7 Preliminaries
7.1 Introduction . . . . . . . . . . . . . .
7.2 Multi-agent Learning . . . . . . . . .
7.3 Artificial Neural Networks . . . . . .
7.4 Machine Learning . . . . . . . . . . .
7.5 Reinforcement Learning . . . . . . .
7.6 Predator-prey problem . . . . . . . .
8 System Model
8.1 Introduction . . . . . . . . . . . . . .
8.2 Scenario . . . . . . . . . . . . . . . .
8.3 Participants . . . . . . . . . . . . . .
9 Algorithms
9.1 Introduction . . . . . . . . . . . . . .
9.2 Korff Algorithm . . . . . . . . . . . .
9.3 Self Organizing Maps Algorithm . .
9.4 Learning Automata . . . . . . . . . .
9.5 Q-learning . . . . . . . . . . . . . . .
10 Experiments
10.1 Introduction . . . . . . . . . . . . . .
10.2 Experimental setup . . . . . . . . . .
10.3 Korff . . . . . . . . . . . . . . . . . .
10.4 SOM . . . . . . . . . . . . . . . . . .
10.5 Learning Automata . . . . . . . . . .
10.6 Q-learning . . . . . . . . . . . . . . .
10.7 Comparison of the different Learning
45
46
48
50
53
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Algorithms
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
54
56
57
58
59
62
64
64
64
67
69
69
69
70
71
71
73
73
73
75
76
77
79
80
IV Part IV: Conclusions
83
11 Conclusions and future work
11.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
84
84
85
86
vi
List of Figures
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
Netlogo environment. Controls and action scenario. . . .
Netlogo. Turtle shapes . . . . . . . . . . . . . . . . . . . .
Netlogo example scenario: wolves and sheeps. . . . . . . .
Netlogo. Controls to change the size of the scenario . . .
Netlogo.Button to control the beginning of the simulation
Netlogo Switch (left) and slider (right) . . . . . . . . . . .
Netlogo Chooser . . . . . . . . . . . . . . . . . . . . . . .
Netlogo Plot . . . . . . . . . . . . . . . . . . . . . . . . .
Netlogo Monitor to display variables values . . . . . . . .
Tumor model and its components. . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
7
8
8
9
9
9
9
10
11
3.1
3.2
3.3
3.4
3.5
3.6
GPS service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Communication between the user and the LBS without third party . . . . . .
Clasification of privacy methods . . . . . . . . . . . . . . . . . . . . . . . . .
Communication between the user and the LBS with third party (TTP-based)
TTP-free communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Deleting data to hide home’s location. . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
15
15
16
17
22
4.1
4.2
4.3
System model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example of a reticular city: New York. . . . . . . . . . . . . . . . . . . . . . . . . .
Netlogo city view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
25
26
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
Netlogo implementation of the city. . . . . . . . . . . . . . . . . . . . . . . .
Maximum permitted speed in a city. . . . . . . . . . . . . . . . . . . . . . .
Empirical distribution of vehicle trip times. . . . . . . . . . . . . . . . . . .
Distance errors in tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Releasing all samples. N =10. . . . . . . . . . . . . . . . . . . . . . . . . . .
Releasing all samples. N =100. . . . . . . . . . . . . . . . . . . . . . . . . .
Subsampling algorithm.N =100, φ=0.8. . . . . . . . . . . . . . . . . . . . . .
Subsampling algorithm.N =100, φ=0.8. . . . . . . . . . . . . . . . . . . . . .
Subsampling algorithm.N =100, φ=0.5. . . . . . . . . . . . . . . . . . . . . .
Uncertainty-aware algorithm. N =10, k=2, β=0.4. . . . . . . . . . . . . . .
Uncertainty-aware algorithm. N =100, k=2, β=0.4. . . . . . . . . . . . . . .
Uncertainty-aware algorithm. N =100, k=10, β=0.2. . . . . . . . . . . . . .
Comparison between releasing all samples and subsampling. N =100. . . . .
Comparison between releasing all samples and Uncertainty-aware algorithm.
42
43
44
44
45
46
47
47
48
49
49
50
51
51
vii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
N =100.
6.15 Comparison between subsampling and Uncertainty-aware algorithm. N =100. . . .
52
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
Turing test. . . . . . . . . . . . . . . . . . . . . .
Agent action in some environment . . . . . . . .
Non-learning agent. . . . . . . . . . . . . . . . .
Learning agent. . . . . . . . . . . . . . . . . . . .
Multi-agent Learning. Intersection of two fields. .
Simple scheme of the nodes of a Neural Network.
Predator-prey model. . . . . . . . . . . . . . . . .
Capture of a prey by surrounding. . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
55
56
57
58
58
59
63
63
8.1
8.2
8.3
8.4
Traffic basic model. .
Traffic grid model. .
SOTL model. . . . .
SinCity model. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
64
65
65
67
10.1
10.2
10.3
10.4
10.5
10.6
Two different combinations of possible directions (sub-states).
Results with thief using Korff. . . . . . . . . . . . . . . . . .
Results with thief using SOM. . . . . . . . . . . . . . . . . . .
Results with thief using LA. . . . . . . . . . . . . . . . . . . .
Results with thief using QL. . . . . . . . . . . . . . . . . . . .
Results police using Korff in a 10x10 traffic map. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
74
77
78
80
81
82
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
viii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
List of Tables
10.1 Number of states for each algorithm and each learning mode. . . .
10.2 Parameters for the different Learning Algorithms. . . . . . . . . . .
10.3 Results in a 5x5 patches scenario. Algorithm for the thief: Korff. .
10.4 Results in a 10x10 patches scenario. Algorithm for the thief: Korff.
10.5 Results in a 5x5 patches scenario. Algorithm for the thief: SOM. .
10.6 Results in a 10x10 patches scenario. Algorithm for the thief: SOM.
10.7 Results in a 5x5 patches scenario. Algorithm for the thief: LA. . .
10.8 Results in a 10x10 patches scenario. Algorithm for the thief: LA. .
10.9 Results in a 5x5 patches scenario. Algorithm for the thief: QL. . .
10.10Results in a 10x10 patches scenario. Algorithm for the thief: QL. .
ix
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
75
75
76
76
77
78
79
79
80
81
Part I
Part I: Presentation
1
Chapter 1
Introduction
1.1
Motivation
Nowadays, there is a growing interest in simulations that reflect closely the real world, since
we are interested in observing and modelling the behaviour of different systems. However,
performing real-world experiments is often expensive in terms of resources, time needed to
collect the necessary information, etc. For example, if we want to simulate the cycle of the
wolf-sheep population (the evolution of these population sizes,) we would need to place all
the animals, observe the evolution of the population during a long time, etc. These kind of
real-world experiments are normally difficult to handle without simulating tools that allow us
to accelerate and simplify the process.
Urban traffic simulation is a field that attracts the attention of the research community. For
instance, the development of several applications related to the traffic, as for example GPS
systems, or the study of traffic congestion in cities are important research areas nowadays.
The problem to obtain results in urban traffic experiments is that the resources tend to be
expensive and limited, since we need cars, the recruitment of drivers who want to volunteer to
carry tracking devices, etc. This is why Netlogo, a multi-agent modeling language which allows
to simulate natural and social phenomena, can be useful.
Netlogo is useful in many fields as a tool for simulating real-world experiments. It allows to
define agents that perform their actions independently in a scenario, while we can control,
measure and observe their parameters and behaviour. Concretely, in urban traffic simulations,
it allows to define all resources we need (e.g., number of cars that we need,) control the speed
of the simulation, measure and redefine the model whenever is necessary, etc. This reduces
the time needed for simulations and improves the observation quality of the observation of the
Netlogo world.
1.2
Goals
There are many fields in which we can use Netlogo as a tool for performing real-world experiments,
but concretely, in this thesis we use it to perform experiments in Location Privacy and Multi2
Structure of the thesis
agent Systems, showing the utility of this tool in different research areas.
Location Privacy is becoming more important every day because people carry more and more
personal devices that reveal their location (GSM, PDA, etc.) This means that we are revealing
our position to servers that we have to trust, but these servers can be malicious, or even the
communication channel may be compromised. Although it may seem inoffensive, the problem
of data revealing is not trivial. In fact, if a person can get our data, this person may be able to
obtain useful and private information, as where we live, or which places we visit. For example,
a company could not hire an employee because they have ’discovered’, observing this person’s
data, that this person may be ill because he visits the hospital every day. This is why privacy
defense algorithms must be developed.
A special case of Location Privacy is GPS samples anonymization. When a person uses a
position transmitter and he sends location samples to a Location Based Service (LBS,) if
these samples are not anonymized, privacy may be compromised, since we are revealing our
geographical position, as well as time and identification information. This is why it is important
to develop and test anonymization algorithms. However, to do this, we need location samples,
and these samples are difficult to obtain because we need real users to give their data. In this
thesis we do not use real data, but we build a Netlogo city scenario with cars that send location
samples. We implement defense (anonymization) and attack (tracking) algorithms over these
samples and we perform a comparison of the different defense attacks.
In the second part of the thesis, we use Netlogo to compare the behaviour of different Learning
Algorithms. The investigation in this part resulted in a publication with the paper SinCity: a
pedagogical testbed for checking Multi-agent Learning techniques [1]. In Multi-agent Systems
agents collaborate to solve a problem. This collaboration may be important for example when
we want to coordinate robots to save a person in a natural disaster scenario. We have to be
sure that these agents will have a correct behaviour, and try this robots in all situations before
a real trial, since a mistake could be fatal in a critical situation.
In Multi-agent Learning Systems, besides cooperating, the agents also learn and adapt to their
environment. With the same robot example from previous paragraph, an agent may learn that
getting closer to the person to save is a good decision, while going to a cliff is a bad decision. The
challenge is to be able to observe and compare the different behaviors resulting from different
Learning Algorithms without using lots of expensive resources. To achieve this, we build a
Netlogo city where agents (thieves) escape from other agents (police.) We implement different
Learning Algorithms that agents perform in that scenario and we compare the performance of
these algorithms in reaching a goal. In our thesis, the goal of the thief is to escape after the
robbery, and the goal of the police is to catch the thief.
1.3
Structure of the thesis
The thesis is structured in four parts.
Part I: Presentation, this is an introductory part:
Chapter 2: Netlogo: This Chapter presents and explains the Netlogo tool, with which we
simulate our real situations.
3
1. Introduction
Part II: Location Privacy describes the Location Privacy part of the thesis. It is structured as
follows:
Chapter 3: State of the art: This Chapter introduces the concepts of Location Privacy, a
classification of the privacy methods and the principal attacks and countermeasures.
Chapter 4: System model: This Chapter describes our model, i.e., the scenario where our
experiments are developed, the participants and the adversarial model.
Chapter 5: Algorithms: This Chapter explains the different privacy enhancing and attack
algorithms used in the experiments..
Chapter 6: Experiments: This Chapter explains the experimental setup and shows the
different results when performing the algorithms from Chapter 5.
Part III: Learning Algorithms describes the Multi-agent Learning Systems part of the thesis. It
is structured as follows:
Chapter 7: State of the art: This Chapter introduces concepts to have a theoretical background in Multi-agent Learning Systems.
Chapter 8: System model: This Chapter describes our model, i.e., the scenario where our
experiments are developed and the participants.
Chapter 9: Algorithms: This Chapter explains the different Learning Algorithms used in
the experiments.
Chapter 10: Experiments: This Chapter explains the experimental setup and shows the
different results when performing the algorithms from Chapter 9.
Finally, Part III: Conclusions:
Chapter 11: Conclusions and Future Work: This Chapter explains the conclusions and
the possible ways to improve this work.
4
Chapter 2
Netlogo
2.1
Introduction
Netlogo [2] is a multi-agent modeling language that provides a programmable environment
for simulating natural and social phenomena. For example the wolf sheep predation, that
studies the stability of predator-prey ecosystems, or the spreading of a virus on a network . In
particular, it provides a good environment for modeling complex systems developing over time.
When there are a multiplicity of simple interactions between agents (entities that develop
their actions in a concrete environment,) normally patterns and complex systems arise. This
is known as emergent phenomena. Netlogo allows the exploration of these phenomenas in a
created environment since we can give instructions to a big quantity of agents all operating
independently. When giving instructions to the agents, they perform actions, and this way we
can model the behaviour of an entity (or even of a group.) They are the principal ’actors’ in
the Netlogo scenario.
Sometimes it is difficult to experiment with a system in a real world situation. However, using
Netlogo to model situations permits the user to investigate a system in a rapid and flexible
way. Netlogo provides a good environment to simulate scenarios from different fields where the
connection between the individual behaviour of the agents and the patterns emerging from the
interaction of the individuals can be studied.
One big advantage of Netlogo is that the user can choose between building its own model or use
predefined models of the ones provided by the tool. The Models Library [3] provides a collection
of pre-built models to work with. Further, we can also modify this models to adapt them to
our needs. These models allow to compute simulations in different areas in the natural and
social sciences, like economics, psychology, mathematics, computer science, machine learning,
location privacy, etc.
If no pre-built model is suitable for the purposes of the user, he can create a new one. Netlogo
provides many features for a designer to model, control and measure the behaviour of the agents
in the scenario. Netlogo is an easy language and it provides predefined language primitives, so
that amateur programmers can easily build their models and run simulations. Further, as we
show in this thesis, it can also be used as a powerful tool for researchers.
5
2. Netlogo
Figure 2.1: Netlogo environment. Controls and action scenario. This figure shows a model to
study the evolution of the sheep-grass cycle.
Another advantage of Netlogo is that it is a cross-platform system, this is, we can run our
simulations either in Mac, Linux, Windows, or in any other operative system. Not only that,
but runs are exactly reproducible cross-platform, i.e., we can build a model in one platform
and run it in other, and the results are the same as if we run it in the same platform. These
provides a flexibility that allow all kind of programmers to use and to develop their models.
In the next sections we will explain the agents and their relations, an overview of the model
and different settings, and finally we will show an example.
2.2
Agents
In Netlogo, we can model agents that perform actions and give them commands. This agents
are a part of the Netlogo world and they perform their activities independently from the others
simultaneously in the world.
There are four kinds of agents: patches, turtles, links and the observer.
6
Model and settings
Patches Patches are individual squares on a grid. They are stationary, but this does not mean
they do not perform actions and have a behaviour. Patches compose what it is known as
the world. This world is two dimensional, although Netlogo also has the option of a 3D
view.
Turtles Turtles are mobile agents that move around in the grid provided by the patches. We
can control their number, behaviour, shape, etc. In Fig. 2.2 we show some of the different
shapes that the turtles can adopt.
Figure 2.2: Netlogo. Turtle shapes
To a better and more graphical understanding of the last two agents, we show in Fig.
2.3, one of the more known predefined models in Netlogo: the Wolf Sheep Predation. In
this figure we can see the paths, which are the green and brown squares, and the agents,
which in this case are the wolves (in black) and the sheeps (in white.) This figure is also
good to show the different shapes that we can use for the agents, and their movements
and actions. Furthermore, as we can control the speed of the simulation (this will be
explained in Sect. 2.3,) we can view in slow motion what it is happening in our model,
and also, go forward in time until the action we want to study.
Links A link connects two turtles and this makes possible the creation of aggregates (a collection
of beings, for example a population of wolves, the grass in a field, etc.,) networks and
graphs.
Observer The observer is the controller of the world. It oversees the scenario and it gives
orders to the rest of the agents.
We can also define ’breeds’ of turtles and links. A breed is a group of agents with an homogeneous
appearance and behaviour. In fact, all the agents of a breed have the same behaviour, but
within the breed, every agent acts independently (following the same behaviour.)
2.3
Model and settings
The scenario in Netlogo can be viewed as 2D or 3D, although normally the scenario is 2D. The
size of the scenario can be changed, and this allows to have flexibility in the model. We modify
7
2. Netlogo
Figure 2.3: Netlogo example scenario: wolves and sheeps.
the size of the world by changing the number of patches that compose it. We show in Fig. 2.4
a 50x50 patches scenario.
Figure 2.4: Netlogo. Controls to change the size of the scenario
The interface of Netlogo allows to control the agents and the simulation variables. We can
control the world using buttons. First we define the actions that we want to be performed when
we press the button and when we press it, it executes them. In Fig. 2.5 we can see a typical
control button.
There are also sliders and switches (shown in Fig. 2.6) that allow to change variables value
(like the size, number of agents, etc). The difference between sliders and switches is that the
switches can only be set to on or off, while the sliders take values in an interval defined by the
user. These controls allows for example to change the speed of the turtles and even the speed
8
Model and settings
Figure 2.5: Netlogo.Button to control the beginning of the simulation
of the simulation. With the switches, we can enable or disable an action.
Figure 2.6: Netlogo Switch (left) and slider (right)
With the choosers (Fig. 2.7) we can also change the value of a variable. They allow to define a
list a values that a variable can take, and then choosing one of them.
Figure 2.7: Netlogo Chooser
Netlogo provides also tools for plotting and displaying the results: plots and monitors. They
allow to observe the evolution (during the simulation) of the variables. The plot (Fig. 2.8)
graphically represents the value and state of the variables, and records them to be used
afterwards in the analysis of the experiment. To display the state and numerical value of
different variables we can use the monitors. We see one example in Fig. 2.9.
Figure 2.8: Netlogo Plot
All these tools make Netlogo a friendly environment for simulations, since it allows an easy
change of variables value, and with this a change in the behaviour of the system. Besides, it
9
2. Netlogo
Figure 2.9: Netlogo Monitor to display variables values
provides a good way to the interpretation of the results and also for debugging, since we can
plot results, and view the variable values in simulation time.
2.4
Example
To have a final overview of the whole model, we show the Tumor model in Fig. 2.10. Using
this model we can simulate the growth of a tumor and its resistance to chemical treatment.
In this figure we can see arrows pointing to different elements described in previous section.
First, the one named Scenario points the graphical representation of the scenario, where the
action is developed. The arrow named Buttons shows the buttons. In this case they are the
Setup and Go buttons. As we can see, the Go button has two arrows, which means that it is
a ’forever’ button, i.e., the action is carried on forever from the moment we click this button
until we click it again. However, the Setup button is a ’once’ button, which means that the
action is only performed once. The arrow Monitor points to one monitor which count the alive
cells. Further, the Plot arrow points to a plot which shows the evolution of the alive cells over
the time. We also have arrows Command center and Observer. In the Command center the
user can insert commands, which are actions the agents will perform or directions in the model.
However, if we write in the observer line we can only give commands to the turtles and the
patches. Finally, we have the arrows Slider, with which we can vary the speed of the simulation,
and Switch, which allows us to choose if the cells leave a trace (if On) or not (if Off.)
10
Example
Slider
Buttons
Switch
Monitor
Scenario
Plot
Command
center
Observer
Figure 2.10: Tumor model and its components.
11
Part II
Part II : Location Privacy
12
Chapter 3
Preliminaries
3.1
Introduction
Location-based applications (also known as Location-Based services, LBS) and traffic monitoring
applications use people’s movements in order to offer several services (as an example, GPS, see
Fig. 3.1.) However, in order to get these services, users need to provide private information
(their geographical location information.) Location information is a set of data describing an
individual’s geographical position over a period of time. This is important to protect because a
sequence of recorded locations of a person constitutes a quasi-identifier, which are the attributes
that can be linked with external data to uniquely identify at least one individual in general
population. This means that data can reveal personal information (for example, the places we
visit or the times we arrive home.) However, nowadays people does not really care [4] about
these location privacy matters, even though it is becoming more important every day [5].
LBS provider
GPS
Figure 3.1: GPS service.
We can define location privacy as the ability to prevent other parties from learning one’s current
or past location. Past location information is important to protect [6], because although real
time location could be useful to find a person, past data could help to discover more private
information, as where a person lives, the places that this person visits during the day, etc, as
data stored is difficult to protect, and third parties may be malicious.
A location privacy threat is the capacity of an adversary to get geographical location information
and this way compromising user’s privacy . Even if data is revealed anonymously, if the sample
13
3. Preliminaries
frequency is sufficiently high compared to the user density in an area, an adversary could link
samples to the same user. An accumulation of path information about individual users will
likely lead to identification. The tecniques described in the following sections are used to avoid
this identification, or at least make it more difficult.
When collecting information from the users, there is sensitive information that may become
public. Users should be able to use LBS without compromising their privacy. Data privacy
algorithms increase privacy through deliberate modifications on the dataset, as omission,
perturbation or generalization. This is why normally there must be a trade off between privacy
and accuracy (quality of service) [7] [8].
In order to protect our privacy, there are some tecniques which are applied to the location
samples to make the tracking more difficult. Two of them are pseudonymity and anonymity.
It is important to understand the difference between them. The former consists of stripping
names from location data and replacing them with arbitrary IDs (identities, different from real
ones.) Pseudonymity is a simple privacy defense, but it is not effective, since even though the
user’s true identity is not revealed, an attacker, examining where the users spend more time,
could determine the place where an user lives [9]. With anonymity, we also give arbitrary IDs
to the users, but we change this IDs along the time, so an attacker cannot assign a concrete
ID to an user. Anonymity requires unlinkability, it means, that an attacker cannot infer the
holder of an ID or link its samples. Anonymity aims to hide users’ true identity with respect to
emitted location queries.
3.2
Classification of privacy methods according to the
communication with the LBS
Location-based service (LBS) provides users with information accesible by mobile devices that
are able to locate themselves, as for example GPS, or even a cell phone. In this section we will
explain two different approaches of LBS: a TTP-based (Trusted Third Party,) where there is a
third-party between the LBS provider and the user which acts like a proxy such that privacy
is ensured. The second approach is TTP-free. This approach gives better privacy than the
TTP-based one because we do not have to trust a third party, which can be malicious.We can
see a scheme of this classification in Fig. 3.3. Note that, if the user does not care about her
privacy, the communication between the user and the LBS can be without using intermediates
(Fig. 3.2)
3.2.1
TTP-based
In this scheme, users do not communicate with applications directly, otherwise they would reveal
their identity straight away. An option to avoid this identification is to use an anonymizing
proxy for all communication between users and applications (see Fig. 3.4.) This proxy allows
applications to receive and to reply to anonymous (more correctly pseudonymous) messages
from the users. It removes any identifiers (IP, addresses, perturbs the location information, etc)
so that each message sent to the LBS contains the location information of the mobile client
and a time stamp [8] but not identity information.
14
Classification of privacy methods according to the communication with the LBS
Query
User
LBS
provider
Answer
Figure 3.2: Communication between the user and the LBS without third party
LBS Privacy Methods
TTP-Based
TTP-free
Policy-based
Simple
Collaboration-based
Pseudonymbased
Obfuscation-based
Anonymity-based
Figure 3.3: Clasification of privacy methods
One of the simplest intermediates entities are the ones which replace the real IDs of the users
by fake ones. Entities which provide anonymity aim to hide users true identity with respect to
emitted location information.
By using a TTP, we move trust from LBS to intermediate entities, so LBS are no longer aware
of the identities. If we use TTP-based, the number of intermediate entities is smaller than the
number of LBS, and we have to trust this third parties.
One of the intermediate entities are the k-anonymizers, which can replace the real location of
the user and give a blurred one. This entities take k users and they cloak the area where they
are located. The cloaking is really simple, instead of reporting the exact location (for example,
the (x,y) coordinates) they report an area where all this k-users have been. This leads to an
15
3. Preliminaries
uncertainty of which car has reported that sample and where this car is now. This way LBS
providers cannot easily determine which of the k users is really submitting the query.
User
User
TTP
LBS
provider
User
Figure 3.4: Communication between the user and the LBS with third party (TTP-based)
3.2.2
TTP-free
With this approach, privacy can be provided without seeking help from any centralized third
party (it is a P2P approach.) In this new approach, users collaborate to protect their privacy.
Users use collaboration methods among them, but they do not even need to trust each other.
We show an scheme in Fig. 3.5.
Before requesting any location based service, the mobile user discovers peers and forms a group
of k users with her peers via single-hop communication and/or multi-hop routing. The area
where users are anonymous is computed as the region that covers the entire group of peers.
In TTP-free the mobile client can blur its exact location so the adversary only knows an area
where the user could be. This area is the minimum area covering the k-1 peers that form the
group and itself.
With TTP-free, an user can choose between two approaches to define when an user tries to
find peers: on demand mode and proactive mode. On the one hand, with on demand mode
mobile clients execute the cloaking algorithm when they need to access information from the
location-based database server. We obtain a better quality of services, but longer response
time. On the other hand, with proactive mode mobile clients periodically look around to find
the desired number of peers. They can cloak their exact locations into spatial regions whenever
they want to retrieve information from the data server.
In order to form a group, the user first adds Gaussian noise. Then the user broadcasts his
perturbed location and requests its neighbors to return perturbed versions of their locations.
Amongst the replies received, the user selects k-1 neighbors such that the group formed by the
locations of these neighbors and his own perturbed location span an area A. The user sends to
the LBS the centroid of the group of k perturbed locations including his own. Note that, as
users only exchange perturbed data, there is no need to trust each other. Perturbations tend
to cancel each other in the centroid, maintaining the accuracy.
Another methods, like obfuscation or SpaceTwist [10] are an alternative to collaboration-based
methods. Degrading the quality of information about users’ location protects users’ privacy.
16
Attacks and countermeasures
SpaceTwist generates a ’fake’ point that is used to retrieve information on the k nearest points.
After successive queries to the LBS provider, SpaceTwist is able to determine the closest interest
point to the real location while the LBS server cannot derive the real location of the user.
User
User
User
LBS
provider
User
User
Figure 3.5: TTP-free communication
3.3
3.3.1
Attacks and countermeasures
Attacks
Since consecutive location samples from a vehicle exhibit temporal and spatial correlation,
paths of individual vehicles can be reconstructed from a mix of anonymous samples belonging
to several vehicles.
Location data is very vulnerable to privacy attacks [11]. We can do a simple classification of
these attacks: passive and active. In passive attacks, the attacker only listen to the information
that he is given and he uses this information to attack. Among them there are the traffic
analysis techniques, which only use the space/time information. Within the most important
traffic analysis attacks we find brute force, timing, communication patterns, packet counting,
intersection and statistical disclosure.
In active attacks the attacker modifies the packets, the time of sending, etc. He does something
active with the packets, as for example remove one. These attacks are more powerful than
passive attacks, but passive attacks have one big advantage: they are not detectable.
In secure communications, nodes communicate with encrypted and authenticated data packets,
to prevent active attacks. We assume that this encryption is perfect, so we have to take care of
passive attacks (and above all traffic analysis attacks.)
Tracking techniques are passive attacks, since even though the attackers use information from
the packet, they do not modify them. For an adversary, a path is a location of location/time
samples that a single user has reported. These tracking techniques can be used to reconstruct
paths from anonymous samples or segments. They are useful once a home location has been
identified. They can allow an adversary to follow the traces reported by a vehicle to other
location, thereby linking information about other places to the driver identity.
17
3. Preliminaries
As we said in Sect. 3.1, the problem of location based services is that our privacy is threatened.
There are four reported studies in which data was used to demonstrate an attack efficiency [8].
In three cases, data was pseudonomized, but even with this protection some tracks could be
followed and locate some homes. In the fourth case, the data had been completely anonymized,
thus mixing together coordinates from different people. The result obtained shows that it is
not enough to prevent an attacker from reassembling data into individual tracks.
An adversary can use four approaches to link the samples reported by the users: first, trajectory
based linking assumes that a user is more likely to continue traveling on the same trajectory.
Second, map-based linking correlates location samples with likely routes on a map. That way
future location samples can be linked and users position can be predicted. A third approach is
empirical linking, which uses past information (for example the last movement in a concret
position) to connect samples. Finally, the inference attack analyzes data in order to illegitimately
gain knowledge about a subject [6]. Furthermore, it allows to extract visited places.
To find the coordinates of each subject’s home based on their GPS data, there are four heuristic
algorithms [11]:
Last destination Exploits the fact that the last place a person goes is home. In [11] the
attacker could computationally locate a subject’s home within about 60 meters at least
half the times.
WeightedMedian The attacker exploits the fact that the subject spends more time at home
than at any other location. A longer stay in a place can be used to identify the subjects’
home.
Largest cluster The attacker assumes that most of reported coordinates will be at home.
Then, she builds a dendrogram (which is an hierarchical clustering technique that builds
a tree diagram that can show an arrangement of clusters) of the subject’s destinations,
where the merge criterion is the distance between the cluster centroids. When the nearest
two clusters are over 100 meters apart, the search ends. The home location is taken as
the centroid of the cluster with most points.
Best time As the attacker assumes, the probability of a subject being at home varies depending
on the time of the day. The attacker obtains this distribution and this way the relative
probability of being home vs. the time of the day can be known. Applying this distribution,
for each measured sample of each user, we compute the relative probability. Once this is
done, we extract the coordinates with the maximum relative probability and define home
as the median of those points.
From the four explained heuristic’s, Last destination has the best performance, while when
performing Best time algorithm the attacker obtains the worst results.
If we can identify a subject’s home, then we can search the name (for example in the Internet,)
and maybe further personal information [11]. This is the reason why defenses against these
attacks must be investigated.
3.3.2
Countermeasures
There are some simple countermeasures that the reader must have in mind when talking about
location privacy. Within this countermeasures, we have to take into account the regulatory
18
Attacks and countermeasures
measures, which normally means goverment rules. These rules impose some countermeasures
that must be followed, or legal problems will arise. Normally, the legal rules are imposed by
the government. Besides, there are privacy policies, which are normally privacy rules or politics
defined by within the enterprises. These rules should always be defined in order to avoid privacy
compromise of any user. It is normally collected in a document that tries to define how to
deal with information related to users’ private profiles. Finally, pseudonimity and anonymity,
introduced in the first part of Chapter 3, to avoid the inference of information from the users.
As we introduced in Sect. 3.1, pseudonymity is using a pseudonym (an alternative ID) rather
than the user’s actual identity. Such pseudonyms should be different for different services to
prevent tracking (users should adopt new pseudonyms for each application with which they
interact,) because even if we do not give our true identity, but a pseudonym, our samples can be
linked, thus we can be tracked and located. To be anonymous, the user changes his pseudonym
over the time. This pseudonyms have to be generated so that relating a previous pseudonym
with the following must be difficult [12], for instance using anonymous credentials [13].
Using anonymity or pseudonymity applications only receive the ’fake’ identity, not the real
one, and real IDs cannot be linked to locations. For this reason, whenever an user switches his
pseudonym, the following location information appears as a different path to the adversary [14],
since he does not know if the new pseudonym belongs to the tracked user.
However, there are algorithms that link pseudonyms with real location data and can even
reconstruct paths from data consisting of mixed, anonymized location coordinates from multiple
users [15]. Besides, to link pseudonyms, an attacker might look for patterns in the particular
request service, and this way being able to link different IDs to one user.
In location based systems, we can define strong anonymity and weak anonymity. Strong
anonymity permits user information to remain safer. The aim of the adversary is to identify the
user who generated an anonymous path by linking other available information to the path, but
with strong anonymity it is more difficult because a larger group of service users are required to
travel along the same path at the same time. In weak anonymity, there are no so many users
or they do not travel at the same time, thus data is more distinctive and it could be linked to
an individual.
In the rest of the section we describe some algorithms to protect LBS users’ privacy.
3.3.2.1
k-anonymity
With k-anonymity, instead of pseudonymously reporting her exact location, an user reports a
region containing k-1 other people, that is, a person cannot be distinguished from k-1 other
people. We call these k people the anonymity set. The algorithm reduces spatial or time
accuracy of each reported sample until it meets the defined k-anonymity constraint.
The larger is k, the higher the guarantees for location privacy. If we want a high location
privacy, it is necessary to perform additional spatial and temporal cloaking, and this results
in low spatial and temporal resolution for the anonymized messages, which means that QoS
(which is defined by temporal and spatial tolerance specifications) will be diminished, because
if the region or the time between reported samples are bigger, the accuracy of the service will
be reduced [16].
19
3. Preliminaries
Besides, the k-anonymity model can be improved (in terms of privacy) by performing message
perturbation and location cloaking algorithms, which means that besides applying k-anonymity,
the location can be blurred or some packet fields may be changed. We can achieve higher
guarantees of k-anonymity and higher resilience to location privacy threats. In a scenario,
each mobile client defines its desired anonymity level (the k value in k-anonymity,) spatial
tolerance and temporal tolerance. Each message is transformed into a message that can be
safely forwarded to the LBS provider. There are two tecniques to obtain k-anonymity:
• Spatial cloaking: Decreasing the location accuracy through enlarging the exposed spatial
area such that there are other k-1 mobile clients present in the same spatial area.
• Temporal cloaking: Delaying the message until k-1 (k with the user) mobile clients have
visited the same area located by the message sender. Increasing the time between location
reports can also improve privacy.
Data perturbation or resolution control mechanisms offer users additional options when releasing
data to semi trusted providers (such as little-known service providers,) but it is proven that
service providers can still glean useful information from the data [14].
Since k-anonymity modifies the location traces substantially, it may not meet the accuracy
requirements of some applications, as traffic monitoring systems. To solve this, a time-toconfusion metric [7] [17] is defined to evaluate the users’ path privacy of a set of location
traces, which describes how long an individual vehicle can be tracked. Hoh et al. propose and
algorithm [17] which can guarantee a specified maximum time-to-confusion (see Sect. 5.3.3.)
This algorithm provides more accurate location data than a random sampling algorithm (see
Sect. 5.3.2.) It can guarantee a good level of privacy even for users driving in low-density areas.
3.3.2.2
l-diversity
k-anonymity can provide anonymity in certain scenarios, but it cannot avoid certain attacks
as the homogeneity attack and the background knowledge attack [18]. With the homogeneity
attack, an attacker can obtain sensitive information when there is little diversity in the values
of sensitive attributes. When all k-users reside in the exact same location, if only k-anonymity
is used, a subject might be identifiable.
Background knowledge means that an adversary has a previous knowledge of the users. ldiversity provides privacy even when the data publisher does not know what kind of knowledge
is possessed by the adversary.
Incorporating l-diversity allows the mobile users to improve their privacy. With k-anonymity
and l-diversity we use k for the privacy in terms of the set of subjects (the anonymity set of
users) and l set of locations (the anonymity set of locations.) If all the users are travelling
along the same route and passing through the same set of identifiable locations, m-invariant
can be introduced, to be able to control the number of anonymous routes that an user wants to
maintain anonymous.
The main reason to introduce l-diversity is to improve the privacy protection of location provided
by k-anonymity algorithm. To ensure l-diversity, every group of tuples that share the same
non-sensitive information should be associated with at least l approximately equally distributed
20
Attacks and countermeasures
sensitive values. If l is increased (thus the diversity is increased) the probability of linking a
static location to a mobile user is reduced (for example a church, a school, etc.) and this allows
the user to be unidentifiable from a set of l different physical locations [19].
3.3.2.3
Spatial and temporal degradation
Apart from k-anonymity, there are other countermeasures. One of them is Spatial Cloaking [7],
which can be obtained applying different techniques. It can be applied to a single user or to a
group of people in the same region. As we said before, an user is k-anonymous if the reported
location is imprecise enough to be indistinguishable from at least k-1 other objects.
Mix Zones (which will be explained with more detail in next section) are physical regions in
which subjects’ pseudonyms can be shuffled among themselves to try to prevent an inference
attack (explained in Sect. 3.3.1) on data. Using these Mix Zones we can achieve the goal of an
user being k-anonymous.
There are alternatives to these Mix Zones that uses only a single user’s data. One of them uses
the ’knowledge’ that the most visited place is home. The algorithm deletes samples near a
subject’s home and this creates ambiguity about the home’s actual location, since no clouds of
samples are pointing at this location. The algorithm works as follows: it deletes points in a
circle centered at the subject’s home, creating ambiguity about home’s location. We show an
example in Fig. 3.6. We choose a point in the r radius circle (centered in at the subject’s home,)
and this point is the center of the R circle, which is the region where samples are deleted.
Another Spatial Cloaking tecnique is to create fake traces in areas to prevent tracking. An
attacker can also be fooled if false locations are appended to a true location report. The
location-based service responds to the reports and the client picks out only the response based
on the true location.
Other countermeasure is adding noise. If location data is noisy, it will not be useful for inferring
the actual location of the subject (value distorsion.) For example, we can add Gaussian noise
to each measured latitude and longitude. This inaccuracy makes that we report a sample
different from the actual location, and imprecision means giving a plurality of possible locations.
However, to prevent an attack we need lots of additive noise. Rounding is also used to prevent
an attack. If data is too coarse, it will not correspond to the subject’s actual location. Another
ways to maintain location privacy are vagueness, where a subject reports a place name instead
of latitude and longitude, or subsampling, which means sending less samples.
Path changing [20] is also used as a countermeasure. There are two approaches: Path Confusion
and Path Perturbation. With the former, we take advantage of the fact that every time two
paths of different users’ meet, the probability the adversary to confuse the tracks and follow
the wrong user is increased. With the later new segments are inserted in the map (so there are
more paths crossing) and some path segments are changed, raising the privacy by confusing the
opponent. Path Perturbation also depends on the characteristics of the original traces, because
if few crossing streets exist, the adversary could assume that all crossing segments have been
artificially inserted through a perturbation algorithm. Note that this approach obtains the best
performance for short paralell segments.
The privacy level depends on the density of the scenario and the quality of service constraint.
21
3. Preliminaries
r
R
Figure 3.6: Deleting data to hide home’s location.
This dependency on user density makes that, if it is low, we have to sacriface QoS to obtain
the level of privacy that we need. Thus adequate levels of privacy can only be obtained if user
density is sufficiently high, or the QoS is very low.
3.3.2.4
Mix Zones
The aim of Mix Zones is to prevent tracking of long term user movements, but still permit the
operation of many short-term location-aware applications. If sufficient users simultaneously
pass through these zones, an adversary cannot determine which path segments leading into and
out of a zone belong to the same users. This technique relies on statically defined zones, so it
cannot guarantee protection in areas with a low traffic density [8].
A Mix Zone for a group of users is a connected spatial region in which none of these users
has registered any application callback. We define the anonymity set as the group of people
visiting the Mix Zone during the same time period. The larger the anonymity set is, the greater
anonymity offered, since as there are more users going out of the Mix Zone, it is easier that the
attacker gets confused.
When users enter in the Mix ¿one, they are ’invisible’ for the application, they stop receiving
22
Attacks and countermeasures
location information. Users change to a new pseudonym every time they enter in a Mix Zone,
and applications that see a user emerging from the Mix Zone cannot distinguish that user from
any other who was in the Mix Zone and cannot link people going into the mix zone with those
coming out of it.
An user might enter to different Mix Zones in different times. The attacker can observe the
times, coordinates and pseudonyms of all the enter and egress events. The goal of the adversary
is to reconstruct the correct mapping between new and old IDs, being able to identify an user.
If not all the members of the set are equally interesting to the observer, the anonymity set is
not a good measure. This is why other anonymity metrics has been defined [21] [22]. In this
case, they are not equiprobable, and the entropy (explained in Sect. 5.2) will be lower, which
means less anonymity, since the higher the entropy is, the more uncertain an observer will be
about the true answer [22].
3.3.2.5
Privacy grid
A privacy grid is a framework for supporting anonymous location-based queries in mobile
information delivery systems.
Users can define their preferred privacy requirements in terms of both location hiding measures
(k-anonymity, l-diversity, etc.) and location service quality measures (maximum spatial and
temporal resolution.) The privacy grid provides fast and effective cloaking algorithms for
location k-anonymity and location l-diversity [19].
In Privacy grid, the space is divided into cells. To this end, the smallest spatial cloaking area
for each user must be found, and there are two approaches to find it: bottom-up and top-down.
The bottom-up approach takes the base cell (minimum defined area) containing the mobile user;
it performs two checks for each message going to the LBS, with k or l higher than one. The first
check determines if the current cell meets the user spatial constraints. Then k-anonymity and
l-diversity requirements must be checked, to determine if they are fulfilled. If both conditions
are fulfilled, the current cloaking area is chosen. If the constraints are not met, the algorithm
starts the expansion to other neighboring cells, making the cloaking area bigger. The cell
chosen to do the expansion is the one with the highest object count.
With top-down cloaking, we first find the largest grid cell region within the user specified
maximum spatial resolution area. This cloaking area is divided into a set of rows and columns.
The algorithm checks if the largest possible cloaking box meets the privacy requirements, and
if the region does not fulfill them, the message is not cloaked and the algorithm terminates.
Otherwise, the algorithm starts removing rows and columns and checking the fulfillment of
requirements. This process is repeated iteratively until it finds the smallest cloaking region
which meets the requirements.
23
Chapter 4
System model
4.1
Introduction
The definition of the model is really important since the analysis methodology, the results
obtained and their interpretation depend on it. In this chapter we include a description of the
scenario, every agent, its functions and capabilities, trusted and untrusted entities, etc.
In the following sections we will describe the behaviour and characteristics of the participants:
vehicles, Trusted Third Party (TTP) and public server (LBS.) We will also describe the scenario
in which the participant move, using Netlogo. We can see an example of our model in Fig. 4.1.
Attacker
TTP
LBS
User
Figure 4.1: System model.
24
Scenario
4.2
Scenario
The scenario we simulate is the following: a city consisting of blocks, streets, traffic lights and
cars (as it could be for example the city of NY, presented in Fig. 4.2.) We only consider straight
streets. The reason to do this is that even if the streets in the city are curve, the user can only
go through the streets and turn in intersections. In our scenario there are bidirectional and
unidirectional streets, as well as two lane streets. Lights control the traffic, as it happens in a
real scenario.
Figure 4.2: Example of a reticular city: New York [23].
We choose a region of 12.5x12.5 km2 , as we consider it is big enough as to represent a real city.
Our scenario is not developed in Km, as we explain in Chapter 6. This is one big advantage
of Netlogo, that we can easily change the size of our scenario (our city scenario in Netlogo is
shown in Fig. 4.3.) We also consider a block size of 500 meters and the maximum time that a
traffic light remains closed is 15 seconds.
In our model, cars are going to drive in the city, reporting samples every t seconds, and we
consider that their maximum speed is v. While driving, the car describes a path. We denote
this path as a car’s trajectory. However, the valuable samples (meaning with this the ones
that contain useful information) are only the ones for a certain amount of time or space, even
though the cars keeps on reporting samples until the end. The reason to do this is that if
they stopped reporting samples, the number of targets to follow would be less, and that would
improve the tracking performance.
We chose an exponential distribution for the car’s travel times inspired by the work of
Krumms [24] and Hoh et al [17]. We chose an exponential with median 14.4 as Krumm [24].
25
4. System model
Figure 4.3: Netlogo city view.
4.3
4.3.1
Participants
Vehicles
In our model, we define two types of vehicles, to try to make our experiments as realistic as
possible:
Commuters We consider N vehicles carrying GPS receiver and a transmitter. They report
samples containing the ID, timestamp, longitude, latitude, velocity, heading information
and another timestamp (no related to real time, but related to Netlogo simulation steps.)
This last timestamp represents one simulation step in Netlogo. We need this second
timestamp because our tracking algorithm needs to know how many cars are reporting
samples at a concrete time and, since Netlogo has little deviation of the time report, it
could cause problems in selecting which samples are part of a concrete discrete time.
The transmitter only sends samples while the car is switched on, i.e., cars do not
exist for the algorithm if they are parked. The sampling frequency is 5 seconds. In [17],
samples are reported every minute, but our scenario is smaller, and 5 seconds is already
challenging.
The behaviour of the commuters is the following: they drive in the city and they stop
whenever they have arrived to the destination or they have driven for a certain time. We
explain this in Sect. 6.2
No commuters Besides the N commuters, we consider F vehicles that do not report their
position, but they go from a place to a random destination, modifying the traffic conditions
for other vehicles. This vehicles aim at making our scenario as realistic as possible. We
have no interest in these vehicles and what they do.
26
Adversarial model
In the scenario, both types of cars (R = N + F , in total) drive together, simulating real traffic
conditions.
4.3.2
Trusted Third Party and public server
In this model, we use a Trusted Third Party (TTP,) as an entity which acts as a proxy such that
privacy is ensured. As we explained in Sect. 3.2, there are other approaches to location privacy
without using this entity [25]. However, the use of the Trusted Third Parties (centralized) is
still the most used solution.
This entity receives the samples of the vehicles every second. We define a set of samples as all
the released samples in a concrete time t. For example, the set of sample for t=0 would be
s1 ,s2 ,...,sn , being n the number of samples released for that time and belonging each sample to
one different car. As explained in Sect. 4.3.1, N is the total number of cars reporting samples,
thus n ≤ N , because some cars can be parked.
The public server is the server which receives the samples from the cars to offer the users
various services. The TTP is located between this entity and the cars to ensure privacy, since
the public server could be malicious and try to use the location information to infer private
information about a user.
The TTP can release all samples directly or execute a privacy enhancing algorithm (as the
’Uncertainty-aware privacy algorithm’ [17].) In the case of performing a privacy algorithm the
TTP reveals the samples resulting from the algorithm to the public server.
4.4
4.4.1
Adversarial model
Definition
Nowadays, there is a growing interest in tracking larger user population, rather than individual
users. Anonymous localization samples do not fully solve the privacy problem, because an
adversary could link multiple samples to accumulate path information and eventually reidentify
a user. Thus if we want to share information location, it will raise privacy concerns [26].
Tracking algorithms (explained in Sect. 5.4) aim at compromising the users’ privacy. For
example, Multi target tracking algorithms have application in both military and civilian areas
(e.g., air defense, air traffic control,) but not only in that high fields, just the knowledge of
where a person has been, or where that person lives, is a valuable information nowadays. This
is why tracking algorithms are developed and improved.
Target tracking algorithms predict target’s position using the last known speed and heading
information and the decide which next sample to link to the same vehicle through Maximum
Likehood Detection [27]. If multiple candidate samples exist, the algorithm chooses the one
with the highest a posteriori probability based on a probability model of distance and time
deviations from the prediction [17].
In our model, the attacker can be a malicious public server, or an entity that can eavesdrop the
communication channel between the TTP and the public server. In both cases, the attacker
27
4. System model
receives a set of anonymized samples and aims at linking them to obtain private information.
The attacker can see the location and speed information, and we also assume that he has
knowledge of the times the samples were released. The attacker takes the samples (for a concret
timestamp, which means that the attackers knows the timestamps, so the reading is in order,
taking at each step all the samples of a concrete timestamp) and using tracking algorithms, he
tries to link samples. This can be achieved using Single Tracking algorithms or Multi Target
Tracking algorithms.
4.4.2
Types of tracking
In the field of target tracking, we can follow two main approaches:
• Multi target tracking
• Single target tracking
Multi target tracking aims at tracking a larger user population. To this end, one of the most
popular algorithm is the Reid’s multiple hypothesis tracking algorithm [28]. This algorithm
is based on the Kalman filter [29], and it makes a prediction of the following possible state,
to be able to link samples. The Kalman filter (explained with more detail in Sect. 5.2.2) is
useful to predict the following state of the users, and try to track them this way. It is a set of
mathematical equations that provides an efficient computational (recursive) means to estimate
the state of a process, in a way that minimizes the mean of the squared error.
The biggest difficulty in the application of multi target tracking involves the problem of
associating measurements with the appropriate tracks, above all when there are reports missing.
Another problem with multi target tracking is the high computational cost that it means. It is
not realistic using Multi target tracking in a big scenario with more than three or four targets.
The Single Target tracking is easier to implement, and it needs less resources thus it can be
performed in a big target population. However, it performs worse results than the Multi Target
one. With this second approach only one car is followed at a time. In spite of the fact that
it would be better a Multi Target algorithm, the computational effort required is to high to
implement it with a large number of cars. In our case, we consider that the last tracking
algorithm is out of the possibilities of the attacker, so he uses different types of Single Tracking
algorithms (explained in Sect. 5.4.2.)
28
Chapter 5
Algorithms
5.1
Introduction
In this section we explain the different algorithms that we have used in our experiments. The
importance of Netlogo is well shown here. Normally these type of experiments would require
the collection of real data samples. This means that we would need to access cars, users that
allow their data to be public, etc. With Netlogo, we only need to build an scenario and record
the samples of the simulated (saving money and time while obtaining representative results.)
There are two types of algorithms that we deploy. Ones are defense algorithms and the others are
attack algorithms. We use different attacks with different defenses and study their behaviour.
Before explaining the algorithms in Sect. 5.2, we explain some preliminary concepts necessary
to understand these algorithms.
The defense algorithms are used to avoid location privacy breaches. In this work we use two
defenses: the subsampling algorithm (Sect. 5.3.2) and the Uncertainty-aware privacy algorithm
(Sect. 5.3.3.) These algorithms release less samples in order to avoid tracking while enough
information is given for users to be able to use Location Based Services.
The attack algorithms aim at tracking users and obtaining information. It is important to
ensure that an attacker does not know a user’s path or the places that user has visited, since
this can give the attacker knowledge that can be used in several (probably malicious) ways.
The attacks we explain in this chapter are mainly two: in Sect. 5.4.1 we explain Multi target
tracking and Single tracking algorithm is explained in Sect. 5.4.2.
5.2
5.2.1
Mathematical Background
Entropy
It is a measure of the uncertainty associated with a random variable. It is also called Shannon
entropy (entropy in information theory field) [30]. Entropy quantifies the information contained
in a message, or equivalently, the average of information content one is missing when one does
29
5. Algorithms
not know the value of the random variable.
The entropy H of a discrete random variable X with possible values x1 , ..., xn is:
H(X) = E(I(X))
being E the expected value function and I(X) the information content or self-information of
X. I(X) is itself a random variable. If p(X) denotes the probability mass function of X, then
entropy can be explicitly written as
H(X) = −
n
X
p(xi )logb (p(xi ))
i=1
with b the base of the logarithm. When computing entropy, normally b is equal to two, and in
this case the entropy is measured in bits.
5.2.2
Kalman filter
The Kalman filter is an recursive filter that estimates the state of a linear dynamic system from
a series of noisy measurements in an efficient way. The main idea is that with the Kalman filter
we can obtain the future state of a system, working on a prediction-correction basis. The state
of the system is represented as a vector of real numbers, and at each discrete time increment, a
linear operator is applied to the state to generate a new one.
The problem is that normally the variables of the state are noisy and not directly observable,
which means that there must be errors of the sensors, the senders, etc. The filter predicts the
next state, using the measurements of the system (as for example the geographical position,)
which are noisy and linearly related to the state. If the noise is Gaussian distributed, the
Kalman filter estimator is statistically optimal [31].
The Kalman filter algorithm operates in three steps: prediction of a new system state, generation
of hypothesis for the assigment of new samples to targets and selection of the most likely
hypothesis, adjustment of the system state. In the first step, the filter predicts which is the
most likely next state of the system. The filter equations allow also to compute the uncertainty
of this estimation being true. The second step, the filter generates hypothesis to assign the
new state to the real situations that we have observed, and it then selects the most likely one.
Finally, the system corrects the result and incorporates the information of the measurements.
The applications of the Kalman filter include a wide range of applications (like for example
control a dynamic system.) However, the importance of this filter in this Master Thesis is its
application in prediction of dynamic systems that are difficult to control.
5.2.3
Euclidean metric
It is the most common distance measure used between two points. The Euclidean distance
between two points A = (a1 , ..., an ) and B = (b1 , ..., bn ) in an Euclidean-n-space is defined as:
30
Mathematical Background
v
u n
p
uX
2
2
(a1 − b1 ) + ... + (an + bn ) = t (ai − bi )2
i=1
Netlogo provides a two-dimensional (2D) scenario. Thus in our case A and B are 2D vectors,
i.e., A = (ax , ay ), B = (bx , by ). The Euclidean distance in this scenario is:
d=
5.2.4
q
(ax − bx )2 + (ay + by )2
(5.1)
Random number generator
One of the defense algorithms, the Random subsampling algorithm explained in Sect. 5.3.2,
needs the generation of random numbers in order to decide if a sample is released or not. To
that end, we programmed a pseudo random number generator, more specifically a consistent
lineal multiplicative generator. This consistent lineal pseudo random number generator satisfies
Eq. 5.2.
Zi = (aZi−1 + c) mod m
0 ≤ Zi ≤ m − 1
(5.2)
where Zi is the random number (Z0 must be the seed,) m must be a prime number, a must
be the primitive root of m, which means that it has to satisfy Eq. 5.3, and c is an increment
which satisfies c < m, being in our case c = 0.
an
mod m 6= 1
n = 1, ..., m − 2
(5.3)
As we said, we use the consistent lineal multiplicative generator, which means that in Eq. 5.2
c = 0. We show how we obtain the random numbers in Algorithm 1.
Algorithm 1 Random numbers generator algorithm
1: h = Zi % q
2: l = Zi mod q
3: t = a · l − r · hi
4: if t > 0 then
5:
Zi = t
6: else
7:
Zi = t + m
8: end if
where m, which is the maximum period (maximum series of random number that we can
obtain,) is equal to 231 − 1, a = 48271, q = 44488 and r = 3399.
31
5. Algorithms
5.3
5.3.1
Privacy enhancing algorithms
Introduction
Privacy enhancing algorithms are a wide array of technical measures to protect user’s privacy [32].
Nowadays, we live in a world where our data can identify us thus it is very important to have
means of protecting it.
There are lots of types of privacy enhancements. In our case, our algorithms aim at anonymizing
the samples reported by cars, so that private information cannot be inferred from that samples.
In the next section we explain two privacy enhancing algorithms. In Sect. 5.3.2 we explain the
Random subsampling algorithm, which is an easy algorithm that randomly removes samples.
Finally, in Sect. 5.3.3, we explain the Uncertainty-aware algorithm, which is also based in
samples removal, but not randomly: the algorithm chooses to remove the more ’compromising’
samples. This way the algorithm removes less samples but maintains the privacy level.
5.3.2
Random subsampling algorithm
We define subsampling as a reduction of the sampling rate. This is a really easy approach to
increase privacy, since all the algorithm has to do is deleting samples to make the tracking
more difficult. The less samples we report, the more difficult it is to link them, since these
samples will be more separated between them, and this way the prediction is likely to be wrong,
resulting the attacker assigning a wrong path to a user.
However, this algorithm does not obtain good results, since it deletes samples without taking
into account available information as the position of the cars, the number of cars surrounding
it, etc. To get this idea straight, we show the following example: if an user is in an intersection,
and the next sample is not reported, it is better in terms of privacy than if this user does not
report a sample when there is no intersection and the user can only go straight. In the later
case, the attacker can infer the direction of the user, even without the non-released sample. To
solve the weakness other algorithms (as the one explained in Sect. 5.3.3) are developed.
In our simulations we have applied random subsampling. This variant of subsampling consists
on the following: every timestamp, we randomly decide to release an user’s sample or to delete
it. To make this decision we use a custom pseudo random numbers generator instead of using
the default C++ function random(). We have programmed this generator in C++ to make it
probabilistically correct. The algorithm we use is the consistent lineal multiplicative generator
explained in Sect. 5.2.4.
With this approach the tracking ability of an attacker is reduced, since the less samples are
released, the more difficult is for an attacker to follow the path.
32
Privacy enhancing algorithms
5.3.3
5.3.3.1
Uncertainty-aware privacy algorithm
Definitions
In this subsection we define some terms that will be necessary to understand the Uncertaintyaware Algorithm.
Definition 1 Time-to-confusion: It is the maximum time that an adversary can correctly
follow an user, i.e., for how long an attacker can follow a target with sufficient certainty that
these samples correspond to that user.
Definition 2 Tracking Uncertainty: It is defined as entropy of the probability distribution
describing the certainty of an attacker in the assignment of samples to users. In Eq. 5.4, pi
denotes the probability that location sample i belongs to the vehicle currently tracked. Lower
values of H denote more certainty or lower privacy.
H(X) = −
n
X
pi log2 (pi )
(5.4)
i=1
Definition 3 Mean time-to-confusion: It is the mean tracking time during which uncertainty
stays below a confusion threshold.
5.3.3.2
Algorithm definition
The Uncertainty-aware privacy algorithm was developed by Hoh et al. [17]. It is a privacy
algorithm which aims at providing anonymity even in low density areas. It works as follows: the
user defines a maximum allowable time-to-confusion and an associated uncertainty threshold
(explained in Sect. 5.3.3.1.) The algorithm receives as input sets of samples, it processes each
set, and releases only the samples that ensures that the tracking bounds are maintained. The
strength of this algorithm is that it ensures a maximum time-to-confusion.
Using this algorithm the maximum time-to-confusion is maintained for a concrete uncertainty.
To ensure this, the algorithm must follows two criteria to release samples. The first rule
mandates that a sample can be revealed if the time since the last point of confusion is less than
the defined maximum time-to-confusion. The second rule mandates that a sample can only be
released if the tracking uncertainty (Eq .5.4) is above a defined threshold.
The Algorithm 2 describes the processing of data for a set of samples and it is repeated until
there are no more samples. The input of the algorithm is the set of GPS samples reported
at time t (v.currentGPSSample updated for each vehicle,) the maximum time to confusion
(confusionTimeout,) and the associated uncertainty threshold (confusionLevel.) With these
three inputs, it generates an output which is a set of GPS samples that can be released without
compromising privacy.
33
5. Algorithms
Algorithm 2 Uncertainty-aware privacy algorithm
1: releaseSet = releaseCandidates = ∅
2: for all vehicles v do
3:
if startof trip then
4:
v.lastConf usionT ime = t
5:
else
6:
v.predictedP os = v.lastV isible.position+(t−v.lastV isible.time)∗v.LasV isible.speed
7:
end if
8:
//release all vehicles below time-to-confusion threshold
9:
if t − v.LastConf usionT ime < conf usionT imeout then
10:
add v to releaseSet
11:
else
12:
//consider release of others dependent on uncertainty
13:
v.dependencies = k vehicles closest to the predictedP os
14:
if uncertainty(v.predictedP os, v.dependencies) > conf usionLevel) then
15:
add v to releaseCandidates
16:
end if
17:
end if
18: end for
19: //prune releaseCandidates
20: for all v ∈ releaseCandidates do
21:
if ∃w ∈ v.dependencies.w ∈
/ (releaseCandidates ∪ releaseSet) then
22:
delete v f rom releaseCandidates
23:
end if
24: end for
25: repeat pruning until no more candidates to remove
26: ReleaseSet = releaseSet ∪ releaseCandidates
27: //release GPS samples and update time of confusion
28: for all v ∈ releaseSet do
29:
publish v.currentGP SSample
30:
v.lastV isible = v.currentGP SSample
31:
neighbors = kclosest vehicles to v.predictedP os in ReleaseSet
32:
if uncertainty(v.predictedP os, neighbors) ≥ conf usionLevel then
33:
v.lastConf usionT ime = t
34:
end if
35: end for
34
Tracking algorithms
In a first step, the algorithm selects the samples that can be safely revealed because an amount
of time smaller than confusionTimeout has passed since the last point of confusion (line 9.)
In the second step, from line 12 to line 27, the algorithm selects the vehicles that have a
tracking uncertainty above the threshold, and releases them. In this step, some approximations are made, and even though without them more samples would be released, they are
enough to maintain privacy guarantees. The final tracking uncertainty is not calculated with
all the samples reported at a time t, but only with the k closest samples to the predicted
point. This is because the uncertainty would be bigger if more samples were taken into account, thus it is an conservative approach. Furthermore, since uncertainty should only be
computed with revealed samples, and these are not determined yet, the algorithm chooses
a set of releaseCandidates. This set is pruned until only the vehicles which meet the uncertainty threshold are in it. ’The key property to achieve after the pruning step is that
∀v ∈ releaseCandidates.uncertainty(v.predictedP os, k closest neighbors in releaseSet ∪
releaseCandidates) ≥ conf usionLevel’ [17]. For every vehicle it must be ensured that, after
pruning, all the k neighbors must be in the releaseSet.
The last operations (line 27 and forward) performed, after having decided which GPS samples
can be released, are to update the last confusion point and the last visible GPS sample for each
vehicle. Note that confusion should be only calculated over released samples and not over all
the samples. The last three lines are needed for path prediction in the uncertainty calculation.
5.4
Tracking algorithms
A tracking algorithm is an algorithm that tries to associate samples with the appropriate tracks.
In our study, two tracking algorithms are applied to the samples collected in the Netlogo
simulation to reconstruct the path followed by an user. We do not do the step of recovering
information, e.g. recover the name of the user.
When choosing a tracking algorithm we have to take into account the algorithm performance,
but also the computational effort required to compute it, because resources and time are limited.
An adversary could employ at least three approaches to link location samples [33]. We introduce
these approaches in the following paragraphs.
Trajectory-based linking assumes that a user is more likely to continue travelling on the same
trajectory, rather than changing direction.
A second approach is Map-based linking. This algorithm correlates location samples with likely
routes on a road or a map. These routes can in turn be used to predict users’ position and to
link future samples. In other words, the attacker uses geographic, traffic and map information
to obtain a better prediction. For example, if an user is on a intersection where he can only turn
to the right (because turning to the left is forbidden,) then the adversary uses this knowledge
in order not to link the user’s following sample to one going in a forbidden direction.
Finally, Empirical linking connects samples based on prior movements that have been observed
at a given location. An attacker uses past information to link samples. This means that if an
user is in a specific situation, and this situation has also happened in the past, the attacker
takes into account this information (e.g., which direction did the user follow) to make a better
prediction.
35
5. Algorithms
During the following sections we explain different Target tracking algorithms. In Sect. 5.4.1 we
explain the Multi target algorithm, and in Sect. 5.4.2 and Sect. 5.4.3 we explain two different
types of Single tracking algorithm. The difference between single and multi target algorithms is
that with the first, we only follow one target at every time, while with the other, we follow take
into account all the users. Besides, since the approach in most of the literature (for example,
in [17]) is Trajectory-based linking, because it requires least effort for large-scale outdoor
positioning systems, we will restrict our analysis to this approach.
5.4.1
Multi target tracking algorithm
In Multi target tracking algorithms, the attacker is not interested in following every user
independently. There is a higher interest in tracking larger user populations, rather than
individual users. The problem of linking location samples to potential users is known as the
data association problem in multi target tracking systems. The idea of these algorithms is
minimize the error when assigning the positions of new location samples with the predicted
positions of all targets.
We choose to use Reid’s multiple hypothesis tracking algorithm, which is based on Kalman
filtering (explained in Sect. 5.2.2.) The algorithm explained in this section considers that
after every step only one hypothesis survives, i.e., at each step we calculate likehood between
predictions and new samples assuming that the previous assigments were correct.
5.4.1.1
State prediction
The first step is the Kalman filter prediction step. In this step, given a current state,we predict
the next state using the available measurements (in our case samples) and adding noise, as we
explained in Subsect.5.2.2.
The filter prediction is described by the following formulas:
xk = F xk−1 + wk
(5.5)
zk = Hxk + vk
(5.6)
In this equations, F is a matrix which given a previous state xk−1 , describes the next one. wk
is noise, and xk is the state vector of the process at time k. zk is the new observation vector,
obtained from the prediction xk using H, that converts a state vector into measurement domain.
There are two types of noise in this equations. wk ∼ N (0, Qk ) is the process noise vector and
vk ∼ N (0, Rk ) is the measurement of noise, where N (µ, cov) represents the Gaussian function
with mean µ and covariance cov. wk and vk are assumed to be independent between them and
they are normally distributed with covariance matrices Qk (Eq. 5.7) and R (Eq. 5.8.) As we
can see, only the main diagonal of these matrices is different to zero, because the noises are
assumed to be independent. These non-zero values represent the variance in the state (in Qk )
36
Tracking algorithms
and in the measurement vector variables (in Rk ,) since we cannot assume that measures are
perfect.


Qk 0
0
0
 0 Qk 0
0

Q=
0
0 Qk 0 
0
0
0 Qk

Rk 0
0
0
 0 Rk 0
0

R=
0
0 Rk 0 
0
0
0 Rk
(5.7)

(5.8)
As our tracking applications are two-dimensional, we can model the system state as xk =[px ,py ,vx ,vy ]’,
which are the position and the speed vector in the x and y axis, respectively. We define F, a
matrix that multiplies the state vector as:

1
0
F =
0
0
0
1
0
0

0
1

0
0
1
0
1
0
(5.9)
If we use this matrix, we predict the position in the next timestamp (the x and the y. The
predicted speed remains the same,) since e = v · t, being e the space, v the speed and t the time.
Every new step, the filter makes a prediction of the new position of the user as:
x̄k+1 = F x̂k
P̄
k+1
(5.10)
k
T
T
= F P̂ F + Q
(5.11)
where x̄ is the mean and P̄ is the covariance of a multivariate normal distribution. x̂ and P̂ are
the estimates after the last sample was received.
5.4.1.2
Second step: Hypothesis Generation and Selection
An hypothesis is defined as the possible assigment of new samples to users. In this step, when
the attacker receives the new samples, the algorithm generates one set of hypothesis for each
permutation of the sample set, i.e., the algorithm makes all the possible permutations with the
assignment between samples and users. Finally, it computes the likehood for each hypothesis
and chooses the one with maximum likehood.
37
5. Algorithms
Being Ωi an hypothesis, Z k a set of measurements with cardinality M, we define
Pik
≡
P (Ωki |Z k )
≈
M
Y
f (zm )
(5.12)
m=1
which is the probability of Ωi given the measurements at a time k. f represents the conditional
probability density function of the vector zk , which obeys a multivariate normal distribution
(Eq. 5.13.)
f (z k |x̄k ) = N (z k − H x̄k , B)
(5.13)
Equation 5.13 calculates the proximity of a prediction to the observation and then these values
are combined into the probability of each hypothesis. B is defined as B = H P̄ k H T + R.
N (x, P ) = e
−1 T −1
x P x
2
/
p
(2π)n |P |
(5.14)
where Eq. 5.14 shows a normal distribution. The values xk and P are the ones calculated in
the prediction step.
Finally, the hypothesis j with maximum probability is chosen and we calculate the log-likehood
radio:
log Λk = log PI
Pik
i=1,i6=j
5.4.1.3
Pik
(5.15)
Third step: State correction
This is the last step, and it is needed in order to update the predicted system state vector
for each path with the Kalman gain K = P̂ H T R−1 . After the assigment of the observation
vector to the targets, according to the chosen hypothesis, it is also necessary to update the
difference between the predicted vector and the chosen observation. The new samples give
direct information about the current state of the system, and with these new samples the
algorithm corrects the most recent predictions by incorporating the information gained from
the new information.
38
x̂k = x̄k + K[z k − H x̄k ]
(5.16)
P̂ k = P̄ + P̄ H T (H P̄ H T + R)−1 H P̄
(5.17)
Tracking algorithms
Equations 5.16 and 5.17 describe the correction step.
Once we have updated xk and P k , the three steps described before (State Prediction, Hypothesis
Generation and Selection and State Correction) are repeated for the following set of data (data
at time k+1,) but using this new x̂k and P̂k , so that we correct our system.
5.4.1.4
Limitations
Although Multi target tracking performs really good results, it is computationally expensive,
and it becomes unfeasible when a big number of targets are tracked. This is the reason why in
real situations with a big number of targets, an attacker cannot use this algorithm and has to
resort to simpler algorithms, getting worse results but being necessary less time and memory
resources.
In the next section we introduce a much simpler algorithm, which is Single tracking algorithm.
5.4.2
5.4.2.1
Simple Single Tracking Algorithm
Motivation
Although the optimal option would be to perform a Multi tracking algorithm, its complexity
makes it unworkable for a big number of cars. In [33], the authors are able to evaluate the
privacy risk against multitarget tracking because the scenario is made up of only five cars, so it
is computionally possible. However, in the majority of the literature (e.g., [17],) the performed
tracking algorithm is Single Tracking.
5.4.2.2
Description
An attacker performing the Single Tracking Algorithm can only follow only one car. This
makes the computation much easier, but this time and memory reduction has its consequences,
because the results are worse than using other more complex algorithms.
In the Simple Single Tracking Algorithm (SSTA) we define x̂k as the estimation of the position
of a target vehicle, and x̄k as the prediction of the next position of this target. We assume that
the attacker has knowledge of the times the samples were released. The attacker reads set by
set of samples (being a set of samples defined in Chapter 4,) until the data samples are finished.
For the first set of samples, which is in k=0, we assign each of them to one x̂k , assuming that
they are correct. After this step of inizialitation, the algorithm works as follows: first, we read
again a set of samples. Second, knowing the estimation of the prior position x̂k , we calculate
the predicted position as:
x̄k+1 = F x̂k
(5.18)
39
5. Algorithms
where F is the matrix defined in Eq. 5.9. The last step is to calculate the closest sample to
the predicted position. To this end, we use the Euclidean metric (Eq. 5.1.) When the closest
sample is found, we assign it to x̂k+1 .
We repeat this procedure for all the samples. The output of this algorithm are the tracked
paths.
5.4.2.3
Limitations
We have to be aware of the fact that we always assign one sample to x̂k (after the prediction
step,) even if this sample is too far so that it cannot belong to the observed target. This is
because cars have a limited speed, so their movements are limited in a concrete time. For
these reasons, the algorithm when all samples are released, good tracking results are obtained.
However, when a defense privacy algorithm removes samples, this SSTA performs badly.
The reason of this bad performance is that the algorithm always chooses a sample, even if
this sample is to far as to be part of a possible path for that car. Thus every time a car does
not report a sample, the algorithm chooses the one which is closer, but it is never the one
corresponding to that car. If for example, we have two cars, A and B in a time k, situated in a
city in positions SA =(1,1) and SB =(9,9), being this coordinates in Km. If in this concrete time
k the car B does not report a sample, the algorithm will choose the SA coordinate (assuming
that the other cars releasing samples are further,) since there is no other closer sample, even
though it is impossible that in one timestamp the car B has moved that distance.
In the next section we present an algorithm that mitigates this effect.
5.4.3
Distance-aware Single Tracking Algorithm
The Distance-Aware Single Tracking Algorithm works similarly to the prior one, introducing
one modification so that, when a sample is not reported, the tracking algorithm does not assign
impossible samples to cars.
When using DATA, after the step of calculating the closest sample, we check if the distance to
the last known position is more than two patches (which are individual squares on the Netlogo
grid, an each patch is equivalent to 125 meter in reality.) If this is true, then x̂k+1 = x̂k , since
probably the closest sample does not correspond to the path of the target vehicle. In our
experiments we have chosen two patches as a threshold. The reason to do this is that the
maximum speed for a car is 0.56 patches/tic. This means that, in the worst case, if the sample
is further than two patches, the target has no reported a sample in four timestamps. In this
case we would probably have lost the track.
40
Chapter 6
Experiments
6.1
Introduction
In this chapter we present the results of the experiments performed using Netlogo to evaluate
Location Privacy defenses and attacks.
In Sect. 6.2 we explain the experimental setup, i.e, the conditions under which the experiments
were performed. This is an adaptation of the model explained in Chapter 4 to the Netlogo
world.
Section 6.3, Sect. 6.4 and Sect. 6.5 show the results obtained when performing the approaches
to release samples described in Chapter 5: release all, random subsampling and the Uncertaintyaware algorithm. In each of these sections, we apply the two attacks explained in Sect. 5.4.2
and in Sect. 5.4.3 to the outputs of the defense algorithms and show their tracking performance.
We also study the effect of the different system parameters: number of cars reporting samples
(N ,) total number of cars (R,) anonymity set size (k,) confusion level (β) and the sampling
release probability (φ) on our results.
Finally, in Sect. 6.6 we compare the three defenses effectiveness against the tracking algorithm
explained in Sect. 5.4.3.
6.2
Experimental setup
In this section we explain the selection of the experimental parameters, the scenario, the
participants and the behaviour of these participants. To do our experiments, the samples
were obtained with Netlogo, processed with algorithms programmed with C++, and the
representation of results was done with Matlab.
The scenario is a 100x100 patches (Netlogo distance unit) scenario. We consider that one
patch is equivalent to σ meters in reality. We choose σ=125, which means that our scenario is
equivalent to a region of 12.5 km2 . We show the Netlogo scenario in Fig. 6.1.
Further, we have to decide the speed of the cars in the experiments. As our experiment aim
41
6. Experiments
Figure 6.1: Netlogo implementation of the city.
at simulating the traffic in a city, cars may drive with a speed v of at most 50km/h (14m/s,
see Fig. 6.2.) To achieve this, we choose the Netlogo time unit, tic, to be 5 seconds (1 tic=5
seconds) such that the maximum permitted speed is v= 0.56 patches/tic. We recall that one
tic is one step of the Netlogo simulation, i.e., one time step.
In our scenario, N cars drive and report GPS location samples. In our first approach to model
the traffic, that we call no-exponential, the vehicles are initially randomly situated in the city.
We consider the place where they are first allocated is their home. When the car’s home is
allocated, we define a random destination, which we consider the car’s owner working place
(work.) These cars behave as follows: they drive from home to work and vice-versa until the
simulation ends, reporting a sample every tic.
Contrary to [17], where most of the users worked in the same place, we choose to give cars
random destinations. This corrects a potencial error, since in [17] the work place worked as a
place of confusion, where tracking algorithms failed. We think that this approach of same work
place is not a real assumption, because in real life drivers have different destinations.
In [17], to obtain the N cars that report samples, the authors overlaid data obtained by the
same cars from different days, thus the same path from different days, because they could not
42
Experimental setup
Figure 6.2: Maximum permitted speed in a city.
collect data from a large number of cars. What they did was the following: for every available
car (they had M < N cars available to perform their experiments) they collected data different
days. As they wanted to show experiments corresponding to 24-hour GPS traces, they overlaid
the data (choosing randomly from the ’paths’ created by the cars) from different days into
only one day. This overlaying method is a limitation, since it generates many similar routes
by aggregating GPS traces from the same set of drivers. This may not represent true traffic
conditions, since this overlaying creates a non-existent high density scenario, where vehicles
might be driven by the same driver and on the same route. To avoid this problems, we do not
use an overlaid method to obtain all the samples that we need. Our scenario is made up of
traces from different N cars. This approach provides more realistic results, and our simulation
environment allows us to place as many cars as necessary.
In [24] and in [17], the car trip times where exponentially distributed. However, with the model
described before, the distribution of trip times (shown in Fig. 6.3a) were not exponential at
all, so we decided a change. As we would like our results to be comparable to the ones in [24]
and [17], we define a new scenario, exponential, where instead of going from home to work and
viceversa, cars randomly ’wander’ for a concret time, τ , and when τ expires, they park, remain
parked for a random time κ and they start driving again.
To ensure an exponential distribution of the trip times, every τ is a sample from an exponential
distribution. We do this to follow the empirical statistics from real GPS traces in Krumms’s
work [24] confirmed in [17]. We take as a median for the exponential distribution the 14.4
minutes, as reported in [24]. Since the samples follow an exponential distribution, we can
obtain its mean knowing that:
mean = 1/λ
(6.1)
median = ln(2)/λ
(6.2)
43
6. Experiments
We obtain that the mean is 12460 488, and as t=5 seconds in reality, we have to generate samples
1
from an exponential with parameter mean= 249 tics and parameter λ = 249
. The resulting
distribution is shown in Fig. 6.3b.
180
600
160
500
140
400
Frequency
Frequency
120
100
80
60
300
200
40
100
20
0
0
500
1000
Times per trip (tics)
1500
0
0
500
(a) no-exponential scenario
1000
1500
2000
Times per trip (tics)
2500
3000
(b) exponential scenario
Figure 6.3: Empirical distribution of vehicle trip times.
In [17], where the samples used where collected from real cars, the distance error between the
real and the predited positions follow an exponential function. Figure 6.4 shows that the same
error distribution is obtained in our Netlogo model, showing this way that the modeling of
traffic conditions with Netlogo is close to reality.
10000
9000
8000
7000
Samples
6000
5000
4000
3000
2000
1000
0
0
0.2
0.4
0.6
0.8
Distance deviation from prediction
1
1.2
Figure 6.4: Distance errors in tracking
To avoid that the variability between different scenarios affect our results, in every experiment
of the following sections we represent 10 aggregated Netlogo experiments in the graph. This
44
Experiments releasing all samples
provides more reliable results, as we are not seeing only one path of every car in a concret
simulation.
All the results are shown as histograms representing the percentage of path that a percentage
of cars are tracked.
6.3
Experiments releasing all samples
In this section, we show the results obtained when the proxy releases all samples, and no
anonymity algorithm is applied.
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
Figure 6.5a shows the results in a scenario where N =10, while R, the total number of cars in
the scenario, is 500 and the tracking algorithm performed is SSTA (Simple Single Tracking
Algorithm, explained in Sect. 5.4.2.) We observe that the percentage of tracked cars in this
scenario is really high, and even the cars which are not tracked all the path are tracked above
the 60%. This is a logical result, since with only 10 cars, the samples from different cars are
normally far from each other and this makes it easy for the tracking algorithm. However, in
some cases the algorithm can be mistaken in intersections or crossing roads and follow another
car from that confusion point.
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
(a) SSTA: N =10
80
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =10
Figure 6.5: Releasing all samples. N =10.
In Fig. 6.5b, we use the same scenario as before, but we apply DATA (Distance Aware Tracking
Algorithm, explained in Sect. 5.4.3.) As we can observe, the tracking results are the same,
since DATA only provides better results when there are samples removal.
In Fig. 6.6a and Fig. 6.6b, we show the results with more congested traffic conditions: instead of
10 cars, N is increased to 100 (and R=500.) Figure 6.6a shows the results when performing the
SSTA. We can observe that when the number of cars increase, the difficulty of tracking increases
too, because there are more samples with which the algorithm can be ’fooled’. However, the
percentage of cars tracked when releasing all samples is still high. Figure 6.6b shows the result
45
6. Experiments
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
when performing the DATA. We can observe that as in the previous case, the number of cars
and percentages are the same, since no samples are removed.
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
100
(a) SSTA: N =100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100
Figure 6.6: Releasing all samples. N =100.
6.4
Experiments subsampling
In this section we show the results obtained when cars do not release all the samples, but use
the subsampling algorithm explained in Sect. 5.3.2.
Figure 6.7a shows the results in a scenario where N =10, R=500 and the attacker uses SSTA.
The sampling release probability, φ, is 0.8 . This means that the probability of releasing one
sample is 0.8, i.e, for each sample we generate a random number η, and if η <φ, then the
sample is released.
We observe that the percentage of path tracked is really low for the majority of the cars. It
might seem odd, since the sampling removal is only a 20%. This has a simple explanation: for
every set of samples, the tracking algorithm chooses the closest sample to the prediction as the
following sample in the path. The problem is that if the car has not reported a sample in that
set (a concrete timestamp,) the algorithm assigns this car one sample, even if it is too far to be
a possible option for that car. From that point, the algorithm follows an incorrect track. This
is the reason of the poor behaviour even when there is low removal rate.
In Fig. 6.7b, the DATA is applied to track the cars. We observe a significant improvement of
the results, since this algorithm fixes the problem of the SSTA when a sample is not released.
In Fig. 6.8a and Fig. 6.8b the number of cars, N , is increased to 100, to show the effect of a
higher density in the same space scenario. Figure 6.8a shows the results when performing the
SSTA. We observe that the number of path tracked is improved for some cars (between 60%
and 70%,) but this does not mean that having more cars is better for the tracking algorithm,
actually, it is the opposite, because we also observe that the number of cars tracked less time is
46
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
Experiments subsampling
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
0
0
100
(a) SSTA: N =10, φ=0.8
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =10, φ=0.8
Figure 6.7: Subsampling algorithm.N =100, φ=0.8.
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
higher (more then 70% of the cars are tracked only less than the 10%, and we also observe a
diminution between 10% and 20%.)
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
(a) SSTA: N =100, φ=0.8
80
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, φ=0.8
Figure 6.8: Subsampling algorithm.N =100, φ=0.8.
In Fig. 6.8b, we observe a big improvement of the tracking success. This is a the result of
applying DATA, instead of SSTA. Applying DATA, if a sample is not sufficiently close, it is not
chosen, i.e., the algorithm is not forced to choose a sample, it is only chosen if it is likely to
belong to the tracked path.
In the next experiment, we change φ. Now, instead of releasing one sample with probability
of 0.8, we release a sample with a probability of 0.5, which means that in average we release
only half of the samples. Figure 6.9a shows the results applying SSTA. The results are really
47
6. Experiments
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
poor, and this shows that the biggest the sampling removal is, the worst the performance of
the tracking algorithm is be too. As we can see, more than the 90% of the cars are tracked
less than 10% of the path, and only one can be tracked all the path. Figure 6.9b shows the
result applying the DATA. Again, we can observe a big improvement compared to the prior
case. There are more cars tracked 100% of the path, but not only that, we see cars tracked the
50% (and more) of the path, which do not happen when SSTA is performed.
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
(a) SSTA: N =100, φ=0.5
80
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, φ=0.5
Figure 6.9: Subsampling algorithm.N =100, φ=0.5.
6.5
Experiments Uncertainty-aware algorithm
In this section we show the results obtained when performing the Uncertainty-aware algorithm
(explained in Sect. 5.3.3) to decide whether a sample can be released or not.
Figure 6.10a shows the results in a scenario where N =10 and R=500. The parameters of the
Uncertainty-aware algorithm a set as follows: k=2 and β=0.4.
First we apply SSTA and we observe that the tracking results are poor. This is a logical result
for two reasons. First, as the number of cars is low, when the algorithm is performed, few cars
meet the uncertainty constraint. The second reason is the same as for Fig. 6.7a: SSTA has
really bad performance when samples are removed.
In Fig. 6.10b, we see that applying the DATA the tracking results are improved. We observe
that the number of cars tracked between a 0% and 10% decreases, and that the number of
cars tracked more than 50% of the path is increased. This is because DATA works better than
SSTA when removing samples.
In Fig. 6.11a and Fig. 6.11b we set N =100 to show the effect of a more congested traffic
conditions. Figure 6.11a shows the result when applying SSTA. It provides worse results (we
can see that the number of cars tracked 100% of the path decreases,) but not much worse
than with the 10 cars scenario (Fig. 6.10a.). This is because the Uncertainty-aware algorithm
48
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
Experiments Uncertainty-aware algorithm
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
0
0
100
(a) SSTA: N =10, k=2, β=0.4
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =10, k=2, β=0.4
Figure 6.10: Uncertainty-aware algorithm. N =10, k=2, β=0.4.
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
provides good privacy protection even in low density scenarios (as proved in [17],) this is why
the tracking results are poor.
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
(a) SSTA: N =100, k=2, β=0.4
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, k=2, β=0.4
Figure 6.11: Uncertainty-aware algorithm. N =100, k=2, β=0.4.
In Fig. 6.11b, we present the results of applying DATA, and the results are better than applying
SSTA. We can see that the number of cars tracked 100% is higher, and we also observe a
disminution between the 0% and 10%. Even though the results are improved, this improvement
is less than in the subsampling scenario (Fig. 6.8b.) We comment this in the Sect. 6.6.
In the next experiment we change k and β, being now k=10 and β=0.2. This experiments aim
at showing the importance of the selection of this parameters in the algorithm. Figure 6.12a
shows the results when performing SSTA. With a lower β, more samples are released since
49
6. Experiments
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
their uncertainty is above the threshold imposed with β. As we can see in the figure, with the
reduction of this threshold the percentages of paths tracked are higher, and with a significant
difference compared to the situation from Fig. 6.11a. Figure 6.12b is the result of performing
DATA. The tracking results are better than in Fig. 6.12a. The number of cars tracked between
10% and 30% decreases, while the number of cars tracked between 50% and 100% increases.
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
(a) SSTA: N =100, k=10, β=0.2
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, k=10, β=0.2
Figure 6.12: Uncertainty-aware algorithm. N =100, k=10, β=0.2.
6.6
Comparison of different defense algorithms
In this section we compare the different results obtained when applying the different defense
algorithms, and in the case that we do not apply any of these algorithms and just release all
samples. We choose relevant graphics to compare two by two when performing the DATA. In
these experiments, N =100.
The structure of this section is the following. First, we compare the results between releasing all
samples and subsampling. Second, we compare the case of releasing all samples and performing
the Uncertainty-aware algorithm. Finally, we compare the results between the random sampling
and the Uncertainty-aware algorithm.
In Fig. 6.13a and Fig. 6.13b we compare the results between the results obtained releasing all
samples and applying the subsampling algorithm with φ=0.8.
As we can see in Fig. 6.13a and Fig. 6.13b, the results of the attacks are much better when
releasing all samples. This is an logic result, since the further the consecutive samples are
reported, the more difficult it is for the tracking algorithm to link them.
In Fig. 6.14a and Fig. 6.14b we compare the results between the results obtained releasing all
samples and applying the Uncertainty-aware algorithm, with the following parameters: k=2
and β=0.4.
50
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
Comparison of different defense algorithms
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
0
0
100
(a) DATA: N =100, reporting all samples.
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, φ=0.8, subsampling algorithm.
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
Figure 6.13: Comparison between releasing all samples and subsampling. N =100.
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
(a) DATA: N =100, reporting all samples.
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, k=2 and β=0.4, Uncertainty-aware
algorithm.
Figure 6.14: Comparison between releasing all samples and Uncertainty-aware algorithm.
N =100.
In Fig. 6.14a and Fig. 6.14b we can see that there is a difference between performing the
Uncertainty algorithm and releasing all samples. The tracking algorithm performs better results
in no algorithm is applied, since if it has all the consecutive samples, it is easier to link them.
In Fig. 6.15b and Fig. 6.15a we compare the results between the results obtained subsampling
with φ=0.8, and with the following parameters for the Uncertainty-aware algorithm: k=2 and
β=0.4.
We can see the results in Fig. 6.15b and Fig. 6.15a. This is an important result since it shows
51
100
100
90
90
80
80
Percentage of cars tracked
Percentage of cars tracked
6. Experiments
70
60
50
40
30
70
60
50
40
30
20
20
10
10
0
0
20
40
60
Percentage of path tracked
80
(a) DATA: N =100, subsampling algorithm.
100
0
0
20
40
60
Percentage of path tracked
80
100
(b) DATA: N =100, k=2 and β=0.4, Uncertainty-aware
algorithm.
Figure 6.15: Comparison between subsampling and Uncertainty-aware algorithm. N =100.
that choosing when to remove the samples obtains a better performance than in the case of
random removal.
52
Part III
Part III: Learning algorithms
53
Chapter 7
Preliminaries
7.1
Introduction
Artificial Intelligence (AI) is a branch of computer science that deals with the simulation of
intelligent behaviour in computers. The aim of AI is to design intelligent agents, where an
intelligent agent is an entity capable of taking actions depending on the perceived environment
maximizing the chances of success [34].
Thinking machines and artificial machines appeared already in the Greek myths. All along
history, writers and thinkers have written about these topics, wondering if an intelligent machine
(meaning most of the times with intelligence the capacity of acting like a human,) could be
created, and if so, if this machine could feel. However, it was not until the middle of the 20th
century when scientists began to design and build intelligent machines, based on new discoveries
and on the improvement of computer technology. In fact, the artificial intelligence as we now
it today was born in the conference of the campus of Darmouth College, in 1956. From that
moment, it has passed for really optimistic times (thinking even that a robot could be in last
instance a human,) disappointing times, etc, but today it is an important part of technology
and industry in continuous development.
Finally, say that Artificial Intelligence has many application fields including medical diagnosis,
robot control, video games, etc, and its importance is growing each day.
7.1.1
Intelligence
One of the main problems of the Artificial Intelligence field has been the definition and agreement
of what intelligence is. In fact, there are several definitions and tests of machine’s abilities to
demonstrate intelligence, as for example the Turing test [35]. This test consists on the following:
a human judge has a written conversation with a machine and a human being, but this two
elements are isolated and the judge cannot see them. If after the conversation the judge is
not able to determine with certainty which was the human and which was the machine, the
machine is said to be intelligent.
However, nowadays there is an ’universal’ accepted definition of intelligence. An agent is said
54
Introduction
Figure 7.1: Turing test [36].
to be intelligent if it has the following properties: autonomy, social ability, reactivity and
pro-activeness. An agent has autonomy if it exercises control over its own actions and state,
this means, it does not need intervention of other beings to act. This is the most important
property of an agent. In Fig. 7.2 we can see that an agent receives some inputs, processes them,
and acts changing the environment. The social ability is a property which means that an agent
interacts with other agents via an agent communication language. This property also refers to
a possible communication with humans. A reactive agent is one which can perceive events in
the environment and react to them in the correct fashion. The last quality is proactiveness.
We do not want an agent to simply interact with the environment and to be driven solely by
reaction to events, we want it to show goal-directed behaviour, i.e, to generate and attempt to
achieve goals. If an agent has this properties, it is said to be intelligent. However, they are
other qualities that are not a requisite, but they are desirable, as for example mobility (the
capacity to move,) veracity (not communicating false information,) rationality (the agent takes
actions which, given its knowledge of the environment, maximizes of success) and adaptability
(the agent changes its behaviour in response to changes in the environment.)
55
7. Preliminaries
Input
Output
ENVIRONMENT
SYSTEM
Figure 7.2: Agent action in some environment
7.1.2
Subfields, tools and applications
AI is divided in different subfields and methods to solve the different problems that we can
find in this area. Some general methods are the following: Search and Optimization, which
includes searching algorithms, Optimization and Evolutionary computations. Also in the logic
field, which is mainly Logic Programming and Automated reasoning. Another methods are the
Probabilistic method for uncertain reasoning, which includes for example Bayesian network,
Hidden Markov Model, Kalman Filters, etc. Another important subfield (explained with more
detail in Sect. 7.4) are the statistical learning methods. Finally, we also find Neural Networks,
which will be explained in Sect. 7.3. These last two fields will be explained since we use its
algorithms in our experiments.
A simpler classification is the following: there are non-learning intelligent agents and learning
intelligent agents. In this Master Thesis, we are interested in the second type. Figure 7.3 and
Fig. 7.4 show the difference between a non-learning agent and a learning agent. We can see that
the non-learning agent receives and input and simply generates an output, since its behaviour
is already defined. However, the learning agent processes this input, and empirically learns
how to act. In the following sections we will introduce the learning concept, just as different
algorithms and learning techniques.
7.2
Multi-agent Learning
As we can see in Fig. 7.5, Multi-agent learning is the intersection of Multi-agent systems and
Machine Learning [37].
In the environment agents learn and adapt in the presence of other agents that are simultaneously
learning and adapting. What is more, agents can also learn even if they are the only ones which
are learning. Still there is an interaction with other agents while learning.
56
Artificial Neural Networks
Sensors
Inputs
Observation of
the environment
AGENT
Decide action
ENVIRONMENT
Action to be done
Actuators
Actions
Figure 7.3: Non-learning agent.
7.3
Artificial Neural Networks
Artificial Neural Networks [38] are composed by interconnected artificial neurons, which are
mathematical functions or programming constructions which acts as an abstraction of biological
neurons. Explained in another way, neurons are simple nodes connected to form a network.
These nodes are simple processing elements, although they can exhibit global behaviour. An
artificial neuron receives some inputs and sums them to produce an output (Fig. 7.6.)
Artificial Neural Networks (referred from now on as Neural Networks) are used either to
understand biological neural networks or to be able to solve artificial intelligence problems
without creating a model of a real biological system.
As we said before, a Neural Network is composed by neurons, which are simple nodes connected
to form a network. This nodes are simple processing elements, although they can exhibit global
behaviour.
The neuronal network has to be trained. The three major learning paradigms to do this are
Supervised Learning, Unsupervised Learning (both explained Sect. 7.4) and Reinforcement
Learning (Sect. 7.5.)
This networks are useful since they can be used to infer a function from observations, and also
in domains where it is difficult to design a function by hand because the data or the tasks
are too complex. This is why Neuronal Networks are used in several areas, as for example
game-playing and decision making, pattern recognition, system identification and control, etc.
57
7. Preliminaries
Performance Standard
Sensors
Inputs
Feedback
Critic
Changes
Performance
element
ENVIRONMENT
im
en
ts
ledge
Know
Ex
pe
r
Le
Alg arnin
ori
g
thm
s
Learning
Element
Actuators
Actions
Problem Generator
Figure 7.4: Learning agent.
MULTI-AGENT
SYSTEMS
MULTI-AGENT
LEARNING
MACHINE
LEARNING
Figure 7.5: Multi-agent Learning. Intersection of two fields.
7.4
Machine Learning
Machine Learning is a subfield of artificial intelligence, and it deals with the capacity of machines
to learn from the actions they have performed or from the environment [39].
As for intelligence, there is not a global definition for the word learning. However, we can give
a guideline, saying that a whenever a machine changes its structure, program or data (based
on inputs or in response to external information) to improve its performance, we can say that
58
Reinforcement Learning
Figure 7.6: Simple scheme of the nodes of a Neural Network.
a machine learns. There are many reasons why a machine should learn, for example changes
of the environment over the time, to extract relationships from large amounts of data (data
mining,) etc.
There are several Machine Learning algorithms. One possible classification is based on the
desired output of the algorithm. Among them we have: Supervised Learning, Unsupervised
Learning and Semi-supervised Learning. In this three the goal is to find a function that maps
inputs to desired outputs, the difference is that in the first there are labelled examples (they
are input-output examples from of a function,) in the second there are no examples, and in
the third there are labelled and no labelled examples. A fourth algorithm is Transduction,
which is similar to Supervised Learning but without constructing a function. Finally, with
Reinforcement Learning (we will explain it with more detail in Sect. 7.5) the agent observes
the world and learns a policy of how to act in a concrete situation.
7.5
Reinforcement Learning
Reinforcement Learning (RL) is a sub-area of Machine Learning, in which the goal of the agent
is to maximize the long-term reward. The problem which RL tries to solve is which action the
agent ought take in a concrete environment, i.e., to find a policy that defines which actions
the agent must take in a concrete state. Nowadays, there are lots of unresolved problems, for
example flight control systems, automated manufacturing systems, etc. These problems are
not unresolved because we need a faster or better computer, but the problem is to determine
what the program should do. It could be solved if a computer could learn to resolve results
through trial and error. That is what Reinforcement Learning aims at achieving, to solve the
problem of an agent that must learn a behaviour through trial-error interactions with dynamic
59
7. Preliminaries
environment.
There are two main strategies for solving Reinforcement Learning problems. The first approach
is used in genetic algorithms and genetic programming. It consists on the search in the space of
behaviors to find one action that performs well in the environment. The second approach is to
use Statistical Techniques and Dynamic Programming. In this case, Reinforcement Learning is a
mix or two disciplines: Dynamic programming [40] and Supervised Learning [41]. This way RL
manages to solve problems that neither of the other two algorithms can solve individually [42].
Finally, the capability of yielding powerful machine-learning systems, its generality and the
fact that we ’only’ have to give the computer a goal to achieve, have made of Reinforcement
Learning a really appealing algorithm to researchers, for example in robot-rescue.
7.5.1
Environment and reinforcement function
In the RL system model, an agent is in a dynamic environment, and the agent observes this
environment (at least partially) through sensors, readers, etc. In most of the problems, it is
assumed that the agent perceives the exact state of the environment.
In every state (S) the agent chooses an action (A) and it generates an output. This action
changes the state of the environment, and the agent is rewarded with an scalar number named
reinforcement signal (R.) This reward varies and it is negative or positive depending on the
result of the performed action. The target of the agents is to perform actions that maximize
the sum of reinforcements received. Thus a reward function (R) must be defined, that awards
some actions, while punishes others.
Using the reward function, the agent has to find a policy π which determines which action
should be performed in each state. The optimal policy is the mapping from states to actions
that maximizes the sum of the reinforcements.
7.5.2
Future rewards
To maximize the long term reward, an agent has to take the future into account for the decision
he is making now. There are three main models to do this [43]: the finite-horizon model, the
infinite-horizon discounted model and the average reward model.
With the Finite-Horizon model, at every step, an agent has to optimize its expected reward for
the next n steps (see Eq. 7.1.)
E(
n
X
rt )
t=0
where E is the average function, t are the steps and rt represents a scalar reward.
60
(7.1)
Reinforcement Learning
The Infinite-Horizon discounted model takes all the long-run reward into account, but discounting
the rewards received in the future with a discount factor 0 ≤ γ ≤ 1 (see Eq. 7.2.)
E(
∞
X
γ t rt )
(7.2)
t=0
where again E is the average function, t are the steps and rt represents a scalar reward.
Finally, in the average-reward model, the agent aims at taking actions that maximize the long
run average reward (see Eq. 7.3.) It is also known as a gain optimal policy. The problem with
it is that we cannot distinguish between two policies, one which gains a lot in the initial phases
and the other that does not (because the long-run average covers the first gains.) This is solved
generalizing the model (bias optimal model) where a policy is preferred if it maximizes the
long-run average.
h
lim E(
h→∞
7.5.3
1X t
γ rt )
h
(7.3)
t=0
Markov Decision Process
In Reinforcement Learning, the environment is typically formulated as a finite-state Markov
Decision Process (MPD).
An MDP consists of 4 objects: S is the state space, A is the set of actions, Pa (s, s0 ) is the
probability of making a transition performing an action a from state s to state s’, and Ra (s, s0 )
is the immediate reward that the agent receives after going from state s to s’ with transition
probability Pa (s, s0 ).
In the Reinforcement Learning problem, agent’s actions determine their next reward and the
next state of the environment. In delayed reinforcement [43] for example, the agent receives
little reinforcement when taking some actions, but it finally arrives to a state with high reward.
In this case, the agent learns which actions it has to take, even though it does not receive a big
reward until the end.
Using infinite-horizon as a function, we want to maximize the cumulative function of the rewards
(see Eq. 7.4.)
E(
∞
X
γ t Rat (st , st+1 ))
(7.4)
t=0
where γ t is the discount factor, and Rat (st , st+1 ) is the reward obtained when performing action
a, going from state st o state st+1 .
We have to define now a solution for the MDP. This solution can be expressed as a policy
π, which maps the states to probabilities (i.e., which action must be chosen in every state.)
61
7. Preliminaries
We will show the optimal solution for the Infinite-Horizon, but this optimal solution can also
be achieved with the Finite-Horizon function. However, with the infinite one, an optimal
deterministic stationary policy exist [44].
Being π the decision policy, the optimal value of a state is the expected sum of reward that an
agent will gain if it executes the optimal policy starting from a state:
∞
X
V (s) = maxπ E(
γ t rt )
(7.5)
t=0
V ∗ (see Eq. 7.6) is the optimal value function and it is unique. With this function, using the
best available action, in the state s the agent gets the expected instantaneous reward plus the
expected discounted value of the following state. This optimal value function is unique and it
defined as the solution to the equations:
V ∗ (s) = maxa (R(s, a) + γ
X
Pa (s, s0 )V ∗ (s0 )), ∀s ∈ S
(7.6)
s∈S
We define the optimal policy knowing the optimal value function:
π ∗ (s) = arg maxa (R(s, a) + γ
X
Pa (s, s0 )V ∗ ), ∀s ∈ S
(7.7)
s∈S
There are two ways to find an optimal policy. One is finding the optimal value function, which
can be determined iteratively with the algorithm value iteration. The other algorithm is the
policy iteration algorithm, which, instead of finding the optimal policy via the optimal value
function, it manipulates the policy directly.
7.6
Predator-prey problem
The predator-prey problem [45] is a challenging problem in the field of distributed artificial
intelligence. It is used as a generic multi-agent testbed because it can illustrate different multiagent scenarios, but it is a ’toy’ to try new algorithms and to concretize concepts. Although it
is not useful when we want to have a complex real world domain, we can also base this more
complex world on it.
The participants of this problem are two: the predators and the preys. They move in an
scenario formed by a discrete grid with squares (see Fig. 7.7.) The goal of the predators is to
capture the prey, which means being in the same grid square as the prey or surrounding it (Fig.
7.8.) On the other hand, the goal of the prey is not to be captured.
Predators and preys can only move to the adjacent square, and turns are defined, such that
they can only move in their turn.
62
Predator-prey problem
Possible movements
Predator
Prey
Figure 7.7: Predator-prey model.
Predator
Prey
Figure 7.8: Capture of a prey by surrounding.
To measure the ’quality’ of an algorithm, predators perform the developed MAS algorithm,
and we observe the results of the simulations (how many times they predators capture the
preys, the time they need to do this, etc.) As for the prey, it can also use an algorithm to try
to escape from the predators or move randomly.
Everything described before is just the basic model, in which there is one prey and four
predators. However, the predator-prey problem has also variants to adapt it to different needs.
This means that we can change the definition of capture, the size and shape of the world, etc.
63
Chapter 8
System Model
8.1
Introduction
In this chapter we explain our model, which is named SinCity. We describe the scenario (a
city,) the participants and a new pursuit problem, which is a more complex version of the
predator-prey problem.
This chapter is structured as follows: in Sect. 8.2.1 we explain the previous existing city models.
In Sect. 8.2.2 we explain the improvements that we made to the previous scenarios to obtain a
more realistic one. Finally, in Sect. 8.3, we explain the participants of the scenario.
8.2
8.2.1
Scenario
Previous models
The first approach to model vehicular traffic, which is included in the NetLogo distribution [2],
is Traffic Basic [46] (Fig. 8.1.) It models the movements of cars on a highway. Each car follows
a simple set of rules: it slows down (decelerates) if it sees a close car ahead, and it speeds up
(accelerates) if it does not see a car ahead. It demonstrates how traffic jams can form even
without any ’centralized cause’.
Figure 8.1: Traffic basic model.
Using the movement of the cars in the previous model, a small city with traffic lights is modeled
in Traffic Grid [47] (we can see the model in Fig. 8.2), also included in the NetLogo distribution.
64
Scenario
It consists of an abstract traffic grid with intersections between cyclic single-lane arteries of
two types: vertical or horizontal. It is possible to control traffic lights, car’s speed limit and the
number of cars, creating a real-time traffic simulation. This allows the user to explore traffic
dynamics and develop strategies to improve traffic and to understand the different ways to
measure the quality of the traffic.
Figure 8.2: Traffic grid model.
Using the Traffic Grid model as a starting point, a more complex model is presented in
[48], called Self-Organizing Traffic Lights (SOTL.) Cars flow in a straight line, eastbound or
southbound by default. Each crossroad has traffic lights that only allow cars to move in one of
the arteries that intersect it with a green light. Yellow or red lights stop the traffic. The light
sequence for a given artery is green-yellow-red-green. Cars simply try to drive at maximum
speed of a patch per time step, but they stop when a car or a red or yellow light is in front of
them. A patch is a square of the environment with the size of a car. Time is discrete, but space
is continuous. The environment is shown in Fig. 8.3. The user can change different parameters,
such as the number of arteries or cars.
Figure 8.3: SOTL model.
Different statistics are shown if we use this model: the number of stopped cars, their average
speed, and their average waiting times. In this scenario, there are three self-organizing methods
65
8. System Model
for traffic light control outperforming traditional ones, since the agents are ’aware’ of changes
in their environment, and therefore they can adapt to new situations.
8.2.2
City Improvements: our model
Our model is a more realistic city scenario. The main agents in this model are:
Intersections: Agentset containing the patches that are intersections of two roads.
Controllers: Agentset containing the intersections that control traffic lights. Controllers
occupy only one patch per intersection.
Roads: Agentset containing patches forming roads. There are four sub-agentsets, depending if
the road is southbound, northbound, eastbound or westbound.
Buildings: Agentset containing patches forming buildings.
Exits: Agentset containing the patches where cars leave the simulation.
Gates: Agentset containing the patches where cars sprout from.
Given the model of the traffic in a city described in Sect .8.2.1, we have made some improvements
to represent a more realistic scenario. In the previous model, the roads have a single lane and
direction, and there are only two directions by default, south and east, although they can be
changed to four, by adding north and west. We have added the possibility of bidirectional
roads and roads with two lanes in the same direction. Also, in the previous model a torus was
used by default, which means that when a given car coming from the west to the east arrives
to the end of the scenario at the east, the same car appears at the beginning of the scenario at
the west. To increase realism, we remove the torus and impose four directions (north, east,
south and west) and a by-pass road is created to improve traffic, which is the outermost in the
scenario.
We have also changed the car creation and elimination scheme. In our model, for every car
we define a source (a random road patch) and a destination (another random road patch,)
such that every car is created at a source, and it moves (following the shortest path) to its
destination, where it is eliminated. The sources and destinations may be outside the world,
depending on the value of two sliders called origin-out and destination-out, leading to some
cars appearing and disappearing at the borders of the world.
We have modified the control methods so that, instead of just one yellow light cycle, now
there are as many yellow light cycles as patches in the intersection, i.e., if the traffic light
protects a bidirectional road, with two lanes in each direction, there will be four yellow lights
cycles. In order to try to correct deadlocks at the intersections, a deadlock algorithm has been
implemented. If a given car at an intersection has not moved after a given time, it tries to
change direction in order to keep moving and to try to leave the deadlock. This movement
affects other cars and helps finishing the current deadlock.
Due to all these improvements, specially the possibility of an origin and a destination and
bidirectional roads, a complex algorithm to guide the cars is needed. In previous models, a car
only changes road or direction depending on a probability prob-turn. In our model, whenever a
car is in a patch that is an intersection (it belongs simultaneously to a horizontal and a vertical
road,) it runs a guiding algorithm in order to know if a change of direction is necessary, before
moving on. This algorithm works as follows: first it checks in which direction the car is moving,
and the possible directions this car can follow in the intersection. After this, the algorithm
66
Participants
checks where the destination is, and knowing all this information, it decides to go in the way
which is more likely to be the shortest. For example, if a car is moving to the north, and in
an intersection it can move to the north and to the east, and the destination is more to the
southeast, the algorithm decides to go to the east. If a car is not in an intersection, it will keep
the same direction until the next intersection.
As seen in Fig. 8.4, with these changes we obtain a more realistic scenario where we can notice
the different widths of the streets, depending if they are bidirectional and single or dual lane
streets. We can also see the distribution of the traffic lights, and the by-pass road surrounding
the city.
Figure 8.4: SinCity model.
8.3
Participants
Out city simulation is an extension of the predator-prey pursuit problem (explained in Sect.
7.6,) where the prey is substituted by a thief car and the predators by a set of police cars. The
goal of the thief is to rob the bank and arrive safe to the hideout, while the police cars have to
catch them after they rob the bank.
67
8. System Model
On every challenge, the thief car starts driving at normal speed to a city bank. It stops in front
of the bank, does the theft and gets away to its hideout at double speed. On the other hand,
police cars patrol the city at random before the theft is done. When the thief car robs the bank
an alarm is triggered, and police cars double their speed and patrol along the city trying to
identify the thief’s car.
The chase begins when any police sees the thief before it arrives to its hideout in the same road
and at a distance of two blocks or less. On the one hand, if the thief’s car is seen, then all
police cars know its position. On the other hand, if it is lost, they keep the position where the
thief was last seen as target. If any police car arrives to that point but the thief car is not in
sight, then the chase stops, preventing the other police cars to go to that place and the patrol
continues.
We consider that the thief is captured if it is surrounded by police in a road (two police cars)
or in an intersection (four police cars.) We consider that the thief escapes when it reaches its
hideout and enters into it without being seen by any police car (we call this event ’thief wins’.)
Note that during the chase the thief does not go to its hideout, but it tries to escape from
police. Besides, there are more cars in the city that cause thief or police cars to reduce their
speed in the prosecution, from double to a normal one, if they are in front of the car.
Fig. 8.4 shows a snapshot of the SinCity map and data with the thief car is highlighted. The
bank is marked in red at the upper right and the hideout is marked in green at the center. At
every challenge the position of the bank and hideout changes but they must be at a distance
greater than the 25% of the map size. At the bottom and the right of the map we can see a
graphic display with some outputs related with the simulation. We can also configure several
parameters related with the algorithms explained in Chapter 9.
68
Chapter 9
Algorithms
9.1
Introduction
In all artificial intelligence subfields lots of algorithms have been developed to try to solve the
problems that agents have to face with. In this chapter, we will focus on four algorithms.
In Sect. 9.2, we explain the Korff algorithm, which is a non learning algorithm. It is one
important algorithm to solve the predator-prey problem, in which our model is based on. In
Sect. 9.3, we explain the Self Organizing Map (SOM,) a Neural Network algorithm. Finally,
in Sect. 9.4 and in Sect. 9.5, we explain two Reinforcement Learning algorithms: Learning
Automata and Q-learning.
9.2
Korff Algorithm
Korff developed this algorithm to solve the predator-prey problem (described in Sec.7.6.) The
algorithm is a simple non-learning algorithm [49], but really effective. His solution required
only sensing (which means that an agent can perceive the surrounding environment) and action
from preys and predators. He considered that the predators were attracted towards to the prey.
The force of this attraction could be calculated using Eq. 9.1 (where d is the distance from the
predator to the prey, and S measure function.) The predator choose the neighboring cell with
the highest score.
S = d(prey)
(9.1)
However, with this solution, the predators do not surround the prey, because they pile up
and disturb themselves. The solution is to consider repulsion between wolves. This repulsion
is shown in the second term of Eq.9.2, where k is a constant that models the repulsive force
between wolves. In this variant, the predator also moves to the neighboring cell with the highest
69
9. Algorithms
score.
S = d(prey) − k · d(predator to prey)
(9.2)
With this new approach, the predators do not disturb themselves, and finally surround the
prey.
As for the prey, in both approaches the strategy followed by the prey is to move to the
neighboring cell which is farthest from the predators. If the prey’s speed is faster than the
speed of the wolves, it permits the prey to escape [50].
One problem of Korff’s algorithms is that it is considered that the predators can see all the
environment, and when this is not true, their performance gets worse. However, it works good
when predators have a complete view of the field.
9.3
Self Organizing Maps Algorithm
Self-Organizing Maps (SOMs) [51] are a data visualization technique which reduce the dimensions of data through the use of self-organizing Neural Networks. This algorithm is a
type of Artificial Neural Network that is trained using unsupervised learning to produce a
low-dimensional (typically two dimensional,) discretized representation of the training samples,
called a map.
A Self-organizing Map consists of components called neurons, and each of these neurons has
associated a weight vector. The SOM describes a mapping from a higher dimensional input
space to a lower dimensional map space. This means that the algorithm aims at placing the
data into the map finding the neuron with the closest weight vector to the vector taken from
the data space and to assign the map coordinates of this node to our data vector. This closest
neuron to the data is called BMU (best matching unit.)
In SOM, the goal of learning is that all the different parts of the network responds similarly to
some input patterns. The SOM algorithm adjusts the weights of the BMU and neurons close
to them, using the updating formula shown in Eq. 9.3:
Wn(t+1) = Wn(t) + Θ(t)α(t)(I(t) − Wn(t) ))
(9.3)
where Wn(t) is the weight vector of the neuron n, α(t) is a monotonically decreasing learning
coefficient and I(t) is the data input vector. The neighborhood Θ(t) depends on the lattice
distance between the winner (closer) neuron and the neuron n. The magnitude of change
decreases with time and with distance from the BMU.
70
Learning Automata
9.4
Learning Automata
Learning automata (LA) is a type of Reinforcement Learning algorithm and a branch of the
theory of adaptive control. Originally it was described as a finite state automata [52], but
afterwards it was decided to use a probability distribution to describe the internal state of the
agent. According to the probabilities given by the distribution, the agent would choose actions.
For each agent, it is defined a matrix (Ψ) which contains the probability vectors of each state.
The items of the probability vector contain the probability of taking a concrete action in a
determined state. Thus if for example an agent has three possible states and it can perform
four actions, the vector will be like Eq. 9.4, where P (si , x) represents the probability of making
the action x in the state si . Each row of the matrix represents the probability vector for each
state. The probability vectors are actualized and adjusted depending on the previous successes
and failures. This is how the agents can learn which actions perform.


P (s1 , a) P (s1 , b) P (s1 , c) P (s1 , d)
Ψ = P (s2 , a) P (s2 , b) P (s2 , c) P (s2 , d)
P (s3 , a) P (s3 , b) P (s3 , c) P (s3 , d)
(9.4)
Learning Automata uses two very simple rules, which are shown in Eq. 9.5 and Eq. 9.6.
P (s, a) = P (s, a) + α(1 − P (s, a))
(9.5)
P (s, b) = (1 − α)P (s, b) f or b 6= a
(9.6)
where P (s, a) is the probability that an agent takes action a in state s and α is a small learning
factor. We only actualize this equations when a performed action succeeds. This way we
increase the probability of taking this action a in the state s, while we punish (decrease the
probability) the rest of the actions for this state.
This algorithm always converge, giving us a vector of zeros and one one, which is ’winner’
action. However, it can converge to an incorrect action, but we can reduce this probability
making α small [43].
9.5
Q-learning
Q-learning (QL) [53] is also a Reinforcement Learning algorithm. This algorithm is used in
Markovian domains, providing the agents with the capability of learning to act optimally, doing
this through the experience of the consequences of their actions. This algorithm makes the
agent learns a mapping which represents the best actions to perform in every possible state.
In Q-learning, the learned decision policy is determined by the value function Q(s, a) (Eq. 9.8.)
This function is used to actualize the components of a matrix (Γ) defined for each agent which
71
9. Algorithms
contains state vectors, each of them containing the Q value for each possible action. We can
see this matrix in Eq. 9.7, where rows represent the vector states (Q values for a concrete state
and action) and Q(si , x) represents the Q value of performing the action x in state si . The
higher the Q value is, the more probable we perform that action, because that action is the best
option to perform in that state. In the end, we have a mapping between states and actions,
which is the higher value of Q of each vector state.


Q(s1 , a) Q(s1 , b) Q(s1 , c) Q(s1 , d)
Γ = Q(s2 , a) Q(s2 , b) Q(s2 , c) Q(s2 , d)
Q(s3 , a) Q(s3 , b) Q(s3 , c) Q(s3 , d)
(9.7)
Given an state s, an agent has a series of actions that it can perform and that will take the
agent to the next state s0 .
Q(s, a) = Q(s, a) + α(R(s) + γmaxa Q(s0 , a0 ) − Q(s, a))
(9.8)
In Eq. 9.8, s is the current state, s0 is the next state, Q(s, a) is the value of the matrix for that
state and action, Q(s0 , a0 ) is the value of the matrix for the next state and action and α is the
learning rate (0 ≤ α ≤ 1.) The smaller this factor is, the less the agent learns, while if it is
close to one the agent will only take into account the most recent information.
Continuing with the explanation of parameters, R(s) is the reward and γ is the discount factor,
which determines the importance of future rewards (0 ≤ γ ≤ 1.) The smaller this parameter
is, the agent gives more importance to current rewards, while if it is close to one, it looks for
a long-term reward. Finally, the value maxa Q(s0 , a0 ) is the maximum value of Q that we can
obtain in the following state. This three last parameters are defined as the expected discounted
rate.
Note that we have to assign values to the Q matrix before the learning starts. After that,
new values are calculated and the agent obtains the best action for every state, which is the
maximum Q(s,a) for a concrete s.
72
Chapter 10
Experiments
10.1
Introduction
This chapter presents the results of the experiments, with which we also obtained a publication
with the paper SinCity: a pedagogical testbed for checking Multi-agent Learning techniques [1].
We show and explain the results of our experiments to test the performance of different Learning
Algorithms when applied to the pursuit problem in a city.
In Sect. 10.2 we explain the experimental setup, i.e., the conditions and parameters under
which the experiments were performed. In Sect. 10.3, Sect. 10.4, Sect. 10.5 and Sect. 10.6, we
present the results of the simulations. The title of the sections refers to the algorithm used by
the thief, e.g., in the Sect. 10.3 we present the results when the thief uses Korff algorithm and
the police uses LA, QL and Korff. We remark here that in our experiments SOM is only used
by the thief when it escapes from the police. Finally, in Sect. 10.7, we compare and comment
the results.
10.2
Experimental setup
For implementing the learning techniques described in the previous section we took several
decisions. First, police cars and the thief car make decisions about what road to take only
at every intersection. This reduces the number of states of the system and speeds up the
simulations.
The thief follows two different learning systems. The first is used to go from a particular
location to the hideout when there is no police car at sight. The other learning system is used
when it escapes from the police. Police cars only have one learning system which is used to go
from its present location to a destination, for instance, to pursue the thief during the chase.
However, while the bank has not been robbed, the police cars patrol the city randomly, and
the thief car uses a guidance algorithm (explained in Sect. 8.2.2) to arrive to the bank.
To compute the state (s) of an agent we proceed as follows: first we define a sub-state (∆)
depending on the possible actions that an agent can perform. In our case, the possible actions
73
10. Experiments
are moving to the north, to the south, to the east or to the west. These possibilities depend on
the allowed road directions. As an intersection has four possible exits, we have P =16-1=15
possibilities for the state (it is 15 because we do not consider the case where all roads are blocked)
depending if a certain road is blocked or not (in Fig. 10.1 we see two possible sub-states.) We
assign a number for each of these possible sub-states. For example, if a car is in an intersection
where it can only move to the west, ∆=1. Secondly, we define a second sub-state (Υ) which is
the position where the target is located with respect to the agent, i.e., if the target is more to
the north, more to the south, more to the southeast, etc. For the police the target is the thief,
and for the thief the target can be one police car, more than one police car or the hideout.
We assign a number for each of these possible states. For example, if the target is more to
the south, Υ=2. We denote as W the number of possible locations for the target. The total
number of states is T =P ·W and to compute the agent current state we do s=∆·Υ.
(a) Car can go in four different directions.
(b) Car can go in two different directions.
Figure 10.1: Two different combinations of possible directions (sub-states).
For the LA and QL techniques, when the thief car goes to the hideout or the police cars chase
the thief, P =15 and W =8, therefore we have T =120 input states for LA and QL.
In the case of the thief, using LA during the chase we consider only the closest police car.
Therefore W =4 and P =15, which makes T =60 input states . For the QL case, we consider
the discrete distance in blocks (0, 1 or 2) of every police car when they are closer than two
blocks. Then we have W =81 inputs and P =15, therefore there are T =1215 states. Finally, the
SOM neural network is only used by the thief during the chase. First we identify the type of
intersection we have (P =15 possibilities) and for each one we set a SOM with 4 real valued
inputs (the 4 directions) with a number describing the exact distance to a police car (if anyone
is less than 2 blocks, or zero if there is no one.) We used a lattice of 16 x 16 = 256 neurons.
Therefore, we have 256 x 15 = 3840 neurons with 4 inputs each one. The output of every
neuron is one of the four possible roads to take and it is based in a probability distribution
of those possible exits, and trained as the LA case. We show a table with a summary of the
number of states in Table 10.1.
In the following sections we compare the results using the learning techniques in one run. We
74
Korff
QL
LA
SOM
Target
120
120
3840
Chase
1215
60
3840
Table 10.1: Number of states for each algorithm and each learning mode. The chase mode is
only used for the thief.
QL
LA
SOM
α
0.1
0.1
0.1
R(s)
±0.25
-
γ
0.2
-
Θ
1
(1+0.01∗t)
Table 10.2: Parameters for the different Learning Algorithms.
call a run a set of challenges, where we denote as challenge the time between the cars are
allocated until either the thief or the police wins. A run stops when the average standard
deviation of the thief wins (a thief car wins when it reaches the hideout and enters it but
without being seen) in the last 500 challenges is lower than a 3%, provided than at least 1000
challenges have taken place.
The learning parameters (show in Table 10.2) we used in the simulations are: α=0.1 for the
LA, QL and the probability distribution of every neuron in the SOM. Besides, in QL we set
R(s)=±0.25 and γ=0.2. Finally, for the SOM case we have t= 1 / (1 + 0.01*t) and Θ(t) = (t)=
1 / (1 +0.01*t), where t is the learning iteration. Of course, as there are several SOMs, each
one has its own t and Θ(t) parameters.
We note that agents using Korff’s algorithm do not learn. This algorithm is used in order to
compare the other learning algorithms.
10.3
Korff
In this section we present the results obtained when the thief performs Korff algorithm, while
the police cars use QL, LA and Korff.
We perform two types of experiments. In the former, the size of the scenario is 5x5, and
the number of polices can be 2, 3 or 4. The results are shown in Table 10.3. In the second
experiment, we use a more challenging scenario, which is a 10x10 patches scenario, and we
change the number of police cars, which are now 2, 4 and 6. The results are shown in Table
10.4. For an easy comparison, we also offer a graphical representation of the results in Fig.
10.2a and Fig. 10.2b.
We can see in Fig. 10.2a and Fig. 10.2b that the percentage of victories obtained by the thief
is big, since Korff algorithm is a non-learning one, and as it does not need time to converge, it
75
10. Experiments
Thief algorithm
Korff
Korff
Korff
Korff
Korff
Korff
Korff
Korff
Korff
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
3
3
3
4
4
4
% Thief wins
78.4
33.8
55.6
62.2
17.6
28.6
40.8
7.4
14.4
Table 10.3: Results in a 5x5 patches scenario. Algorithm for the thief: Korff.
Thief algorithm
Korff
Korff
Korff
Korff
Korff
Korff
Korff
Korff
Korff
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
4
4
4
6
6
6
% Thief wins
89.8
55.6
78.4
71.4
26.6
35.6
52.4
13.2
19.8
Table 10.4: Results in a 10x10 patches scenario. Algorithm for the thief: Korff.
gets more victories faster. Further, we also observe that increasing the number of police cars
decreases the percentage of thief victories, since if there are more police cars, they see the thief
sooner and it is also easier to surround it. For example, in the 10x10 scenario and against
Korff, there is a reduction of the 30% of thief victories between having 2 or 4 police cars. We
also observe that increasing the size of the scenario increases the percentage of victories (a 10%
for example with 2 police cars using QL.) This is because the bigger the scenario is, it is more
difficult to surround the thief since it can escape through more ways.
10.4
SOM
In this section we present the results obtained when the thief performs SOM algorithm, while
the police cars use QL, LA and Korff.
The experiments are deployed in two different scenarios: 5x5 patches and 10x10 patches. For
the 5x5 patches scenario, the number of cars is 2, 3 and 4. We show the results in Table 10.5.
The second scenario is a 10x10 patches scenario, and the experiments are developed with 2, 4
76
Learning Automata
100
100
Q−learning
Korff
LA
80
80
70
70
60
50
40
30
60
50
40
30
20
20
10
10
0
1.5
2
2.5
3
3.5
Number of police cars
4
Q−learning
Korff
LA
90
Percentage of thief wins
Percentage of thief wins
90
4.5
(a) 5x5 maps with thief using Korff
0
1.5
2
2.5
3
3.5
4
4.5
Number of police cars
5
5.5
6
6.5
(b) 10x10 maps with thief using Korff
Figure 10.2: Results with thief using Korff.
Thief algorithm
SOM
SOM
SOM
SOM
SOM
SOM
SOM
SOM
SOM
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
3
3
3
4
4
4
% Thief wins
58.8
27.8
39.2
37
14.2
18.6
29.2
5.4
12.6
Table 10.5: Results in a 5x5 patches scenario. Algorithm for the thief: SOM.
and 6 police cars. The results are shown in Table 10.6. For an easy comparison, we also offer a
graphical representation of the results in Fig. 10.3a and Fig. 10.3b.
In both experiments, we see that increasing the number of police cars results in a decrease
of the thief’s wins percentage. This diminution is less in the 5x5 scenario. For example, the
difference between using 2 or 4 cars in the 10x10 scenario (police cars using LA) is about a
30% of decrease, while in the 5x5 scenario the diminution is about a 20%.
10.5
Learning Automata
In this section we present the results obtained when the thief performs LA algorithm, while the
police cars use QL, SOM and Korff.
77
10. Experiments
Thief algorithm
SOM
SOM
SOM
SOM
SOM
SOM
SOM
SOM
SOM
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
4
4
4
6
6
6
% Thief wins
76.6
52.8
59.4
46.8
24.2
34.2
29.0
9.4
18.2
Table 10.6: Results in a 10x10 patches scenario. Algorithm for the thief: SOM.
100
100
Q−learning
Korff
LA
80
80
70
70
60
50
40
30
60
50
40
30
20
20
10
10
0
1.5
2
2.5
3
3.5
Number of police cars
4
(a) 5x5 maps with thief using SOM
4.5
Q−learning
Korff
LA
90
Percentage of thief wins
Percentage of thief wins
90
0
1.5
2
2.5
3
3.5
4
4.5
Number of police cars
5
5.5
6
6.5
(b) 10x10 maps with thief using SOM
Figure 10.3: Results with thief using SOM.
The experiments are deployed in two different scenarios: 5x5 patches and 10x10 patches. For
the 5x5 patches scenario, the number of cars is 2, 3 and 4. We show the results in Table 10.7.
The second scenario is a 10x10 patches scenario, and the experiments are developed with 2, 4
and 6 police cars. The results are shown in Table 10.8. For an easy comparison, we also offer a
graphical representation of the results in Fig. 10.4a and Fig. 10.4b.
We can see that in both experiments the best performance is obtained when thief ’plays’ against
police cars when they perform the QL algorithm. This is an interesting result, since QL is the
more complex and refined algorithm. The reason of this poor behaviour could be that LA,
Korff and SOM algorithms learn faster. This shows that in some cases simple solutions are
good enough to solve complex problems.
We observe that incrementing the number of cars reduces the thief’s victories percentage. This
is a logical result, since the more police cars we have to do the chase, it is more probable that
78
Q-learning
Thief algorithm
LA
LA
LA
LA
LA
LA
LA
LA
LA
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
3
3
3
4
4
4
% Thief wins
79.0
31.2
58.0
65.0
12.0
35.8
51.2
5.2
25.0
Table 10.7: Results in a 5x5 patches scenario. Algorithm for the thief: LA.
Thief algorithm
LA
LA
LA
LA
LA
LA
LA
LA
LA
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
4
4
4
6
6
6
% Thief wins
81.4
49.6
65.2
77.2
27.0
45.4
60.2
12.0
23.2
Table 10.8: Results in a 10x10 patches scenario. Algorithm for the thief: LA.
these cars see the thief and they can surround the thief more easily. To show this, in Fig. 10.4b
we see a reduction of almost a 20% of the thief’s percentage victories if the number of police
cars is 6 instead of 2, when police uses QL. Further, we see that increasing the size of the
scenario benefits the thief. This is because the bigger the scenario is, it is more difficult to
surround the thief since it can escape through more ways.
10.6
Q-learning
In this section we present the results obtained when the thief performs QL algorithm, while the
police cars use SOM, LA and Korff.
The experiments are deployed in two different scenarios: 5x5 patches and 10x10 patches. For
the 5x5 patches scenario, the number of cars is 2, 3 and 4. We show the results in Table 10.9.
The second scenario is a 10x10 patches scenario, and the experiments are developed with 2, 4
and 6 police cars. The results are shown in Table 10.10. For an easy comparison, we also offer
a graphical representation of the results in Fig. 10.5a and Fig. 10.5b.
79
10. Experiments
100
100
Q−learning
Korff
LA
80
80
70
70
60
50
40
30
60
50
40
30
20
20
10
10
0
1.5
2
2.5
3
3.5
Number of police cars
4
Q−learning
Korff
LA
90
Percentage of thief wins
Percentage of thief wins
90
4.5
(a) 5x5 maps with thief using LA
0
1.5
2
2.5
3
3.5
4
4.5
Number of police cars
5
5.5
6
6.5
(b) 10x10 maps with thief using LA
Figure 10.4: Results with thief using LA.
Thief algorithm
QL
QL
QL
QL
QL
QL
QL
QL
QL
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
3
3
3
4
4
4
% Thief wins
52.6
19.2
25.8
28.4
5.4
15.8
22.0
2.8
9.8
Table 10.9: Results in a 5x5 patches scenario. Algorithm for the thief: QL.
This experiments confirm the results in the previous section where the thief obtains better
results when ’playing’ against police cars performing QL. For example in Fig. 10.5a we see that
having 2 police cars, the thief has a 25% less victories when the police performing LA instead
of QL. We also observe that the increase of the scenario benefits the thief (in the 5x5 scenario,
with 2 police cars performing QL, the increment is almost a 20%.)
10.7
Comparison of the different Learning Algorithms
In this section we compare the results obtained by LA, QL and SOM in a 10x10 scenario. We
set Korff’s algorithm for the police cars and then we change the algorithms for the thief car.
Korff’s algorithm does not learn and may be used to better compare thief success when using
different learning techniques.
80
Comparison of the different Learning Algorithms
Thief algorithm
QL
QL
QL
QL
QL
QL
QL
QL
QL
Police algorithm
QL
Korff
LA
QL
Korff
LA
QL
Korff
LA
Police cars
2
2
2
4
4
4
6
6
6
% Thief wins
69.2
35.0
48.2
44.0
16.2
20.4
23.8
5.4
10.4
Table 10.10: Results in a 10x10 patches scenario. Algorithm for the thief: QL.
100
100
Q−learning
Korff
LA
80
80
70
70
60
50
40
30
60
50
40
30
20
20
10
10
0
1.5
2
2.5
3
3.5
Number of police cars
(a) 5x5 maps with thief using QL
4
4.5
Q−learning
Korff
LA
90
Percentage of thief wins
Percentage of thief wins
90
0
1.5
2
2.5
3
3.5
4
4.5
Number of police cars
5
5.5
6
6.5
(b) 10x10 maps with thief using QL
Figure 10.5: Results with thief using QL.
As we can see in Fig. 10.6, the best results are obtained in average by the LA algorithm, followed
by SOM and the last QL. This is even more interesting considering that the LA algorithm uses
less states and considers if there are police cars in a direction, but without determining their
precise distance. This is a surprising result, and it demonstrates that sometimes an excess of
information can be a disadvange, and simple solutions are good enough for complex problems.
See [49] for more examples.
Besides, if we compare the results obtained in the previous sections, we see that if both police
and thief cars use the same Learning Algorithm, the percentage of thief wins is similar. For
example, we can see in Table 10.10 that with 6 police cars, and both police and thief using QL,
the percentage of wins of the thief is 23.8. If we compare this to the case when both police and
thief use LA (Table 10.8, also with 6 police cars,) we see that the percentage of victories is
23.2, which is similar to the result commented before.
81
10. Experiments
100
Q−learning
LA
SOM
90
Percentage of thief wins
80
70
60
50
40
30
20
10
0
1.5
2
2.5
3
3.5
4
4.5
Number of police cars
5
5.5
6
6.5
Figure 10.6: Results police using Korff in a 10x10 traffic map.
82
Part IV
Part IV: Conclusions
83
Chapter 11
Conclusions and future work
11.1
Conclusions
In this thesis we have performed two types of experiments: ones in the Location Privacy field
and the others in the Multi-agent Learning System field. For both of them we have used the
Netlogo tool, showing its flexibility and capacity to perform real world simulations. With the
data obtained from Netlogo, we perform different investigations, obtain results and conclusions
without the need of using expensive resources. Besides, with this tool the control of simulations
is easier, and we can choose in which parts we are more interested to look into depth.
In the Location Privacy part, we have described some important existing algorithms and
measures to protect our privacy, which means in our context being anonymous. We show that
privacy can be committed if no anonymization is applied to the samples. What is more, in this
thesis we have shown that even when current state-of-the-art anonymization algorithms are
used, the privacy of users can be committed.
In order to perform our experiments, first we define and build a model. Secondly, we build
a Netlogo city in which cars drive reporting samples. We implement attack (tracking) and
defense (anonymization) algorithms. With these algorithms and the samples obtained in the
Netlogo simulation we perform experiments. The results obtained were really impressive, and
we confirmed that even using Single Target Tracking Algorithm, which is not the best attack
that we could have used (but the more computationally affordable) we obtain good tracking
results. For the defense algorithms, it was proven that the more samples we remove, the more
anonymous we remain, but we have to be aware that the more samples we remove, also the
worse the location services are. Besides, we also show that removing samples taking into account
the available information (with the Uncertainty-Aware algorithm) provides worse tracking
results, and thus improve privacy.
In the Multi-agent Learning Systems part, we have shown the performance of different Learning
Algorithms. In order to do that, we have build a flexible and efficient model in Netlogo. This
model can be considered as a more complex version of the predator-prey pursuit problem. In
our case, we model a police/thief pursuit in an urban grid environment where other elements
(cars, traffic lights, etc.) may interact with the agents during the simulation.
84
Future work
We present the results of the simulations to show the different performance of the different
Learning Algorithms. The agents develop their tasks in the Netlogo scenario, and we compare
the results obtained in different situations and with different parameters. The results show that
the best algorithm does not obtain the best results. This could be because simplicity can be
better in a simple scenario, and this is why LA obtains better results than QL.
11.2
Future work
In Location Privacy experiments, results were obtained performing Single Tracking Algorithm
and Trajectory Based Linking. This is not the best approach to get good tracking results.
Even using Single Tracking Algorithms, the tracking results could be improved if we use
Map-based Linking (this approach uses map information) or Empirical Linking (which uses
past information) to link samples. A big improvement of the attack is the use of Multi-target
Tracking Algorithms, which take into account all the cars simultaneously to do the linking.
Besides, a clever algorithm which takes into account how the Uncertainty-aware algorithm
works could improve the tracking results when using this algorithm to anonymize.
In our Location Privacy model, there is a Trusted Third Party (TTP,) which means that we
have to trust a third entity. This TTP may be malicious, and if this happens, no anonymization
algorithm would be helpful. Another research line that can be followed is the removal of this
entity and try a TTP-free scheme, where the users collaborate to protect their privacy.
As an extension of the Multi-agent model, the model of the traffic in the city can be extended,
and also we can introduce more complex interactions among normal traffic and the police-thief
cars. Besides, in these thesis we have implemented three Learning Algorithms (LA, QL and
SOM) but other complex learning techniques can be implemented (Dynamic Programming [40],
Temporal difference learning [54], etc.)
85
Bibliography
[1] A. Peleteiro-Ramallo, J. Burguillo-Rial, P. Rodrı́guez-Hernández, and E. Costa-Montenegro,
“Sincity: a pedagogical testbed for checking multi-agent learning techniques,” in ECMS
2009: 23rd European Conference on Modelling and Simulation, 2009.
[2]
Netlogo. http://ccl.northwestern.edu/netlogo/
[3]
Models library. http://ccl.northwestern.edu/netlogo/models
[4] G. Danezis and B. Wittneben, “The economics of mass surveillance - and the questionable
value of anonymous communications,” in In Proceedings of the 5th Workshop on The
Economics of Information Security (WEIS 2006, 2006.
[5] J. Tsai, S. Egelman, L. Cranor, and A. Acquisti, “The effect of online privacy information
on purchasing behavior: An experimental study.” Carnegie Mellon University, Pittsburgh,
PA (USA), June 2007. http://http://www.weis2007.econinfosec.org/papers/57.pdf
[6] J. Krumm, “A survey of computational location privacy,” Personal and Ubiquitous
Computing. http://dx.doi.org/10.1007/s00779-008-0212-5
[7] M. Gruteser and D. Grunwald, “Anonymous usage of location-based services through spatial
and temporal cloaking.” http://www.usenix.org/events/mobisys03/tech/gruteser.html
[8] A. R. Beresford and F. Stajano, “Location privacy in pervasive computing,” Pervasive
Computing, IEEE, vol. 2, nr. 1, pp. 46–55, 2003. http://dx.doi.org/10.1109/MPRV.2003.
1186725
[9] B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady, “Enhancing security and privacy in
traffic-monitoring systems,” IEEE Pervasive Computing, vol. 5, nr. 4, pp. 38–46, 2006.
[10] M. L. Yiu, C. S. Jensen, X. Huang, and H. Lu, “Spacetwist: Managing the trade-offs
among location privacy, query performance, and query accuracy in mobile services,” in
ICDE, 2008, pp. 366–375.
[11] J. Krumm, “Inference attacks on location tracks,” 2007, pp. 127–143. http:
//dx.doi.org/10.1007/978-3-540-72037-9 8
[12] B. Gedik and L. Liu, “Protecting location privacy with personalized k-anonymity:
Architecture and algorithms,” IEEE Transactions on Mobile Computing, vol. 7, nr. 1, pp.
1–18, January 2008. http://dx.doi.org/10.1109/TMC.2007.1062
[13] D. S. Brands, “A technical overview of digital credentials,” 2002.
[14] M. Gruteser, J. Bredin, and D. Grunwald, “Path privacy in location-aware computing,”
2008. http://citeseerx.ist.psu.edu/viewdoc/summary10.1.1.110.3891
86
[15] C. Bettini, X. S. Wang, and S. Jajodia, “Protecting privacy against location-based personal
identification,” in Secure Data Management, 2005, pp. 185–199.
[16] B. Gedik and L. Liu, “Protecting location privacy with personalized k-anonymity:
Architecture and algorithms,” IEEE Transactions on Mobile Computing, vol. 7, nr. 1, pp.
1–18, January 2008. http://dx.doi.org/10.1109/TMC.2007.1062
[17] B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady, “Preserving privacy in gps traces via
uncertainty-aware path cloaking,” in CCS ’07: Proceedings of the 14th ACM conference on
Computer and communications security. New York, NY, USA: ACM, 2007, pp. 161–171.
http://dx.doi.org/10.1145/1315245.1315266
[18] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, “l-diversity:
Privacy beyond k-anonymity,” in 22nd IEEE International Conference on Data Engineering,
2006. http://www.cs.umass.edu/∼{}mhay/links.html
[19] B. Bamba, L. Liu, P. Pesti, and T. Wang, “Supporting anonymous location queries in
mobile environments with privacygrid,” in WWW ’08: Proceeding of the 17th international
conference on World Wide Web. New York, NY, USA: ACM, 2008, pp. 237–246.
http://dx.doi.org/10.1145/1367497.1367531
[20] B. Hoh and M. Gruteser, “Protecting location privacy through path confusion,” in SECURECOMM ’05: Proceedings of the First International Conference on Security and
Privacy for Emerging Areas in Communications Networks. Washington, DC, USA: IEEE
Computer Society, 2005, pp. 194–205.
[21] C. Diaz, S. Seys, J. Claessens, and B. Preneel, “Towards measuring anonymity,” 2002.
[22] A. Serjantov and G. Danezis, “Towards an information theoretic metric for anonymity,”
2002. citeseer.ist.psu.edu/serjantov02towards.html
[23] Google maps. http://maps.google.es/
[24] J. Krumm and E. Horvitz, “Predestination: Inferring destinations from partial trajectories,”
2006, pp. 243–260. http://dx.doi.org/10.1007/11853565 15
[25] B. Hoh, M. Gruteser, R. Herring, J. Ban, D. Work, J. C. Herrera, A. M. Bayen,
M. Annavaram, and Q. Jacobson, “Virtual trip lines for distributed privacy-preserving
traffic monitoring.” in MobiSys, D. Grunwald, R. Han, E. de Lara, and C. S. Ellis, Eds.
ACM, 2008, pp. 15–28. http://dblp.uni-trier.de/db/conf/mobisys/mobisys2008.html#
HohGHBWHBAJ08
[26] J. Warrior, E. McHenry, and K. McGee, “They know where you are,” IEEE Spectrum, pp.
20–25, August 2003.
[27] W. Tranter, K. Shanmugan, T. Rappaport, and K. Kosbar, Principles of communication
systems simulation with wireless applications. Upper Saddle River, NJ, USA: Prentice
Hall Press, 2003.
[28] D. Reid, “An algorithm for tracking multiple targets,” Automatic Control, IEEE Transactions on, vol. 24, nr. 6, pp. 843–854, Dec 1979.
[29] “An introduction to the kalman filter,” Tech. Rep. TR 95-041, 2004.
[30] C. E. Shannon, A Mathematical Theory of Communication. CSLI Publications, 1948.
http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html
[31] M. S. Grewal and A. P. Andrews, Kalman Filtering: theory and practice using Matlab.
87
Bibliography
[32] Y. Wang and A. Kobsa, Privacy-Enhancing Technologies. http://www.ics.uci.edu/
∼{}kobsa/papers/2008-Handbook-LiabSec-kobsa.pdf
[33] M. Gruteser and B. Hoh, “On the anonymity of periodic location samples,” in SPC, 2005,
pp. 179–192.
[34] S. J. Russell and P. Norvig, Articial Intelligence: A Modern Approach, 2nd Ed. Englewoo:
Prentice Hall.
[35] A. M. Turing, “Computing machinery and intelligence,” Mind, vol. LIX, pp. 433–460, 1950.
[36] Turing test. http://turing-machine.weblog.com.pt/arquivo/turingtest.gif
[37] T. Smithers, “On quantitative performance measures of robot behaviour,” Robotics and
Autonomous Systems, vol. 15, nr. 1-2, pp. 107–133, 1995.
[38] M. A. Arbib, The Handbook of Brain Theory and Neural Networks, 2de ed. The MIT
Press, November 2002. http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20\
&amp;path=ASIN/0262011972
[39] Machine learning. http://robotics.stanford.edu/∼nilsson/MLDraftBook/MLBOOK.pdf
[40] D. B. Wagner, “Dynamic programming.”
[41] S. B. Kotsiantis, “Supervised machine learning: A review of classification techniques,”
2007. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.95.9683
[42] M. Harmon, “Reinforcement learning: a tutorial,” 1996. http://citeseerx.ist.psu.edu/
viewdoc/summary?doi=10.1.1.33.2480
[43] L. P. Kaelbling, M. L. Littman, and A. P. Moore, “Reinforcement learning:
A survey,” Journal of Artificial Intelligence Research, vol. 4, pp. 237–285, 1996.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.2707
[44] R. E. Bellman, Dynamic programming. Princeton University Press, 1957. http://www.
amazon.com/exec/obidos/redirect?tag=citeulike07-20\&path=ASIN/B0006AUXX8
[45] M. Benda, V. Jagannathan, and R. Dodhiawala, “On optimal cooperation of knowledge sources - an empirical investigation,” Boeing Advanced Technology Center, Boeing
Computing Services, Seattle, Washington, Tech. Rep. BCS–G2010–28, juli 1986.
[46] M. Wiering, J. Vreeken, J. Van Veenen, and A. Koopman, “Simulation and optimization
of traffic in a city,” in IEEE Intelligent Vehicles Symposium (IV’04). IEEE, 2004.
http://www.cs.uu.nl/groups/IS/archive/marco/simulating%5Foptimizing%5Ftraffic.ps.gz
[47] Wilensky, u. 2005, netlogo traffic model. http://ccl.northwestern.edu/netlogo/
[48] C. Gershenson, “Self-Organizing Traffic Lights,” ArXiv Nonlinear Sciences e-prints, november 2004.
[49] J. Reverte, F. Gallego, R. Satorre, and F. Llorens, “Mixing greedy and evolutive approaches
to improve pursuit strategies,” in IBERAMIA, 2008, pp. 203–212.
[50] K. R.E., “A simple solution to pursuit games,” in In Proceedings of the 11th international
WorkShop on Distributed Artificial Intelligence, 1992.
[51] T. Kohonen, Self-Organizing Maps. Springer, December 2000. http://www.amazon.ca/
exec/obidos/redirect?tag=citeulike09-20\&amp;path=ASIN/3540679219
[52] K. Narendra and M. A. L. Thathachar, Learning automata: an introduction.
Hall, 1989.
88
Prentice
[53] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, nr. 3-4, pp.
279–292, 1992. http://jmvidal.cse.sc.edu/library/watkins92a.pdf
[54] R. S. Sutton, “Learning to predict by the methods of temporal differences,” Machine
Learning, vol. 3, pp. 9–44, 1988. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.
1.42.3191
89

Documentos relacionados