universidade de vigo
Transcripción
universidade de vigo
UNIVERSIDADE DE VIGO ESCUELA TÉCNICA SUPERIOR DE INGENIEROS DE TELECOMUNICACIÓN PROYECTO FIN DE CARRERA CONTRIBUCIONES AL MODELADO Y SIMULACIÓN DEL TRÁFICO URBANO EN NETLOGO AUTORA: Ana Peleteiro Ramallo TUTOR: Juan Carlos Burguillo Rial CURSO 2008-2009 Faculty Engeneering Departement Electrotecnic – ESAT KATHOLIEKE UNIVERSITEIT LEUVEN Contributions to the modelling and simulation of urban traffic in Netlogo Master Thesis is submitted in partial fulfillment of the requirements for the degree of Telecommunications Engeneering Ana Peleteiro Ramallo Promotor: Prof.Dr.Ir.Bart Preenel Daily Supervisors: Dr.Ir.Claudia Diaz Ir.Carmela Troncoso 2008 – 2009 Framework: Erasmus program Abstract Performing real-world experiments is often expensive in terms of resources, as well as the time needed to collect the necessary information. For example, experiments that involve car traffic require drivers who volunteer to carry tracking devices for a long period of time. This is why Netlogo, a multi-agent modeling language that provides a programmable environment for simulating natural and social phenomena, can be useful. There are many fields in which we can use Netlogo as a tool for simulating real situations. In this thesis, we use it to simulate car traffic scenarios and perform experiments in two different fields: Location Privacy and Multi-agent Systems. Location privacy is gaining importance with the increasing popularity of Location Based Services (LBS.) If a person can get our location data, this person may be able to obtain useful and private information. This is why location privacy defense algorithms must be developed, to prevent that the use of location services reveals detailed information about our movements that enable detailed tracking and profiling, thus compromising our privacy. In GPS anonymization, the goal is to anonymize location samples, so that they can be used by external entities while preventing user re-identification and protecting the privacy of the users providing the samples. In Multi-agent Systems groups of agents interact to try to solve a problem. Moreover, in Multiagent Learning Systems, agents cooperate, but they also learn and adapt in an environment. The problem is to be able to observe and compare the different behaviors obtained with different Learning Algorithms without using lots of expensive resources. In this thesis, we develop two main tasks. First, we develop a Netlogo city scenario with cars that send location samples. We implement and perform defense (anonymization) and attack (tracking) algorithms over these samples and use the results to compare the different defenses and attacks. In the second task, we build a Netlogo city and we implement different Learning Algorithms that agents perform in that scenario. The performance of these algorithms in reaching a goal is compared. i Acknowledgements I would like to thank all the COSIC group for the opportunity of working with them and for making me feel like another researcher in the group. To Carmela and Claudia, because you have not only been my supervisors, but also my advisors. Thank you for your support, your dedication... thank you for everything. To Juan Carlos, my supervisor at University of Vigo, Pedro and Kike, for introducing me to the research world. A mis amigos no telecos, por reı́rse de mis chistes frikis aún sin entenderlos. A mis amigos telecos, en especial a 4 mejores amigos que han hecho que estos 5 años sean inolvidables: A Sara, por su energı́a inagotable, y por haber batido un record de natación olı́mpico contando por lo bajo. A Antı́a, por todas las risas en clase, y por tus mı́ticos ’ni de coña’ que se convierten en un ’seguro que’. A Majo, por hacer vida cuando los demás dormı́amos y por ayudarme a gobernar el lado oscuro como vicepresidenta. A Humberto, por todos los caminos al CUVI y horas de prácticas compartidas, y por nunca negarme una discusión cuando la necesitaba. Gracias a los 4 por vuestro apoyo, amistad... y por todo lo que hemos compartido. En definitiva, porque sólo con conoceros, teleco ya ha valido la pena. A Marta, por ser la mejor hermana y amiga que hubiera podido tener, por darme siempre todo tu apoyo, por ser mi consejera y escucharme siempre que lo necesito. A Jota, mi bro favorito, por alegrarme los dı́as, por aguantar mis sermones, por estar ahı́ para lo que sea, y por ser mi compañero de competiciones deportivas (aunque siempre subcampeón). Gracias a los dos, porque es imposible expresar en unas lı́neas todo lo que significáis para mı́. Por último y más importante, este proyecto va dedicado a mis padres, porque todo lo que soy es gracias a vosotros, porque sois las personas que más quiero y admiro, por darme siempre el mejor consejo, y por aguantarme en los momentos en los que ni yo me aguantarı́a. Todos mis éxitos son también vuestros. iii Contents Abstract Acknowledgements Contents List of Figures List of Tables I i iii v vii ix Part I: Presentation 1 Introduction 1.1 Motivation . . . . . . 1.2 Goals . . . . . . . . . 1.3 Structure of the thesis 2 Netlogo 2.1 Introduction . . . . . . 2.2 Agents . . . . . . . . . 2.3 Model and settings . . 2.4 Example . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II Part II : Location Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 2 3 5 5 6 7 10 12 3 Preliminaries 3.1 Introduction . . . . . . . . . . . . . . . . . . . 3.2 Classification of privacy methods according to 3.3 Attacks and countermeasures . . . . . . . . . 4 System model 4.1 Introduction . . . . . . . . . . . . . . . . . . . 4.2 Scenario . . . . . . . . . . . . . . . . . . . . . 4.3 Participants . . . . . . . . . . . . . . . . . . . 4.4 Adversarial model . . . . . . . . . . . . . . . 5 Algorithms 5.1 Introduction . . . . . . . . . . . . . . . . . . . 5.2 Mathematical Background . . . . . . . . . . . 5.3 Privacy enhancing algorithms . . . . . . . . . 5.4 Tracking algorithms . . . . . . . . . . . . . . 6 Experiments 6.1 Introduction . . . . . . . . . . . . . . . . . . . 6.2 Experimental setup . . . . . . . . . . . . . . . v 13 . . . . . . . . . . . . . . . . . . . 13 the communication with the LBS 14 . . . . . . . . . . . . . . . . . . . 17 24 . . . . . . . . . . . . . . . . . . . 24 . . . . . . . . . . . . . . . . . . . 25 . . . . . . . . . . . . . . . . . . . 26 . . . . . . . . . . . . . . . . . . . 27 29 . . . . . . . . . . . . . . . . . . . 29 . . . . . . . . . . . . . . . . . . . 29 . . . . . . . . . . . . . . . . . . . 32 . . . . . . . . . . . . . . . . . . . 35 41 . . . . . . . . . . . . . . . . . . . 41 . . . . . . . . . . . . . . . . . . . 41 6.3 6.4 6.5 6.6 Experiments releasing all samples . . . . . . Experiments subsampling . . . . . . . . . . Experiments Uncertainty-aware algorithm . Comparison of different defense algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III Part III: Learning algorithms 7 Preliminaries 7.1 Introduction . . . . . . . . . . . . . . 7.2 Multi-agent Learning . . . . . . . . . 7.3 Artificial Neural Networks . . . . . . 7.4 Machine Learning . . . . . . . . . . . 7.5 Reinforcement Learning . . . . . . . 7.6 Predator-prey problem . . . . . . . . 8 System Model 8.1 Introduction . . . . . . . . . . . . . . 8.2 Scenario . . . . . . . . . . . . . . . . 8.3 Participants . . . . . . . . . . . . . . 9 Algorithms 9.1 Introduction . . . . . . . . . . . . . . 9.2 Korff Algorithm . . . . . . . . . . . . 9.3 Self Organizing Maps Algorithm . . 9.4 Learning Automata . . . . . . . . . . 9.5 Q-learning . . . . . . . . . . . . . . . 10 Experiments 10.1 Introduction . . . . . . . . . . . . . . 10.2 Experimental setup . . . . . . . . . . 10.3 Korff . . . . . . . . . . . . . . . . . . 10.4 SOM . . . . . . . . . . . . . . . . . . 10.5 Learning Automata . . . . . . . . . . 10.6 Q-learning . . . . . . . . . . . . . . . 10.7 Comparison of the different Learning 45 46 48 50 53 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 54 56 57 58 59 62 64 64 64 67 69 69 69 70 71 71 73 73 73 75 76 77 79 80 IV Part IV: Conclusions 83 11 Conclusions and future work 11.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography 84 84 85 86 vi List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 Netlogo environment. Controls and action scenario. . . . Netlogo. Turtle shapes . . . . . . . . . . . . . . . . . . . . Netlogo example scenario: wolves and sheeps. . . . . . . . Netlogo. Controls to change the size of the scenario . . . Netlogo.Button to control the beginning of the simulation Netlogo Switch (left) and slider (right) . . . . . . . . . . . Netlogo Chooser . . . . . . . . . . . . . . . . . . . . . . . Netlogo Plot . . . . . . . . . . . . . . . . . . . . . . . . . Netlogo Monitor to display variables values . . . . . . . . Tumor model and its components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7 8 8 9 9 9 9 10 11 3.1 3.2 3.3 3.4 3.5 3.6 GPS service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Communication between the user and the LBS without third party . . . . . . Clasification of privacy methods . . . . . . . . . . . . . . . . . . . . . . . . . Communication between the user and the LBS with third party (TTP-based) TTP-free communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deleting data to hide home’s location. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 15 15 16 17 22 4.1 4.2 4.3 System model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Example of a reticular city: New York. . . . . . . . . . . . . . . . . . . . . . . . . . Netlogo city view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 25 26 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 Netlogo implementation of the city. . . . . . . . . . . . . . . . . . . . . . . . Maximum permitted speed in a city. . . . . . . . . . . . . . . . . . . . . . . Empirical distribution of vehicle trip times. . . . . . . . . . . . . . . . . . . Distance errors in tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . Releasing all samples. N =10. . . . . . . . . . . . . . . . . . . . . . . . . . . Releasing all samples. N =100. . . . . . . . . . . . . . . . . . . . . . . . . . Subsampling algorithm.N =100, φ=0.8. . . . . . . . . . . . . . . . . . . . . . Subsampling algorithm.N =100, φ=0.8. . . . . . . . . . . . . . . . . . . . . . Subsampling algorithm.N =100, φ=0.5. . . . . . . . . . . . . . . . . . . . . . Uncertainty-aware algorithm. N =10, k=2, β=0.4. . . . . . . . . . . . . . . Uncertainty-aware algorithm. N =100, k=2, β=0.4. . . . . . . . . . . . . . . Uncertainty-aware algorithm. N =100, k=10, β=0.2. . . . . . . . . . . . . . Comparison between releasing all samples and subsampling. N =100. . . . . Comparison between releasing all samples and Uncertainty-aware algorithm. 42 43 44 44 45 46 47 47 48 49 49 50 51 51 vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N =100. 6.15 Comparison between subsampling and Uncertainty-aware algorithm. N =100. . . . 52 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 Turing test. . . . . . . . . . . . . . . . . . . . . . Agent action in some environment . . . . . . . . Non-learning agent. . . . . . . . . . . . . . . . . Learning agent. . . . . . . . . . . . . . . . . . . . Multi-agent Learning. Intersection of two fields. . Simple scheme of the nodes of a Neural Network. Predator-prey model. . . . . . . . . . . . . . . . . Capture of a prey by surrounding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 56 57 58 58 59 63 63 8.1 8.2 8.3 8.4 Traffic basic model. . Traffic grid model. . SOTL model. . . . . SinCity model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 65 65 67 10.1 10.2 10.3 10.4 10.5 10.6 Two different combinations of possible directions (sub-states). Results with thief using Korff. . . . . . . . . . . . . . . . . . Results with thief using SOM. . . . . . . . . . . . . . . . . . . Results with thief using LA. . . . . . . . . . . . . . . . . . . . Results with thief using QL. . . . . . . . . . . . . . . . . . . . Results police using Korff in a 10x10 traffic map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 77 78 80 81 82 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii . . . . . . . . . . . . . . . . List of Tables 10.1 Number of states for each algorithm and each learning mode. . . . 10.2 Parameters for the different Learning Algorithms. . . . . . . . . . . 10.3 Results in a 5x5 patches scenario. Algorithm for the thief: Korff. . 10.4 Results in a 10x10 patches scenario. Algorithm for the thief: Korff. 10.5 Results in a 5x5 patches scenario. Algorithm for the thief: SOM. . 10.6 Results in a 10x10 patches scenario. Algorithm for the thief: SOM. 10.7 Results in a 5x5 patches scenario. Algorithm for the thief: LA. . . 10.8 Results in a 10x10 patches scenario. Algorithm for the thief: LA. . 10.9 Results in a 5x5 patches scenario. Algorithm for the thief: QL. . . 10.10Results in a 10x10 patches scenario. Algorithm for the thief: QL. . ix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 75 76 76 77 78 79 79 80 81 Part I Part I: Presentation 1 Chapter 1 Introduction 1.1 Motivation Nowadays, there is a growing interest in simulations that reflect closely the real world, since we are interested in observing and modelling the behaviour of different systems. However, performing real-world experiments is often expensive in terms of resources, time needed to collect the necessary information, etc. For example, if we want to simulate the cycle of the wolf-sheep population (the evolution of these population sizes,) we would need to place all the animals, observe the evolution of the population during a long time, etc. These kind of real-world experiments are normally difficult to handle without simulating tools that allow us to accelerate and simplify the process. Urban traffic simulation is a field that attracts the attention of the research community. For instance, the development of several applications related to the traffic, as for example GPS systems, or the study of traffic congestion in cities are important research areas nowadays. The problem to obtain results in urban traffic experiments is that the resources tend to be expensive and limited, since we need cars, the recruitment of drivers who want to volunteer to carry tracking devices, etc. This is why Netlogo, a multi-agent modeling language which allows to simulate natural and social phenomena, can be useful. Netlogo is useful in many fields as a tool for simulating real-world experiments. It allows to define agents that perform their actions independently in a scenario, while we can control, measure and observe their parameters and behaviour. Concretely, in urban traffic simulations, it allows to define all resources we need (e.g., number of cars that we need,) control the speed of the simulation, measure and redefine the model whenever is necessary, etc. This reduces the time needed for simulations and improves the observation quality of the observation of the Netlogo world. 1.2 Goals There are many fields in which we can use Netlogo as a tool for performing real-world experiments, but concretely, in this thesis we use it to perform experiments in Location Privacy and Multi2 Structure of the thesis agent Systems, showing the utility of this tool in different research areas. Location Privacy is becoming more important every day because people carry more and more personal devices that reveal their location (GSM, PDA, etc.) This means that we are revealing our position to servers that we have to trust, but these servers can be malicious, or even the communication channel may be compromised. Although it may seem inoffensive, the problem of data revealing is not trivial. In fact, if a person can get our data, this person may be able to obtain useful and private information, as where we live, or which places we visit. For example, a company could not hire an employee because they have ’discovered’, observing this person’s data, that this person may be ill because he visits the hospital every day. This is why privacy defense algorithms must be developed. A special case of Location Privacy is GPS samples anonymization. When a person uses a position transmitter and he sends location samples to a Location Based Service (LBS,) if these samples are not anonymized, privacy may be compromised, since we are revealing our geographical position, as well as time and identification information. This is why it is important to develop and test anonymization algorithms. However, to do this, we need location samples, and these samples are difficult to obtain because we need real users to give their data. In this thesis we do not use real data, but we build a Netlogo city scenario with cars that send location samples. We implement defense (anonymization) and attack (tracking) algorithms over these samples and we perform a comparison of the different defense attacks. In the second part of the thesis, we use Netlogo to compare the behaviour of different Learning Algorithms. The investigation in this part resulted in a publication with the paper SinCity: a pedagogical testbed for checking Multi-agent Learning techniques [1]. In Multi-agent Systems agents collaborate to solve a problem. This collaboration may be important for example when we want to coordinate robots to save a person in a natural disaster scenario. We have to be sure that these agents will have a correct behaviour, and try this robots in all situations before a real trial, since a mistake could be fatal in a critical situation. In Multi-agent Learning Systems, besides cooperating, the agents also learn and adapt to their environment. With the same robot example from previous paragraph, an agent may learn that getting closer to the person to save is a good decision, while going to a cliff is a bad decision. The challenge is to be able to observe and compare the different behaviors resulting from different Learning Algorithms without using lots of expensive resources. To achieve this, we build a Netlogo city where agents (thieves) escape from other agents (police.) We implement different Learning Algorithms that agents perform in that scenario and we compare the performance of these algorithms in reaching a goal. In our thesis, the goal of the thief is to escape after the robbery, and the goal of the police is to catch the thief. 1.3 Structure of the thesis The thesis is structured in four parts. Part I: Presentation, this is an introductory part: Chapter 2: Netlogo: This Chapter presents and explains the Netlogo tool, with which we simulate our real situations. 3 1. Introduction Part II: Location Privacy describes the Location Privacy part of the thesis. It is structured as follows: Chapter 3: State of the art: This Chapter introduces the concepts of Location Privacy, a classification of the privacy methods and the principal attacks and countermeasures. Chapter 4: System model: This Chapter describes our model, i.e., the scenario where our experiments are developed, the participants and the adversarial model. Chapter 5: Algorithms: This Chapter explains the different privacy enhancing and attack algorithms used in the experiments.. Chapter 6: Experiments: This Chapter explains the experimental setup and shows the different results when performing the algorithms from Chapter 5. Part III: Learning Algorithms describes the Multi-agent Learning Systems part of the thesis. It is structured as follows: Chapter 7: State of the art: This Chapter introduces concepts to have a theoretical background in Multi-agent Learning Systems. Chapter 8: System model: This Chapter describes our model, i.e., the scenario where our experiments are developed and the participants. Chapter 9: Algorithms: This Chapter explains the different Learning Algorithms used in the experiments. Chapter 10: Experiments: This Chapter explains the experimental setup and shows the different results when performing the algorithms from Chapter 9. Finally, Part III: Conclusions: Chapter 11: Conclusions and Future Work: This Chapter explains the conclusions and the possible ways to improve this work. 4 Chapter 2 Netlogo 2.1 Introduction Netlogo [2] is a multi-agent modeling language that provides a programmable environment for simulating natural and social phenomena. For example the wolf sheep predation, that studies the stability of predator-prey ecosystems, or the spreading of a virus on a network . In particular, it provides a good environment for modeling complex systems developing over time. When there are a multiplicity of simple interactions between agents (entities that develop their actions in a concrete environment,) normally patterns and complex systems arise. This is known as emergent phenomena. Netlogo allows the exploration of these phenomenas in a created environment since we can give instructions to a big quantity of agents all operating independently. When giving instructions to the agents, they perform actions, and this way we can model the behaviour of an entity (or even of a group.) They are the principal ’actors’ in the Netlogo scenario. Sometimes it is difficult to experiment with a system in a real world situation. However, using Netlogo to model situations permits the user to investigate a system in a rapid and flexible way. Netlogo provides a good environment to simulate scenarios from different fields where the connection between the individual behaviour of the agents and the patterns emerging from the interaction of the individuals can be studied. One big advantage of Netlogo is that the user can choose between building its own model or use predefined models of the ones provided by the tool. The Models Library [3] provides a collection of pre-built models to work with. Further, we can also modify this models to adapt them to our needs. These models allow to compute simulations in different areas in the natural and social sciences, like economics, psychology, mathematics, computer science, machine learning, location privacy, etc. If no pre-built model is suitable for the purposes of the user, he can create a new one. Netlogo provides many features for a designer to model, control and measure the behaviour of the agents in the scenario. Netlogo is an easy language and it provides predefined language primitives, so that amateur programmers can easily build their models and run simulations. Further, as we show in this thesis, it can also be used as a powerful tool for researchers. 5 2. Netlogo Figure 2.1: Netlogo environment. Controls and action scenario. This figure shows a model to study the evolution of the sheep-grass cycle. Another advantage of Netlogo is that it is a cross-platform system, this is, we can run our simulations either in Mac, Linux, Windows, or in any other operative system. Not only that, but runs are exactly reproducible cross-platform, i.e., we can build a model in one platform and run it in other, and the results are the same as if we run it in the same platform. These provides a flexibility that allow all kind of programmers to use and to develop their models. In the next sections we will explain the agents and their relations, an overview of the model and different settings, and finally we will show an example. 2.2 Agents In Netlogo, we can model agents that perform actions and give them commands. This agents are a part of the Netlogo world and they perform their activities independently from the others simultaneously in the world. There are four kinds of agents: patches, turtles, links and the observer. 6 Model and settings Patches Patches are individual squares on a grid. They are stationary, but this does not mean they do not perform actions and have a behaviour. Patches compose what it is known as the world. This world is two dimensional, although Netlogo also has the option of a 3D view. Turtles Turtles are mobile agents that move around in the grid provided by the patches. We can control their number, behaviour, shape, etc. In Fig. 2.2 we show some of the different shapes that the turtles can adopt. Figure 2.2: Netlogo. Turtle shapes To a better and more graphical understanding of the last two agents, we show in Fig. 2.3, one of the more known predefined models in Netlogo: the Wolf Sheep Predation. In this figure we can see the paths, which are the green and brown squares, and the agents, which in this case are the wolves (in black) and the sheeps (in white.) This figure is also good to show the different shapes that we can use for the agents, and their movements and actions. Furthermore, as we can control the speed of the simulation (this will be explained in Sect. 2.3,) we can view in slow motion what it is happening in our model, and also, go forward in time until the action we want to study. Links A link connects two turtles and this makes possible the creation of aggregates (a collection of beings, for example a population of wolves, the grass in a field, etc.,) networks and graphs. Observer The observer is the controller of the world. It oversees the scenario and it gives orders to the rest of the agents. We can also define ’breeds’ of turtles and links. A breed is a group of agents with an homogeneous appearance and behaviour. In fact, all the agents of a breed have the same behaviour, but within the breed, every agent acts independently (following the same behaviour.) 2.3 Model and settings The scenario in Netlogo can be viewed as 2D or 3D, although normally the scenario is 2D. The size of the scenario can be changed, and this allows to have flexibility in the model. We modify 7 2. Netlogo Figure 2.3: Netlogo example scenario: wolves and sheeps. the size of the world by changing the number of patches that compose it. We show in Fig. 2.4 a 50x50 patches scenario. Figure 2.4: Netlogo. Controls to change the size of the scenario The interface of Netlogo allows to control the agents and the simulation variables. We can control the world using buttons. First we define the actions that we want to be performed when we press the button and when we press it, it executes them. In Fig. 2.5 we can see a typical control button. There are also sliders and switches (shown in Fig. 2.6) that allow to change variables value (like the size, number of agents, etc). The difference between sliders and switches is that the switches can only be set to on or off, while the sliders take values in an interval defined by the user. These controls allows for example to change the speed of the turtles and even the speed 8 Model and settings Figure 2.5: Netlogo.Button to control the beginning of the simulation of the simulation. With the switches, we can enable or disable an action. Figure 2.6: Netlogo Switch (left) and slider (right) With the choosers (Fig. 2.7) we can also change the value of a variable. They allow to define a list a values that a variable can take, and then choosing one of them. Figure 2.7: Netlogo Chooser Netlogo provides also tools for plotting and displaying the results: plots and monitors. They allow to observe the evolution (during the simulation) of the variables. The plot (Fig. 2.8) graphically represents the value and state of the variables, and records them to be used afterwards in the analysis of the experiment. To display the state and numerical value of different variables we can use the monitors. We see one example in Fig. 2.9. Figure 2.8: Netlogo Plot All these tools make Netlogo a friendly environment for simulations, since it allows an easy change of variables value, and with this a change in the behaviour of the system. Besides, it 9 2. Netlogo Figure 2.9: Netlogo Monitor to display variables values provides a good way to the interpretation of the results and also for debugging, since we can plot results, and view the variable values in simulation time. 2.4 Example To have a final overview of the whole model, we show the Tumor model in Fig. 2.10. Using this model we can simulate the growth of a tumor and its resistance to chemical treatment. In this figure we can see arrows pointing to different elements described in previous section. First, the one named Scenario points the graphical representation of the scenario, where the action is developed. The arrow named Buttons shows the buttons. In this case they are the Setup and Go buttons. As we can see, the Go button has two arrows, which means that it is a ’forever’ button, i.e., the action is carried on forever from the moment we click this button until we click it again. However, the Setup button is a ’once’ button, which means that the action is only performed once. The arrow Monitor points to one monitor which count the alive cells. Further, the Plot arrow points to a plot which shows the evolution of the alive cells over the time. We also have arrows Command center and Observer. In the Command center the user can insert commands, which are actions the agents will perform or directions in the model. However, if we write in the observer line we can only give commands to the turtles and the patches. Finally, we have the arrows Slider, with which we can vary the speed of the simulation, and Switch, which allows us to choose if the cells leave a trace (if On) or not (if Off.) 10 Example Slider Buttons Switch Monitor Scenario Plot Command center Observer Figure 2.10: Tumor model and its components. 11 Part II Part II : Location Privacy 12 Chapter 3 Preliminaries 3.1 Introduction Location-based applications (also known as Location-Based services, LBS) and traffic monitoring applications use people’s movements in order to offer several services (as an example, GPS, see Fig. 3.1.) However, in order to get these services, users need to provide private information (their geographical location information.) Location information is a set of data describing an individual’s geographical position over a period of time. This is important to protect because a sequence of recorded locations of a person constitutes a quasi-identifier, which are the attributes that can be linked with external data to uniquely identify at least one individual in general population. This means that data can reveal personal information (for example, the places we visit or the times we arrive home.) However, nowadays people does not really care [4] about these location privacy matters, even though it is becoming more important every day [5]. LBS provider GPS Figure 3.1: GPS service. We can define location privacy as the ability to prevent other parties from learning one’s current or past location. Past location information is important to protect [6], because although real time location could be useful to find a person, past data could help to discover more private information, as where a person lives, the places that this person visits during the day, etc, as data stored is difficult to protect, and third parties may be malicious. A location privacy threat is the capacity of an adversary to get geographical location information and this way compromising user’s privacy . Even if data is revealed anonymously, if the sample 13 3. Preliminaries frequency is sufficiently high compared to the user density in an area, an adversary could link samples to the same user. An accumulation of path information about individual users will likely lead to identification. The tecniques described in the following sections are used to avoid this identification, or at least make it more difficult. When collecting information from the users, there is sensitive information that may become public. Users should be able to use LBS without compromising their privacy. Data privacy algorithms increase privacy through deliberate modifications on the dataset, as omission, perturbation or generalization. This is why normally there must be a trade off between privacy and accuracy (quality of service) [7] [8]. In order to protect our privacy, there are some tecniques which are applied to the location samples to make the tracking more difficult. Two of them are pseudonymity and anonymity. It is important to understand the difference between them. The former consists of stripping names from location data and replacing them with arbitrary IDs (identities, different from real ones.) Pseudonymity is a simple privacy defense, but it is not effective, since even though the user’s true identity is not revealed, an attacker, examining where the users spend more time, could determine the place where an user lives [9]. With anonymity, we also give arbitrary IDs to the users, but we change this IDs along the time, so an attacker cannot assign a concrete ID to an user. Anonymity requires unlinkability, it means, that an attacker cannot infer the holder of an ID or link its samples. Anonymity aims to hide users’ true identity with respect to emitted location queries. 3.2 Classification of privacy methods according to the communication with the LBS Location-based service (LBS) provides users with information accesible by mobile devices that are able to locate themselves, as for example GPS, or even a cell phone. In this section we will explain two different approaches of LBS: a TTP-based (Trusted Third Party,) where there is a third-party between the LBS provider and the user which acts like a proxy such that privacy is ensured. The second approach is TTP-free. This approach gives better privacy than the TTP-based one because we do not have to trust a third party, which can be malicious.We can see a scheme of this classification in Fig. 3.3. Note that, if the user does not care about her privacy, the communication between the user and the LBS can be without using intermediates (Fig. 3.2) 3.2.1 TTP-based In this scheme, users do not communicate with applications directly, otherwise they would reveal their identity straight away. An option to avoid this identification is to use an anonymizing proxy for all communication between users and applications (see Fig. 3.4.) This proxy allows applications to receive and to reply to anonymous (more correctly pseudonymous) messages from the users. It removes any identifiers (IP, addresses, perturbs the location information, etc) so that each message sent to the LBS contains the location information of the mobile client and a time stamp [8] but not identity information. 14 Classification of privacy methods according to the communication with the LBS Query User LBS provider Answer Figure 3.2: Communication between the user and the LBS without third party LBS Privacy Methods TTP-Based TTP-free Policy-based Simple Collaboration-based Pseudonymbased Obfuscation-based Anonymity-based Figure 3.3: Clasification of privacy methods One of the simplest intermediates entities are the ones which replace the real IDs of the users by fake ones. Entities which provide anonymity aim to hide users true identity with respect to emitted location information. By using a TTP, we move trust from LBS to intermediate entities, so LBS are no longer aware of the identities. If we use TTP-based, the number of intermediate entities is smaller than the number of LBS, and we have to trust this third parties. One of the intermediate entities are the k-anonymizers, which can replace the real location of the user and give a blurred one. This entities take k users and they cloak the area where they are located. The cloaking is really simple, instead of reporting the exact location (for example, the (x,y) coordinates) they report an area where all this k-users have been. This leads to an 15 3. Preliminaries uncertainty of which car has reported that sample and where this car is now. This way LBS providers cannot easily determine which of the k users is really submitting the query. User User TTP LBS provider User Figure 3.4: Communication between the user and the LBS with third party (TTP-based) 3.2.2 TTP-free With this approach, privacy can be provided without seeking help from any centralized third party (it is a P2P approach.) In this new approach, users collaborate to protect their privacy. Users use collaboration methods among them, but they do not even need to trust each other. We show an scheme in Fig. 3.5. Before requesting any location based service, the mobile user discovers peers and forms a group of k users with her peers via single-hop communication and/or multi-hop routing. The area where users are anonymous is computed as the region that covers the entire group of peers. In TTP-free the mobile client can blur its exact location so the adversary only knows an area where the user could be. This area is the minimum area covering the k-1 peers that form the group and itself. With TTP-free, an user can choose between two approaches to define when an user tries to find peers: on demand mode and proactive mode. On the one hand, with on demand mode mobile clients execute the cloaking algorithm when they need to access information from the location-based database server. We obtain a better quality of services, but longer response time. On the other hand, with proactive mode mobile clients periodically look around to find the desired number of peers. They can cloak their exact locations into spatial regions whenever they want to retrieve information from the data server. In order to form a group, the user first adds Gaussian noise. Then the user broadcasts his perturbed location and requests its neighbors to return perturbed versions of their locations. Amongst the replies received, the user selects k-1 neighbors such that the group formed by the locations of these neighbors and his own perturbed location span an area A. The user sends to the LBS the centroid of the group of k perturbed locations including his own. Note that, as users only exchange perturbed data, there is no need to trust each other. Perturbations tend to cancel each other in the centroid, maintaining the accuracy. Another methods, like obfuscation or SpaceTwist [10] are an alternative to collaboration-based methods. Degrading the quality of information about users’ location protects users’ privacy. 16 Attacks and countermeasures SpaceTwist generates a ’fake’ point that is used to retrieve information on the k nearest points. After successive queries to the LBS provider, SpaceTwist is able to determine the closest interest point to the real location while the LBS server cannot derive the real location of the user. User User User LBS provider User User Figure 3.5: TTP-free communication 3.3 3.3.1 Attacks and countermeasures Attacks Since consecutive location samples from a vehicle exhibit temporal and spatial correlation, paths of individual vehicles can be reconstructed from a mix of anonymous samples belonging to several vehicles. Location data is very vulnerable to privacy attacks [11]. We can do a simple classification of these attacks: passive and active. In passive attacks, the attacker only listen to the information that he is given and he uses this information to attack. Among them there are the traffic analysis techniques, which only use the space/time information. Within the most important traffic analysis attacks we find brute force, timing, communication patterns, packet counting, intersection and statistical disclosure. In active attacks the attacker modifies the packets, the time of sending, etc. He does something active with the packets, as for example remove one. These attacks are more powerful than passive attacks, but passive attacks have one big advantage: they are not detectable. In secure communications, nodes communicate with encrypted and authenticated data packets, to prevent active attacks. We assume that this encryption is perfect, so we have to take care of passive attacks (and above all traffic analysis attacks.) Tracking techniques are passive attacks, since even though the attackers use information from the packet, they do not modify them. For an adversary, a path is a location of location/time samples that a single user has reported. These tracking techniques can be used to reconstruct paths from anonymous samples or segments. They are useful once a home location has been identified. They can allow an adversary to follow the traces reported by a vehicle to other location, thereby linking information about other places to the driver identity. 17 3. Preliminaries As we said in Sect. 3.1, the problem of location based services is that our privacy is threatened. There are four reported studies in which data was used to demonstrate an attack efficiency [8]. In three cases, data was pseudonomized, but even with this protection some tracks could be followed and locate some homes. In the fourth case, the data had been completely anonymized, thus mixing together coordinates from different people. The result obtained shows that it is not enough to prevent an attacker from reassembling data into individual tracks. An adversary can use four approaches to link the samples reported by the users: first, trajectory based linking assumes that a user is more likely to continue traveling on the same trajectory. Second, map-based linking correlates location samples with likely routes on a map. That way future location samples can be linked and users position can be predicted. A third approach is empirical linking, which uses past information (for example the last movement in a concret position) to connect samples. Finally, the inference attack analyzes data in order to illegitimately gain knowledge about a subject [6]. Furthermore, it allows to extract visited places. To find the coordinates of each subject’s home based on their GPS data, there are four heuristic algorithms [11]: Last destination Exploits the fact that the last place a person goes is home. In [11] the attacker could computationally locate a subject’s home within about 60 meters at least half the times. WeightedMedian The attacker exploits the fact that the subject spends more time at home than at any other location. A longer stay in a place can be used to identify the subjects’ home. Largest cluster The attacker assumes that most of reported coordinates will be at home. Then, she builds a dendrogram (which is an hierarchical clustering technique that builds a tree diagram that can show an arrangement of clusters) of the subject’s destinations, where the merge criterion is the distance between the cluster centroids. When the nearest two clusters are over 100 meters apart, the search ends. The home location is taken as the centroid of the cluster with most points. Best time As the attacker assumes, the probability of a subject being at home varies depending on the time of the day. The attacker obtains this distribution and this way the relative probability of being home vs. the time of the day can be known. Applying this distribution, for each measured sample of each user, we compute the relative probability. Once this is done, we extract the coordinates with the maximum relative probability and define home as the median of those points. From the four explained heuristic’s, Last destination has the best performance, while when performing Best time algorithm the attacker obtains the worst results. If we can identify a subject’s home, then we can search the name (for example in the Internet,) and maybe further personal information [11]. This is the reason why defenses against these attacks must be investigated. 3.3.2 Countermeasures There are some simple countermeasures that the reader must have in mind when talking about location privacy. Within this countermeasures, we have to take into account the regulatory 18 Attacks and countermeasures measures, which normally means goverment rules. These rules impose some countermeasures that must be followed, or legal problems will arise. Normally, the legal rules are imposed by the government. Besides, there are privacy policies, which are normally privacy rules or politics defined by within the enterprises. These rules should always be defined in order to avoid privacy compromise of any user. It is normally collected in a document that tries to define how to deal with information related to users’ private profiles. Finally, pseudonimity and anonymity, introduced in the first part of Chapter 3, to avoid the inference of information from the users. As we introduced in Sect. 3.1, pseudonymity is using a pseudonym (an alternative ID) rather than the user’s actual identity. Such pseudonyms should be different for different services to prevent tracking (users should adopt new pseudonyms for each application with which they interact,) because even if we do not give our true identity, but a pseudonym, our samples can be linked, thus we can be tracked and located. To be anonymous, the user changes his pseudonym over the time. This pseudonyms have to be generated so that relating a previous pseudonym with the following must be difficult [12], for instance using anonymous credentials [13]. Using anonymity or pseudonymity applications only receive the ’fake’ identity, not the real one, and real IDs cannot be linked to locations. For this reason, whenever an user switches his pseudonym, the following location information appears as a different path to the adversary [14], since he does not know if the new pseudonym belongs to the tracked user. However, there are algorithms that link pseudonyms with real location data and can even reconstruct paths from data consisting of mixed, anonymized location coordinates from multiple users [15]. Besides, to link pseudonyms, an attacker might look for patterns in the particular request service, and this way being able to link different IDs to one user. In location based systems, we can define strong anonymity and weak anonymity. Strong anonymity permits user information to remain safer. The aim of the adversary is to identify the user who generated an anonymous path by linking other available information to the path, but with strong anonymity it is more difficult because a larger group of service users are required to travel along the same path at the same time. In weak anonymity, there are no so many users or they do not travel at the same time, thus data is more distinctive and it could be linked to an individual. In the rest of the section we describe some algorithms to protect LBS users’ privacy. 3.3.2.1 k-anonymity With k-anonymity, instead of pseudonymously reporting her exact location, an user reports a region containing k-1 other people, that is, a person cannot be distinguished from k-1 other people. We call these k people the anonymity set. The algorithm reduces spatial or time accuracy of each reported sample until it meets the defined k-anonymity constraint. The larger is k, the higher the guarantees for location privacy. If we want a high location privacy, it is necessary to perform additional spatial and temporal cloaking, and this results in low spatial and temporal resolution for the anonymized messages, which means that QoS (which is defined by temporal and spatial tolerance specifications) will be diminished, because if the region or the time between reported samples are bigger, the accuracy of the service will be reduced [16]. 19 3. Preliminaries Besides, the k-anonymity model can be improved (in terms of privacy) by performing message perturbation and location cloaking algorithms, which means that besides applying k-anonymity, the location can be blurred or some packet fields may be changed. We can achieve higher guarantees of k-anonymity and higher resilience to location privacy threats. In a scenario, each mobile client defines its desired anonymity level (the k value in k-anonymity,) spatial tolerance and temporal tolerance. Each message is transformed into a message that can be safely forwarded to the LBS provider. There are two tecniques to obtain k-anonymity: • Spatial cloaking: Decreasing the location accuracy through enlarging the exposed spatial area such that there are other k-1 mobile clients present in the same spatial area. • Temporal cloaking: Delaying the message until k-1 (k with the user) mobile clients have visited the same area located by the message sender. Increasing the time between location reports can also improve privacy. Data perturbation or resolution control mechanisms offer users additional options when releasing data to semi trusted providers (such as little-known service providers,) but it is proven that service providers can still glean useful information from the data [14]. Since k-anonymity modifies the location traces substantially, it may not meet the accuracy requirements of some applications, as traffic monitoring systems. To solve this, a time-toconfusion metric [7] [17] is defined to evaluate the users’ path privacy of a set of location traces, which describes how long an individual vehicle can be tracked. Hoh et al. propose and algorithm [17] which can guarantee a specified maximum time-to-confusion (see Sect. 5.3.3.) This algorithm provides more accurate location data than a random sampling algorithm (see Sect. 5.3.2.) It can guarantee a good level of privacy even for users driving in low-density areas. 3.3.2.2 l-diversity k-anonymity can provide anonymity in certain scenarios, but it cannot avoid certain attacks as the homogeneity attack and the background knowledge attack [18]. With the homogeneity attack, an attacker can obtain sensitive information when there is little diversity in the values of sensitive attributes. When all k-users reside in the exact same location, if only k-anonymity is used, a subject might be identifiable. Background knowledge means that an adversary has a previous knowledge of the users. ldiversity provides privacy even when the data publisher does not know what kind of knowledge is possessed by the adversary. Incorporating l-diversity allows the mobile users to improve their privacy. With k-anonymity and l-diversity we use k for the privacy in terms of the set of subjects (the anonymity set of users) and l set of locations (the anonymity set of locations.) If all the users are travelling along the same route and passing through the same set of identifiable locations, m-invariant can be introduced, to be able to control the number of anonymous routes that an user wants to maintain anonymous. The main reason to introduce l-diversity is to improve the privacy protection of location provided by k-anonymity algorithm. To ensure l-diversity, every group of tuples that share the same non-sensitive information should be associated with at least l approximately equally distributed 20 Attacks and countermeasures sensitive values. If l is increased (thus the diversity is increased) the probability of linking a static location to a mobile user is reduced (for example a church, a school, etc.) and this allows the user to be unidentifiable from a set of l different physical locations [19]. 3.3.2.3 Spatial and temporal degradation Apart from k-anonymity, there are other countermeasures. One of them is Spatial Cloaking [7], which can be obtained applying different techniques. It can be applied to a single user or to a group of people in the same region. As we said before, an user is k-anonymous if the reported location is imprecise enough to be indistinguishable from at least k-1 other objects. Mix Zones (which will be explained with more detail in next section) are physical regions in which subjects’ pseudonyms can be shuffled among themselves to try to prevent an inference attack (explained in Sect. 3.3.1) on data. Using these Mix Zones we can achieve the goal of an user being k-anonymous. There are alternatives to these Mix Zones that uses only a single user’s data. One of them uses the ’knowledge’ that the most visited place is home. The algorithm deletes samples near a subject’s home and this creates ambiguity about the home’s actual location, since no clouds of samples are pointing at this location. The algorithm works as follows: it deletes points in a circle centered at the subject’s home, creating ambiguity about home’s location. We show an example in Fig. 3.6. We choose a point in the r radius circle (centered in at the subject’s home,) and this point is the center of the R circle, which is the region where samples are deleted. Another Spatial Cloaking tecnique is to create fake traces in areas to prevent tracking. An attacker can also be fooled if false locations are appended to a true location report. The location-based service responds to the reports and the client picks out only the response based on the true location. Other countermeasure is adding noise. If location data is noisy, it will not be useful for inferring the actual location of the subject (value distorsion.) For example, we can add Gaussian noise to each measured latitude and longitude. This inaccuracy makes that we report a sample different from the actual location, and imprecision means giving a plurality of possible locations. However, to prevent an attack we need lots of additive noise. Rounding is also used to prevent an attack. If data is too coarse, it will not correspond to the subject’s actual location. Another ways to maintain location privacy are vagueness, where a subject reports a place name instead of latitude and longitude, or subsampling, which means sending less samples. Path changing [20] is also used as a countermeasure. There are two approaches: Path Confusion and Path Perturbation. With the former, we take advantage of the fact that every time two paths of different users’ meet, the probability the adversary to confuse the tracks and follow the wrong user is increased. With the later new segments are inserted in the map (so there are more paths crossing) and some path segments are changed, raising the privacy by confusing the opponent. Path Perturbation also depends on the characteristics of the original traces, because if few crossing streets exist, the adversary could assume that all crossing segments have been artificially inserted through a perturbation algorithm. Note that this approach obtains the best performance for short paralell segments. The privacy level depends on the density of the scenario and the quality of service constraint. 21 3. Preliminaries r R Figure 3.6: Deleting data to hide home’s location. This dependency on user density makes that, if it is low, we have to sacriface QoS to obtain the level of privacy that we need. Thus adequate levels of privacy can only be obtained if user density is sufficiently high, or the QoS is very low. 3.3.2.4 Mix Zones The aim of Mix Zones is to prevent tracking of long term user movements, but still permit the operation of many short-term location-aware applications. If sufficient users simultaneously pass through these zones, an adversary cannot determine which path segments leading into and out of a zone belong to the same users. This technique relies on statically defined zones, so it cannot guarantee protection in areas with a low traffic density [8]. A Mix Zone for a group of users is a connected spatial region in which none of these users has registered any application callback. We define the anonymity set as the group of people visiting the Mix Zone during the same time period. The larger the anonymity set is, the greater anonymity offered, since as there are more users going out of the Mix Zone, it is easier that the attacker gets confused. When users enter in the Mix ¿one, they are ’invisible’ for the application, they stop receiving 22 Attacks and countermeasures location information. Users change to a new pseudonym every time they enter in a Mix Zone, and applications that see a user emerging from the Mix Zone cannot distinguish that user from any other who was in the Mix Zone and cannot link people going into the mix zone with those coming out of it. An user might enter to different Mix Zones in different times. The attacker can observe the times, coordinates and pseudonyms of all the enter and egress events. The goal of the adversary is to reconstruct the correct mapping between new and old IDs, being able to identify an user. If not all the members of the set are equally interesting to the observer, the anonymity set is not a good measure. This is why other anonymity metrics has been defined [21] [22]. In this case, they are not equiprobable, and the entropy (explained in Sect. 5.2) will be lower, which means less anonymity, since the higher the entropy is, the more uncertain an observer will be about the true answer [22]. 3.3.2.5 Privacy grid A privacy grid is a framework for supporting anonymous location-based queries in mobile information delivery systems. Users can define their preferred privacy requirements in terms of both location hiding measures (k-anonymity, l-diversity, etc.) and location service quality measures (maximum spatial and temporal resolution.) The privacy grid provides fast and effective cloaking algorithms for location k-anonymity and location l-diversity [19]. In Privacy grid, the space is divided into cells. To this end, the smallest spatial cloaking area for each user must be found, and there are two approaches to find it: bottom-up and top-down. The bottom-up approach takes the base cell (minimum defined area) containing the mobile user; it performs two checks for each message going to the LBS, with k or l higher than one. The first check determines if the current cell meets the user spatial constraints. Then k-anonymity and l-diversity requirements must be checked, to determine if they are fulfilled. If both conditions are fulfilled, the current cloaking area is chosen. If the constraints are not met, the algorithm starts the expansion to other neighboring cells, making the cloaking area bigger. The cell chosen to do the expansion is the one with the highest object count. With top-down cloaking, we first find the largest grid cell region within the user specified maximum spatial resolution area. This cloaking area is divided into a set of rows and columns. The algorithm checks if the largest possible cloaking box meets the privacy requirements, and if the region does not fulfill them, the message is not cloaked and the algorithm terminates. Otherwise, the algorithm starts removing rows and columns and checking the fulfillment of requirements. This process is repeated iteratively until it finds the smallest cloaking region which meets the requirements. 23 Chapter 4 System model 4.1 Introduction The definition of the model is really important since the analysis methodology, the results obtained and their interpretation depend on it. In this chapter we include a description of the scenario, every agent, its functions and capabilities, trusted and untrusted entities, etc. In the following sections we will describe the behaviour and characteristics of the participants: vehicles, Trusted Third Party (TTP) and public server (LBS.) We will also describe the scenario in which the participant move, using Netlogo. We can see an example of our model in Fig. 4.1. Attacker TTP LBS User Figure 4.1: System model. 24 Scenario 4.2 Scenario The scenario we simulate is the following: a city consisting of blocks, streets, traffic lights and cars (as it could be for example the city of NY, presented in Fig. 4.2.) We only consider straight streets. The reason to do this is that even if the streets in the city are curve, the user can only go through the streets and turn in intersections. In our scenario there are bidirectional and unidirectional streets, as well as two lane streets. Lights control the traffic, as it happens in a real scenario. Figure 4.2: Example of a reticular city: New York [23]. We choose a region of 12.5x12.5 km2 , as we consider it is big enough as to represent a real city. Our scenario is not developed in Km, as we explain in Chapter 6. This is one big advantage of Netlogo, that we can easily change the size of our scenario (our city scenario in Netlogo is shown in Fig. 4.3.) We also consider a block size of 500 meters and the maximum time that a traffic light remains closed is 15 seconds. In our model, cars are going to drive in the city, reporting samples every t seconds, and we consider that their maximum speed is v. While driving, the car describes a path. We denote this path as a car’s trajectory. However, the valuable samples (meaning with this the ones that contain useful information) are only the ones for a certain amount of time or space, even though the cars keeps on reporting samples until the end. The reason to do this is that if they stopped reporting samples, the number of targets to follow would be less, and that would improve the tracking performance. We chose an exponential distribution for the car’s travel times inspired by the work of Krumms [24] and Hoh et al [17]. We chose an exponential with median 14.4 as Krumm [24]. 25 4. System model Figure 4.3: Netlogo city view. 4.3 4.3.1 Participants Vehicles In our model, we define two types of vehicles, to try to make our experiments as realistic as possible: Commuters We consider N vehicles carrying GPS receiver and a transmitter. They report samples containing the ID, timestamp, longitude, latitude, velocity, heading information and another timestamp (no related to real time, but related to Netlogo simulation steps.) This last timestamp represents one simulation step in Netlogo. We need this second timestamp because our tracking algorithm needs to know how many cars are reporting samples at a concrete time and, since Netlogo has little deviation of the time report, it could cause problems in selecting which samples are part of a concrete discrete time. The transmitter only sends samples while the car is switched on, i.e., cars do not exist for the algorithm if they are parked. The sampling frequency is 5 seconds. In [17], samples are reported every minute, but our scenario is smaller, and 5 seconds is already challenging. The behaviour of the commuters is the following: they drive in the city and they stop whenever they have arrived to the destination or they have driven for a certain time. We explain this in Sect. 6.2 No commuters Besides the N commuters, we consider F vehicles that do not report their position, but they go from a place to a random destination, modifying the traffic conditions for other vehicles. This vehicles aim at making our scenario as realistic as possible. We have no interest in these vehicles and what they do. 26 Adversarial model In the scenario, both types of cars (R = N + F , in total) drive together, simulating real traffic conditions. 4.3.2 Trusted Third Party and public server In this model, we use a Trusted Third Party (TTP,) as an entity which acts as a proxy such that privacy is ensured. As we explained in Sect. 3.2, there are other approaches to location privacy without using this entity [25]. However, the use of the Trusted Third Parties (centralized) is still the most used solution. This entity receives the samples of the vehicles every second. We define a set of samples as all the released samples in a concrete time t. For example, the set of sample for t=0 would be s1 ,s2 ,...,sn , being n the number of samples released for that time and belonging each sample to one different car. As explained in Sect. 4.3.1, N is the total number of cars reporting samples, thus n ≤ N , because some cars can be parked. The public server is the server which receives the samples from the cars to offer the users various services. The TTP is located between this entity and the cars to ensure privacy, since the public server could be malicious and try to use the location information to infer private information about a user. The TTP can release all samples directly or execute a privacy enhancing algorithm (as the ’Uncertainty-aware privacy algorithm’ [17].) In the case of performing a privacy algorithm the TTP reveals the samples resulting from the algorithm to the public server. 4.4 4.4.1 Adversarial model Definition Nowadays, there is a growing interest in tracking larger user population, rather than individual users. Anonymous localization samples do not fully solve the privacy problem, because an adversary could link multiple samples to accumulate path information and eventually reidentify a user. Thus if we want to share information location, it will raise privacy concerns [26]. Tracking algorithms (explained in Sect. 5.4) aim at compromising the users’ privacy. For example, Multi target tracking algorithms have application in both military and civilian areas (e.g., air defense, air traffic control,) but not only in that high fields, just the knowledge of where a person has been, or where that person lives, is a valuable information nowadays. This is why tracking algorithms are developed and improved. Target tracking algorithms predict target’s position using the last known speed and heading information and the decide which next sample to link to the same vehicle through Maximum Likehood Detection [27]. If multiple candidate samples exist, the algorithm chooses the one with the highest a posteriori probability based on a probability model of distance and time deviations from the prediction [17]. In our model, the attacker can be a malicious public server, or an entity that can eavesdrop the communication channel between the TTP and the public server. In both cases, the attacker 27 4. System model receives a set of anonymized samples and aims at linking them to obtain private information. The attacker can see the location and speed information, and we also assume that he has knowledge of the times the samples were released. The attacker takes the samples (for a concret timestamp, which means that the attackers knows the timestamps, so the reading is in order, taking at each step all the samples of a concrete timestamp) and using tracking algorithms, he tries to link samples. This can be achieved using Single Tracking algorithms or Multi Target Tracking algorithms. 4.4.2 Types of tracking In the field of target tracking, we can follow two main approaches: • Multi target tracking • Single target tracking Multi target tracking aims at tracking a larger user population. To this end, one of the most popular algorithm is the Reid’s multiple hypothesis tracking algorithm [28]. This algorithm is based on the Kalman filter [29], and it makes a prediction of the following possible state, to be able to link samples. The Kalman filter (explained with more detail in Sect. 5.2.2) is useful to predict the following state of the users, and try to track them this way. It is a set of mathematical equations that provides an efficient computational (recursive) means to estimate the state of a process, in a way that minimizes the mean of the squared error. The biggest difficulty in the application of multi target tracking involves the problem of associating measurements with the appropriate tracks, above all when there are reports missing. Another problem with multi target tracking is the high computational cost that it means. It is not realistic using Multi target tracking in a big scenario with more than three or four targets. The Single Target tracking is easier to implement, and it needs less resources thus it can be performed in a big target population. However, it performs worse results than the Multi Target one. With this second approach only one car is followed at a time. In spite of the fact that it would be better a Multi Target algorithm, the computational effort required is to high to implement it with a large number of cars. In our case, we consider that the last tracking algorithm is out of the possibilities of the attacker, so he uses different types of Single Tracking algorithms (explained in Sect. 5.4.2.) 28 Chapter 5 Algorithms 5.1 Introduction In this section we explain the different algorithms that we have used in our experiments. The importance of Netlogo is well shown here. Normally these type of experiments would require the collection of real data samples. This means that we would need to access cars, users that allow their data to be public, etc. With Netlogo, we only need to build an scenario and record the samples of the simulated (saving money and time while obtaining representative results.) There are two types of algorithms that we deploy. Ones are defense algorithms and the others are attack algorithms. We use different attacks with different defenses and study their behaviour. Before explaining the algorithms in Sect. 5.2, we explain some preliminary concepts necessary to understand these algorithms. The defense algorithms are used to avoid location privacy breaches. In this work we use two defenses: the subsampling algorithm (Sect. 5.3.2) and the Uncertainty-aware privacy algorithm (Sect. 5.3.3.) These algorithms release less samples in order to avoid tracking while enough information is given for users to be able to use Location Based Services. The attack algorithms aim at tracking users and obtaining information. It is important to ensure that an attacker does not know a user’s path or the places that user has visited, since this can give the attacker knowledge that can be used in several (probably malicious) ways. The attacks we explain in this chapter are mainly two: in Sect. 5.4.1 we explain Multi target tracking and Single tracking algorithm is explained in Sect. 5.4.2. 5.2 5.2.1 Mathematical Background Entropy It is a measure of the uncertainty associated with a random variable. It is also called Shannon entropy (entropy in information theory field) [30]. Entropy quantifies the information contained in a message, or equivalently, the average of information content one is missing when one does 29 5. Algorithms not know the value of the random variable. The entropy H of a discrete random variable X with possible values x1 , ..., xn is: H(X) = E(I(X)) being E the expected value function and I(X) the information content or self-information of X. I(X) is itself a random variable. If p(X) denotes the probability mass function of X, then entropy can be explicitly written as H(X) = − n X p(xi )logb (p(xi )) i=1 with b the base of the logarithm. When computing entropy, normally b is equal to two, and in this case the entropy is measured in bits. 5.2.2 Kalman filter The Kalman filter is an recursive filter that estimates the state of a linear dynamic system from a series of noisy measurements in an efficient way. The main idea is that with the Kalman filter we can obtain the future state of a system, working on a prediction-correction basis. The state of the system is represented as a vector of real numbers, and at each discrete time increment, a linear operator is applied to the state to generate a new one. The problem is that normally the variables of the state are noisy and not directly observable, which means that there must be errors of the sensors, the senders, etc. The filter predicts the next state, using the measurements of the system (as for example the geographical position,) which are noisy and linearly related to the state. If the noise is Gaussian distributed, the Kalman filter estimator is statistically optimal [31]. The Kalman filter algorithm operates in three steps: prediction of a new system state, generation of hypothesis for the assigment of new samples to targets and selection of the most likely hypothesis, adjustment of the system state. In the first step, the filter predicts which is the most likely next state of the system. The filter equations allow also to compute the uncertainty of this estimation being true. The second step, the filter generates hypothesis to assign the new state to the real situations that we have observed, and it then selects the most likely one. Finally, the system corrects the result and incorporates the information of the measurements. The applications of the Kalman filter include a wide range of applications (like for example control a dynamic system.) However, the importance of this filter in this Master Thesis is its application in prediction of dynamic systems that are difficult to control. 5.2.3 Euclidean metric It is the most common distance measure used between two points. The Euclidean distance between two points A = (a1 , ..., an ) and B = (b1 , ..., bn ) in an Euclidean-n-space is defined as: 30 Mathematical Background v u n p uX 2 2 (a1 − b1 ) + ... + (an + bn ) = t (ai − bi )2 i=1 Netlogo provides a two-dimensional (2D) scenario. Thus in our case A and B are 2D vectors, i.e., A = (ax , ay ), B = (bx , by ). The Euclidean distance in this scenario is: d= 5.2.4 q (ax − bx )2 + (ay + by )2 (5.1) Random number generator One of the defense algorithms, the Random subsampling algorithm explained in Sect. 5.3.2, needs the generation of random numbers in order to decide if a sample is released or not. To that end, we programmed a pseudo random number generator, more specifically a consistent lineal multiplicative generator. This consistent lineal pseudo random number generator satisfies Eq. 5.2. Zi = (aZi−1 + c) mod m 0 ≤ Zi ≤ m − 1 (5.2) where Zi is the random number (Z0 must be the seed,) m must be a prime number, a must be the primitive root of m, which means that it has to satisfy Eq. 5.3, and c is an increment which satisfies c < m, being in our case c = 0. an mod m 6= 1 n = 1, ..., m − 2 (5.3) As we said, we use the consistent lineal multiplicative generator, which means that in Eq. 5.2 c = 0. We show how we obtain the random numbers in Algorithm 1. Algorithm 1 Random numbers generator algorithm 1: h = Zi % q 2: l = Zi mod q 3: t = a · l − r · hi 4: if t > 0 then 5: Zi = t 6: else 7: Zi = t + m 8: end if where m, which is the maximum period (maximum series of random number that we can obtain,) is equal to 231 − 1, a = 48271, q = 44488 and r = 3399. 31 5. Algorithms 5.3 5.3.1 Privacy enhancing algorithms Introduction Privacy enhancing algorithms are a wide array of technical measures to protect user’s privacy [32]. Nowadays, we live in a world where our data can identify us thus it is very important to have means of protecting it. There are lots of types of privacy enhancements. In our case, our algorithms aim at anonymizing the samples reported by cars, so that private information cannot be inferred from that samples. In the next section we explain two privacy enhancing algorithms. In Sect. 5.3.2 we explain the Random subsampling algorithm, which is an easy algorithm that randomly removes samples. Finally, in Sect. 5.3.3, we explain the Uncertainty-aware algorithm, which is also based in samples removal, but not randomly: the algorithm chooses to remove the more ’compromising’ samples. This way the algorithm removes less samples but maintains the privacy level. 5.3.2 Random subsampling algorithm We define subsampling as a reduction of the sampling rate. This is a really easy approach to increase privacy, since all the algorithm has to do is deleting samples to make the tracking more difficult. The less samples we report, the more difficult it is to link them, since these samples will be more separated between them, and this way the prediction is likely to be wrong, resulting the attacker assigning a wrong path to a user. However, this algorithm does not obtain good results, since it deletes samples without taking into account available information as the position of the cars, the number of cars surrounding it, etc. To get this idea straight, we show the following example: if an user is in an intersection, and the next sample is not reported, it is better in terms of privacy than if this user does not report a sample when there is no intersection and the user can only go straight. In the later case, the attacker can infer the direction of the user, even without the non-released sample. To solve the weakness other algorithms (as the one explained in Sect. 5.3.3) are developed. In our simulations we have applied random subsampling. This variant of subsampling consists on the following: every timestamp, we randomly decide to release an user’s sample or to delete it. To make this decision we use a custom pseudo random numbers generator instead of using the default C++ function random(). We have programmed this generator in C++ to make it probabilistically correct. The algorithm we use is the consistent lineal multiplicative generator explained in Sect. 5.2.4. With this approach the tracking ability of an attacker is reduced, since the less samples are released, the more difficult is for an attacker to follow the path. 32 Privacy enhancing algorithms 5.3.3 5.3.3.1 Uncertainty-aware privacy algorithm Definitions In this subsection we define some terms that will be necessary to understand the Uncertaintyaware Algorithm. Definition 1 Time-to-confusion: It is the maximum time that an adversary can correctly follow an user, i.e., for how long an attacker can follow a target with sufficient certainty that these samples correspond to that user. Definition 2 Tracking Uncertainty: It is defined as entropy of the probability distribution describing the certainty of an attacker in the assignment of samples to users. In Eq. 5.4, pi denotes the probability that location sample i belongs to the vehicle currently tracked. Lower values of H denote more certainty or lower privacy. H(X) = − n X pi log2 (pi ) (5.4) i=1 Definition 3 Mean time-to-confusion: It is the mean tracking time during which uncertainty stays below a confusion threshold. 5.3.3.2 Algorithm definition The Uncertainty-aware privacy algorithm was developed by Hoh et al. [17]. It is a privacy algorithm which aims at providing anonymity even in low density areas. It works as follows: the user defines a maximum allowable time-to-confusion and an associated uncertainty threshold (explained in Sect. 5.3.3.1.) The algorithm receives as input sets of samples, it processes each set, and releases only the samples that ensures that the tracking bounds are maintained. The strength of this algorithm is that it ensures a maximum time-to-confusion. Using this algorithm the maximum time-to-confusion is maintained for a concrete uncertainty. To ensure this, the algorithm must follows two criteria to release samples. The first rule mandates that a sample can be revealed if the time since the last point of confusion is less than the defined maximum time-to-confusion. The second rule mandates that a sample can only be released if the tracking uncertainty (Eq .5.4) is above a defined threshold. The Algorithm 2 describes the processing of data for a set of samples and it is repeated until there are no more samples. The input of the algorithm is the set of GPS samples reported at time t (v.currentGPSSample updated for each vehicle,) the maximum time to confusion (confusionTimeout,) and the associated uncertainty threshold (confusionLevel.) With these three inputs, it generates an output which is a set of GPS samples that can be released without compromising privacy. 33 5. Algorithms Algorithm 2 Uncertainty-aware privacy algorithm 1: releaseSet = releaseCandidates = ∅ 2: for all vehicles v do 3: if startof trip then 4: v.lastConf usionT ime = t 5: else 6: v.predictedP os = v.lastV isible.position+(t−v.lastV isible.time)∗v.LasV isible.speed 7: end if 8: //release all vehicles below time-to-confusion threshold 9: if t − v.LastConf usionT ime < conf usionT imeout then 10: add v to releaseSet 11: else 12: //consider release of others dependent on uncertainty 13: v.dependencies = k vehicles closest to the predictedP os 14: if uncertainty(v.predictedP os, v.dependencies) > conf usionLevel) then 15: add v to releaseCandidates 16: end if 17: end if 18: end for 19: //prune releaseCandidates 20: for all v ∈ releaseCandidates do 21: if ∃w ∈ v.dependencies.w ∈ / (releaseCandidates ∪ releaseSet) then 22: delete v f rom releaseCandidates 23: end if 24: end for 25: repeat pruning until no more candidates to remove 26: ReleaseSet = releaseSet ∪ releaseCandidates 27: //release GPS samples and update time of confusion 28: for all v ∈ releaseSet do 29: publish v.currentGP SSample 30: v.lastV isible = v.currentGP SSample 31: neighbors = kclosest vehicles to v.predictedP os in ReleaseSet 32: if uncertainty(v.predictedP os, neighbors) ≥ conf usionLevel then 33: v.lastConf usionT ime = t 34: end if 35: end for 34 Tracking algorithms In a first step, the algorithm selects the samples that can be safely revealed because an amount of time smaller than confusionTimeout has passed since the last point of confusion (line 9.) In the second step, from line 12 to line 27, the algorithm selects the vehicles that have a tracking uncertainty above the threshold, and releases them. In this step, some approximations are made, and even though without them more samples would be released, they are enough to maintain privacy guarantees. The final tracking uncertainty is not calculated with all the samples reported at a time t, but only with the k closest samples to the predicted point. This is because the uncertainty would be bigger if more samples were taken into account, thus it is an conservative approach. Furthermore, since uncertainty should only be computed with revealed samples, and these are not determined yet, the algorithm chooses a set of releaseCandidates. This set is pruned until only the vehicles which meet the uncertainty threshold are in it. ’The key property to achieve after the pruning step is that ∀v ∈ releaseCandidates.uncertainty(v.predictedP os, k closest neighbors in releaseSet ∪ releaseCandidates) ≥ conf usionLevel’ [17]. For every vehicle it must be ensured that, after pruning, all the k neighbors must be in the releaseSet. The last operations (line 27 and forward) performed, after having decided which GPS samples can be released, are to update the last confusion point and the last visible GPS sample for each vehicle. Note that confusion should be only calculated over released samples and not over all the samples. The last three lines are needed for path prediction in the uncertainty calculation. 5.4 Tracking algorithms A tracking algorithm is an algorithm that tries to associate samples with the appropriate tracks. In our study, two tracking algorithms are applied to the samples collected in the Netlogo simulation to reconstruct the path followed by an user. We do not do the step of recovering information, e.g. recover the name of the user. When choosing a tracking algorithm we have to take into account the algorithm performance, but also the computational effort required to compute it, because resources and time are limited. An adversary could employ at least three approaches to link location samples [33]. We introduce these approaches in the following paragraphs. Trajectory-based linking assumes that a user is more likely to continue travelling on the same trajectory, rather than changing direction. A second approach is Map-based linking. This algorithm correlates location samples with likely routes on a road or a map. These routes can in turn be used to predict users’ position and to link future samples. In other words, the attacker uses geographic, traffic and map information to obtain a better prediction. For example, if an user is on a intersection where he can only turn to the right (because turning to the left is forbidden,) then the adversary uses this knowledge in order not to link the user’s following sample to one going in a forbidden direction. Finally, Empirical linking connects samples based on prior movements that have been observed at a given location. An attacker uses past information to link samples. This means that if an user is in a specific situation, and this situation has also happened in the past, the attacker takes into account this information (e.g., which direction did the user follow) to make a better prediction. 35 5. Algorithms During the following sections we explain different Target tracking algorithms. In Sect. 5.4.1 we explain the Multi target algorithm, and in Sect. 5.4.2 and Sect. 5.4.3 we explain two different types of Single tracking algorithm. The difference between single and multi target algorithms is that with the first, we only follow one target at every time, while with the other, we follow take into account all the users. Besides, since the approach in most of the literature (for example, in [17]) is Trajectory-based linking, because it requires least effort for large-scale outdoor positioning systems, we will restrict our analysis to this approach. 5.4.1 Multi target tracking algorithm In Multi target tracking algorithms, the attacker is not interested in following every user independently. There is a higher interest in tracking larger user populations, rather than individual users. The problem of linking location samples to potential users is known as the data association problem in multi target tracking systems. The idea of these algorithms is minimize the error when assigning the positions of new location samples with the predicted positions of all targets. We choose to use Reid’s multiple hypothesis tracking algorithm, which is based on Kalman filtering (explained in Sect. 5.2.2.) The algorithm explained in this section considers that after every step only one hypothesis survives, i.e., at each step we calculate likehood between predictions and new samples assuming that the previous assigments were correct. 5.4.1.1 State prediction The first step is the Kalman filter prediction step. In this step, given a current state,we predict the next state using the available measurements (in our case samples) and adding noise, as we explained in Subsect.5.2.2. The filter prediction is described by the following formulas: xk = F xk−1 + wk (5.5) zk = Hxk + vk (5.6) In this equations, F is a matrix which given a previous state xk−1 , describes the next one. wk is noise, and xk is the state vector of the process at time k. zk is the new observation vector, obtained from the prediction xk using H, that converts a state vector into measurement domain. There are two types of noise in this equations. wk ∼ N (0, Qk ) is the process noise vector and vk ∼ N (0, Rk ) is the measurement of noise, where N (µ, cov) represents the Gaussian function with mean µ and covariance cov. wk and vk are assumed to be independent between them and they are normally distributed with covariance matrices Qk (Eq. 5.7) and R (Eq. 5.8.) As we can see, only the main diagonal of these matrices is different to zero, because the noises are assumed to be independent. These non-zero values represent the variance in the state (in Qk ) 36 Tracking algorithms and in the measurement vector variables (in Rk ,) since we cannot assume that measures are perfect. Qk 0 0 0 0 Qk 0 0 Q= 0 0 Qk 0 0 0 0 Qk Rk 0 0 0 0 Rk 0 0 R= 0 0 Rk 0 0 0 0 Rk (5.7) (5.8) As our tracking applications are two-dimensional, we can model the system state as xk =[px ,py ,vx ,vy ]’, which are the position and the speed vector in the x and y axis, respectively. We define F, a matrix that multiplies the state vector as: 1 0 F = 0 0 0 1 0 0 0 1 0 0 1 0 1 0 (5.9) If we use this matrix, we predict the position in the next timestamp (the x and the y. The predicted speed remains the same,) since e = v · t, being e the space, v the speed and t the time. Every new step, the filter makes a prediction of the new position of the user as: x̄k+1 = F x̂k P̄ k+1 (5.10) k T T = F P̂ F + Q (5.11) where x̄ is the mean and P̄ is the covariance of a multivariate normal distribution. x̂ and P̂ are the estimates after the last sample was received. 5.4.1.2 Second step: Hypothesis Generation and Selection An hypothesis is defined as the possible assigment of new samples to users. In this step, when the attacker receives the new samples, the algorithm generates one set of hypothesis for each permutation of the sample set, i.e., the algorithm makes all the possible permutations with the assignment between samples and users. Finally, it computes the likehood for each hypothesis and chooses the one with maximum likehood. 37 5. Algorithms Being Ωi an hypothesis, Z k a set of measurements with cardinality M, we define Pik ≡ P (Ωki |Z k ) ≈ M Y f (zm ) (5.12) m=1 which is the probability of Ωi given the measurements at a time k. f represents the conditional probability density function of the vector zk , which obeys a multivariate normal distribution (Eq. 5.13.) f (z k |x̄k ) = N (z k − H x̄k , B) (5.13) Equation 5.13 calculates the proximity of a prediction to the observation and then these values are combined into the probability of each hypothesis. B is defined as B = H P̄ k H T + R. N (x, P ) = e −1 T −1 x P x 2 / p (2π)n |P | (5.14) where Eq. 5.14 shows a normal distribution. The values xk and P are the ones calculated in the prediction step. Finally, the hypothesis j with maximum probability is chosen and we calculate the log-likehood radio: log Λk = log PI Pik i=1,i6=j 5.4.1.3 Pik (5.15) Third step: State correction This is the last step, and it is needed in order to update the predicted system state vector for each path with the Kalman gain K = P̂ H T R−1 . After the assigment of the observation vector to the targets, according to the chosen hypothesis, it is also necessary to update the difference between the predicted vector and the chosen observation. The new samples give direct information about the current state of the system, and with these new samples the algorithm corrects the most recent predictions by incorporating the information gained from the new information. 38 x̂k = x̄k + K[z k − H x̄k ] (5.16) P̂ k = P̄ + P̄ H T (H P̄ H T + R)−1 H P̄ (5.17) Tracking algorithms Equations 5.16 and 5.17 describe the correction step. Once we have updated xk and P k , the three steps described before (State Prediction, Hypothesis Generation and Selection and State Correction) are repeated for the following set of data (data at time k+1,) but using this new x̂k and P̂k , so that we correct our system. 5.4.1.4 Limitations Although Multi target tracking performs really good results, it is computationally expensive, and it becomes unfeasible when a big number of targets are tracked. This is the reason why in real situations with a big number of targets, an attacker cannot use this algorithm and has to resort to simpler algorithms, getting worse results but being necessary less time and memory resources. In the next section we introduce a much simpler algorithm, which is Single tracking algorithm. 5.4.2 5.4.2.1 Simple Single Tracking Algorithm Motivation Although the optimal option would be to perform a Multi tracking algorithm, its complexity makes it unworkable for a big number of cars. In [33], the authors are able to evaluate the privacy risk against multitarget tracking because the scenario is made up of only five cars, so it is computionally possible. However, in the majority of the literature (e.g., [17],) the performed tracking algorithm is Single Tracking. 5.4.2.2 Description An attacker performing the Single Tracking Algorithm can only follow only one car. This makes the computation much easier, but this time and memory reduction has its consequences, because the results are worse than using other more complex algorithms. In the Simple Single Tracking Algorithm (SSTA) we define x̂k as the estimation of the position of a target vehicle, and x̄k as the prediction of the next position of this target. We assume that the attacker has knowledge of the times the samples were released. The attacker reads set by set of samples (being a set of samples defined in Chapter 4,) until the data samples are finished. For the first set of samples, which is in k=0, we assign each of them to one x̂k , assuming that they are correct. After this step of inizialitation, the algorithm works as follows: first, we read again a set of samples. Second, knowing the estimation of the prior position x̂k , we calculate the predicted position as: x̄k+1 = F x̂k (5.18) 39 5. Algorithms where F is the matrix defined in Eq. 5.9. The last step is to calculate the closest sample to the predicted position. To this end, we use the Euclidean metric (Eq. 5.1.) When the closest sample is found, we assign it to x̂k+1 . We repeat this procedure for all the samples. The output of this algorithm are the tracked paths. 5.4.2.3 Limitations We have to be aware of the fact that we always assign one sample to x̂k (after the prediction step,) even if this sample is too far so that it cannot belong to the observed target. This is because cars have a limited speed, so their movements are limited in a concrete time. For these reasons, the algorithm when all samples are released, good tracking results are obtained. However, when a defense privacy algorithm removes samples, this SSTA performs badly. The reason of this bad performance is that the algorithm always chooses a sample, even if this sample is to far as to be part of a possible path for that car. Thus every time a car does not report a sample, the algorithm chooses the one which is closer, but it is never the one corresponding to that car. If for example, we have two cars, A and B in a time k, situated in a city in positions SA =(1,1) and SB =(9,9), being this coordinates in Km. If in this concrete time k the car B does not report a sample, the algorithm will choose the SA coordinate (assuming that the other cars releasing samples are further,) since there is no other closer sample, even though it is impossible that in one timestamp the car B has moved that distance. In the next section we present an algorithm that mitigates this effect. 5.4.3 Distance-aware Single Tracking Algorithm The Distance-Aware Single Tracking Algorithm works similarly to the prior one, introducing one modification so that, when a sample is not reported, the tracking algorithm does not assign impossible samples to cars. When using DATA, after the step of calculating the closest sample, we check if the distance to the last known position is more than two patches (which are individual squares on the Netlogo grid, an each patch is equivalent to 125 meter in reality.) If this is true, then x̂k+1 = x̂k , since probably the closest sample does not correspond to the path of the target vehicle. In our experiments we have chosen two patches as a threshold. The reason to do this is that the maximum speed for a car is 0.56 patches/tic. This means that, in the worst case, if the sample is further than two patches, the target has no reported a sample in four timestamps. In this case we would probably have lost the track. 40 Chapter 6 Experiments 6.1 Introduction In this chapter we present the results of the experiments performed using Netlogo to evaluate Location Privacy defenses and attacks. In Sect. 6.2 we explain the experimental setup, i.e, the conditions under which the experiments were performed. This is an adaptation of the model explained in Chapter 4 to the Netlogo world. Section 6.3, Sect. 6.4 and Sect. 6.5 show the results obtained when performing the approaches to release samples described in Chapter 5: release all, random subsampling and the Uncertaintyaware algorithm. In each of these sections, we apply the two attacks explained in Sect. 5.4.2 and in Sect. 5.4.3 to the outputs of the defense algorithms and show their tracking performance. We also study the effect of the different system parameters: number of cars reporting samples (N ,) total number of cars (R,) anonymity set size (k,) confusion level (β) and the sampling release probability (φ) on our results. Finally, in Sect. 6.6 we compare the three defenses effectiveness against the tracking algorithm explained in Sect. 5.4.3. 6.2 Experimental setup In this section we explain the selection of the experimental parameters, the scenario, the participants and the behaviour of these participants. To do our experiments, the samples were obtained with Netlogo, processed with algorithms programmed with C++, and the representation of results was done with Matlab. The scenario is a 100x100 patches (Netlogo distance unit) scenario. We consider that one patch is equivalent to σ meters in reality. We choose σ=125, which means that our scenario is equivalent to a region of 12.5 km2 . We show the Netlogo scenario in Fig. 6.1. Further, we have to decide the speed of the cars in the experiments. As our experiment aim 41 6. Experiments Figure 6.1: Netlogo implementation of the city. at simulating the traffic in a city, cars may drive with a speed v of at most 50km/h (14m/s, see Fig. 6.2.) To achieve this, we choose the Netlogo time unit, tic, to be 5 seconds (1 tic=5 seconds) such that the maximum permitted speed is v= 0.56 patches/tic. We recall that one tic is one step of the Netlogo simulation, i.e., one time step. In our scenario, N cars drive and report GPS location samples. In our first approach to model the traffic, that we call no-exponential, the vehicles are initially randomly situated in the city. We consider the place where they are first allocated is their home. When the car’s home is allocated, we define a random destination, which we consider the car’s owner working place (work.) These cars behave as follows: they drive from home to work and vice-versa until the simulation ends, reporting a sample every tic. Contrary to [17], where most of the users worked in the same place, we choose to give cars random destinations. This corrects a potencial error, since in [17] the work place worked as a place of confusion, where tracking algorithms failed. We think that this approach of same work place is not a real assumption, because in real life drivers have different destinations. In [17], to obtain the N cars that report samples, the authors overlaid data obtained by the same cars from different days, thus the same path from different days, because they could not 42 Experimental setup Figure 6.2: Maximum permitted speed in a city. collect data from a large number of cars. What they did was the following: for every available car (they had M < N cars available to perform their experiments) they collected data different days. As they wanted to show experiments corresponding to 24-hour GPS traces, they overlaid the data (choosing randomly from the ’paths’ created by the cars) from different days into only one day. This overlaying method is a limitation, since it generates many similar routes by aggregating GPS traces from the same set of drivers. This may not represent true traffic conditions, since this overlaying creates a non-existent high density scenario, where vehicles might be driven by the same driver and on the same route. To avoid this problems, we do not use an overlaid method to obtain all the samples that we need. Our scenario is made up of traces from different N cars. This approach provides more realistic results, and our simulation environment allows us to place as many cars as necessary. In [24] and in [17], the car trip times where exponentially distributed. However, with the model described before, the distribution of trip times (shown in Fig. 6.3a) were not exponential at all, so we decided a change. As we would like our results to be comparable to the ones in [24] and [17], we define a new scenario, exponential, where instead of going from home to work and viceversa, cars randomly ’wander’ for a concret time, τ , and when τ expires, they park, remain parked for a random time κ and they start driving again. To ensure an exponential distribution of the trip times, every τ is a sample from an exponential distribution. We do this to follow the empirical statistics from real GPS traces in Krumms’s work [24] confirmed in [17]. We take as a median for the exponential distribution the 14.4 minutes, as reported in [24]. Since the samples follow an exponential distribution, we can obtain its mean knowing that: mean = 1/λ (6.1) median = ln(2)/λ (6.2) 43 6. Experiments We obtain that the mean is 12460 488, and as t=5 seconds in reality, we have to generate samples 1 from an exponential with parameter mean= 249 tics and parameter λ = 249 . The resulting distribution is shown in Fig. 6.3b. 180 600 160 500 140 400 Frequency Frequency 120 100 80 60 300 200 40 100 20 0 0 500 1000 Times per trip (tics) 1500 0 0 500 (a) no-exponential scenario 1000 1500 2000 Times per trip (tics) 2500 3000 (b) exponential scenario Figure 6.3: Empirical distribution of vehicle trip times. In [17], where the samples used where collected from real cars, the distance error between the real and the predited positions follow an exponential function. Figure 6.4 shows that the same error distribution is obtained in our Netlogo model, showing this way that the modeling of traffic conditions with Netlogo is close to reality. 10000 9000 8000 7000 Samples 6000 5000 4000 3000 2000 1000 0 0 0.2 0.4 0.6 0.8 Distance deviation from prediction 1 1.2 Figure 6.4: Distance errors in tracking To avoid that the variability between different scenarios affect our results, in every experiment of the following sections we represent 10 aggregated Netlogo experiments in the graph. This 44 Experiments releasing all samples provides more reliable results, as we are not seeing only one path of every car in a concret simulation. All the results are shown as histograms representing the percentage of path that a percentage of cars are tracked. 6.3 Experiments releasing all samples In this section, we show the results obtained when the proxy releases all samples, and no anonymity algorithm is applied. 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked Figure 6.5a shows the results in a scenario where N =10, while R, the total number of cars in the scenario, is 500 and the tracking algorithm performed is SSTA (Simple Single Tracking Algorithm, explained in Sect. 5.4.2.) We observe that the percentage of tracked cars in this scenario is really high, and even the cars which are not tracked all the path are tracked above the 60%. This is a logical result, since with only 10 cars, the samples from different cars are normally far from each other and this makes it easy for the tracking algorithm. However, in some cases the algorithm can be mistaken in intersections or crossing roads and follow another car from that confusion point. 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked (a) SSTA: N =10 80 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =10 Figure 6.5: Releasing all samples. N =10. In Fig. 6.5b, we use the same scenario as before, but we apply DATA (Distance Aware Tracking Algorithm, explained in Sect. 5.4.3.) As we can observe, the tracking results are the same, since DATA only provides better results when there are samples removal. In Fig. 6.6a and Fig. 6.6b, we show the results with more congested traffic conditions: instead of 10 cars, N is increased to 100 (and R=500.) Figure 6.6a shows the results when performing the SSTA. We can observe that when the number of cars increase, the difficulty of tracking increases too, because there are more samples with which the algorithm can be ’fooled’. However, the percentage of cars tracked when releasing all samples is still high. Figure 6.6b shows the result 45 6. Experiments 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked when performing the DATA. We can observe that as in the previous case, the number of cars and percentages are the same, since no samples are removed. 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 100 (a) SSTA: N =100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100 Figure 6.6: Releasing all samples. N =100. 6.4 Experiments subsampling In this section we show the results obtained when cars do not release all the samples, but use the subsampling algorithm explained in Sect. 5.3.2. Figure 6.7a shows the results in a scenario where N =10, R=500 and the attacker uses SSTA. The sampling release probability, φ, is 0.8 . This means that the probability of releasing one sample is 0.8, i.e, for each sample we generate a random number η, and if η <φ, then the sample is released. We observe that the percentage of path tracked is really low for the majority of the cars. It might seem odd, since the sampling removal is only a 20%. This has a simple explanation: for every set of samples, the tracking algorithm chooses the closest sample to the prediction as the following sample in the path. The problem is that if the car has not reported a sample in that set (a concrete timestamp,) the algorithm assigns this car one sample, even if it is too far to be a possible option for that car. From that point, the algorithm follows an incorrect track. This is the reason of the poor behaviour even when there is low removal rate. In Fig. 6.7b, the DATA is applied to track the cars. We observe a significant improvement of the results, since this algorithm fixes the problem of the SSTA when a sample is not released. In Fig. 6.8a and Fig. 6.8b the number of cars, N , is increased to 100, to show the effect of a higher density in the same space scenario. Figure 6.8a shows the results when performing the SSTA. We observe that the number of path tracked is improved for some cars (between 60% and 70%,) but this does not mean that having more cars is better for the tracking algorithm, actually, it is the opposite, because we also observe that the number of cars tracked less time is 46 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked Experiments subsampling 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 0 0 100 (a) SSTA: N =10, φ=0.8 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =10, φ=0.8 Figure 6.7: Subsampling algorithm.N =100, φ=0.8. 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked higher (more then 70% of the cars are tracked only less than the 10%, and we also observe a diminution between 10% and 20%.) 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked (a) SSTA: N =100, φ=0.8 80 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, φ=0.8 Figure 6.8: Subsampling algorithm.N =100, φ=0.8. In Fig. 6.8b, we observe a big improvement of the tracking success. This is a the result of applying DATA, instead of SSTA. Applying DATA, if a sample is not sufficiently close, it is not chosen, i.e., the algorithm is not forced to choose a sample, it is only chosen if it is likely to belong to the tracked path. In the next experiment, we change φ. Now, instead of releasing one sample with probability of 0.8, we release a sample with a probability of 0.5, which means that in average we release only half of the samples. Figure 6.9a shows the results applying SSTA. The results are really 47 6. Experiments 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked poor, and this shows that the biggest the sampling removal is, the worst the performance of the tracking algorithm is be too. As we can see, more than the 90% of the cars are tracked less than 10% of the path, and only one can be tracked all the path. Figure 6.9b shows the result applying the DATA. Again, we can observe a big improvement compared to the prior case. There are more cars tracked 100% of the path, but not only that, we see cars tracked the 50% (and more) of the path, which do not happen when SSTA is performed. 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked (a) SSTA: N =100, φ=0.5 80 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, φ=0.5 Figure 6.9: Subsampling algorithm.N =100, φ=0.5. 6.5 Experiments Uncertainty-aware algorithm In this section we show the results obtained when performing the Uncertainty-aware algorithm (explained in Sect. 5.3.3) to decide whether a sample can be released or not. Figure 6.10a shows the results in a scenario where N =10 and R=500. The parameters of the Uncertainty-aware algorithm a set as follows: k=2 and β=0.4. First we apply SSTA and we observe that the tracking results are poor. This is a logical result for two reasons. First, as the number of cars is low, when the algorithm is performed, few cars meet the uncertainty constraint. The second reason is the same as for Fig. 6.7a: SSTA has really bad performance when samples are removed. In Fig. 6.10b, we see that applying the DATA the tracking results are improved. We observe that the number of cars tracked between a 0% and 10% decreases, and that the number of cars tracked more than 50% of the path is increased. This is because DATA works better than SSTA when removing samples. In Fig. 6.11a and Fig. 6.11b we set N =100 to show the effect of a more congested traffic conditions. Figure 6.11a shows the result when applying SSTA. It provides worse results (we can see that the number of cars tracked 100% of the path decreases,) but not much worse than with the 10 cars scenario (Fig. 6.10a.). This is because the Uncertainty-aware algorithm 48 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked Experiments Uncertainty-aware algorithm 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 0 0 100 (a) SSTA: N =10, k=2, β=0.4 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =10, k=2, β=0.4 Figure 6.10: Uncertainty-aware algorithm. N =10, k=2, β=0.4. 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked provides good privacy protection even in low density scenarios (as proved in [17],) this is why the tracking results are poor. 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 (a) SSTA: N =100, k=2, β=0.4 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, k=2, β=0.4 Figure 6.11: Uncertainty-aware algorithm. N =100, k=2, β=0.4. In Fig. 6.11b, we present the results of applying DATA, and the results are better than applying SSTA. We can see that the number of cars tracked 100% is higher, and we also observe a disminution between the 0% and 10%. Even though the results are improved, this improvement is less than in the subsampling scenario (Fig. 6.8b.) We comment this in the Sect. 6.6. In the next experiment we change k and β, being now k=10 and β=0.2. This experiments aim at showing the importance of the selection of this parameters in the algorithm. Figure 6.12a shows the results when performing SSTA. With a lower β, more samples are released since 49 6. Experiments 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked their uncertainty is above the threshold imposed with β. As we can see in the figure, with the reduction of this threshold the percentages of paths tracked are higher, and with a significant difference compared to the situation from Fig. 6.11a. Figure 6.12b is the result of performing DATA. The tracking results are better than in Fig. 6.12a. The number of cars tracked between 10% and 30% decreases, while the number of cars tracked between 50% and 100% increases. 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 (a) SSTA: N =100, k=10, β=0.2 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, k=10, β=0.2 Figure 6.12: Uncertainty-aware algorithm. N =100, k=10, β=0.2. 6.6 Comparison of different defense algorithms In this section we compare the different results obtained when applying the different defense algorithms, and in the case that we do not apply any of these algorithms and just release all samples. We choose relevant graphics to compare two by two when performing the DATA. In these experiments, N =100. The structure of this section is the following. First, we compare the results between releasing all samples and subsampling. Second, we compare the case of releasing all samples and performing the Uncertainty-aware algorithm. Finally, we compare the results between the random sampling and the Uncertainty-aware algorithm. In Fig. 6.13a and Fig. 6.13b we compare the results between the results obtained releasing all samples and applying the subsampling algorithm with φ=0.8. As we can see in Fig. 6.13a and Fig. 6.13b, the results of the attacks are much better when releasing all samples. This is an logic result, since the further the consecutive samples are reported, the more difficult it is for the tracking algorithm to link them. In Fig. 6.14a and Fig. 6.14b we compare the results between the results obtained releasing all samples and applying the Uncertainty-aware algorithm, with the following parameters: k=2 and β=0.4. 50 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked Comparison of different defense algorithms 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 0 0 100 (a) DATA: N =100, reporting all samples. 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, φ=0.8, subsampling algorithm. 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked Figure 6.13: Comparison between releasing all samples and subsampling. N =100. 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 (a) DATA: N =100, reporting all samples. 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, k=2 and β=0.4, Uncertainty-aware algorithm. Figure 6.14: Comparison between releasing all samples and Uncertainty-aware algorithm. N =100. In Fig. 6.14a and Fig. 6.14b we can see that there is a difference between performing the Uncertainty algorithm and releasing all samples. The tracking algorithm performs better results in no algorithm is applied, since if it has all the consecutive samples, it is easier to link them. In Fig. 6.15b and Fig. 6.15a we compare the results between the results obtained subsampling with φ=0.8, and with the following parameters for the Uncertainty-aware algorithm: k=2 and β=0.4. We can see the results in Fig. 6.15b and Fig. 6.15a. This is an important result since it shows 51 100 100 90 90 80 80 Percentage of cars tracked Percentage of cars tracked 6. Experiments 70 60 50 40 30 70 60 50 40 30 20 20 10 10 0 0 20 40 60 Percentage of path tracked 80 (a) DATA: N =100, subsampling algorithm. 100 0 0 20 40 60 Percentage of path tracked 80 100 (b) DATA: N =100, k=2 and β=0.4, Uncertainty-aware algorithm. Figure 6.15: Comparison between subsampling and Uncertainty-aware algorithm. N =100. that choosing when to remove the samples obtains a better performance than in the case of random removal. 52 Part III Part III: Learning algorithms 53 Chapter 7 Preliminaries 7.1 Introduction Artificial Intelligence (AI) is a branch of computer science that deals with the simulation of intelligent behaviour in computers. The aim of AI is to design intelligent agents, where an intelligent agent is an entity capable of taking actions depending on the perceived environment maximizing the chances of success [34]. Thinking machines and artificial machines appeared already in the Greek myths. All along history, writers and thinkers have written about these topics, wondering if an intelligent machine (meaning most of the times with intelligence the capacity of acting like a human,) could be created, and if so, if this machine could feel. However, it was not until the middle of the 20th century when scientists began to design and build intelligent machines, based on new discoveries and on the improvement of computer technology. In fact, the artificial intelligence as we now it today was born in the conference of the campus of Darmouth College, in 1956. From that moment, it has passed for really optimistic times (thinking even that a robot could be in last instance a human,) disappointing times, etc, but today it is an important part of technology and industry in continuous development. Finally, say that Artificial Intelligence has many application fields including medical diagnosis, robot control, video games, etc, and its importance is growing each day. 7.1.1 Intelligence One of the main problems of the Artificial Intelligence field has been the definition and agreement of what intelligence is. In fact, there are several definitions and tests of machine’s abilities to demonstrate intelligence, as for example the Turing test [35]. This test consists on the following: a human judge has a written conversation with a machine and a human being, but this two elements are isolated and the judge cannot see them. If after the conversation the judge is not able to determine with certainty which was the human and which was the machine, the machine is said to be intelligent. However, nowadays there is an ’universal’ accepted definition of intelligence. An agent is said 54 Introduction Figure 7.1: Turing test [36]. to be intelligent if it has the following properties: autonomy, social ability, reactivity and pro-activeness. An agent has autonomy if it exercises control over its own actions and state, this means, it does not need intervention of other beings to act. This is the most important property of an agent. In Fig. 7.2 we can see that an agent receives some inputs, processes them, and acts changing the environment. The social ability is a property which means that an agent interacts with other agents via an agent communication language. This property also refers to a possible communication with humans. A reactive agent is one which can perceive events in the environment and react to them in the correct fashion. The last quality is proactiveness. We do not want an agent to simply interact with the environment and to be driven solely by reaction to events, we want it to show goal-directed behaviour, i.e, to generate and attempt to achieve goals. If an agent has this properties, it is said to be intelligent. However, they are other qualities that are not a requisite, but they are desirable, as for example mobility (the capacity to move,) veracity (not communicating false information,) rationality (the agent takes actions which, given its knowledge of the environment, maximizes of success) and adaptability (the agent changes its behaviour in response to changes in the environment.) 55 7. Preliminaries Input Output ENVIRONMENT SYSTEM Figure 7.2: Agent action in some environment 7.1.2 Subfields, tools and applications AI is divided in different subfields and methods to solve the different problems that we can find in this area. Some general methods are the following: Search and Optimization, which includes searching algorithms, Optimization and Evolutionary computations. Also in the logic field, which is mainly Logic Programming and Automated reasoning. Another methods are the Probabilistic method for uncertain reasoning, which includes for example Bayesian network, Hidden Markov Model, Kalman Filters, etc. Another important subfield (explained with more detail in Sect. 7.4) are the statistical learning methods. Finally, we also find Neural Networks, which will be explained in Sect. 7.3. These last two fields will be explained since we use its algorithms in our experiments. A simpler classification is the following: there are non-learning intelligent agents and learning intelligent agents. In this Master Thesis, we are interested in the second type. Figure 7.3 and Fig. 7.4 show the difference between a non-learning agent and a learning agent. We can see that the non-learning agent receives and input and simply generates an output, since its behaviour is already defined. However, the learning agent processes this input, and empirically learns how to act. In the following sections we will introduce the learning concept, just as different algorithms and learning techniques. 7.2 Multi-agent Learning As we can see in Fig. 7.5, Multi-agent learning is the intersection of Multi-agent systems and Machine Learning [37]. In the environment agents learn and adapt in the presence of other agents that are simultaneously learning and adapting. What is more, agents can also learn even if they are the only ones which are learning. Still there is an interaction with other agents while learning. 56 Artificial Neural Networks Sensors Inputs Observation of the environment AGENT Decide action ENVIRONMENT Action to be done Actuators Actions Figure 7.3: Non-learning agent. 7.3 Artificial Neural Networks Artificial Neural Networks [38] are composed by interconnected artificial neurons, which are mathematical functions or programming constructions which acts as an abstraction of biological neurons. Explained in another way, neurons are simple nodes connected to form a network. These nodes are simple processing elements, although they can exhibit global behaviour. An artificial neuron receives some inputs and sums them to produce an output (Fig. 7.6.) Artificial Neural Networks (referred from now on as Neural Networks) are used either to understand biological neural networks or to be able to solve artificial intelligence problems without creating a model of a real biological system. As we said before, a Neural Network is composed by neurons, which are simple nodes connected to form a network. This nodes are simple processing elements, although they can exhibit global behaviour. The neuronal network has to be trained. The three major learning paradigms to do this are Supervised Learning, Unsupervised Learning (both explained Sect. 7.4) and Reinforcement Learning (Sect. 7.5.) This networks are useful since they can be used to infer a function from observations, and also in domains where it is difficult to design a function by hand because the data or the tasks are too complex. This is why Neuronal Networks are used in several areas, as for example game-playing and decision making, pattern recognition, system identification and control, etc. 57 7. Preliminaries Performance Standard Sensors Inputs Feedback Critic Changes Performance element ENVIRONMENT im en ts ledge Know Ex pe r Le Alg arnin ori g thm s Learning Element Actuators Actions Problem Generator Figure 7.4: Learning agent. MULTI-AGENT SYSTEMS MULTI-AGENT LEARNING MACHINE LEARNING Figure 7.5: Multi-agent Learning. Intersection of two fields. 7.4 Machine Learning Machine Learning is a subfield of artificial intelligence, and it deals with the capacity of machines to learn from the actions they have performed or from the environment [39]. As for intelligence, there is not a global definition for the word learning. However, we can give a guideline, saying that a whenever a machine changes its structure, program or data (based on inputs or in response to external information) to improve its performance, we can say that 58 Reinforcement Learning Figure 7.6: Simple scheme of the nodes of a Neural Network. a machine learns. There are many reasons why a machine should learn, for example changes of the environment over the time, to extract relationships from large amounts of data (data mining,) etc. There are several Machine Learning algorithms. One possible classification is based on the desired output of the algorithm. Among them we have: Supervised Learning, Unsupervised Learning and Semi-supervised Learning. In this three the goal is to find a function that maps inputs to desired outputs, the difference is that in the first there are labelled examples (they are input-output examples from of a function,) in the second there are no examples, and in the third there are labelled and no labelled examples. A fourth algorithm is Transduction, which is similar to Supervised Learning but without constructing a function. Finally, with Reinforcement Learning (we will explain it with more detail in Sect. 7.5) the agent observes the world and learns a policy of how to act in a concrete situation. 7.5 Reinforcement Learning Reinforcement Learning (RL) is a sub-area of Machine Learning, in which the goal of the agent is to maximize the long-term reward. The problem which RL tries to solve is which action the agent ought take in a concrete environment, i.e., to find a policy that defines which actions the agent must take in a concrete state. Nowadays, there are lots of unresolved problems, for example flight control systems, automated manufacturing systems, etc. These problems are not unresolved because we need a faster or better computer, but the problem is to determine what the program should do. It could be solved if a computer could learn to resolve results through trial and error. That is what Reinforcement Learning aims at achieving, to solve the problem of an agent that must learn a behaviour through trial-error interactions with dynamic 59 7. Preliminaries environment. There are two main strategies for solving Reinforcement Learning problems. The first approach is used in genetic algorithms and genetic programming. It consists on the search in the space of behaviors to find one action that performs well in the environment. The second approach is to use Statistical Techniques and Dynamic Programming. In this case, Reinforcement Learning is a mix or two disciplines: Dynamic programming [40] and Supervised Learning [41]. This way RL manages to solve problems that neither of the other two algorithms can solve individually [42]. Finally, the capability of yielding powerful machine-learning systems, its generality and the fact that we ’only’ have to give the computer a goal to achieve, have made of Reinforcement Learning a really appealing algorithm to researchers, for example in robot-rescue. 7.5.1 Environment and reinforcement function In the RL system model, an agent is in a dynamic environment, and the agent observes this environment (at least partially) through sensors, readers, etc. In most of the problems, it is assumed that the agent perceives the exact state of the environment. In every state (S) the agent chooses an action (A) and it generates an output. This action changes the state of the environment, and the agent is rewarded with an scalar number named reinforcement signal (R.) This reward varies and it is negative or positive depending on the result of the performed action. The target of the agents is to perform actions that maximize the sum of reinforcements received. Thus a reward function (R) must be defined, that awards some actions, while punishes others. Using the reward function, the agent has to find a policy π which determines which action should be performed in each state. The optimal policy is the mapping from states to actions that maximizes the sum of the reinforcements. 7.5.2 Future rewards To maximize the long term reward, an agent has to take the future into account for the decision he is making now. There are three main models to do this [43]: the finite-horizon model, the infinite-horizon discounted model and the average reward model. With the Finite-Horizon model, at every step, an agent has to optimize its expected reward for the next n steps (see Eq. 7.1.) E( n X rt ) t=0 where E is the average function, t are the steps and rt represents a scalar reward. 60 (7.1) Reinforcement Learning The Infinite-Horizon discounted model takes all the long-run reward into account, but discounting the rewards received in the future with a discount factor 0 ≤ γ ≤ 1 (see Eq. 7.2.) E( ∞ X γ t rt ) (7.2) t=0 where again E is the average function, t are the steps and rt represents a scalar reward. Finally, in the average-reward model, the agent aims at taking actions that maximize the long run average reward (see Eq. 7.3.) It is also known as a gain optimal policy. The problem with it is that we cannot distinguish between two policies, one which gains a lot in the initial phases and the other that does not (because the long-run average covers the first gains.) This is solved generalizing the model (bias optimal model) where a policy is preferred if it maximizes the long-run average. h lim E( h→∞ 7.5.3 1X t γ rt ) h (7.3) t=0 Markov Decision Process In Reinforcement Learning, the environment is typically formulated as a finite-state Markov Decision Process (MPD). An MDP consists of 4 objects: S is the state space, A is the set of actions, Pa (s, s0 ) is the probability of making a transition performing an action a from state s to state s’, and Ra (s, s0 ) is the immediate reward that the agent receives after going from state s to s’ with transition probability Pa (s, s0 ). In the Reinforcement Learning problem, agent’s actions determine their next reward and the next state of the environment. In delayed reinforcement [43] for example, the agent receives little reinforcement when taking some actions, but it finally arrives to a state with high reward. In this case, the agent learns which actions it has to take, even though it does not receive a big reward until the end. Using infinite-horizon as a function, we want to maximize the cumulative function of the rewards (see Eq. 7.4.) E( ∞ X γ t Rat (st , st+1 )) (7.4) t=0 where γ t is the discount factor, and Rat (st , st+1 ) is the reward obtained when performing action a, going from state st o state st+1 . We have to define now a solution for the MDP. This solution can be expressed as a policy π, which maps the states to probabilities (i.e., which action must be chosen in every state.) 61 7. Preliminaries We will show the optimal solution for the Infinite-Horizon, but this optimal solution can also be achieved with the Finite-Horizon function. However, with the infinite one, an optimal deterministic stationary policy exist [44]. Being π the decision policy, the optimal value of a state is the expected sum of reward that an agent will gain if it executes the optimal policy starting from a state: ∞ X V (s) = maxπ E( γ t rt ) (7.5) t=0 V ∗ (see Eq. 7.6) is the optimal value function and it is unique. With this function, using the best available action, in the state s the agent gets the expected instantaneous reward plus the expected discounted value of the following state. This optimal value function is unique and it defined as the solution to the equations: V ∗ (s) = maxa (R(s, a) + γ X Pa (s, s0 )V ∗ (s0 )), ∀s ∈ S (7.6) s∈S We define the optimal policy knowing the optimal value function: π ∗ (s) = arg maxa (R(s, a) + γ X Pa (s, s0 )V ∗ ), ∀s ∈ S (7.7) s∈S There are two ways to find an optimal policy. One is finding the optimal value function, which can be determined iteratively with the algorithm value iteration. The other algorithm is the policy iteration algorithm, which, instead of finding the optimal policy via the optimal value function, it manipulates the policy directly. 7.6 Predator-prey problem The predator-prey problem [45] is a challenging problem in the field of distributed artificial intelligence. It is used as a generic multi-agent testbed because it can illustrate different multiagent scenarios, but it is a ’toy’ to try new algorithms and to concretize concepts. Although it is not useful when we want to have a complex real world domain, we can also base this more complex world on it. The participants of this problem are two: the predators and the preys. They move in an scenario formed by a discrete grid with squares (see Fig. 7.7.) The goal of the predators is to capture the prey, which means being in the same grid square as the prey or surrounding it (Fig. 7.8.) On the other hand, the goal of the prey is not to be captured. Predators and preys can only move to the adjacent square, and turns are defined, such that they can only move in their turn. 62 Predator-prey problem Possible movements Predator Prey Figure 7.7: Predator-prey model. Predator Prey Figure 7.8: Capture of a prey by surrounding. To measure the ’quality’ of an algorithm, predators perform the developed MAS algorithm, and we observe the results of the simulations (how many times they predators capture the preys, the time they need to do this, etc.) As for the prey, it can also use an algorithm to try to escape from the predators or move randomly. Everything described before is just the basic model, in which there is one prey and four predators. However, the predator-prey problem has also variants to adapt it to different needs. This means that we can change the definition of capture, the size and shape of the world, etc. 63 Chapter 8 System Model 8.1 Introduction In this chapter we explain our model, which is named SinCity. We describe the scenario (a city,) the participants and a new pursuit problem, which is a more complex version of the predator-prey problem. This chapter is structured as follows: in Sect. 8.2.1 we explain the previous existing city models. In Sect. 8.2.2 we explain the improvements that we made to the previous scenarios to obtain a more realistic one. Finally, in Sect. 8.3, we explain the participants of the scenario. 8.2 8.2.1 Scenario Previous models The first approach to model vehicular traffic, which is included in the NetLogo distribution [2], is Traffic Basic [46] (Fig. 8.1.) It models the movements of cars on a highway. Each car follows a simple set of rules: it slows down (decelerates) if it sees a close car ahead, and it speeds up (accelerates) if it does not see a car ahead. It demonstrates how traffic jams can form even without any ’centralized cause’. Figure 8.1: Traffic basic model. Using the movement of the cars in the previous model, a small city with traffic lights is modeled in Traffic Grid [47] (we can see the model in Fig. 8.2), also included in the NetLogo distribution. 64 Scenario It consists of an abstract traffic grid with intersections between cyclic single-lane arteries of two types: vertical or horizontal. It is possible to control traffic lights, car’s speed limit and the number of cars, creating a real-time traffic simulation. This allows the user to explore traffic dynamics and develop strategies to improve traffic and to understand the different ways to measure the quality of the traffic. Figure 8.2: Traffic grid model. Using the Traffic Grid model as a starting point, a more complex model is presented in [48], called Self-Organizing Traffic Lights (SOTL.) Cars flow in a straight line, eastbound or southbound by default. Each crossroad has traffic lights that only allow cars to move in one of the arteries that intersect it with a green light. Yellow or red lights stop the traffic. The light sequence for a given artery is green-yellow-red-green. Cars simply try to drive at maximum speed of a patch per time step, but they stop when a car or a red or yellow light is in front of them. A patch is a square of the environment with the size of a car. Time is discrete, but space is continuous. The environment is shown in Fig. 8.3. The user can change different parameters, such as the number of arteries or cars. Figure 8.3: SOTL model. Different statistics are shown if we use this model: the number of stopped cars, their average speed, and their average waiting times. In this scenario, there are three self-organizing methods 65 8. System Model for traffic light control outperforming traditional ones, since the agents are ’aware’ of changes in their environment, and therefore they can adapt to new situations. 8.2.2 City Improvements: our model Our model is a more realistic city scenario. The main agents in this model are: Intersections: Agentset containing the patches that are intersections of two roads. Controllers: Agentset containing the intersections that control traffic lights. Controllers occupy only one patch per intersection. Roads: Agentset containing patches forming roads. There are four sub-agentsets, depending if the road is southbound, northbound, eastbound or westbound. Buildings: Agentset containing patches forming buildings. Exits: Agentset containing the patches where cars leave the simulation. Gates: Agentset containing the patches where cars sprout from. Given the model of the traffic in a city described in Sect .8.2.1, we have made some improvements to represent a more realistic scenario. In the previous model, the roads have a single lane and direction, and there are only two directions by default, south and east, although they can be changed to four, by adding north and west. We have added the possibility of bidirectional roads and roads with two lanes in the same direction. Also, in the previous model a torus was used by default, which means that when a given car coming from the west to the east arrives to the end of the scenario at the east, the same car appears at the beginning of the scenario at the west. To increase realism, we remove the torus and impose four directions (north, east, south and west) and a by-pass road is created to improve traffic, which is the outermost in the scenario. We have also changed the car creation and elimination scheme. In our model, for every car we define a source (a random road patch) and a destination (another random road patch,) such that every car is created at a source, and it moves (following the shortest path) to its destination, where it is eliminated. The sources and destinations may be outside the world, depending on the value of two sliders called origin-out and destination-out, leading to some cars appearing and disappearing at the borders of the world. We have modified the control methods so that, instead of just one yellow light cycle, now there are as many yellow light cycles as patches in the intersection, i.e., if the traffic light protects a bidirectional road, with two lanes in each direction, there will be four yellow lights cycles. In order to try to correct deadlocks at the intersections, a deadlock algorithm has been implemented. If a given car at an intersection has not moved after a given time, it tries to change direction in order to keep moving and to try to leave the deadlock. This movement affects other cars and helps finishing the current deadlock. Due to all these improvements, specially the possibility of an origin and a destination and bidirectional roads, a complex algorithm to guide the cars is needed. In previous models, a car only changes road or direction depending on a probability prob-turn. In our model, whenever a car is in a patch that is an intersection (it belongs simultaneously to a horizontal and a vertical road,) it runs a guiding algorithm in order to know if a change of direction is necessary, before moving on. This algorithm works as follows: first it checks in which direction the car is moving, and the possible directions this car can follow in the intersection. After this, the algorithm 66 Participants checks where the destination is, and knowing all this information, it decides to go in the way which is more likely to be the shortest. For example, if a car is moving to the north, and in an intersection it can move to the north and to the east, and the destination is more to the southeast, the algorithm decides to go to the east. If a car is not in an intersection, it will keep the same direction until the next intersection. As seen in Fig. 8.4, with these changes we obtain a more realistic scenario where we can notice the different widths of the streets, depending if they are bidirectional and single or dual lane streets. We can also see the distribution of the traffic lights, and the by-pass road surrounding the city. Figure 8.4: SinCity model. 8.3 Participants Out city simulation is an extension of the predator-prey pursuit problem (explained in Sect. 7.6,) where the prey is substituted by a thief car and the predators by a set of police cars. The goal of the thief is to rob the bank and arrive safe to the hideout, while the police cars have to catch them after they rob the bank. 67 8. System Model On every challenge, the thief car starts driving at normal speed to a city bank. It stops in front of the bank, does the theft and gets away to its hideout at double speed. On the other hand, police cars patrol the city at random before the theft is done. When the thief car robs the bank an alarm is triggered, and police cars double their speed and patrol along the city trying to identify the thief’s car. The chase begins when any police sees the thief before it arrives to its hideout in the same road and at a distance of two blocks or less. On the one hand, if the thief’s car is seen, then all police cars know its position. On the other hand, if it is lost, they keep the position where the thief was last seen as target. If any police car arrives to that point but the thief car is not in sight, then the chase stops, preventing the other police cars to go to that place and the patrol continues. We consider that the thief is captured if it is surrounded by police in a road (two police cars) or in an intersection (four police cars.) We consider that the thief escapes when it reaches its hideout and enters into it without being seen by any police car (we call this event ’thief wins’.) Note that during the chase the thief does not go to its hideout, but it tries to escape from police. Besides, there are more cars in the city that cause thief or police cars to reduce their speed in the prosecution, from double to a normal one, if they are in front of the car. Fig. 8.4 shows a snapshot of the SinCity map and data with the thief car is highlighted. The bank is marked in red at the upper right and the hideout is marked in green at the center. At every challenge the position of the bank and hideout changes but they must be at a distance greater than the 25% of the map size. At the bottom and the right of the map we can see a graphic display with some outputs related with the simulation. We can also configure several parameters related with the algorithms explained in Chapter 9. 68 Chapter 9 Algorithms 9.1 Introduction In all artificial intelligence subfields lots of algorithms have been developed to try to solve the problems that agents have to face with. In this chapter, we will focus on four algorithms. In Sect. 9.2, we explain the Korff algorithm, which is a non learning algorithm. It is one important algorithm to solve the predator-prey problem, in which our model is based on. In Sect. 9.3, we explain the Self Organizing Map (SOM,) a Neural Network algorithm. Finally, in Sect. 9.4 and in Sect. 9.5, we explain two Reinforcement Learning algorithms: Learning Automata and Q-learning. 9.2 Korff Algorithm Korff developed this algorithm to solve the predator-prey problem (described in Sec.7.6.) The algorithm is a simple non-learning algorithm [49], but really effective. His solution required only sensing (which means that an agent can perceive the surrounding environment) and action from preys and predators. He considered that the predators were attracted towards to the prey. The force of this attraction could be calculated using Eq. 9.1 (where d is the distance from the predator to the prey, and S measure function.) The predator choose the neighboring cell with the highest score. S = d(prey) (9.1) However, with this solution, the predators do not surround the prey, because they pile up and disturb themselves. The solution is to consider repulsion between wolves. This repulsion is shown in the second term of Eq.9.2, where k is a constant that models the repulsive force between wolves. In this variant, the predator also moves to the neighboring cell with the highest 69 9. Algorithms score. S = d(prey) − k · d(predator to prey) (9.2) With this new approach, the predators do not disturb themselves, and finally surround the prey. As for the prey, in both approaches the strategy followed by the prey is to move to the neighboring cell which is farthest from the predators. If the prey’s speed is faster than the speed of the wolves, it permits the prey to escape [50]. One problem of Korff’s algorithms is that it is considered that the predators can see all the environment, and when this is not true, their performance gets worse. However, it works good when predators have a complete view of the field. 9.3 Self Organizing Maps Algorithm Self-Organizing Maps (SOMs) [51] are a data visualization technique which reduce the dimensions of data through the use of self-organizing Neural Networks. This algorithm is a type of Artificial Neural Network that is trained using unsupervised learning to produce a low-dimensional (typically two dimensional,) discretized representation of the training samples, called a map. A Self-organizing Map consists of components called neurons, and each of these neurons has associated a weight vector. The SOM describes a mapping from a higher dimensional input space to a lower dimensional map space. This means that the algorithm aims at placing the data into the map finding the neuron with the closest weight vector to the vector taken from the data space and to assign the map coordinates of this node to our data vector. This closest neuron to the data is called BMU (best matching unit.) In SOM, the goal of learning is that all the different parts of the network responds similarly to some input patterns. The SOM algorithm adjusts the weights of the BMU and neurons close to them, using the updating formula shown in Eq. 9.3: Wn(t+1) = Wn(t) + Θ(t)α(t)(I(t) − Wn(t) )) (9.3) where Wn(t) is the weight vector of the neuron n, α(t) is a monotonically decreasing learning coefficient and I(t) is the data input vector. The neighborhood Θ(t) depends on the lattice distance between the winner (closer) neuron and the neuron n. The magnitude of change decreases with time and with distance from the BMU. 70 Learning Automata 9.4 Learning Automata Learning automata (LA) is a type of Reinforcement Learning algorithm and a branch of the theory of adaptive control. Originally it was described as a finite state automata [52], but afterwards it was decided to use a probability distribution to describe the internal state of the agent. According to the probabilities given by the distribution, the agent would choose actions. For each agent, it is defined a matrix (Ψ) which contains the probability vectors of each state. The items of the probability vector contain the probability of taking a concrete action in a determined state. Thus if for example an agent has three possible states and it can perform four actions, the vector will be like Eq. 9.4, where P (si , x) represents the probability of making the action x in the state si . Each row of the matrix represents the probability vector for each state. The probability vectors are actualized and adjusted depending on the previous successes and failures. This is how the agents can learn which actions perform. P (s1 , a) P (s1 , b) P (s1 , c) P (s1 , d) Ψ = P (s2 , a) P (s2 , b) P (s2 , c) P (s2 , d) P (s3 , a) P (s3 , b) P (s3 , c) P (s3 , d) (9.4) Learning Automata uses two very simple rules, which are shown in Eq. 9.5 and Eq. 9.6. P (s, a) = P (s, a) + α(1 − P (s, a)) (9.5) P (s, b) = (1 − α)P (s, b) f or b 6= a (9.6) where P (s, a) is the probability that an agent takes action a in state s and α is a small learning factor. We only actualize this equations when a performed action succeeds. This way we increase the probability of taking this action a in the state s, while we punish (decrease the probability) the rest of the actions for this state. This algorithm always converge, giving us a vector of zeros and one one, which is ’winner’ action. However, it can converge to an incorrect action, but we can reduce this probability making α small [43]. 9.5 Q-learning Q-learning (QL) [53] is also a Reinforcement Learning algorithm. This algorithm is used in Markovian domains, providing the agents with the capability of learning to act optimally, doing this through the experience of the consequences of their actions. This algorithm makes the agent learns a mapping which represents the best actions to perform in every possible state. In Q-learning, the learned decision policy is determined by the value function Q(s, a) (Eq. 9.8.) This function is used to actualize the components of a matrix (Γ) defined for each agent which 71 9. Algorithms contains state vectors, each of them containing the Q value for each possible action. We can see this matrix in Eq. 9.7, where rows represent the vector states (Q values for a concrete state and action) and Q(si , x) represents the Q value of performing the action x in state si . The higher the Q value is, the more probable we perform that action, because that action is the best option to perform in that state. In the end, we have a mapping between states and actions, which is the higher value of Q of each vector state. Q(s1 , a) Q(s1 , b) Q(s1 , c) Q(s1 , d) Γ = Q(s2 , a) Q(s2 , b) Q(s2 , c) Q(s2 , d) Q(s3 , a) Q(s3 , b) Q(s3 , c) Q(s3 , d) (9.7) Given an state s, an agent has a series of actions that it can perform and that will take the agent to the next state s0 . Q(s, a) = Q(s, a) + α(R(s) + γmaxa Q(s0 , a0 ) − Q(s, a)) (9.8) In Eq. 9.8, s is the current state, s0 is the next state, Q(s, a) is the value of the matrix for that state and action, Q(s0 , a0 ) is the value of the matrix for the next state and action and α is the learning rate (0 ≤ α ≤ 1.) The smaller this factor is, the less the agent learns, while if it is close to one the agent will only take into account the most recent information. Continuing with the explanation of parameters, R(s) is the reward and γ is the discount factor, which determines the importance of future rewards (0 ≤ γ ≤ 1.) The smaller this parameter is, the agent gives more importance to current rewards, while if it is close to one, it looks for a long-term reward. Finally, the value maxa Q(s0 , a0 ) is the maximum value of Q that we can obtain in the following state. This three last parameters are defined as the expected discounted rate. Note that we have to assign values to the Q matrix before the learning starts. After that, new values are calculated and the agent obtains the best action for every state, which is the maximum Q(s,a) for a concrete s. 72 Chapter 10 Experiments 10.1 Introduction This chapter presents the results of the experiments, with which we also obtained a publication with the paper SinCity: a pedagogical testbed for checking Multi-agent Learning techniques [1]. We show and explain the results of our experiments to test the performance of different Learning Algorithms when applied to the pursuit problem in a city. In Sect. 10.2 we explain the experimental setup, i.e., the conditions and parameters under which the experiments were performed. In Sect. 10.3, Sect. 10.4, Sect. 10.5 and Sect. 10.6, we present the results of the simulations. The title of the sections refers to the algorithm used by the thief, e.g., in the Sect. 10.3 we present the results when the thief uses Korff algorithm and the police uses LA, QL and Korff. We remark here that in our experiments SOM is only used by the thief when it escapes from the police. Finally, in Sect. 10.7, we compare and comment the results. 10.2 Experimental setup For implementing the learning techniques described in the previous section we took several decisions. First, police cars and the thief car make decisions about what road to take only at every intersection. This reduces the number of states of the system and speeds up the simulations. The thief follows two different learning systems. The first is used to go from a particular location to the hideout when there is no police car at sight. The other learning system is used when it escapes from the police. Police cars only have one learning system which is used to go from its present location to a destination, for instance, to pursue the thief during the chase. However, while the bank has not been robbed, the police cars patrol the city randomly, and the thief car uses a guidance algorithm (explained in Sect. 8.2.2) to arrive to the bank. To compute the state (s) of an agent we proceed as follows: first we define a sub-state (∆) depending on the possible actions that an agent can perform. In our case, the possible actions 73 10. Experiments are moving to the north, to the south, to the east or to the west. These possibilities depend on the allowed road directions. As an intersection has four possible exits, we have P =16-1=15 possibilities for the state (it is 15 because we do not consider the case where all roads are blocked) depending if a certain road is blocked or not (in Fig. 10.1 we see two possible sub-states.) We assign a number for each of these possible sub-states. For example, if a car is in an intersection where it can only move to the west, ∆=1. Secondly, we define a second sub-state (Υ) which is the position where the target is located with respect to the agent, i.e., if the target is more to the north, more to the south, more to the southeast, etc. For the police the target is the thief, and for the thief the target can be one police car, more than one police car or the hideout. We assign a number for each of these possible states. For example, if the target is more to the south, Υ=2. We denote as W the number of possible locations for the target. The total number of states is T =P ·W and to compute the agent current state we do s=∆·Υ. (a) Car can go in four different directions. (b) Car can go in two different directions. Figure 10.1: Two different combinations of possible directions (sub-states). For the LA and QL techniques, when the thief car goes to the hideout or the police cars chase the thief, P =15 and W =8, therefore we have T =120 input states for LA and QL. In the case of the thief, using LA during the chase we consider only the closest police car. Therefore W =4 and P =15, which makes T =60 input states . For the QL case, we consider the discrete distance in blocks (0, 1 or 2) of every police car when they are closer than two blocks. Then we have W =81 inputs and P =15, therefore there are T =1215 states. Finally, the SOM neural network is only used by the thief during the chase. First we identify the type of intersection we have (P =15 possibilities) and for each one we set a SOM with 4 real valued inputs (the 4 directions) with a number describing the exact distance to a police car (if anyone is less than 2 blocks, or zero if there is no one.) We used a lattice of 16 x 16 = 256 neurons. Therefore, we have 256 x 15 = 3840 neurons with 4 inputs each one. The output of every neuron is one of the four possible roads to take and it is based in a probability distribution of those possible exits, and trained as the LA case. We show a table with a summary of the number of states in Table 10.1. In the following sections we compare the results using the learning techniques in one run. We 74 Korff QL LA SOM Target 120 120 3840 Chase 1215 60 3840 Table 10.1: Number of states for each algorithm and each learning mode. The chase mode is only used for the thief. QL LA SOM α 0.1 0.1 0.1 R(s) ±0.25 - γ 0.2 - Θ 1 (1+0.01∗t) Table 10.2: Parameters for the different Learning Algorithms. call a run a set of challenges, where we denote as challenge the time between the cars are allocated until either the thief or the police wins. A run stops when the average standard deviation of the thief wins (a thief car wins when it reaches the hideout and enters it but without being seen) in the last 500 challenges is lower than a 3%, provided than at least 1000 challenges have taken place. The learning parameters (show in Table 10.2) we used in the simulations are: α=0.1 for the LA, QL and the probability distribution of every neuron in the SOM. Besides, in QL we set R(s)=±0.25 and γ=0.2. Finally, for the SOM case we have t= 1 / (1 + 0.01*t) and Θ(t) = (t)= 1 / (1 +0.01*t), where t is the learning iteration. Of course, as there are several SOMs, each one has its own t and Θ(t) parameters. We note that agents using Korff’s algorithm do not learn. This algorithm is used in order to compare the other learning algorithms. 10.3 Korff In this section we present the results obtained when the thief performs Korff algorithm, while the police cars use QL, LA and Korff. We perform two types of experiments. In the former, the size of the scenario is 5x5, and the number of polices can be 2, 3 or 4. The results are shown in Table 10.3. In the second experiment, we use a more challenging scenario, which is a 10x10 patches scenario, and we change the number of police cars, which are now 2, 4 and 6. The results are shown in Table 10.4. For an easy comparison, we also offer a graphical representation of the results in Fig. 10.2a and Fig. 10.2b. We can see in Fig. 10.2a and Fig. 10.2b that the percentage of victories obtained by the thief is big, since Korff algorithm is a non-learning one, and as it does not need time to converge, it 75 10. Experiments Thief algorithm Korff Korff Korff Korff Korff Korff Korff Korff Korff Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 3 3 3 4 4 4 % Thief wins 78.4 33.8 55.6 62.2 17.6 28.6 40.8 7.4 14.4 Table 10.3: Results in a 5x5 patches scenario. Algorithm for the thief: Korff. Thief algorithm Korff Korff Korff Korff Korff Korff Korff Korff Korff Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 4 4 4 6 6 6 % Thief wins 89.8 55.6 78.4 71.4 26.6 35.6 52.4 13.2 19.8 Table 10.4: Results in a 10x10 patches scenario. Algorithm for the thief: Korff. gets more victories faster. Further, we also observe that increasing the number of police cars decreases the percentage of thief victories, since if there are more police cars, they see the thief sooner and it is also easier to surround it. For example, in the 10x10 scenario and against Korff, there is a reduction of the 30% of thief victories between having 2 or 4 police cars. We also observe that increasing the size of the scenario increases the percentage of victories (a 10% for example with 2 police cars using QL.) This is because the bigger the scenario is, it is more difficult to surround the thief since it can escape through more ways. 10.4 SOM In this section we present the results obtained when the thief performs SOM algorithm, while the police cars use QL, LA and Korff. The experiments are deployed in two different scenarios: 5x5 patches and 10x10 patches. For the 5x5 patches scenario, the number of cars is 2, 3 and 4. We show the results in Table 10.5. The second scenario is a 10x10 patches scenario, and the experiments are developed with 2, 4 76 Learning Automata 100 100 Q−learning Korff LA 80 80 70 70 60 50 40 30 60 50 40 30 20 20 10 10 0 1.5 2 2.5 3 3.5 Number of police cars 4 Q−learning Korff LA 90 Percentage of thief wins Percentage of thief wins 90 4.5 (a) 5x5 maps with thief using Korff 0 1.5 2 2.5 3 3.5 4 4.5 Number of police cars 5 5.5 6 6.5 (b) 10x10 maps with thief using Korff Figure 10.2: Results with thief using Korff. Thief algorithm SOM SOM SOM SOM SOM SOM SOM SOM SOM Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 3 3 3 4 4 4 % Thief wins 58.8 27.8 39.2 37 14.2 18.6 29.2 5.4 12.6 Table 10.5: Results in a 5x5 patches scenario. Algorithm for the thief: SOM. and 6 police cars. The results are shown in Table 10.6. For an easy comparison, we also offer a graphical representation of the results in Fig. 10.3a and Fig. 10.3b. In both experiments, we see that increasing the number of police cars results in a decrease of the thief’s wins percentage. This diminution is less in the 5x5 scenario. For example, the difference between using 2 or 4 cars in the 10x10 scenario (police cars using LA) is about a 30% of decrease, while in the 5x5 scenario the diminution is about a 20%. 10.5 Learning Automata In this section we present the results obtained when the thief performs LA algorithm, while the police cars use QL, SOM and Korff. 77 10. Experiments Thief algorithm SOM SOM SOM SOM SOM SOM SOM SOM SOM Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 4 4 4 6 6 6 % Thief wins 76.6 52.8 59.4 46.8 24.2 34.2 29.0 9.4 18.2 Table 10.6: Results in a 10x10 patches scenario. Algorithm for the thief: SOM. 100 100 Q−learning Korff LA 80 80 70 70 60 50 40 30 60 50 40 30 20 20 10 10 0 1.5 2 2.5 3 3.5 Number of police cars 4 (a) 5x5 maps with thief using SOM 4.5 Q−learning Korff LA 90 Percentage of thief wins Percentage of thief wins 90 0 1.5 2 2.5 3 3.5 4 4.5 Number of police cars 5 5.5 6 6.5 (b) 10x10 maps with thief using SOM Figure 10.3: Results with thief using SOM. The experiments are deployed in two different scenarios: 5x5 patches and 10x10 patches. For the 5x5 patches scenario, the number of cars is 2, 3 and 4. We show the results in Table 10.7. The second scenario is a 10x10 patches scenario, and the experiments are developed with 2, 4 and 6 police cars. The results are shown in Table 10.8. For an easy comparison, we also offer a graphical representation of the results in Fig. 10.4a and Fig. 10.4b. We can see that in both experiments the best performance is obtained when thief ’plays’ against police cars when they perform the QL algorithm. This is an interesting result, since QL is the more complex and refined algorithm. The reason of this poor behaviour could be that LA, Korff and SOM algorithms learn faster. This shows that in some cases simple solutions are good enough to solve complex problems. We observe that incrementing the number of cars reduces the thief’s victories percentage. This is a logical result, since the more police cars we have to do the chase, it is more probable that 78 Q-learning Thief algorithm LA LA LA LA LA LA LA LA LA Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 3 3 3 4 4 4 % Thief wins 79.0 31.2 58.0 65.0 12.0 35.8 51.2 5.2 25.0 Table 10.7: Results in a 5x5 patches scenario. Algorithm for the thief: LA. Thief algorithm LA LA LA LA LA LA LA LA LA Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 4 4 4 6 6 6 % Thief wins 81.4 49.6 65.2 77.2 27.0 45.4 60.2 12.0 23.2 Table 10.8: Results in a 10x10 patches scenario. Algorithm for the thief: LA. these cars see the thief and they can surround the thief more easily. To show this, in Fig. 10.4b we see a reduction of almost a 20% of the thief’s percentage victories if the number of police cars is 6 instead of 2, when police uses QL. Further, we see that increasing the size of the scenario benefits the thief. This is because the bigger the scenario is, it is more difficult to surround the thief since it can escape through more ways. 10.6 Q-learning In this section we present the results obtained when the thief performs QL algorithm, while the police cars use SOM, LA and Korff. The experiments are deployed in two different scenarios: 5x5 patches and 10x10 patches. For the 5x5 patches scenario, the number of cars is 2, 3 and 4. We show the results in Table 10.9. The second scenario is a 10x10 patches scenario, and the experiments are developed with 2, 4 and 6 police cars. The results are shown in Table 10.10. For an easy comparison, we also offer a graphical representation of the results in Fig. 10.5a and Fig. 10.5b. 79 10. Experiments 100 100 Q−learning Korff LA 80 80 70 70 60 50 40 30 60 50 40 30 20 20 10 10 0 1.5 2 2.5 3 3.5 Number of police cars 4 Q−learning Korff LA 90 Percentage of thief wins Percentage of thief wins 90 4.5 (a) 5x5 maps with thief using LA 0 1.5 2 2.5 3 3.5 4 4.5 Number of police cars 5 5.5 6 6.5 (b) 10x10 maps with thief using LA Figure 10.4: Results with thief using LA. Thief algorithm QL QL QL QL QL QL QL QL QL Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 3 3 3 4 4 4 % Thief wins 52.6 19.2 25.8 28.4 5.4 15.8 22.0 2.8 9.8 Table 10.9: Results in a 5x5 patches scenario. Algorithm for the thief: QL. This experiments confirm the results in the previous section where the thief obtains better results when ’playing’ against police cars performing QL. For example in Fig. 10.5a we see that having 2 police cars, the thief has a 25% less victories when the police performing LA instead of QL. We also observe that the increase of the scenario benefits the thief (in the 5x5 scenario, with 2 police cars performing QL, the increment is almost a 20%.) 10.7 Comparison of the different Learning Algorithms In this section we compare the results obtained by LA, QL and SOM in a 10x10 scenario. We set Korff’s algorithm for the police cars and then we change the algorithms for the thief car. Korff’s algorithm does not learn and may be used to better compare thief success when using different learning techniques. 80 Comparison of the different Learning Algorithms Thief algorithm QL QL QL QL QL QL QL QL QL Police algorithm QL Korff LA QL Korff LA QL Korff LA Police cars 2 2 2 4 4 4 6 6 6 % Thief wins 69.2 35.0 48.2 44.0 16.2 20.4 23.8 5.4 10.4 Table 10.10: Results in a 10x10 patches scenario. Algorithm for the thief: QL. 100 100 Q−learning Korff LA 80 80 70 70 60 50 40 30 60 50 40 30 20 20 10 10 0 1.5 2 2.5 3 3.5 Number of police cars (a) 5x5 maps with thief using QL 4 4.5 Q−learning Korff LA 90 Percentage of thief wins Percentage of thief wins 90 0 1.5 2 2.5 3 3.5 4 4.5 Number of police cars 5 5.5 6 6.5 (b) 10x10 maps with thief using QL Figure 10.5: Results with thief using QL. As we can see in Fig. 10.6, the best results are obtained in average by the LA algorithm, followed by SOM and the last QL. This is even more interesting considering that the LA algorithm uses less states and considers if there are police cars in a direction, but without determining their precise distance. This is a surprising result, and it demonstrates that sometimes an excess of information can be a disadvange, and simple solutions are good enough for complex problems. See [49] for more examples. Besides, if we compare the results obtained in the previous sections, we see that if both police and thief cars use the same Learning Algorithm, the percentage of thief wins is similar. For example, we can see in Table 10.10 that with 6 police cars, and both police and thief using QL, the percentage of wins of the thief is 23.8. If we compare this to the case when both police and thief use LA (Table 10.8, also with 6 police cars,) we see that the percentage of victories is 23.2, which is similar to the result commented before. 81 10. Experiments 100 Q−learning LA SOM 90 Percentage of thief wins 80 70 60 50 40 30 20 10 0 1.5 2 2.5 3 3.5 4 4.5 Number of police cars 5 5.5 6 6.5 Figure 10.6: Results police using Korff in a 10x10 traffic map. 82 Part IV Part IV: Conclusions 83 Chapter 11 Conclusions and future work 11.1 Conclusions In this thesis we have performed two types of experiments: ones in the Location Privacy field and the others in the Multi-agent Learning System field. For both of them we have used the Netlogo tool, showing its flexibility and capacity to perform real world simulations. With the data obtained from Netlogo, we perform different investigations, obtain results and conclusions without the need of using expensive resources. Besides, with this tool the control of simulations is easier, and we can choose in which parts we are more interested to look into depth. In the Location Privacy part, we have described some important existing algorithms and measures to protect our privacy, which means in our context being anonymous. We show that privacy can be committed if no anonymization is applied to the samples. What is more, in this thesis we have shown that even when current state-of-the-art anonymization algorithms are used, the privacy of users can be committed. In order to perform our experiments, first we define and build a model. Secondly, we build a Netlogo city in which cars drive reporting samples. We implement attack (tracking) and defense (anonymization) algorithms. With these algorithms and the samples obtained in the Netlogo simulation we perform experiments. The results obtained were really impressive, and we confirmed that even using Single Target Tracking Algorithm, which is not the best attack that we could have used (but the more computationally affordable) we obtain good tracking results. For the defense algorithms, it was proven that the more samples we remove, the more anonymous we remain, but we have to be aware that the more samples we remove, also the worse the location services are. Besides, we also show that removing samples taking into account the available information (with the Uncertainty-Aware algorithm) provides worse tracking results, and thus improve privacy. In the Multi-agent Learning Systems part, we have shown the performance of different Learning Algorithms. In order to do that, we have build a flexible and efficient model in Netlogo. This model can be considered as a more complex version of the predator-prey pursuit problem. In our case, we model a police/thief pursuit in an urban grid environment where other elements (cars, traffic lights, etc.) may interact with the agents during the simulation. 84 Future work We present the results of the simulations to show the different performance of the different Learning Algorithms. The agents develop their tasks in the Netlogo scenario, and we compare the results obtained in different situations and with different parameters. The results show that the best algorithm does not obtain the best results. This could be because simplicity can be better in a simple scenario, and this is why LA obtains better results than QL. 11.2 Future work In Location Privacy experiments, results were obtained performing Single Tracking Algorithm and Trajectory Based Linking. This is not the best approach to get good tracking results. Even using Single Tracking Algorithms, the tracking results could be improved if we use Map-based Linking (this approach uses map information) or Empirical Linking (which uses past information) to link samples. A big improvement of the attack is the use of Multi-target Tracking Algorithms, which take into account all the cars simultaneously to do the linking. Besides, a clever algorithm which takes into account how the Uncertainty-aware algorithm works could improve the tracking results when using this algorithm to anonymize. In our Location Privacy model, there is a Trusted Third Party (TTP,) which means that we have to trust a third entity. This TTP may be malicious, and if this happens, no anonymization algorithm would be helpful. Another research line that can be followed is the removal of this entity and try a TTP-free scheme, where the users collaborate to protect their privacy. As an extension of the Multi-agent model, the model of the traffic in the city can be extended, and also we can introduce more complex interactions among normal traffic and the police-thief cars. Besides, in these thesis we have implemented three Learning Algorithms (LA, QL and SOM) but other complex learning techniques can be implemented (Dynamic Programming [40], Temporal difference learning [54], etc.) 85 Bibliography [1] A. Peleteiro-Ramallo, J. Burguillo-Rial, P. Rodrı́guez-Hernández, and E. Costa-Montenegro, “Sincity: a pedagogical testbed for checking multi-agent learning techniques,” in ECMS 2009: 23rd European Conference on Modelling and Simulation, 2009. [2] Netlogo. http://ccl.northwestern.edu/netlogo/ [3] Models library. http://ccl.northwestern.edu/netlogo/models [4] G. Danezis and B. Wittneben, “The economics of mass surveillance - and the questionable value of anonymous communications,” in In Proceedings of the 5th Workshop on The Economics of Information Security (WEIS 2006, 2006. [5] J. Tsai, S. Egelman, L. Cranor, and A. Acquisti, “The effect of online privacy information on purchasing behavior: An experimental study.” Carnegie Mellon University, Pittsburgh, PA (USA), June 2007. http://http://www.weis2007.econinfosec.org/papers/57.pdf [6] J. Krumm, “A survey of computational location privacy,” Personal and Ubiquitous Computing. http://dx.doi.org/10.1007/s00779-008-0212-5 [7] M. Gruteser and D. Grunwald, “Anonymous usage of location-based services through spatial and temporal cloaking.” http://www.usenix.org/events/mobisys03/tech/gruteser.html [8] A. R. Beresford and F. Stajano, “Location privacy in pervasive computing,” Pervasive Computing, IEEE, vol. 2, nr. 1, pp. 46–55, 2003. http://dx.doi.org/10.1109/MPRV.2003. 1186725 [9] B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady, “Enhancing security and privacy in traffic-monitoring systems,” IEEE Pervasive Computing, vol. 5, nr. 4, pp. 38–46, 2006. [10] M. L. Yiu, C. S. Jensen, X. Huang, and H. Lu, “Spacetwist: Managing the trade-offs among location privacy, query performance, and query accuracy in mobile services,” in ICDE, 2008, pp. 366–375. [11] J. Krumm, “Inference attacks on location tracks,” 2007, pp. 127–143. http: //dx.doi.org/10.1007/978-3-540-72037-9 8 [12] B. Gedik and L. Liu, “Protecting location privacy with personalized k-anonymity: Architecture and algorithms,” IEEE Transactions on Mobile Computing, vol. 7, nr. 1, pp. 1–18, January 2008. http://dx.doi.org/10.1109/TMC.2007.1062 [13] D. S. Brands, “A technical overview of digital credentials,” 2002. [14] M. Gruteser, J. Bredin, and D. Grunwald, “Path privacy in location-aware computing,” 2008. http://citeseerx.ist.psu.edu/viewdoc/summary10.1.1.110.3891 86 [15] C. Bettini, X. S. Wang, and S. Jajodia, “Protecting privacy against location-based personal identification,” in Secure Data Management, 2005, pp. 185–199. [16] B. Gedik and L. Liu, “Protecting location privacy with personalized k-anonymity: Architecture and algorithms,” IEEE Transactions on Mobile Computing, vol. 7, nr. 1, pp. 1–18, January 2008. http://dx.doi.org/10.1109/TMC.2007.1062 [17] B. Hoh, M. Gruteser, H. Xiong, and A. Alrabady, “Preserving privacy in gps traces via uncertainty-aware path cloaking,” in CCS ’07: Proceedings of the 14th ACM conference on Computer and communications security. New York, NY, USA: ACM, 2007, pp. 161–171. http://dx.doi.org/10.1145/1315245.1315266 [18] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam, “l-diversity: Privacy beyond k-anonymity,” in 22nd IEEE International Conference on Data Engineering, 2006. http://www.cs.umass.edu/∼{}mhay/links.html [19] B. Bamba, L. Liu, P. Pesti, and T. Wang, “Supporting anonymous location queries in mobile environments with privacygrid,” in WWW ’08: Proceeding of the 17th international conference on World Wide Web. New York, NY, USA: ACM, 2008, pp. 237–246. http://dx.doi.org/10.1145/1367497.1367531 [20] B. Hoh and M. Gruteser, “Protecting location privacy through path confusion,” in SECURECOMM ’05: Proceedings of the First International Conference on Security and Privacy for Emerging Areas in Communications Networks. Washington, DC, USA: IEEE Computer Society, 2005, pp. 194–205. [21] C. Diaz, S. Seys, J. Claessens, and B. Preneel, “Towards measuring anonymity,” 2002. [22] A. Serjantov and G. Danezis, “Towards an information theoretic metric for anonymity,” 2002. citeseer.ist.psu.edu/serjantov02towards.html [23] Google maps. http://maps.google.es/ [24] J. Krumm and E. Horvitz, “Predestination: Inferring destinations from partial trajectories,” 2006, pp. 243–260. http://dx.doi.org/10.1007/11853565 15 [25] B. Hoh, M. Gruteser, R. Herring, J. Ban, D. Work, J. C. Herrera, A. M. Bayen, M. Annavaram, and Q. Jacobson, “Virtual trip lines for distributed privacy-preserving traffic monitoring.” in MobiSys, D. Grunwald, R. Han, E. de Lara, and C. S. Ellis, Eds. ACM, 2008, pp. 15–28. http://dblp.uni-trier.de/db/conf/mobisys/mobisys2008.html# HohGHBWHBAJ08 [26] J. Warrior, E. McHenry, and K. McGee, “They know where you are,” IEEE Spectrum, pp. 20–25, August 2003. [27] W. Tranter, K. Shanmugan, T. Rappaport, and K. Kosbar, Principles of communication systems simulation with wireless applications. Upper Saddle River, NJ, USA: Prentice Hall Press, 2003. [28] D. Reid, “An algorithm for tracking multiple targets,” Automatic Control, IEEE Transactions on, vol. 24, nr. 6, pp. 843–854, Dec 1979. [29] “An introduction to the kalman filter,” Tech. Rep. TR 95-041, 2004. [30] C. E. Shannon, A Mathematical Theory of Communication. CSLI Publications, 1948. http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html [31] M. S. Grewal and A. P. Andrews, Kalman Filtering: theory and practice using Matlab. 87 Bibliography [32] Y. Wang and A. Kobsa, Privacy-Enhancing Technologies. http://www.ics.uci.edu/ ∼{}kobsa/papers/2008-Handbook-LiabSec-kobsa.pdf [33] M. Gruteser and B. Hoh, “On the anonymity of periodic location samples,” in SPC, 2005, pp. 179–192. [34] S. J. Russell and P. Norvig, Articial Intelligence: A Modern Approach, 2nd Ed. Englewoo: Prentice Hall. [35] A. M. Turing, “Computing machinery and intelligence,” Mind, vol. LIX, pp. 433–460, 1950. [36] Turing test. http://turing-machine.weblog.com.pt/arquivo/turingtest.gif [37] T. Smithers, “On quantitative performance measures of robot behaviour,” Robotics and Autonomous Systems, vol. 15, nr. 1-2, pp. 107–133, 1995. [38] M. A. Arbib, The Handbook of Brain Theory and Neural Networks, 2de ed. The MIT Press, November 2002. http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20\ &path=ASIN/0262011972 [39] Machine learning. http://robotics.stanford.edu/∼nilsson/MLDraftBook/MLBOOK.pdf [40] D. B. Wagner, “Dynamic programming.” [41] S. B. Kotsiantis, “Supervised machine learning: A review of classification techniques,” 2007. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.95.9683 [42] M. Harmon, “Reinforcement learning: a tutorial,” 1996. http://citeseerx.ist.psu.edu/ viewdoc/summary?doi=10.1.1.33.2480 [43] L. P. Kaelbling, M. L. Littman, and A. P. Moore, “Reinforcement learning: A survey,” Journal of Artificial Intelligence Research, vol. 4, pp. 237–285, 1996. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.2707 [44] R. E. Bellman, Dynamic programming. Princeton University Press, 1957. http://www. amazon.com/exec/obidos/redirect?tag=citeulike07-20\&path=ASIN/B0006AUXX8 [45] M. Benda, V. Jagannathan, and R. Dodhiawala, “On optimal cooperation of knowledge sources - an empirical investigation,” Boeing Advanced Technology Center, Boeing Computing Services, Seattle, Washington, Tech. Rep. BCS–G2010–28, juli 1986. [46] M. Wiering, J. Vreeken, J. Van Veenen, and A. Koopman, “Simulation and optimization of traffic in a city,” in IEEE Intelligent Vehicles Symposium (IV’04). IEEE, 2004. http://www.cs.uu.nl/groups/IS/archive/marco/simulating%5Foptimizing%5Ftraffic.ps.gz [47] Wilensky, u. 2005, netlogo traffic model. http://ccl.northwestern.edu/netlogo/ [48] C. Gershenson, “Self-Organizing Traffic Lights,” ArXiv Nonlinear Sciences e-prints, november 2004. [49] J. Reverte, F. Gallego, R. Satorre, and F. Llorens, “Mixing greedy and evolutive approaches to improve pursuit strategies,” in IBERAMIA, 2008, pp. 203–212. [50] K. R.E., “A simple solution to pursuit games,” in In Proceedings of the 11th international WorkShop on Distributed Artificial Intelligence, 1992. [51] T. Kohonen, Self-Organizing Maps. Springer, December 2000. http://www.amazon.ca/ exec/obidos/redirect?tag=citeulike09-20\&path=ASIN/3540679219 [52] K. Narendra and M. A. L. Thathachar, Learning automata: an introduction. Hall, 1989. 88 Prentice [53] C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, nr. 3-4, pp. 279–292, 1992. http://jmvidal.cse.sc.edu/library/watkins92a.pdf [54] R. S. Sutton, “Learning to predict by the methods of temporal differences,” Machine Learning, vol. 3, pp. 9–44, 1988. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1. 1.42.3191 89