Index of /publis/slides/2019_04__Presentation_IEEE_WCNC__Demo_ICT_2018

Name	Last modified	Size
Parent Directory		-
plots/	2021-03-04 16:23	-
.gitignore	2021-03-04 16:23	7
Makefile	2021-03-04 16:23	1.6K
README.md	2021-03-04 16:23	8.1K
preprocess_tex.sh	2021-03-04 16:23	119
slides.md	2021-03-04 16:23	8.1K
slides.pdf	2021-03-04 16:23	2.2M
slides.pdfpc	2021-03-04 16:23	32
slides_169.pdf	2021-03-04 16:23	2.2M
slides_169.pdfpc	2021-03-04 16:23	36
slides_pandoc.pdfpc	2021-03-04 16:23	39
---
title: GNU Radio Implementation of Multi-Armed bandits Learning for Internet-of-things Networks
subtitle: IEEE WCNC 2019
author: Lilian Besson
institute: SCEE Team, IETR, CentraleSupélec, Rennes
date: Wednesday 17th of April, 2019
lang: english
---

### *IEEE WCNC 2019*: "**GNU Radio Implementation of Multi-Armed bandits Learning for Internet-of-things Networks**"

- *Date* :date: : $17$th of April $2019$

- *Who:* [Lilian Besson](https://GitHub.com/Naereen/slides/) :wave: , PhD Student in France, co-advised by

| *Christophe Moy* <br> @ IETR, Rennes | *Emilie Kaufmann* <br> @ CNRS & Inria, Lille |
|:---:|:---:|
| ![8%](../common/LogoCS.png) ![14%](../common/LogoIETR.png) | ![12%](../common/LogoInria.jpg) ![16%](../common/LogoCNRS.jpg) |

> See our paper at [`HAL.Inria.fr/hal-02006825`](https://hal.inria.fr/hal-02006825)

---

# Introduction

- We implemented a demonstration of a simple IoT network
- Using open-source software (GNU Radio) and USRP boards from Ettus Research / National Instrument
- In a wireless ALOHA-based protocol, IoT objects are able to improve their network access efficiency by using *embedded* *decentralized* *low-cost* machine learning algorithms
- The Multi-Armed Bandit model fits well for this problem
- Our demonstration shows that using the simple UCB algorithm can lead to great empirical improvement in terms of successful transmission rate for the IoT devices

> Joint work by R. Bonnefoi, L. Besson and C. Moy.

---

# :timer_clock: Outline

## 1. Motivations
## 2. System Model
## 3. Multi-Armed Bandit (MAB) Model and Algorithms
## 4. GNU Radio Implementation
## 5. Results

### Please :pray:
Ask questions *at the end* if you want!

---

# 1. Motivations

- IoT networks are interesting and will be more and more present,
- More and more IoT objects
- $\Longrightarrow$ networks will be more and more occupied

But...

- Heterogeneous spectrum occupancy in most IoT networks standards
- Maybe IoT objects can improve their communication by *learning* to access the network more efficiently (e.g., by using the less occupied spectrum channel)
- Simple but efficient learning algorithm can give great improvements in terms of successful communication rates
- $\Longrightarrow$ can fit more objects in the existing IoT networks :tada: !

---

# 2. System Model

Wireless network
- In ISM band, centered at $433.5$ MHz (in Europe)
- $K=4$ (or more) orthogonal channels

Gateway
- One gateway, handling different objects
- Communications with ALOHA protocol (without retransmission)
- Objects send data for $1$s in one channel, wait for an *acknowledgement* for $1$s in same channel, use Ack as feedback: success / failure
- Each object: communicate from time to time (e.g., every $10$ s)
- Goal: max successful communications $\Longleftrightarrow$ max nb of received Ack

---

# 2. System Model

![80%](plots/system_model.png)

---

# Hypotheses

1. We focus on **one gateway**

2. Different IoT objects using the same standard are able to run a low-cost learning algorithm on their embedded CPU

3. The spectrum occupancy generated by the rest of the environment is **assumed to be stationary**

4. And **non uniform traffic**:
   some channels are more occupied than others.

---

# 3. Multi-Armed Bandits (MAB)

## 3.1. Model

## 3.2. Algorithms

---

# 3.1. Multi-Armed Bandits Model
- $K \geq 2$ resources (*e.g.*, channels), called **arms**
- Each time slot $t=1,\ldots,T$, you must choose one arm, denoted $A(t)\in\{1,\ldots,K\}$
- You receive some reward $r(t) \sim \nu_k$ when playing $k = A(t)$
- **Goal:** maximize your sum reward $\sum\limits_{t=1}^{T} r(t)$, or expected $\sum\limits_{t=1}^{T} \mathbb{E}[r(t)]$
- Hypothesis: rewards are stochastic, of mean $\mu_k$.
  Example: Bernoulli distributions.

### Why is it famous?
Simple but good model for **exploration/exploitation** dilemma.

---

# 3.2. Multi-Armed Bandits Algorithms
### Often "*index* based"
- Keep *index* $I_k(t) \in \mathbb{R}$ for each arm $k=1,\ldots,K$
- Always play $A(t) = \arg\max I_k(t)$
- $I_k(t)$ should represent our belief of the *quality* of arm $k$ at time $t$

### Example: "Follow the Leader"
- $X_k(t) := \sum\limits_{s < t} r(s) \bold{1}(A(s)=k)$ sum reward from arm $k$
- $N_k(t) := \sum\limits_{s < t} \bold{1}(A(s)=k)$ number of samples of arm $k$
- And use $I_k(t) = \hat{\mu}_k(t) := \frac{X_k(t)}{N_k(t)}$.

---

## *Upper Confidence Bounds* algorithm (UCB)
- Instead of using $I_k(t) = \frac{X_k(t)}{N_k(t)}$, add an *exploration term*
$$ I_k(t) = \frac{X_k(t)}{N_k(t)} + \sqrt{\frac{\alpha \log(t)}{2 N_k(t)}} $$

### Parameter $\alpha$: tradeoff exploration *vs* exploitation
- Small $\alpha$: focus more on **exploitation**,
- Large $\alpha$: focus more on **exploration**,
- Typically $\alpha=1$ works fine empirically and theoretically.

---

# 4. GNU Radio Implementation

## 4.1. Physical layer and protocol

## 4.2. Equipment

## 4.3. Implementation

## 4.4. User interface

---

# 4.1. Physical layer and protocol

> Very simple ALOHA-based protocol

An uplink message $\,\nearrow\,$ is made of...
- a preamble (for phase synchronization)
- an ID of the IoT object, made of QPSK symbols $1\pm1j \in \mathbb{C}$
- then arbitrary data, made of QPSK symbols $1\pm1j \in \mathbb{C}$

A downlink (Ack) message $\,\swarrow\,$  is then...
- same preamble
- the same ID
  (so a device knows if the Ack was sent for itself or not)

---

# 4.2. Equipment
$\geq3$ USRP boards

1: gateway
2: traffic generator
3: IoT dynamic objects (as much as we want)

![55%](plots/our-demo.png)

---

# 4.3. Implementation

- Using GNU Radio and GNU Radio Companion
- Each USRP board is controlled by one *flowchart*
- Blocks are implemented in C++
- MAB algorithms are simple to code

(examples...)

---

# Flowchart of the random traffic generator

![95%](plots/USRP_TX_PU__v1__simple_grc.png)

---

# Flowchart of the IoT gateway

![95%](plots/USRP_RX_BTS__v1__simple_grc.png)

---

# Flowchart of the IoT dynamic object

![95%](plots/USRP_TX_SU__v1__simple_grc.png)

---

# 4.4. User interface of our demonstration
→ See video of the demo: [`YouTu.be/HospLNQhcMk`](https://youtu.be/HospLNQhcMk)

![30%](plots/UI.png)


---

# 5. Example of simulation and results

On an example of a small IoT network:
- with $K=4$ channels,
- and *non uniform* "background" traffic (other networks),
  with a repartition of $15\%$, $10\%$, $2\%$, $1\%$

1. $\Longrightarrow$ the uniform access strategy obtains a successful communication rate of about $40\%$.

2. About $400$ communication slots are enough for the learning IoT objects to reach a successful communication rate close to $80\%$, using UCB algorithm or another one (Thompson Sampling).

> Note: similar gains of performance were obtained in other scenarios.

---

# Illustration

![70%](plots/plot_datafile_append_Uniform_vs_UCB_vs_TS.png)

---

# 6. Summary

## We showed
1. The system model and PHY/MAC layers of our demo
2. The Multi-Armed Bandits model and algorithms
3. Our demonstration written in C++ and GNU Radio Companion
4. *Empirical results*: the proposed approach works fine, is simple to set-up in existing networks, and give impressive results!

## Take home message

**Dynamically reconfigurable IoT objects can learn on their own to favor certain channels, if the environment traffic is not uniform between the $K$ channels, and greatly improve their succesful communication rates!**

---

# 6. Future works

- Study a real IoT LPWAN protocol (e.g., LoRa)
- Implement our proposed approach in a large scale realistic

### We are exploring these directions
- Extending the model for ALOHA-like retransmissions
  (→ [`HAL.Inria.fr/hal-02049824`](https://hal.inria.fr/hal-02049824) at MoTION Workshop @ WCNC)
- Experiments in a real LoRa network with dozens of nodes
  (→ IoTlligent project @ Rennes, France)

---

# 6. Conclusion

### → See our paper: [`HAL.Inria.fr/hal-02006825`](https://hal.inria.fr/hal-02006825)

### → See video of the demo: [`YouTu.be/HospLNQhcMk`](https://youtu.be/HospLNQhcMk)

### → See the code of our demo:
Under GPL open-source license, for GNU Radio:
[bitbucket.org/scee_ietr/malin-multi-arm-bandit-learning-for-iot-networks-with-grc](https://bitbucket.org/scee_ietr/malin-multi-arm-bandit-learning-for-iot-networks-with-grc/)

> Thanks for listening !