# 10 Minutes to DeepGraph¶

[`ipython notebook`

] [`python script`

] [`data`

]

This is a short introduction to DeepGraph. In the following, we demonstrate DeepGraph’s core functionalities by a toy data-set, “flying balls”.

First of all, we need to import some packages

```
# for plots
import matplotlib.pyplot as plt
# the usual
import numpy as np
import pandas as pd
import deepgraph as dg
# notebook display
%matplotlib inline
plt.rcParams['figure.figsize'] = 8, 6
pd.options.display.max_rows = 10
pd.set_option('expand_frame_repr', False)
```

**Loading Toy Data**

Then, we need data in the form of a pandas DataFrame, representing the nodes of our graph

```
v = pd.read_csv('flying_balls.csv', index_col=0)
print(v)
```

```
time x y ball_id
0 0 1692.000000 0.000000 0
1 0 8681.000000 0.000000 1
2 0 490.000000 0.000000 2
3 0 7439.000000 0.000000 3
4 0 4998.000000 0.000000 4
... ... ... ... ...
1163 45 2812.552734 16.503178 39
1164 46 5686.915998 14.161693 10
1165 46 3161.729086 19.381823 14
1166 46 5594.233413 57.701712 37
1167 47 5572.216748 20.588750 37
[1168 rows x 4 columns]
```

The data consists of 1168 space-time measurements of 50 different toy
balls in two-dimensional space. Each space-time measurement (i.e. row of
`v`

) represents a **node**.

Let’s plot the data such that each ball has it’s own color

```
plt.scatter(v.x, v.y, s=v.time, c=v.ball_id)
```

## Creating Edges¶

In order to create edges between these nodes, we now initiate a `dg.DeepGraph`

instance

```
g = dg.DeepGraph(v)
g
```

```
<DeepGraph object, with n=1168 node(s) and m=0 edge(s) at 0x7facf3b35dd8>
```

and use it to create edges between the nodes given by `g.v`

. For that matter, we may define a **connector** function

```
def x_dist(x_s, x_t):
dx = x_t - x_s
return dx
```

and pass it to `g.create_edges`

in order to compute the distance in the x-coordinate of each pair of nodes

```
g.create_edges(connectors=x_dist)
g
```

```
<DeepGraph object, with n=1168 node(s) and m=681528 edge(s) at 0x7facf3b35dd8>
```

```
print(g.e)
```

```
dx
s t
0 1 6989.000000
2 -1202.000000
3 5747.000000
4 3306.000000
5 2812.000000
... ...
1164 1166 -92.682585
1167 -114.699250
1165 1166 2432.504327
1167 2410.487662
1166 1167 -22.016665
[681528 rows x 1 columns]
```

Let’s say we’re only interested in creating edges between nodes with a
x-distance smaller than 1000. Then we may additionally define a
**selector**

```
def x_dist_selector(dx, sources, targets):
dxa = np.abs(dx)
sources = sources[dxa <= 1000]
targets = targets[dxa <= 1000]
return sources, targets
```

and pass both the **connector** and **selector** to `g.create_edges`

```
g.create_edges(connectors=x_dist, selectors=x_dist_selector)
g
```

```
<DeepGraph object, with n=1168 node(s) and m=156938 edge(s) at 0x7facf3b35dd8>
```

```
print(g.e)
```

```
dx
s t
0 6 416.000000
7 848.000000
19 -973.000000
24 437.000000
38 778.000000
... ...
1162 1167 -44.033330
1163 1165 349.176351
1164 1166 -92.682585
1167 -114.699250
1166 1167 -22.016665
[156938 rows x 1 columns]
```

There is, however, a much more efficient way of creating edges that involve a simple distance threshold such as the one above

## Creating Edges on a FastTrack¶

In order to efficiently create edges including a selection of edges via a simple distance threshold as above, one should use the `create_edges_ft`

method. It relies on a sorted DataFrame, so we need to sort `g.v`

first

```
g.v.sort_values('x', inplace=True)
```

```
g.create_edges_ft(ft_feature=('x', 1000))
g
```

```
<DeepGraph object, with n=1168 node(s) and m=156938 edge(s) at 0x7facf3b35dd8>
```

Let’s compare the efficiency

```
%timeit -n3 -r3 g.create_edges(connectors=x_dist, selectors=x_dist_selector)
```

```
3 loops, best of 3: 557 ms per loop
```

```
%timeit -n3 -r3 g.create_edges_ft(ft_feature=('x', 1000))
```

```
3 loops, best of 3: 167 ms per loop
```

The `create_edges_ft`

method also accepts **connectors** and **selectors** as input. Let’s connect only those measurements that are close in space and time

```
def y_dist(y_s, y_t):
dy = y_t - y_s
return dy
def time_dist(time_t, time_s):
dt = time_t - time_s
return dt
def y_dist_selector(dy, sources, targets):
dya = np.abs(dy)
sources = sources[dya <= 100]
targets = targets[dya <= 100]
return sources, targets
def time_dist_selector(dt, sources, targets):
dta = np.abs(dt)
sources = sources[dta <= 1]
targets = targets[dta <= 1]
return sources, targets
```

```
g.create_edges_ft(ft_feature=('x', 100),
connectors=[y_dist, time_dist],
selectors=[y_dist_selector, time_dist_selector])
g
```

```
<DeepGraph object, with n=1168 node(s) and m=1899 edge(s) at 0x7facf3b35dd8>
```

```
print(g.e)
```

```
dt dy ft_r
s t
890 867 -1 19.311136 33.415831
867 843 -1 17.678482 33.415831
843 818 -1 16.045829 33.415831
818 792 -1 14.413176 33.415831
792 766 -1 12.780523 33.415831
... .. ... ...
244 203 -1 -10.825226 15.455612
203 159 -1 -12.457879 15.455612
159 114 -1 -14.090532 15.455612
114 65 -1 -15.723185 15.455612
65 16 -1 -17.355838 15.455612
[1899 rows x 3 columns]
```

We can now plot the flying balls and the edges we just created with the `plot_2d`

method

```
obj = g.plot_2d('x', 'y', edges=True,
kwds_scatter={'c': g.v.ball_id, 's': g.v.time})
obj['ax'].set_xlim(1000,3000)
```

## Graph Partitioning¶

The `DeepGraph`

class also offers methods to partition `nodes`

, `edges`

and an entire `graph`

. See the docstrings and the other tutorials for details and examples.

## Graph Interfaces¶

Furthermore, you may inspect the docstrings of `return_cs_graph`

, `return_nx_graph`

and `return_gt_graph`

to see how to convert from DeepGraph’s DataFrame representation of a network to sparse adjacency matrices, NetworkX’s network representation and graph_tool’s network representation.

## Plotting Methods¶

DeepGraph also offers a number of useful Plotting methods. See plotting methods for details and have a look at the other tutorials for examples.