deepgraph.deepgraph.DeepGraph¶
-
class
DeepGraph
(v=None, e=None, supernode_labels_by=None, superedge_labels_by=None)[source]¶ The core class of DeepGraph (dg).
This class encapsulates the graph representation as
pandas.DataFrame
objects in its attributesv
ande
. It can be initialized with a node tablev
, whose rows represent the nodes of the graph, as well as an edge tablee
, whose rows represent edges between the nodes.Given a node table
v
, it provides methods to iteratively compute pairwise relations between the nodes using arbitrary, user-defined functions. These methods provide arguments to parallelize the computation and control memory consumption (seecreate_edges
andcreate_edges_ft
).Also provides methods to partition nodes, edges or an entire graph by the graph’s properties and labels, and to create common network representations and graph objects of popular Python network packages.
Furthermore, it provides methods to visualize graphs and their properties and to benchmark the graph construction parameters.
Optionally, the convenience parameter
supernode_labels_by
can be passed, creating supernode labels by enumerating all distinct (tuples of) values of a (multiple) column(s) ofv
. Superedge labels can be created analogously, by passing the parametersuperedge_labels_by
.Parameters: - v (pandas.DataFrame or pandas.HDFStore, optional (default=None)) – The node table, a table representation of the nodes of a graph. The
index of
v
must be unique and represents the node indices. The column names ofv
represent the types of features of the nodes, and each cell represents a feature of a node. Only a reference to the input DataFrame is created, not a copy. May also be apandas.HDFStore
, but onlycreate_edges
andcreate_edges_ft
may then be used (so far). - e (pandas.DataFrame, optional (default=None)) – The edge table, a table representation of the edges between the
nodes given by
v
. Its index has to be apandas.core.index.MultiIndex
, whose first level contains the indices of the source nodes, and the second level contains the indices of the target nodes. Each row ofe
represents an edge, column names ofe
represent the types of relations of the edges, and each cell ine
represents a relation of an edge. Only a reference to the input DataFrame is created, not a copy. - supernode_labels_by (dict, optional (default=None)) – A dictionary whose keys are strings and their values are (lists of)
column names of
v
. Appends a column tov
for each key, whose values correspond to supernode labels, enumerating all distinct (tuples of) values of the column(s) given by the dict’s value. - superedge_labels_by (dict, optional (default=None)) – A dictionary whose keys are strings and their values are (lists of)
column names of
e
. Appends a column toe
for each key, whose values correspond to superedge labels enumerating all distinct (tuples of) values of the column(s) given by the dict’s value.
-
v
¶ See Parameters.
Type: pandas.DataFrame
-
e
¶ See Parameters.
Type: pandas.DataFrame
-
n
¶ Property: Number of nodes.
Type: int
-
m
¶ Property: Number of edges.
Type: int
-
f
¶ Property: types of features and number of features of corresponding type.
Type: pd.DataFrame
-
r
¶ Property: types of relations and number of relations of corresponding type.
Type: pd.DataFrame
-
__init__
(v=None, e=None, supernode_labels_by=None, superedge_labels_by=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
([v, e, supernode_labels_by, …])Initialize self. append_binning_labels_v
(col, col_name[, …])Append a column with binning labels of the values in v[col]
.append_cp
([directed, connection, col_name, …])Append a component membership column to v
.append_datetime_categories_v
([col, …])Append datetime categories to v
.create_edges
([connectors, selectors, …])Create an edge table e
linking the nodes inv
.create_edges_ft
(ft_feature[, connectors, …])Create (ft) an edge table e
linking the nodes inv
.filter_by_interval_e
(col, interval[, endpoint])Keep only edges in e
with relations of typecol
ininterval
.filter_by_interval_v
(col, interval[, endpoint])Keep only nodes in v
with features of typecol
ininterval
.filter_by_values_e
(col, values)Keep only edges in e
with relations of typecol
invalues
.filter_by_values_v
(col, values)Keep only nodes in v
with features of typecol
invalues
.partition_edges
([relations, …])Return a superedge DataFrame se
.partition_graph
(features[, feature_funcs, …])Return supergraph DataFrames sv
andse
.partition_nodes
(features[, feature_funcs, …])Return a supernode DataFrame sv
.plot_2d
(x, y[, edges, C, C_split_0, …])Plot nodes and corresponding edges in 2 dimensions. plot_2d_generator
(x, y, by[, edges, C, …])Plot nodes and corresponding edges by groups. plot_3d
(x, y, z[, edges, kwds_scatter, …])Work in progress! plot_hist
(x[, bins, log_bins, density, …])Plot a histogram (or pdf) of x. plot_logfile
(logfile)Plot a logfile. plot_map
(lon, lat[, edges, C, C_split_0, …])Plot nodes and corresponding edges on a basemap. plot_map_generator
(lon, lat, by[, edges, C, …])Plot nodes and corresponding edges by groups, on basemaps. plot_raster
(label[, time, ax])Work in progress! plot_rects_label_numeric
(label, xl, xr[, …])Work in progress! plot_rects_numeric_numeric
(yb, yt, xl, xr[, …])Work in progress! return_cs_graph
([relations, dropna])Return scipy.sparse.coo_matrix
representation(s).return_gt_graph
([features, relations, …])Return a graph_tool.Graph
representation.return_nx_graph
([features, relations, dropna])Return a networkx.DiGraph
representation.return_nx_multigraph
([features, relations, …])Return a networkx.MultiDiGraph
representation.update_edges
()After removing nodes in v
, updatee
.Attributes
f
Types of features and number of features of corresponding type. m
The number of edges n
The number of nodes r
Types of relations and number of relations of corresponding type. - v (pandas.DataFrame or pandas.HDFStore, optional (default=None)) – The node table, a table representation of the nodes of a graph. The
index of