#Plot the data
The biconnected_network is a simple network constructed of two groups
of nodes that are linked by a single edge. This means the network is
made of 3 bi-connected components hence the name. For more information
type ?biconnected_network
biconnected_network %>%
ggraph() +
geom_edge_link() +
geom_node_point(aes(colour = group), size = 3)
Creating the embeddings using this dataset is very simple as it already has a force variable and the edge weights can be used as the spring constant.
embeddings_cont <- biconnected_network %>%
prepare_edges(.) %>%
prepare_continuous_force(., node_names = "name", force_var = "force") %>%
setse_auto(., k = "weight")
out <- create_node_edge_df(embeddings_cont, function_names = c("mean", "mode", "sum"))
However, for some reason we may want to reset the spring constant to something else, below we perform the embedding using a fixed k of 500.
embeddings_cont_fixed <- biconnected_network %>%
prepare_edges(., k = 500) %>%
prepare_continuous_force(., node_names = "name", force_var = "force") %>%
setse_auto(., k = "k")
By aggregating the tension in each edge to node level using
create_node_edge_df()
for both the embeddings methods we
can see how the different node are embedded. What we see is that the two
most central node experience much more tension and also have a
substantially higher elevation than the other nodes. This is expected as
on the biconnected_network network the node force is the centrality of
the nodes.
We can also see that the embeddings are similar but having fixed or variable k-strength has a clear impact on the final embeddings.
continuous_results <- bind_rows(create_node_edge_df(embeddings_cont) %>% mutate(type = "variable k"),
create_node_edge_df(embeddings_cont_fixed) %>% mutate(type = "fixed k")
)
continuous_results %>%
ggplot(aes(x = tension_mean, y = elevation, colour = node)) + geom_jitter() +
facet_wrap(~type) +
facet_wrap(~type) +
labs(title = "Continuous embeddings",
x = "mean tension")
Now we will use the group identity as binary force variable. The network is made up of two groups A and B. We arbitrarily set A to be the positive force.
As can be seen embeddings using the groups as force variables create very different embedded results, this is despite the fact that the networks are identical.
For factor levels of more than two, the high-dimensional setse should be used.
embeddings_binary <- biconnected_network %>%
prepare_edges(.) %>%
prepare_categorical_force(., node_names = "name", force_var = "group") %>%
setse_auto(.,
force = "group_A",
k = "weight")
#> Warning: There was 1 warning in `dplyr::mutate()`.
#> ℹ In argument: `dplyr::across(...)`.
#> Caused by warning:
#> ! Using `across()` without supplying `.cols` was deprecated in dplyr 1.1.0.
#> ℹ Please supply `.cols` instead.
embeddings_binary_fixed <- biconnected_network %>%
prepare_edges(., k = 500) %>%
prepare_categorical_force(., node_names = "name", force_var = "group") %>%
setse_auto(.,
force = "group_A",
k = "k")
binary_results <- bind_rows(create_node_edge_df(embeddings_binary) %>% mutate(type = "variable k"),
create_node_edge_df(embeddings_binary_fixed) %>% mutate(type = "fixed k")
)
binary_results %>%
ggplot(aes(x = tension_mean, y = elevation, colour = node)) + geom_jitter() +
facet_wrap(~type) +
labs(title = "Binary embeddings",
x = "mean tension")
Because this network has two features. We can embed this high
dimensional network using the setse_auto_hd
function.
What we see is that in this case there is very little relationship between the elevation of a node when it is embedded using different features.
two_dimensional_embeddings <- biconnected_network %>%
prepare_edges(.) %>%
#prepare the continuous features as normal
prepare_continuous_force(., node_names = "name", force_var = "force") %>%
#prepare the categorical features as normal
prepare_categorical_force(., node_names = "name", force_var = "group") %>%
#embed them using the high dimensional function
setse_auto_hd(., force = c("group_A", "force"), k = "weight")
#> Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if
#> `.name_repair` is omitted as of tibble 2.0.0.
#> ℹ Using compatibility `.name_repair`.
#> ℹ The deprecated feature was likely used in the rsetse package.
#> Please report the issue at <https://github.com/JonnoB/rSETSe/issues>.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
two_dimensional_embeddings_fixed <- biconnected_network %>%
prepare_edges(., k = 500) %>%
#prepare the continuous features as normal
prepare_continuous_force(., node_names = "name", force_var = "force") %>%
#prepare the categorical features as normal
prepare_categorical_force(., node_names = "name", force_var = "group") %>%
#embed them using the high dimensional function
setse_auto_hd(., force = c("group_A", "force"), k = "k")
bind_rows(two_dimensional_embeddings$node_embeddings %>% mutate(type = "variable k"),
two_dimensional_embeddings_fixed$node_embeddings %>% mutate(type = "fixed k")) %>%
#The elevation variables are renamed for simplicity
rename(categorical = elevation_group_A,
continuous = elevation_force) %>%
ggplot(aes(x = categorical, y = continuous, colour = node)) + geom_jitter() +
facet_wrap(~type) +
labs(title = "Node elevation for two different features",
x = "elevation with continuous embedding",
y = "elevation with categorical embedding")