Social Network Analysis

124 views 9:13 am 0 Comments June 3, 2023

SOST71032 Social Network Analysis
Termeh ShafieWritten assignment
Conditional Uniform Graph Distributions
E-mail: [email protected] Topic: Conditional Uniform Graph Distributions
Introduction
In this practical, we will use network data sets and calculate some network statistics in R. We
will then compare these statistics to the equivalent statistics for random networks given certain
features (uniform graph distribution given certain features). These random neworks then correspond to the null distribution.
Make sure you have the package
statnet loaded:
library(’statnet’)
The Coleman Data
Load a dataset and extract adjacency matrix
We are going to use a data set, coleman, that is automatically loaded with statnet. To get information about it type ?coleman and select Colemans High School Friendship Data. This should
open a helpfile with information about the data set. Read the description of the data in the help
file in order to know what you are working with. To load the data in your session:
data(coleman)
As described in the help file, the data set is an array with 2 observations on the friendship nominations of 73 students (one for fall and one for spring). We will focus on the fall network here, and
create the adjacency matrix for the network:
friendsFall <- coleman[1,,]
We know that the network has 73 nodes/student. But how many edges/ties do we have here?
Calculate the number of ties in the two networks as the sum of the adjacency matrix (because it is
a directed network):
sum(friendsFall)
1/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
Uniform graph distribution given expected density: UjE(L)
Calculate and store the density of the Coleman fall network. Density is given as the number of
present ties divided by the total number of possible ties in the network. We use the adjacency
matrix to calculate this:
FallDens <- sum(friendsFall)/(73*72)
FallDens
## [1] 0.04623288
To generate one random graph with the same density on average as the observed fall network, we
write:
g <- rgraph(n = dim(friendsFall)[1], m = 1, tprob = FallDens, mode = “digraph”)
Make sure you understand all arguments included. The random network and the observed network may not have the exact same number of edges but stochastically, it has the same density:
sum(g)
## [1] 265
sum(friendsFall)
## [1] 243
Now we can plot the random network we generated next to the observed one:
par(mfrow = c(1, 2))
plot(as.network(friendsFall))
plot(as.network(g))
Question 1: Can you note any obvious differences in structure?
2/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
Uniform graph distribution given density: UjL
Now generate a random network with exactly the same density with a slightly different function
called
rgnm():
g <- rgnm(n = 1, nv = dim(friendsFall)[1], m = sum(friendsFall), mode = “digraph”)
Calculate the outdegree distribution for this random network:
g.outdegree <- table(colSums(g))
To calculate the out-degree distribution for the observed network, we take the column sum to get
the number of nominations each actor has received:
colSums(friendsFall) # to get the out degree for each of the 73 actors
outdegree <- table(colSums(friendsFall)) # to get the outdegree distribution
Let’s plot these two out-degree distributions next to each other (note that we scale the x and y axes
similarly so that they become comparable):
par(mfrow = c(1, 2))
plot(outdegree, ylim = c(0,17), xlim = c(0,11),
main = ’observed’, xlab = ’out-degree’)
plot(g.outdegree, ylim = c(0,17), xlim = c(0,11),
main = ’random’, xlab = ’out-degree’)
0 5 10 15
observed
out-degree
outdegree
0 2 4 6 8 10
0 5 10 15
random
out-degree
g.outdegree
1 3 5 7
Question 2: If you interpret being nominated many times as ‘being active’, are there actors in the
observed data that are more active than by pure chance?
Exercise 1: Compare the in-degree distribution (interpreted as ‘popularity’) of the students in the
observed to the generated random network
3/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
In-degree out-degree assortativity
Is there an association between popularity and activity? We plot the in-degree against the outdegree for the observed and the random network:
par(mfrow = c(1, 2))
plot(rowSums(friendsFall), colSums(friendsFall), xlab = ’in-degree’,
ylab = ’out-degree’, main = ’observed’)
plot(rowSums(g), colSums(g), xlab = ’in-degree’,
ylab = ’out-degree’, main = ’random’)
0 2 4 6 8
0 2 4 6 8 10
observed
in-degree
out-degree
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7
random
in-degree
out-degree
Question 3: Is there more of a pattern in observed data set than in the random network? What
does that tell you?
Null, asymmetric, and mutual dyads
Tabulate the number of dyads that do not have any ties, have exactly one tie, and that have two
ties (i.e. are reciprocated or mutual dyads), see Figure 1. We calculate these numbers for both the
observed and the random network:
dyad.census(friendsFall) # observed network

## Mut Asym Null
## [1,] 62 119 2447
dyad.census(g) # random network
## Mut Asym Null

## [1,] 10 223 2395
Question 4: Are there any differences?
4/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
Figure 1: Directed dyad census
Figure 2: Directed triad census
Transitive triads and non-transitive triads
Among the 16 unique triads in Figure 2, which ones are transitive, non-transitive or both?
What is the count for the observed and for the random network?
triad.census(friendsFall) # observed network

## 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210
## [1,] 50171 7384 3957 64 121 128 139 70 23 1 20 43 10 9 34
## 300
## [1,] 22
triad.census(g) # random network
## 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210
## [1,] 47062 13154 607 317 292 635 50 40 20 9 3 1 1 5 0
##
## [1,]
300
0

Question 5: Are there any differences?
5/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
Uniform graph distribution given dyad census: U|MAN
Generate a new type of random graph, namely one that is random but has the exact same number
of null, mutual and asymmetric dyads as the observed network:
g2 <- rguman(n = 1, nv = 73, mut = 62, asym = 119, null = 2447, method = “exact”)
Repeat the check of the triad census:
triad.census(friendsFall)
## 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210
## [1,] 50171 7384 3957 64 121 128 139 70 23 1 20 43 10 9 34
## 300
## [1,] 22
triad.census(g2)
## 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210
## [1,] 50216 7344 3780 87 88 176 195 181 2 0 114 2 2 5 3
## 300
## [1,] 1
Question 6: Did anything change?
Comments on the results thusfar
In the previous practical we calculated some subgraph counts for the some data set (e.g. the coleman data set, which will be used for reference here). More specifically, we calculated how many
‘mutual’ (M), ‘asymmetric’ (A), and ‘null’ (N) dyads there were in the fall network.
We compared the dyad census to
one (1) random graph constrained to have the exact same density
as the observed network, that is the
(U|L) model. For this single random network we noted the
following:
The number of mutual dyads is much, much smaller than for the ‘real’ network, i.e. randomly allocating ties among actors is not going to give us as many reciprocated ties as for
the observed network
The number of complete triads (MAN: 300) is 22 in the observed coleman data and 0 for the
random network. However, this might be an unfair comparison as the complete 300 triangle
contains three reciprocated ties and we already established that the coleman data had many
many more mutual dyads than the random network.
Next we randomly distribute ties in the network
but in such a was so as to keep the number of
Mutual, Asymmetric and Null dyads fixed at 62, 119, and 2447, respectively ( that is the
(U|MAN)
model). As seen from the results, this still did not manage to produce any complete 300 triangles.
How do we interpret this? We can interpret this as
“had allocation of ties in the network been completely random given the ‘dyadic processes’, it would be
unlikely that we would observe any complete triangles”
6/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
Can we then say that there are more complete triangles than we expect by chance? Just how likely
or unlikely is it to observe this many triangles? In order to answer this we need to produce a
world of hypothetical networks by generating many many random graphs.
The world of hypothetical networks
When investigating whether there were an unusual amount of reciprocated ties (mutual ties), we
saw that
one random network, even though it had the same density as the observed graph, had
a completely different count of reciprocated ties. Was this just a coincidence? Are most random
networks different in this way? Let us check a larger number of random networks. Let’s create 5
random networks with the same density as the observed Coleman network:
g <- rgnm(n = 5, nv = dim(friendsFall)[1], m = sum(friendsFall), mode = “digraph”)
You now have 5 hypothetical (random) networks the way they could have looked had the tendency towards reciprocation not mattered. The first network is stored in g[1„] in the array, the
second in
g[2„], and so on. We can calculate the dyad census for all of them by typing
dyad.census(g)
## Mut Asym Null
## [1,] 5 233 2390
## [2,] 6 231 2391
## [3,] 8 227 2393
## [4,] 8 227 2393
## [5,] 4 235 2389
Question 7: How many have as high a number of reciprocated ties as the observed network?
You can visualise the 5 networks:
par(mfrow = c(1, 5))
apply(g, 1, function(x) plot(as.network(x)))
The first line of text says that we want to have 1 times 5 panel. The second line uses the apply
function to perform the function plot(as.network(x)) for every network in the array.
These are now 5 networks from the population or alternative world of networks that we could get
had ties been distributed at random.
Compare observed to the distrubtion of mututal ties from the U|L model
To see just how unusual mutual ties are in the alternative world, we can generate 1000 random
networks while conditioning on the observed number of ties (the exact density):
g <- rgnm(n = 1000, nv = dim(friendsFall)[1], m = sum(friendsFall), mode = “digraph”)
7/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
To plot the number of mutual dyads for each one, store the dyad counts in a matrix
BigDyadCensus <- dyad.census(g)
Each line of the matrix BigDyadCensus lists the dyad count for the corresponding network, for
example
BigDyadCensus[10,] is the dyad census for the 10th random graph. We can draw the
histogram for the distribution of mutual dyads through:
mutTies <- BigDyadCensus[,1]
hist(mutTies, main = ’histogram of mutual ties’, xlab = ’nr mutual ties’, ylab = ’freq’)
histogram of mutual ties
nr mutual ties
freq
2 4 6 8 10 12 14
0 50 100 150
Note that we take all rows but only the first column by writing BigDyadCensus[, 1]. This because
the first column corresponds to the mutual ties of each generated graph.
Question 8: Do any of the 1000 random networks have as large a number of mutual dyads as in
the observed fall network (you calculated the observed counts earlier)?
Compare transitivity to a random distribtion from U|MAN model
Here we compare the observed number of transitive triads (the 030T triads) to those expected
from the
U|MAN model which is our null model. We first calculate the observed dyad and triad
census:
obsDC <- dyad.census(friendsFall)
obsTC <-
triad.census(friendsFall)
Next, we generate 1000 random graphs from the null model (that is given the fixed dyad census
we observed). We also calculate the triad census of the randomly generated graphs:
g <- rguman(n = 1000, nv = dim(friendsFall)[1],
mut = obsDC[1], asym = obsDC[2], null = obsDC[3], method = “exact”)
8/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
gTC <- triad.census(g, mode = ’digraph’)
Finally, we plot the distribution of number of transitive triads under the null model, and include
a red line showing where the observed number falls in this distribution (we also tidy up the histogram):
hist(gTC[, 9], main = ’distribution under null’,
xlab = ’nr transitive triads’, col = ’grey’, xlim = c(0, 25))
abline(v = obsTC[, 9], col=“red”, lwd=3, lty=2)
distribution under null
nr transitive triads
Frequency
0 5 10 15 20 25
0 50 100 150 200
Question 9: Are we observing more/less transitive triads than what is expected under the null
model?
9/10
SOST71032 Social Network Analysis Conditional Uniform Graph Distributions
Summary: different null distributions
1. Fixed expected density on average UjE(L)
Let mynet be the observed network.
To generate 100 random graphs with the same density on average as
mynet if it is a directed
graphs:
g <- rgraph(dim(mynet)[1], m=100,
tprob= sum(mynet)/(dim(mynet)[1]*(dim(mynet)[1]1)), mode=“digraph”)
To generate 100 random graphs with the same density on average as mynet if it is an undirected
graphs:
g <- rgraph(dim(mynet)[1], m=100,
tprob= 2*sum(mynet)/(dim(mynet)[1]*(dim(mynet)[1]1)),
mode=“graph”)
Note that this is the same as the so called Bernoulli graph which we cover next week.
2. Fixed number of ties for every graph: UjL
Let mynumties be the number of edges in the observed network.
To generate 100 random graphs with exactly
mynumties number of ties for directed graphs
g <- rgnm(100, numact, mynumties , mode = “digraph”)
To generate 100 random graphs with exactly mynumties number of ties for un-directed graphs
g <- rgnm(100, numact, mynumties/2, mode = “graph” )
3. Fixed number of number of Mutual, Asymmetric and Null dyads UjMAN
Let M, A and N correspond to the observed values of mutual, asymmetric and null ties.
To generate 100 random graphs with M mutual ties, A asymmetric ties, and N null ties:
g <- rguman(100, numact, mut = M, asym = A, null = N, method = “exact”)
This only makes sense for directed network. Can you understand why?
3. Fixed degree Ujd
This is a little more complicated. In order to generate random given fixed degree sequence you
need to use the package
igraph and the function sample_degseq. Read about this function by
typing
?sample_degseq.
Note that this is the same as the so called Configuration model.
10/10

Tags: , , , , , , ,