SOST71032 Social Network Analysis
Introduction to
Statistical Network AnalysisAssignment 1
Dr Termeh Shafie
Department of Social Statistics
School of Social Sciences
The University of Manchester
Day 4
SOST71032 Social Network Analysis Introduction Day 4 1 / 15
welcome to part II of the course
overview
day 4:
I introduction to statistical network analysis
I parametric and non-parametric approaches
I conditional uniform graph distributions
I random graph models
I introduction to R (with András, Alejandro & Pete)
day 5–6 (with András):
I models for cross-sectional network data
– exponential random graph models
I models for longitudinal network data
– stochastic actor oriented models
SOST71032 Social Network Analysis Introduction Day 4 2 / 15
soware: R
I we use the statistical soware R
I install R from
https://cran.r-project.org/
I install Rstudio from
https://rstudio.com/
I a (very) short introduction to R here
what you need to know:
I import data
I work with vectors, matrices, data frames and network objects
I using functions
I primary packages:
statnet, igraph, snahelper, Rsiena
SOST71032 Social Network Analysis Introduction Day 4 3 / 15
statistical network analysis
what is statistical network analysis?
what is statistics?
descriptives I simplify, present data I describe, summarize data looking at the data and reporting what’s there (∼part I of the course) |
inference I generalize from sample to population I conclusion about the population based on your data concluding with respect to the underlying population from which the data is from (∼part II of the course) |
SOST71032 Social Network Analysis Introduction Day 4 4 / 15
statistical inference
the science of changing your mind under uncertainty
SOST71032 Social Network Analysis Introduction Day 4 5 / 15
statistical inference
the science of changing your mind under uncertainty
– a default action (frequentist) or a prior belief (Bayesian)
– hypotheses describe what the world might look like:
the null hypothesis describes the world with the default action
the alternative hypothesis is all other worlds
does our evidence [data] make the null hypothesis look ridiculous or unlikely?
SOST71032 Social Network Analysis Introduction Day 4 5 / 15
statistical inference
the science of changing your mind under uncertainty
making decisions based on facts [parameters] means having information
about all the facts
…but most of the time we don’t have all the information!
instead what we know [sample]
is dierent from what we wish we knew [population]
so we guess [estimate] under uncertainty
SOST71032 Social Network Analysis Introduction Day 4 5 / 15
statistical inference
the science of changing your mind under uncertainty
maths [probabilities, distributions] allows us
to express rules governing the null hypothesis world
and build a toy model of the null hypothesis world
from which we believe our data is from
since we don’t have facts,
we combine data with assumptions to make reasonable decisions
there’s no magic that makes certainty out of uncertainty!
SOST71032 Social Network Analysis Introduction Day 4 5 / 15
statistical inference
SOST71032 Social Network Analysis Introduction Day 4 6 / 15
statistical inference
statistical inference
relies on the assumption that there is some randomness in the data
we need to model this randomness
we will return to modelling this randomness later, but first
what is statistical network analysis?
(i.e. what makes it statistical analysis of networks?)
it’s all in the data…
SOST71032 Social Network Analysis Introduction Day 4 7 / 15
network data
what is statistical network analysis?
‘[…]the study of the collection, management, analysis,
interpretation, and presentation of relational data.’
[Brandes, Robins, McCranie, Wasserman, 2013]
atomic data dyadic data network data
make use of the added information
that network data gives in comparison to atomic
SOST71032 Social Network Analysis Introduction Day 4 8 / 15
network data
what is statistical network analysis?
features of network data
overlapping dyads interdependencies
imply challenges from a statistical perspective
data availability improved over the last decades
I traditional data collection by questionnaires:
”please name your best friends”
I now more automatically logged from electronic communication:
telephone calls, emails, social media, online markets, etc.
=) opportunities and challenges
SOST71032 Social Network Analysis Introduction Day 4 9 / 15
statistical analysis of network data
research question about networks typically include
I how do networks work?
I where could we best manipulate a network (diusion or brokerage)?
I how similar are networks?
I what are the building principles of these networks?
I can we learn from real-life networks to build man-made eicient ones?
from a statistical viewpoint, questions include
I how to best describe networks?
I how to infer characteristics of nodes in the network?
I how to infer missing links
I how to predict functions from networks?
I how to find relevant sub-structures of a network?
SOST71032 Social Network Analysis Introduction Day 4 10 / 15
statistical inference
statistical inference
relies on the assumption that there is some randomness in the data
we need to model this randomness
to judge if a network summary is ’unusual’ or if a motif is ’frequent’,
there is an underlying assumption of randomness in the network
SOST71032 Social Network Analysis Introduction Day 4 11 / 15
network models
what is statistical network analysis?
the elements of network models
from Brandes et. al. 2013 What is Network Science?
a specification of
I how the phenomena (in general i.e. more generally than this particular
case) is abstracted to a network
I how this conceptual network is represented in data
(e.g. measured or observed)
SOST71032 Social Network Analysis Introduction Day 4 12 / 15
statistical models of network data
what would happen, if we measured the data again?
I at a dierent point in time
I on a dierent set of actors
I with dierent environmental factors
…
want to estimate expected outcome ± variability
=) to explain and predict social relations and behaviour
SOST71032 Social Network Analysis Introduction Day 4 13 / 15
statistical models of network data
specify realistic probability distributions for social networks,
formalizing hypothetical dependencies in the data.
statistical network models serve several purposes:
explaining social relations and/or behaviour
– search for rules that govern the evolution of social networks
predicting social relations and/or behaviour
– learn from given data and predict the data yet to come
random generation of networks that look like real data
– algorithm engineering; empirical estimation of average runtime or
performance
– simulation of network processes
(e.g.,information spreading, spread of disease)
SOST71032 Social Network Analysis Introduction Day 4 14 / 15
statistical models of network data
specify realistic probability distributions for social networks,
formalizing hypothetical dependencies in the data.
I sample graphs at random from this probability distribution
I compare sampled to observed network on a feature of interest
I if model is good fit, then sampled network resembles observed
I modelled structural eects might explain emergence of network
how do we determine this?
I parametric methods
I non-parametric methods
(topic of next video lecture)
SOST71032 Social Network Analysis Introduction Day 4 14 / 15
hypothesis testing
null hypothesis
H0: observed network is created from specified model that does X
alternative hypothesis
H1: observed network is not created from specified model that does X
if H0 is true we expect to see networks like those we simulate from model so:
decision rule
if simulated networks look like the observed in only a(100%) of the cases
we would reject H0 on the a(100%) significance level
otherwise, we cannot reject the H0
observed network
distribution of measure under null model
summary measure
SOST71032 Social Network Analysis Introduction Day 4 15 / 15