Social Network Analysis

117 views 9:19 am 0 Comments June 3, 2023

A quick introduction to R
András VörösInnovation and Change Assignment
University of Manchester, Department of Social Statistics
SOST71032 Social Network Analysis
A few initial words about R
1. R is an object-oriented programming language
instead of a single data table (obs-vars), we work with multiple objects
that can be data tables, vectors, etc.
operations on matrices can be done very easily and quickly (without
looping) thanks to efficient functions
this is good news for network researchers, who usually have to work
with different kinds of data at the same time
2. R is a collection of statistical packages
it is free to use and open for the addition of new functionalities
result: many contributed packages – already programmed solutions
for problems – tools are developed by researchers themselves
this is good news for everyone
Introduction to R – practical session
Open your “favourite” R GUI (for me it’s RStudio)
Open the 1_R_introduction.R script from the downloaded materials
Packages in R: the base package
Basically, everything in R is about packages
these are bundles of functions, class definitions, etc.
operations on objects
once installed, you can attach/detach them at any time from R
Your installation comes with some packages, including base
the base package contains a lot of everything needed to get things
going
definition of basic object classes (vector, matrix, data frame, etc.)
functions to work with objects (assign, arithmetic operators, etc.)
But the really nice thing in R is that users can contribute to it by making
their own packages

Packages in R: contributed packages
Various, officially approved contributed packages include
Matrix: lot of handy matrix functions, most of which you don’t see
directly, but other packages require it – like NA-methods; but also
colSums and so on
MASS: lots of example datasets and functions for statistical methods
(e.g. scaling, canonical corr. analysis) – originally for the Venables–
Ripley stat. book, but by now many packages “depend” on it (i.e. use
some of its functions)



oz: functions for plotting Australia’s coastline and state boundaries
There are plenty of contributed packages…
CRAN – The Comprehensive R Archive Network – official repository
!!!
!!!

There are plenty of contributed packages…
R-Forge – Official package development site and community
!!!
Network-related packages
Packages that have to do something with networks are also plentiful
But which network-related packages should we use?
The network keyword can cover a lot of things:
SocialNetworks: generate networks based on “area radius” as a
measure of association – pretty specific, right?
NetIndices: food web analysis
netweavers: “Weighted Averages for Networks: This package
summarizes quantified peptide data, fits linear models on and performs
network analysis for proteomics mass spectrometry data.”
As apparent, some packages are for the specific problems of different
scientific communities or groups
→ divergent terminology and implementation of methods

Network-related packages – more examples
A few more examples for specific-scope network-related packages:
NetComp: generate and compare networks; by biologists; some oneliners are in there, e.g. binarizing weighted nets, difference,
intersection, union
NetCluster: clustering functions tailored to networks; beta version
yet; might be convenient later, but you can do these things in other,
more basic packages
netmeta: some meta-analysis tools (can also be done in other
packages)
nettools: high-level interface functions to calculate a few things,
e.g. infer a network between variables from standard datasets
(observations by variables), calculate distance between matrices;
methods choice not too flexible

Our cup of tea:
Flexible packages for social network analysis
Basics – things we don’t use directly but usually need:
network: defines the object class “network” and functions to work
with it; other packages use some of these, e.g. plotting, dyad count (I
find storing networks in network-class objects confusing, but this may
be only my problem)
networkDynamic: dynamic extensions to the network class; own
class of objects, functions to work with them

Flexible packages for social network analysis
Things we use (not full list):
sna: the basic yet flexible package; offers a range of descriptive tools
for SNA; really a lot of things, from density, centrality measures, graph
plots to blockmodeling, community detection, network regressions
(even some version of ERGMs, but looks less flexible than other
implementation)
igraph: nice visualization and other tools (centrality, homophily
measures, clustering, subgraph counts); defines its own class of objects
(“igraph”) – a lot different from simple matrices; good for working with
large networks
statnet: integration of many SNA packages (network, ergm, tergm,
sna, relevent, …)
RSiena: Stochastic Actor-oriented Models for network (and behavior
co-)evolution

Flexible packages for social network analysis
Things we use (not full list):
relevent: comes with the statnet suite; package for fitting
Relational Event Models (REMs), a family of models for time-stamped
network data. It implements different types of REMs, which are tailored
to the study of tie-oriented network processes through time.
goldfish: The Goldfish package in R allows the study of timestamped network data using a variety of models. In particular, it
implements different types of Dynamic Network Actor Models
(DyNAMs), a class of models that is tailored to the study of actororiented network processes through time.
ggraph: The grammar of graphics as implemented in ggplot2 is a
poor fit for graph and network visualizations … ggraph is an extension
of the ggplot2 API tailored to graph visualizations and provides the
same flexible approach to building up plots layer by layer.

Other packages that might be useful for you
Again, only a few examples:
NetSim: simulate micro-models to study how their impact on macrostructures (SAOM, small-world, and other models)
ndtv: Network Dynamic Temporal Visualizations – make cool movies
to impress your audience; uses networkDynamic objects; part of the
statnet package
multiplex: methods for multiplex networks, e.g. “bundle census”
(dyad census extended to multiple types of relations); however, these
problems can take so many forms, depending on your research, that
you probably have to program things for yourself in the end…
blockmodeling: an implementation of Generalized Blockmodels
Summary on packages
Why are packages useful?
they define object classes, generic functions for working with them
they are developed with the demand, a lot of popular methods
already implemented somewhere
they are developed by people who are also users, researchers –
developments aim at solving problems, making life easier (with more or
less success)
you can choose to use what is already there, but you can also build on
existing packages and make your own functions/package without
having to start from nothing
Limits of packages
limited flexibility – in the end, you might need to program things for
yourself (but a lot of methods are available to save time)
time and energy costs of finding and learning new packages
One more thing, which can never be overemphasized:
How to get help with your R problems?
Use package documentation in R or from CRAN
Ultimate tool:
google your question (somebody has asked it before)
General R tutorials, websites – few useful links in the course scripts
What is in the
base package?
https://stat.ethz.ch/R-manual/R-devel/library/base/html/00Index.html
Two examples for package help pages:
statnet website (+ tutorials for its components): statnet.org
RSiena website (+ manual): www.stats.ox.ac.uk/~snijders/siena/
Network analysis in R – practical session
Let’s return to some scripts!

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , ,