Harry Potter Network Analysis

Harry Potter Network Analysis

Published on 19 April 2021
  • Facebook
  • Twitter
  • Linkedin
WM
Transcript
00:04
Have you ever wondered what the social network of Harry Potter looks like?
00:09
This project analyses the character network in the 7 Harry Potter books. 7 MB, 199 chapters, 1 mio. words
00:17
The books will mainly be used to investigate two questions...
00:23
What does the word sentiment in each community look like?
00:28
Can Harry Potter be used as a toy example of community splits like the Zachary's Karate Club?
00:36
So why use Harry Potter?...
00:40
The world of Harry Potter is interesting because it is well described.
00:45
And has clearly defined communities like the Hogwarts Houses.
00:53
So which elements do we need to perform our experiments?
00:57
First we need the 7 books...
00:59
They are available on Github in a digital text format.
01:03
Then a mapping of selected characters to their respective Hogwarts Houses
01:07
Generously provided by hp-lexicon.com by scraping their website...
01:12
Finally, a lexicon for sentiment analysis.
01:15
Obtained from: hedonometer.org and VADER
01:18
Then stirring it all together with creativity to achieve our research goal...
01:27
So what have we done so far?...
01:31
We split the book texts into documents for each chapter
01:35
We then cleaned the documents by removing stop words, new lines, etc.
01:39
And mapped all versions of character names to one name per character. To get around all the nicknames...
01:52
Before the cleaning:
01:54
Snape started the class by taking the roll call, and he paused at Harry’s name. “Ah, yes” he said softly, “Harry Potter".
01:59
Snape started the class by taking the roll call, and he paused at Harry’s name. “Ah, yes” he said softly, “Harry Potter".
02:03
After the cleaning:
02:03
Severus started class taking roll call, paused Harry name. Ah, yes, said softly, Harry.
02:11
We then created our initial character network as an undirected graph according to co-occurrences in the chapters.
02:19
The initial graph has Nodes: 85 Edges: 1465 Avg. Short. Path: 1.5 Avg. Clust. Coef: 0.83
02:35
Moving forward we will
02:37
- Compare community splits with the grouping of characters into their Hogwarts Houses
02:41
- Adjust edge weights based on positive or negative sentiment between the characters
02:45
- Investigate each Hogwarts House via word clouds and sentiment analysis
02:51
Thank you for watching!
02:53
Produced by Lukas, Peter and William