For a word-count of the entire poetic corpus, click
here.
You can
download the
Corpus of Old English from the Oxford Text Archives. The first four files contain the
entire Anglo-Saxon poetic corpus, taken from Krapp & Dobbie's Anglo-Saxon
Poetic Records.
To measure
word frequency, 1) I removed all punctiation and extraneous
matter from the files, leaving only words. 2)
Then, I substituted "th" for all thorns and eths,
and substituted roman letters for other runes. 3) This left
a very long string of words; so, I substituted a carriage return
for every space, leaving a single word on each line. 4) finally,
I piped a sort command to uniq, then to another sort ( sort
FILENAME | uniq -c | sort -r ) --following the advice of the Natural
Language Group at USC. Here are the
results for file 1, containing the contents of British Library
MS Cotton Junius xi:
NB. Nouns are underlined.
Words used 100 or more times:
1079
|
on |
|
280 |
þaer |
942 |
þa |
249 |
þurh |
823 |
þaet |
227 |
þaes |
780 |
and |
204 |
of |
575 |
þe |
186 |
me |
520 |
him |
169 |
ofer |
497 |
to |
165 |
nu |
466 |
he |
150 |
is |
418 |
ond |
147 |
god |
371 |
ne |
145 |
þonne |
364 |
waes |
143 |
for |
351 |
ic |
133 |
aer |
346 |
swa |
117 |
under |
329 |
se |
115 |
we |
326 |
þu |
112 |
ac |
325 |
mid |
111 |
wordum |
321 |
þam |
103 |
aefter |
303 |
his |
102 |
eorðan |
294 |
hie |
101 |
þone |
287 |
in |
|
|
|
Words used 50 to 99 times:
97
|
siððan |
|
68 |
waldend |
97 |
drihten |
67 |
bearn |
92 |
wið |
66 |
hwaet |
92 |
hine |
65 |
seo |
86 |
waeron |
65 |
gode |
83 |
wearð |
64 |
ongan |
82 |
wuldres |
64 |
hu |
82 |
aet |
63 |
weard |
80 |
þaere |
63 |
sunu |
78 |
eft |
62 |
faeder |
77 |
word |
62 |
ealle |
77 |
heo |
61 |
þeah |
76 |
waere |
61 |
maeg |
74 |
wolde |
60 |
us |
74 |
gif |
59 |
þara |
74 |
engla |
58 |
up |
73 |
oððaet |
58 |
þin |
73 |
cyning |
57 |
forð |
72 |
haefde |
56 |
ymb |
70 |
ge |
56 |
hit |
|
|
56 |
gewat |
|
|
56 |
ece |
|
|
55 |
be |
|
|
54 |
sceolde |
|
|
54 |
biþ |
|
|
51 |
sceal |
|
|
51 |
her |
|
|
51 |
heora |
|
|