Comments

You must log in or register to comment.

quarterthrowback t1_jccxrmz wrote

The city Raqqa must acount for 8,999 of those 9k...

92

OfficialWireGrind OP t1_jccva32 wrote

The bar chart counts the occurrences of double letters in all of English Wikipedia's article text.

Data Source: English Wikipedia's April 1st, 2022 article data dump

Tools: Python, Matplotlib

17

sckurvee t1_jcdzwyx wrote

Wikipedia's got more llamas than accuracy.

17

Gr1ff1n90 t1_jce980l wrote

Omg! šŸ˜‚ This made me snort out loud reading it

0

solarmelange t1_jcd1o0v wrote

They need to go case sensitive, eliminating double capitals. That is clearly where ii are coming from.

10

DameKumquat t1_jcd4qen wrote

And skiing.

20

OfficialWireGrind OP t1_jcdrsql wrote

And Hawaii and Pompeii. Roman numerals do make up a lot of the ii's. Roughly, about 75% of them. A lot of the xx's too.

20

imlookingatarhino t1_jcdqd5v wrote

I'm gonna put so many Q's in articles now. Gotta* work on those numbers

10

Only-Engineering6586 t1_jce8i7t wrote

Seems right, Iā€™ve seen a couple double ddā€™s in my dday

5

PrompteRaith t1_jcdvsht wrote

I would expect XX to be much higher (genetics, the band, etc)

1

T-Dex_the_T-Rex t1_jcffbm2 wrote

Interestingly, in terms of words with consecutive double letters, there is only 1 word with 3 consecutive double letters and only 1 word with 4 consecutive double letters. These words are Bookkeeper and Subbookkeeper respectively.

1

CaptainBentham t1_jcfix2a wrote

How many of those zzā€™s are from the ZZ Top article

1

popeter45 t1_jcdpu6g wrote

So how many of the LL are just Welsh words/places?

−1

glidespokes t1_jcejo3e wrote

Probably not all of them because spanish exists.

2

[deleted] t1_jcepbx7 wrote

There's also lots of double Ls in English; such as pull, allow, traveller...

4

glidespokes t1_jces3pn wrote

Same in german. There (and English too I believe?) it simply means the preceding vowel is pronounced short.

1

[deleted] t1_jcf02vn wrote

As with so many other 'rules' of English, there's lots of exceptions

1

Waxoplax t1_jce8pmx wrote

What does the k in the number stand for?

−1

[deleted] t1_jce96xt wrote

[deleted]

5

Waxoplax t1_jcf6hkz wrote

Thats what I though, but how can there be 43 million words with ll when thereā€™s roughly 200k words in the english vocabulary šŸ¤”šŸ¤Æ

−1

dhkendall t1_jcfanit wrote

Each occurrence. Iā€™m sure that, for example, the word ā€œshellā€ appears more than once in Wikipedia, but it only appears once in the list of words in the English vocabulary.

3

Waxoplax t1_jcfuvws wrote

Ohh, each occurence.. gotcha. Missed that, I though it was just the number of individual words and I was hella confused

2

Loosestool421 t1_jcdr5y7 wrote

Double L is the Latin influence.

−2