Nomen est omen?

“Nomen est omen”. You probably heard of this saying, whose origin is attributed to Roman dramatist Plautus some two hundred years before Christ. However, history of names is even more ancient, so ancient that no one knows the beginning of the story, but history of names is probably as old as human language. Since written history began, and as far back as oral history reaches, people have had names. In the beginning most names appear to have had some sort of original meaning, usually descriptive (developed both from nouns and adjectives), rather than being simply a pleasing collection of sounds,… think of something like Native American names, e.g. Sitting Bull. Early in prehistory some descriptive names began to be used again and again until they formed a name pool for that culture. Parents would choose names from the pool of existing names rather than invent new ones for their children. As time went on the language changed and in many cases the words that formed the original name passed out of use, leaving the fossilized form in the name. This is why we do not recognize the meanings of many names today. Their origins are in ancient languages from words that have passed out of use. With the rise of Christianity, certain trends in naming practices manifested. Christians were encouraged to name their children after saints and martyrs of the church. The oldest of these names were of Jewish and Greco­-Roman origin; Jewish names of the prominent early Christians mentioned in the New Testament such as Mary, Joseph or John, and Greco-Roman names such as Anthony, Catherine, Nicholas and Paul in commemoration of the martyrs and saints. This will be the case with many Croatian names as well, as we are going to see.


Okay, now you are getting the idea what’s this post going to be about… we are going to do quantitative analysis of Croatian names corpus, in particular baby naming trends over the years but we are also going to present some other interesting findings, so stick around. So how did we get the data? Well, there is a cool webpage called Acta Croatica, where you can research your family name, its history and ancestors, and where you can also search Croatian names and get something like the image above.

Marko_DZSFirst, I thought “relative popularity” means percentage of total population in that year, but no, once we scraped the names we figured that all these data points in sum make 100, so this “relative popularity” refers to relative popularity/frequency of the name (given to a baby) in a given year compared to the same name in some other year. At first this was a bummer, but not so fast. Croatian Bureau of Statistics  compiled results of 2011 census and exposed this little service where you can search by name (or last name) and find out how many people with a given name were living in Croatia in 2011.

mortality_tablesNow we are making progress, but not so fast… data from Acta Croatica gives us relative frequency of people born with a given name, while data from Croatian Bureau of Statistics gives us number of living people with that name and obviously not all people born in let’s say 1940 are still alive, right? So we must account for this and after bit of searching, Croatian Bureau of Statistics comes to the rescue again, with mortality tables. By combining these three sets of info, we can finally calculate absolute number of people born with a given name in a given year. Voila!


Now it’s time to see if we can dig up something interesting… for start, I would like to know all-time name rankings and by all-time let’s say from 1940 to 2010, because reliability of data prior to 1940 is little questionable. There are some usual suspects, but also lot of surprises, names I would not rate so high such as Stjepan and Željko in male names category and Mirjana and Nada for females. For our international readers Ivan is John, Josip is Joseph… Marko, Marija and Ana I think you can figure out, while Ivana is Jane or Joan. As you can see there is strong influence of Christian naming tradition that we mentioned in the beginning of the post. If you are curious about other names and their meanings, google is your friend 😊


These were all-time name rankings (i.e. 1940-2010), but let’s see naming trends over the decades. As we can see it’s a diverse group of names and tastes certainly changed over the years, but Ivan and Josip remained popular choices and TOP10 through the whole 70-year period. Milan which is Slavic-origin name (same as Željko) was very popular in 40s and 50s, but quickly went out of fashion while Marko entered TOP10 in the 80s and till present day remained as one of the most popular baby name choices. It’s also worth noting that 2000s were first decade without single Slavic-origin name in the TOP10 (in 90s there was Tomislav and in 80s there were Tomislav, Goran and Igor).


When it comes to female names we can also notices interesting naming trends and changing preferences. Marija was the only name that remained in TOP10 from 1940 till 2010, together with Ivan and Josip making trio of Biblical names that rule the rankings. Marija was dominating female name rankings through 40s, 50s and 60s, before Ivana entered TOP10 and immediately took #1 spot and held it until 2000s when Ana finally managed to climb on top. It’s interesting to compare 70s and 2000s when it comes to name length, something that’s worth looking more closely.


As we can see, both from male and female names, preferences changed considerably. Take a look for example at Milan and Marko, one was popular in 40s and 50s while other in 80s and after, same with Dragica and Lucija, so the question begs, can we tell someone’s age when all we know is his/her name? Well, we can, we can infer person’s age (or better say age range) knowing their name with pretty decent probability. Following two violin plots show us age distribution for 15 most popular male and female names. (Violin plots are similar to histograms and box plots in that they show a representation of the probability distribution of the sample.)


Someone named Milan or Željko is more probably older than someone named Marko or Luka. In the same time Josip, Ante and Ivan are kinda always in fashion, while most of the Mario-s are between 20 and 50, similar to Goran, Damir and Ivica (between 30 and 60). It’s interesting to observe, female names typically cycle in and out of fashion more quickly than male names, which means that they have narrower interquartile ranges. Katarina, Ana and Marija are names always in fashion, but still most Marija-s are much older than let’s say Petra-s. Most of Martina-s, Maja-s, Kristina-s, Ivana-s and Jelena-s are between 20 and 50, while Gordana-s, Vesna-s, Mirjana-s and Nada-s are considerably older and between 40 and 70 years old.


As we mentioned before, notice how female names were rather long in the 70s, but very short in 2000s? Well, if we plot average name length over the years we can see this is not only our impression based on cherry-picked piece of information. Names were indeed longest during 60s and 70s, having on average more than 6 letters. Since then it was almost steady decline, especially during 2000-2010 period when names shortened considerably in very short period, reaching all-time low of 5.2.


As we previously saw there is less and less slavic-origin names, but names are also losing special Croatian characters like ‘č/ć/ž/š…’. Again, it looks like 60s/70s very pretty crazy, with more than 20% of the names (one in five) having at least one of the mentioned characters. After reaching peak in late 60s we witnessed steady decline (similar as with length), reaching all-time lows during 2000s and stabilizing at only 2.5% (1 in 40 names)! Looks like names are definitely getting more “international” look & feel.


It’s interesting to see how people’s names are reflecting some higher-level influences on society, so another crazy idea crossed my mind. There are lot of Croatian names with “SLAV” component (Tomislav, Slaven, Mislav…), obviously paying homage to nation’s ethnic origins, I mean, even the old country was named Yugoslavia… and then it hit me, was weakening of Yugoslavian sentiment in Croatia evident from name selection? Was “nomen” really the “omen” for the end of Yugoslavian idea and war for independence that erupted in 1991?


By looking at the above graph, it certainly appears so. Percentage of names with “SLAV” reached peak around 1980 and then started declining, going down from almost 5% of total population to les than 0.5% by 2010… and this wasn’t the only “omen”. Folks that survived WW2 understood and cherished the idea of living in peace, thus naming their children with names containing “MIR”, meaning peace in basically all the Slavic languages. However, it appears that younger generations who never experienced war took the concept of peace too lightly, and as “MIR” was slowly getting out of fashion, dark clouds started gathering… more than two thousand years after Plautus, names were “omen” once again, bad one this time.