Halite Home

Halite Map Gini Coefficient


#1

Was thinking a bit about the different kinds of maps, specifically regarding production distribution. Some maps seem to place more emphasis on controlling the right regions than others.

I then had an idea: to use the Gini coefficient to measure the distribution of production within maps. Pretend that each site is a person and the map is the nation; each site's production is effectively its income. Luckily, this is pretty easy to calculate too!

Here's a fairly golfed 3-line calculator (relies on numpy because I was lazy, sorry, but should be easy to rewrite it to not use numpy :slight_smile:):
import sys, json, numpy
prods = numpy.cumsum(sorted(numpy.array(json.loads(open(sys.argv[1], 'r').read())['productions']).flatten()))
print(str((len(prods)*prods[-1]-2*numpy.trapz(prods)+prods[0])/len(prods)/prods[-1]))

Pass it the filename of the replay as the command line argument.

I tried it on a sample replay (specifically ar1482947270-2437412300.hlt) and got a coefficient of 0.306, which is appreciably less than the US and on par with a country such as Hungary.


#2

I continued doing my analysis, more to come, but this is a little teaser...

I downloaded the 5 GB set of files and ran it over all of them. Of all of them, the highest one corresponded to this replay and the lowest one to this replay, so I think this method may not be unreasonable.


#3

Here's a plot of the frequency of various Gini coefficients (of the 2095 games I analyzed):

It appears that one should optimize their bot for a coefficient of around 0.27-0.29, though be prepared for deviation from that.

Of course, this is only one of many factors that describe a map, but I think it's an important one.


#4

Lastly, for anyone interested in this idea but don't want to wait around for the analysis of all 2095 games to finish, here's a json file containing each replay name and the according Gini coefficient:
https://drive.google.com/open?id=0B4e-jO5ldlVjUlFzRllEUk96TG8


#5

If my economics 101 doesn't fail me I believe that what you are showing here is the poor to rich distribution from very equal to very unequal. In the games of 0.10 to 0.20 all the production very equally spread over the map while at 0.45 to 0.60 the production is very unequally spread over the map (many low and high production tiles).

Thank you, I'll use this data to finetune my own simulator to more closely generate maps of these Gini Coefficients.

I can't think of a way of optimizing specifically around 0.27 or 0.29, or maybe we are already doing that without realizing... but what I can imagine is that the bot's will of exploration could be quantified by calculating the map's gini coefficient.


#6

I have a request, could you show me the average Lorenz curve around the 0.25-0.30 Gini coefficient maps?
I'd imagine it looks quite boring but who knows!


#7

Here's the average Lorenz curve for maps with a Gini coefficient between 0.25 and 0.30

Enjoy!


#8

@Sydriax, would you mind sharing the code to create these analyses so we can mess around with it?


#9

Sure!
The actual Gini code is just the 3 lines in the original post, but later today I can also post the matplotlib code as well.


#10

Here's a Gist with the rest of the code. I added a bit of description for each file, but the code is not cleaned up: https://gist.github.com/Sydriax/947d6b331f9eb22e2ab6215e329db306.