RFC: Map Rating to Replace Map Categories

LaFayette

With the triplea_maps.yaml file becoming deprecated in 2.6 and map information being moved to a database, there is a need to define how categories are going to be managed. The map.yml file is the driver behind how we will remove the triplea_maps.yml file. For more info on the map.yml file, please see:

Broad problems to solve:

A. how & where to manage the map category?
B. how can map makers be enabled to completely and accurately self manage categories?
C. how do new players when looking at the download list know which maps they should try first?

To solve these problems I am thinking we replace the concept of map category with map rating. The map rating would be defined in the map.yml file (solves problem A), which allows map makers to update it as they choose (solves problem B). If we create standard rating definitions, then the rating a map should get will hopefully be obvious (solves prolbem B). If players can then sort the map list by rating, then we'll have solved problem C.

The rating scale would be from 1-4, where maps start at rating 1 and can go up to rating 4:

Rating 1: map is broken, has error messages, a person likely won't be able to finish a game on the map. The map needs work and is not suitable for play.

Rating 2: map is not broken, but does not 'play well'. The map is imbalanced, not interesting, or is still a work-inprogress

Rating 3: the map is play-balanced, is interesting, compelling multiplayer games or can be played on this map

Rating 4: Map is rating 3 and also looks (relatively) good

For example Total World War, World at War, most of the WWII maps are rating 4. "Cold War" is a really fun map but has very grainy graphics, so it would be rating 3.

This question will need to solved before 2.6 can be released. I think I'll need an answer to be able to code something up in 1-3 weeks from now. So please consider this and give feedback. All map makers will need to understand how to rate their map and will be responsible for creating a map.yml file for any new map, so it's really important to try and think this through as much as we can beforehand. Thank you in advance for the thoughts and feedback.

TheDog

C. how do new players ...
I would like a Map Rating system that has the following "periods", so something like;
Fantasy
Ancient/Medieval
Renaissance/ECW
Napoleonic
ACW
WW1
WW2
Modern
SciFi
None of the above (so could be be abstract)

This will allow players to zone in on their favourite period, so they dont have to wade through all periods.

And also be graded as above 1-4.

WW2 is not my period, but where do I start as a new player, there are so many maps, so many rule sets?
If most of the WW2 maps will be rating 4, I will still be at a loss.

I would like another rating, a star rating, that means this map is played the most or similar.

This * rating probably only needs to apply to WW2 as there are so many maps to choose from. Perhaps the top 3 could be *.

Perhaps the * rating could be rating 5 ?

Cernel

I strongly believe game and graphic quality MUST be rated strictly separately (especially since you can have a very bad looking map with a very good looking map-skin). Graphic quality is not only about being pretty but firstly about being usable (For example, having a good quantity of well placed placement spots per territory so to limit overflow lines with convenience.).

Obviously, map-skins should have only the graphic rating.

By the way, maps (and map-skins) may have a sound set which may be more or less good. That either has to go with the graphic (so graphic and sounds) or as a third rating (or it should be clarified if there is the intention completely to ignore sounds quality when rating).

RogerCooper

The difference between 3 and 4 is subjective. I would suggest an established category for ports of well-known boardgames and mods frequently played on-line. The rest could be categorized by period. Broken maps should not be in the repository at all.

LaFayette

Re: additional sort categories

Being able to sort by era and popularity are certainly desired features but not going to be tackled for 2.6. For context, the triplea_maps.yaml file is going away, that is where we stored the map categories. Something new must be done to replace it or failing any replacement strategy there would be no categories at all. In essence the existing RFC would replace the existing categories, with 'best' being '4' and 'good' being '3'. As an addition there is more definition so that 'best' and 'good' are less subjective.

RE: map skins

When skins and maps are bundled together it would be moot to rate them independently. Presumably as well any map would use it's best looking skin. Regardless, rating would be based on the default skin and any other skins effectively ignored for the purpose of rating (and of course, if there is a better looking skin, then it should be considered to make that the default skin).

Re: 3 vs 4 subjectivity

I would agree that '3' and '4' has an element of subjectivity. Though, I do think it's more clear which maps look really bad compared to those that do not. Cold War IMO is a good example of an excellent map but poor graphics quality. The bar would then be 'does this look like Cold War or nicer?'. I believe that is less subjective, particularly compared to "best" and "good".

Re: Era Categorizations vs Rating

Creating categorization by era again is requested but opens a lot of can of worms. How do we standardize the eras? How do we account for spelling mistakes? How do we ensure there is a reasonable and sensible number? If we are only cateogrizing by era, how do players find the "best" or "popular" maps? For example, with perhaps a majority being WWII maps, how do players know which ones of those are better or worse?

Re: Broken/Experimental Maps - '1 star' rated maps

I would tend to agree that broken maps should not be listed. I think the 'experimental' list has been perhaps baggage that we have been just carrying around. The confounding factor is map upload to repository is intended and encouraged for in-development maps (I am somewhat frustrated that this message has not resonated and upload is still viewed as a one-and-done final step, but that is another problem..).

So, essentially '1' was to represent all of the experimental maps. A map in-development and not yet done would also be a '1'. A map that is just finished but perhaps not balanced yet would be a '2'. Once balanced, it would be a '3'. If the map is then also really appealing, it would be a '4'. The latter point then creates some distinction otherwise if we drop the '1', there are no '2', and if we do not base it on anything else then we are left with all maps being the same category. Hence the last distinguishing factor.

Grant it, with the definition above we could certainly drop all of the experimental maps and then new in-development maps would be a '1', and we would simply have many fewer '1' star maps.

LaFayette

To clarify a bit on the transition,
best -> 4
good -> 3
experimental -> 1

Out of the experimental, those that are not broken would be a '2'. If they turn out to be balanced and a good map then they move to a '3'.

Hence, '4' would be the all-star list. '3' is the list of very good maps, but they do not like nice, but certainly would be great to play. '2' is something that only should be played if you are actively trying to balance the map, otherwise avoid it. '1' is just to be avoided unless trying to help build a map or get some early feedback.

If we change our perspective that maps are never to be uploaded until done, then '1' and '2' mostly go away and then we are back to only having the sneaker-net solutions that many have developed over time (which is again frustrating since in many ways it is far more work for everyone then simply uploading to github (which is easier than uploading to dropbox even)).

So clearly I think it's beneficial for the rating system to also take into account maps that are not yet done, those that are newly done, and then the vast majority of maps that are done and has some philosophy on how to distinguish them so we are not left with a list of 70+ maps.

Cernel

The fact that the mention of map-skins was made between parentheses should have hinted that was unnecessary: I don't see how the value of a game and the quality of a skin for playing the game can be mixed together and I believe they ought not to even if every map would have only one game and every game would have only one skin (both being not the case).

Substantially, based on what you said, your 1, 2 and 3 categories would rate the game only, whereas the value 4 would be a 3 for the game and something not too low a rating for the graphic (and also the sounds set or not?).

All this, of course, assuming that one wants to discriminate maps based on a rating system (I'm not saying anything about it because I understand this topic implies that rating has to be the way.).

LaFayette

@Cernel Do you have any suggestions on how we can avoid the majority of all maps from having the exact same rating? If we delete the experimental maps, then presumably without any other factor all maps would have the same rating. This fails to solve problem (C) in the OP.

LaFayette

@Cernel said in RFC: Map Rating to Replace Map Categories:

I'm not saying anything about it because I understand this topic implies that rating has to be the way.).

Solving the 3 problems in the OP is the goal. How we do that is an open question.

This rating system suggestion is essentially the existing categories remapped to numbers with a definition between '1' and '2' (both considered experimental today) and a definition between '3' (good) and '4' (best).

If there is a better way to solve the 3 problems without incurring continuous overhead, then kindly offer it.

Cernel

@LaFayette said in RFC: Map Rating to Replace Map Categories:

@Cernel Do you have any suggestions on how we can avoid the majority of all maps from having the exact same rating? If we delete the experimental maps, then presumably without any other factor all maps would have the same rating. This fails to solve problem (C) in the OP.

I think this can be seriously solved only by having something collecting popularity data of some sort and returning information about what are the most played maps. Besides, I'm fairly certain the current divide between the highest and second highest category in the current download listing is not based on graphics (there are certainly many maps in the second level of quality which are about as good looking as what you can find in the first level), so I'm sorry to say you would be quite wrong in putting all first level maps on the 4 rating and all second level maps on the 3 rating under what described (Of course, this is merely a legacy problem given by the fact that I don't believe there is any kind of information about what currently distinguish the first and the second level.).

Of course, I realize this is substantially off-topic, so no: I don't have any suggestions on how to solve such challenge within a rating system. After a level of dencent quality, it is arguably just too subjective for a no-profit organization to be as judgmental as needed.

LaFayette

@Cernel Map popularity is not happening in 2.6 and triplea_maps.yaml is disappearing. Seemingly this must be solved or we can say we are unable to solve this. Furthermore, I don't agree that popularity solves (C). Popularity could be gamed, for example someone just repeatedly starts a map over and over again. You could also just leave a specific open for a very long time, create many alias accounts (emails are infinite) and then start a game scenario many times. Last, popularity biases towards older maps, that's another challenge to solve that also fails (C).

We could just keep all the same categories, but that feels like a shame since they are completely subjective and are really just whatever Veq felt they should have been some 12 years ago.

I'm sorry to say you would be quite wrong in putting all first level maps on the 4 rating and all second level maps on the 3 rating under what described

Yip, so there would be some shifting of some maps to 3 and others up to 4. Now that we have a working definition of what is a 3 vs 4, we can actually move stuff without being completely subjective or just leaving it up to me to make an executive decision. If I were to make executive decision, then a year from now we'll be in the same place where 'best' is whatever I felt it should be and no way to shift maps between categories.

Of course, we can work on the rating definitions, for example maybe somehow 5 levels would make more sense. Maybe the difference between 3 and 4 could be clarified better. Maybe a different system that does not involve significant work (now and in an ongoing basis) could be suggested that also fits with the map development workflow.

IDK, my best idea for now is to give 'good' and 'best' definitions and to split 'experimental' between broken and in-progress.

Cernel

@LaFayette I would then have all 3 and 4 in 3 even if this means giving up with having a category limited to the very good maps.

What I think I made clear is that I believe it just makes no sense to have on the same metre two different things. The quality of the game is one thing and the quality of the graphic (and sounds?) is another. We cannot say what is better between a game which is very engaging but has very bad graphics and a game which is very boring but has very good graphics.

If keeping the current concept, I suggest changing the name of "4" to "3+". This way, you could say that "3" and "3+" maps are about the same quality of gaming (for at least one of their games) but "3+" have a particularly good graphic (and sounds?) too.

split 'experimental' between broken and in-progress.

I agree, but be ready having the category for not-yet-good maps which are actively under work be empty most of the time and with some significant over-head for assuring abandoned maps won't linger in there. Still, this category may be very good to have when there is actually a map in it (to avoid burying it within all abandoned not-very-good ones). I'm just saying it will be important resisting to the temptation of removing it once it will (because it will) remain completely empty for a long time.

LaFayette

@Cernel said in RFC: Map Rating to Replace Map Categories:

The quality of the game is one thing and the quality of the graphic (and sounds?) is another. We cannot say what is better between a game which is very engaging but has very bad graphics and a game which is very boring but has very good graphics.

I hear ya. The rating suggestion is not just a ranking, think of it more as levels.

Is the game complete? -> 2
Is the game good and balanced? -> 3
Is the game well polished and looks good? -> 4

A map cannot be a rating (4) without being good, balanced, and complete. Does this make sense and do you think it would work?

I agree, but be ready having the category for not-yet-good maps which are actively under work be empty most of the time and with some significant over-head for assuring abandoned maps won't linger in there.

Indeed, and if we purge the broken maps then almost all maps are going to be a 3 or a 4. That I think is okay, the '4' are then the answer to (C) in the OP. Otherwise players should look through and download all the others as they are fine maps.

Cernel

There are actually some topics currently open in forum to gather popularity information:
https://forums.triplea-game.org/topic/2394/triplea-vote-most-popular-map
https://forums.triplea-game.org/topic/1234/what-games-are-you-all-playing

If the forum would accept polls with huge numbers of options, one could have a poll with one choice per map or game.

If anyone were to take upon himself or herself the difficult duty of indicating the very good maps amongst the good ones (as we guess Veqryn used somehow to do), he or she could do it by gathering the votes. Otherwise, I'm still oriented suggesting having only your first 3 levels and maybe a 3+ for the level 3 games which happen to have an original skin which is very good (if not a 1-3 rating for the game (the best one if the map has more than one) and a 1-3 rating for the skin).

Also, I would have 1 as the best and higher numbers the worst.

LaFayette

I was planning to replace the rating number with a star count.

I think it's okay to open up this conversation in retrospect, even if all elements of what we design are not onboarded in 2.6 at least we will have the roadmap and features that are more likely to fit together.

I think the three main 'attributes' (aka tags) on maps are:

rating
era
popularity

For rating we need to define definitions. I'm not clear why "3+ should not just be called "4"? I do think that whatever the best skin is, that should be default, beyond that, we ignore skins when it comes to this.

Era would take some work to define some good categories for it. I don't think it would be helpful to have as many era's as we do maps, or to have all maps be in their unique era and then a big mass in WWII.

If a map maker typo's an era and defines one that does not exist, how should that be handled? Does that get created as a new ear, or does the map appear as having no defined era?

For popularity, I think we need to be sure it stays sustainable. It may get a lot of excitement and action now, but will it 5 years from now? I think to this extent it probably needs to be automated. I also think we need to develop a popularity score that is applied to all maps. If we define the top 3 most popular, I don't think that is necessarily useful or justice that there are so many maps and there are differences in popularity when get to the long tail (for example, Cold War once used to be popular, Napolean used to be extremely popular and is less so today. Having those not be somewhere in the middle or top middle of the list seems inadequate).

ubernaut

@LaFayette so if the premise here is that our current categories really signal quality then I would agree in a general sense, at least. might be good to include tags as well in that case.

LaFayette

In essence I think rating, era, and popularity will essentially be implemented as tags that are produced from the maps server. Rather than having completely free-form tags, I'm leaning for the tag keys and values to be on a white list and perhaps ignored if not on that list. I'm still not sure if we can't avoid a lot of pain for typo's or issues if we want to redo or change the era tag names.

TheDog

I would like to keep the experimental rating of 1 and display it for download. It has a lot of resources that can be plundered for maps, units, flags and ideas, I would be sad to see it go.

Also to incorporate a played often at some point in the maps life.

I like the star rating as per LaFayette with;
1-Incomplete (old Experimental)
2-Complete
3-Complete & Balanced
4-As 3, Looks good or is/has played often
5-As 3 Looks good & is/has played often, best in its era (can have more than 1 5star per era )

.
More on the era
As an example the map maker assigns era code, 3-4 letter/number code, white list checked, if the map maker gets it wrong it is changed to WRON=wrong.
This code is then displayed in English for the public to use for downloading.
eg WW2W=1939-1945 WW2 World theatre

Note that WW2 has been split into 3 theatres

FANT=Fantasy
ANCI=3000bce to 1450 Ancient/Medieval
RENA=1450-1793 Renaissance/ECW
NAPO=1793-1815 Napoleonic
ACW=1861-1865 ACW
WW1=1914-1918 WW1
WW2W=1939-1945 WW2 World theatre
WW2A=1939-1945 WW2 European/Atlantic theatre
WW2P=1939-1945 WW2 Far East/Pacific theatre
MODE=1946+ Modern
SIFI=SciFi
UNAS=Unassigned, between some eras, none of the above, could be abstract
WRON=Wrong era entered by mapmaker

Schulz

I'am againt map rating idea due to so many unfair situations.

Maps that created eariler will have unfair advantages since they are already there for a long time and it will be easier for them to get high ratings whereas newer maps will even struggle to find someones to rate them.
WWI/WWII maps will have unfair advantages as well since their scenarios are based on history which were well known, multi-fronted interesting conflict. Creating a fictional scenario as exciting as these wars is harder than to create just another WWI/WWII map.
Giving every vote to equal weight is another problem. If there will be rating system, only old members should be able to vote because getting refined-fixed taste is not a quick process.
I don't even thing balance is the most important thing (unless its not extremely unbalanced) or graphics can be rated objectively. I personally give more importance to variation (having the most different ways to achive victory), speed and reversibilty than balance since the first ones cannot be fixed while the latter can already be re-adjusted by bidding.

I do believe there should have been way more section to rate games if this will be certainly happen.

Variation: How many different ways can be used to achieve victory? (More is objectively better)
Reversibility: Do players have good chance to recover their mistakes and turn the tide of war? (Games should be reversible enough to prevent earlier drop offs)
Luck: How much role does luck play? (Luck should not play major role)
Speed: Game timespan? (Long gameplay is drawback imho)
Balance: Is it at least reasonably balanced?
Elegance: Having not unneeed complexities, unit rosters, nations etc...
Historical Accuracy: Not meant to have perfect accuracy instead "does the game give feeling of the war its based on?" For example Japanese invasion of Vladivostok, India or Australia are probable outcomes although none of them happened in WWII but seeing Japan marching Moscow is a totally different thing.
Graphics: Including flags, colors, unit images, default zoom, relief tiles.

I am favour of to categorize maps based on their era. Total 5 categories: Pre WWI-WWI-WWII-Post WWII-Fantasy/Sci-Fi.

When any sub era category get sufficient amount of map, then they will be able to their own category. (Like, if there will be so many Cold War map then Cold War category will be created)

Also its possible to use both era and rate system like categorizing maps based on era than listing from top to bottom depends on its ratings.

TheDog

@Schulz
It would be great to have a detailed rating system, but it takes a lot of time and effort to evaluate each map, who will do that?
Some of what you are asking for is subjective and personal and cannot easily be graded/rated.

@RogerCooper does a very good review job with enough detail to intrigue me to download a map I would otherwise not bother with.
https://forums.triplea-game.org/topic/889/roger-s-scenario-thread
Thank you Roger!