Ticket #42299

texai creating units twice loading savegame?

오픈 날짜: 2021-05-16 23:40 마지막 업데이트: 2021-05-21 21:40

Reporter:
소유자:
Type:
Status:
Closed
Component:
MileStone:
Priority:
5 - Medium
Severity:
5 - Medium
Resolution:
Fixed
File:
4

Details

Every time I load a savegame with texai.

If I don't have the -F server option, I get the assertion failures, but the game seems to work OK.

/load texonly-T0006-Y-3750-manual.sav.xz

3: Removing player AI*1. 3: Removing player AI*2. ... 3: Removing player AI*1. 3: Savegame: 'aifill' explicitly set to value same as default. 3: Savegame: 'alltemperate' explicitly set to value same as default. ... 3: Per Albin Hansson has been added as Easy level AI-controlled player (tex). 3: Red Cloud has been added as Easy level AI-controlled player (tex). 0: in idex_register_unit() [idex.c::89]: assertion '((void *)0) == old' failed. 0: IDEX: unit collision: new 113 0x7fa3640c44d0 Explorer, old 113 0x7fa3640c3d80 Explorer 0: Please report this message at https://osdn.net/projects/freeciv/ticket/ Aborted (core dumped)

#0 raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:50

set = {val = {0, 140339871055018, 140339851740432, 140721747891268,
94801172287776, 12, 94801172381472, 0, 0, 0, 0, 21474836480, 140339851740632, 140339851740464, 94801172289120, 94801127663888}}
pid = <optimized out> tid = <optimized out> ret = <optimized out>

#1 0x000056389b859337 in fc_assert_fail (

file=file@entry=0x56389b8ebd5b "idex.c", function=function@entry=0x56389b8ebdd0 <FUNCTION.19266> "idex_register_unit", line=line@entry=89, assertion=assertion@entry=0x56389b8ebd48 "((void *)0) == old", message=message@entry=0x56389b8ebc50 "IDEX: unit collision: new %d %p %s, old %d %p %s") at log.c:570
level = LOG_FATAL

#2 0x000056389b7b9ae0 in idex_register_unit (

iworld=iworld@entry=0x56389bb62160 <texai_world>, punit=punit@entry=0x7fa3640c44d0) at idex.c:86
old = 0x7fa3640c3d80 FUNCTION = "idex_register_unit"

#3 0x000056389b729084 in texai_unit_info_recv (data=0x56389e5cfac0,

msgtype=<optimized out>) at texaiworld.c:296
plr_data = 0x5638a0212000 info = 0x56389e5cfac0 punit = 0x7fa3640c44d0 pplayer = 0x56389e3b9050 type = 0x56389c356420 <unit_types+127488> ptile = 0x7fa364036b30 FUNCTION = "texai_unit_info_recv"

#4 0x000056389b7282b1 in texai_check_messages (

ait=ait@entry=0x56389bb66780 <ai_types>) at texaiplayer.c:231
msg = 0x56389e5cf6a0 new_abort = TEXAI_ABORT_NONE ret_abort = TEXAI_ABORT_NONE FUNCTION = "texai_check_messages"

#5 0x000056389b7286a7 in texai_thread_start (arg=0x56389bb66780 <ai_types>)

at texaiplayer.c:100
finished = false texai = 0x56389bb66780 <ai_types>

#6 0x000056389b8564ca in fc_thread_wrapper (arg=0x5638a04bb130) at fcthread.c:40

data = 0x5638a04bb130

#7 0x00007fa36c42f0a2 in start_thread (arg=<optimized out>)

at pthread_create.c:483
start = <optimized out> ret = <optimized out> pd = <optimized out> now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140339851749120, 4286597848982146009, 140721747873390, 140721747873391, 140339851749120, 94801205168256, -4306611064108692519, -4306612714935235623}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out>

#8 0x00007fa36c35f4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 No locals.

Ticket History (3/28 Histories)

2021-05-16 23:40 Updated by: andrewmcg
  • New Ticket "texai creating units twice loading savegame?" created
2021-05-17 01:19 Updated by: andrewmcg
댓글 올리기

texai_control_gained(struct ai_type *ait, struct player *pplayer) is called for each texai player, but adds units and cities for all players into the shared texai_world cache. It looks like it's the same information for all of them, so we can just ignore repeated creates for items we already have

patch attached.

2021-05-17 05:24 Updated by: cazfi
댓글 올리기

Reply To andrewmcg

texai_control_gained(struct ai_type *ait, struct player *pplayer) is called for each texai player

Yes, but the first one should start the tex thread, and later ones should not add units or cities if thread is already running.

I've lately seen lots of thread start and end messages when running autogames with some tex players in a debugger, so it seems like the thread end thread unexepectedly ends. That's the real bug.

2021-05-17 16:45 Updated by: andrewmcg
댓글 올리기

Reply To cazfi

Yes, but the first one should start the tex thread, and later ones should not add units or cities if thread is already running. I've lately seen lots of thread start and end messages when running autogames with some tex players in a debugger, so it seems like the thread end thread unexepectedly ends. That's the real bug.

I put logging in texai_control_gained and in texai_city_info_recv : the TEXAI_MSG_CITY_CREATED message is being sent once and received twice. It looks like somehow there are two texai threads running. It happens 100% consistently for me, I will pin it down.

In the loading process, it creates the texai players as AI*1 etc, then deletes them, then creates them with the right names. Maybe the thread created the first time hangs around and still gets messages? Just a guess. I would have thought it would be listening on different message lists though.

(Edited, 2021-05-17 16:49 Updated by: andrewmcg)
2021-05-18 00:16 Updated by: andrewmcg
댓글 올리기

Reply To andrewmcg

I put logging in texai_control_gained and in texai_city_info_recv : the TEXAI_MSG_CITY_CREATED message is being sent once and received twice. It looks like somehow there are two texai threads running. It happens 100% consistently for me, I will pin it down. In the loading process, it creates the texai players as AI*1 etc, then deletes them, then creates them with the right names. Maybe the thread created the first time hangs around and still gets messages? Just a guess. I would have thought it would be listening on different message lists though.

No, it isn't caused by any threading issue.

The city is being added once from savegame_load, and once from the texai_control_gained

savegame_load -> savegame3_load -> sg_load_players -> sg_load_player_main -> ai.funcgained_control -> texai_control_gained

then

savegame_load -> call_func_each_ai(city_created)

https://github.com/freeciv/freeciv/blob/master/server/savegame/savemain.c#L81

At this point the texaiworld already has the city, because it added it when the first texai player was created.

The second call to city_created is needed, because when the first call runs we haven't yet loaded all the cities.

The first call is needed when you switch a player to the tex ai, but isn't needed when you load a savegame.

There are a couple of approaches:

1. move the gained_control call to the end of the savegame_load, after all the data has been loaded, and skip calling create_city directly from savegame3_load, so that all the cities get added in gained_control. (that might break a contract with other ai implementations though?)

2. Use my existing patch so that calling create_city and create_unit twice is OK

I kind of like the first approach. We load all the data, we have a valid consistent game, and then we initialise the AIs. Trying to start the AIs with incomplete data could lead to other issues.

2021-05-18 00:52 Updated by: cazfi
댓글 올리기

Reply To andrewmcg

In the loading process, it creates the texai players as AI*1 etc

So you have tex as the default AI type? I wonder if that makes a difference to any of this.

I see no such problems when loading europe_1900_WWI scenario in master, built with default configuration (ai types "classic,tex"). That scenario has players with tex ai type. The savegame format of the scenario is that of 3.0, so couple of rounds of compatibility conversions get run.

Can you attach a savegame to reproduce this from?

2021-05-18 01:54 Updated by: andrewmcg
댓글 올리기

I think making texai the default is what causes the AI*1 etc, but I don't think that's related to the bug. It is creating a new texai thread when it starts loading the players from the savegame, after the earlier thread is correctly destroyed, and it's in the new thread that the units and cities are being recreated.

If the first player loaded is a texai player, it will start the texai thread before it loads any units or cities, so you won't see the bug. No units or cities are added to the texai_world until the last part of savegame_load where it iterates over the units and cities and calls CALL_FUNC_EACH_AI(city_created, pcity);

Attaching the savegame I was testing with, but it's been every savegame.

As I said above, I think calling ai.func->gained_control() during the loading process, before all the players have been loaded, and before even the ai player's own cities and units have been loaded, is probably unsafe. It's better to load all the data, and then activate the AI.

https://github.com/freeciv/freeciv/blob/a978d996ec4c8235388eef850b32fe5cc9ab0f6e/server/savegame/savegame3.c#L4042

2021-05-18 03:09 Updated by: cazfi
댓글 올리기

Reply To andrewmcg

If the first player loaded is a texai player, it will start the texai thread before it loads any units or cities, so you won't see the bug.

That was what I had missed. I can reproduce now.

Moving the gained_control to later seems right thing to do (a bit risky, of course, in that savegame loading may have some other unseen dependencies on AI being functional), but I'm less convinced about removal of city_created etc calls. In AI API city_created is expected to be called for every city added to the game. Other AI types may have no city related handling at all in their gained_control (or no gained_control at all).

As a third option I wonder if we could remove adding of cities/units to world object in tex gained_control, and assume that city_created & friends are always adding all items after the tex AI has already been activated (gained_control has already been called).

2021-05-18 03:45 Updated by: andrewmcg
댓글 올리기

Reply To cazfi

Reply To andrewmcg

If the first player loaded is a texai player, it will start the texai thread before it loads any units or cities, so you won't see the bug.

That was what I had missed. I can reproduce now. Moving the gained_control to later seems right thing to do (a bit risky, of course, in that savegame loading may have some other unseen dependencies on AI being functional), but I'm less convinced about removal of city_created etc calls. In AI API city_created is expected to be called for every city added to the game. Other AI types may have no city related handling at all in their gained_control (or no gained_control at all). As a third option I wonder if we could remove adding of cities/units to world object in tex gained_control, and assume that city_created & friends are always adding all items after the tex AI has already been activated (gained_control has already been called).

But if you remove it from texai_control_gained, then it won't get them when you switch a player to ai in-game. That's the underlying issue, that the same function is being used for restoring a saved position as for starting the AI from scratch, and it's a little bit awkward. The cleanest solution would be to have specific "resume from saved data" function in the AI API -- particularly if you start saving more AI-specific state information. That's not essential now, but it might become so.

2021-05-18 04:00 Updated by: cazfi
댓글 올리기

Reply To andrewmcg

that the same function is being used for restoring a saved position as for starting the AI from scratch, and it's a little bit awkward. The cleanest solution would be to have specific "resume from saved data" function in the AI API

At the same forcing AI module author to define two callbacks for what often seem the same (AI type gained control of the player) wouldn't be ideal. But I think we can have the cake and eat it too - let's add (boolean?) parameter to gained_control telling if this is about loading a savegame or not. Those AI modules that want to have the same callback for both cases can just ignore the parameter.

2021-05-18 04:09 Updated by: cazfi
댓글 올리기

Shouldn't matter, but I don't remember from top-of-my-head if gained_control means just that the given AI module/type gets assigned to the player (even human players have AI type assigned), or that player gets under AI control (as opposed to human).

2021-05-18 16:40 Updated by: andrewmcg
댓글 올리기

Reply To cazfi

Shouldn't matter, but I don't remember from top-of-my-head if gained_control means just that the given AI module/type gets assigned to the player (even human players have AI type assigned), or that player gets under AI control (as opposed to human).

gained_control is only called for actual ai-controlled players.

I think we should move gained_control until after all players, cities and units have been loaded. But if we are still going to call city_created/city_got and unit_created/unit_got, that has to be after gained_control, since for texai, gained_control is what starts the thread.

(The classic ai doesn't have handlers for city_created/city_got and unit_created/unit_got)

However, if texai were developed further to trigger actions or calculations when, say, a unit is created, then it might make a big difference whether it is really a new unit being created, or whether it is just a notification of a unit loaded from a save file.

Adding a parameter to gained_control is fine, but wouldn't that break any existing out-of-tree AI modules? Whereas we can add functions to struct ai_type without breaking existing modules. If we add void (*loading_savegame)(bool), then we can call loading_savegame(true), then gained_control(), then loading_savegame(false). We then just need to add a flag to texai_plr (or maybe to ai_type.private?) to say whether we're in a savegame load or not.

So we can guarantee to AI modules that the game loading sequence is:

  • loading_savegame(true)
  • (all players, cities, units, map etc. loaded)
  • gained_control()
  • unit_created(each unit)
  • unit_got(each own unit)
  • city_created(each city)
  • city_got(each own city)
  • loading_savegame(false)

Then texai can skip copying cities and units into texaiworld in gained_control if loading_savegame has been set true, because it knows it's going to get the individual calls. Right now it doesn't care in city_created etc. whether it's loading or not, but if it does in future, it can just check the loading_savegame flag

That will be the same behaviour for texai as the cases which currently work (where gained_control is called before any players are loaded, so it doesn't add anything to texai_world from gained_control)

It will affect classic ai slightly -- right now it calls clear_worker_tasks on each city in dai_gained_control, but this has no effect because dai_gained_control is called before the player's cities are loaded. With the above, it will be called after all cities are loaded. The result might be better or worse, but it will be different, unless we check in dai_gained_control whether we are loading or not.

Last question -- do we need loading_savegame(bool) at all? Can we use server_state() ? I notice also that dai_gained_control calls dai_assess_danger, which checks server_state == S_S_RUNNING before doing anything.

2021-05-18 20:03 Updated by: andrewmcg
댓글 올리기

When I went right through it, the sequence of calls to the AI when it starts up is complex enough that moving the gained_control() to later doesn't really simplify anything or make any new useful guarantees.

So the patch now is to do the most straightforward fix for the problem, and to document what actually gets in what order when the AI starts.

2021-05-18 23:04 Updated by: andrewmcg
댓글 올리기

There's still more...

I noticed I didn't cover in the README.AI_modules what happens if you do an aitoggle before the game starts. The answer is: it crashes.

player_set_to_ai_mode calls restart_phase() (depending on details of the phase setup which are unpredictable), which I'm almost sure shouldn't be called when the game isn't running.

That's an easy fix, but it leaves more inconsistencies -- create_unit is called for texai when you load a savegame if there is a texai player at the point the savegame is loaded. If you add one later, it won't. Without my latest change, gained_control() will add all the units to the texai_world, if it is first called after loading but before starting, but with my change in, it won't, which means there will be a failure if there are no texai players at loading time but you activate one afterwards.

Your suggestion of a parameter to gained_control that tells whether we are loading (not whether we are still in S_S_INITIAL) would work for this -- activating an AI player after loading the savegame but before starting the game would import the units.

Delaying gained_control() until the end of loading and skipping the explicit unit_created calls from the loading would also work. Your point that other AIs might be relying on unit_created being called is not valid, because it is not called if you load a savegame with human players and toggle the AI after loading, and before or after starting the game (which might be quite a common scenario in multiplayer if one player has left and you want to resume without them?)

So, going round in a complete circle, I want to try again with gained_control() happening later in the load, and the explicit add_unit() taken out of the load. gained_control() can be called before, during, or after the interactive pre-game S_S_INITIAL period, and in all cases it should ensure that the AI has all the information that it needs extracted from the game state. unit_created(), tile_info(), etc. should be called only when there are changes in a game, not as part of the game setup. (In a clean new game situation, the starting units are created after S_S_RUNNING is reached, so the game starts with no units and then they are added).

2021-05-19 02:19 Updated by: cazfi
댓글 올리기

I assume unit|city_alloc still do the memory handling for the newly allocated units and cities even when gained_control has not yet been called. So those are the callbacks that logically should handle any unit|city setup that is agnostic to whether the unit|city is created in game or loaded from a savegame, making your suggestion more API-correct.

2021-05-19 02:47 Updated by: andrewmcg
댓글 올리기

Reply To cazfi

I assume unit|city_alloc still do the memory handling for the newly allocated units and cities even when gained_control has not yet been called. So those are the callbacks that logically should handle any unit|city setup that is agnostic to whether the unit|city is created in game or loaded from a savegame, making your suggestion more API-correct.

This patch delays calling gained_control until after save is complete, so that the behaviour is more consistent between loading an AI player, and switching a player to AI after loading or while playing.

The one remaining issue is the inconsistency around calling unit_got and city_got when starting up the AI.

Without this patch, they are called when loading an AI player from a savegame, but not when switching a player to AI control. With the patch, they are not called in either case.

city_got is not used by either default or tex AI, but unit_got is used by both to initialise the ferry data. I think it city_got and unit_got should probably be called for all the player's own units from inside gained_control()

(Edited, 2021-05-19 02:48 Updated by: andrewmcg)
2021-05-19 21:58 Updated by: cazfi
댓글 올리기

Reply To andrewmcg

The one remaining issue is the inconsistency around calling unit_got and city_got when starting up the AI. Without this patch, they are called when loading an AI player from a savegame, but not when switching a player to AI control.

But are they called *before* switching player to AI control, i.e., when the player actually gets the unit? I'm starting to suspect that the real problem is that the tex thread is running only when there's player toggled to AI, but things get handled correctly (even when player is under human control) only when the thread is running. Maybe the thread should be running always?

2021-05-19 22:01 Updated by: cazfi
댓글 올리기

Of course, handling things (in the thread) would have performance penalty even in the by-far more common situation that tex AI is not used at all, but just exist in the server (as it's built in by default).

2021-05-19 22:59 Updated by: andrewmcg
댓글 올리기

Currently unit_got in texai does not need the texai thread to be running: it only inherits the default ai behaviour, which runs in the main thread and acts on the pplayer->server.ais[] data

I think the tex ai has to be able to pick up the pieces mid-game without having been involved before. It is adding not only performance penalty but complexity and risk to have it keeping tabs on data when it doesn't need to.

I have an ulterior motive -- my Grand Plan is to write another ai, so I see tex as one among several that might be available, which would multiply the disadvantages of having inactive AIs still running code.

But if it is able to pick up a player mid-game, then there's no reason to do anything special when loading a savegame other than to manage AI-specific data that is in the savegame. That's done by player_load, unit_load, city_load. texai should be (and currently is) able to handle receiving that data when not activated (currently it only calls the default implementation for them).

Doing any processing that has to be done on units, cities when the AI is activated is the same whether it is resuming a savegame or taking over control from a human, so it shouldn't be specifically in the savegame load, it should be up to gained_control() to do all of it.

2021-05-19 23:13 Updated by: cazfi
댓글 올리기

This ticket is really getting too complex. It should fix just one issue, and the other issues should have tickets of their own.

Reply To andrewmcg

The one remaining issue is the inconsistency around calling unit_got and city_got when starting up the AI. ... With the patch, they are not called in either case.

So the patch breaks even classic AI (as you said that it's default AI behavior that is behind those callbacks) ?

2021-05-20 00:23 Updated by: andrewmcg
댓글 올리기

Yes, you're right.

To recap (after 24 comments):

The problem is that texai_control_gained either adds some (but not all) of the units and cities into the thread-local texaiworld, or doesn't, depending on the order that players are loaded in.

To my mind, the correct fix is to make it always add them, but that was based on the assumption that the AI starts from scratch when it takes control of a player. In fact, the AI does get called frequently through the game, whether it controls players or not.

/* FIXME: This should also check if player is ai controlled */
#define CALL_PLR_AI_FUNC(_func, _player, ...)                           \
  do {                                                                  \
    struct player *_plr_ = _player; /* _player expanded just once */    \
    if (_plr_ && _plr_->ai && _plr_->ai->funcs._func) {                 \
      ai_timer_player_start(_plr_);                                     \
      _plr_->ai->funcs._func( __VA_ARGS__ );                            \
      ai_timer_player_stop(_plr_);                                      \
    }                                                                   \
  } while (FALSE)

That explains the unit_got() inconsistency. Classic AI needs unit_got() called on savegame_load(), but not on toggle_ai_player(), because it's been getting it all along during the game whether players were under its control or not. So yes, my current patch would break it.

This is not really compatible with having multiple AI modules and being able to switch between them. It's also, as per your previous comment, not ideal when the AI needs to start up expensive resources to run at all. So the logical changes will have to wait until that problem is solved.

If you want the simplest possible fix for bug 42299, it's actually the first patch I posted -- 0001-Avoid-re-registering-units-and-cities-for-each-texai.patch. That just makes sure that if texai_unit_created() is called twice for the same unit, it ignores the second one. That's not elegant, but the alternative is to build even higher on the foundation that the AI has up-to-date information about players it doesn't control, which in the longer run it shouldn't have.

We can look at detaching the AI from human-controlled players afterwards. The question is whether that would interfere with the functioning of the advisors.

2021-05-20 00:58 Updated by: cazfi
댓글 올리기

Reply To andrewmcg

This is not really compatible with having multiple AI modules and being able to switch between them.

Yes. Having ability switch between the modules has not even been a goal in the first phase of AI module interface implementation. It has been knowingly left as a limitation that AI type must be set at the time of player creation and not changed afterwards.

Improving that is of course welcome.

If you want the simplest possible fix for bug 42299, it's actually the first patch I posted -- 0001-Avoid-re-registering-units-and-cities-for-each-texai.patch. That just makes sure that if texai_unit_created() is called twice for the same unit, it ignores the second one. That's not elegant, but the alternative is to build even higher on the foundation that the AI has up-to-date information about players it doesn't control, which in the longer run it shouldn't have.

Right, I'll check that again in the light of this discussion.

Feel free to open new tickets about the further development.

2021-05-20 04:09 Updated by: cazfi
  • 소유자 Update from (None) to cazfi
  • Resolution Update from None to Accepted
  • Milestone Update from (None) to 3.1.0 (closed)
  • Component Update from (None) to AI
2021-05-21 21:40 Updated by: cazfi
  • Status Update from Open to Closed
  • Resolution Update from Accepted to Fixed

Attachment File List

Edit

Please login to add comment to this ticket » Login