[Pgbigm-hackers] pg_gin_pending_cleanup function

Back to archive index

Fujii Masao masao****@gmail*****
2015年 9月 25日 (金) 19:04:02 JST


On Tue, Sep 22, 2015 at 12:55 AM, Masahiko Sawada <sawad****@gmail*****> wrote:
> On Fri, Sep 18, 2015 at 1:00 AM, Fujii Masao <masao****@gmail*****> wrote:
>> On Thu, Aug 27, 2015 at 1:44 AM, Fujii Masao <masao****@gmail*****> wrote:
>>> On Wed, Aug 26, 2015 at 11:48 PM, Masahiko Sawada <sawad****@gmail*****> wrote:
>>>> On Wed, Aug 26, 2015 at 11:06 AM, Fujii Masao <masao****@gmail*****> wrote:
>>>>> Hi,
>>>>>
>>>>> Attached patch implements the pg_gin_pending_cleanup function which cleans up
>>>>> the pending list of the specified GIN index by moving tuples in it to the main
>>>>> GIN data structure in bulk. Then this function returns the number of pages in
>>>>> the pending list cleaned up. I'd like to add this function into the master.
>>>>>
>>>>> Even without this function, we can clean up the pending list by using VACUUM.
>>>>> However, since VACUUM needs to do not only the pending list cleanup but also
>>>>> other various jobs, it usually takes a long time and its performance impact is
>>>>> likely to be big. So I think that pg_gin_pending_cleanup function is useful
>>>>> because we can clean up the list more quickly and avoid such big performance
>>>>> impact by using the function.
>>>>
>>>> +1.
>>>> It will be really useful function for maintenance GIN index.
>>>> I applied this patch to HEAD cleanly, and compiled without warning.
>>>> It looks good to me.
>>>
>>> Thanks for reviewing the patch! Applied the patch to the master.
>>
>> On second thought, current version of pg_gin_pending_cleanup might not be
>> sufficient for real scenario because it moves the tuples from pending list into
>> GIN index main structure but doesn't mark the removed pages as free in FSM.
>> So even if pg_gin_pending_cleanup function is called many times, garbage pages
>> in pending list will never be freed and reused later. This causes GIN index to
>> be kept being bloated unexpectedly :(
>>
>> For that problem, I think that we should provide not only tuple-moving but also
>> mark-as-free functionalities.
>
> +1.
>
>> One question here is; how should we provide those
>> functionalities? There are basically three options.
>>
>> #1. Provide two separate functions, (1) tuple-move and (2) mark-as-free.
>>       The demerit of this option is that a user needs to call both functions
>>       when he or she wants to move tuples from pending list and mark removed
>>       pages as free in FSM.
>>
>> #2. Provide three separate functions,
>>       (1) tuple-move, (2) mark-as-free and (3) tuple-move + mark-as-free
>>       But we might want to avoid providing three functions here...
>>
>> #3. Provide one function and enable them to specify the operation that they
>>       want to perform as an argument. For example, if a user specifies "free"
>>       as argument, the function does only mark-as-free operation. If "both" is
>>       specified, both tuple-move and mark-as-free are performed. Of course,
>>       the argument value "move" makes the function perform tuple-move.
>>       Maybe the default should be "both".
>
> I think that the function just moving tuple(i.g. (1) function) would
> be useful for testing GIN and pg_bigm on 9.4 or before.
> And (3) function will be helpful certainly in production environment.
> But I'm not sure that using the function just marking FSM as free
> (i.g, (2) function)  would help for something.

Okay.

> Also #3 seems to be overkill.
>
> So IMO, we should add (1) and (3) functions.

But I'd like to avoid providing very similar two different functions.
So I feel inclined to add something like

    pg_gin_pending_cleanup(index regclass, update_fsm boolean default true)

    If only index name is given as an argument (or update_fsm is true),
    this function cleans up the list and adds the deleted pages to FSM.

    If update_fsm is false, this function just moves the tuples from
    the list to GIN index.

What about adding the above only one function?

Regards,

-- 
Fujii Masao




Pgbigm-hackers メーリングリストの案内
Back to archive index