Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential bug on wal_files service on low activity standby with 0 wal file #314

Open
l00ptr opened this issue Jun 7, 2022 · 1 comment

Comments

@l00ptr
Copy link
Contributor

l00ptr commented Jun 7, 2022

The check_wal_files function returns some errors when using the wal_files service. It also returns a wrong number of wal file on one of our low activity standby:

$ ~/bin/check_pgactivity  --service wal_files

Use of uninitialized value in string gt at ./bin/check_pgactivity line 8790.
Use of uninitialized value $first_seg in string gt at ./bin/check_pgactivity line 8790.
Use of uninitialized value $seg_kept in numeric gt (>) at ./bin/check_pgactivity line 8802.
Use of uninitialized value in string ne at ./bin/check_pgactivity line 8825.
Use of uninitialized value $ratio in multiplication (*) at ./bin/check_pgactivity line 795.
Use of uninitialized value $ratio in multiplication (*) at ./bin/check_pgactivity line 795.

POSTGRES_WAL_FILES OK: 1 WAL files | total_wal=1 recycled_wal=0 tli=0 written_wal=1 kept_wal=0

I think it's because the following query return an empty resultset :

DEBUG: host service:main is version 9.0.23/90023
DEBUG: Query: 
        SELECT s.f,
          greatest(
            1 + current_setting('checkpoint_segments')::float4 *
              (2 + current_setting('checkpoint_completion_target')::float4),
            1 + current_setting('wal_keep_segments')::float4 +
              2 * current_setting('checkpoint_segments')::float4
          ),
          CASE WHEN pg_is_in_recovery()
            THEN NULL
            ELSE pg_current_xlog_location()
          END,
          current_setting('wal_keep_segments')::integer,
          substring(s.f from 1 for 8) AS tli
        FROM pg_ls_dir('pg_xlog') AS s(f)
        WHERE f ~ '^[0-9A-F]{24}$'
        ORDER BY
          (pg_stat_file('pg_xlog/'||s.f)).modification DESC,
          f DESC
DEBUG: Env. service:  main 
DEBUG: Query rc: 0
DEBUG: Query result: $VAR1 = [];

dprinting the rs var on line https://github.com/OPMDG/check_pgactivity/blob/master/check_pgactivity#L8946 shows this :

rs: $VAR1 = [
          []
        ];

So in think that's why we iterate (at least once) throught this empty array on L8951 and get the result 1 WAL files | total_wal=1

@l00ptr
Copy link
Contributor Author

l00ptr commented Jun 7, 2022

after searching a little bit more, this issue seems related to the those 3 lines of code just after @rs = @{ query_ver( $hosts[0], %queries ) }; :

    $first_seg = $rs[0][0];
    $max_segs  = $rs[0][1]; #segments to keep including kept segments
    $tli = hex($rs[0][4]);

Here is a isolated test case to reproduce this error :

~ cat hydrat.pl 

sub machin() {
    return [];
}
@rs = @{ machin() };
my $var_machin = $rs[0][1];

foreach my $row (@rs) {
    print "oups";
}

#print( machin() );~ perl hydrat.pl                                       
oups% 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant