This reverts commit
cc1fc6af4150b19f9c4c70d0463ff498703fb637, since it
causes a number of regressions that seem not to be easily fixable.
The problem lies in existence of "freestanding" code, a code that is
part of a CU but does not have any block associated with it. Consider
following program:
__asm__(
".type foo,@function \n"
"foo: \n"
" mov %rdi, %rax \n"
" ret \n"
);
static int foo(int i);
int main(int argc, char **argv) {
return foo(argc);
}
When compiled, the foo function has no block of itself:
Blockvector:
no map
block #000, object at 0x55978957b510, 1 symbols in 0x1129..0x1148
int main(int, char **); block object 0x55978957b380, 0x112d..0x1148 section .text
block #001, object at 0x55978957b470 under 0x55978957b510, 2 symbols in 0x1129..0x1148
typedef int int;
typedef char char;
block #002, object at 0x55978957b380 under 0x55978957b470, 2 symbols in 0x112d..0x1148, function main
int argc; computed at runtime
char **argv; computed at runtime
In this case lookup(0x1129) returns static block and, because of the
change in
cc1fc6af4, contains(0x1129) which is wrong.
Such "freestanding" code is perhaps not common but it does exist,
especially in system code. In fact the regressions were at least in part
caused by such "freestanding" code in glibc (libc_sigaction.c).
The whole idea of commit
cc1fc6af4 was to handle "holes" in CUs, a case
where one CU spans over multiple disjoint regions, possibly interleaved
with other CUs. Consider somewhat extreme case with two CUs:
/* hole-1.c */
int give_me_zero ();
int
main ()
{
return give_me_zero ();
}
/* hole-2.c */
int __attribute__ ((section (".text_give_me_one"))) __attribute__((noinline))
baz () { return 42; }
__asm__(
".section .text_give_me_one,\"ax\",@progbits\n"
".type foo,@function \n"
"foo: \n"
" mov %rdi, %rax \n"
" ret \n"
" nop \n"
" nop \n"
" nop \n"
);
int __attribute__ ((section (".text_give_me_one"))) __attribute__((noinline))
give_me_one ()
{
return 1;
}
__asm__(
".section .text_give_me_zero,\"ax\",@progbits\n"
"bar: \n"
" jmp give_me_one \n"
" nop \n"
" nop \n"
" nop \n"
);
int __attribute__ ((section (".text_give_me_zero")))
give_me_zero ()
{
extern int bar();
return give_me_one() - 1;
}
This when compiled with a carefully crafted linker script to force code
at certain positions, creates following layout:
0x080000..0x080007 # "freestanding" bar from hole-2.c
0x080008..0x080016 # give_me_zero() from hole-2.c
0x080109..0x080114 # main from hole-1.c
0xf00000..0xf0000b # baz() from hole-2.c
0xf0000b..0xf00011 # "freestanding" foo from hole-2.
0xf0000b..0xf0001c # gice_me_one() from hole-2.
The block vector for hole-1.c looks:
Blockvector:
no map
block #000, object at 0x555a5d85fb90, 1 symbols in 0x80109..0x80114
int main(void); block object 0x555a5d85faa0, 0x80109..0x80114 section .text
block #001, object at 0x555a5d85faf0 under 0x555a5d85fb90, 1 symbols in 0x80109..0x80114
typedef int int;
block #002, object at 0x555a5d85faa0 under 0x555a5d85faf0, 0 symbols in 0x80109..0x80114, function main
And for hole-2.c:
Blockvector:
map
0x0 -> 0x0
0x80008 -> 0x555a5d85ff50
0x80016 -> 0x0
0xf00000 -> 0x555a5d860280
0xf0000b -> 0x0
0xf00012 -> 0x555a5d860110
0xf0001d -> 0x0
block #000, object at 0x555a5d8603b0, 3 symbols in 0x80008..0xf0001d
int give_me_zero(void); block object 0x555a5d85ff50, 0x80008..0x80016 section .text
int give_me_one(void); block object 0x555a5d860110, 0xf00012..0xf0001d section .text
int baz(void); block object 0x555a5d860280, 0xf00000..0xf0000b section .text
block #001, object at 0x555a5d8602d0 under 0x555a5d8603b0, 1 symbols in 0x80008..0xf0001d
typedef int int;
block #002, object at 0x555a5d85ff50 under 0x555a5d8602d0, 0 symbols in 0x80008..0x80016, function give_me_zero
block #003, object at 0x555a5d860280 under 0x555a5d8602d0, 0 symbols in 0xf00000..0xf0000b, function baz
block #004, object at 0x555a5d860110 under 0x555a5d8602d0, 0 symbols in 0xf00012..0xf0001d, function give_me_one
Note that despite the fact "freestanding" bar belongs to hole-2.c, the
corresponding CU's global and static blocks start at 0x80008! Looking
at DWARF for the second program, it looks like that the compiler (GCC 15)
did not record the presence of "freestanding" code:
<0><71>: Abbrev Number: 1 (DW_TAG_compile_unit)
<72> DW_AT_producer : (indirect string, offset: 0): GNU C23 15.2.0 -mtune=generic -march=x86-64 -g -fasynchronous-unwind-tables
<76> DW_AT_language : 29 (C11)
<77> Unknown AT value: 90: 3
<78> Unknown AT value: 91: 0x31647
<7c> DW_AT_name : (indirect line string, offset: 0x2d): hole-2.c
<80> DW_AT_comp_dir : (indirect line string, offset: 0): test_programs
<84> DW_AT_ranges : 0xc
<88> DW_AT_low_pc : 0
<90> DW_AT_stmt_list : 0x51
and corresponding part of .debug_aranges:
Length: 76
Version: 2
Offset into .debug_info: 0x65
Pointer Size: 8
Segment Size: 0
Address Length
0000000000f00000 000000000000000b
0000000000f00012 000000000000000b
0000000000080008 000000000000000e
0000000000000000 0000000000000000
Thiago suggested to use minsymbols to tell whether or a CU contains
given address. I do not think this would work reliably as minsymbols do
no know to which CU they belong. In slightly more complicated case of
interleaved CUs it does not seem to be possible to tell for sure to which
one a given minsymbol belongs.
Moreover, Tom suggested that the comment in find_compunit_symtab_for_pc_sect
(which led to
cc1fc6af4) may be outdated [2].
Given all that, I'm just reverting the change.
[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=33679#c13
[2]: https://inbox.sourceware.org/gdb-patches/87cy6xzd3j.fsf@tromey.com/
Approved-By: Tom Tromey <tom@tromey.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=33679
SELF_CHECK (bv->contains (0x1500) == true);
/* Test address falling into a "hole". If BV has an address map,
- lookup () returns nullptr. If not, lookup () return static block.
- contains() returns false in both cases. */
+ lookup () returns nullptr and contains (). returns false. If not,
+ lookup () return static block and contains() returns true. */
if (with_map)
- SELF_CHECK (bv->lookup (0x2500) == nullptr);
+ {
+ SELF_CHECK (bv->lookup (0x2500) == nullptr);
+ SELF_CHECK (bv->contains (0x2500) == false);
+ }
else
- SELF_CHECK (bv->lookup (0x2500) == bv->block (STATIC_BLOCK));
- SELF_CHECK (bv->contains (0x2500) == false);
+ {
+ SELF_CHECK (bv->lookup (0x2500) == bv->block (STATIC_BLOCK));
+ SELF_CHECK (bv->contains (0x2500) == true);
+ }
/* Test address falling into a block above the "hole". */
SELF_CHECK (bv->lookup (0x3500) == bv->block (3));
bool
blockvector::contains (CORE_ADDR addr) const
{
- auto b = lookup (addr);
- if (b == nullptr)
- return false;
-
- /* Handle the case that the blockvector has no address map but still has
- "holes". For example, consider the following blockvector:
-
- B0 0x1000 - 0x4000 (global block)
- B1 0x1000 - 0x4000 (static block)
- B3 0x1000 - 0x2000
- (hole)
- B4 0x3000 - 0x4000
-
- In this case, the above blockvector does not contain address 0x2500 but
- lookup (0x2500) would return the blockvector's static block.
-
- So here we check if the returned block is a static block and if yes, still
- return false. However, if the blockvector contains no blocks other than
- the global and static blocks and ADDR falls into the static block,
- conservatively return true.
-
- See comment in find_compunit_symtab_for_pc_sect, symtab.c.
-
- Also, note that if the blockvector in the above example would contain
- an address map, then lookup (0x2500) would return NULL instead of
- the static block.
- */
- return b != static_block () || num_blocks () == 2;
+ return lookup (addr) != nullptr;
}
/* See block.h. */