Ocaml native code debugging

Note: english translation of my previous post

t-caml-valid-callgraph.png

Now that Improve gnu ELF bug is commited in ocaml 3.11+, KCachegrind can generate beautifull callgraphs.

This patch consist in adding .size instructions (in ELF assembly code) in order to allow valgrind to interpret all symbols (camlT_entry, camlT_foo, camlT_bar, ..) and it can so display symbols names instead of their hexadecimal numbers!!!

ELF instructions for debug

Now, we may want these tools to be able to display file name and line number for all functions.(File name is present in symbols name but it doesn't allow full usage of these tools)

Let's see how's gcc working :

  int bar(int a) {
        return 1+a;
  }
  

int foo(int a) { return 2+bar(a); }

int main() { foo(3); }

  $ gcc -O0 -g -S t.c
  .globl bar
        .type   bar, @function
  bar:
  .LFB2:
        .file 1 "t.c"
        .loc 1 1 0
        pushq   %rbp
  .LCFI0:
        movq    %rsp, %rbp
  .LCFI1:
        movl    %edi, -4(%rbp)
        .loc 1 2 0
        movl    -4(%rbp), %eax
        incl    %eax
        .loc 1 3 0
        leave
        ret
  .LFE2:
        .size   bar, .-bar
  .globl foo
        .type   foo, @function

Usefull instructions are .file and .loc :

-> .file $file_id"$file_path"

-> .loc $file_id$ $line$ $column$

Suggested solution

The compiler module which emits these instructions is : asmcomp/i386/emit.mlp
And especially this "fundecl" function :

  let fundecl fundecl =
    function_name := fundecl.fun_name;
    fastcode_flag := fundecl.fun_fast;
    (* ... *)
    `   .globl  {emit_symbol fundecl.fun_name}
`;
    `{emit_symbol fundecl.fun_name}:
`;
    if !Clflags.gprofile then emit_profile();
    let n = frame_size() - 4 in
    if n > 0 then
      ` subl    ${emit_int n}, %esp
`;
    `{emit_label !tailrec_entry_point}:
`;
    emit_all true fundecl.fun_body;
    List.iter emit_call_gc !call_gc_sites;
    emit_call_bound_errors ();
    List.iter emit_float_constant !float_constants;
    match Config.system with
      "linux_elf" | "bsd_elf" | "gnu" ->
        `       .type   {emit_symbol fundecl.fun_name},@function
`;
        `       .size   {emit_symbol fundecl.fun_name},.-{emit_symbol fundecl.fun_name}
`
    | _ -> ()

Except that the only data we have is the fundecl variable :

  type fundecl = 
  { fun_name: string;
    fun_body: instruction;
    fun_fast: bool } 
  type instruction =
  { mutable desc: instruction_desc;
    mutable next: instruction;
    arg: Reg.t array;
    res: Reg.t array;
    dbg: Debuginfo.t;
    live: Reg.Set.t }

There is a dbg attribute on instructions but it's rarely set. (One compilation with -dlinear option allow to see this fact)

I've decided to add a fun_dbg : Debuginfo.t attribute on "fundecl" type and fill it in all compilation steps. It may more clever to work on this (often-empty) "dbg" attribute ? (it would allow to add position information on all instructions, it can be usefull for valgrind and gdb) This patch is not optimised because it repeats .file instruction for each .loc and so it repeats it on each function header.

-> Patch based on release311 branch, but works on current trunk

Now let's see what brings this patch

gdb results

  $ ocamlopt -g -inline 0 t.ml
  $ gdb a.out
  (gdb) break t.ml:6
  Breakpoint 1 at 0x8049940: file t.ml, line 6.
  (gdb) run
  Starting program: /home/alex/callgraph/a.out 
  

Breakpoint 1, camlT__foo_60 () at t.ml:6 6 let foo i = Current language: auto; currently asm

(gdb) backtrace #0 camlT__foo_60 () at t.ml:7 #1 0x0804c570 in camlT__entry () at t.ml:12 #2 0x0806e4b7 in caml_start_program ()

(gdb) step 1 camlT__bar_58 () at t.ml:2 2 let bar i =

(gdb) list 1 2 let bar i = 3 1+i 4 ;; 5 6 let foo i = 7 2+(bar i ) 8 ;; 9 10 let () =

gprof results

  $ ocamlopt -g -p -inline 0 t.ml
  $ ./a.out
  $ gprof -A
  *** File /home/alex/callgraph/t.ml:
                  
       1 -> let bar i =
                    Thread.delay 3.0;
                    1+i
            ;;
            
       1 -> let foo i  =
                    2+(bar i )
            ;;
            
            let () =
       1 ->         let closure() = 3 in
                    print_int ( foo (closure()) )
            ;;
            

Top 10 Lines:

 Line      Count
    2          1
    7          1
   12          1

Execution Summary:

    3   Executable lines in this file
    3   Lines executed

100.00 Percent of the file executed

    3   Total number of line executions
 1.00   Average executions per line

valgrind/kcachegrind results

  $ ocamlopt -g -inline 0 t.ml
  $ valgrind --tool=callgrind ./a.out
  $ callgrind_annotate callgrind.out.2152 t.ml

-- User-annotated source: t.ml

.
8 let bar i = 77,715 => thread.ml:camlThread__delay_75 (1x) . Thread.delay 3.0; . 1+i . ;; .
3 let foo i = 77,723 => t.ml:camlT__bar_58 (1x) . 2+(bar i ) . ;; .
. let () = 13 let closure() = 3 in 1,692 => pervasives.ml:camlPervasives__output_string_215 (1x) 2,312 => pervasives.ml:camlPervasives__string_of_int_154 (1x) 77,726 => t.ml:camlT__foo_60 (1x) . print_int ( foo (closure()) ) . ;; .
$ kcachegrind callgrind.out.2152

kcachegrind-file-and-line.png



And next ?

We first need to wait approval for this new feature by ocaml community, I've submitted it there.
I someone from INRIA read this ... Don't hesitate to contact me, I'm open to work on a different approach.
After that, we may hope a lot of new features, like :