What does HackerNews think of bloaty?
Bloaty McBloatface: a size profiler for binaries
When I worked on Matter¹, the Xtensa and RISC-V versions were basically fungible from the software point of view. (And really, so were other vendors' various ARMs.) We did find that Bloaty McBloatface² didn't support Xtensa, so I had to write an alternative.
https://github.com/google/bloaty
Or goweight.
Like the other Google project, Bloaty McBloatface.
Is it reasonably easy to attribute individual entries in pclntab to specific symbols? If so I'd love to add this capability to https://github.com/google/bloaty which already tries to do per-symbol analysis of many other sections (eh_frame, rela.dyn, etc).
> At this time, I do not have a satisfying explanation for this “dark” file usage.
The author's journey of starting with "nm --size", discovering "dark" bytes, and wanting to attribute them properly, is exactly what led me to create and invest so much effort into Bloaty McBloatface: https://github.com/google/bloaty
Bloaty's core principle is that every byte of the file should be attributed to something, so that the sum of the parts always adds up to the total file size. If we can't get detailed symbol, etc. information for a given region of the file, we can at least fall back to describing what section the bytes were in.
Attributing all of the bytes requires parsing much more than just the symbol table. Bloaty parses many different sections of the binary, including unwind information, relocation information, debug info, and the data section itself in an attempt to attribute every part of the binary to the function/data that emitted it. It will even disassemble the binary looking for references to anonymous data (some data won't make it into the symbol table, especially things like string literals).
I wrote up some details of how Bloaty works here: https://github.com/google/bloaty/blob/master/doc/how-bloaty-.... The section on the "Symbols" data source is particularly relevant here:
> I excerpted two symbols from the report. Between these two symbols, Bloaty has found seven distinct kinds of data that contributed to these two symbols. If you wrote a tool that naively just parsed the symbol table, you would only find the first of these seven:"
The author's contention that these "dark" bytes are "non-useful" is not quite fair. There are plenty of things a binary contains that are useful even though they are not literally executable code. For example, making a binary position-independent (which is good for security) requires emitting relocations into the binary so that globals with pointer values can be relocated at program load time, once the base address of the binary is chosen. I don't know if Go does this or not, but it's just one example.
On the other hand, I do agree that the ability to produce slim binaries is an important and often undervalued property of modern compiler toolchains. All else being equal, I much prefer a toolchain that can make the smallest binaries.
Bloaty should work reasonably well for Go binaries, though I have gotten some bug reports about things Bloaty is not yet handling properly for Go: https://github.com/google/bloaty/issues/204 Bloaty is just a side thing for me, so I often don't get as much time as I'd like to fix bugs like this.
Bloaty McBloatface.
$ bloaty `which touch` -d segments,sections --domain=file
FILE SIZE
---------------
55.1% 19.6Ki __LINKEDIT
91.7% 18.0Ki Code Signature
2.7% 544 Symbol Table
1.9% 376 Lazy Binding Info
1.6% 328 String Table
1.2% 232 Indirect Symbol Table
0.6% 112 Binding Info
0.2% 32 Export Info
0.1% 24 Function Start Addresses
0.0% 8 Rebase Info
0.0% 8 Table of Non-instructions
0.0% 0 [__LINKEDIT]
22.2% 7.90Ki __TEXT
35.0% 2.77Ki __TEXT,__text
33.6% 2.65Ki [__TEXT]
17.2% 1.36Ki [Mach-O Headers]
4.2% 339 __TEXT,__cstring
3.4% 274 __TEXT,__const
3.3% 266 __TEXT,__stub_helper
1.9% 150 __TEXT,__stubs
1.5% 120 __TEXT,__unwind_info
11.2% 4.00Ki __DATA
94.9% 3.80Ki [__DATA]
4.9% 200 __DATA,__la_symbol_ptr
0.2% 8 __DATA,__data
11.2% 4.00Ki __DATA_CONST
98.4% 3.94Ki [__DATA_CONST]
1.6% 64 __DATA_CONST,__got
0.3% 104 [Mach-O Headers]
100.0% 35.6Ki TOTAL
So 2.77Ki of actual code in "__TEXT,__text".Those [__TEXT], [__DATA], and [__DATA_CONST] sections are the part that is lost to padding, so 10.4Ki or so.
Disclosure: I am the author of Bloaty.
I frequently use this tool to answer questions for C++ binaries, another language that has a penchant for producing large executables.
ELF (and Mach-O, PE, etc) are designed to optimize the creation of a process image. The runtime loader mainly just has to mmap() a bunch of file ranges into memory with various permissions. This is quite different than loading a .jar file, .pyc, etc. which involve building a runtime heap and loading objects into that heap.
ELF has two file-level tables: sections and segments (the latter are also called program headers). Things clicked for me when I realized: sections are for the linker and segments are for the runtime loader. Sections are the atomic unit of data that linkers operate on: the linker will never rearrange data within a section (it may concatenate several input sections into a single output section though). The loader doesn't even look at the section table AFAIK, everything needed to load the binary is put into segments / program headers.
Only some parts of the binary are actually read/loaded when the binary is executed. Debugging info may bloat the binary but it doesn't cost any RAM at runtime because it's never loaded unless you run a debugger. Bloaty makes this clear by showing both VM size and file size: https://github.com/google/bloaty#running-bloaty
My tool Bloaty (https://github.com/google/bloaty) attempts to do exactly this. It even disassembles the binary looking for instructions that reference other sections like .rodata.
It doesn't currently know anything about Rust's name mangling scheme. I'd be happy to add this, though I suppose Rust's mangling is probably written in Rust and Bloaty is written in C++.
I wrote a size profiling tool that can give much more precise measurements (like size(1) on steroids, see: https://github.com/google/bloaty). Here is output for LuaJIT:
$ bloaty src/luajit -n 5
VM SIZE FILE SIZE
-------------- --------------
74.3% 323Ki .text 323Ki 73.8%
12.5% 54.5Ki .eh_frame 54.5Ki 12.4%
7.6% 33.2Ki .rodata 33.2Ki 7.6%
2.2% 9.72Ki [Other] 12.9Ki 2.9%
2.1% 9.03Ki .eh_frame_hdr 9.03Ki 2.1%
1.2% 5.41Ki .dynsym 5.41Ki 1.2%
100.0% 435Ki TOTAL 438Ki 100.0%
And for Pixie: $ bloaty pixie/pixie-vm -n 5
VM SIZE FILE SIZE
-------------- --------------
57.5% 4.39Mi .text 4.39Mi 44.7%
33.7% 2.58Mi .data 2.58Mi 26.3%
0.0% 0 .symtab 1.31Mi 13.4%
0.0% 0 .strtab 978Ki 9.7%
8.8% 688Ki [Other] 595Ki 5.9%
0.0% 8 [None] 0 0.0%
100.0% 7.64Mi TOTAL 9.82Mi 100.0%
In this case, neither binary had debug info. Pixie does appear to have a symbol table though, which LuaJIT has mostly stripped.In general, I think "VM size" is the best general number to cite when talking about binary size, since it avoids penalizing binaries for keeping around debug info or symbol tables. Symbol tables and debug info are useful; we don't want people to feel pressured to strip them just to avoid looking bad in conversations about binary size.
> You can examine binary images using the nm and objdump commands to display symbols, their addresses, segments, and so on.
You can also use my new tool Bloaty McBloatface (https://github.com/google/bloaty). Check out the -v option especially, which will dump a memory map of both the file domain and the VM address domain:
$ ./bloaty `which ls` -v -d segments
FILE MAP:
[0, 19d44] LOAD [RX], LOAD [RX]
[19d44, 19df0] [None], [Unmapped]
[19df0, 1a5f4] LOAD [RW], LOAD [RW]
[1a5f4, 1a700] [None], [Unmapped]
[1a700, 1ae00] [None], [ELF Headers]
VM MAP:
[0, 400000] NO ENTRY
[400000, 419d44] LOAD [RX], LOAD [RX]
[419d44, 619df0] NO ENTRY
[619df0, 61a5f4] LOAD [RW], LOAD [RW]
[61a5f4, 61b360] LOAD [RW], LOAD [RW]
VM SIZE FILE SIZE
-------------- --------------
95.1% 103Ki LOAD [RX] 103Ki 96.1%
4.9% 5.36Ki LOAD [RW] 2.00Ki 1.9%
0.0% 0 [ELF Headers] 1.75Ki 1.6%
0.0% 0 [Unmapped] 440 0.4%
100.0% 108Ki TOTAL 107Ki 100.0%
If you leave off "-d segments" the map will include all sections too (like .bss, .text, etc). Here is an example of that output: http://pastebin.com/3XGcqA8k