Way back when I was messing with Android reverse engineering, there were already a number of obfuscation/protection systems which screwed with Dalvik VM internals. One particularly memorable one had a native library, written using a completely incompatible ARM ABI (using the stack pointer as a normal register, a different register for stack-like operations in the opposite direction, using random registers and stack slots for arguments, etc.), whose only job it was to patch the crap out of the Dalvik VM so it would load their custom obfuscated VM bytecode. The main issue (and the reason this kind of obfuscation seems to have gotten less popular) was that it depended extremely heavily on Dalvik internal structure offsets, and had a massive table of version-specific offsets and patch code which presumably became unmaintainable with all the extant versions of Android.

Anyway, it’s fun to look at ways to obfuscate bytecode. It’s far too easy to decompile unobfuscated Java code to pretty much perfect source code these days (same goes for any .NET code) - you really do need a little bit of obfuscation to prevent people from trivially stealing your code.

I've heard similar tales about the JRE as well. Cross compilers and obfuscators found many, many patterns that the spec 'allowed' but the runtime did not.

I think the biggest tricks I know of in obfuscators is abuse of overloading (calling all functions 'a' until you run into a collision on argument types, then call those methods 'b'), which decompiled is still gibberish to most of us. The nastiest one I saw was that someone realized that keywords and symbols are reserved in Java but not in the JVM. So they started naming things "int" or "{".

I wouldn't be surprised if that's now old hat for decompilers though.

nneonneo

Yeah calling everything “a” is pretty much the only thing Proguard (the default Android obfuscator) really does that impacts reverse engineering in any way. Most reverse engineers I know working with Android already have good tools to deal with this (interactive renaming, very similar to refactor renaming, works wonders; as does the fact that most Android code leaves class and sometimes method names intact in logging and debug infrastructure!).

Renaming classes and methods into annoying names is something I see with .NET much more frequently - one obfuscator I’ve seen basically uses different mixes of Unicode spaces to name everything which is pretty cute. Unfortunately most of these techniques are easy to work around as a reverse engineer.

atesti

Which are the best tools to reverse engineer Android Apps?

For me jd-gui was not working very well, it was not able to decompile many long methods and just threw errors. Is there any actively developed JAR or dalvik decompiler?

nneonneo

I used a mix of apktool (reading raw Smali code) and jd-gui for a while, but recently transitioned mostly to using JADX: https://github.com/skylot/jadx

JADX is really nice - there’s a decent GUI, and it also supports exporting the whole thing as an Android Studio project so you can directly take advantage of the refactor tools there for more deobfuscation. It also has a smarter decompiler than JD as it’s specialized to go straight from dex to Java (JD makes a trip through the classic JVM class format, which loses some nuance of dex).

Of course, many of the apps I’m taking apart have native components, for which I use GHIDRA and IDA.