How can someone learn more about reverse engineering and decompiling

Any discussion of software that doesn't fit into any category goes here.
Post Reply
User avatar
a_regular_pear
Posts: 15
Joined: Sat Dec 31, 2022 6:00 pm

How can someone learn more about reverse engineering and decompiling

Post by a_regular_pear »

I've recently been fascinated by Matt's videos where he gets a rough idea of how a program works by looking at its hexcode. So I've been wondering what is a good way to learn how to do that and generally gain some knowledge in reverse engineering stuff
User avatar
acidiclight
Posts: 88
Joined: Tue Dec 27, 2022 10:53 pm
Location: MeteoTech Premises
Contact:

Re: How can someone learn more about reverse engineering and decompiling

Post by acidiclight »

As a programmer I would say that being able to engineer and compile software will naturally help out with reverse engineering and decompiling.

Over time you'll tend to pick up patterns and rough jdeas of how software could've been written based on how you youraelf would have done it.

I'm not Matt bhf I know he also writes code. I'm sure some of that knowledge plays into his reverse engineering projects too.
acidic light

I'm a blind game developer. I write code because it's fun.
User avatar
a_regular_pear
Posts: 15
Joined: Sat Dec 31, 2022 6:00 pm

Re: How can someone learn more about reverse engineering and decompiling

Post by a_regular_pear »

I do have some programming knowledge, while mostly high level I know and have used C. So should I first focus on becoming proficient C++ or even x86 assembly before focusing on reverse engineering specifically
User avatar
acidiclight
Posts: 88
Joined: Tue Dec 27, 2022 10:53 pm
Location: MeteoTech Premises
Contact:

Re: How can someone learn more about reverse engineering and decompiling

Post by acidiclight »

I would rather just get more familkar wjtb different languages, ljbraries and frameworks. Different OSes and platforms too if that's something you're interested in.

You could also dive into malware and ethical hacking. There are lots of YouTubers that take malware samples and reverse engineer them. You could learn a lot from that. Plus it can even be turned into a career if that's something you're interested in.
acidic light

I'm a blind game developer. I write code because it's fun.
Halamix2
Posts: 6
Joined: Sat Dec 10, 2022 6:39 pm
Location: Poland
Contact:

Re: How can someone learn more about reverse engineering and decompiling

Post by Halamix2 »

In my case I've started reverse-engineering Stunt GP on-and-off a few years ago.
I've started with loading binary in Ghidra and looking around, finding main function etc. Then I've added and learned more and more tools and concepts along the way. I can't recall my exact steps of learning, but I can write more about my current setup and useful knowledge:

Tools:
  • Ghidra, somethimes with astrelsky/Ghidra-Cpp-Class-Analyzer
  • Cheat Engine, as I join static analysis with dynamic one. I use it for memory analysis and for patching code on-the-fly
  • Excel & Speedcrunch for calculations, latter one have useful functions to convert from/to hex and/or IEEE float
  • DXWnd if the game refuses to be windowed otherwise
  • SysInternals Process Monitor to check which files/registry keys are accesed by the game in which moment
  • As I write .dll mods for the game current;y I use DebugView++ to view logs I send from my mods with the OutputDebugString() C++ function
  • Detect-it-easy to find out which compiler was used for compilation. Back then Function ID plugin for the Ghidra required additional files based on the MSVC version, I don't know how it works now
  • 010 editor for binary files (paid). basically any hex editor will do, but I love binary templates and scripting support
Knowledge:
  • Start with older games (not too old, DOS is NOT fun), early 2000's, without DRM, as they should have simpler to understand code
  • Learning assembly on the way worked out for me, I can read well enough, but I cannot write it to save my life
  • Windows API, e.g that winMain() have 4 arguments, makes it easier to find it / correct argument type
  • For file formats look on XeNTaX if somebody already cracked it for the game, or similar games (e.g. archives used in Stunt GP are identical to Worms Armageddon ones)
  • custom file formats: some developers LOVE aligning everything to 4 bytes, or to use bitfields everywhere, be aware of popular data structures
  • document as much as you can, even if you're not 100% sure. Now I have lots of comments in Ghidra project, separate Discord and wiki and this seems to finally be enough to document findings properly :P
TL;DR: I'd start with Ghidra, Excel/docs and expand knowledge/know-how
Your mileage may way vary of course, that's just my story
User avatar
MattKC
Site Admin
Posts: 323
Joined: Mon Aug 22, 2022 1:05 am
Contact:

Re: How can someone learn more about reverse engineering and decompiling

Post by MattKC »

My advice, while it may sound initially unhelpful, is really to just start doing it. Find something you can do with the skills you have now, and work your way up from there.

For me, and I think it's the same for most people, it's much easier to motivate myself through learning something if I have a tangible end goal that I want to achieve. For years I was scared of even writing C++ let alone dabbling in anything machine code related. But one day I was writing a program in Java and I realized it needed to run a lot faster. While it was intimidating at first, I starting rewriting it in C++, and just figuring out whatever I needed to do to get that done as I went. It took over a year of working on that project, but eventually I felt pretty confident about C++.

By then, I too was getting really fascinated about the idea of reverse engineering and modding something (mostly from reading slowbeef's Policenauts Let's Play). I had no idea where to start, until a friend mentioned how cumbersome LEGO Island's turn speed was on modern computers. I knew enough about programming to know that the turn speed was probably just a number somewhere, and if I found it, I could change it. So I did whatever I could to try and find that number.

A lot of it was just trial and error that went nowhere - I tried using virtual machine save states to track for changes in memory, I tried just watching a debugger blow through instructions to see if I could recognize anything, I tried blindly searching the binary for anything that might be relevant, none of which went anywhere. But eventually I discovered a tool called Cheat Engine, which can search through and modify memory. From there, I could find the turn speed value, which then showed me where that value was in the executable, which instructions read from and modified it, etc.

After figuring out how to do all of that, I then had the knowledge to tackle bigger challenges (including eventually a more sophisticated turn speed fix that completely unhooked it from the frame rate). But I could only get there by tackling smaller, more approachable tasks first.

Self-learning is definitely an acquired skill. I think the trick is finding things that you might be 80-90% sure of how to do. That way you'll at least have somewhere to start from, from which you can push through and figure out the remaining 10-20%. That's almost always what I'm doing in my videos, while it may seem from the presentation that I had it all figured out from the beginning, I'm usually learning something new with every project. That's what makes me so excited to do them.

I'd recommend at least familiarizing yourself with C/C++ simply because 99% of compiled machine code you run into will be sourced from it. While you can certainly read/write assembly without knowing C/C++, knowing it and all of its "standard practices" will help you figure out what the programmers were trying to do that resulted in the machine code you see. From there, like I said, find something you think you might be able to do, and figure out how to do it. Also Ghidra is an incredibly powerful (and completely free!) reverse engineering tool. Your US tax dollars at work!

Best of luck, hope that helps!
User avatar
a_regular_pear
Posts: 15
Joined: Sat Dec 31, 2022 6:00 pm

Re: How can someone learn more about reverse engineering and decompiling

Post by a_regular_pear »

It's a really interesting advice. Self learning might be quite hard considering considering you may not know where to start from but research and good old trial and error might cover the trick. I do have some projects in my mind, might as well try them
Post Reply