Page 1 of 1

Want to learn assembly through C.

Posted: Thu Jul 16, 2015 5:11 pm UTC
by Denuis
I want to learn assembly language through C as that would help me to write my own code and decompile the script written in C, and therefore understand much easier of how assembly works and executes in the memory. I know that C is not capable of the same things as C++, and overall programming in C can be difficult if you are trying to make some type of software. As I have heard is that C is made for low level coding. However, here comes a problem for me. There is bunch of decompilers for C but almost none for C++. The main reason I want to learn one of these languages because of assembly coding.

Is there any decompilers for C++ on Windows OS (?) ,that also has support for file formats such as ELF?

Thanks in advance.

Re: Want to learn assembly through C.

Posted: Fri Jul 17, 2015 6:23 pm UTC
by EvanED
I have no idea what you're asking.

You say you want to use C to learn assembly, but then you ask for a decompiler for C++. You're also asking for a decompiler, but do you really want a disassembler? What are you trying to do?

Re: Want to learn assembly through C.

Posted: Sat Jul 18, 2015 3:43 am UTC
by commodorejohn
Studying compiler output isn't a good way to learn assembly language, because compilers don't always generate terribly sensible or efficient assembly language. You'll do much better to just find a good tutorial and working at it.

Re: Want to learn assembly through C.

Posted: Mon Jul 20, 2015 3:42 am UTC
by Derek
commodorejohn wrote:Studying compiler output isn't a good way to learn assembly language, because compilers don't always generate terribly sensible or efficient assembly language. You'll do much better to just find a good tutorial and working at it.

If you compile C code with no optimizations, the resulting assembly is usually fairly readable.

Re: Want to learn assembly through C.

Posted: Mon Jul 20, 2015 2:49 pm UTC
by commodorejohn
Fairly, yes, but it's still not a particularly great way to learn assembly language. A supplemental learning aid, maybe, but not a main course of study.

Re: Want to learn assembly through C.

Posted: Sat Jul 25, 2015 11:02 pm UTC
by korona
commodorejohn wrote:Studying compiler output isn't a good way to learn assembly language, because compilers don't always generate terribly sensible or efficient assembly language. You'll do much better to just find a good tutorial and working at it.

Compilers are actually better at converting large programs to assembler code than humans. Humans might be able to come up with better snippets for very short functions or parts of functions (e.g. spin-wait loops, memcpy() functions or SSE computations) but compilers usually produce more efficient code when compiling large functions as they perform better register allocation and instruction scheduling and of course do better intermediate code level optimizations.

Re: Want to learn assembly through C.

Posted: Tue Sep 15, 2015 3:27 am UTC
by Wildcard
korona wrote:Compilers are actually better at converting large programs to assembler code than humans.

Except for Mel.

<.<

>.>

^.^

Re: Want to learn assembly through C.

Posted: Tue Sep 15, 2015 5:02 pm UTC
by commodorejohn
He's still out there, somewhere...

Re: Want to learn assembly through C.

Posted: Tue Sep 15, 2015 8:05 pm UTC
by Xanthir
That story is older than me, too, so whatever skills Mel had over compilers of the time probably aren't relevant after 30+ years of compiler development. ^_^

Re: Want to learn assembly through C.

Posted: Tue Sep 15, 2015 8:08 pm UTC
by commodorejohn
Xanthir wrote:That story is older than me, too, so whatever skills Mel had over compilers of the time probably aren't relevant after 30+ years of compiler development. ^_^

Looking at the size and performance characteristics of modern software compared to essentially functional equivalents from even ten years ago, I have a very, very hard time believing that modern compilers are really all that great. Or if they are, modern software engineering must be well more than bad enough to compensate.

Re: Want to learn assembly through C.

Posted: Fri Sep 18, 2015 10:07 pm UTC
by korona
Compilers are much better at exploiting the features of a modern CPU than humans are.

Let's look at an extreme case first: ISAs like IA64 (not to be confused with x64, Intels 64-bit extension to the x86 ISA) even (almost) require compilers. In IA64 machine code instructions are grouped into fixed size blocks and each block does not only specify the opcodes and operands of each contained instruction but also encodes branch prediction information (i.e. specifies when each instruction should be discarded instead of being executed) and specifies which bits of the CPU are blocked by the instruction. Unless you explicitly specify that two instructions conflict (which hurts performance) the processor executes them in parallel (i.e. always 6 consecutive instructions are executed in parallel). This is further complicated by the fact that instructions cannot be ordered arbitrarily. Some sequences of instructions are just illegal even if each instruction itself is perfectly legal. Optimizing even the most trivial code by hand is almost impossible on this architecture.

Now IA64 was not really a success story but even modern implementations of x86 contain features are hard to take into account manually, for example:
  • Instruction selection: There is no single best instruction for an operation. It depends on surrounding instructions and varies greatly between different implementations. A memcpy() function can perform greatly on Sandy Bridge and sub optimal on Haswell.
  • Pipelining: Executing a "bad" sequence of instructions can stall or flush the pipeline. Different instructions have different latencies that must be taken into account. Latencies vary between different CPU implementations. On Atom a x87 float instruction takes two cycles while the processor can execute two equivalent SSE vector instructions in a single cycle.
  • Branch prediction: Mispredicted branches lead to bad performance
  • Microcode: Instructions take a different amount of space in micro op queues and caches
  • Out-of-order execution: Instructions are not executed by the processor in the order they appear in. Independent micro ops can be executed in parallel. Not exploiting this leads to sub optimal performance
  • Exploiting different ALU units: For example a floating point addition on Haswell is slower than a float multiplication! In fact a float addition has the same throughput as a float multiplication followed by a float addition (i.e. the multiplication is free if followed by an addition)
  • Register renaming: ISA registers to not physically correspond to CPU registers. Modern CPUs have more physical registers than ISA registers. Not exploiting this leads to sub optimal throughput
  • Macro op caches: Less macro ops lead to better performance
  • L1 caches: Different processors have different cache sizes which affects loop performance

Re: Want to learn assembly through C.

Posted: Fri Sep 18, 2015 11:39 pm UTC
by commodorejohn
You have an interesting definition of "almost impossible." It seems to mean "a lot of work, but actually completely possible."

Anyway, as you yourself admit, Itanium is a major outlier, being designed from the get-go as a deliberately compiler-optimized architecture (hence its infamously subpar performance early on, before the compilers were good enough to make decent use of it.) And the things you list for x86 all boil down to "different processors have different optimization characteristics and that's confusing" and "x86 has features now that veteran x86 programmers may not know to take advantage of," neither of which are actually limitations of hand-optimization, and both of which are pretty well applicable to compiler-generated code as well (at least assuming you ever distribute pre-baked binaries for use on processors for which they may not have been specifically optimized, a.k.a. What Every Software Distributor Ever Does All The Time, unless you're running something like Gentoo where everything gets compiled fresh and thus could be specifically optimized for the exact processor it's going to run on.) The only things that are even close to being prohibitive are the bits with micro-op sequencing and out-of-order execution, which are only really comparable to the Itanium thing at worst.

Re: Want to learn assembly through C.

Posted: Sat Sep 19, 2015 12:25 am UTC
by korona
commodorejohn wrote:You have an interesting definition of "almost impossible." It seems to mean "a lot of work, but actually completely possible."

Anyway, as you yourself admit, Itanium is a major outlier, being designed from the get-go as a deliberately compiler-optimized architecture (hence its infamously subpar performance early on, before the compilers were good enough to make decent use of it.) And the things you list for x86 all boil down to "different processors have different optimization characteristics and that's confusing" and "x86 has features now that veteran x86 programmers may not know to take advantage of," neither of which are actually limitations of hand-optimization, and both of which are pretty well applicable to compiler-generated code as well (at least assuming you ever distribute pre-baked binaries for use on processors for which they may not have been specifically optimized, a.k.a. What Every Software Distributor Ever Does All The Time, unless you're running something like Gentoo where everything gets compiled fresh and thus could be specifically optimized for the exact processor it's going to run on.) The only things that are even close to being prohibitive are the bits with micro-op sequencing and out-of-order execution, which are only really comparable to the Itanium thing at worst.

Well of course it's not literally impossible. After all anything the compiler can do can also be done by hand. It boils down to the amount of patience you have and how much time you're willing to invest. Handcrafted assembly has its uses and I still have to write assembly snippets quite often. But those are only snippets to optimize a certain function. I would never write a whole program in assembly. Humans are much better at writing efficient memcpy() implementations or synchronization primitives than compilers. But I'd rather spend my time using std::map<std::string, int> to do something productive than re-implement a red-black tree with string handling in assembly. The C++ implementation might be 5% slower than an assembly version that was hand-optimized for my specific CPU model but those few nanoseconds do not justify rewriting such a complex data structure for each new micro architecture that Intel releases.

Of course you cannot expect all binaries to the perfectly optimized for you CPU (unless you're running Gentoo and you're willing to invest multiple days compiling code in order to get not noticeable performance improvements) but there are a libc implementations that use different memcpy() functions depending on the CPU model. We can also tell the compiler to optimize for generic "modern" CPUs which is still much better than causally written assembler code. It will still be slower than assembler code that was hand-optimized for your specific CPU by someone who knows the micro architecture really well but by definition you cannot be faster than that.

Re: Want to learn assembly through C.

Posted: Sat Sep 19, 2015 2:54 am UTC
by commodorejohn
So your argument is that compilers are much better at doing something than people who do it badly? Quelle surprise.

Re: Want to learn assembly through C.

Posted: Sat Sep 19, 2015 4:14 am UTC
by Wildcard
korona wrote:
commodorejohn wrote:Studying compiler output isn't a good way to learn assembly language, because compilers don't always generate terribly sensible or efficient assembly language. You'll do much better to just find a good tutorial and working at it.

Compilers are actually better at converting large programs to assembler code than humans. Humans might be able to come up with better snippets for very short functions or parts of functions (e.g. spin-wait loops, memcpy() functions or SSE computations) but compilers usually produce more efficient code when compiling large functions as they perform better register allocation and instruction scheduling and of course do better intermediate code level optimizations.
I agree with both of you, and suspect you are actually in agreement if you would stop arguing. ;)

The way I see it there are two factors. Actually, three:

1. The best way to learn assembly language.
2. The most practical way to create large functioning programs
2b. (...that can be maintained.)
3. The most optimal accomplishment of a specific computing task on a specific computer architecture.

The answers to these, I hope we can all agree, are:

1. Learn assembly language as itself; don't study compiler output as your primary or only approach to learning assembly.
2. Use higher-level languages (higher than assembly, anyway) for large and intricate projects.
3. No compiler will ever beat (efficiency-wise) a superb human computer programmer who knows the architecture inside out and upside down...but assembly language will never beat higher-level languages for maintainability and ease of reading.

Re: Want to learn assembly through C.

Posted: Sat Sep 19, 2015 4:34 am UTC
by commodorejohn
Oh, I certainly won't disagree with the idea that it's generally more practical to employ HLLs for most purposes, or that they're generally quite passable performance-wise, these days (at least provided you actually do thoughtful, non-stupid software engineering, which isn't really a given, but that's another discussion, and the point is just as applicable to assembler programming anyway.) It's the blanket statements like "compilers are better at generating code than humans, full stop" that get my hackles up.

Re: Want to learn assembly through C.

Posted: Sat Sep 19, 2015 11:42 am UTC
by korona
commodorejohn wrote:So your argument is that compilers are much better at doing something than people who do it badly? Quelle surprise.

My point is that compilers generate quite good assembly code (from sane C source code) and you're not going to beat that unless you're willing to invest quite a bit of time to learn the peculiarities of your CPU implementation. Just knowing the ISA and writing assembly the same way you're writing C code ("write basic functionality first, optimize bottlenecks later") is usually going to lead to worse code than GCCs -O3.

commodorejohn wrote:Oh, I certainly won't disagree with the idea that it's generally more practical to employ HLLs for most purposes, or that they're generally quite passable performance-wise, these days (at least provided you actually do thoughtful, non-stupid software engineering, which isn't really a given, but that's another discussion, and the point is just as applicable to assembler programming anyway.) It's the blanket statements like "compilers are better at generating code than humans, full stop" that get my hackles up.

I agree with that, I never claimed that "compilers are better at generating code than humans, full stop".

Re: Want to learn assembly through C.

Posted: Sat Sep 19, 2015 6:02 pm UTC
by commodorejohn
korona wrote:My point is that compilers generate quite good assembly code (from sane C source code) and you're not going to beat that unless you're willing to invest quite a bit of time to learn the peculiarities of your CPU implementation.

Which is an entirely fair point, but not what you actually said.

I agree with that, I never claimed that "compilers are better at generating code than humans, full stop".

korona wrote:Compilers are much better at exploiting the features of a modern CPU than humans are.