x86 assembly syntax

Please compose all posts in Emacs.

Moderators: phlip, Prelates, Moderators General

Which do you prefer?

Intel syntax
56
84%
AT&T syntax
11
16%
 
Total votes : 67

x86 assembly syntax

Postby EvanED » Mon Mar 10, 2008 10:47 pm UTC

I don't think this has come up. AT&T syntax, or Intel syntax?

Some differences:

- Source order. To store 5 into eax, Intel syntax would use
Code: Select all
mov eax, 5
while AT&T would be
Code: Select all
movl $5, %eax


- Decorations. Immediate are written with a $ in AT&T, and registers with a %. Intel doesn't use these decorations. (See above for examples.)

- Operand lengths. AT&T combines a suffix with the instruction mnemonic, while Intel uses qualifiers on the operand. To load a 1-byte value at the address in eax into ebx, Intel syntax would use
Code: Select all
mov ebx, byte ptr [eax]
while AT&T would be
Code: Select all
movb (%eax), %ebx


- Memory offsets are written differently. A simple offset (e.g. a structure or array access) would, in Intel syntax be
Code: Select all
mov ebx, [eax + 4]
and in AT&T
Code: Select all
movl 4(%eax), %ebx
. The most complicated form involving a segment register (here fs), base register (eax), index register (ebx), scale (8), and constant offset (4), e.g. loading a member of a structure in an array relative to a segment register, is in Intel syntax:
Code: Select all
mov ecx, fs:[eax + ebx * 8 + 4]
and, in AT&T,
Code: Select all
mov %fs:4(%eax, %ebx, 8), %ecx


Note: don't trust me on any of this. I don't know if I've actually written x86 in either syntax, though I have done a lot of reading it in Intel syntax. I prefer it a lot; I think it's much cleaner and makes more sense.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: x86 assembly syntax

Postby Rysto » Tue Mar 11, 2008 3:02 am UTC

Intel's syntax is far, far superior to AT&T's syntax. I've worked with both to a certain degree, and I don't think that you made any mistakes. The advantages to Intel's syntax:

Programmers write eax = 5; not 5 = eax. That's why mov eax, 5 makes so much more sense than mov $5, %eax.

Intel syntax only requires operand length qualifiers when the operation is ambiguous. If I say mov al, 5, then obviously I want a byte operation, because al is an 8-bit register. Requiring me to tell the assembler what the width is when it's unambiguous is annoying as hell.

Intel's syntax really shines for memory accesses. When you see [eax+4*ebx+20], it's immediately obvious what's the base register, what's the index register and what's the scale. 20(%eax, %ebx, 8) is completely unintuitive. This is the biggest reason for using Intel syntax.

It's a little-known fact that you can get gas to accept assembly code written in Intel syntax. Use the .intel_syntax noprefix directive. Gas will accept nearly any valid instruction in Intel syntax. It just misses some obscure things that nobody ever does anymore. It can't handle long jumps, for example. I had to write a Operating System kernel for a course last semester(it goes without saying that this was a very simple kernel). I ended up writing 1573 lines of assembly code for the whole project, and there was only one line in the whole thing in which I was forced to resort to AT&T syntax despite using gas.
Last edited by Rysto on Tue Mar 11, 2008 4:25 am UTC, edited 1 time in total.
Rysto
 
Posts: 1443
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: x86 assembly syntax

Postby EvanED » Tue Mar 11, 2008 4:20 am UTC

Rysto wrote:Intel's syntax really shines for memory accesses. When you see [eax+4*ebx+20], it's immediately obvious what's the base register, what's the index register and what's the scale. 20(%eax, %ebx, 8) is completely unintuitive. This is the biggest reason for using Intel syntax.

I would like to suggest another benefit of Intel syntax: a totally reasonable instruction won't get turned into a emoticon by forum software ;-)

(Also, I think all the % and $ occurrences are both ugly and unnecessary.)

It's a little-known fact that you can get gas to accept assembly code written in Intel syntax. Use the .intel_syntax noprefix directive.

I noticed that when I was looking for examples for that post. Furthermore, looking into it further, it looks like you can even use this from within an asm directive in GCC:
Code: Select all
asm(".intel_syntax noprefix\n\t"
    "mov eax, ecx\n"
    ".att_syntax prefix");

Sorta ugly, but if the code is non-trivial, better than dealing with AT&T dumps. ('course, if the code is nontrivial, maybe it's worth it to put it into a separate asm-only file.)
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: x86 assembly syntax

Postby Rysto » Tue Mar 11, 2008 4:33 am UTC

EvanED wrote:I would like to suggest another benefit of Intel syntax: a totally reasonable instruction won't get turned into a emoticon by forum software ;-)

I have *no* idea what you're talking about. :mrgreen:


I did forget to mention the main failing of Intel's syntax: hexadecimal constants. You write a hexadecimal number and append an h to it: FFFFh is an example. The big problem with this is that it's non-trivial to differentiate between hexadecimal constants and labels. It's no problem to write a lexer to do so, but if you're doing ad-hoc transformations to code with regular expressions and "find and replace", you can really mess things up. For even more fun, you always have to remember that ah, bh, ch and dh are registers, not hexadecimal constants. And yes, I did learn all of this the hard way.
Rysto
 
Posts: 1443
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: x86 assembly syntax

Postby qbg » Tue Mar 11, 2008 1:18 pm UTC

I'll go with Intel syntax because that is the syntax that SBCL's disassembler uses.
qbg
 
Posts: 586
Joined: Tue Dec 18, 2007 3:37 pm UTC

Re: x86 assembly syntax

Postby 0xDEADBEEF » Tue Mar 11, 2008 2:51 pm UTC

If you're geeky enough, gcc -S produces assembler code, in AT&T syntax. Once in a blue moon it's worth studying the output, to see just how your code is being optimized, or for some real hard-core bug chasing.

If you're at that level, then it becomes worthwhile to understand AT&T syntax well enough to read the output.


But if you're writing in assembler, Intel syntax and NASM are the way to go!
User avatar
0xDEADBEEF
 
Posts: 284
Joined: Wed Jan 30, 2008 6:05 am UTC
Location: Austin, TX

Re: x86 assembly syntax

Postby Sc4Freak » Thu Mar 13, 2008 9:11 am UTC

Whatever it is that VS uses. Intel syntax, probably.
User avatar
Sc4Freak
 
Posts: 673
Joined: Thu Jul 12, 2007 4:50 am UTC
Location: Redmond, Washington

Re: x86 assembly syntax

Postby Tei » Thu Mar 13, 2008 11:55 pm UTC

Sorry, I think I am already "tainted" by what you call Intel syntax, by the use of 6502 CPU's
User avatar
Tei
 
Posts: 64
Joined: Fri Nov 30, 2007 2:58 pm UTC

Re: x86 assembly syntax

Postby You, sir, name? » Sat Mar 22, 2008 1:09 pm UTC

Tei wrote:Sorry, I think I am already "tainted" by what you call Intel syntax, by the use of 6502 CPU's


That is a quite different (but related) syntax.

In intel, if you want to store 5 in the accumulator, you write

Code: Select all
mov eax, 5


In 6502, you write

Code: Select all
LDA #5


You have specific instructions for each register, and a pound before immediate data.
I now occasionally update my rarely-updated blog.

I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.
User avatar
You, sir, name?
 
Posts: 6567
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City

Re: x86 assembly syntax

Postby Robin S » Sat Mar 22, 2008 3:03 pm UTC

I think the results of the poll are pretty definitive.
This is a placeholder until I think of something more creative to put here.
Robin S
 
Posts: 3579
Joined: Wed Jun 27, 2007 7:02 pm UTC
Location: London, UK

Re: x86 assembly syntax

Postby EvanED » Sat Mar 22, 2008 6:41 pm UTC

Yeah, that's what I'm thinking. With the slant towards GNU and Linux and stuff among CS techies, I'm surprised by it, but pleased, because it means that people have taste in this matter. ;-)
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: x86 assembly syntax

Postby Dakman » Sat Mar 22, 2008 8:03 pm UTC

You guys suck, I can't believe I am the only one who prefers AT&T assembly... It's probably because I'm such a GNU whore.
Dakman
 
Posts: 50
Joined: Sat Jul 07, 2007 7:49 am UTC

Re: x86 assembly syntax

Postby zenten » Sat Mar 22, 2008 8:50 pm UTC

For me it's probably because Intel is what they taught me in school.
zenten
 
Posts: 3798
Joined: Fri Jun 22, 2007 7:42 am UTC
Location: Ottawa, Canada

Re: x86 assembly syntax

Postby EvanED » Sat Mar 22, 2008 8:57 pm UTC

I like Intel syntax for three reasons:

1. All the extra %s and $s in AT&T are unnecessary and ugly (code aesthetics are perhaps too important to me, but whatever)
2. Most of the time your operations are going to be on dwords. This means that you have 'l' at the end of all your operation names. Again, unnecessary, ugly, and something to forget.
3. You can't possibly tell me that "4(eax)" makes more sense than "[eax + 4]"
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: x86 assembly syntax

Postby Goplat » Wed Mar 26, 2008 3:45 pm UTC

Rysto wrote:I did forget to mention the main failing of Intel's syntax: hexadecimal constants. You write a hexadecimal number and append an h to it: FFFFh is an example. The big problem with this is that it's non-trivial to differentiate between hexadecimal constants and labels.
What assembler were you using? Every assembler I've used has the rule that if it starts with a digit, it's a number; if it starts with a letter, it's a label. So hex constants can't be written as FFFFh but must be 0FFFFh; no ambiguity.
Goplat
 
Posts: 490
Joined: Sun Mar 04, 2007 11:41 pm UTC

Re: x86 assembly syntax

Postby Dakman » Thu Mar 27, 2008 2:31 pm UTC

All of the extra %, and $ give the code a more organized look, especially when dealing with labels, registers, hex, and decimal. And as for looks, I have never seen a pretty chunk of intel assembly, but have seen plenty of pretty AT&T code, so it's really a matter of opinion I guess.

Oh... and address offsetting, no comment.
Dakman
 
Posts: 50
Joined: Sat Jul 07, 2007 7:49 am UTC

Re: x86 assembly syntax

Postby The Finn » Thu Mar 27, 2008 4:01 pm UTC

As is reflected above by various opinions, my experience is that Intel syntax is the best for writing ASM code, (I never have bothered trying to write AT&T) yet knowing AT&T is excellent if you need to chase down bugs (twice in my earlier career, both times were instances where the bug became blindingly obvious once an AT&T dump smacked me across the face.)
The Finn
 
Posts: 23
Joined: Thu Mar 06, 2008 4:47 pm UTC

Re: x86 assembly syntax

Postby aldimond » Fri Mar 28, 2008 5:44 am UTC

3. You can't possibly tell me that "4(eax)" makes more sense than "[eax + 4]"


I learned on AT&T syntax and never had a problem with the addressing modes. If your C-level thinking is *(a+4) then [eax + 4] makes sense. But that's very rarely how you actually write programs. Usually your C-level thinking is more like a->second. If you're writing well-documented assembly code you'll have the data structure pointed to by a (or %eax on the asm side) documented with #defines (do common assemblers that use intel syntax let you run through cpp? Hopefully they have some reasonable way to do this. Magic numbers suck, even especially in assembler).

So if you're doing it right, in Intel syntax you have [eax + SECOND], and in AT&T syntax, SECOND(%eax). Why have syntax that discusses adding when you almost never think about it as adding? Similarly for the longer forms. In [eax + ebx * PAIR_SIZE + SECOND] you're expressing *(a+i*sizeof(pair_t) + SECOND) and in SECOND(%eax, %ebx, PAIR_SIZE) you're thinking more along the lines of a[ i].second (except that you have to include an argument for the size of your structure because assembler doesn't keep track of pointer types like C). When you learn the addressing mode syntax you learn each part by its function: member-offset(base-register, index-register, structure-size).

Even though the Intel syntax is just as structured, the AT&T syntax looks more structured.

Furthermore: Specifying the length of an operation "only when it's ambiguous" is troublesome; it asks the programmer to think about what's ambiguous to the assembler instead of thinking about the program. And writing movl $5, %eax isn't too awful, it reads like a sentence. I don't think it's any better or worse than flipping their order, really; there's no punctuation of assignment, so it's just a matter of how you get used to reading the operands.
One of these days my desk is going to collapse in the middle and all its weight will come down on my knee and tear my new fake ACL. It could be tomorrow. This is my concern.
User avatar
aldimond
Otter-duck
 
Posts: 2665
Joined: Fri Nov 03, 2006 8:52 am UTC
Location: Uptown, Chicago

Re: x86 assembly syntax

Postby niteice » Sat Mar 29, 2008 4:11 am UTC

The way I see it, Intel's is very direct - you're still looking up the value pointed to by value of eax + 4 bytes, Intel is just much more terse about the same idea. It's like comparing C to Pascal - C (Intel) practically helps you shoot yourself in the foot, Pascal (AT&T) tries to avoid it as a loss of readability.

aldimond wrote:do common assemblers that use intel syntax let you run through cpp?

nasm implements its own preprocessor, yasm has a nasm-like preprocessor, its own, and a wrapper around the system cpp, and I think masm will use their own cpp.
GENERATION 4294967292: The first time you see this, copy it into your sig on any forum, negate the generation, and convert it to a 32-bit unsigned integer. Social experiment.
User avatar
niteice
 
Posts: 186
Joined: Wed May 02, 2007 4:17 am UTC

Re: x86 assembly syntax

Postby aldimond » Sat Mar 29, 2008 5:24 am UTC

I'm not sure how one lets you shoot yourself in the foot more than the other. It's the same instruction set. And Intel syntax isn't any more terse. strlen("4(%eax)")==strlen("[eax+4]")==7. Of course, if you're writing assembler code, I don't think terseness is something you're worried about. Your lines will be numerous and short no matter what.

Readability is incredibly subjective. I'm sure if I cut my teeth on Pascal I'd think it was more readable than C. And I think you're missing the point when you talk about directness. Intel syntax more directly expresses the internal components of the hypothetical minimal addressing unit needed to satisfy the ISA. But you're writing a program and accessing data. If you're thinking "I want data 4 bytes above what EAX points to" you're doing it wrong. If you're looking at a constant offset from where EAX is pointing then EAX is pointing at data with a specific structure and you're probably really thinking about grabbing a specific element of that structure. No place for magic numbers, no place for thinking about pointer arithmetic. SECOND(%eax), which reads like second-of-eax more directly represents the thought of anyone writing serious programs than [eax + SECOND].
One of these days my desk is going to collapse in the middle and all its weight will come down on my knee and tear my new fake ACL. It could be tomorrow. This is my concern.
User avatar
aldimond
Otter-duck
 
Posts: 2665
Joined: Fri Nov 03, 2006 8:52 am UTC
Location: Uptown, Chicago

Re: x86 assembly syntax

Postby Goplat » Sat Mar 29, 2008 4:49 pm UTC

I prefer Intel syntax. I don't think there's really anything inherently bad about gas's AT&T syntax, but it goes against what every other x86 assembler recognizes. What if the authors of gcc were more familiar with Pascal and they made a "C compiler" that took code that looked like this:
Code: Select all
#include <stdio.h>

function main : integer;
var
   i : integer;
begin
   for i := 1 to 10 do
      printf('%d', i);
   main := 0;
end;
Some people would think it looks better than "K&R syntax C", but it would be a big pain for anyone who wanted to port their C code to gcc. Why is assembly any different?
Goplat
 
Posts: 490
Joined: Sun Mar 04, 2007 11:41 pm UTC

Re: x86 assembly syntax

Postby coppro » Sat Mar 29, 2008 6:04 pm UTC

aldimond wrote:strlen("4(%eax)")==strlen("[eax+4]")==7
False. strlen("4(%eax)")==strlen("[eax+4]") is 1, which most definitely does not equal 7.
coppro
 
Posts: 117
Joined: Mon Feb 04, 2008 6:04 am UTC

Re: x86 assembly syntax

Postby aldimond » Sat Mar 29, 2008 7:37 pm UTC

coppro wrote:
aldimond wrote:strlen("4(%eax)")==strlen("[eax+4]")==7
False. strlen("4(%eax)")==strlen("[eax+4]") is 1, which most definitely does not equal 7.


Ha, that's what I get for writing in C around here.

Goplat wrote:Some people would think it looks better than "K&R syntax C", but it would be a big pain for anyone who wanted to port their C code to gcc. Why is assembly any different?


I haven't been able to find any historical information for why AT&T saw fit to come up with their own version of the syntax. My guess would be that they thought the ambiguities in Intel's syntax were unbearable, especially when it came to writing an assembler. That is they didn't want their assembler to have to figure out instruction length or decide whether a value was a register name or a hexidecimal constant. And the programmers here were probably old-school Unix guys, and you know how they are about complexity. I repeat, just a guess. These days they probably wouldn't have done it, but back then they did, and came up with something that's in my opinion better designed.

Ha, that gives yet another advantage to AT&T syntax: it's simpler to specify and simpler to implement a parser for it. That's actually a pretty big advantage to me. Simple, unambiguous specifications lead to programs (in this case the assembler) with simple, understandable behavior. The question, "What is ah?" never even enters the programmer's mind.
One of these days my desk is going to collapse in the middle and all its weight will come down on my knee and tear my new fake ACL. It could be tomorrow. This is my concern.
User avatar
aldimond
Otter-duck
 
Posts: 2665
Joined: Fri Nov 03, 2006 8:52 am UTC
Location: Uptown, Chicago

Re: x86 assembly syntax

Postby Rysto » Sat Mar 29, 2008 8:50 pm UTC

Oh, bullcrap. It's absolutely trivial to write a lexer and parser for Intel's syntax. I've done it myself.
Rysto
 
Posts: 1443
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: x86 assembly syntax

Postby aldimond » Sat Mar 29, 2008 10:15 pm UTC

I didn't mean it was unbearably hard to write, I meant it was unbearably ugly. I'm not saying it's hard to special-case out the register names. I'm saying it's needlessly complex. Needless complexity doesn't have to be very complicated, it just needs to be needless. And it propagates to the way you think about the language. A program just has to take some input and figure out what it is. It can get away with "Is it a register? No. Is it a literal? Aha!" Not even a special case in that logic. A programmer looking at code might go through that process, but might also want to think about generally what can be used as a hex literal. "Any hex number, followed by the h suffix, except when it collides with register names," is ugly.

It's a single-digit hex literal! How could they not have thought of that collision when designing the syntax? It doesn't seem very well thought-out.

I don't really like the decision to redo the whole syntax over just that, but I like their result better, and that's what this discussion is about, right? Which syntax we like better?
One of these days my desk is going to collapse in the middle and all its weight will come down on my knee and tear my new fake ACL. It could be tomorrow. This is my concern.
User avatar
aldimond
Otter-duck
 
Posts: 2665
Joined: Fri Nov 03, 2006 8:52 am UTC
Location: Uptown, Chicago

Re: x86 assembly syntax

Postby Rysto » Sat Mar 29, 2008 10:59 pm UTC

intel.flex:

Code: Select all
e?[a-d][xhl]|e?[bs]p|e?[sd]i       { return register(yytext); }
[a-fA-F0-9]+h                      { return hex_const(yytext); }


att.flex:
Code: Select all
%(e?[a-d][xhl]|e?[bs]p|e?[sd]i)    { return register(yytext); }
#[a-fA-F0-9]+                      { return hex_const(yytext); }


God, it really sucks having to put all those special cases in for ah and whatnot...
Rysto
 
Posts: 1443
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: x86 assembly syntax

Postby EvanED » Sun Mar 30, 2008 2:52 am UTC

aldimond wrote:I didn't mean it was unbearably hard to write, I meant it was unbearably ugly. I'm not saying it's hard to special-case out the register names. I'm saying it's needlessly complex. Needless complexity doesn't have to be very complicated, it just needs to be needless.

I wouldn't say that code that obviates the need for all those %s is needless. I would argue that the presence of the %s add needless uglyness that the people dealing with AT&T syntax have to deal with. ;-)

Imagine if every integer literal in C needed to have $ prepended to it.

It's a single-digit hex literal! How could they not have thought of that collision when designing the syntax? It doesn't seem very well thought-out.

I will admit something like this is a bit ugly. It would have been better to use 0x.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: x86 assembly syntax

Postby aldimond » Sun Mar 30, 2008 7:09 pm UTC

aldimond wrote:I didn't mean it was unbearably hard to write, I meant it was unbearably ugly.

aldimond wrote:It can get away with "Is it a register? No. Is it a literal? Aha!" Not even a special case in that logic. A programmer looking at code might go through that process, but might also want to think about generally what can be used as a hex literal. "Any hex number, followed by the h suffix, except when it collides with register names," is ugly.


Rysto wrote:Code samples that only illustrate exactly what I'm talking about


Most useful code is read more time than it's written (in fact, I'd say that's central to its being useful). When reading the Intel version, the line about hex constants is only understood correctly in the context of the first line being there. If you didn't think carefully you might reasonably think ah was a hex constant. This is an incredibly small problem when you look at these two lines of code, right next to each other, in isolation. But those lines probably aren't really right next to each other in the full program, and matching hex literals might be combined with matching other types of literals.

From the perspective of writing code... there is more to a language than programs written in it and programs that interpret it; in general there is more to a format than people that write files in it and programs that interpret it. Other programs may be written to generate, modify the input data. In the case where code is the input data, those programs are likely regexes written ad-hoc in a programmer's editor. Those often will need to have special cases in them. Matching the wrong thing, changing a value to the wrong thing, etc., in a regex can really mutilate a program.

And despite this it's an incredibly small problem. We as programmers in general write much worse stuff all the time. I do think, though, that if we care about code quality we should care about this.
One of these days my desk is going to collapse in the middle and all its weight will come down on my knee and tear my new fake ACL. It could be tomorrow. This is my concern.
User avatar
aldimond
Otter-duck
 
Posts: 2665
Joined: Fri Nov 03, 2006 8:52 am UTC
Location: Uptown, Chicago

Re: x86 assembly syntax

Postby Rysto » Sun Mar 30, 2008 7:52 pm UTC

aldimond wrote:From the perspective of writing code... there is more to a language than programs written in it and programs that interpret it; in general there is more to a format than people that write files in it and programs that interpret it. Other programs may be written to generate, modify the input data. In the case where code is the input data, those programs are likely regexes written ad-hoc in a programmer's editor. Those often will need to have special cases in them. Matching the wrong thing, changing a value to the wrong thing, etc., in a regex can really mutilate a program.

I did mention this.

But let's be honest here. Virtually no assembly code is being written today, and that's not going to change. So I don't think that the this argument really holds up, because how often is anybody making these kinds of changes to assembly nowadays?
Rysto
 
Posts: 1443
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: x86 assembly syntax

Postby aldimond » Sun Mar 30, 2008 8:15 pm UTC

Sure, nobody's writing assembler, so why do we bother arguing at all?

(this is the Internet, that's what it's there for! :lol: )

Y'all are just lucky I didn't rant about using "bourgeois, regex-based lex" to read assembly language.
One of these days my desk is going to collapse in the middle and all its weight will come down on my knee and tear my new fake ACL. It could be tomorrow. This is my concern.
User avatar
aldimond
Otter-duck
 
Posts: 2665
Joined: Fri Nov 03, 2006 8:52 am UTC
Location: Uptown, Chicago

Re: x86 assembly syntax

Postby coppro » Mon Mar 31, 2008 11:00 pm UTC

I find it funny that no one has mentioned GAS here - it's basically (so far as I can tell anyway) Intel syntax with AT&T disambiguation features, such as sigils for all the values and the "movl" convention. I find GAS the easiest to digest (AT&T is really hard to think in, and Intel can trip you up with the ambiguities sometimes), but I honestly don't have enough experience in asm to say.
coppro
 
Posts: 117
Joined: Mon Feb 04, 2008 6:04 am UTC

Re: x86 assembly syntax

Postby Rysto » Mon Mar 31, 2008 11:05 pm UTC

coppro wrote:I find it funny that no one has mentioned GAS here - it's basically (so far as I can tell anyway) Intel syntax with AT&T disambiguation features

No it isn't. GAS by default uses AT&T syntax. Using a directive you can tell it to use Intel syntax instead(except that hex constants are still of the form 0x<hex digits>).
Rysto
 
Posts: 1443
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: x86 assembly syntax

Postby zenten » Tue Apr 01, 2008 3:11 am UTC

EvanED wrote:Imagine if every integer literal in C needed to have $ prepended to it.


*shudders* Perl *shudders*
zenten
 
Posts: 3798
Joined: Fri Jun 22, 2007 7:42 am UTC
Location: Ottawa, Canada

Re: x86 assembly syntax

Postby HappySmileMan » Thu May 22, 2008 9:05 pm UTC

Rysto wrote:I did forget to mention the main failing of Intel's syntax: hexadecimal constants. You write a hexadecimal number and append an h to it: FFFFh is an example. The big problem with this is that it's non-trivial to differentiate between hexadecimal constants and labels. It's no problem to write a lexer to do so, but if you're doing ad-hoc transformations to code with regular expressions and "find and replace", you can really mess things up. For even more fun, you always have to remember that ah, bh, ch and dh are registers, not hexadecimal constants. And yes, I did learn all of this the hard way.


Maybe I'm wrong, I've never actually written anything more than 10-15 lines in ASM (well, no idea how small exactly, but nothing more useful than a hello world or a program to do a billion decimal places of long division) and can't even remember that properly, but I thought Intel (or at least NASM) allowed 0x80 instead of 80h, or mayeb it was 0x80h, but either way, I think that eliminates the confusion, since you can't confuse something starting with "0x" as a label or register
HappySmileMan
 
Posts: 52
Joined: Fri Nov 09, 2007 11:46 pm UTC

Re: x86 assembly syntax

Postby timmmay » Thu Jun 09, 2011 4:27 am UTC

Hey, I love how riled up you guys get about syntax! Not being sarcastic, I really love your enthusiasm and have enjoyed reading these posts!

The first language that really taught me to understand how computers work was gnu's C compiler on a Gnu/Linux system.

I love both C and UNIX, and
AT&T invented both C and UNIX; so
I use AT&T syntax.

Plus "Programming from the ground up" by jonathan bartlett is the only good book teaching assembly to beginners (while explaining how hardware interacts and not dumbing it down too much) that i've ever come across. his book uses at&t syntax too.

i have the worst short term memory of anyone on the planet so i like having simple $ and % markers to remind me wtf im doing at any given time, and if your not a really advanced programmer, it can be helpful having the l and b markers at the end of mov's. I've actually come across an instance where without MOVL instead of MOV, i wouldn't have ever fixed the code i was debugging. i was supposed to be moving just one byte, not four, but if it had been MOV, i never would have noticed that that was the mistake.
timmmay
 
Posts: 0
Joined: Thu Jun 09, 2011 4:14 am UTC

Re: x86 assembly syntax

Postby EvanED » Thu Jun 09, 2011 2:12 pm UTC

timmmay wrote:i have the worst short term memory of anyone on the planet ...

...so you go with the "wtf order are these in" %fs:4(%eax, %ebx, 4) instead of the eminently clear fs:[eax + ebx * 4 + 4]? :-)

I guess that could count as long-term memory.
EvanED
 
Posts: 4141
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: x86 assembly syntax

Postby Ptolom » Thu Jun 09, 2011 5:37 pm UTC

I've never been much of an asm programmer, but intel syntax seems slightly more logical... aesthetically pleasing. I don't know.
It Should Be Real wrote:Fuck the wizard.
We're doing this manually.

http://www.hexifact.co.uk - Hacking blog: in which I take some things apart, and put other things together.
User avatar
Ptolom
 
Posts: 1543
Joined: Mon Mar 24, 2008 1:55 pm UTC
Location: The entropy pool

Re: x86 assembly syntax

Postby Derek » Sat Jul 02, 2011 6:13 pm UTC

I wrote an OS for class last semester, so I got to write a good amount of assembly code. We used GAS, and therefore AT&T. Our debugger showed assembly code in Intel syntax though, which was especially fun when it also showed the line of source code below, so you would see both syntaxes side by side :P

One advantage (imo) of AT&T that hasn't been mentioned above though is that it has source->destination syntax, which corresponds with most copy functions/program and which I prefer.

All in all though I prefer Intel syntax. The main reason for this is that its an Intel architecture, and therefore I feel that Intel gets to set the assembly standards and everyone else should follow them as closely as possible. Another reason is that the reordering of parameters in AT&T becomes really confusing when doing comparisons. The result of
Code: Select all
cmp $1, $0
jg label

is not immediately obvious. On the one hand, 1 > 0, but since AT&T syntax reverses parameters, the actual comparison you're performing is 0 > 1. This can easily lead to bugs, and did when writing my OS. Most of the differences between the two syntaxes are purely stylistic, but this and the unclear memory offset syntax are significant downsides for AT&T for me.
Derek
 
Posts: 1603
Joined: Wed Aug 18, 2010 4:15 am UTC

Re: x86 assembly syntax

Postby MHD » Mon Jul 25, 2011 11:42 pm UTC

I personally prefer LLVM and having the guys working on that wonderful project figure out the code generation for me.
EvanED wrote:be aware that when most people say "regular expression" they really mean "something that is almost, but not quite, entirely unlike a regular expression"
User avatar
MHD
 
Posts: 631
Joined: Fri Mar 20, 2009 8:21 pm UTC
Location: Denmark


Return to Religious Wars

Who is online

Users browsing this forum: No registered users and 1 guest