XML vs... sanity?

Please compose all posts in Emacs.

Moderators: phlip, Moderators General, Prelates

User avatar
tetsujin
Posts: 426
Joined: Thu Nov 15, 2007 8:34 pm UTC
Location: Massachusetts
Contact:

XML vs... sanity?

Postby tetsujin » Sat Jun 16, 2012 1:17 am UTC

I've never been too keen on XML - but I will say that I think in some regards it's got the right idea: establish a syntax that allows for clear-cut delineation of payload, make the whole thing flexible and extensible as hell. However, there's also the attitude that compactness doesn't matter, or that it can be left to a compression algorithm. It seems to lead to some cases of flagrant waste.

This came up recently as I was looking to lay out some sheet music with the software available on Debian. I started with "abc" because... I happened to have remembered it existed. Spent a few hours learning to use it while writing up the music I wanted... The syntax kind of sucks (can't space out the notes so they're legible, etc.), it's pretty arcane and there's apparently no simple way to transpose a section with "8va" notation such that it'll actually work with the various tools...

Mind you, I wasn't planning to actually write any MusicXML in a text editor - I know that's not what it's for. But working with ABC kind of established my mindset as, out of curiosity, I went to see what MusicXML was like...

On the MusicXML FAQ they say in no uncertain terms that they believe in the XML principle that "terseness is not valuable" and that it's better to strive for clarity... But it seems like they've failed to grasp the value of brevity.

Too brief: IDNU. TL;DR NWIM? LOL
Too verbose:
Spoiler:
If it pleases you sirs, or ladies, who may undertake the endeavour to entertain the humble author as he seeks, humbly, to share his own vision of enlightenment upon this particular topic, the manner in which the expression of ideas is pursued, which is to say the particular means of expressing oneself, via a particular choice of language, the quality of expression thereof depends not merely upon the merit of one's ideas, nor the particular qualities expressed in the formulation of said ideas; rather, as conscientious authors all, we must endeavour to consider other matters as well. These various issues, elusive, counter-intuitive, apparently nonsensical as they may seem to intelligent, enlightened minds, nevertheless must be acknowledged as the truth which they are. As a particular composition increases in size, the increased bulk of the text may serve, not to further enlighten the gentle reader in his quest to connect with the author and recognize his particular view upon a given subject of conversation: but rather, through the density with which ideas are presented, and sheer bulk of text may serve instead to distract, weary, or annoy the reader. Further, through well-intentioned exploration of related ideas presented in the text, these explorations having marginal value to the core idea of the thesis, an author may divert attention from the core ideas which he is attempting to present, weakening the overall presentation of his argument. In civil discourse one may compare this situation to a gentlemen speaking at great length on a particular subject; though his arguments may be sound and his discussion enjoyable enough, through the great length at which his turn at conversation is made, those around him may tire, and come to find the discussion nevertheless unpleasant.

Just about right: If you take "verbosity" too far, you generate a "wall of text". The "clarity" of your information is lost because you weren't brief enough.

As a basic example, here's ABC notation for a C-major chord one octave down, and one half the unit note length:

Code: Select all

[A,B,C,]/2


Now, in MusicXML:

Code: Select all

      <note>
        <pitch>
          <step>C</step>
          <octave>3</octave>
        </pitch>
        <duration>12</duration>
        <type>eighth</type>
      </note>
      <note>
        <chord />
        <pitch>
          <step>E</step>
          <octave>3</octave>
        </pitch>
        <duration>12</duration>
        <type>eighth</type>
      </note>
      <note>
        <chord />
        <pitch>
          <step>G</step>
          <octave>3</octave>
        </pitch>
        <duration>12</duration>
        <type>eighth</type>
      </note>


It's not all bad, it seems like a very versatile, full-featured format. But I have to believe they could have made better decisions about how to represent a sequence of notes. Think about how many notes are in a typical piece of music, then multiply that by eight lines of text each. I'll grant that it is pretty easy to look at the file in a text editor and figure out how the file structure works - but one could read the documentation, too, and learn the same thing. Or, if the file syntax has the right balance between brevity and flexibility, one could still just look at the file and figure it out.

For instance, let's suppose the format used MusicXML style representation of the notes, but provided a compact syntax for a sequence of notes:

Code: Select all

(C3-12 E3-12 G3-12)


I think someone could look at that and figure out that "C" is the note, "3" is the octave, and "12" is the duration without straining their brain too much. In the context of XML representation of music, to provide everything MusicXML does, something like that would still have to be contained within XML tags delineating the song, the individual measure, and providing modifiers on individual sets of notes. But pretty much any time you have a musical note, you need those three pieces of information: pitch with octave, and duration. So why be that verbose?

End of rant. (EOR)
---GEC
I want to create a truly new command-line shell for Unix.
Anybody want to place bets on whether I ever get any code written?

Iranon
Posts: 49
Joined: Wed Jul 28, 2010 6:30 am UTC

Re: XML vs... sanity?

Postby Iranon » Sat Jun 16, 2012 2:46 am UTC

Sounds like you answered your own question:
Mind you, I wasn't planning to actually write any MusicXML in a text editor - I know that's not what it's for. But working with ABC kind of established my mindset as, out of curiosity, I went to see what MusicXML was like...


Markup languages are generally annoying because they're made for dumb machines rather than impatient people. Completeness and clarity in the data allows one to get away with a dumb and robust program, sounds like good practice. Hand editing should only be necessary for little tweaks or unusual things that aren't worth having dedicated features in the dedicated editors.

Now, a tool to non-destructively strip the syntax down where applicable would be nice - saving space and keeping things more friendly to impatient humans.
These things don't typically get a lot of interest. Lazy developer: "use the right tool for what you want to do" for convenience and "if it's repetitive, dumb compression ought to work well too" for space concerns.
LEGO won't be ready for the average user until it comes pre-assembled, in a single unified theme, and glued together so it doesn't come apart.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: XML vs... sanity?

Postby EvanED » Sat Jun 16, 2012 2:53 am UTC

Right.

I tend to view XML as basically a binary format with really big, easy-to-see bits. :-)

It's not really meant to be hand-edited, despite what its proponents may say. Things are a bit different with very text-heavy XML (like XHTML), but those XML formats that are basically just a mess of tags... doing hand-editing is a last resort.

In the specific case of music, I'd suggest either using some notation software or writing Lilypond source.

User avatar
Xanthir
My HERO!!!
Posts: 5413
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

Re: XML vs... sanity?

Postby Xanthir » Mon Jun 25, 2012 11:41 pm UTC

Don't use XML when the weight of the tags is more than 50% of the filesize.

If you're really set on using XML for some reason, use a saner language that doesn't overuse elements like that:

Code: Select all

<music defaultdur='12' defaulttype='8'>
  <note pitch='c3' />
  <note pitch='e3' />
  <note pitch='g3' />
</music>


or even better, use those element names for something useful:

Code: Select all

<music defaultdur='12' defaulttype='8'>
  <c3 />
  <e3 />
  <g3 />
</music>
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

KnightExemplar
Posts: 5494
Joined: Sun Dec 26, 2010 1:58 pm UTC

Re: XML vs... sanity?

Postby KnightExemplar » Wed Jun 27, 2012 1:32 pm UTC

XML's main advantage is its extendability due to its hierarchical structure.

The best protocol that demonstrates the extendability of XML is XMPP (aka Jabber). All of the Jabber extensions can be safely ignored by a "core-compliant" client or server. New extensions to the protocol can be safely added knowing everyone else will ignore it except for your clients.

For something like music, which has had a specific format for the past 500 years, it is unlikely that the format of music would ever change. If you studied musical notation long enough, you probably will come up with a non-extendable protocol that makes sense. For something more like XMPP which changes from implementation to implementation... the flexibility of XML greatly adds to the protocol.

Now of course, there are less extendable protocols that are much simpler to use. JSON for example allows you to add a field that can be ignored, so I'd generally recommend JSON as a starting point for a text protocol. If JSON's extendability isn't enough, then XML would be a solution. (Ex: the XML community has defined namespaces to ensure no namespace collisions in an extension. Just one of the ways XML is more extendable than JSON)
First Strike +1/+1 and Indestructible.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: XML vs... sanity?

Postby EvanED » Wed Jun 27, 2012 2:11 pm UTC

KnightExemplar wrote:For something like music, which has had a specific format for the past 500 years, it is unlikely that the format of music would ever change.

While I get what you're saying, a particular piece of music software certainly won't support anything close to everything, and XML still allows them to easily ignore the parts of the spec that they don't support. Furthermore, music does change. Composers still occasionally invent a new notation, a new instruction, etc.; and the invention of a new instrument is a rare event but one that does happen.

KnightExemplar
Posts: 5494
Joined: Sun Dec 26, 2010 1:58 pm UTC

Re: XML vs... sanity?

Postby KnightExemplar » Wed Jun 27, 2012 10:23 pm UTC

EvanED wrote:
KnightExemplar wrote:For something like music, which has had a specific format for the past 500 years, it is unlikely that the format of music would ever change.

While I get what you're saying, a particular piece of music software certainly won't support anything close to everything, and XML still allows them to easily ignore the parts of the spec that they don't support. Furthermore, music does change. Composers still occasionally invent a new notation, a new instruction, etc.; and the invention of a new instrument is a rare event but one that does happen.


Sure. For example, in the program "CSound" makes sense to use XML, because they create new instruments all the time. And for a synthesizer where you're required to support new instruments, new notations, and new paradigms of developing sound (let alone new sounds)... XML makes a fine choice for that.

But extendability comes at a price of simplicity or efficiency. It is the programmer's job to clearly define his requirements such that he doesn't make something too extendible at the cost of other parameters. If all the OP wants to do is write Piano sheet music, XML is going to be a waste of time and energy. (Ex: CSound music doesn't translate directly to Piano Sheet music). On the other hand, just using XML doesn't guarantee extendability... it just allows extendability in a specific direction. If you design XML for Piano Sheet music, it probably won't work for Guitar Tabs at all.

For the case of "designing software for Piano Sheet music" and then having to deal with a new requirement later (ie: support Guitar Tab notation), XML really doesn't offer you much of an advantage over JSON... or even simple Type-Length Fields. All of the programming you did to properly space half-notes or quarter rests just don't make sense when you move to Guitar Tabs, so you'd end up "extending" your file format in a way that even simple Type-Length Field formats can do.

You're gonna add a "new type" for the different chords in Guitar Tabs and be done with it. CSV files, Type Length Field files, ini files, JSON... just about every simple file format supports this kind of extendability. Many of which are simpler to implement than XML.

So when to use XML? XML really shines when you want other people to extend your file format, and still have your software compatible with their software. If you control everything, I doubt XML really will offer you much over simpler formats.
First Strike +1/+1 and Indestructible.

billyswong
Posts: 41
Joined: Mon Nov 16, 2009 3:56 pm UTC

Re: XML vs... sanity?

Postby billyswong » Wed Jul 11, 2012 2:22 pm UTC

KnightExemplar mentioned JSON. Seeing that, I tried a direct conversion from OP's xml into JSON:

Code: Select all

[
  {
    "note":{
      "pitch":{
        "step":"C",
        "octave":3
      },
      "duration":12,
      "type":"eighth"
    }
  },
  {
    "note":{
      "chord":true,
      "pitch":{
        "step":"E",
        "octave":3
      },
      "duration":12,
      "type":"eighth"
    }
  },
  {
    "note":{
      "chord":true,
      "pitch":{
        "step":"E",
        "octave":3
      },
      "duration":12,
      "type":"eighth"
    }
  }
]

It looks verbose. But length-wised it is a little shorter (after whitespaces stripped)

User avatar
Jplus
Posts: 1721
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: XML vs... sanity?

Postby Jplus » Mon Jul 23, 2012 10:32 am UTC

JSON has already been mentioned, so all there's left for me to say is YAML.

(Personally, even if you want others to extend your format I see no reason to prefer XML over JSON.)
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

User avatar
cjmcjmcjmcjm
Posts: 1158
Joined: Tue Jan 05, 2010 5:15 am UTC
Location: Anywhere the internet is strong

Re: XML vs... sanity?

Postby cjmcjmcjmcjm » Tue Jul 31, 2012 3:14 am UTC

Since you all are complaining about music and XML, may I point you to Lilypond?
frezik wrote:Anti-photons move at the speed of dark

DemonDeluxe wrote:Paying to have laws written that allow you to do what you want, is a lot cheaper than paying off the judge every time you want to get away with something shady.

User avatar
tetsujin
Posts: 426
Joined: Thu Nov 15, 2007 8:34 pm UTC
Location: Massachusetts
Contact:

Re: XML vs... sanity?

Postby tetsujin » Thu Aug 02, 2012 6:31 pm UTC

cjmcjmcjmcjm wrote:Since you all are complaining about music and XML, may I point you to Lilypond?


You may, but I've already seen it. The only reason I posted an example in ABC rather than Lilypond was because I had just recently used ABC to make some sheet music for the first time, and only leanred about Lilypond once I started researching my other options (after having discovered a few things ABC apparently couldn't do.)

I don't complain about MusicXML because I think it's the only option, it just baffles me that people will design something like that without any thought given to compact representation. And then the claimed benefits of "human-readability" are just about crushed by the vanishing signal-to-noise ratio of content vs. tag structure.

To be fair, though, though, I don't suppose many sheet music scores are going to be so large that even MusicXML bloat (on the order of 100 bytes per note on the page) is going to be all that serious for today's hardware.
---GEC
I want to create a truly new command-line shell for Unix.
Anybody want to place bets on whether I ever get any code written?

User avatar
soundandfury
Posts: 20
Joined: Wed May 27, 2009 2:59 pm UTC
Location: near Cambridge, UK
Contact:

Re: XML vs... sexprs?

Postby soundandfury » Sat Aug 18, 2012 11:25 pm UTC

My preferred metaformat for these kinds of things is S-expressions. They express fully recursive tree structure with much much less markup than XML; they're even lighter than JSON.
(I'm actually working on an S-expression based textual representation of MIDI files, though it's only intended for debugging-ish purposes - I can't imagine someone writing out an entire MITI file, by hand, from scratch)

Code: Select all

(chord
    (note (type eighth) C (octave 3) (duration 12))
    (note (type eighth) E (octave 3) (duration 12))
    (note (type eighth) G (octave 3) (duration 12))
)


Or, with 'default' modifiers (an atom (default) whose first argument is a list of default properties),

Code: Select all

(chord
    (defaults ((type eighth) (octave 3) (duration 12))
        (note C)
        (note E)
        (note G)
    )
)


Other kinds of modifiers (eg. (repeat)) could similarly be implemented as enclosing tags, like this: (repeat 3 (note C (duration 12)))
Q: Why don't matrices live in the suburbs?
A: Because they don't commute.

eternalfrost
Posts: 75
Joined: Mon Feb 11, 2008 1:06 am UTC
Contact:

Re: XML vs... sanity?

Postby eternalfrost » Wed Nov 05, 2014 3:28 am UTC

I have had more luck with JSON, but I don't have much occasion to use it.

KnightExemplar
Posts: 5494
Joined: Sun Dec 26, 2010 1:58 pm UTC

Re: XML vs... sanity?

Postby KnightExemplar » Sun Nov 09, 2014 9:16 am UTC

Holy necro batman.

I talked in this thread before?
First Strike +1/+1 and Indestructible.


Return to “Religious Wars”

Who is online

Users browsing this forum: No registered users and 4 guests