Poll! What is the value of this constant?

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

What is the value of FILE_EXTENSION_PDF?

pdf
46
88%
.pdf
6
12%
 
Total votes: 52

User avatar
tendays
Posts: 957
Joined: Sat Feb 17, 2007 6:21 pm UTC
Location: HCMC

Poll! What is the value of this constant?

Postby tendays » Wed Aug 20, 2014 5:56 am UTC

I'll explain the context later but I don't want to skew the results.
In a (java, but the question is language-independent) project I'm working on we have a String constant called FILE_EXTENSION_PDF (in a file called CommonConstants.java).
The question you have to answer is: Is that constant equal to "pdf" (without the quotes) or ".pdf" (still without the quotes, but with a dot).
Please vote for what feels more natural to you, thank you in advance!

EDIT: I don't know if I can close voting without losing the results? The answer to the question is in my last post. The results so far are:

Code: Select all

######################* pdf (85%)
##**                   .pdf (15%)

(Stars are the votes of three people who work in the same company but different projects, whom I asked before posting a poll in here)
Last edited by tendays on Thu Aug 21, 2014 3:05 am UTC, edited 1 time in total.
<Will> s/hate/love/
Hammer wrote:We are only mildly modly. :D
Beware of the shrolymerase!

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Poll! What is the value of this constant?

Postby EvanED » Wed Aug 20, 2014 6:21 am UTC

Weak vote for "pdf". (I'd use suffix for ".pdf".) But I'd check the code before assuming too much.

User avatar
Zabaron
Posts: 113
Joined: Wed Jun 04, 2008 3:33 am UTC
Location: Georgia

Re: Poll! What is the value of this constant?

Postby Zabaron » Wed Aug 20, 2014 6:39 am UTC

I would also lean towards "pdf", but it's a horrible constant name either way. The point of a constant like that is so you could change all the uses at once, but then the constant would be named improperly. It's almost as bad as "const int FIVE = 6". Another would be to give an otherwise meaningless constant like 5 some context, but FILE_EXTENSION_PDF doesn't give any more context thant ".pdf".
I love deadlines. I love that wooshing sound they make as they fly by. -Douglas Adams

User avatar
Thesh
Made to Fuck Dinosaurs
Posts: 6327
Joined: Tue Jan 12, 2010 1:55 am UTC
Location: Colorado

Re: Poll! What is the value of this constant?

Postby Thesh » Wed Aug 20, 2014 7:17 am UTC

I would also assume it was "pdf" and then question why that needs to be a constant in the first place.
Summum ius, summa iniuria.

User avatar
Xenomortis
Not actually a special flower.
Posts: 1426
Joined: Thu Oct 11, 2012 8:47 am UTC

Re: Poll! What is the value of this constant?

Postby Xenomortis » Wed Aug 20, 2014 8:41 am UTC

Strong vote for "pdf".
You're asking for the extension part and the period is not part of the extension.

Odd that it's a constant - it's not even like FILE_EXTENSION_PDF is any more useful than "pdf".
Image

speising
Posts: 2288
Joined: Mon Sep 03, 2012 4:54 pm UTC
Location: wien

Re: Poll! What is the value of this constant?

Postby speising » Wed Aug 20, 2014 8:45 am UTC

i agree it's weird, but a possible use could be to prevent spelling errors. how often do we type "pfd" in our haste.

User avatar
The Great Hippo
Swans ARE SHARP
Posts: 7212
Joined: Fri Dec 14, 2007 4:43 am UTC
Location: behind you

Re: Poll! What is the value of this constant?

Postby The Great Hippo » Wed Aug 20, 2014 11:51 am UTC

Is FILE_EXTENSION_PFD a less likely error?

I guess it would show up faster during debugging.

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5099
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Poll! What is the value of this constant?

Postby Xeio » Wed Aug 20, 2014 1:58 pm UTC

I voted for "pdf", but I'd assume it should match whatever the standard library returns from "Path.GetExtension(path)" in the Java equivalent.

I don't know if Java has something like that in the standard library though...

EDIT: Somewhat related, this is why I love that C# has methods like Path.ChangeExtension that handles files that don't currently have an extension, as well as properly consume both ".pdf" and "pdf" as passed extensions.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Poll! What is the value of this constant?

Postby EvanED » Wed Aug 20, 2014 2:29 pm UTC

This seems unlikely for PDF, but there are situations like this that are eminently reasonable. For example, consider FILE_EXTENSION_HTML. Perhaps you're undecided on whether you want to output files as ".htm" or ".html". Making a constant is perfect for that time. Or perhaps you want to do something analogous to JAR files -- you're considering using PDF for the format of something, but giving it a different extension to distinguish.

I see little reason to complain about the presence of the constant even ignoring this:
The Great Hippo wrote:I guess it would show up faster during debuggingcompilation.
which is reason enough IMO to have it. :-)

Edit: It also prevents things like "is it '.PDF' or '.pdf'?"

speising
Posts: 2288
Joined: Mon Sep 03, 2012 4:54 pm UTC
Location: wien

Re: Poll! What is the value of this constant?

Postby speising » Wed Aug 20, 2014 3:03 pm UTC

EvanED wrote:
The Great Hippo wrote:I guess it would show up faster during debuggingcompilationtyping with autocompletion.

User avatar
firechicago
Posts: 621
Joined: Mon Jan 11, 2010 12:27 pm UTC
Location: One time, I put a snowglobe in the microwave and pushed "Hot Dog"

Re: Poll! What is the value of this constant?

Postby firechicago » Wed Aug 20, 2014 3:25 pm UTC

speising wrote:
EvanED wrote:
The Great Hippo wrote:I guess it would show up faster during debuggingcompilationtyping with autocompletion.

That assuming it's the only constant using that format. Autocomplete doesn't help very much if you also have FILE_EXTENSION_JPG and FILE_EXTENSION_HTML.

User avatar
Diemo
Posts: 393
Joined: Mon Dec 03, 2007 8:43 pm UTC

Re: Poll! What is the value of this constant?

Postby Diemo » Wed Aug 20, 2014 4:11 pm UTC

So, I'm the only one who thought that would be ".pdf". Nice to know.

firechicago wrote:
speising wrote:
EvanED wrote:
The Great Hippo wrote:I guess it would show up faster during debuggingcompilationtyping with autocompletion.

That assuming it's the only constant using that format. Autocomplete doesn't help very much if you also have FILE_EXTENSION_JPG and FILE_EXTENSION_HTML.


Why would you ever have this?
In the beginning the Universe was created.
This has made a lot of people very angry and been widely regarded as a bad move.
--Douglas Adams

User avatar
Xenomortis
Not actually a special flower.
Posts: 1426
Joined: Thu Oct 11, 2012 8:47 am UTC

Re: Poll! What is the value of this constant?

Postby Xenomortis » Wed Aug 20, 2014 4:16 pm UTC

Wait, you're ok with FILE_EXTENSION_PDF, where pdf files have basically one accepted extension.
But you're confused as to why you might have FILE_EXTENSION_JPG or HTML; file types that have multiple extensions?
Image

User avatar
Diemo
Posts: 393
Joined: Mon Dec 03, 2007 8:43 pm UTC

Re: Poll! What is the value of this constant?

Postby Diemo » Wed Aug 20, 2014 4:19 pm UTC

Well, I wouldn't say that I am ok with FILE_EXTENSION _PDF.

I did forget that jpg had multiple extentions though, so they do make sense.
In the beginning the Universe was created.
This has made a lot of people very angry and been widely regarded as a bad move.
--Douglas Adams

User avatar
tendays
Posts: 957
Joined: Sat Feb 17, 2007 6:21 pm UTC
Location: HCMC

Re: Poll! What is the value of this constant?

Postby tendays » Thu Aug 21, 2014 3:00 am UTC

Thank you for your votes!
The correct answer was ".pdf" with a dot.
Now for the context as promised. The application we are developing permits users to upload files, but they are only permitted to upload pdf files, or more specifically files whose name ends in ".pdf", so we had something like

Code: Select all

if (!filename.endsWith(".pdf")) { throw some exception }

Secondly, the application sometimes generates pdf files. When generating the name of those files it does something like

Code: Select all

String filename = some + contextual + data + ".pdf";

That was noticed during a code review, and the reviewer said that having the same string literal more than once is A Bad Thing™, and required them to be replaced by a constant in a central location. So the above code was replaced by

Code: Select all

if (!filename.endsWith(CommonConstants.FILE_EXTENSION_PDF)) { throw some exception }

and

Code: Select all

String filename = some + contextual + data + CommonConstants.FILE_EXTENSION_PDF;

The point that I am trying to make is that it is less clear what the code does, and therefore more error-prone, in the revised versions.
I didn't see anything wrong with using string literals because when reading the code we immediately know if it's correct or not.
However when reading the revised versions I can't be sure if they're doing what they're supposed to do without knowing if the constant contains a dot or not (yes I know I can easily reach the constant file). For instance consider this piece of code:

Code: Select all

if (!FilenameUtils.getExtension(filename).equals(CommonConstants.FILE_EXTENSION_PDF)) { do something }

It seems obviously correct, but actually it's wrong because the constant contains a dot (getExtension doesn't return the dot).
But before making such a claim I needed to know if maybe it is obvious what the constant means, in which case my concern is without basis.

So, yeah, thanks again for the data.

I guess we will rename the constant to

Code: Select all

public static final String DOT_PDF = ".pdf";


Why are people so afraid of string literals?

EDIT: spelsing, you do have a point about typoes, I had not thought of that
<Will> s/hate/love/
Hammer wrote:We are only mildly modly. :D
Beware of the shrolymerase!

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5099
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Poll! What is the value of this constant?

Postby Xeio » Thu Aug 21, 2014 2:00 pm UTC

tendays wrote:Why are people so afraid of string literals?
Well, it's significantly harder to make a typo in one of the dozens of instances using a constant.

I think this particular issue tends to be a problem with path manipulation as strings, not constant related.

What does

Code: Select all

string path = rootFolder + subFolder + filename + extension
result in? Did you include a / between each folder and filename? Did you accidentally include two? Did the extension have a ".", or were you supposed to add one?

You should really be using a library. It appears that java has one. But this is the C# way:

Code: Select all

string path = Path.Combine(rootFolder, subFolder, filename);
path = Path.ChangeExtension(path, extension);

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3060
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Poll! What is the value of this constant?

Postby Qaanol » Fri Aug 22, 2014 12:49 am UTC

tendays wrote:The application we are developing permits users to upload files, but they are only permitted to upload pdf files, or more specifically files whose name ends in ".pdf", so we had something like

Code: Select all

if (!filename.endsWith(".pdf")) { throw some exception }

Secondly, the application sometimes generates pdf files. When generating the name of those files it does something like

Code: Select all

String filename = some + contextual + data + ".pdf";

That was noticed during a code review, and the reviewer said that having the same string literal more than once is A Bad Thing™, and required them to be replaced by a constant in a central location. So the above code was replaced by

Code: Select all

if (!filename.endsWith(CommonConstants.FILE_EXTENSION_PDF)) { throw some exception }

and

Code: Select all

String filename = some + contextual + data + CommonConstants.FILE_EXTENSION_PDF;

The point that I am trying to make is that it is less clear what the code does, and therefore more error-prone, in the revised versions.
I didn't see anything wrong with using string literals because when reading the code we immediately know if it's correct or not.
However when reading the revised versions I can't be sure if they're doing what they're supposed to do without knowing if the constant contains a dot or not (yes I know I can easily reach the constant file).


The term “file extension” refers to the part after the dot, so the constant is poorly named.

But even more important than that, it is not at all clear to me that “extension of files that users can upload” and “extension of files the application generates” are related. In particular, it is entirely plausible that at some point in the future the application might be able to generate files of a type that users are not allowed to upload, and/or users might be allowed to upload files of a type that the application cannot generate.

If that ever happens, then it would be necessary to go back through the code and figure out which occurrences of the constant refer to which thing.

So I would say, if ".pdf" appears multiple times with respect to user uploads, then create a constant for UPLOAD_SUFFIX or some such.

Similarly, if ".pdf" appears multiple times with respect to generated files, then create a constant for NEW_FILE_SUFFIX or what-have-you.
wee free kings

Tub
Posts: 409
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: Poll! What is the value of this constant?

Postby Tub » Fri Aug 22, 2014 10:17 am UTC

tendays wrote:That was noticed during a code review, and the reviewer said that having the same string literal more than once is A Bad Thing™, and required them to be replaced by a constant in a central location.

Cargo cult, cargo cult! Switch off your brain and follow the rules!
Did the reviewer have anything else to say, or did he just complain about something trivial to maintain the illusion that he's been doing his job?


As a rule of thumb, if the name of your constant includes its value (FILE_EXTENSION_PDF = "pdf" or FIVE = 5), you're probably doing it wrong. Both constants are used for different purposes anyway, so I don't see why they need a common constant. Detecting jpg or html wouldn't work this way, and if you end up with FILE_EXTENSION_JPG and FILE_EXTENSION_JPEG, you still haven't reduced potential errors. Someone checking the file extension must remember to check both constants, and someone creating a new filename doesn't know which to use as default.

While it would be helpful to prevent typos like ".pfd", I think in the current case the dot creates more potential (and difficult to spot) errors than it prevents. It's not like the next guy HAS to use the constant, he can still type ".pfd" as often as he likes.

You could create helper functions

Code: Select all

bool doesFilenameMatchMIME(path, "image/jpeg")
string addExtensionByMIME(path, "image/jpeg")

or even a function per type so you can avoid the mime types

Code: Select all

bool doesFilenameMatchPDF(path)
string addExtensionForPDF(path)

bool doesFilenameMatchJPEG(path)
string addExtensionForJPEG(path)

but whether or not such an abstraction is useful or just architectural bloat depends on the application.

tendays wrote:

Code: Select all

if (!filename.endsWith(CommonConstants.FILE_EXTENSION_PDF)) { throw some exception }

Wait, is that comparison case sensitive? Let me just upload that .PDF ..

Never mind that you should probably be whitelisting instead of blacklisting, and should probably check for the actual file types instead of just names.

I don't know the rest of your code or the requirements of the application, but if a reviewer complains about string literals, but not about bugs within the code, then it doesn't look like he understood the code. Feel free to replace him with jenkins if you need someone to complain to you about superficial style problems. :evil:


/edit: ignore that, I overlooked the negation operator. I totally blame the lack of syntax highlighting for that one. ;)
Xeio wrote:What does

Code: Select all

string path = rootFolder + subFolder + filename + extension
result in? Did you include a / between each folder and filename? Did you accidentally include two? Did the extension have a ".", or were you supposed to add one?

If one doesn't have access to a path manipulation library, IMHO the clearest way to put it would be

Code: Select all

string path = rootFolder + PATH_SEPARATOR + subFolder + PATH_SEPARATOR + filename + "." + extension

Duplicated / are ignored by all operating systems I know, so better add one too many than one too few. (Unless you wish to use the filename as a key for caching purposes or something, but then you'll need to call realpath() anyway to resolve symbolic links)
You can replace PATH_SEPARATOR by "/" if you don't intend to pass the name to certain windows APIs.
Last edited by Tub on Fri Aug 22, 2014 1:55 pm UTC, edited 2 times in total.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Poll! What is the value of this constant?

Postby EvanED » Fri Aug 22, 2014 1:44 pm UTC

Qaanol wrote:Similarly, if ".pdf" appears multiple times with respect to generated files, then create a constant for NEW_FILE_SUFFIX or what-have-you.
The concept is kinda OK if you're right about upload extension and new file extension being unrelated (I think the constant becomes more questionable if they are unrelated, and my earlier message and position that the constant is a good idea is somewhat assuming they are related).

But in that case, I think "NEW_FILE_SUFFIX" is 10x worse than FILE_EXTENSION_PDF. The reason is it's sort of in an "uncanny valley" of containing semantic information: it doesn't contain the information about the concrete value (e.g. _PDF) but at the same time doesn't contain information about what "new file" it's talking about. So now you need to add a new output to your program. Which one is NEW_FILE_SUFFIX talking about? What do you call the next constant, if you have one... OTHER_NEW_FILE_SUFFIX? (That's a bit facetious, but it gets across my point.)

Tub wrote:
tendays wrote:

Code: Select all

if (!filename.endsWith(CommonConstants.FILE_EXTENSION_PDF)) { throw some exception }

Wait, is that comparison case sensitive? Let me just upload that .PDF ..

Never mind that you should probably be whitelisting instead of blacklisting, and should probably check for the actual file types instead of just names.
That is a whitelist. He's whitelisting .pdf. (Though I do agree that a check for the file type is probably a good idea, though I can imagine you may want to start with a filename check first.)

You can replace PATH_SEPARATOR by "/" if you don't intend to pass the name to certain windows APIs.
Well, or some other uses like passing a path to another program that doesn't understand that Windows actually deals pretty well with a/b/c in addition to a\b\c and does a manual str.split("\\") or something.

User avatar
Jplus
Posts: 1717
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Poll! What is the value of this constant?

Postby Jplus » Mon Aug 25, 2014 9:30 am UTC

I just submitted a late vote for ".pdf" (after reading only the OP). It was not a strong expectation, but I had two reasons to choose ".pdf" over "pdf" that made the deal.

First, in my experience, standard libraries that deal with file extensions explicitly (such as Python's os.path.splitext) tend to treat the dot as a part of the extension. So it would be good for interoperability if the constant did the same.

Secondly, when adding/detecting file extensions, you invariably have to deal with a dot. So you may as well include the dot in the constant.

That said, I have very little experience with Java and I didn't really ponder the virtues of a file extension constant, either.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Poll! What is the value of this constant?

Postby EvanED » Mon Aug 25, 2014 2:20 pm UTC

I don't mind the argument that ".pdf" could be a more useful constant to have (it depends on your language, libraries, and purpose), but I think it shouldn't be called "extension" in that case. I'd prefer FILE_SUFFIX_PDF if you include the dot.

My view may be biased by the DOS 8.3 days. (How often did you see "you have an 8-character name plus a 4-character extension that must start with a dot" vs "you have an 8-character name plus a 3-character extension, separated by a dot"?)

heatsink
Posts: 86
Joined: Fri Jun 30, 2006 8:58 am UTC

Re: Poll! What is the value of this constant?

Postby heatsink » Mon Aug 25, 2014 3:08 pm UTC

If the extension format begins with a dot (".pdf"), are extensions without a dot allowed? If the answer is no, don't include it in the extension string. The only thing the dot character achieves is to introduce a new kind of invalid data—extension strings that don't begin with dot—that you have to reason about and check for.

In general, if you store data in a format that admits meaningless values, then there's a risk of errors from using meaningless values. You might forget an error check, or someone else may misunderstand the documentation and add buggy code. By picking the right format, you declare what values are meaningful in a way that's enforced by the programming language.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: Poll! What is the value of this constant?

Postby You, sir, name? » Mon Aug 25, 2014 3:39 pm UTC

EvanED wrote:I don't mind the argument that ".pdf" could be a more useful constant to have (it depends on your language, libraries, and purpose), but I think it shouldn't be called "extension" in that case. I'd prefer FILE_SUFFIX_PDF if you include the dot.


I concur with this. SUFFIX makes me think there could be more than just the extension (which there appears to be in this case -- either way, it tells me I should check). EXTENSION makes me think I'll find exactly 3 characters, and that I'm charged with adding the dot.
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Poll! What is the value of this constant?

Postby EvanED » Mon Aug 25, 2014 4:05 pm UTC

I wouldn't expect to find exactly 3, because of file names like foo.html or foo.ly, even though I said my thinking is probably inspired by the terminology used for 8.3 names. :-) "html" and "ly" are still extensions in my mind.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Poll! What is the value of this constant?

Postby troyp » Mon Aug 25, 2014 5:02 pm UTC

I've seen "extension" used to both include and exclude the dot. Personally, I tend to agree with most people that it should exclude the dot, and use "filetype suffix" to include it.

I think it's sensible to have a FILE_EXTENSION_PDF constant to standardize the extension used for a particular file type. That's assuming the "PDF" in the identifier is actually the file type and not the extension itself. ie.

Code: Select all

const String FILE_EXTENSION_POSTSCRIPT = "ps"
would make sense. OTOH, if it was just naming a particular string, so you have both FILE_EXTENSION_JPG and FILE_EXTENSION_JPEG, that seems pointless to me. I do wonder about the use that tendays described. A constant naming a canonical extension for a file type would make sense for writing the extension, but for identifying it, you'd want a list of all possible extensions (probably ignoring case), not a single string.*

I also think that while using the full suffix as the constant may be more convenient, heatsink is right about the raw extension being the 'right thing' to use.

* You could always allow either, so FILE_EXTENSION_PDF would be "pdf", but FILE_EXTENSION_JPEG would be ["jpg", "jpeg"], but that's not really practical in a statically typed language like Java.

User avatar
You, sir, name?
Posts: 6983
Joined: Sun Apr 22, 2007 10:07 am UTC
Location: Chako Paul City
Contact:

Re: Poll! What is the value of this constant?

Postby You, sir, name? » Mon Aug 25, 2014 6:20 pm UTC

EvanED wrote:I wouldn't expect to find exactly 3, because of file names like foo.html or foo.ly, even though I said my thinking is probably inspired by the terminology used for 8.3 names. :-) "html" and "ly" are still extensions in my mind.


Well, I'd certainly expect strlen("pdf") to be 3 anyway :P
I edit my posts a lot and sometimes the words wrong order words appear in sentences get messed up.

User avatar
Xenomortis
Not actually a special flower.
Posts: 1426
Joined: Thu Oct 11, 2012 8:47 am UTC

Re: Poll! What is the value of this constant?

Postby Xenomortis » Tue Aug 26, 2014 3:28 pm UTC

troyp wrote: You could always allow either, so FILE_EXTENSION_PDF would be "pdf", but FILE_EXTENSION_JPEG would be ["jpg", "jpeg"], but that's not really practical in a statically typed language like Java.

That'd be worse! Now I don't know how I'm supposed to use FILE_EXTENSION_X - is it an array or a string? If it's an array, how do I know which element I want to use?
And if you do that, you might as well let me people type the extension manually.
Image

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Poll! What is the value of this constant?

Postby troyp » Wed Aug 27, 2014 6:55 am UTC

You'd just test it. If it's a string, you convert it to a 1-element array. Then you check input against each element of the array and if it matches any, accept it.

That said, I admit it'd be a bit silly to allow either. An interface like that's just a convenience for the user, so they can pass multiple values but in the common case of one value, not have to bother wrapping it. It makes sense for a function parameter, but for a constant it's pretty overboard. Presumably it's not going to be changed often enough that care about something like this.

User avatar
WarDaft
Posts: 1583
Joined: Thu Jul 30, 2009 3:16 pm UTC

Re: Poll! What is the value of this constant?

Postby WarDaft » Wed Aug 27, 2014 1:33 pm UTC

It's terrible for any situation. "Just test for it" means that every time you use it, you must test for it or you have introduced a new opportunity for error.

Further, in many strongly typed languages, you probably can't use one in place of another. And this is not support for using a weakly typed language, because those run into the above problem which is even worse.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.

User avatar
Xenomortis
Not actually a special flower.
Posts: 1426
Joined: Thu Oct 11, 2012 8:47 am UTC

Re: Poll! What is the value of this constant?

Postby Xenomortis » Wed Aug 27, 2014 1:36 pm UTC

troyp wrote:You'd just test it. If it's a string, you convert it to a 1-element array. Then you check input against each element of the array and if it matches any, accept it.

Dude, this isn't the RW forum.
Image

User avatar
Yakk
Poster with most posts but no title.
Posts: 11092
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Poll! What is the value of this constant?

Postby Yakk » Wed Aug 27, 2014 2:47 pm UTC

You do need both for recognition. A file with extension .jpeg and .jpg are both reasonable possibilities, and not detecting one or the other will screw you up.

When generating such extensions, you probably want one picked "by default" if the code above doesn't pick one.

So the right kind of "extension for a given file name" is a list of extensions, plus a default-default, and possibly a place to get more extensions that match that file type and/or change the default.

In practice, things like extensions for a given file type should usually be part of a plug-in architecture (so you can add more file types without changing the code base) and/or data rather than code constants.

Or even it should be simply handled by systems you do not write yourself.

All of this is in an increasing spiral of code robustness and complexity away from the throw-away script.

The important part is that the decision of ".jpg" or "jpg" is not an important one. Use whatever works: the important things to think about are removed from that problem.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Poll! What is the value of this constant?

Postby troyp » Wed Aug 27, 2014 2:56 pm UTC

WarDaft wrote:It's terrible for any situation. "Just test for it" means that every time you use it, you must test for it or you have introduced a new opportunity for error.

Further, in many strongly typed languages, you probably can't use one in place of another. And this is not support for using a weakly typed language, because those run into the above problem which is even worse.
What do you mean "every time you use it"? You just test it at the beginning of the function. (This is something you'd do in a dynamically-typed language.)

Xenomortis wrote:
troyp wrote:You'd just test it. If it's a string, you convert it to a 1-element array. Then you check input against each element of the array and if it matches any, accept it.
Dude, this isn't the RW forum.
Um? I just answered your question. I'm not sure what religious-warry about that.

User avatar
Xenomortis
Not actually a special flower.
Posts: 1426
Joined: Thu Oct 11, 2012 8:47 am UTC

Re: Poll! What is the value of this constant?

Postby Xenomortis » Wed Aug 27, 2014 3:09 pm UTC

troyp wrote:Um? I just answered your question. I'm not sure what religious-warry about that.

I expect a certain level of sanity here though. :P

Edit:
Ok, I'm being a little unfair.
But your solution only causes me problems when I want to save a jpg file following "the standard", whatever that is.
Image

User avatar
WarDaft
Posts: 1583
Joined: Thu Jul 30, 2009 3:16 pm UTC

Re: Poll! What is the value of this constant?

Postby WarDaft » Wed Aug 27, 2014 3:30 pm UTC

troyp wrote:What do you mean "every time you use it"? You just test it at the beginning of the function. (This is something you'd do in a dynamically-typed language.)

Yes, so the beginning of every function in which you use it you now have to remember to do another test.

This is like the exact opposite of good programming practice.
All Shadow priest spells that deal Fire damage now appear green.
Big freaky cereal boxes of death.

troyp
Posts: 557
Joined: Thu May 22, 2008 9:20 pm UTC
Location: Lismore, NSW

Re: Poll! What is the value of this constant?

Postby troyp » Wed Aug 27, 2014 3:49 pm UTC

Xenomortis wrote:
troyp wrote:Um? I just answered your question. I'm not sure what religious-warry about that.

I expect a certain level of sanity here though. :P
Oh, I see. You weren't accusing me of religious-warring... just being insane. Well, I guess that's okay then.

But your solution only causes me problems when I want to save a jpg file following "the standard", whatever that is.

Well, first, like I said, I wouldn't actually use that kind of "undiscriminated union" interface for a constant (even if we were talking about a dynamic language in the first case). It was just an offhand comment that I didn't think through.

But in any case, the business about using an array in the first place was under the assumption that the constant was being used for identifying file extensions that the software would accept (as stated in the OP). I did mention that if you were using the constant for writing file names, it should be a single string. The idea was you need a list of acceptable extensions you accept as input and a single canonical extension you use for output (if output is even an issue).

WarDaft wrote:Yes, so the beginning of every function in which you use it you now have to remember to do another test.
But... I'm only talking about a single function.*
* Besides, as I said, I'd only really do this for a function parameter. A function parameter by definition only occurs in a single function.
(But if I did use it in the situation described, the constant would only be used in the function that checks the input file.)

User avatar
Quizatzhaderac
Posts: 1600
Joined: Sun Oct 19, 2008 5:28 pm UTC
Location: Space Florida

Re: Poll! What is the value of this constant?

Postby Quizatzhaderac » Tue Sep 02, 2014 5:37 pm UTC

troyp wrote:
WarDaft wrote:Yes, so the beginning of every function in which you use it you now have to remember to do another test.
But... I'm only talking about a single function.*
* Besides, as I said, I'd only really do this for a function parameter. A function parameter by definition only occurs in a single function.
(But if I did use it in the situation described, the constant would only be used in the function that checks the input file.)
Dynamic type languages have their purposes, but you really don't want to try to force dynamic practices into static languages. In Java, String and String[] only have an ancestor in Object, which means a method that can take both, can take anything.
Wardaft, I think Troyp was suggesting something like

Code: Select all

public boolean checkFile(String fileName, Object extension) {

   if(extension instanceof String) {
      return fileName.endsWith( (String) extension);
   } else if (extension instanceof String[]) {
      boolean matchesAny = false;
      
      for(String anExtension : (String[]) extension) {
         matchesAny = matchesAny || fileName.endsWith( anExtension);
      }
      return matchesAny;
   } else {
      throw new IllegalArgumentException("Extension must be String or String array");
   }
}
Where the type checking is in the actual method itself.

Troyp, in Java, if you wanted that flexibility, you'd typically overload the method, one version taking a String, one taking an array of Strings. It's the caller's responsibility to know what to do if they don't have an argument of the correct type.

tendays wrote:Why are people so afraid of string literals?
They're afraid the developer will change, then the needed value of the string will change, then only one of the values of the literal will change. I've personally gotten into the habit (when I change of string literal) of looking up all other instances of the literal and inspecting the surrounding code to guess if they need to be changed to.

I don't really see this in your case; I doubt the "pdf2" file format will become popular (or real) soon.
Last edited by Quizatzhaderac on Thu Jun 04, 2015 3:25 pm UTC, edited 2 times in total.
The thing about recursion problems is that they tend to contain other recursion problems.

User avatar
Sizik
Posts: 1225
Joined: Wed Aug 27, 2008 3:48 am UTC

Re: Poll! What is the value of this constant?

Postby Sizik » Tue Sep 02, 2014 6:59 pm UTC

Quizatzhaderac wrote:In Java, String and String[] only have an ancestor in Object, which means a method that can take both, can take anything.


*cough*

Code: Select all

public void foo(String... arg)
{
    for(String a : arg) System.out.println(a);
}
gmalivuk wrote:
King Author wrote:If space (rather, distance) is an illusion, it'd be possible for one meta-me to experience both body's sensory inputs.
Yes. And if wishes were horses, wishing wells would fill up very quickly with drowned horses.

User avatar
Quizatzhaderac
Posts: 1600
Joined: Sun Oct 19, 2008 5:28 pm UTC
Location: Space Florida

Re: Poll! What is the value of this constant?

Postby Quizatzhaderac » Tue Sep 02, 2014 8:08 pm UTC

But at that point you've already determined if you're passing a String or an array. If you try to pass an uncasted object you'll get a compile error. Wardaft was discussing testing before the call, so that implied a context where it was unclear what the type of the constant was.
The thing about recursion problems is that they tend to contain other recursion problems.

User avatar
tendays
Posts: 957
Joined: Sat Feb 17, 2007 6:21 pm UTC
Location: HCMC

Re: Poll! What is the value of this constant?

Postby tendays » Thu Sep 04, 2014 2:13 am UTC

In our case the standard behaviour seems to be that if two or more identical String literals appear in in the codebase we're automatically supposed to replace all of them by a reference to a constant. That seems to imply some assumptions:
  • All string constants are likely to change in the future
  • When one of them changes, then all of them will change the same way
  • While it is easy to change the value of a single constant, it is difficult and error-prone to change the value of several string literals.
  • Reading code using a constant name is at least as easy as code using string literals, or even easier.
In this case the first assumption is extremely unlikely, the second one is plain wrong (if we decide to accept jpeg files that doesn't mean that now we'll be generating jpeg files as well), and the third one is not true either (if we change the value of the constant we still need to check every place it is used, and just changing the name of generated files won't suddenly make the pdf generator generate jpeg files). The funny thing is that the reviewer found the occurences of ".pdf" by doing a text search in the codebase, which probably took him one minute and a half to do.
In the highly unlikely case that the extension used for pdf files changes from "pdf" to "pfd" as in Portable Format for Documents because it is discovered that the three letter acronym "PDF" belongs to the Parkinson's Disease Foundation, so that anyone using it has to pay them a thousand dollars in royalties, we can just do the exact same text search and replace things on a case by case basis (it's likely we'll be accepting uploads of x.pdf and x.pfd in that hypothetical scenario).
The fourth assumption has been demonstrated wrong by this poll. If we don't choose constant names carefully they make the code harder to read and understand and especially more error prone (I guess we could have written functions isPdfFilename(filename) = filename.endsWith(".pdf"), and createPdfFilename(basename) = basename + ".pdf", that's probably pointless compared to plain literals but at least wouldn't have this ambiguity)

Of course (as some of you have been saying) constants do have their place. For instance having a constant PERMITTED_FILE_EXTENSION="pdf" or (better) PERMITTED_FILE_EXTENSIONS=("pdf", "jpeg", "jpg", "whateverelse") would make sense. While it's unlikely that the extension of pdf files ever changes, it's a possibility that we change the set of file extensions that we accept. I also use extensions when using String identifiers to refer to objects like i18n keys, HTTP GET parameter names or whatever, if those appear more than once.

heatsink wrote:If the extension format begins with a dot (".pdf"), are extensions without a dot allowed? If the answer is no, don't include it in the extension string. The only thing the dot character achieves is to introduce a new kind of invalid data—extension strings that don't begin with dot—that you have to reason about and check for.

In general, if you store data in a format that admits meaningless values, then there's a risk of errors from using meaningless values. You might forget an error check, or someone else may misunderstand the documentation and add buggy code. By picking the right format, you declare what values are meaningful in a way that's enforced by the programming language.

"The only thing that the dotless format achieves is to introduce a new kind of invalid data—extension strings that begin with dot—that you have to reason about and check for."
[offtopic]
Interesting article, by the way, even if I think it goes a bit too far.
Spoiler:
I've tried a few times in Java to use one of the Optional<X> types instead of nullable references, but then I realised that it just creates noise. People who remembered to do null-checks before will think to do isPresent()-checks, people who didn't think about that just start seeing get() as part of the method call. Instead of doing meal = whatsForDinnerToday() they'll do meal = whatsForDinnerToday().get(). If they're motivated they might add a comment "dinner is never null at this location" even when it might be null, and it still fails at runtime just like with plain null references.
He also advocates using return types instead of exceptions. There were a few places in our projects that used return types to signal failure, and developers didn't always remember to check the return value, so even at runtime, no errors, no warnings, just that the thing the user tried to do didn't actually take effect and we get bug reports "I do this, it seems to work, but when I click reload it goes back to the previous version".
[/offtopic]
<Will> s/hate/love/
Hammer wrote:We are only mildly modly. :D
Beware of the shrolymerase!

korona
Posts: 495
Joined: Sun Jul 04, 2010 8:40 pm UTC

Re: Poll! What is the value of this constant?

Postby korona » Thu Sep 04, 2014 9:41 am UTC

heatsink wrote:In general, if you store data in a format that admits meaningless values, then there's a risk of errors from using meaningless values. You might forget an error check, or someone else may misunderstand the documentation and add buggy code. By picking the right format, you declare what values are meaningful in a way that's enforced by the programming language.

I disagree with that article on almost every point, at least when talking about languages like Java, C# and C++.

Using a Null Object instead of null only makes sense if that object can provide a meaningful implementation of all methods. Using a Maybe/Optional/etc. return type makes sense when an empty result is part of the semantics of your function. It makes sense to write a Optional<Int> parseInt(String) function (to signal that the given string does not represent an integer) but it doesn't make sense to write a Optional<Char> readChar(Stream) function (to signal that an IO error occurred). Use assertions and exceptions to signal critical failure.

In general that talk about "domains of primitive data types are too large" contains a lot of bullshit.
So this function has 2^2 = 4 possible implementations. Perhaps we could write a test case for each one.

That is just totally stupid. I don't think I have to explain why. The next paragraph compares the possible complexity of an Int f(Int, Int) function to the complexity of a {-1,0,1} g({0, 1}, {0, 1}) function. The special case that both parameters are 0 or 1 is a stupid toy example and almost never comes up in a real world program. If the parameters are that simple you shouldn't use a function but a simple "if" expression. Are they seriously suggesting to build a three-valued type/enum with values "less", "equal", "greater" in a language like Java or C#? How much syntactical ugliness does that introduce? It is standard practice to use -1, 0, 1 return values for comparison operations. Doing something else certainly violates the principle of least surprise. It is as bad as the famous "true", "false", "file not found" enum.

Most of those practices make sense in languages like Haskell and Scala that have type systems that can properly deal with complex data types. But please don't try to write Haskell code in Java, C# and C++. The proper suggestion is "use Scala" and not "write Scala code in Java". If you have to write Java code then use Java idioms.


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 4 guests