## Code duplication vs clarity

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

ucim
Posts: 6757
Joined: Fri Sep 28, 2012 3:23 pm UTC

### Code duplication vs clarity

Ok, so I have incoming data, which is supposed to include some subset of fields A, B, C, D, and E. Depending on which fields are included, I do things: some subset (in order) of 1, 2, 3, 4, 5, 6.

For example:
if data includes only A, I do thing 1, thing 3, and thing 4.

If data includes A and C, I do thing 2.

If data includes B, C, and D, I do thing 1, thing 2, and thing 3.

etc.

So, I have three choices:

1: I can FIRST figure out which subset of fields I have, and THEN use a switch to do the proper subset of things.

Code: Select all

`if (A)  case=foo;if (A and C)  case=bar;if (B, C, and D)  case=baz;etc;switch (case)  foo:    thing1; thing3; thing4; break;  bar:    thing2; break;  baz:    thing2; thing3; break;etc;end switch`

2: I can do the things in order, and for each thing, figure out whether the data combination means I should do that thing in the first place.

Code: Select all

`if (A)  thing1;if (A and C) or (B and C and D)  thing2;if (A or (B and C and D))  thing3;if (A)  thing4;etc;`

3: I can do both, but it seems to be the long way around:

Code: Select all

`if (A)  case=foo;if (A and C)  case=bar;if (B, C, and D)  case=baz;etc;if (case==foo)  thing1;if (case==bar or case==baz)  thing2;if (case==foo or case==baz)  thing3;if (case==foo)  thing4;etc;`
...and then I wonder whether or not I should gather up the various combinations of foo, bar, and baz that lead to thing1, thing2... before calling them.

4: I can try to get clever using fall-thrus and such to reduce the duplicated code, but this is fragile.

5: something else.

It doesn't matter what the A, B, C, thing1, thing2... is. I've run into this general pattern before.

Further, sometimes thing-n comes in variations. For example, if the data associated with A is a particular (kind of) value, then thing3 uses that value, otherwise thing3 uses some default. As a concrete example, there are message boards (Science-1, Science-2, Math-1...) which are displayed one way, but TeacherLounge-# boards (of the same structure) are displayed differently. And the -0 boards (i.e. Science-0) are not displayed to the public, whereas the -n (n>0) are.

Is there any good way of thinking about it that would help choose between #1, #2, and #3? #3 seems the best to me, but none of them are elegant. Any better ways to reduce code duplication? Am I "doing it all wrong" (without getting into haskell-like languages)?

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Heartfelt thanks from addams and from me - you really made a difference.

Flumble
Yes Man
Posts: 2183
Joined: Sun Aug 05, 2012 9:35 pm UTC

### Re: Code duplication vs clarity

I think it's mostly a matter of taste or highly dependent upon use case and environment (consistency with other libraries/projects improves clarity).
When the thing<k> operations mutate the data and require a particular order, my preference would be the conditional dataflow of #2. Well, as long as the things can't alter the result of the conditions, otherwise the clarity is easily lost.

ucim wrote:(without getting into haskell-like languages)

Speaking of, maybe I'm doing it all wrong, but why is there no function Bool->(a->a)->a->a or f a->(a->Bool)->(a->f a)->f a or something similar in the standard libraries? Function or case notation is verbose in such simple cases.

Tub
Posts: 446
Joined: Wed Jul 27, 2011 3:13 pm UTC

### Re: Code duplication vs clarity

Let me throw another suggestion onto the pile:

Code: Select all

`if (A) actions=[thing1, thing3, thing4]if (A and C) actions=[thing2];if (B, C, and D) actions=[thing2, thing3];etc;foreach(methods as method) { ...}`

Whether the array contains function pointers/function objects that can be called in a loop, or entries of an ENUM (or simply strings) that are later handled with a switch statement, I think this variant does multiple things well:

a) the logic remains as clear as in example 1. Your example 2 needs to reorder all the logic, which is difficult to read and maintain
b) it is immediately obvious what each case does, because the list of actions is immediately below the corresponding if. Your examples 1 and 3 suffer, because "case=bar" doesn't tell the reader anything unless the reader scrolls down.
c) it separates the selection logic from the actions.

About your variations, it's difficult to say without knowing what kind of variations you have, but here are two inspirations:

Code: Select all

`if (A) { if (A has special value)   actions=[thing1, thing3_with_value, thing4] else   actions=[thing2, thing3_with_default, thing4]`

Code: Select all

`params = { thing3_uses_default: true, ...}; if (A) { actions=[thing1, thing3, thing4] if (A has special value)  params.thing3_uses_default = false;}`

I like the second one for its flexibility, but if you have few variations with simple conditions, the first one might be preferable for its lower complexity.

Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 4053
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

### Re: Code duplication vs clarity

I'd personally be tempted to look at the truth array, and optimise:
Currently, it's incomplete, but looks like:

Code: Select all

`     /--!A-\ /--A--\     !C  /--C--\ !C/ !D ??? ???  2  134!B\  / ??? ???  2  ???    D/  \ ??? 123 123 ???B\ !D ??? ???  2  ??? `

(Table-tags would still be nice, ah well. But what you have is an array covering all sixteen possible combos, populated by function numbers we know we want according to not-variable and is-variable positions.)

Obviously you'd have etcs, including more than four tests, and you may even want to change things later, but by looking for the largest groupings (together or wrapped or, with more vars to worry about, spread-spanned unless you switch variables around (in your head, or actually) I can see, forthis particular example, that I can quickly summarise as:

Code: Select all

`If (B & C & D) // defines two positions, in the simplest + largest + exclusivest way  things123;elseif (A & C) // defines four positions, but the A&B&C&D one has already been iffed,               // and this is an else to that...  thing2;elseif (A & !B & !D) // defines two positions, but A&C has already been dealt with                     // which doesn't mean you *can't* add !C to that  things134;else // room for catchalls  etcs???endif`

Note that instead of overriding the "foo" situation with the "bar" one, I've just pre-emptively used the barish call, if necessary, by shortcutting.

And, based on the assumption of logical groupings of thingNs, I've called an actual envelope "thingsXYZ" calle, here, although you don't have to. My thinking, though, is that it first of all forces you to anticipate a combination in code (things234 could be a typo, as could calling "thing1: thing1; thing3;" in longhand due to a copypasta error) and also enables you to validate and extend such specific sequences. (You're probably not using thingN functions but something like "undock(RadarWindow); resize(RadarWindow,quarter_screen); recentre(RadarWindow.data);" and a function "zoomFocus(RadarWindow)" could be a good envelope function to have, so that adding "scale(RadarWindow.data)" i to zoomFocus also deals with other zoomFocussing (e.g. with MapWindow as the param,, another side-docked element in this hypothetical).)

But, it depends on your style (and expectations as to who might be maintaining/porting/reverse-engineering the code in the future, including your future self!) and how malleable your decision-making process is possibly going to be both in development and in foreseeable future tweaks.

And I'd probably do it different in Perl. Define @doFuncs=() and then push (and, in extreme circumstances, splice) \$subRefs into that possibly in the form of an Iterated function, then while ( \$doFunc = shift @doFuncs) {&\$doFunc()}, or somesuch guff. [Which it appears Tub has just ninjaed me about, whilst I was perfecting my pseudo-table formatting, except in OO form.] But the logic of the pushing would probably be much the same, with or without a truth-array (depending on whether I think I could keep the workings all safely in my head.

ThemePark
Posts: 450
Joined: Fri Jun 27, 2008 5:42 pm UTC
Location: Århus, Denmark

### Re: Code duplication vs clarity

Since it's first and foremost a matter of whether each piece of data is included or not, I would do an alternate version of your first suggestion using binary, like this.

Code: Select all

`case = 0;if (A)  case &= 1;if (B)  case &= 2;if (C)  case &= 4;if (D)  case &= 8;if (E)  case &= 16;  switch (case)  1: // A only    thing1; thing3; thing4; break;  5: // A and C    thing2; break;  14: // B, C and D    thing2; thing3; break;etc;end switch`

As for your other problem, I would add code like to all switch cases where it is needed:

Code: Select all

`thing3value = defaultif (A == particularvalue)  thing3value = A`
I have traveled from 1979 to be a member of the unofficial board Council of Elders. Phear M3

ucim
Posts: 6757
Joined: Fri Sep 28, 2012 3:23 pm UTC

### Re: Code duplication vs clarity

Thanks for the ideas.

@Flumble - that's kind of what I suspected. Yes, thing<k> mutates the data.

@Tub: Brilliant. But then, instead of

if (A)
..methods=[ thing1, thing3, thing4 ];
if (A and C)..
..
foreach (methods as method) {...}

why not just do:

if (A)
{ thing1; thing3; thing4; }

which would be even more straightforward?

My reason for trying to avoid this is that the things being done and the order in which they are done are pretty much mostly the same in all the cases. So whether I put the thing strings in an array first, or just do them straightaway, I'm still duplicating the logic of "thing1 comes first, thing2 is next..."

I often get to this point by starting with (the logic I don't want to be duplicating):

thing1;
thing2;
thing3;
thing4;
thing5;

where each "thing" could be a function or could be a simple statement (like foo=bar;)

but then, if B is populated, we don't need thing2, and if C is populated and has the value 0, we need to do thing3a before thing3. Now it's

thing1;
if (B) then thing2;
if (C and C=0) then thing3a;
thing3;
thing4;
thing5;

Thing1, thing2, and thing3 set up a form, but if the only incoming data is A, then it's a virgin form.

if !(A and !B and !C and !D) // if not virgin
{
thing1;
if (B) then thing2;
if (C and C=0) then thing3a;
thing3;
}
thing4;
thing5;

At this point, the logic flow of thing12345 is getting lost, and there are a lot of if statements, but at least no code duplication (except perhaps having to evaluate B and C twice so far)

So (sez I)... evaluate A, B, C... once, creating foo, bar, and baz cases, and switch on them... and you can see where that's going.

An example of thing3a would be that usually the table to use is in A, but if it's a certain kind of data coming in (we have A, B, and D populated), the table to use is in D. So usually I'll have the statement
table=A['tablename'];
but sometimes I'll have the statement
table=D['name'];
but in all cases followed by:
do_stuff_with_(table);

@ThemePark - also brilliant. I'll have to remember that for the future. It's especially elegant for pure cases.

@Soupspoon - that solution is the (4) in my OP; you using "already if'd" and me using case fall-thrus. But though most efficient, it looks fragile. Is the matrix itself something you're using in the code, or just as a guide? (I actually did use a matrix in plpgsql code to do that very thing... it was a nightmare (plpgsql ain't good at that!) but the best way to accomplish the task at hand.)

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Heartfelt thanks from addams and from me - you really made a difference.

Xanthir
My HERO!!!
Posts: 5389
Joined: Tue Feb 20, 2007 12:49 am UTC
Contact:

### Re: Code duplication vs clarity

ucim wrote:1: I can FIRST figure out which subset of fields I have, and THEN use a switch to do the proper subset of things.
...
2: I can do the things in order, and for each thing, figure out whether the data combination means I should do that thing in the first place.
...
3: I can do both, but it seems to be the long way around:
...
4: I can try to get clever using fall-thrus and such to reduce the duplicated code, but this is fragile.
...
5: something else.
...

#1 is bad; it's just a less-readable variant of #2. If the stuff you need to do is long enough to make #2 less readable, it should be in functions so the conditional structure remains paramount.

I've done this kind of code before; the right thing to do is to make a few boolean variables for "do_thing_1", "do_thing_2", etc. Then, first, go through all your logical conditions, setting the correct variables to true. Then just run through the six things based on whether the corresponding variable is set.

Further, sometimes thing-n comes in variations. For example, if the data associated with A is a particular (kind of) value, then thing3 uses that value, otherwise thing3 uses some default. As a concrete example, there are message boards (Science-1, Science-2, Math-1...) which are displayed one way, but TeacherLounge-# boards (of the same structure) are displayed differently. And the -0 boards (i.e. Science-0) are not displayed to the public, whereas the -n (n>0) are.

Same deal - abstract the data you're actually going to work on into variables, and in the conditionals, set them to the correct value. Then in the six things, work on those variables, rather than on your incoming data directly.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

PM 2Ring
Posts: 3684
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Sydney, Australia

### Re: Code duplication vs clarity

ThemePark wrote:Since it's first and foremost a matter of whether each piece of data is included or not, I would do an alternate version of your first suggestion using binary, like this.

Code: Select all

`case = 0;if (A)  case &= 1;if (B)  case &= 2;if (C)  case &= 4;if (D)  case &= 8;if (E)  case &= 16;  switch (case)  1: // A only    thing1; thing3; thing4; break;  5: // A and C    thing2; break;  14: // B, C and D    thing2; thing3; break;etc;end switch`

Good concept, but you need to use bitwise OR, not bitwise AND. Remember, 0 & x == 0 for all x.

Tub
Posts: 446
Joined: Wed Jul 27, 2011 3:13 pm UTC

### Re: Code duplication vs clarity

ucim wrote:@Tub: Brilliant. But then, instead of

if (A)
..methods=[ thing1, thing3, thing4 ];
if (A and C)..
..
foreach (methods as method) {...}

why not just do:

if (A)
{ thing1; thing3; thing4; }

which would be even more straightforward?

Because I assumed that "thing1" is a placeholder for something more complicated, along the lines of
results.thing1 = thing1factory.getInstance().thing1(data.A, data.B, params.thing1_parameter);
and the point of the exercise was to avoid duplicating that code multiple times.

Now that you've given a more detailed example, I see why my suggestion might not be as useful for your specific problem.

I also see that your problems are way too diverse to abstract them away in a simplified model. There's no one-size-fits-all solution that's more concise than what you're already doing.

The thing is, your logic is inherently messy, and any possible way to translate that logic into code is going to end up messy. It is going to be difficult to read precisely because you're trying to communicate something non-trivial.
That being said, you can always improve readability. For one, I'm not sure if the single "case" variable is a good idea. You have multiple independent conditions and you should probably track them separately. So why not use multiple bools with clear names?

if !(A and !B and !C and !D) // if not virgin
{
thing1;
if (B) then thing2;
if (C and C=0) then thing3a;
thing3;
}
thing4;
thing5;

Code: Select all

`bool is_virgin_form = A and !B and !C and !D;bool has_b = B;bool has_c = C;bool c_is_zero = has_c and C=0;if (!is_virgin_form) {  thing1;  if (has_b)    thing2;  if (c_is_zero)    thing3a;  thing3;}thing4;`

There. Ok, most of my booleans have stupid names because I don't know what those checks do, but I hope you get the idea of proper, readable naming anyway. Also the idea of piecewise assembly. For example
if (A and C) or (B and C and D)
thing2;
if (A or (B and C and D))
thing3;

Would become

Code: Select all

`bool BCD = B and C and D;bool needs_thing2 = (A and C) or BCD;bool needs_thing3 = A or BCD;`

if your bools have better names than mine, that should increase readability and reduce duplication.

There are plenty of variants with a framework with inversion of control and dependency injection that would yield a code structure with very clean separation of tasks, but in the end you end up with 1000+ LoC for essentially the same result. That's overkill and won't increase readability.

Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 4053
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

### Re: Code duplication vs clarity

ucim wrote:@Soupspoon - that solution is the (4) in my OP; you using "already if'd" and me using case fall-thrus.
For fall-throughs, as I use the term, it'd be a different optimal combo, but could still be worked out from the same diagram. Although, for that limited example, I can't see a more efficient fall-through.

From a quick glance at the example, but I could be wrong...

Code: Select all

`case (A&C):  thing2();  thing2done=true;  if not(C&D) break; // part-time (mis-)match to prevent unwanted fallthroughcase (!B&C&D or A&!B&!C&!D):  thing1();  thing3();  if (thing2done or A) break; // another conditional break, tuned to exclude two situations  thing2(); // a 'repeat', if the break above hadn't caught thatcase (A&!B&!C&!D):  thing4();`

That looks messy, and I've pseudocoded the cases, probably... You might even pre-if to set "docase1=true" and "case (docase1)" instead, etc, but then it might as well be serially (rather than elseif 'next test if not'ting) or nestingly if-controlled in its entirety, and not even Cased at all in the first place.

(Depends on your real-life scenario. The logic of your actual requirement(s) could give a more logical fall-through routing than this example ever could.)

But though most efficient, it looks fragile. Is the matrix itself something you're using in the code, or just as a guide?
Here just a guide (and easier jotted down on paper, for planning purposes) but stored in such a text form as a comment just above the code lets you buffer the 'fragility' by referring to it and altering it (via anothe scrap of paper, if necessary!) whenever you need to revise the logic you're taking. And inform future readers of the approach currently taken.

(It's not unreasonable to consider storing the table in human-editable and machine-parsable form (just the one entry for both, that is!) for either 'precompiled' or on-the-fly evaluation, but would writing the 'solver' code be worthwhile? Easier than a natural-language parser, but probably a diversion left for when you have time on your hands, if at all... Unless something like it is already Public Domain or otherwise rippable-off with impunity... )

If a future revision changes the diagram to work better in a different and more efficient form then a trained eye could discover this, as well, rather than have to work with pasta-code1 to derive the current workings, then decide whether to re-engineer it from scratch, in-built (but rarely encountered and never noticed) logical errors and all...

Not that I'm exactly the saintly supplier of super-readable script, all most much enough of the time, but I can still pontificate upon preferable programming paradigms, right?

1 Not necassarily spaghetti, but could look sufficiently like it at first glance!

ThemePark
Posts: 450
Joined: Fri Jun 27, 2008 5:42 pm UTC
Location: Århus, Denmark

### Re: Code duplication vs clarity

PM 2Ring wrote:
ThemePark wrote:Since it's first and foremost a matter of whether each piece of data is included or not, I would do an alternate version of your first suggestion using binary, like this.

Code: Select all

`case = 0;if (A)  case &= 1;if (B)  case &= 2;if (C)  case &= 4;if (D)  case &= 8;if (E)  case &= 16;  switch (case)  1: // A only    thing1; thing3; thing4; break;  5: // A and C    thing2; break;  14: // B, C and D    thing2; thing3; break;etc;end switch`

Good concept, but you need to use bitwise OR, not bitwise AND. Remember, 0 & x == 0 for all x.

You are of course right, it needs to be |= (i.e. OR) to set the bits. I must've been half asleep when I wrote that.
I have traveled from 1979 to be a member of the unofficial board Council of Elders. Phear M3

ucim
Posts: 6757
Joined: Fri Sep 28, 2012 3:23 pm UTC

### Re: Code duplication vs clarity

Thanks for all your ideas. Yes, there are several different kinds of ways to group things, and I've been taking a case-by-case approach. Sometimes this leads to one form (examine input, decide on the kind of thing to do, and then switch (or if..elseif) to choose.) The advantage is each logical grouping is clear and well named:
if (new_event) { bunch_of_stuff;}
elseif (edit_event) {similar_bunch_of_stuff;}
...

Sometimes it seemed clearer to do something like:
if (edit_event) {retrieve_old_data;}
else {set_up_defaults;}
validate_data;
// even defaults - defend against future
populate_form;
output_page;
...

Sometimes other things seemed better, and I was wondering if, being self-taught, there was some overall technique or methodology I was missing.

Tub wrote:The thing is, your logic is inherently messy, and any possible way to translate that logic into code is going to end up messy.
...and the question in my mind is, is it messy because I haven't thought it through properly, or is it irreducibly messy because that's just the nature of this particular problem.

Things that are almost the same aren't exactly the same, and I guess sometimes the differences are best expressed by near-duplication, and sometimes by embedded conditionals.

Tub wrote:Also the idea of piecewise assembly.
Yes, I do that; I probably should to it more. I also have the (probably bad) practice of using deep arrays directly. For example,

part_number is the index of an array of part.
part is an array of attribute (such as name, weight, price...)

So, given the part number, its name is part[part_number][name]

I get an order, which is an array of fields, one of which is items, which is an array of item. Each item is an array(quantity, part_number).

order[items][3][part_number] is the part number of the third item being ordered.

It's name, of course, is

part[order[items][3][part_number]][name]

etc.

So I then
do_something_with(part[order[items][3][part_number]][name]) // part name

Leaving it as is keeps it at one line of code. I could otherwise do:

orderline = order[items][3]
this_part_number = orderline[part_number]
this_part_name = part[this_part_number][name]
do_something_with(this_part_name)

which is sort of clearer, but turns fifteen lines into sixty, and makes me have to think of new names for the same thing. (The two hardest things are...)

What I have been doing is sticking with the long array notation until I can't stand it, and then taking a big shortcut. i.e. doing something like

this_name = part[order[items][3][part_number]][name]
do_something_with(this_name)

What should I be doing?

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Heartfelt thanks from addams and from me - you really made a difference.

Tub
Posts: 446
Joined: Wed Jul 27, 2011 3:13 pm UTC

### Re: Code duplication vs clarity

ucim wrote:Yes, I do that; I probably should to it more. I also have the (probably bad) practice of using deep arrays directly. For example,

part_number is the index of an array of part.
part is an array of attribute (such as name, weight, price...)

So, given the part number, its name is part[part_number][name]

Shouldn't that be part[part_number].name or ->name or even ['name']? What language are you using, and why aren't you using proper classes with named attributes?

ucim wrote:So I then
do_something_with(part[order[items][3][part_number]][name]) // part name

Yeah, been there, done that. I usually add a bunch of spaces around the inner variable to make it easier to parse:

Code: Select all

`do_something_with(part[  order[items][3][part_number]  ][name])`

Of course there's no code formatter that I know of that would preserve such formatting.

Also, as a rule of thumb, if you feel like you need to explain a line of code with a comment: don't. Improve or split the line instead.

For example, if you're looping over an array, alias the current element instead of accessing the array. Compare these two php snippets; I would prefer the second.

Code: Select all

`for (\$i=0;\$i<count(\$orders);\$i++) {  \$partname = \$parts[  \$orders[\$i]->part  ]->name;}`

Code: Select all

`foreach (\$orders as \$order) {  \$partname = \$parts[ \$order->part ]->name;}`

ucim wrote:which is sort of clearer, but turns fifteen lines into sixty

Well, terseness does not improve clarity by default (see IOCCC), but neither does verbosity (see java).

ucim
Posts: 6757
Joined: Fri Sep 28, 2012 3:23 pm UTC

### Re: Code duplication vs clarity

Tub wrote:Shouldn't that be part[part_number].name or ->name or even ['name']? What language are you using, and why aren't you using proper classes with named attributes?
It's pseudocode, dimly based on (PHP hidden to protect the sensitive), which is what my latest project is in. I needed a scripting language, and chose it because it looked like C and I wanted to get going fast. I'd been out of coding since structural programming (C, pascal) was a thing, so missed out on the joys of OOP. And now, here I am. Sometimes I think I'm slowly inventing parts of OOP, but let's not go there! And converting the whole thing to something else (Python?) would be a huge project, and I need to know I have something marketable before I try that.

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Heartfelt thanks from addams and from me - you really made a difference.

Boson Collider
Posts: 13
Joined: Wed Oct 29, 2014 6:20 pm UTC

### Re: Code duplication vs clarity

Well, if we do get into Haskell-like languages, you get useful syntactic constructs like guards.

For example, this is how I would probably do this in Haskell:

Code: Select all

`function data    |B & C & D  = thing3.thing2 data    |A & C      = thing2 data    |A          = thing4.thing3.thing1 data    |otherwise  = data`

Furthermore, if A,B,C,D,E are not general boolean cases but just checking whether an element is in a list, you could write something like:

Code: Select all

`containsElems elements data = and ( map (\e -> elem e data) elements )`

which checks whether everything in the "elements" list is part of data. This can simplify the boolean expressions if they are clunky when expanded. This would give you:

Code: Select all

`function data    |containsElems [B,C,D] data = thing3.thing2 data    |containsElems [A,C] data   = thing2 data    |elem A data                = thing4.thing3.thing1 data    |otherwise                  = data`

Or more generally you could write helper functions that do something similar.

Your variation sounds like a place where pattern matching would be very useful.

Tyndmyr
Posts: 11443
Joined: Wed Jul 25, 2012 8:38 pm UTC

### Re: Code duplication vs clarity

Honestly, duplication isn't so bad, usually. Worth worrying about if the lack of it results in massive bloat, or performance issues or whatever, but most of the time, the clarity is handy down the line, and the optimization *really* isn't important.

It's a tradeoff, and as such, there isn't a single perfect answer you should always do, you decide based on the needs of your current app. That said, I find that *most* coders are prone to premature optimization in search of the mythical perfect code.