Understanding Code Bases / Refactoring [Discussion,Research]

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Prelates, Moderators General

Understanding Code Bases / Refactoring [Discussion,Research]

Postby Breakfast » Thu Feb 23, 2012 5:25 pm UTC

So I'd like to know how individuals familiarize themselves with an existing code base of any size. Are there specific steps that you follow? What is your thought process behind choosing those steps and also the thought process that you use when carrying out those steps? How long did it take you to develop these skills and how long does it take you to progress through your steps? Do you even have a process or do you only learn what code you need as you work?

Where and how does refactoring play into this process (if at all)? Do you only refactor code as needed or will you do it to help you understand what is going on? What refactoring techniques do you find particularly useful? How about seldom used ones?

I’m hopeful that there will be some good responses and if there are I will be editing this first post to reflect gathered information and also to modify or and questions as they arise.

If anyone can think of a better thread title let me know and I’ll change it.
Breakfast
 
Posts: 91
Joined: Tue Jun 16, 2009 7:34 pm UTC
Location: Coming to a table near you

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Pepve » Thu Feb 23, 2012 9:38 pm UTC

I'm very skeptical of our ability to read code in general. Let alone understanding sizeable code bases by reading code. Documentation is usually even worse (because it's lagging behind or not written at all).

So I usually just go at it. I start some refactorings that I never finish (because you can not properly refactor when you do not understand the code/problems at hand). I look for APIs, contracts, architectural patterns, even source directory layout. And I just start implementing features, solving issues, doing the grunt work, and most importantly having my code reviewed by those that do know the software. At a certain point you start to understand how the code actually works, and a while later you'll see where it needs to go. After that refactoring is just a part of the development effort.

This can take hours to months.
Pepve
 
Posts: 57
Joined: Wed Jul 28, 2010 9:47 am UTC

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Breakfast » Thu Feb 23, 2012 11:00 pm UTC

I agree that any kind of formal documentation is typically a bad way to go about learning a code base. Requirements change too often and with a project of any substantial size it's probably been in development for years. This definitely makes the documentation lag behind current implementation. However, you mention code reviews and getting information from those that do understand the software and I have to wonder if there could be a good way to abstract that into documentation or the project itself. What if you had to make a change to a part of the code that no member of the current team has any experience with? This question is meant to remove outside sources from the list of resources at your disposal so that you must rely on only your skills and prior knowledge. This does not, however, exclude references that can be found online because those will not, in most circumstances, provide explicit solutions. They'll merely give you concepts that you can then implement.

Does knowledge of specific design patterns help or does it merely confuse when the project implements only parts of a pattern and then merges another pattern into it?

If you just "go for it" what do you think would help to improve the time it takes to become familiar with a system (i.e. better understand of patterns, documentation, better up front knowledge transfer...) I agree that at some point you do start to understand what is happening in the code base but depending on the processes used to gain that understanding two programmers could take disproportionate amounts of time to achieve the same tasks.
Breakfast
 
Posts: 91
Joined: Tue Jun 16, 2009 7:34 pm UTC
Location: Coming to a table near you

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Webzter » Sun Feb 26, 2012 3:35 am UTC

Breakfast wrote:So I'd like to know how individuals familiarize themselves with an existing code base of any size. Are there specific steps that you follow? What is your thought process behind choosing those steps and also the thought process that you use when carrying out those steps? How long did it take you to develop these skills and how long does it take you to progress through your steps?


well, here's, exactly, the process I used for my last major refactoring.

1. Start supporting the code. Take care of bug fixes. When I fix a bug, I generally introduce an automated unit test to verify the fix.
2. Start adding features to the code. When I add a feature, I generally introduce automated unit and behavior tests to verify the feature.
3. From steps 1 and 2, trouble spots naturally start to jump out. I never refactor because "that's not how I would have done it". I only refactor if the code is a significant source of bugs, requires a significant percentage of QA effort (from yourself or others), or is generally making itself a nuisance.
4. Write automated unit tests, behavior tests, and integration tests (if trivial, stubbed if not... calls to a local DB, trivial; calls to a remote queue on an AS/400, not) that are green against the code I'm going to refactor. Then, refactor and ensure the unit tests are still green.

This was on a fair (but not large) code base that had no comments and no unit tests. I supported it for 4-6 months and then rewrote it in my spare time over another month. I took it from around 120k lines with no unit tests to 17k lines including unit tests and significantly cut down the QA effort (partially because this actually recorded expected behaviors for the QA team and updated / made consistent the business requirements).

At my current job, a good chunk of what I do is greenfield development so it's less of an issue. For the parts that were inherited, I largely follow the same process as above just on a much compressed timeline (because the chunks I generally go after are much smaller).

A book I particularly enjoy is Refactoring to Patterns by Joshua Kerievsky. It might help with some situational refactoring help.
Webzter
 
Posts: 179
Joined: Tue Dec 04, 2007 4:16 pm UTC
Location: Michigan, USA

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Breakfast » Tue Feb 28, 2012 12:53 am UTC

Thank you for the response and I'll look into Refactoring to Patterns. I'm not particularly looking for help when it comes refactoring, testing, or understand code but what I've come to notice is that more experienced developers have better abilities at accomplishing those tasks much faster than someone who is newer to the profession. While this may seem obvious, I think it's important to look at, understand, and come up with techniques to try to decrease that gap faster.

So what I'm really looking at from your comment is the first sentence of step 1. "Start supporting the code." Does this mean your first exposure comes with your first bug and you start encapsulating objects and patterns as you work? Or is there some preparatory work that comes before that? If so, is it a knowledge transfer, documentation, or some combination of many other things?

I'm essentially trying to qualify the knowledge that experience gets us in the hope to transfer knowledge better / aide training and teaching new developers.
Breakfast
 
Posts: 91
Joined: Tue Jun 16, 2009 7:34 pm UTC
Location: Coming to a table near you

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Steax » Fri Mar 02, 2012 5:44 am UTC

For me, it's descriptive function names and descriptive variables. Stuff like that. Simply finding code where I have to mentally fill in variables is a giant pain in the rear. Even with for loops, if the loop is any longer than, say, 20 lines, I suggest using some other variable name than "i".
In Minecraft, I use the username Rirez.
User avatar
Steax
SecondTalon's Goon Squad
 
Posts: 3037
Joined: Sat Jan 12, 2008 12:18 pm UTC

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby EvanED » Fri Mar 02, 2012 5:47 am UTC

Heck, I'd put that at, I dunno, more like 5 lines. Depends a bit on the context and type of loop too. Some places I'd recommend a better name even if it were 2 lines; others it's fine if it's much longer.
EvanED
 
Posts: 4160
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Breakfast » Wed Mar 07, 2012 4:24 pm UTC

Foolishly I hadn't read Code Complete until now; and it seems to cover a good deal of the concepts and ideas that I am looking for. It also seems to cover ideas that have been mentioned such as descriptive variable names and the appropriate length of functions and loops.

So now that I have good resources for these ideas. The question becomes, "We can teach these techniques; but how do we present them in a way that is easy to understand and that will also stick?" I believe this is harder to do at a professional level because experienced programmers might have experience in using bad practices. However, since that's the way they've "always done it" they would be resistant to change.

Any additional resources or comments would be greatly appreciated.
Breakfast
 
Posts: 91
Joined: Tue Jun 16, 2009 7:34 pm UTC
Location: Coming to a table near you

Re: Understanding Code Bases / Refactoring [Discussion,Resea

Postby Steax » Wed Mar 07, 2012 4:44 pm UTC

Teaching by example has always been the way it works for me. Pretty much all (good) beginner programmers will want to contribute or use, at some point, open-source code. I've found that this teaches them, the hard way, why code standards exist, what they mean, and how to use them. I think most of us have found a library we desperately need, but with horribly inadequate documentation, forcing us to dig into the code. The horrors of getting lost in code you need but cannot understand just... can't be overstated.

That's also why sites like The Daily WTF exists. It lets programmers everywhere sanity-check what they're doing, and see if they're making horrible mistakes pretty early on. The whole "we'll laugh at your complete failure if you do this" atmosphere helps.
In Minecraft, I use the username Rirez.
User avatar
Steax
SecondTalon's Goon Squad
 
Posts: 3037
Joined: Sat Jan 12, 2008 12:18 pm UTC


Return to Coding

Who is online

Users browsing this forum: No registered users and 2 guests