@Charlie! and the OP:
There is no contradiction; the time ordering operator is not a map of operators, but rather a map that takes a tuple of time-dependent operators A(t), B(t), C(t)... AND a tuple of times t1,t2,t3... and yields an operator (clearly not time dependent). Ergo, one cannot just time order the product A(1)A(2) because it is just a time independent operator; What one means by
T(A(1) A(2))
is actually T[A(..),A(..)](1,2) (read as "the time ordered product of the time dependent operator A(..) with itself evaluated at times (t1,t2)" ). The equality
A(t1)A(t2)=B(-t1)B(-t2)
for arbitrary t1 and t2 is just "numerical" equality of time independent operators and not an equality of tuples [A(..),A(..)](t1,t2) = [B(..),B(..)](-t1,-t2) since B(..)!=A(..) -- note that the fact that B(-t)=A(t) is entirely irrelevant as far as the tuples are concerned.
The definition given by tooyoo using Heaviside's theta is perfectly rigorous, and can easily be generalized to arbitrary finite products of operators.
Now the interesting reason there's an apparent paradox in using time ordering naively is that B(t)=A(-t) is time-orientation reversed with respect to A(t), as an operator-valued curve, and time ordering "detects" this of course -- otherwise it would not really be time "ordering" at all! Look at http://en.wikipedia.org/wiki/Differential_geometry_of_curves#Reparametrization_and_equivalence_relation for details.
Anyway, in my experience time ordered products in physics appear in exactly two contexts in my experience:
1) When solving for the propagator of an equation
\dot{\Phi}(t)=A(t) \Phi(t)\ and \ \Phi(0)=1
where A and Phi are linear operators, the solution can be expressed as a time-ordered exponential,
\Phi(t)=T\exp \left(\int A(t) \ dt\right)
2) Inside correlation functions
where one does not need to consider such subtleties.
EDIT: @ the OP: Well about mathematical formulations of path integration... you might want to read A Modern Approach to Functional Integration by Klauder, where he gives a construction based on (his?) coherent-space path integral, which is a phase space path integral interpretation of the transition amplitude from one coherent state to another which solves the problem nicely --at least for QM in flat space-- for a rather large class of hamiltonians IIRC. Another related approach is expounded upon in Functional Integration: Action and Symmetries by DeWitt-Morette and Cartier but that book is much more difficult to read. There's also the book by Glimm and Jaffe but I have not read it yet. Finally, you may want to look into white noise analysis; one (possibly the only) textbook is Lectures on White Noise Functionals by Hida and Si Si which I have only started to read, but you should probably find introductory papers aiming at applying the results of the theory to QM and QFT.