Two Simple Tips to Speed up Python Time Parsing

  1. Sometimes, date parsing formatting in Python takes a long time. It can be worth writing custom datestring converters to sacrifice generality for speed.
  2. Another oddity:  setting the timezone by force can speed up code as well, like this: os.environ[‘TZ’] = ‘GMT’

Both tips are demo’d and tested in the code snipped below.

import os
import time

def _convert_date(string, year=None):
 ''' take a log string, turn it into time epoch, tuple, string

 >>> _convert_date2('Aug 19 13:45:01',2009)
 (1250689501, (2009, 8, 19, 13, 45, 1, 2, 231, 0), 'Aug 19 13:45:01')
 '''
 if year is None:  year = time.gmtime()[0]

 # was, but this profiled 4x slower
 tt = list(time.strptime("%s " % year + string, "%Y %b %d %H:%M:%S"))
 tt[-1] = 0 # turn off timezone
 tt= tuple(tt)
 ts = int(time.mktime(tt))
 return (ts,tt,string)

_months = dict(jan=1,feb=2,mar=3,apr=4,may=5,jun=6,jul=7,aug=8,sep=9,oct=10,nov=11,dec=12)
def _convert_date2(string, year=None):
 ''' take a log string, turn it into time epoch, tuple, string

 >>> _convert_date2('Aug 19 13:45:01',2009)
 (1250689501, (2009, 8, 19, 13, 45, 1, 2, 231, 0), 'Aug 19 13:45:01')
 '''
 if year is None:  year = time.gmtime()[0]

 # was, but this profiled 4x slower
 #tt = list(time.strptime("%s " % year + x, "%Y %b %d %H:%M:%S"))
 mon,d,t  = string.split()
 h,m,s = t.split(":")
 mon = _months[mon.lower()]
 tt = [year, mon,d,h,m,s,0,0,0]
 tt = tuple([int(v) for v  in tt])
 ts = int(time.mktime(tt))
 tt = time.gmtime(ts)
 return (ts,tt,string)

assert _convert_date('Aug 19 13:45:01',2009) == _convert_date2('Aug 19 13:45:01',2009)

#%timeit is an ipython macro that is like timeit.Timer with brains!

# including figuring out how many loops to run heuristically

# key fact:  a microsecond is 1000 nanoseconds

timeit _convert_date('Aug 19 13:45:01',2009)
timeit _convert_date2('Aug 19 13:45:01',2009)
os.environ['TZ'] = 'GMT'
timeit _convert_date('Aug 19 13:45:01',2009)
timeit _convert_date2('Aug 19 13:45:01',2009)

Results  (Python 2.4.3 on x64 Linux):

timeit _convert_date(‘Aug 19 13:45:01’,2009)
10000 loops, best of 3: 62 µs per loop

In [11]: timeit _convert_date2(‘Aug 19 13:45:01’,2009)
10000 loops, best of 3: 18.3 µs per loop

In [12]: os.environ[‘TZ’] = ‘GMT’

In [13]: timeit _convert_date(‘Aug 19 13:45:01’,2009)
10000 loops, best of 3: 60.2 µs per loop

In [14]: timeit _convert_date2(‘Aug 19 13:45:01’,2009)
100000 loops, best of 3: 13.3 µs per loop

The Win Factor:

  • custom parser:  300%
  • setting TZ:  20%

Feedback and additional speedup improvements welcome.

(Thanks to Jon Nelson; of the Pycurious Blog for the TZ idea)

Advertisements

Deepcopy is a Pig (For Simple Data)

“Being this easy ain’t cheap.”  “There’s no such thing as a free lunch.”

We’ve all heard these tropes before, right?  Sometimes, without testing, it’s hard to see exactly how much that lunch costs.  This week’s example:  Python’s copy.deepcopy.

I tend to fancy myself as using a lot of functional programming techniques in my code, and as part of that, I try to avoid modifying data by side-effect.  Deepcopy makes it easy to copy the original structure, modify the copy, and return it.  After some profiling and timing work, I saw that, of all things, deepcopy was the bottleneck!

Sure, it’s bulletproof, battle-tested, and designed to do the Right Thing ™ in almost every case!  But for simple data structures, it can be overkill, since it does so much accounting, reference tracking, and the like.

Most of the data I see in my day job has simple formats: mainly dictionaries of lists, sets, strings, tuples, and integers. — the basic python types we know and love, easily representable (in plain text, html, tables), and easy to munge / transmit (using JSON or the like).  In short, they’re nice to work with, and transparent.

As it turns out, when we control the input data, we don’t need to worry as much about robustness.  Sure the code below for “deepish_copy” doesn’t handle classes, and nested iterables, or generators, or even nesting to arbitrary depth.  But, it runs fast, as the speed results below show.

import timeit
from copy import deepcopy

def deepish_copy(org):
    '''
    much, much faster than deepcopy, for a dict of the simple python types.
    '''
    out = dict().fromkeys(org)
    for k,v in org.iteritems():
        try:
            out[k] = v.copy()   # dicts, sets
        except AttributeError:
            try:
                out[k] = v[:]   # lists, tuples, strings, unicode
            except TypeError:
                out[k] = v      # ints

    return out

def test_deepish_copy():
    o1 = dict(name = u"blah", id=1, att0 = (1,2,3), att1 = range(10), att2 = set(range(10)))
    o2 = deepish_copy(o1)
    assert o2 == o1, "not equal, but should be"
    del o2['att1'][-1]
    assert o2 != o1, "are equal, shouldn't be"

#prun for ii in xrange(1000):  o2 = deepcopy(o1)
#prun for ii in xrange(1000):  o2 = dc2(o1)

o1 = dict(name = u"blah", id=1, att0 = (1,2,3), att1 = range(10), att2 = set(range(10)))

a = timeit.Timer("o2 = deepish_copy(o1)","from __main__ import deepish_copy,o1")
b = timeit.Timer("o2 = deepcopy(o1)","from __main__ import deepcopy,o1")

# 64-bit linux, 1 gHz chip, python 2.4.3
a.repeat(3,number=20000)
# [0.45441699028015137, 0.41893100738525391, 0.46757102012634277]
b.repeat(3,number=20000)
# [2.5441901683807373, 2.5316669940948486, 2.4751369953155518]

Using the custom written code speeds things up quite a bit (5 fold!).  For me, where this copying *was* the bottleneck, and I have to iterate over hundreds of thousands of these things, it made a noticible difference in total run time.  Taking the 10 minutes it took to write this code was worth it.   So was profiling (using ipython’s simple %prun macro).

As always, to end with another cliche:  your mileage may vary… but if you’re not relying on the car manufacturers to degisn an engine for exactly your needs,  you can probably improve it.


Len() calls can be SLOW in Berkeley Database and Python bsddb.

In my day-to-day coding work, I make extensive use of Berkeley DB (bdb) hash and btree tables. They’re really fast, easy-ish to use, and work for the apps I need them for (persistent storage of json and other small data structures).

So, this python code was having all kinds of weird slowdowns for me, and it was the len() call (of all things) that was causing the issue!

As it turns out, sometimes the Berkeley database does have to iterate over all keys to give a proper answer. Even the “fast stats” *number of records* call has to

References:
Jesus Cea’s comments one why bdb’s don’t know how many keys they have
db_stat tool description
DB->stat api


Using Python “timeit” to time functions

Python’s timeit module is not intuitive for me.

Steve D’Aprano’s thread on the python mailing list:  Timing a function object… and especially Carl Bank’s response of


def foo(x):
    return x+1

import timeit
timeit.Timer("foo(1)","from __main__ import foo")

was a godsend!

Reading this hint contradicted my blithe statement to friend this morning that “the only time __main__ appears in user code is to determine when a script is being run form the command line”.  Such ignorance and folly!


Quick and Dirty JSON Speed Testing in Python

[See updated post for analysis using timeit]

As per Poromenos‘ request on Reddit, I decided to do a bit of expansion on my cryptic comment about the major json packages in python (simplejson, cjson, demjson):

My conclusion: use demjson if you really really want to make sure everything is right, and you don’t care at all about time. Use simplejson if you’re in the 99% of all users who want reasonable performance over a broad range of objects, and use enhanced cjson 1.0.3x if you in the came with reasonable json inputs, and you need much faster (10x) speed…. that is, if the json step is the bottleneck.

More worrisome — demjson didn’t handle the unicode string I threw at it properly…

benchmark code and more indepth discussion


Sorting Dictionaries by Value in Python (improved?)

Sorting dictionaries by value in Python is a chronic annoyance. The dictionary implementation is very efficient, well-thought out, and easy-to-use, but sorting by value isn’t part of the spec. So, I was pretty excited when read this post on sorting by value at Digital Sanitation Engineering, I was excited. Finally, a method that “is the fastest way to do this and it uses the least amount of memory. Enjoy.”

Oh, enjoy I will.

But then, a wrench in the works. I’m not sure the claim in true. PEP 265 has a different opinion on the matter, and suggests a different idiom, so I decided to do a little testing.

(Preview for time impatient: the PEP 265 version is 10x faster than any others in this test.)

def sbv0(adict,reverse=False):
    ''' proposed at Digital Sanitation Engineering
    http://blog.modp.com/2007/11/sorting-python-dict-by-value.html '''
    return sorted(adict.iteritems(), key=lambda (k,v): (v,k), reverse=reverse)

def sbv1(d,reverse=False):
    '''  explicit list expansion '''
    L = [(k,v) for (k,v) in d.iteritems()]
    return sorted(L, key=lambda x: x[1] , reverse=reverse)

def sbv2(d,reverse=False):
    '''  generator '''
    L = ((k,v) for (k,v) in d.iteritems())
    return sorted(L, key=lambda x: x[1] , reverse=reverse)

def sbv3(d,reverse=False):
    ''' using a lambda to get the key, rather than "double-assignment" '''

    return sorted(d.iteritems(), key=lambda x: x[1] , reverse=reverse)

def sbv4(d,reverse=False):
    ''' using a formal function to get the sorting key, rather than a lambda'''
    def sk(x):  return x[1]
    return sorted(d.iteritems(), key=sk , reverse=reverse)

def sk(x):  return x[1]

def sbv5(d,reverse=False):
    ''' using a formal function, defined in outer scope
    to get the sorting key, rather than a lambda
    '''
    return sorted(d.iteritems(), key=sk , reverse=reverse)

from operator import itemgetter
def sbv6(d,reverse=False):
    ''' proposed in PEP 265, using  the itemgetter '''
    return sorted(d.iteritems(), key=itemgetter(1), reverse=True)

D = dict(zip(range(100),range(100)))

from profile import run

run("for ii in xrange(10000):  sbv0(D, reverse=True)")
run("for ii in xrange(10000):  sbv1(D, reverse=True)")
run("for ii in xrange(10000):  sbv2(D, reverse=True)")
run("for ii in xrange(10000):  sbv3(D, reverse=True)")
run("for ii in xrange(10000):  sbv4(D, reverse=True)")
run("for ii in xrange(10000):  sbv5(D, reverse=True)")
run("for ii in xrange(10000):  sbv6(D, reverse=True)")

>>> run("for ii in xrange(10000):  sbv0(D, reverse=True)")
         1020003 function calls in 5.500 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.040    0.000    0.040    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    2.880    0.000    5.440    0.001 :1(sbv0)
  1000000    2.520    0.000    2.520    0.000 :4()
        1    0.060    0.060    5.500    5.500 :1(?)
        1    0.000    0.000    5.500    5.500 profile:0(for ii in xrange(10000):  sbv0(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

>>> run("for ii in xrange(10000):  sbv1(D, reverse=True)")
         1020003 function calls in 5.030 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.030    0.000    0.030    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    3.080    0.000    4.930    0.000 :1(sbv1)
  1000000    1.820    0.000    1.820    0.000 :4()
        1    0.100    0.100    5.030    5.030 :1(?)
        1    0.000    0.000    5.030    5.030 profile:0(for ii in xrange(10000):  sbv1(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

>>> run("for ii in xrange(10000):  sbv2(D, reverse=True)")
         2030003 function calls in 8.900 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.020    0.000    0.020    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    4.460    0.000    8.870    0.001 <stdin>:1(sbv2)
  1010000    2.630    0.000    2.630    0.000 <stdin>:3(<generator expression>)
  1000000    1.760    0.000    1.760    0.000 <stdin>:4(<lambda>)
        1    0.030    0.030    8.900    8.900 <string>:1(?)
        1    0.000    0.000    8.900    8.900 profile:0(for ii in xrange(10000):  sbv2(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

>>> run("for ii in xrange(10000):  sbv3(D, reverse=True)")
         1020003 function calls in 4.860 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.050    0.000    0.050    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    2.650    0.000    4.820    0.000 :1(sbv3)
  1000000    2.120    0.000    2.120    0.000 :4()
        1    0.040    0.040    4.860    4.860 :1(?)
        1    0.000    0.000    4.860    4.860 profile:0(for ii in xrange(10000):  sbv3(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

>>> run("for ii in xrange(10000):  sbv4(D, reverse=True)")
         1020003 function calls in 4.850 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.020    0.000    0.020    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    2.770    0.000    4.810    0.000 :1(sbv4)
  1000000    2.020    0.000    2.020    0.000 :3(sk)
        1    0.040    0.040    4.850    4.850 :1(?)
        1    0.000    0.000    4.850    4.850 profile:0(for ii in xrange(10000):  sbv4(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

>>> run("for ii in xrange(10000):  sbv5(D, reverse=True)")
         1020003 function calls in 4.670 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.010    0.000    0.010    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    2.690    0.000    4.650    0.000 :1(sbv5)
  1000000    1.950    0.000    1.950    0.000 :1(sk)
        1    0.020    0.020    4.670    4.670 :1(?)
        1    0.000    0.000    4.670    4.670 profile:0(for ii in xrange(10000):  sbv5(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

>>> run("for ii in xrange(10000):  sbv6(D, reverse=True)")
         20003 function calls in 0.460 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000    0.020    0.000    0.020    0.000 :0(iteritems)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
    10000    0.340    0.000    0.360    0.000 :1(sbv6)
        1    0.100    0.100    0.460    0.460 :1(?)
        1    0.000    0.000    0.460    0.460 profile:0(for ii in xrange(10000):  sbv6(D, reverse=True))
        0    0.000             0.000          profile:0(profiler)

I can’t speak to memory use, but I can speak to speed…. itemgetter, and functions in the operator module in general, are fast. Use them. As usual, the overhead of function calling kills performance, and eliminating those calls makes all the difference.

Good luck, pythonistas!

gL

(comments welcome!)