Writing User Defined Functions (UDFs)

Exposing functions as UDFs

Python functions to be exposed as UDFs are decorated with the xl_func decorator, imported from the pyxll module. The pyxll module is compiled into the pyxll.xll addin so it doesn’t have to be on the pythonpath.

pyxll.xl_func(signature [, category=PyXLL] [, help_topic=""] [, thread_safe=False] [, macro=False] [, allow_abort=?] [, volatile=False] [, disable_function_wizard_calc=False] [, disable_replace_calc=False])

xl_func is a function that returns a decorator for exposing python functions to Excel.

Parameters:
  • signature (string) –
    string specifying the argument types and, optionally, their names and the return type.
    If the return type isn’t specified the var type is assumed. eg:

    "int x, string y: double" for a function that takes two arguments, x and y and returns a double.

    "float x" or "float x: var" for a function that takes a float x and returns a variant type.

  • category (string) – String that sets the category in the Excel function wizard the exposed function will appear under.
  • help_topic (string) – Path of the help file (.chm) that will be available from the function wizard in Excel.
  • thread_safe (boolean) – Indicates whether the function is thread-safe or not. If True the function may be called from multiple threads in Excel 2007 or later
  • macro (boolean) – If True the function will be registered as a macro sheet equivalent function. Macro sheet equivalent functions are less restricted in what they can do, and in particular they can call Excel macro sheet functions such as xlfCaller.
  • allow_abort (boolean) – If True the function may be cancelled by the user pressing Esc. A KeyboardInterrupt exception is raised when Esc is pressed. If not specified the behavior is determined by the allow_abort setting in the config (see Config file).
  • volatile (boolean) – if True the function will be registered as a volatile function, which means it will be called everytime Excel recalculates regardless of whether any of the parameters to the function have changed or not
  • disable_function_wizard_calc (boolean) – Don’t call from the Excel function wizard. This is useful for functions that take a long time to complete that would otherwise make the function wizard unresponsive
  • disable_replace_calc (boolean) – Set to True to stop the function being called from Excel’s find and replace dialog.
  • arg_descriptions – dict of parameter names to help strings

Example usage:

from pyxll import xl_func

@xl_func("string name: string", thread_safe=True)
def hello(name):
    """return a familiar greeting"""
    return "Hello, %s" % name

@xl_func("int n: int", category="fibonacci", thread_safe=True)
def fibonacci(n):
    """naive iterative implementation of fibonacci"""
    a, b = 0, 1
    for i in xrange(n):
        a, b = b, a + b
    return a

Documenting functions

When a python function is exposed to Excel the docstring of that function is visible in Excel’s function wizard dialog.

Parameter documentation may also be provided help the user know how to call the function. The most convenient way to add parameter documentation is to add it to the docstring as shown in the following example:

from pyxll import xl_func

@xl_func("float x, int n: float")
def py_round(x, n):
    """
    Return a number to a given precision in decimal digits.
    x: floating point number to round
    n: number of decimal digits
    """
    return round(x, n)

Here PyXLL will automatically detect that the last two lines of the docstring are parameter documentation. They will appear in the function wizard as help strings for the parameters when selected. The first line will be used as the function description.

One or more of any of the characters :, - or = may be used to separate the parameter name from it’s description, and the Sphinx style :param x: description is also recognized.

Parameter documentation may also be added by passing a dictionary of parameter names to help strings to xl_func as the keyword argument arg_descriptions if it is not desirable to add it to the docstring for any reason.

Standard argument and return types

Several standard types may be used in the signature specifed when exposing a Python UDF. It is also possible to pass arrays and custom types, which are discussed later.

Below is a list of the standard types. Any of these can be specified as an argument type or return type in a function signature. If a type passed from Excel or returned from Python is not (or cannot be converted to) the Python type in this list an error will be written to the log file and NaN will be returned to Excel if possible.

PyXLL type Python type
int int
float float
string str
bool bool
datetime datetime.datetime
date datetime.date
time datetime.time
unicode unicode [1]

The var type

In addition to the standard types there is also the var type. This can be used when the argument or return type isn’t fixed. Using the strong types has the advantage that arguments passed from Excel will get coerced correctly. For example if your function takes an int you’ll always get an int and there’s no need to do type checking in your function. If you use a var, you may get a float if a number is passed to your function, and if the user passes a non-numeric value your function will still get called so you need to check the type and raise an exception yourself.

Using arrays

Ranges of cells can be passed from Excel to Python as a 2d array, represented in python as a list of lists.

Any type can be used as an array type by appending [], as shown in the following example:

from pyxll import xl_func

@xl_func("float[] array: float")
def py_sum(array):
    """return the sum of a range of cells"""
    total = 0.0

    # array is a list of lists of floats
    for row in array:
        for cell_value in row:
            total += cell_value

    return total

Arrays can be used as return values as well. When returning an array remember that it has to be a list of lists. This means to return a row of data you would return [[1,2,3,4]], for example. To enter an array forumla in Excel you select the cells, enter the formula and then press Ctrl+Shift+Enter. Please refer to the Excel documentation for more information about array formula.

Any type can be used as an array type, but float[] requires the least marshalling between Excel and python and is therefore the fastest of the array types.

If you use the var type in your function signature then an array type will be used if you return a list of lists, or if the argument to your function is a range of data.

Using NumPy arrays

To be able to use numpy arrays you must have numpy installed and in your pythonpath.

You can use numpy 1d and 2d arrays as argument types to pass ranges of data into your function, and as return types for returing for array functions. Only up to 2d arrays are supported, as higher dimension arrays don’t fit well with how data is arranged in a spreadsheet.

The most common type of numpy array to use is a 2d array of floats, for which the type to use in the function signature is numpy_array. For 1d arrays, the types numpy_row and numpy_column may be used.

Types other than floating point arrays are supported too, and are listed below for numpy_array. The same applies to the 1d array types.

PyXLL type Python type
numpy_array numpy.array of float
numpy_array<float> numpy.array of float
numpy_array<int> numpy.array of int
numpy_array<bool> numpy.array of bool

Passing errors as values

Sometimes it is useful to be able to pass a cell value from Excel to python when the cell value is actually an error, or vice-versa.

PyXLL has two different ways of doing this.

The first is to use the var type, which passes Excel errors as Python exception objects. Below is a table that shows how Excel errors are converted to python exception objects when the var type is used.

Excel error Python exception type
#NULL! LookupError
#DIV/0! ZeroDivisionError
#VALUE! ValueError
#REF! ReferenceError
#NAME! NameError
#NUM! ArithmeticError
#NA! RuntimeError

The second is to use the special type: float_nan.

float_nan behaves in almost exactly the same way as the normal float type. It can be used as an array type, or as an element type in a numpy array, e.g. numpy_array<float_nan>. The only difference is that if the Excel value is an error or a non-numeric type (e.g. an empty cell), the value passed to python will be float('nan') or 1.#QNAN, which is equivalent to numpy.nan.

The two different float types exist because sometimes you don’t want your function to be called if there’s an error with the inputs, but sometimes you do. There is also a slight performance penalty for using the float_nan type when compared to a plain float.

Errors can also be returned to Excel using instances of python exception types. This way, it is possible to return arrays where some values are errors but some aren’t.

Custom types

As well as the standard types listed above, it’s also possible to define your own argument and return types that can then be used in your function signatures.

Custom argument types need a function that will convert a standard type to the custom type, which will then be passed to your function. For example, if you have a function that takes an instance of type X, you can declare a function to convert from a standard type to X and then use X as a type in your function signature. When called from Excel, your conversion function will be called with an instance of the base type, and then your exposed UDF will be called with the result of that conversion.

To declare a custom type, you use the xl_arg_type decorator on your conversion function. The xl_arg_type decorator takes at least two arguments, the name of your custom type and the base type.

pyxll.xl_arg_type(name, base_type [, allow_arrays=True] [, macro=None] [, thread_safe=None])

Returns a decorator for registering a function for converting from a base type to a custom type.

Parameters:
  • name (string) – custom type name
  • base_type (string) – base type
  • allow_arrays (boolean) – custom type may be passed in an array using the standard [] notation
  • macro (boolean) – If True all functions using this type will automatically be registered as a macro sheet equivalent function
  • thread_safe (boolean) – If False any function using this type will never be registered as thread safe

Here’s an example of a simple custom type:

from pyxll import xl_arg_type

class CustomType:
    def __init__(self, x):
        self.x = x

@xl_arg_type("custom", "string")
def string_to_customtype(x):
    return CustomType(x)

@xl_func("custom x: bool")
def test_custom_type_arg(x):
    # this function is called from Excel with a string, and then
    # string_to_customtype is called to convert that to a CustomType
    # and then this function is called with that instance
    return isinstance(x, CustomType)

custom can now be used as an argument type in a function signature. The Excel UDF will take a string, but before your Python function is called the conversion function will be used to convert that string to a CustomType instance.

To use a custom type as a return type you also have to specify the conversion function from your custom type to a base type. This is exactly the reverse of the custom argument type conversion described previously.

The custom return type conversion function is decorated with the xl_return_type decorator.

pyxll.xl_return_type(name, base_type [, allow_arrays=True] [, macro=None] [, thread_safe=None])

Returns a decorator for registering a function for converting from a custom type to a base type.

Parameters:
  • name (string) – custom type name
  • base_type (string) – base type
  • allow_arrays (boolean) – custom type may be returned as an array using the standard [] notation
  • macro (boolean) – If True all functions using this type will automatically be registered as a macro sheet equivalent function
  • thread_safe (boolean) – If False any function using this type will never be registered as thread safe

For the previous example the return type conversion function could look like:

from pyxll import xl_return_type, xl_func

@xl_return_type("custom", "string")
def customtype_to_string(x):
    # x is an instance of CustomType
    return x.x

@xl_func("string x: custom")
def test_returning_custom_type(x):
    # the returned object will get converted to a string
    # using customtype_to_string before being returned to Excel
    return CustomType(x)

Any recognized type can be used as a base type. That can be a standard type, an array type or another custom type (or even an array of a custom type!). The only restriction is that it must resolve to a standard type eventually.

There are more examples of custom types included in the PyXLL download.

Type conversion

Sometimes it’s useful to be able to convert from one type to another, but it’s not always convenient to have to determine the chain of functions to call to convert from one type to another.

For example, you might have a function that takes an array of var types, but some of those may actually be datetimes, or one of your own custom types. To convert them to those types you would have to check what type has actually been passed to your function and then decide what to call to get it into exactly the type you want.

PyXLL includes the function get_type_converter to do this for you. It takes source and target types by name and returns a function that will perform the conversion, if possible.

pyxll.get_type_converter(src_type, dest_type)

Returns a function to convert objects of type src_type to dest_type.

Parameters:
  • src_type (string) – name of type to convert from
  • dest_type (string) – name of type to convert to
Returns:

function to convert from src_type to dest_type

Even if there is no function registered that converts exactly from src_type to dest_type, as long as there is a way to convert from src_type to dest_type using one or more intermediate types this function will create a function to do that.

Here’s an example that shows how to get a datetime from a var parameter:

from pyxll import xl_func, get_type_converter
from datetime import datetime

@xl_func("var x: string")
def var_datetime_func(x):
    var_to_datetime = get_type_converter("var", "datetime")
    dt = var_to_datetime(x)
    # dt is now of type 'datetime'
    return "%s : %s" % (dt, type(dt))

Passing cell metadata

As well as passing arguments from Excel to Python by value, it’s also possible to pass data about the cell being passed to your function using the xl_cell type.

When you use the xl_cell type you get passed an object of type XLCell.

class pyxll.XLCell

XLCell represents the data and metadata for a cell in Excel passed as an xl_cell argument to a function registered with xl_func.

Some of the properties of XLCell instances can only be accessed if the calling function has been registered as a macro sheet equivalent function. See Exposing functions as UDFs.

value

value of the cell argument, passed in the same way as the var type.

Must be called from a macro sheet equivalent function

address

string representing the address of the cell, or None if a value was passed to the function and not a cell reference.

Must be called from a macro sheet equivalent function

formula

formula of the cell as a string, or None if a value was passed to the function and not a cell reference or if the cell has no formula.

Must be called from a macro sheet equivalent function

note

note on the cell as a string, or None if a value was passed to the function and not a cell reference or if the cell has no note.

Must be called from a macro sheet equivalent function

sheet_name

name of the sheet this cell belongs to.

sheet_id

integer id of the sheet this cell belongs to.

rect

XLRect instance with the coordinates of the cell.

from pyxll import xl_func

@xl_func("xl_cell cell: string", macro=True)
def xl_cell_test(cell):
    return "[value=%s, address=%s, formula=%s, note=%s]" % (
                cell.value,
                cell.address,
                cell.formula,
                cell.note)
class pyxll.XLRect

XLRect instances are accessed via XLCell.rect to get the coordinates of the cell.

first_row
last_row
first_col
last_col

Asynchronous functions

In Excel 2010 Microsoft introduced asynchronous functions. Instead of returning a value immediately, an asynchronous function initiates a potentially slow calculation in another thread, or perhaps via a server request, and returns immediately. When the result of the calculation is ready xlAsyncReturn is called to inform Excel.

PyXLL makes registering an asynchronous function very simple. By using the type async_handle, in the function signature passed to xl_func, the function will automatically get registered as an asynchronous function.

The async_handle parameter will be a unique handle for that function call and must be used to return the result when it’s ready. The async_handle type should be considered opaque and any functions using that type shouldn’t return a value.

Here’s an example of an asynchronous function [2]

from pyxll import xl_func, xlAsyncReturn
from threading import Thread
import time

class MyThread(Thread):
    def __init__(self, async_handle, x):
        Thread.__init__(self)
        self.__async_handle = async_handle
        self.__x = x

    def run(self):
        # here would be your long running function or a call to a server
        # or something like that
        time.sleep(5)
        xlAsyncReturn(self.__async_handle, self.__x)

# no return type required as async functions don't return a value
# the excel function will just take x, the async_handle is added automatically by Excel
@xl_func("async_handle h, int x")
def my_async_function(h, x):
    # start the long calculation in another thread
    thread = MyThread(h, x)
    thread.start()

    # return immediately, the real result will be returned by the thread function
    return

Footnotes

[1]Unicode was only introduced in Excel 2007 and is not available in earlier versions. Use xl_version to check what version of Excel is being used if in doubt.
[2]Asynchronous functions are only available in Excel 2010. Attempting to use them in an earlier version will result in an error.