User-defined Functions (12 of 14)

A user-defined function can be regarded as equivalent to a program in another language. Like a program, it consists of statements and has a name. When the name is typed in at the keyboard, the statements are executed.

A function can call other functions. Since several functions can exist in the workspace, this makes it possible to adopt a very modular approach to design.

The diagram below shows how a task might be split into functions. The function called Control at the top calls each function on the level below to perform a specific sub-task. These functions call other functions in the course of their execution.

                        Control
                    [1] Setup
                    [2] Calc
                    [3] Output
                    [4] .
 
        Setup           Calc            Output
    [ ] Vars        [ ] Sales       [ ] .
    [ ] .           [ ] .           [ ] Format
    [ ] .           [ ] Stats
    [ ] Window      [ ] .
    [ ] .
 
    Vars        Window      Sales       Stats       Format
[ ] .       [ ] .       [ ] .       [ ] .       [ ] . 
 
 
 

Any of the functions could of course be used with other functions to do a different overall task.

Window for example, might create a new window and give it a title. Given that the title could be whatever text was currently in a particular variable, such a function might be useful in a number of different applications.

Functions are often only a few lines long, so the structure shown doesn't necessarily represent some vast commercial project. With APL a modular approach comes naturally even for smallish programming tasks.

Arguments and results

User functions need not have arguments. A user function may be a series of APL lines with a name which, when entered, causes the lines to be executed. The name on its own calls the function and no arguments are specified. Such functions are called niladic functions.

Alternatively, functions can be defined in such a way that when you call them, you must provide arguments, just as you would with a built-in APL function. Here for example, the built-in function is being invoked to round up some numbers:

      ⌈6.5009 12.7 33.33333 909.01

The numbers are supplied as the right-hand argument.

If you defined a function called, say, SD which found the standard deviation of a set of numbers, you could write it so that it expected the data as its right-hand argument. You would then call SD in exactly the same way as a primitive function such as :

      SD 23 89 56 12 99 2 16 92

Functions with one argument, like SD are called monadic functions. You can equally well define and use functions with two arguments - dyadic functions. Indeed, if you want you can write a function which (like a built-in function) can sometimes have one argument and sometimes two, and you can make the action the function takes depend on whether one or two arguments are submitted. (The first line of such a function would normally be a test to determine how many arguments had been submitted.) Another useful option is the ability to return a result from a function.

You specify the number of arguments the function is to have, and the name of the result field (if there is one) when you define the function header of the function you are about to write.

User-defined operators

User-defined operators are rather more complex, in that they will have one or two operands, that is functions that they will apply to data and the function that results from the combination of operator and operands may itself have one or two arguments. Since operators exist to modify the behaviour of functions, a user-defined operator must have at least one operand.

User-defined operators can be treated like user-defined functions for the purposes of editing and entry. They are available in all APL dialects but APL+Win.

Editing functions

Most modern APLs include an editor window which you can use to create or edit a function. The editor is either invoked through the application's menu bar, or with the )EDIT system command (or the ⎕EDIT system function), e.g.

      )EDIT FUNK

Note: Some APLs like Dyalog use )ED to invoke the editor, e.g. )ED FUNK

Some older APL systems use a primitive line-at-a-time editor called the Del editor. To enter definition mode and create a new function you type (Del) followed by the function name. If you type nothing else, you are defining a function that will take no arguments:

      ∇FUNK

For clarity, we will list functions here as though they were entered using the Del editor, where a character is used to mark the start and end of the function listing. Listing functions in this way makes it clear at a glance that you are looking at a function. It's also a convention commonly used in other APL documentation on the Internet.

If you are using the normal full-screen editor, you do not type the characters or the line numbers except if you are using NARS2000: it does not have a del editor at all, so it treats a character as a shortcut for )Edit.

The function header

The first line of a function is called the function header. This example is the header for a function called FUNK:

      ∇FUNK

If you want the function you are defining to have arguments you must put them in the header by typing a suitable function header:

      ∇SD X

The above header specifies that SD will take one argument. Here is what SD might look like when you had defined it:

      ∇SD X
  [1] SUM ← +/X
  [2] AV ← SUM÷⍴X
  [3] DIFF ← AV-X
  [4] SQDIFF ← DIFF*2
  [5] SQAV ← (+/SQDIFF)÷⍴SQDIFF
  [6] RESULT ← SQAV*0.5

It's quite unimportant what the statements in the function are doing. The point to notice is that they use the variable X named in the function header. When SD is run, the numbers typed as its right-hand argument will be put into X and will be the data to the statements that use X in the function. So if you type:

      SD 12 45 20 68 92 108

those numbers are put in X. Even if you type the name of a variable instead of the numbers themselves, the numbers in the variable will be put into X.

The function header for a dyadic (two-argument) function would be defined on the same lines:

      ∇X CALC Y

(Remember that you don't type the del character if you are entering the function in an editor window).

When you subsequently use CALC you must supply two arguments:

      1 4 7 CALC 0 92 3

When CALC is run the left argument will be put into X and the right argument into Y.

If you want the result of a function to be put into a specified variable, you can arrange that in the function header too:

      ∇Z ← X CALC Y

In practice most APL functions return a result, which can then be used in expressions for further calculations, or stored in variables.

Defining Z to be the result of X CALC Y allows the outcome of CALC to be either assigned to a variable, or passed as a right argument to another (possibly user-defined) function, or simply displayed, by not making any assignment. The variable Z acts as a kind of surrogate for the final result during execution of CALC.

The operator header

The operator header must accommodate operands as well as the arguments of the function derived from it and so the header is more complex. The operator name and its operands are enclosed in parentheses. Thus a monadic operator whose derived functions will take two arguments and return a result, has a header:

      ∇R ← X (LOP OPERATE) Y

where LOP is the left operand and X and Y the left and right arguments. A dyadic operator whose derived function will take two arguments and return a result, has a header:

      ∇R ← X (LOP OPERATE ROP) Y

Other than its special header line, user-defined operators obey the same internal rules as detailed below for user-defined functions.

Local and global variables

Variable names quoted in the header of a function are local. They exist only while the function is running and it doesn't matter if they duplicate the names of other variables in the workspace.

The other variables - those used in the body of a function but not quoted in the header, or those created in calculator mode - are called global variables.

In the SD example above, X was named in the header so X is a local variable. If another X already exists in the workspace, there will be no problem. When SD is called, the X local to SD will be set up and will be the one used. The other X will take second place till the function has been executed - and of course, its value won't be affected by anything done to the local X. The process whereby a local name overrides a global name is known as 'shadowing'.

It's obviously convenient to use local variables in a function. It means that if you decide to make use of a function written some time before, you don't have to worry about the variable names it uses duplicating names already in the workspace.

But to go back to the SD example. Only X is quoted in the header, so only X is local. It uses a number of other variables, including one called SUM. If you already had a variable called SUM in the workspace, running SD would change its value.

You can 'localise' any variable used in a function by putting a semicolon at the end of the function header and typing the variable name after it:

      ∇SD X;SUM

You may wonder what happens if functions that call each other use duplicate local variable names. You can think of the functions as forming a stack with the one currently running at the top, the one that called it next down, and so on. A reference to a local variable name applies to the variable used by the function currently at the top of the stack.

Branching

Traditionally, the APL right arrow '' has been used to control execution in user-defined functions and operators. It can be used as a conditional or unconditional branch, and thus allows conditional execution and loops to be programmed.

We'll start by introducing the traditional APL branching technique, which is supported by all APL dialects, before considering a more modern APL alternative of using structured-control keywords like :IF and :WHILE.

The symbol is usually followed by an integer scalar, vector, or label name which identifies the line to branch to. If the argument is a vector, the first element of the vector determines the line at which execution will continue, and subsequent elements are ignored. If the line number does not exist, the function terminates (often a line number of 0 is used for this purpose). If the argument is an empty vector, no branch is taken and execution continues at the next statement. Thus, conditional branches can be programmed by using a right argument which, at run-time, evaluates either to an integer scalar/vector, or to an empty vector.

You will rarely use on its own, that is, unconditionally. Consider the following case:

  [1] ...
  [2] →4
  [3] ...
  [4] ...

When this function is run, line 1 is obeyed, then line 2 then line 4. Line 3 is always omitted because the branches round it. This seems pointless. Similarly, the unconditional in the following sequence seems to have created a closed loop of instructions that will repeat forever:

  [1] ...
  [2] ...
  [3] ...
  [4] → 1

It's more common to use conditionally as in the following example:

  [3] →(MARK<PASS)/7
  [4] 'YOU PASSED. CONGRATULATIONS.'
  [5] ...
  [6] ...
  [7] 'BAD LUCK. TRY AGAIN.'

The condition (MARK<PASS) will generate a 1 or a 0 depending on the values contained in the two variables, MARK and PASS. If the condition is met, the result is 1. Using the / function in its selection role, as was illustrated earlier, the right argument of the is 7. Thus execution 'goes to' or 'branches to' line 7. On the other hand, if the condition is not met, we do not select 7, in other words an empty vector is generated as the right argument to and execution carries onto the next line.

The statement on line [3] could thus be read as:

  [3] goto 7 if MARK<PASS

There are very many different ways of generating branches within an APL function; for now, the expression used in the example above will be used.

The last example provides a situation where an unconditional branch may be appropriate. If MARK is not less than PASS, we proceed with line 4, but it looks unlikely that we would also want to execute line 7, We put a before line 7 and branch round it:

  [3] →(MARK<PASS)/7
  [4] 'YOU PASSED. CONGRATULATIONS.'
  [5] ...
  [6] →9
  [7] 'BAD LUCK. TRY AGAIN.'
  [8] ...
  [9] ...

Looping

Branching in many programming languages is used to set up loops: sequences of instructions that are obeyed repeatedly till a count reaches a certain value. The count, of course, is incremented each time the loop is executed.

Loops are rarely necessary in APL, since much of the counting that has to be specified in other languages is implicit in the data structures used in APL and is done automatically, For example, the following statement will add the values in SALES, whether there are two values only, or a thousand:

      +/SALES

If a loop is necessary, it can be constructed using a statement similar to the branch statement shown above, the condition test being the value of a loop count. Alternatively you can use structured control statements like :WHILE and :REPEAT.

Labels

After an editing session in which you've inserted or deleted lines, most APL function editors renumber the function to make sure lines are whole numbers and there are no gaps. So next time you edit or run the function, the line numbers may be different. For this reason it's much safer to 'goto' labels rather than to line numbers.

Here's an earlier example, this time with s referencing labels rather than line numbers:

  [3] →(MARK<PASS)/FAIL
  [4] 'YOU PASSED. CONGRATULATIONS.'
  [5] ...
  [6] →NEXT
  [7] FAIL: 'BAD LUCK. TRY AGAIN.'
  [8] ...
  [9] NEXT: ...

Labels are names followed by colons, They are treated as local variables and have the value of the line numbers with which they are associated. For example, the label FAIL in the extract above will be set up when the function is run and will have the value 7.

Ending execution of a function

When the last line in a function is executed, the function stops naturally (unless, of course, the last line is a branch back to an earlier line). To end a function before the last line is encountered, you can go to a line number which doesn't exist in the function. The safest line number for this purpose (and the one conventionally used) is 0.

The following statement causes a branch to 0 (in other words, terminates the function) if a variable called X currently has a value less than 1.

  [4] →(X<1)/0

Structured control keywords

As well as the conventional branch arrow, some versions of APL support structured-control keywords for flow control, often making for more readable functions. The keywords all begin with a colon character, and usually appear at the start of the line (Some APL editors will automatically indent lines within a block for you). For example:

  [3] :If MARK ≥ PASS
  [4]     'YOU PASSED. CONGRATULATIONS.'
  [5]     ...
  [6] :Else
  [7]     'BAD LUCK. TRY AGAIN.'
  [8]     ...
  [9] :Endif

The structured control keywords are not part of the International Standards Organisation (ISO) specification of the APL language, but they are supported by a number of APL implementations.

Structured control keywords include:

Function

Keyword

Conditional execution

:If / :ElseIf / :Else / :EndIf

For loop

:For / :EndFor

While loop

:While / :EndWhile

Repeat loop

:Repeat / :EndRepeat

Case selection

:Select / :Case / :CaseList / :Else / :EndSelect

Branch

:GoTo

Terminate current function

:Return (equivalent to 0)

Here is a simple example:

     ∇GUESS;VAL
[1]   'Guess a number'
[2]   :Repeat
[3]     VAL ← ⎕
[4]     :If VAL=231153
[5]       'You were right!'
[6]       :Leave
[7]     :EndIf
[8]     'Sorry, try again..'
[9]   :EndRepeat

The amount of indentation does not affect the execution of the function, but it does make it easier to read. Some APL editors allow you to clean up the indentation automatically to make the function more readable.

Comments in functions

If you want to include comments in a function, simply type them in, preceded by a symbol (known as 'lamp')

      ∇R ← AV X
  [1] ⍝ This function finds the average of some numbers
  [2] R ← (+/X)÷⍴X ⍝ The numbers are in X

There are two comments in the example above. Note that the one on line 2 doesn't start at the beginning of a line.

Ambivalent functions

All dyadic functions may be used monadically. If used monadically, the left argument is undefined (i.e. has a Name Classification, ⎕NC of 0). This type of function is known as an ambivalent or nomadic function, and will usually start by testing for the existence of the left argument.

      ∇R←A NOMADIC B
  [1] :If 0=⎕NC 'A'        ⍝ DOES A EXIST?
  [2]   A←5                ⍝ NO, SO WE HAVE BEEN USED MONADICALLY
  [3] :EndIf
      ...etc

Note that this is true in APL2 while NARS2000 and Dyalog use a particular syntax to distinguish between ambivalent and dyadic functions:

      ∇R←A NOMADIC B     ⍝ dyadic     ∇
      ∇R←{A} NOMADIC B   ⍝ Ambivalent ∇

LearnApl/FunctionsAndOperators (last edited 2017-02-16 19:43:14 by KaiJaeger)