Substring (General Concept)

A substring is a subset of characters from a string. Extracting substrings in SPSS is done with CHAR.SUBSTR (SPSS versions 16+) or just SUBSTR (SPSS versions 15-). CHAR.SUBSTR takes two or three arguments as shown by the minimal example below.

SPSS CHAR.SUBSTR – Minimal Example

COMPUTE var_2 = CHAR.SUBSTR(var_1,3,2).

The three arguments mean the following:

  • var_1 denotes the variable from which the substring is taken;
  • 3 is the first character that’s extracted;
  • 2 is the number of characters to extract.

Altogether, this first example implies that var_2 will consist of characters 3 and 4 of var_1.

SPSS Substring Syntax Examples

The examples below use webdesigners.sav.

cd ‘C:\xampp\htdocs\spss-tutorials\wp-content\themes\spss-tutorials-10\dont_upload\@external files\SPSS\test_data_creation\webdesigners’.

get file ‘webdesigners.sav’.

string fname lname company tld (a30).

compute fname = char.substr(email,1,1).
execute.

compute fname = char.substr(email,3,2).
execute.

compute fname = char.substr(email,1,char.index(email,’.’) – 1).
execute.

compute fname = concat(upper(char.substr(fname,1,1)),char.substr(fname,2)).
execute.

compute lname = char.substr(email,char.index(email,’.’) + 1,char.index(email,’@’) – 1 – char.index(email,’.’)).
execute.

compute lname = concat(upper(char.substr(lname,1,1)),char.substr(lname,2)).
execute.

compute company = char.substr(email,char.index(email,’@’) + 1).
execute.

compute tld = char.substr(company,char.rindex(company,’.’) + 1).
execute.

document ‘bal’.

document ‘bol’.

display documents.

Explanation

  • In SPSS, a substring can be extracted by using CHAR.SUBSTR(a,b,c).
  • Here, a refers to the string from which the substring should be taken.
  • The second argument b indicates the starting position (“start at the bth letter”)
  • The third argument c is the length of the substring. It may be omitted, in which case all characters after the starting position will be extracted.
  • As seen in the second example, a and b don’t have to be static numbers. They may be replaced by (for example) the position of the last space in a string, which is returned by RINDEX.
  • The CHAR prefix may often be omitted. Exactly when is explained in Unicode mode.
  • Just SUBSTRING can be used for modifying the original string in many cases.* This is shown in the final example.

Python Substring Examples

begin program.
pets = ‘Cat Dog Rat’
print pets[4:7]
print pets[pets.rfind(” “) +1:]
end program.

Explanation

  • In Python, a substring can be extracted from a string by using square brackets []. The latter enclose the relevant index or indices of the character(s) to be extracted.
  • This operation is called slicing. (Slicing is used for more than just the substring function. For instance, mylist[1] would return the second element from a list called “mylist”.)
  • A range of characters is specified by a colon :.
  • For example, [1:4] returns the second through the fourth elements. This is because it uses the start index as given and (the end index – 1).
  • In a similar vein, if the start index is omitted (as in [:4]) it will return the first through the fourth element.
  • Finally, if the end index is omitted ([1:]), the second through the final elements are returned.