Table of Contents

STRING

A string is a sequence of characters. It is still a list of bytes representing Unicode characters. In that way, it is mutable like lists. You can split strings, extract part of a string, combine strings, search them, insert substring into them, etc.. Strings are represented by quotation, single or double quotes, or even triple quotes.

print "Hello World"

Format Strings

Formatting allow user to have control on how strings will be represented on screen. It also allow inserting placeholders inside the string which will be executed when the string is processed. For example, variable substitution inside a string to make multiple copies of same piece of string.

Formatting by % builtin Operator

Variable Substitution

A variable can be inserted inside a string using %. If the variable represent a string add s to % (%s), if it is a float we use %f, while dotN is used to indicate the number of decimal poinst (%.2f is used for a float with 2 decimal points). %i is for integers. You may substitute even an expression in the place of %s for example you may insert a list inside a string using range in list comprehension, or insert an mathematical expression inside the string.

=> <  "%s|f|i "  % variable >
var = 123456789
"%s " % var  =replace s by var
"%.2f " % int(var/12034) =insert outcome of division as a float with 2 decimal points
"%s " % [x for x in range(10)] = insert the list inside string

Padding strings

You may need to align a a sequence of digits or characters to the right, left or centre for a number of spaces. For this, you use %Nd or %Ns, where N is number of spaces, s for strings and d for digits. Default is right aligned.

=><" %|number of spaces|d " % digits >
= align digits 10 spaces to the right : %Nd N is number of spaces
"%10d" % 123
st = 'Hello World'
"%40s" % st
'                             Hello World'

If you want to align number or substring to the left of another string, precede the N with a minus sign (-).

=> A number aligned left number of spaces  ==>
=><  " %|-number of spaces|d " % digits  >
"%-10d %s" % (123,"number")
"%10d %s" % (123,"number")

Actually aligning here is done by padding the number of string with spaces (white space). What if you want to pad a number with zeros (00) You just enter a zero before the N. Notice that the N here is the total number of digits including the zeros. So, if the number is 4 digits and you make N = 8, the number will be padded by 4 zeros.

=>< "%|0|number zeros + digits|d" % digits >
"%05d" % 123  =5 is total number of digits
"%08d" % 1234

Formating by {} string format function

Format function is called on the string containing curly brackets placeholders for its parameters to be substituted in sequence. Such place holders my be identified by numbers or are keyword arguments.

=>< "{}".format("string" or variable) >
"foo = {0}".format('bar')
=< "{0}, {1}, {2} ".format(var1, string, object)
"{0},{1},{2}".format('foo', 'bar', 'zar')
"{first},{second},{third}".format(first ='foo', second ='bar', third ='zar')

Padding with format

Similar to %, format allow for padding a string to the right and left. Default is aligned left. Inside the curly brackets and after identifying number or keyword name, enter semicolon and then number of spaces to pad to the left,

=>< "{:spaces added}, {:spaces added}, {:spaces added}".format(var1, var2, var3)  =aligned left as default
"{0:10},{1:10},{1:10}".format('foo', 'bar', 'zar')
=>'foo       ,bar       ,bar
"{first:10},{second:10},{third:10}".format(first='foo', second='bar', third='zar')

To indicate direction of alignment or where padding is inserted, you use < for left (this optional as it is the default) or > for the right alignment.

=>"{:<spans}, {:<spans}, {:<spans}".format(var1, var2, var3) = leave N spaces aligned left {:<N}
=>< "{:>spans}, {:>spans}, {:>spans}".format(var1, var2, var3) = leave N spaces aligned right {:>N}
"{:>10},{:>10},{:>10}".format('foo', 'bar', 'zar')

Formatting by F literal

The F literal is much easier and faster than the two previous formating methods. It is simply done by putting f or F before a string which has {} curly brackets as placeholders for variables, keyword arguments, expressions, functions, methods, objects or items in dictionaries or lists.

a = "Hello"
b = "World"
f"{a} {b}"
f"{120*400/23}"
f"{a.lower()} {b.title()}"
D = dict(name = "Adam", Age = "23", sex = "male", salary = "$2300")
f"{D['name']}'s salary is {D['salary']}" ==> "Adam's salary is $2300"

Search a String

If you need to check if a string contains a certain substring, character, number word, you can use IN literal which returns True or False.

s = "Hello World"
'W' in s
"World" in s
=>< sub in string > --> bool
"lo" in "hello"     ==> true
"ol" in "hello"     ==> false
"h" in "hello"      ==> true

This does not tell you where exactly is your search substring. Here you may use str.find() function. It returns the start position (offset) OF that substring.

s = "http://Everyday the sun shines on the world.html"
s.find('world') ==> 38
=>< "string".find(pattern, offset, end ) >
=>offset : starting  position
def findit(string, pattern, offset = 0, end = -1):
  r = string.find(pattern, offset, end)
  if r < 0: 
    print("no matching substring")
  elif r == 0 : 
    print("There is a matching string at position",  offset)
findit("Hello World","Sam")
findit("Hello World", "H")

If you want to split the string at that position, remember that a string is a list of characters. so,

s[0:37] ==> 'http://Everyday the sun shines on the'
s[38:]  ==> 'world.html'

You can actually extract a substring from a long string if you know the first and end words or characters. First get the position of the start word, then the start of the word after end word, then split the string between both positions.

s = "http://Everyday the sun shines on the world.html"
start = s.find("the")
end = s.find("on", start)
s[start:end] ==. 'the sun shines '

To get the N character in a string, use its position as the index

  =>< string[N] >
  a = "abcde"
  a[0]

Find and Replace

Find a Match

You may find all substrings matching a Regex pattern using findall from re module

  import re
  string = "too may cooks tried to spoil the broth"
  print re.findall(r"\bt[\w]*", string)

If you are interested in the characteristics of the substring, you may find it by searching for its initial (prefix) or end (suffix) parts using the startswith() and endswith() functions. Arguments include the part to look for and also where to search in the string from starting position to end position. They return True if substring or word starts with the specified prefix or suffix, False otherwise. Optionally, you specify start and end position. Prefix can also be a tuple of substrings, to find all substrings matching the condition, True is returned if any element is found.

  =>S.startswith(prefix[, start[, end]]) -> bool >
  "congratulation".startswith(("con", "c", "co"))
  "congratulation".startswith("grat",3,10)
  "congratulation".startswith("tion",-4)
  =< S.endswith(suffix[, start[, end]]) -> bool >
  string = "concatenation"
  string.endswith("tion")
  string.endswith("tion", 1, 6)  =starts positions 1 and ends 6
  tupl = ("ion","on","tion","n")
  string.endswith(tupl) =True if any element in tuple is true

Replace Function

Replace() function search for a substring and replace it with the argument given. It returns an new string while original one remains intact.

  =>< string.replace("old", "new") >
  string  = "Everyday the sun shines on the world"
  new_string = string.replace("Everyday", "Always,")

Replace changes all occurrences of the pattern, if you wan to limit that to the first one or few occurrences, you may use the count argument.

  =>< S.replace(old, new[, count]) -> string >
  a = "hello world world world world"
  a.replace('world', "Everyone", 2) ==>"hello Everyone Everyone world world"

Regex Pattern Searching

The regex re module provide access to regex pattern searching in any string. It is powerful and versatile with unlimited possibilities of search and replace inside a string or text documents.

  import re
  text = 'hello world'
  re.search('h(.+?)\s', text).group(0) =number of characters starting with h
  re.search('(.)\s.*', text).group(0)  =Any character followed by space till end of string
  re.search('[aeiou]', text).group(0)  =first vowel in string
  re.search('\w.+', text).group(0)     =Any number of words in string

Find Position of substring

You may get the index of substring to know its position in the string. It is possible to focus on a certain portion of the big string by indicating the start and end position of that segment.

  =>< S.index(sub [,start [,end]]) -> int >
  S = "The quick brown fox jumps over the lazy dog"
  S.index('fox')
  S.index('dog')

Slice a String

As string is a list of characters, it can be split at any index position or between any two positions.

  =>< slice(stop)>  <  slice(start, stop[, step]) >
  string = "too may cooks tried to spoil the broth"
  string[1:10, 2]

Extended slicing can be done using the slice function> * string.slice(stop) * string.slice(start,stop,step)

This creates a slice object which will be used on the string to slice it.

  string = "too may cooks tried to spoil the broth" 
  print(string[1:10]) =simple slicing
  slice_object = slice(8,100) =create slice object
  string[slice_object]

Partition function split a string into three parts including the separator which can be a length of characters.

  partition (pre,pattern, post) ==>
  => <S.partition(sep) -> (head, sep, tail) >
  => Search for the separator sep in S, and return the part before it, the separator itself, and the part after it.
  S = "The quick brown fox jumps over the lazy dog"
  S.partition(" ")  =return a tuple
  S.partition("fox")
  S.partition("jump")

Change String Properties

Make a copy

String is a list of characters, so you may make a copy of the whole string or part of it.

  hi = "Hello World"
  hi2 = hi[:]     =same as hi2 = hi
  hi3 = hi
  print(hi, "=", hi2, "=",  hi3)

Repeat a String N number of times

  =>< String*N >
  "Ho! " * 3

Hide the String

To hide a string multiply it with zero, this does not remove it from memory. To remove it completely and empty all characters, use assign its variable to None.

  =>< String*0 >
  "Ho! " * 0

  a = "abcde"
  a = None
  a =

Count characters in a string

To get the size of a string in characters, use the count() function. It returns the number of non-overlapping occurrences of substring sub in string S[start:end]. It allows you to decide where to start and where to end. Using it creatively, you can count the number of substrings or sequence of characters, or even special characters like linefeeds and newlines, thus you can count how many paragraphs in a document.

  => < S.count(sub[, start[, end]]) >
  a = "hello world\r\nGood Morning\n\nHappy New World"
  a.count("l")     ==> count 'lo'from start to end
  a.count("l", 5)  ==> count 'l' from position 5 to end
  a.count("o", 3, 7)==> count 'o' from position 3 to position 7
  a.count("")    ==> count all characters including \s\n\r\f..etc.
  a.count('\n')     =count number of lines.
  a.count('\n\n') =count number of paragraphs.

Split String Into a List

A string is a list of characters, but if you want to split it into segments of strings and put these in a list as its elements use the split() function. Split() needs a separator as the first argument. It returns a list of the words or segments in the string, using sep as the delimiter string. If sep is not specified or is None, any whitespace character (space, \n,\r,\f\s) is used as a separator. To specify a space enter it quoted as a separator. Empty strings are removed.

  => S.split([sep [,maxsplit]]) -> list of strings 
  "This is a string\nThis is another line".split()   =whitespace: spaces and newlines
  "This is a string\nThis is another line".split(" ")   =only spaces
  "This is a string\nThis is another line".split("\n")  =newline

Reverse

Characters in a string may be reversed left to right as in this example

  print("stressed"[::-1])   ==> "desserts"

Join Strings

You can join (concatenate) two or more strings by plus sign or putting them next to each other. A comma return a tuple, but with print statement the strings will be joined adding a space between parts.

  "Hello from " + "Sam"
  "Hello from ""Sam"
  "Hello from ","Sam"

If you assign the strings to variables, you may add these to each other in the order you like. An object can be added to a string with plus sign, or an expression or outcome of a code.

  =>< "string"+object >
  L = ["Carol", "Karl", "Ian", "Mary"]
  a = "hello  "
  print(a+"world")
  print(a+chr(33))      =33 is a codepoint for "!"
    for x in L : print a+x 

You may join multiline strings by enclosing them in brackets:

  ("Hello from "
  "Sam")    

The same can be achieved by a backlash:

  "Hello from " \
  "Sam" 

Triple quote will preserve the format of multiline strings:

  """Hello from
    Sam             and Michael"""

Compare Strings

To compare strings you is "is" for complete equality in characters and length. It returns True or False. Equality can also be assessed by "==" operator. Notice that upper and lowercase characters are not equal in comparison.

  print 144 is 12*12                =>True
  print 144 == 12*12                =>True
  print "Hello World" == "Hello World"      =>True
  print "Hello World" == "H"+"ello"+" "+"W"+"orld"  =>True, value equal
  print "Hello World" is "H"+"ello"+" "+"W"+"orld"  => False, not the same 
  L = ["abcdef" == "abcde", "abcdef" == "abcdef", "abcdef" == "abcdefg", "abcdef" == "ABCDEF"]
  for x in L: 
    if x: 
      print x, "Because strings are equal in character and length"
    else:
      print x, "Because strings are not equal in either character or length or both"

Style of Strings

Capitalize a String

To capitalize the first word in a string, use capitalize function. To Capitalize every word in a string, use the title function.

  => String.capitalize()
  => String.title()
  str = "everyday the sun shines on the world"
  str.title()       ='Everyday The Sun Shines On The World'
  str.capitalize()  ='Everyday the sun shines on the world'

Swap Case

To swap uppercase for lowercase in a string, use swapcase function. It returns a copy of the string with uppercase characters converted to lowercase and vice versa.

  =>< S.swapcase() -> string >
  str.swapcase("Hello")  ==> "hELLO"
  str.swapcase("cYbEr_PuNk11")  ==> "CyBeR_pUnK11"

Convert to Lowercase

To convert a string to lowercase use the lower() function. It returns a copy of the string converted to lowercase. This is useful when you want to make a search case insensitive by making the pattern and searched string in lowercase before searching.

  =>< S.lower() -> string >
  a = "hElLO"
  str.lower(a)

Convert to Uppercase

To convert all words in a stringg to uppercase(capitalized), use uppercase function. It returns a copy of the string converted to uppercase.

  S.upper() -> string >
  str.upper("hEllO")) ==> "HELLO"

Another way to capitalize capitalize each word, is to use capwords function from the String Module.

  import string
  s = 'The quick brown fox jumped over the lazy dog.'
  string.capwords(s)

Centered

To center a string on the console use the center() function with arguments specifying the total length of string and spaces around it (span), and also giving the padding character(default is spaces).

  =>< String.center(span, padding character) >
  "hello".center(4)         ==> width less than string length
  "hello".center(20)        ==> padded with spaces
  "hello".center(20, '*')   ==>  padded with '*'

Trimmed

To remove trailing newline characters (\n,\r,\r\n) use the rstrip function with the right arguments. If no argument is given, all right whitespace characters are removed.

  => < string.rstrip() >
  "Whitespace    " .rstrip()    =whitespace removed
  "Newline\n".rstrip())     =trailing newline \n removed
  "hello\r\n".rstrip('\r')      =nothing removed, no trailing \r
  "hello\n\r".rstrip('\n')  =nothing removed, no trailing \n
  "Linefeed\r".rstrip('\r') =trailing \r removed
  "Nothing \n there".rstrip('\n')=no trailing \n
  "hello".rstrip("llo")     =trailing llo removed
  "Multiple Linefeeds\r\n\r\n".rstrip()) =all trailing line feeds removed
  "Nothing here\r\n\r\r\n".rstrip('')) =nothing removed, not space character

Another way to remove characters from a string is to slice it using colon and square brackets to remove last N characters. This also allow you to remove characters from the start of the string.

  =>< string[:-N] >
  "string\r\n"[:-1] =remove \n only
  "string\n\r"[:-2] =remove \n\r
  "string.txt"[:-4] =remove file extension
  "concentration"[3:-5] ==remove first 3 and last 5 characters

To remove whitespace from both ends of the string (remove leading and training whitespace), use Strip() function. It returns a copy of the string with leading and trailing whitespace removed. If a particular character is given and not None, it removes such character or substring instead from both ends.

  =>< S.strip([chars]) -> string or unicode >
  str.strip("    hello    ")
  str.strip("\tgoodbye\r\n")
  string = "too may cooks tried to spoil the broth"
  string.strip("too many * the broth") =return string without leading and training substrings

Justified and Padded

A string can right or left justified and padded using the rjust() or ljust() functions while giving the width and character to use for padding as arguments to the functions. It returns right-justified or left-justified string of the given length in width and Padding is done using the specified fill character (default is a space). If width is smaller than the length of characters in the string, nothing happens.

  =>< S.rjust(width[, fillchar]) -> string >
  "hello".rjust(4)       ==> 'hello'
  "hello".rjust(20)      ==> '               hello'
  "hello".rjust(20, '*') ==> '***************hello'

The same to make a string left justified and padded. It returns a left-justified string of length width given. Padding is done using the specified fill character (default is a space)

  => < S.ljust(width[, fillchar]) -> string >
  print "hello".ljust(4)   =less than string length      
  print "hello".ljust(20)
  print "hello".ljust(20, '*')