How can I get the position of a character inside a string in Python?
bad_coder
10.9k20 gold badges42 silver badges70 bronze badges
asked Feb 19, 2010 at 6:32
0
There are two string methods for this, find()
and index()
. The difference between the two is what happens when the search string isn’t found. find()
returns -1
and index()
raises a ValueError
.
Using find()
>>> myString = 'Position of a character'
>>> myString.find('s')
2
>>> myString.find('x')
-1
Using index()
>>> myString = 'Position of a character'
>>> myString.index('s')
2
>>> myString.index('x')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found
From the Python manual
string.find(s, sub[, start[, end]])
Return the lowest index in s where the substring sub is found such that sub is wholly contained ins[start:end]
. Return-1
on failure. Defaults for start and end and interpretation of negative values is the same as for slices.
And:
string.index(s, sub[, start[, end]])
Likefind()
but raiseValueError
when the substring is not found.
Tomerikoo
18.1k16 gold badges45 silver badges60 bronze badges
answered Feb 19, 2010 at 6:35
Eli BenderskyEli Bendersky
261k88 gold badges350 silver badges412 bronze badges
1
Just for a sake of completeness, if you need to find all positions of a character in a string, you can do the following:
s = 'shak#spea#e'
c = '#'
print([pos for pos, char in enumerate(s) if char == c])
which will print: [4, 9]
Jolbas
7475 silver badges15 bronze badges
answered Sep 26, 2015 at 7:59
Salvador DaliSalvador Dali
212k146 gold badges696 silver badges752 bronze badges
2
>>> s="mystring"
>>> s.index("r")
4
>>> s.find("r")
4
«Long winded» way
>>> for i,c in enumerate(s):
... if "r"==c: print i
...
4
to get substring,
>>> s="mystring"
>>> s[4:10]
'ring'
answered Feb 19, 2010 at 6:36
ghostdog74ghostdog74
325k56 gold badges257 silver badges342 bronze badges
4
Just for completion, in the case I want to find the extension in a file name in order to check it, I need to find the last ‘.’, in this case use rfind:
path = 'toto.titi.tata..xls'
path.find('.')
4
path.rfind('.')
15
in my case, I use the following, which works whatever the complete file name is:
filename_without_extension = complete_name[:complete_name.rfind('.')]
answered Sep 28, 2017 at 6:37
A.JolyA.Joly
2,2772 gold badges20 silver badges24 bronze badges
2
What happens when the string contains a duplicate character?
from my experience with index()
I saw that for duplicate you get back the same index.
For example:
s = 'abccde'
for c in s:
print('%s, %d' % (c, s.index(c)))
would return:
a, 0
b, 1
c, 2
c, 2
d, 4
In that case you can do something like that:
for i, character in enumerate(my_string):
# i is the position of the character in the string
answered Jul 1, 2015 at 12:40
DimSarakDimSarak
4522 gold badges5 silver badges11 bronze badges
1
string.find(character)
string.index(character)
Perhaps you’d like to have a look at the documentation to find out what the difference between the two is.
Brad Koch
19k19 gold badges107 silver badges137 bronze badges
answered Feb 19, 2010 at 6:37
John MachinJohn Machin
80.9k11 gold badges141 silver badges187 bronze badges
1
A character might appear multiple times in a string. For example in a string sentence
, position of e
is 1, 4, 7
(because indexing usually starts from zero). but what I find is both of the functions find()
and index()
returns first position of a character. So, this can be solved doing this:
def charposition(string, char):
pos = [] #list to store positions for each 'char' in 'string'
for n in range(len(string)):
if string[n] == char:
pos.append(n)
return pos
s = "sentence"
print(charposition(s, 'e'))
#Output: [1, 4, 7]
answered Sep 16, 2018 at 9:33
itssubasitssubas
1632 silver badges11 bronze badges
If you want to find the first match.
Python has a in-built string method that does the work: index().
string.index(value, start, end)
Where:
- Value: (Required) The value to search for.
- start: (Optional) Where to start the search. Default is 0.
- end: (Optional) Where to end the search. Default is to the end of the string.
def character_index():
string = "Hello World! This is an example sentence with no meaning."
match = "i"
return string.index(match)
print(character_index())
> 15
If you want to find all the matches.
Let’s say you need all the indexes where the character match
is and not just the first one.
The pythonic way would be to use enumerate()
.
def character_indexes():
string = "Hello World! This is an example sentence with no meaning."
match = "i"
indexes_of_match = []
for index, character in enumerate(string):
if character == match:
indexes_of_match.append(index)
return indexes_of_match
print(character_indexes())
# [15, 18, 42, 53]
Or even better with a list comprehension:
def character_indexes_comprehension():
string = "Hello World! This is an example sentence with no meaning."
match = "i"
return [index for index, character in enumerate(string) if character == match]
print(character_indexes_comprehension())
# [15, 18, 42, 53]
answered Jan 26, 2021 at 5:01
Guzman OjeroGuzman Ojero
2,6621 gold badge19 silver badges20 bronze badges
2
more_itertools.locate
is a third-party tool that finds all indicies of items that satisfy a condition.
Here we find all index locations of the letter "i"
.
Given
import more_itertools as mit
text = "supercalifragilisticexpialidocious"
search = lambda x: x == "i"
Code
list(mit.locate(text, search))
# [8, 13, 15, 18, 23, 26, 30]
answered Feb 9, 2018 at 0:46
pylangpylang
39.8k11 gold badges127 silver badges120 bronze badges
Most methods I found refer to finding the first substring in a string. To find all the substrings, you need to work around.
For example:
Define the string
vars = ‘iloveyoutosimidaandilikeyou’
Define the substring
key = 'you'
Define a function that can find the location for all the substrings within the string
def find_all_loc(vars, key):
pos = []
start = 0
end = len(vars)
while True:
loc = vars.find(key, start, end)
if loc is -1:
break
else:
pos.append(loc)
start = loc + len(key)
return pos
pos = find_all_loc(vars, key)
print(pos)
[5, 24]
Emi OB
2,7943 gold badges13 silver badges28 bronze badges
answered Nov 5, 2021 at 8:44
0
A solution with numpy for quick access to all indexes:
string_array = np.array(list(my_string))
char_indexes = np.where(string_array == 'C')
answered Jan 15, 2020 at 20:40
SebSeb
3024 silver badges6 bronze badges
2
Given a string and a character, your task is to find the first position of the character in the string using Python. These types of problems are very competitive programming where you need to locate the position of the character in a string. Let’s discuss a few methods to solve the problem.
Method 1: Get the position of a character in Python using rfind()
Python String rfind() method returns the highest index of the substring if found in the given string. If not found then it returns -1.
Python3
string
=
'Geeks'
letter
=
'k'
print
(string.rfind(letter))
Method 2: Get the position of a character in Python using regex
re.search() method either returns None (if the pattern doesn’t match) or re.MatchObject contains information about the matching part of the string. This method stops after the first match,
Python3
import
re
string
=
'Geeksforgeeks'
pattern
=
'for'
match
=
(re.search(pattern, string))
print
(
"starting index"
, match.start())
print
(
"start and end index"
, match.span())
Output
starting index 5 start and end index (5, 8)
Method 3: Get the position of a character in Python using index()
This Method raises Value Error in case the character is not present
Python3
ini_string1
=
'xyze'
c
=
"b"
print
(
"initial_strings : "
, ini_string1,
"ncharacter_to_find : "
, c)
try
:
res
=
ini_string1.index(c)
print
(
"Character {} in string {} is present at {}"
.
format
(
c, ini_string1,
str
(res
+
1
)))
except
ValueError as e:
print
(
"No such character available in string {}"
.
format
(ini_string1))
Output:
initial_strings : xyze character_to_find : b No such character available in string xyze
Method 4: Get the position of a character in Python using the loop
In this example, we will use the Python loop to find the position of a character in a given string.
Python3
ini_string
=
'abcdef'
c
=
"b"
print
(
"initial_string : "
, ini_string,
"ncharacter_to_find : "
, c)
res
=
None
for
i
in
range
(
0
,
len
(ini_string)):
if
ini_string[i]
=
=
c:
res
=
i
+
1
break
if
res
=
=
None
:
print
(
"No such character available in string"
)
else
:
print
(
"Character {} is present at {}"
.
format
(c,
str
(res)))
Output:
initial_string : abcdef character_to_find : b Character b is present at 2
Time Complexity: O(n), where n is length of string.
Auxiliary Space: O(1)
Method 5: Get the position of a character in Python using str.find
This method returns -1 in case the character is not present.
Python3
ini_string
=
'abcdef'
ini_string2
=
'xyze'
c
=
"b"
print
(
"initial_strings : "
, ini_string,
" "
,
ini_string2,
"ncharacter_to_find : "
, c)
res1
=
ini_string.find(c)
res2
=
ini_string2.find(c)
if
res1
=
=
-
1
:
print
(
"No such character available in string {}"
.
format
(
ini_string))
else
:
print
(
"Character {} in string {} is present at {}"
.
format
(
c, ini_string,
str
(res1
+
1
)))
if
res2
=
=
-
1
:
print
(
"No such character available in string {}"
.
format
(
ini_string2))
else
:
print
(
"Character {} in string {} is present at {}"
.
format
(
c, ini_string2,
str
(res2
+
1
)))
Output:
initial_strings : abcdef xyze character_to_find : b Character b in string abcdef is present at 2 No such character available in string xyze
Method 6: Get the position of a character in Python using a list comprehension and next function
This method involves using a list comprehension to create a list of tuples containing the index and character for each character in the string. We can then use the next function to return the first tuple whose character matches the one we are searching for. If the character is not found, we raise a ValueError.
Python3
def
find_position(string, char):
try
:
return
1
+
next
(i
for
i, c
in
enumerate
(string)
if
c
=
=
char)
except
:
return
-
1
string
=
'xybze'
char
=
'b'
print
(
"initial_strings : "
, string)
print
(
"character_to_find : "
, char)
print
(find_position(string, char))
Output
initial_strings : xybze character_to_find : b 3
Time complexity: O(n)
Auxiliary Space: O(n)
Last Updated :
21 Mar, 2023
Like Article
Save Article
When you’re working with a Python program, you might need to search for and locate a specific string inside another string.
This is where Python’s built-in string methods come in handy.
In this article, you will learn how to use Python’s built-in find()
string method to help you search for a substring inside a string.
Here is what we will cover:
- Syntax of the
find()
method- How to use
find()
with no start and end parameters example - How to use
find()
with start and end parameters example - Substring not found example
- Is the
find()
method case-sensitive?
- How to use
find()
vsin
keywordfind()
vsindex()
The find()
Method — A Syntax Overview
The find()
string method is built into Python’s standard library.
It takes a substring as input and finds its index — that is, the position of the substring inside the string you call the method on.
The general syntax for the find()
method looks something like this:
string_object.find("substring", start_index_number, end_index_number)
Let’s break it down:
string_object
is the original string you are working with and the string you will call thefind()
method on. This could be any word you want to search through.- The
find()
method takes three parameters – one required and two optional. "substring"
is the first required parameter. This is the substring you are trying to find insidestring_object
. Make sure to include quotation marks.start_index_number
is the second parameter and it’s optional. It specifies the starting index and the position from which the search will start. The default value is0
.end_index_number
is the third parameter and it’s also optional. It specifies the end index and where the search will stop. The default is the length of the string.- Both the
start_index_number
and theend_index_number
specify the range over which the search will take place and they narrow the search down to a particular section.
The return value of the find()
method is an integer value.
If the substring is present in the string, find()
returns the index, or the character position, of the first occurrence of the specified substring from that given string.
If the substring you are searching for is not present in the string, then find()
will return -1
. It will not throw an exception.
How to Use find()
with No Start and End Parameters Example
The following examples illustrate how to use the find()
method using the only required parameter – the substring you want to search.
You can take a single word and search to find the index number of a specific letter:
fave_phrase = "Hello world!"
# find the index of the letter 'w'
search_fave_phrase = fave_phrase.find("w")
print(search_fave_phrase)
#output
# 6
I created a variable named fave_phrase
and stored the string Hello world!
.
I called the find()
method on the variable containing the string and searched for the letter ‘w’ inside Hello world!
.
I stored the result of the operation in a variable named search_fave_phrase
and then printed its contents to the console.
The return value was the index of w
which in this case was the integer 6
.
Keep in mind that indexing in programming and Computer Science in general always starts at 0
and not 1
.
How to Use find()
with Start and End Parameters Example
Using the start and end parameters with the find()
method lets you limit your search.
For example, if you wanted to find the index of the letter ‘w’ and start the search from position 3
and not earlier, you would do the following:
fave_phrase = "Hello world!"
# find the index of the letter 'w' starting from position 3
search_fave_phrase = fave_phrase.find("w",3)
print(search_fave_phrase)
#output
# 6
Since the search starts at position 3, the return value will be the first instance of the string containing ‘w’ from that position and onwards.
You can also narrow down the search even more and be more specific with your search with the end parameter:
fave_phrase = "Hello world!"
# find the index of the letter 'w' between the positions 3 and 8
search_fave_phrase = fave_phrase.find("w",3,8)
print(search_fave_phrase)
#output
# 6
Substring Not Found Example
As mentioned earlier, if the substring you specify with find()
is not present in the string, then the output will be -1
and not an exception.
fave_phrase = "Hello world!"
# search for the index of the letter 'a' in "Hello world"
search_fave_phrase = fave_phrase.find("a")
print(search_fave_phrase)
# -1
Is the find()
Method Case-Sensitive?
What happens if you search for a letter in a different case?
fave_phrase = "Hello world!"
#search for the index of the letter 'W' capitalized
search_fave_phrase = fave_phrase.find("W")
print(search_fave_phrase)
#output
# -1
In an earlier example, I searched for the index of the letter w
in the phrase «Hello world!» and the find()
method returned its position.
In this case, searching for the letter W
capitalized returns -1
– meaning the letter is not present in the string.
So, when searching for a substring with the find()
method, remember that the search will be case-sensitive.
The find()
Method vs the in
Keyword – What’s the Difference?
Use the in
keyword to check if the substring is present in the string in the first place.
The general syntax for the in
keyword is the following:
substring in string
The in
keyword returns a Boolean value – a value that is either True
or False
.
>>> "w" in "Hello world!"
True
The in
operator returns True
when the substring is present in the string.
And if the substring is not present, it returns False
:
>>> "a" in "Hello world!"
False
Using the in
keyword is a helpful first step before using the find()
method.
You first check to see if a string contains a substring, and then you can use find()
to find the position of the substring. That way, you know for sure that the substring is present.
So, use find()
to find the index position of a substring inside a string and not to look if the substring is present in the string.
The find()
Method vs the index()
Method – What’s the Difference?
Similar to the find()
method, the index()
method is a string method used for finding the index of a substring inside a string.
So, both methods work in the same way.
The difference between the two methods is that the index()
method raises an exception when the substring is not present in the string, in contrast to the find()
method that returns the -1
value.
fave_phrase = "Hello world!"
# search for the index of the letter 'a' in 'Hello world!'
search_fave_phrase = fave_phrase.index("a")
print(search_fave_phrase)
#output
# Traceback (most recent call last):
# File "/Users/dionysialemonaki/python_article/demopython.py", line 4, in <module>
# search_fave_phrase = fave_phrase.index("a")
# ValueError: substring not found
The example above shows that index()
throws a ValueError
when the substring is not present.
You may want to use find()
over index()
when you don’t want to deal with catching and handling any exceptions in your programs.
Conclusion
And there you have it! You now know how to search for a substring in a string using the find()
method.
I hope you found this tutorial helpful.
To learn more about the Python programming language, check out freeCodeCamp’s Python certification.
You’ll start from the basics and learn in an interactive and beginner-friendly way. You’ll also build five projects at the end to put into practice and help reinforce your understanding of the concepts you learned.
Thank you for reading, and happy coding!
Happy coding!
Learn to code for free. freeCodeCamp’s open source curriculum has helped more than 40,000 people get jobs as developers. Get started
Базовые операции¶
# Конкатенация (сложение) >>> s1 = 'spam' >>> s2 = 'eggs' >>> print(s1 + s2) 'spameggs' # Дублирование строки >>> print('spam' * 3) spamspamspam # Длина строки >>> len('spam') 4 # Доступ по индексу >>> S = 'spam' >>> S[0] 's' >>> S[2] 'a' >>> S[-2] 'a' # Срез >>> s = 'spameggs' >>> s[3:5] 'me' >>> s[2:-2] 'ameg' >>> s[:6] 'spameg' >>> s[1:] 'pameggs' >>> s[:] 'spameggs' # Шаг, извлечения среза >>> s[::-1] 'sggemaps' >>> s[3:5:-1] '' >>> s[2::2] 'aeg'
Другие функции и методы строк¶
# Литералы строк S = 'str'; S = "str"; S = '''str'''; S = """str""" # Экранированные последовательности S = "snptanbbb" # Неформатированные строки (подавляют экранирование) S = r"C:tempnew" # Строка байтов S = b"byte" # Конкатенация (сложение строк) S1 + S2 # Повторение строки S1 * 3 # Обращение по индексу S[i] # Извлечение среза S[i:j:step] # Длина строки len(S) # Поиск подстроки в строке. Возвращает номер первого вхождения или -1 S.find(str, [start],[end]) # Поиск подстроки в строке. Возвращает номер последнего вхождения или -1 S.rfind(str, [start],[end]) # Поиск подстроки в строке. Возвращает номер первого вхождения или вызывает ValueError S.index(str, [start],[end]) # Поиск подстроки в строке. Возвращает номер последнего вхождения или вызывает ValueError S.rindex(str, [start],[end]) # Замена шаблона S.replace(шаблон, замена) # Разбиение строки по разделителю S.split(символ) # Состоит ли строка из цифр S.isdigit() # Состоит ли строка из букв S.isalpha() # Состоит ли строка из цифр или букв S.isalnum() # Состоит ли строка из символов в нижнем регистре S.islower() # Состоит ли строка из символов в верхнем регистре S.isupper() # Состоит ли строка из неотображаемых символов (пробел, символ перевода страницы ('f'), "новая строка" ('n'), "перевод каретки" ('r'), "горизонтальная табуляция" ('t') и "вертикальная табуляция" ('v')) S.isspace() # Начинаются ли слова в строке с заглавной буквы S.istitle() # Преобразование строки к верхнему регистру S.upper() # Преобразование строки к нижнему регистру S.lower() # Начинается ли строка S с шаблона str S.startswith(str) # Заканчивается ли строка S шаблоном str S.endswith(str) # Сборка строки из списка с разделителем S S.join(список) # Символ в его код ASCII ord(символ) # Код ASCII в символ chr(число) # Переводит первый символ строки в верхний регистр, а все остальные в нижний S.capitalize() # Возвращает отцентрованную строку, по краям которой стоит символ fill (пробел по умолчанию) S.center(width, [fill]) # Возвращает количество непересекающихся вхождений подстроки в диапазоне [начало, конец] (0 и длина строки по умолчанию) S.count(str, [start],[end]) # Возвращает копию строки, в которой все символы табуляции заменяются одним или несколькими пробелами, в зависимости от текущего столбца. Если TabSize не указан, размер табуляции полагается равным 8 пробелам S.expandtabs([tabsize]) # Удаление пробельных символов в начале строки S.lstrip([chars]) # Удаление пробельных символов в конце строки S.rstrip([chars]) # Удаление пробельных символов в начале и в конце строки S.strip([chars]) # Возвращает кортеж, содержащий часть перед первым шаблоном, сам шаблон, и часть после шаблона. Если шаблон не найден, возвращается кортеж, содержащий саму строку, а затем две пустых строки S.partition(шаблон) # Возвращает кортеж, содержащий часть перед последним шаблоном, сам шаблон, и часть после шаблона. Если шаблон не найден, возвращается кортеж, содержащий две пустых строки, а затем саму строку S.rpartition(sep) # Переводит символы нижнего регистра в верхний, а верхнего – в нижний S.swapcase() # Первую букву каждого слова переводит в верхний регистр, а все остальные в нижний S.title() # Делает длину строки не меньшей width, по необходимости заполняя первые символы нулями S.zfill(width) # Делает длину строки не меньшей width, по необходимости заполняя последние символы символом fillchar S.ljust(width, fillchar=" ") # Делает длину строки не меньшей width, по необходимости заполняя первые символы символом fillchar S.rjust(width, fillchar=" ")
Форматирование строк¶
S.format(*args, **kwargs)
Примеры¶
Python: Определение позиции подстроки (функции str.find и str.rfind)¶
Определение позиции подстроки в строке с помощью функций str.find
и str.rfind
.
In [1]: str = 'ftp://dl.dropbox.com/u/7334460/Magick_py/py_magick.pdf'
Функция str.find
показывает первое вхождение подстроки. Все позиции возвращаются относительно начало строки.
In [2]: str.find('/') Out[2]: 4 In [3]: str[4] Out[3]: '/'
Можно определить вхождение в срезе. первое число показывает начало среза, в котором производится поиск. Второе число — конец среза. В случае отсутствия вхождения подстроки выводится -1.
In [4]: str.find('/', 8, 18) Out[4]: -1 In [5]: str[8:18] Out[5]: '.dropbox.c' In [6]: str.find('/', 8, 22) Out[6]: 20 In [7]: str[8:22] Out[7]: '.dropbox.com/u' In [8]: str[20] Out[8]: '/'
Функция str.rfind
осуществляет поиск с конца строки, но возвращает позицию подстроки относительно начала строки.
In [9]: str.rfind('/') Out[9]: 40 In [10]: str[40] Out[10]: '/'
Python: Извлекаем имя файла из URL¶
Понадобилось мне отрезать от URL всё, что находится после последнего слэша, т.е.названия файла. URL можеть быть какой угодно. Знаю, что задачу запросто можно решить с помощью специального модуля, но я хотел избежать этого. Есть, как минимум, два способа справиться с поставленным вопросом.
Способ №1¶
Достаточно простой способ. Разбиваем строку по слэшам с помощью функции split()
, которая возвращает список. А затем из этого списка извлекаем последний элемент. Он и будет названием файла.
In [1]: str = 'http://dl.dropbox.com/u/7334460/Magick_py/py_magick.pdf' In [2]: str.split('/') Out[2]: ['http:', '', 'dl.dropbox.com', 'u', '7334460', 'Magick_py', 'py_magick.pdf']
Повторим шаг с присвоением переменной:
In [3]: file_name = str.split('/')[-1] In [4]: file_name Out[4]: 'py_magick.pdf'
Способ №2¶
Второй способ интереснее. Сначала с помощью функции rfind()
находим первое вхождение с конца искомой подстроки. Функция возвращает позицию подстроки относительно начала строки. А далее просто делаем срез.
In [5]: str = 'http://dl.dropbox.com/u/7334460/Magick_py/py_magick.pdf' In [6]: str.rfind('/') Out[6]: 41
Делаем срез:
In [7]: file_name = str[42:] In [8]: file_name Out[8]: 'py_magick.pdf'
In this post, you’ll learn how to find an index of a substring in a string, whether it’s the first substring or the last substring. You’ll also learn how to find every index of a substring in a string.
Knowing how to work with strings is an important skill in your Python journey. You’ll learn how to create a list of all the index positions where that substring occurs.
The Quick Answer:
How to Use Python to Find the First Index of a Substring in a String
If all you want to do is first index of a substring in a Python string, you can do this easily with the str.index()
method. This method comes built into Python, so there’s no need to import any packages.
Let’s take a look at how you can use Python to find the first index of a substring in a string:
a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog"
# Find the first index of 'the'
index = a_string.index('the')
print(index)
# Returns: 0
We can see here that the .index()
method takes the parameter of the sub-string that we’re looking for. When we apply the method to our string a_string
, the result of that returns 0
. This means that the substring begins at index position 0, of our string (i.e., it’s the first word).
Let’s take a look at how you can find the last index of a substring in a Python string.
How to Use Python to Find the Last Index of a Substring in a String
There may be many times when you want to find the last index of a substring in a Python string. To accomplish this, we cannot use the .index()
string method. However, Python comes built in with a string method that searches right to left, meaning it’ll return the furthest right index. This is the .rindex()
method.
Let’s see how we can use the str.rindex()
method to find the last index of a substring in Python:
a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog"
# Find the last index of 'the'
index = a_string.rindex('the')
print(index)
# Returns: 76
In the example above, we applied the .rindex()
method to the string to return the last index’s position of our substring.
How to Use Regular Expression (Regex) finditer to Find All Indices of a Substring in a Python String
The above examples both returned different indices, but both only returned a single index. There may be other times when you may want to return all indices of a substring in a Python string.
For this, we’ll use the popular regular expression library, re
. In particular, we’ll use the finditer
method, which helps you find an iteration.
Let’s see how we can use regular expressions to find all indices of a substring in a Python string:
import re
a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog"
# Find all indices of 'the'
indices_object = re.finditer(pattern='the', string=a_string)
indices = [index.start() for index in indices_object]
print(indices)
# Returns: [0, 31, 45, 76]
This example has a few more moving parts. Let’s break down what we’ve done step by step:
- We imported
re
and set up our variablea_string
just as before - We then use
re.finditer
to create an iterable object containing all the matches - We then created a list comprehension to find the
.start()
value, meaning the starting index position of each match, within that - Finally, we printed our list of index start positions
In the next section, you’ll learn how to use a list comprehension in Python to find all indices of a substring in a string.
How to Use a Python List Comprehension to Find All Indices of a Substring in a String
Let’s take a look at how you can find all indices of a substring in a string in Python without using the regular expression library. We’ll accomplish this by using a list comprehension.
Want to learn more about Python list comprehensions? Check out my in-depth tutorial about Python list comprehensions here, which will teach you all you need to know!
Let’s see how we can accomplish this using a list comprehension:
a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog"
# Find all indices of 'the'
indices = [index for index in range(len(a_string)) if a_string.startswith('the', index)]
print(indices)
# Returns: [0, 31, 45, 76]
Let’s take a look at how this list comprehension works:
- We iterate over the numbers from 0 through the length of the list
- We include the index position of that number if the substring that’s created by splitting our string from that index onwards, begins with our letter
- We get a list returned of all the instances where that substring occurs in our string
In the final section of this tutorial, you’ll learn how to build a custom function to return the indices of all substrings in our Python string.
Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas!
How to Build a Custom Function to Find All Indices of a Substring in a String in Python
Now that you’ve learned two different methods to return all indices of a substring in Python string, let’s learn how we can turn this into a custom Python function.
Why would we want to do this? Neither of the methods demonstrated above are really immediately clear a reader what they accomplish. This is where a function would come in handy, since it allows a future reader (who may, very well, be you!) know what your code is doing.
Let’s get started!
# Create a custom function to return the indices of all substrings in a Python string
a_string = "the quick brown fox jumps over the lazy dog. the quick brown fox jumps over the lazy dog"
def find_indices_of_substring(full_string, sub_string):
return [index for index in range(len(full_string)) if full_string.startswith(sub_string, index)]
indices = find_indices_of_substring(a_string, 'the')
print(indices)
# Returns: [0, 31, 45, 76]
In this sample custom function, we use used our list comprehension method of finding the indices of all substrings. The reason for this is that it does not create any additional dependencies.
Conclusion
In this post, you leaned how to use Python to find the first index, the last index, and all indices of a substring in a string. You learned how to do this with regular string methods, with regular expressions, list comprehensions, as well as a custom built function.
To learn more about the re.finditer()
method, check out the official documentation here.