How do I search for items that contain the string 'abc'
in the following list?
xs = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
The following checks if 'abc'
is in the list, but does not detect 'abc-123'
and 'abc-456'
:
if 'abc' in xs:
asked Jan 30, 2011 at 13:29
3
To check for the presence of 'abc'
in any string in the list:
xs = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
if any("abc" in s for s in xs):
...
To get all the items containing 'abc'
:
matching = [s for s in xs if "abc" in s]
Mateen Ulhaq
23.8k18 gold badges95 silver badges132 bronze badges
answered Jan 30, 2011 at 13:32
Sven MarnachSven Marnach
567k117 gold badges934 silver badges834 bronze badges
19
Just throwing this out there: if you happen to need to match against more than one string, for example abc
and def
, you can combine two comprehensions as follows:
matchers = ['abc','def']
matching = [s for s in my_list if any(xs in s for xs in matchers)]
Output:
['abc-123', 'def-456', 'abc-456']
answered Aug 3, 2014 at 6:00
fantabolousfantabolous
21.1k7 gold badges54 silver badges49 bronze badges
4
Use filter
to get all the elements that have 'abc'
:
>>> xs = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
>>> list(filter(lambda x: 'abc' in x, xs))
['abc-123', 'abc-456']
One can also use a list comprehension:
>>> [x for x in xs if 'abc' in x]
Mateen Ulhaq
23.8k18 gold badges95 silver badges132 bronze badges
answered Jan 30, 2011 at 13:34
MAKMAK
26k11 gold badges54 silver badges85 bronze badges
If you just need to know if ‘abc’ is in one of the items, this is the shortest way:
if 'abc' in str(my_list):
Note: this assumes ‘abc’ is an alphanumeric text. Do not use it if ‘abc’ could be just a special character (i.e. []’, ).
answered Apr 13, 2016 at 8:19
RogerSRogerS
1,3029 silver badges11 bronze badges
12
This is quite an old question, but I offer this answer because the previous answers do not cope with items in the list that are not strings (or some kind of iterable object). Such items would cause the entire list comprehension to fail with an exception.
To gracefully deal with such items in the list by skipping the non-iterable items, use the following:
[el for el in lst if isinstance(el, collections.Iterable) and (st in el)]
then, with such a list:
lst = [None, 'abc-123', 'def-456', 'ghi-789', 'abc-456', 123]
st = 'abc'
you will still get the matching items (['abc-123', 'abc-456']
)
The test for iterable may not be the best. Got it from here: In Python, how do I determine if an object is iterable?
answered Oct 20, 2011 at 13:24
Robert MuilRobert Muil
2,9381 gold badge24 silver badges30 bronze badges
4
x = 'aaa'
L = ['aaa-12', 'bbbaaa', 'cccaa']
res = [y for y in L if x in y]
jamylak
128k30 gold badges230 silver badges230 bronze badges
answered Jan 30, 2011 at 13:31
MariyMariy
5,6564 gold badges40 silver badges57 bronze badges
0
for item in my_list:
if item.find("abc") != -1:
print item
jamylak
128k30 gold badges230 silver badges230 bronze badges
answered Jan 30, 2011 at 13:38
RubyconRubycon
18.1k10 gold badges49 silver badges70 bronze badges
1
any('abc' in item for item in mylist)
answered Jan 30, 2011 at 13:34
ImranImran
86.2k23 gold badges97 silver badges131 bronze badges
I am new to Python. I got the code below working and made it easy to understand:
my_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
for item in my_list:
if 'abc' in item:
print(item)
answered Apr 7, 2018 at 7:52
Amol ManthalkarAmol Manthalkar
1,8602 gold badges16 silver badges16 bronze badges
0
Use the __contains__()
method of Pythons string class.:
a = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
for i in a:
if i.__contains__("abc") :
print(i, " is containing")
kalehmann
4,7316 gold badges24 silver badges36 bronze badges
answered Feb 8, 2019 at 16:37
Harsh LodhiHarsh Lodhi
1494 silver badges10 bronze badges
I needed the list indices that correspond to a match as follows:
lst=['abc-123', 'def-456', 'ghi-789', 'abc-456']
[n for n, x in enumerate(lst) if 'abc' in x]
output
[0, 3]
answered Jan 5, 2020 at 19:02
Grant ShannonGrant Shannon
4,6001 gold badge45 silver badges36 bronze badges
If you want to get list of data for multiple substrings
you can change it this way
some_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
# select element where "abc" or "ghi" is included
find_1 = "abc"
find_2 = "ghi"
result = [element for element in some_list if find_1 in element or find_2 in element]
# Output ['abc-123', 'ghi-789', 'abc-456']
answered Jul 14, 2020 at 2:43
mylist=['abc','def','ghi','abc']
pattern=re.compile(r'abc')
pattern.findall(mylist)
Bugs
4,4919 gold badges32 silver badges41 bronze badges
answered Jul 4, 2018 at 13:32
3
Adding nan to list, and the below works for me:
some_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456',np.nan]
any([i for i in [x for x in some_list if str(x) != 'nan'] if "abc" in i])
answered Feb 18, 2021 at 2:38
Sam S.Sam S.
6077 silver badges22 bronze badges
my_list = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
for item in my_list:
if (item.find('abc')) != -1:
print ('Found at ', item)
answered Mar 16, 2018 at 9:14
I did a search, which requires you to input a certain value, then it will look for a value from the list which contains your input:
my_list = ['abc-123',
'def-456',
'ghi-789',
'abc-456'
]
imp = raw_input('Search item: ')
for items in my_list:
val = items
if any(imp in val for items in my_list):
print(items)
Try searching for ‘abc’.
answered Jan 26, 2019 at 2:44
def find_dog(new_ls):
splt = new_ls.split()
if 'dog' in splt:
print("True")
else:
print('False')
find_dog("Is there a dog here?")
4b0
21.7k30 gold badges94 silver badges141 bronze badges
answered Jul 18, 2019 at 8:22
Question : Give the informations of abc
a = ['abc-123', 'def-456', 'ghi-789', 'abc-456']
aa = [ string for string in a if "abc" in string]
print(aa)
Output => ['abc-123', 'abc-456']
cottontail
8,00818 gold badges40 silver badges47 bronze badges
answered Jun 16, 2018 at 10:52
Soudipta DuttaSoudipta Dutta
1,3151 gold badge12 silver badges7 bronze badges
All the answers work but they always traverse the whole list. If I understand your question, you only need the first match. So you don’t have to consider the rest of the list if you found your first match:
mylist = ['abc123', 'def456', 'ghi789']
sub = 'abc'
next((s for s in mylist if sub in s), None) # returns 'abc123'
If the match is at the end of the list or for very small lists, it doesn’t make a difference, but consider this example:
import timeit
mylist = ['abc123'] + ['xyz123']*1000
sub = 'abc'
timeit.timeit('[s for s in mylist if sub in s]', setup='from __main__ import mylist, sub', number=100000)
# for me 7.949463844299316 with Python 2.7, 8.568840944994008 with Python 3.4
timeit.timeit('next((s for s in mylist if sub in s), None)', setup='from __main__ import mylist, sub', number=100000)
# for me 0.12696599960327148 with Python 2.7, 0.09955992100003641 with Python 3.4
Method #1: Using list comprehension
List comprehension is an elegant way to perform any particular task as it increases readability in the long run. This task can be performed using a naive method and hence can be reduced to list comprehension as well.
Python3
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
print
(
"The original list is : "
+
str
(test_list))
subs
=
'Geek'
res
=
[i
for
i
in
test_list
if
subs
in
i]
print
(
"All strings with given substring are : "
+
str
(res))
Output
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks', 'Geeky']
Time complexity: O(n), where n is the length of the test_list.
Auxiliary space: O(1), as only a few variables are used in the code.
Method #2: Using filter() + lambda
This function can also perform the task of finding the strings with the help of lambda. It just filters out all the strings matching the particular substring and then adds it to a new list.
Python3
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
print
(
"The original list is : "
+
str
(test_list))
subs
=
'Geek'
res
=
list
(
filter
(
lambda
x: subs
in
x, test_list))
print
(
"All strings with given substring are : "
+
str
(res))
Output
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks', 'Geeky']
Time complexity: O(n) where n is the number of elements in the test_list.
Auxiliary space: O(m) where m is the number of elements in the result list.
Method #3: Using re + search()
Regular expressions can be used to perform many task in python. To perform this particular task also, regular expressions can come handy. It finds all the matching substring using search() and returns result.
Python3
import
re
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
print
(
"The original list is : "
+
str
(test_list))
subs
=
'Geek'
res
=
[x
for
x
in
test_list
if
re.search(subs, x)]
print
(
"All strings with given substring are : "
+
str
(res))
Output
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks', 'Geeky']
Time complexity: O(n * m), where n is the length of the list and m is the length of the substring.
Auxiliary space: O(k), where k is the number of strings that contain the substring.
Method #4 : Using find() method
Python3
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
print
(
"The original list is : "
+
str
(test_list))
subs
=
'Geek'
res
=
[]
for
i
in
test_list:
if
i.find(subs) !
=
-
1
:
res.append(i)
print
(
"All strings with given substring are : "
+
str
(res))
Output
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks', 'Geeky']
Time complexity: O(n*m), where n is the length of the input list and m is the length of the substring to search for. The loop iterates through each element of the list and calls the find() method on each element, which has a time complexity of O(m) in the worst case.
Auxiliary space: O(k), where k is the number of elements in the result list. The result list res is created to store all the strings that contain the given substring. The maximum size of res is n, the length of the input list, if all elements contain the substring.
Method #5 : Using replace() and len() methods
Python3
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
print
(
"The original list is : "
+
str
(test_list))
subs
=
'Geek'
res
=
[]
for
i
in
test_list:
x
=
i.replace(subs, "")
if
(
len
(x) !
=
len
(i)):
res.append(i)
print
(
"All strings with given substring are : "
+
str
(res))
Output
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks', 'Geeky']
Method #6 : Using a try/except block and the index()
Here is an example of using a try/except block and the index() method to find strings with a given substring in a list:
Python3
res
=
[]
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
subs
=
'Geek'
for
element
in
test_list:
try
:
index
=
element.index(subs)
res.append(element)
except
ValueError:
pass
print
(
"All strings with given substring are : "
+
str
(res))
Output
All strings with given substring are : ['GeeksforGeeks', 'Geeky']
Time complexity: O(n) since it involves a single pass through the input list. It is a simple and efficient method for finding strings with a given substring in a list, and it allows you to handle the case where the substring is not present in the string using a try/except block.
Auxiliary Space: O(n)
Method#7: Using for loop
Here’s the step-by-step algorithm for finding strings with a given substring in a list
- Initialize the list of strings and the substring to search for.
- Initialize an empty list to store the strings that contain the substring.
- Loop through each string in the original list.
- Check if the substring is present in the current string.
- If the substring is present in the current string, add the string to the result list.
- Print the original list and the result list.
Python3
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
subs
=
'Geek'
res
=
[]
for
i
in
test_list:
if
subs
in
i:
res.append(i)
print
(
"The original list is : "
+
str
(test_list))
print
(
"All strings with given substring are : "
+
str
(res))
Output
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks', 'Geeky']
The time complexity of this algorithm is O(n*m), where n is the number of strings in the original list and m is the length of the longest string in the list. This is because in the worst case, we have to loop through every string in the list and check if the substring is present in each string, which takes O(m) time.
The space complexity of this algorithm is O(k), where k is the number of strings in the original list that contain the substring. This is because we are storing the result strings in a list, which can have a maximum size of k.
NumPy approach to find strings with a given substring in a list:
Algorithm:
Convert the given list to a NumPy array.
Create an empty NumPy array of the same shape as the input array to store the boolean values for the given condition.
Use the numpy.char.find() function to find the indices of the given substring in the input array.
Use the numpy.where() function to get the indices where the given condition is True.
Use the indices obtained from step 4 to extract the required elements from the input array.
Python3
import
numpy as np
test_list
=
[
'GeeksforGeeks'
,
'Geeky'
,
'Computers'
,
'Algorithms'
]
print
(
"The original list is : "
+
str
(test_list))
subs
=
'Geek'
test_array
=
np.array(test_list)
output_array
=
np.zeros_like(test_array, dtype
=
bool
)
index_array
=
np.char.find(test_array, subs)
output_array[np.where(index_array !
=
-
1
)]
=
True
res
=
test_array[output_array]
print
(
"All strings with given substring are : "
+
str
(res))
Output:
The original list is : ['GeeksforGeeks', 'Geeky', 'Computers', 'Algorithms'] All strings with given substring are : ['GeeksforGeeks' 'Geeky']
Time Complexity: O(N*M), where n is the length of the input list and m is the length of the substring. The numpy.char.find() function has a time complexity of O(nm) in the worst case, where n is the number of elements in the input array and m is the length of the substring.
Auxiliary Space: O(n), where n is the number of elements in the input array. The numpy.char.find() function creates a temporary array of the same size as the input array to store the indices where the substring is present.
Last Updated :
02 May, 2023
Like Article
Save Article
- Use the
for
Loop to Find Elements From a List That Contain a Specific Substring in Python - Use the
filter()
Function to Find Elements From a Python List Which Contain a Specific Substring - Use the Regular Expressions to Find Elements From a Python List Which Contain a Specific Substring
This tutorial will introduce how to find elements from a Python list that have a specific substring in them.
We will work with the following list and extract strings that have ack
in them.
my_list = ['Jack', 'Mack', 'Jay', 'Mark']
Use the for
Loop to Find Elements From a List That Contain a Specific Substring in Python
In this method, we iterate through the list and check whether the substring is present in a particular element or not. If the substring is present in the element, then we store it in the string. The following code shows how:
str_match = [s for s in my_list if "ack" in s]
print(str_match)
Output:
The in
keyword checks whether the given string, "ack"
in this example, is present in the string or not. It can also be replaced by the __contains__
method, which is a magic method of the string class. For example:
str_match = [s for s in my_list if s.__contains__("ack")]
print(str_match)
Output:
Use the filter()
Function to Find Elements From a Python List Which Contain a Specific Substring
The filter()
function retrieves a subset of the data from a given object with the help of a function. This method will use the lambda
keyword to define the condition for filtering data. The lambda
keyword creates a one-line lambda
function in Python. See the following code snippet.
str_match = list(filter(lambda x: 'ack' in x, my_list))
print(str_match)
Output:
Use the Regular Expressions to Find Elements From a Python List Which Contain a Specific Substring
A regular expression is a sequence of characters that can act as a matching pattern to search for elements. To use regular expressions, we have to import the re
module. In this method, we will use the for
loop and the re.search()
method, which is used to return an element that matches a specific pattern. The following code will explain how:
import re
pattern=re.compile(r'ack')
str_match = [x for x in my_list if re.search('ack', x)]
print(str_match)
Output:
Получение индекса для строк: str.index (), str.rindex() и str.find(), str.rfind()
String
также имеет index
метод , но и более продвинутые варианты и дополнительное str.find
.Для обоих из них есть дополнительный обратный метод.
astring = 'Hello on StackOverflow'
astring.index('o') # 4
astring.rindex('o') # 20
astring.find('o') # 4
astring.rfind('o') # 20
Разница между index
/ rindex
и find
/ rfind
это то , что происходит , если подстрока не найдена в строке:
astring.index('q') # ValueError: substring not found
astring.find('q') # -1
Все эти методы позволяют начальный и конечный индексы:
astring.index('o', 5) # 6
astring.index('o', 6) # 6 - start is inclusive
astring.index('o', 5, 7) # 6
astring.index('o', 5, 6) # - end is not inclusive
ValueError: подстрока не найдена
astring.rindex('o', 20) # 20
astring.rindex('o', 19) # 20 - still from left to right
astring.rindex('o', 4, 7) # 6
В поисках элемента
Все встроенные в коллекции в Python реализовать способ проверить членство элемента с использованием in
. Список
alist = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
5 in alist # True
10 in alist # False
Кортеж
atuple =('0', '1', '2', '3', '4')
4 in atuple # False
'4' in atuple # True
строка
astring = 'i am a string'
'a' in astring # True
'am' in astring # True
'I' in astring # False
Задавать
aset = {(10, 10), (20, 20), (30, 30)}
(10, 10) in aset # True
10 in aset # False
Dict
dict
немного особенный: нормальный in
проверяет только ключи. Если вы хотите , чтобы искать в значении , которые необходимо указать. То же самое , если вы хотите найти пар ключ-значение.
adict = {0: 'a', 1: 'b', 2: 'c', 3: 'd'}
1 in adict # True - implicitly searches in keys
'a' in adict # False
2 in adict.keys() # True - explicitly searches in keys
'a' in adict.values() # True - explicitly searches in values
(0, 'a') in adict.items() # True - explicitly searches key/value pairs
Получение списка индексов и кортежей: list.index(), tuple.index()
list
и tuple
имеют index
-метода получить позицию элемента:
alist = [10, 16, 26, 5, 2, 19, 105, 26]
# search for 16 in the list
alist.index(16) # 1
alist[1] # 16
alist.index(15)
Ошибка значения: 15 отсутствует в списке
Но возвращает только позицию первого найденного элемента:
atuple = (10, 16, 26, 5, 2, 19, 105, 26)
atuple.index(26) # 2
atuple[2] # 26
atuple[7] # 26 - is also 26!
Поиск ключа(ей) по значению в dict
dict
не имеет встроенный метода для поиска значения или ключа , потому что словари являются упорядоченными. Вы можете создать функцию, которая получает ключ (или ключи) для указанного значения:
def getKeysForValue(dictionary, value):
foundkeys = []
for keys in dictionary:
if dictionary[key] == value:
foundkeys.append(key)
return foundkeys
Это также может быть записано как эквивалентное понимание списка:
def getKeysForValueComp(dictionary, value):
return [key for key in dictionary if dictionary[key] == value]
Если вам нужен только один найденный ключ:
def getOneKeyForValue(dictionary, value):
return next(key for key in dictionary if dictionary[key] == value)
Первые две функции возвращает list
всех keys
, которые имеют определенное значение:
adict = {'a': 10, 'b': 20, 'c': 10}
getKeysForValue(adict, 10) # ['c', 'a'] - order is random could as well be ['a', 'c']
getKeysForValueComp(adict, 10) # ['c', 'a'] - dito
getKeysForValueComp(adict, 20) # ['b']
getKeysForValueComp(adict, 25) # []
Другой вернет только один ключ:
getOneKeyForValue(adict, 10) # 'c' - depending on the circumstances this could also be 'a'
getOneKeyForValue(adict, 20) # 'b'
и поднять StopIteration
— Exception
, если значение не в dict
:
getOneKeyForValue(adict, 25)
StopIteration
Получение индекса для отсортированных последовательностей: bisect.bisect_left()
Отсортированные последовательности позволяют использовать более быстрый поиск алгоритмов: bisect.bisect_left()
[1]:
import bisect
def index_sorted(sorted_seq, value):
"""Locate the leftmost value exactly equal to x or raise a ValueError"""
i = bisect.bisect_left(sorted_seq, value)
if i != len(sorted_seq) and sorted_seq[i] == value:
return i
raise ValueError
alist = [i for i in range(1, 100000, 3)] # Sorted list from 1 to 100000 with step 3
index_sorted(alist, 97285) # 32428
index_sorted(alist, 4) # 1
index_sorted(alist, 97286)
ValueError
Для очень больших отсортированных последовательностей выигрыш в скорости может быть достаточно высоким. В случае первого поиска примерно в 500 раз быстрее:
%timeit index_sorted(alist, 97285)
# 100000 loops, best of 3: 3 µs per loop
%timeit alist.index(97285)
# 1000 loops, best of 3: 1.58 ms per loop
Хотя это немного медленнее, если элемент является одним из самых первых:
%timeit index_sorted(alist, 4)
# 100000 loops, best of 3: 2.98 µs per loop
%timeit alist.index(4)
# 1000000 loops, best of 3: 580 ns per loop
Поиск вложенных последовательностей
Поиск во вложенных последовательностях , как в list
из tuple
требует такого подхода , как поиск ключей для значений в dict
, но нуждается в пользовательских функциях.
Индекс самой внешней последовательности, если значение было найдено в последовательности:
def outer_index(nested_sequence, value):
return next(index for index, inner in enumerate(nested_sequence)
for item in inner
if item == value)
alist_of_tuples = [(4, 5, 6), (3, 1, 'a'), (7, 0, 4.3)]
outer_index(alist_of_tuples, 'a') # 1
outer_index(alist_of_tuples, 4.3) # 2
или индекс внешней и внутренней последовательности:
def outer_inner_index(nested_sequence, value):
return next((oindex, iindex) for oindex, inner in enumerate(nested_sequence)
for iindex, item in enumerate(inner)
if item == value)
outer_inner_index(alist_of_tuples, 'a') # (1, 2)
alist_of_tuples[1][2] # 'a'
outer_inner_index(alist_of_tuples, 7) # (2, 0)
alist_of_tuples[2][0] # 7
В общем случае (не всегда) с помощью next
и выражения генератора с условиями , чтобы найти первое вхождение искомого значения является наиболее эффективным подходом.
Поиск в пользовательских классах: __contains__ и __iter__
Для того, чтобы разрешить использование in
пользовательских классах класса должен либо предоставить магический метод __contains__
или, если это невозможно, в __iter__
-метод.
Предположим , у вас есть класс , содержащий list
из list
s:
class ListList:
def __init__(self, value):
self.value = value
# Create a set of all values for fast access
self.setofvalues = set(item for sublist in self.value for item in sublist)
def __iter__(self):
print('Using __iter__.')
# A generator over all sublist elements
return (item for sublist in self.value for item in sublist)
def __contains__(self, value):
print('Using __contains__.')
# Just lookup if the value is in the set
return value in self.setofvalues
# Even without the set you could use the iter method for the contains-check:
# return any(item == value for item in iter(self))
Использование тестирования членства возможно при использовании in
:
a = ListList([[1,1,1],[0,1,1],[1,5,1]])
10 in a # False
# Prints: Using __contains__.
5 in a # True
# Prints: Using __contains__.
даже после удаления __contains__
метода:
del ListList.__contains__
5 in a # True
# Prints: Using __iter__.
Примечание: зацикливание in
(как for i in a
) всегда будет использовать __iter__
даже если класс реализует __contains__
метод.