The code is:
import pyautogui
startButton = pyautogui.locateOnScreen('start.png')
print startButton
Or:
import pyautogui
startButton = pyautogui.locateCenterOnScreen('start.png')
print startButton
The output is:
None
Note: the correct syntax seems to be in place according to the documentation.
Note: I have tried also with image full path. The image is on the screen and it is not overlapped by other images. The pil library is also installed. Other pyautogui
features work (including taking screenshot)
Please let me know what I am missing out. Or please suggest another Python library for image detection.
asked Oct 2, 2015 at 12:35
0
Here is the syntax I use for this:
import pyautogui
start = pyautogui.locateCenterOnScreen('start.png')#If the file is not a png file it will not work
print(start)
pyautogui.moveTo(start)#Moves the mouse to the coordinates of the image
If you are using multiple monitors at the same time it only scans the primary one.
This program scans the pixels of your screen and color matches pixels with your PNG file. If the image color(shadows of the image, the image is changing colors, etc.) changes in any way it will reply with «None».
answered Apr 15, 2016 at 19:43
Malachi BazarMalachi Bazar
1,6851 gold badge18 silver badges20 bronze badges
None
means that PyAutoGui was unable to find your image on the screen.
Make sure:
- The window is open where Start.png was sampled from, and
- The resolutions from when you took the sample and the current monitor are the same.
answered Oct 30, 2015 at 15:27
AP.AP.
7,9042 gold badges24 silver badges33 bronze badges
Try installing opencv & decrease confidence. That worked for me
import pyautogui
startButton = pyautogui.locateOnScreen('start.png', confidence = 0.7)
print(startButton)
answered Jul 5, 2020 at 15:36
KKWKKW
3471 silver badge11 bronze badges
As I understand the problem can be fix by turning the image to RGB. The code will look something like this:
import pyautogui
from PIL import Image
im1=pyautogui.screenshot()
im2=pyautogui.screenshot("newone.png")
image.open("newone.png").convert("RGB").save("newone.png")
answered Nov 1, 2017 at 22:09
Franco MFranco M
1091 silver badge6 bronze badges
You need to put the whole path of the image in pyautogui.locateCenterOnScreen()
, like this:
pyautogui.locateCenterOnScreen('C:/Users/User/Desktop/start.png')
This worked for me. You might also want to add x_cord, y_cord =
in front of that command, so you can use it later.
answered Mar 13, 2020 at 16:29
Urban P.Urban P.
1191 silver badge9 bronze badges
I had a very analogical problem. For me the solution was to use python 2.7 instead of 3.x. It is probably caused by less flexible functionality of python 3. I prefer to use it and pyautogui works fantastically. The pillow module (or commonly known as PIL) which must be installed when installing pyautogui, however seems to have less functionality working with python 3.
answered Mar 1, 2016 at 9:58
L_PavL_Pav
2811 gold badge3 silver badges11 bronze badges
0
I faced the same issue with pyautogui. Though it is a very convenient library, it is quite slow.
I gained a x10 speedup relying on cv2 and PIL:
def benchmark_opencv_pil(method):
img = ImageGrab.grab(bbox=REGION)
img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)
res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)
# print(res)
return (res >= 0.8).any()
Where using TM_CCOEFF_NORMED worked well. (obviously, you can also adjust the 0.8 threshold)
Source : Fast locateOnScreen with Python
For the sake of completeness, here is the full benchmark:
import pyautogui as pg
import numpy as np
import cv2 as cv
from PIL import ImageGrab, Image
import time
REGION = (0, 0, 400, 400)
GAME_OVER_PICTURE_PIL = Image.open("./balloon_fight_game_over.png")
GAME_OVER_PICTURE_CV = cv.imread('./balloon_fight_game_over.png')
def timing(f):
def wrap(*args, **kwargs):
time1 = time.time()
ret = f(*args, **kwargs)
time2 = time.time()
print('{:s} function took {:.3f} ms'.format(
f.__name__, (time2-time1)*1000.0))
return ret
return wrap
@timing
def benchmark_pyautogui():
res = pg.locateOnScreen(GAME_OVER_PICTURE_PIL,
grayscale=True, # should provied a speed up
confidence=0.8,
region=REGION)
return res is not None
@timing
def benchmark_opencv_pil(method):
img = ImageGrab.grab(bbox=REGION)
img_cv = cv.cvtColor(np.array(img), cv.COLOR_RGB2BGR)
res = cv.matchTemplate(img_cv, GAME_OVER_PICTURE_CV, method)
# print(res)
return (res >= 0.8).any()
if __name__ == "__main__":
im_pyautogui = benchmark_pyautogui()
print(im_pyautogui)
methods = ['cv.TM_CCOEFF', 'cv.TM_CCOEFF_NORMED', 'cv.TM_CCORR',
'cv.TM_CCORR_NORMED', 'cv.TM_SQDIFF', 'cv.TM_SQDIFF_NORMED']
# cv.TM_CCOEFF_NORMED actually seems to be the most relevant method
for method in methods:
print(method)
im_opencv = benchmark_opencv_pil(eval(method))
print(im_opencv)
And the results show a x10 improvement.
benchmark_pyautogui function took 175.712 ms
False
cv.TM_CCOEFF
benchmark_opencv_pil function took 21.283 ms
True
cv.TM_CCOEFF_NORMED
benchmark_opencv_pil function took 23.377 ms
False
cv.TM_CCORR
benchmark_opencv_pil function took 20.465 ms
True
cv.TM_CCORR_NORMED
benchmark_opencv_pil function took 25.347 ms
False
cv.TM_SQDIFF
benchmark_opencv_pil function took 23.799 ms
True
cv.TM_SQDIFF_NORMED
benchmark_opencv_pil function took 22.882 ms
True
Automating any kind of task is pretty tricky, and the more complex the task, the harder it is to automate. At some point or the other, the need for Image Recognition with Pyautogui arises, so that we can locate elements on the screen. Without this feature, we would face numerous issues, such as being unable to find an element if it’s position on screen was slightly changed.
How Image Recognition works in PyAutoGUI
First we need to understand how exactly image recognition works in Pyautogui. Simply put, PyAutoGUI has a variety of “locate” functions, which take a source image as a parameter, and then try to find a match from whatever is currently displaying on your screen. (This means if you are searching for something that is currently minimized, or off-screen, then PyAutoGUI will not be able to find it).
Once a match has been found, depending on the function called, we can either have it’s top-left coordinates returned, or a “Box” object, with the top-left coordinates as well as the width and height.
Here are some of the various functions that we can use in Pyautogui for Image Recognition.
- locateOnScreen(image) -> Returns (left, top, width, height) coordinate of first found instance of the
image
on the screen. - locateCenterOnScreen(image) -> Returns the x and y coordinates of the center of the first found instance of the
image
on the screen. - locateAllOnScreen(image) -> Returns a list with (left, top, width, height) tuples for each image found on the screen.
Image Detection with PyAutoGUI
We will now show you a small example of how we can practically apply Image recognition in Pyautogui, and automate a task.
Calculator App
We will be creating a simple program that uses the Calculator app in Windows, and performs a simple addition operation on it between two numbers. Here is what the app looks like on PC.
For this program, I have taken screen-shots of 4 of it’s buttons, pictured below.
Note: These images, may or may not work for you. The Pyautogui functions for image detection needs the correct image dimensions, so if the default calculator app size is different on your system, then it won’t return a match. I advise you to take the screenshots yourself. (I used Windows Snipping tool + Paint)
Now let’s try using one of the above-mentioned functions for image detection. We will use the one with just returns the x and y coordinates, as we aren’t concerned with the width or height right now.
import pyautogui x, y= pyautogui.locateCenterOnScreen("5.jpg") pyautogui.moveTo(x, y, duration = 0.1) pyautogui.leftClick()
Now what this code should do, is locate the number 5 on the calculator, move the mouse cursor over to it, and then click it. However, this code and the underlying logic is pretty tricky, so there is a very good chance you will face an error here.
Let’s try and analyze why these errors can occur, and how to get around them.
Possible Errors, and their Reasons
1# The most obvious issue is that you don’t have your Calculator App open on screen. Even if it’s open however, there can be issues. For example, you have it minimized or it is being covered by some other window. Both of these cases will cause it to fail.
2# You may have tried re-sizing the Calculator Window. If for example, the image you took was of dimensions 20×20, and the button on the calculator has been resized to 40×40, then it will not return a match.
3# (MOST IMPORTANT) You tried using a .jpg
instead of a .png
. (And yes, I know I used a .jpg
in the above example). I will be addressing this issue in the next section in more detail.
Using PyAutoGUI with OpenCV
(Don’t be scared off by the title, it’s alot easier than it looks, trust me)
The problem with the default use of pyautogui is that it searches only for exact matches. But often that’s not very practical, as there can be minor changes in color, tone, etc. Luckily pyautogui offers a simple solution.
All of it’s locate
functions have the confidence
parameter, which takes in a float value ranging from 0
to 1.0
. The default value is 1.0
, which states that only 100% matches should be returned. If you gave a value of 0.8
on the other hand, it will return matches that only match atleast 80%.
Now I mentioned in the earlier that a .jpg
would (likely) not work. This is because they apply compression on the image, which slightly changes the pixel composition, hence it returns a “not found” exception. Using the confidence parameter will remove this problem entirely. Just set a value like 0.9
, and it won’t cause any problems.
There is one slight caveat to this though. You must have OpenCV installed, as it is a dependency for the confidence parameter to work. The full OpenCV package is a bit complex to install, luckily there is a smaller version (with lesser features) we can acquire that will be enough for our purposes.
Run the following command:
pip install opencv-python
Now let’s get back to our code.
Completing our Program
I will write out the complete code here, that I have used to automate this task. Try it out yourself and let me know the results!
import pyautogui x, y= pyautogui.locateCenterOnScreen("5.jpg", confidence = 0.9) pyautogui.moveTo(x, y, duration = 0.1) pyautogui.leftClick() x, y= pyautogui.locateCenterOnScreen("plus.jpg", confidence = 0.9) pyautogui.moveTo(x, y, duration = 0.1) pyautogui.leftClick() x, y= pyautogui.locateCenterOnScreen("7.jpg", confidence = 0.9) pyautogui.moveTo(x, y, duration = 0.1) pyautogui.leftClick() x, y= pyautogui.locateCenterOnScreen("equals.jpg", confidence = 0.9) pyautogui.moveTo(x, y, duration = 0.1) pyautogui.leftClick()
Normally I would have provided the images as a download, but due to various problems that can occur, it’s best if you do that part yourselves.
Another interesting feature that PyAutoGUI has that may come in handy for you, is the Screenshot ability. This allows you take to take screenshots of the full screen or of specific regions, and save them to a file.
Note: For a complete project based around this idea, you would need to take images for each button. To make it even more interesting, you could have the user enter expressions, and the program will automatically solve it.
This marks the end of the Image Recognition with Pyautogui Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the article content can be asked in the comments section below.
Project description
screen_search is small python library use to Searchs for an image on the screen
install:
to install the library
type in your Terminal
pip install screen_search
requirements
screen_search supports Python 2 and 3.
If you are installing screen_search from PyPI using pip:
Windows has no dependencies.
The Win32 extensions do not need to be installed.
OS X needs the pyobjc-core and pyobjc module installed (in that order).
Linux needs the python3-xlib (or python-xlib for Python 2) module installed.
Pillow needs to be installed, and on Linux you may need to install additional libraries to make sure Pillow’s PNG/JPEG works correctly.
Example
this is simple example of looking for a picture on the screen and moving the pointer to it.
from screen_search import * # Search for the github logo on the whole screen # note that the search only works on your primary screen. search = Search("github.png") pos = search.imagesearch() if pos[0] != -1: print("position : ", pos[0], pos[1]) pyautogui.moveTo(pos[0], pos[1]) else: print("image not found")
Download files
Download the file for your platform. If you’re not sure which to choose, learn more about installing packages.
Source Distribution
Sikuli
На OpenCV
Кликаем по кнопке на экране. Изображение кнопки — в файле ‘butt01.png’:
import cv2
import time
import numpy as np
import pyscreenshot as ImageGrab
import pyautogui
def find_patt(image, patt, thres):
img_grey = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
(patt_H, patt_W) = patt.shape[:2]
res = cv2.matchTemplate(img_grey, patt, cv2.TM_CCOEFF_NORMED)
loc = np.where(res>thres)
return patt_H, patt_W, zip(*loc[::-1])
if __name__ == '__main__':
screenshot = ImageGrab.grab()
img = np.array(screenshot.getdata(), dtype='uint8').reshape((screenshot.size[1],screenshot.size[0],3))
patt = cv2.imread('butt01.png', 0)
h,w,points = find_patt(img, patt, 0.60)
if len(points)!=0:
pyautogui.moveTo(points[0][0]+w/2, points[0][1]+h/2)
pyautogui.click()