Как найти медиану числа питон

To calculate the median value in Python:

  1. Import the statistics module.
  2. Call the statistics.median() function on a list of numbers.

For example, let’s calculate the median of a list of numbers:

import statistics

numbers = [1, 2, 3, 4, 5, 6, 7]
med = statistics.median(numbers)

print(med)

Output:

4

The median value is a common way to measure the “centrality” of a dataset.

If you are looking for a quick answer, I’m sure the above example will do. But to really learn what median really is, why it is useful, and how to find it, read along.

This is a comprehensive guide to finding the median in Python.

What Is the Median Value in Maths

The Median is the middle value of a given dataset.

If you have a list of 3 numbers, the median is the second number as it is in the middle.

But in case you have a list of 4 values, there is no “middle value”. When calculating the median, of an even-sized dataset, the average of the two middle values is used.

Median odd or even number of items

Why and When Is Median Value Useful

When dealing with statistics, you usually want to have a single number that describes the nature of a dataset.

Think about your school grades for example. Instead of seeing the dozens of grades, you want to know the average (the mean).

Usually, measuring the “centrality” of a dataset means calculating the mean value. But if you have a skewed distribution, the mean value can be unintuitive.

Let’s say you drive to your nearby shopping mall 7 times. Usually, the drive takes around 10 minutes. But one day the traffic jam makes it last 2 hours.

Here is a list of driving times to the mall:

[9, 120, 10, 9, 10, 10, 10]

Now if you take the average of this list, you get ~25 minutes. But how well does this number really describe your trip?

Pretty badly.

As you can see, most of the time the trip takes around 10 minutes.

To better describe the driving time, you should use a median value instead. To calculate the median value, you need to sort the driving times first:

[9, 9, 10, 10, 10, 10, 120]

Then you can choose the middle value, which in this case is 10 minutes. 10 minutes describes your typical trip length way better than 25, right?

The usefulness of calculating the median, in this case, is that the unusually high value of 120 does not matter.

In short, you can calculate the median value when measuring centrality with average is unintuitive.

In Python, you can either create a function that calculates the median or use existing functionality.

How to Implement Median Function in Python

If you want to implement the median function, you need to understand the procedure of finding the median.

The median function works such that it:

  1. Takes a dataset as input.
  2. Sorts the dataset.
  3. Checks if the dataset is odd/even in length.
  4. If the dataset is odd in length, the function picks the mid-value and returns it.
  5. If the dataset is even, the function picks the two mid values, calculates the average, and returns the result.

Here is how it looks in the code:

def median(data):
    sorted_data = sorted(data)
    data_len = len(sorted_data)

    middle = (data_len - 1) // 2

    if middle % 2:
        return sorted_data[middle]
    else:
        return (sorted_data[middle] + sorted_data[middle + 1]) / 2.0

Example usage:

numbers = [1, 2, 3, 4, 5, 6, 7]
med = median(numbers)

print(med)

Output:

4

Now, this is a valid approach if you need to write the median function yourself. But with common maths operations, you should use a built-in function to save time and headaches.

Let’s next take a look at how to calculate the median with a built-in function in Python.

How to Use a Built-In Median Function in Python

In Python, there is a module called statistics. This module contains useful mathematical tools for data science and statistics.

One of the great methods of this module is the median() function.

As the name suggests, this function calculates the median of a given dataset.

To use the median function from the statistics module, remember to import it into your project.

Here is an example of calculating the median for a bunch of numbers:

import statistics

numbers = [1, 2, 3, 4, 5, 6, 7]
med = statistics.median(numbers)

print(med)

Result:

4

Conclusion

Today you learned how to calculate the median value in Python.

To recap, the median value is a way to measure the centrality of a dataset. The Median is useful when the average doesn’t properly describe the dataset and gives falsy results.

To calculate the median in Python, use the built-in median() function from the statistics module.

import statistics

numbers = [1, 2, 3, 4, 5, 6, 7]
med = statistics.median(numbers)

Thanks for reading. Happy coding!

Further Reading

  • Python Tricks
  • How to Write to a File in Python
  • The with Statement in Python

About the Author

I’m an entrepreneur and a blogger from Finland. My goal is to make coding and tech easier for you with comprehensive guides and reviews.

Recent Posts

In this tutorial, we will look at how to get the median value of a list of values in Python. We will walk you through the usage of the different methods with the help of examples.

Median of a list in Python.

What is median?

Median is a descriptive statistic that is used as a measure of central tendency of a distribution. It is equal to the middle value of the distribution. There are equal number of values smaller and larger than the median. It is also not much sensitive to the presence of outliers in the data like the mean (another measure of central tendency).

To calculate the median of a list of values –

  1. Sort the values in ascending or descending order (either works).
  2. If the number of values, n, is odd, then the median is the value in the (n+1)/2 position in the sorted list(or array) of values.
    If the number of values, n, is even, then the median is the average of the values in n/2 and n/2 + 1 position in the sorted list(or array) of values.

For example, calculate the median of the following values –

A bunch of numbers whose median is to be calculated.

First, let’s sort these numbers in ascending order.

numbers sorted in ascending order

Now, since the total number of values is even (8), the median is the average of the 4th and the 5th value.

Median calculation

Thus, median comes out to be 3.5

Now that we have seen how is the median mathematically calculated, let’s look at how to compute the median in Python.

To compute the median of a list of values in Python, you can write your own function, or use methods available in libraries like numpy, statistics, etc. Let’s look at these methods with the help of examples.

1. From scratch implementation of median in Python

You can write your own function in Python to compute the median of a list.

def get_median(ls):
    # sort the list
    ls_sorted = ls.sort()
    # find the median
    if len(ls) % 2 != 0:
        # total number of values are odd
        # subtract 1 since indexing starts at 0
        m = int((len(ls)+1)/2 - 1)
        return ls[m]
    else:
        m1 = int(len(ls)/2 - 1)
        m2 = int(len(ls)/2)
        return (ls[m1]+ls[m2])/2

# create a list
ls = [3, 1, 4, 9, 2, 5, 3, 6]
# get the median
print(get_median(ls))

Output:

3.5

Here, we use the list sort() function to sort the list, and then depending upon the length of the list return the median. We get 3.5 as the median, the same we manually calculated above.

Note that, compared to the above function, the libraries you’ll see next are better optimized to compute the median of a list of values.

2. Using statistics library

You can also use the statistics standard library in Python to get the median of a list. Pass the list as argument to the statistics.median() function.

import statistics

# create a list
ls = [3, 1, 4, 9, 2, 5, 3, 6]
# get the median
print(statistics.median(ls))

Output:

3.5

We get the same results as above.

For more on the statistics library in Python, refer to its documentation.

3. Using numpy library

The numpy library’s median() function is generally used to calculate the median of a numpy array. You can also use this function on a Python list.

import numpy as np

# create a list
ls = [3, 1, 4, 9, 2, 5, 3, 6]
print(np.median(ls))

Output:

3.5

You can see that we get the same result.

Subscribe to our newsletter for more informative guides and tutorials.
We do not spam and you can opt out any time.

  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

    View all posts

How do you find the median of a list in Python? The list can be of any size and the numbers are not guaranteed to be in any particular order.

If the list contains an even number of elements, the function should return the average of the middle two.

Here are some examples (sorted for display purposes):

median([1]) == 1
median([1, 1]) == 1
median([1, 1, 2, 4]) == 1.5
median([0, 2, 5, 6, 8, 9, 9]) == 6
median([0, 0, 0, 0, 4, 4, 6, 8]) == 2

Veedrac's user avatar

Veedrac

57.6k14 gold badges111 silver badges168 bronze badges

asked Jun 7, 2014 at 21:04

ChucksPlace's user avatar

2

Python 3.4 has statistics.median:

Return the median (middle value) of numeric data.

When the number of data points is odd, return the middle data point.
When the number of data points is even, the median is interpolated by taking the average of the two middle values:

>>> median([1, 3, 5])
3
>>> median([1, 3, 5, 7])
4.0

Usage:

import statistics

items = [6, 1, 8, 2, 3]

statistics.median(items)
#>>> 3

It’s pretty careful with types, too:

statistics.median(map(float, items))
#>>> 3.0

from decimal import Decimal
statistics.median(map(Decimal, items))
#>>> Decimal('3')

answered Jun 8, 2014 at 0:08

Veedrac's user avatar

VeedracVeedrac

57.6k14 gold badges111 silver badges168 bronze badges

3

(Works with python-2.x):

def median(lst):
    n = len(lst)
    s = sorted(lst)
    return (s[n//2-1]/2.0+s[n//2]/2.0, s[n//2])[n % 2] if n else None

>>> median([-5, -5, -3, -4, 0, -1])
-3.5

numpy.median():

>>> from numpy import median
>>> median([1, -4, -1, -1, 1, -3])
-1.0

For python-3.x, use statistics.median:

>>> from statistics import median
>>> median([5, 2, 3, 8, 9, -2])
4.0

answered Jun 7, 2014 at 23:33

A.J. Uppal's user avatar

A.J. UppalA.J. Uppal

19k6 gold badges45 silver badges76 bronze badges

7

The sorted() function is very helpful for this. Use the sorted function
to order the list, then simply return the middle value (or average the two middle
values if the list contains an even amount of elements).

def median(lst):
    sortedLst = sorted(lst)
    lstLen = len(lst)
    index = (lstLen - 1) // 2
   
    if (lstLen % 2):
        return sortedLst[index]
    else:
        return (sortedLst[index] + sortedLst[index + 1])/2.0

vvvvv's user avatar

vvvvv

23.8k19 gold badges48 silver badges75 bronze badges

answered Jun 7, 2014 at 22:09

swolfe's user avatar

swolfeswolfe

9065 silver badges8 bronze badges

2

Of course in Python3 you can use built in functions, but if you are using Python2 or just would like to create your own you can do something like this. The trick here is to use ~ operator that flip positive number to negative. For instance ~2 -> -3 and using negative in for list in Python will count items from the end. So if you have mid == 2 then it will take third element from beginning and third item from the end.

def median(data):
    data.sort()
    mid = len(data) // 2
    return (data[mid] + data[~mid]) / 2.0

personal_cloud's user avatar

answered Jan 21, 2018 at 17:22

Vlad Bezden's user avatar

Vlad BezdenVlad Bezden

82.1k24 gold badges246 silver badges179 bronze badges

Here’s a cleaner solution:

def median(lst):
    quotient, remainder = divmod(len(lst), 2)
    if remainder:
        return sorted(lst)[quotient]
    return sum(sorted(lst)[quotient - 1:quotient + 1]) / 2.

Note: Answer changed to incorporate suggestion in comments.

SiHa's user avatar

SiHa

7,62313 gold badges32 silver badges41 bronze badges

answered Apr 25, 2015 at 20:18

Batuhan Ulug's user avatar

2

You can try the quickselect algorithm if faster average-case running times are needed. Quickselect has average (and best) case performance O(n), although it can end up O(n²) on a bad day.

Here’s an implementation with a randomly chosen pivot:

import random

def select_nth(n, items):
    pivot = random.choice(items)

    lesser = [item for item in items if item < pivot]
    if len(lesser) > n:
        return select_nth(n, lesser)
    n -= len(lesser)

    numequal = items.count(pivot)
    if numequal > n:
        return pivot
    n -= numequal

    greater = [item for item in items if item > pivot]
    return select_nth(n, greater)

You can trivially turn this into a method to find medians:

def median(items):
    if len(items) % 2:
        return select_nth(len(items)//2, items)

    else:
        left  = select_nth((len(items)-1) // 2, items)
        right = select_nth((len(items)+1) // 2, items)

        return (left + right) / 2

This is very unoptimised, but it’s not likely that even an optimised version will outperform Tim Sort (CPython’s built-in sort) because that’s really fast. I’ve tried before and I lost.

answered Jun 8, 2014 at 0:49

Veedrac's user avatar

VeedracVeedrac

57.6k14 gold badges111 silver badges168 bronze badges

2

You can use the list.sort to avoid creating new lists with sorted and sort the lists in place.

Also you should not use list as a variable name as it shadows python’s own list.

def median(l):
    half = len(l) // 2
    l.sort()
    if not len(l) % 2:
        return (l[half - 1] + l[half]) / 2.0
    return l[half]

answered Jun 7, 2014 at 22:48

Padraic Cunningham's user avatar

5

def median(x):
    x = sorted(x)
    listlength = len(x) 
    num = listlength//2
    if listlength%2==0:
        middlenum = (x[num]+x[num-1])/2
    else:
        middlenum = x[num]
    return middlenum

Sam Mason's user avatar

Sam Mason

14.8k1 gold badge41 silver badges59 bronze badges

answered Sep 25, 2018 at 18:22

Bulent's user avatar

BulentBulent

911 silver badge1 bronze badge

0

def median(array):
    """Calculate median of the given list.
    """
    # TODO: use statistics.median in Python 3
    array = sorted(array)
    half, odd = divmod(len(array), 2)
    if odd:
        return array[half]
    return (array[half - 1] + array[half]) / 2.0

answered Mar 4, 2016 at 11:50

warvariuc's user avatar

warvariucwarvariuc

56.6k40 gold badges173 silver badges227 bronze badges

A simple function to return the median of the given list:

def median(lst):
    lst = sorted(lst)  # Sort the list first
    if len(lst) % 2 == 0:  # Checking if the length is even
        # Applying formula which is sum of middle two divided by 2
        return (lst[len(lst) // 2] + lst[(len(lst) - 1) // 2]) / 2
    else:
        # If length is odd then get middle value
        return lst[len(lst) // 2]

Some examples with the median function:

>>> median([9, 12, 20, 21, 34, 80])  # Even
20.5
>>> median([9, 12, 80, 21, 34])  # Odd
21

If you want to use library you can just simply do:

>>> import statistics
>>> statistics.median([9, 12, 20, 21, 34, 80])  # Even
20.5
>>> statistics.median([9, 12, 80, 21, 34])  # Odd
21

Asclepius's user avatar

Asclepius

56.4k17 gold badges164 silver badges142 bronze badges

answered Jul 5, 2020 at 23:16

The AG's user avatar

The AGThe AG

6529 silver badges18 bronze badges

0

I posted my solution at Python implementation of «median of medians» algorithm , which is a little bit faster than using sort(). My solution uses 15 numbers per column, for a speed ~5N which is faster than the speed ~10N of using 5 numbers per column. The optimal speed is ~4N, but I could be wrong about it.

Per Tom’s request in his comment, I added my code here, for reference. I believe the critical part for speed is using 15 numbers per column, instead of 5.

#!/bin/pypy
#
# TH @stackoverflow, 2016-01-20, linear time "median of medians" algorithm
#
import sys, random


items_per_column = 15


def find_i_th_smallest( A, i ):
    t = len(A)
    if(t <= items_per_column):
        # if A is a small list with less than items_per_column items, then:
        #
        # 1. do sort on A
        # 2. find i-th smallest item of A
        #
        return sorted(A)[i]
    else:
        # 1. partition A into columns of k items each. k is odd, say 5.
        # 2. find the median of every column
        # 3. put all medians in a new list, say, B
        #
        B = [ find_i_th_smallest(k, (len(k) - 1)/2) for k in [A[j:(j + items_per_column)] for j in range(0,len(A),items_per_column)]]

        # 4. find M, the median of B
        #
        M = find_i_th_smallest(B, (len(B) - 1)/2)


        # 5. split A into 3 parts by M, { < M }, { == M }, and { > M }
        # 6. find which above set has A's i-th smallest, recursively.
        #
        P1 = [ j for j in A if j < M ]
        if(i < len(P1)):
            return find_i_th_smallest( P1, i)
        P3 = [ j for j in A if j > M ]
        L3 = len(P3)
        if(i < (t - L3)):
            return M
        return find_i_th_smallest( P3, i - (t - L3))


# How many numbers should be randomly generated for testing?
#
number_of_numbers = int(sys.argv[1])


# create a list of random positive integers
#
L = [ random.randint(0, number_of_numbers) for i in range(0, number_of_numbers) ]


# Show the original list
#
# print L


# This is for validation
#
# print sorted(L)[int((len(L) - 1)/2)]


# This is the result of the "median of medians" function.
# Its result should be the same as the above.
#
print find_i_th_smallest( L, (len(L) - 1) / 2)

Community's user avatar

answered Jan 21, 2016 at 0:00

user5818263's user avatar

0

In case you need additional information on the distribution of your list, the percentile method will probably be useful. And a median value corresponds to the 50th percentile of a list:

import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9])
median_value = np.percentile(a, 50) # return 50th percentile
print median_value 

answered Apr 22, 2020 at 12:07

Gabriel123's user avatar

Gabriel123Gabriel123

4165 silver badges11 bronze badges

Here what I came up with during this exercise in Codecademy:

def median(data):
    new_list = sorted(data)
    if len(new_list)%2 > 0:
        return new_list[len(new_list)/2]
    elif len(new_list)%2 == 0:
        return (new_list[(len(new_list)/2)] + new_list[(len(new_list)/2)-1]) /2.0

print median([1,2,3,4,5,9])

answered May 27, 2016 at 8:52

BynderRox's user avatar

Just two lines are enough.

def get_median(arr):
    '''
    Calculate the median of a sequence.
    :param arr: list
    :return: int or float
    '''
    arr = sorted(arr)
    return arr[len(arr)//2] if len(arr) % 2 else (arr[len(arr)//2] + arr[len(arr)//2-1])/2

Asclepius's user avatar

Asclepius

56.4k17 gold badges164 silver badges142 bronze badges

answered Sep 17, 2020 at 2:32

Rt.Tong's user avatar

Rt.TongRt.Tong

1863 silver badges5 bronze badges

median Function

def median(midlist):
    midlist.sort()
    lens = len(midlist)
    if lens % 2 != 0: 
        midl = (lens / 2)
        res = midlist[midl]
    else:
        odd = (lens / 2) -1
        ev = (lens / 2) 
        res = float(midlist[odd] + midlist[ev]) / float(2)
    return res

answered May 21, 2015 at 13:55

Юрий Мойдом Киев's user avatar

I had some problems with lists of float values. I ended up using a code snippet from the python3 statistics.median and is working perfect with float values without imports. source

def calculateMedian(list):
    data = sorted(list)
    n = len(data)
    if n == 0:
        return None
    if n % 2 == 1:
        return data[n // 2]
    else:
        i = n // 2
        return (data[i - 1] + data[i]) / 2

answered May 3, 2017 at 16:54

Dan's user avatar

DanDan

7471 gold badge7 silver badges20 bronze badges

def midme(list1):

    list1.sort()
    if len(list1)%2>0:
            x = list1[int((len(list1)/2))]
    else:
            x = ((list1[int((len(list1)/2))-1])+(list1[int(((len(list1)/2)))]))/2
    return x


midme([4,5,1,7,2])

answered Feb 18, 2018 at 18:00

vk123's user avatar

vk123vk123

213 bronze badges

def median(array):
    if len(array) < 1:
        return(None)
    if len(array) % 2 == 0:
        median = (array[len(array)//2-1: len(array)//2+1])
        return sum(median) / len(median)
    else:
        return(array[len(array)//2])

rollstuhlfahrer's user avatar

answered Apr 6, 2018 at 21:55

Luke Willey's user avatar

3

I defined a median function for a list of numbers as

def median(numbers):
    return (sorted(numbers)[int(round((len(numbers) - 1) / 2.0))] + sorted(numbers)[int(round((len(numbers) - 1) // 2.0))]) / 2.0

answered Oct 14, 2014 at 14:12

Fred Beck's user avatar

1

import numpy as np
def get_median(xs):
        mid = len(xs) // 2  # Take the mid of the list
        if len(xs) % 2 == 1: # check if the len of list is odd
            return sorted(xs)[mid] #if true then mid will be median after sorting
        else:
            #return 0.5 * sum(sorted(xs)[mid - 1:mid + 1])
            return 0.5 * np.sum(sorted(xs)[mid - 1:mid + 1]) #if false take the avg of mid
print(get_median([7, 7, 3, 1, 4, 5]))
print(get_median([1,2,3, 4,5]))

answered Aug 26, 2019 at 7:12

A more generalized approach for median (and percentiles) would be:

def get_percentile(data, percentile):
    # Get the number of observations
    cnt=len(data)
    # Sort the list
    data=sorted(data)
    # Determine the split point
    i=(cnt-1)*percentile
    # Find the `floor` of the split point
    diff=i-int(i)
    # Return the weighted average of the value above and below the split point
    return data[int(i)]*(1-diff)+data[int(i)+1]*(diff)

# Data
data=[1,2,3,4,5]
# For the median
print(get_percentile(data=data, percentile=.50))
# > 3
print(get_percentile(data=data, percentile=.75))
# > 4

# Note the weighted average difference when an int is not returned by the percentile
print(get_percentile(data=data, percentile=.51))
# > 3.04

answered May 7, 2020 at 19:46

conmak's user avatar

conmakconmak

1,15010 silver badges13 bronze badges

Try This

import math
def find_median(arr):
    if len(arr)%2==1:
        med=math.ceil(len(arr)/2)-1
        return arr[med]
    else:
        return -1
print(find_median([1,2,3,4,5,6,7,8]))

answered Dec 20, 2021 at 13:32

0xN1nja's user avatar

0xN1nja0xN1nja

8281 gold badge9 silver badges17 bronze badges

1

Implement it:

def median(numbers):
    """
    Calculate median of a list numbers.
    :param numbers: the numbers to be calculated.
    :return: median value of numbers.

    >>> median([1, 3, 3, 6, 7, 8, 9])
    6
    >>> median([1, 2, 3, 4, 5, 6, 8, 9])
    4.5
    >>> import statistics
    >>> import random
    >>> numbers = random.sample(range(-50, 50), k=100)
    >>> statistics.median(numbers) == median(numbers)
    True
    """
    numbers = sorted(numbers)
    mid_index = len(numbers) // 2
    return (
        (numbers[mid_index] + numbers[mid_index - 1]) / 2 if mid_index % 2 == 0
        else numbers[mid_index]
    )


if __name__ == "__main__":
    from doctest import testmod

    testmod()

source from

Asclepius's user avatar

Asclepius

56.4k17 gold badges164 silver badges142 bronze badges

answered Oct 4, 2020 at 16:36

duyuanchao's user avatar

duyuanchaoduyuanchao

3,7631 gold badge25 silver badges16 bronze badges

Function median:

def median(d):
    d=np.sort(d)
    n2=int(len(d)/2)
    r=n2%2
    if (r==0):
        med=d[n2] 
    else:
        med=(d[n2] + d[n2+1]) / 2
    return med

Siong Thye Goh's user avatar

answered Feb 15, 2020 at 11:03

fati's user avatar

1

Simply, Create a Median Function with an argument as a list of the number and call the function.

def median(l):
    l = sorted(l)
    lent = len(l)
    if (lent % 2) == 0:
        m = int(lent / 2)
        result = l[m]
    else:
        m = int(float(lent / 2) - 0.5)
        result = l[m]
    return result

Asclepius's user avatar

Asclepius

56.4k17 gold badges164 silver badges142 bronze badges

answered Apr 27, 2021 at 5:17

Romesh Borawake's user avatar

What I did was this:

def median(a):
    a = sorted(a)
    if len(a) / 2 != int:
        return a[len(a) / 2]
    else:
        return (a[len(a) / 2] + a[(len(a) / 2) - 1]) / 2

Explanation: Basically if the number of items in the list is odd, return the middle number, otherwise, if you half an even list, python automatically rounds the higher number so we know the number before that will be one less (since we sorted it) and we can add the default higher number and the number lower than it and divide them by 2 to find the median.

Asclepius's user avatar

Asclepius

56.4k17 gold badges164 silver badges142 bronze badges

answered Nov 6, 2020 at 6:31

CodingCuber's user avatar

1

Here’s the tedious way to find median without using the median function:

def median(*arg):
    order(arg)
    numArg = len(arg)
    half = int(numArg/2)
    if numArg/2 ==half:
        print((arg[half-1]+arg[half])/2)
    else:
        print(int(arg[half]))

def order(tup):
    ordered = [tup[i] for i in range(len(tup))]
    test(ordered)
    while(test(ordered)):
        test(ordered)
    print(ordered)


def test(ordered):
    whileloop = 0 
    for i in range(len(ordered)-1):
        print(i)
        if (ordered[i]>ordered[i+1]):
            print(str(ordered[i]) + ' is greater than ' + str(ordered[i+1]))
            original = ordered[i+1]
            ordered[i+1]=ordered[i]
            ordered[i]=original
            whileloop = 1 #run the loop again if you had to switch values
    return whileloop

answered Jan 24, 2017 at 19:05

I Like's user avatar

I LikeI Like

1,6812 gold badges25 silver badges50 bronze badges

2

It is very simple;

def median(alist):
    #to find median you will have to sort the list first
    sList = sorted(alist)
    first = 0
    last = len(sList)-1
    midpoint = (first + last)//2
    return midpoint

And you can use the return value like this median = median(anyList)

answered Dec 7, 2018 at 16:11

Farhan's user avatar

FarhanFarhan

1,44315 silver badges23 bronze badges

2

New in version 3.4.

Source code: Lib/statistics.py


This module provides functions for calculating mathematical statistics of
numeric (Real-valued) data.

Note

Unless explicitly noted otherwise, these functions support int,
float, decimal.Decimal and fractions.Fraction.
Behaviour with other types (whether in the numeric tower or not) is
currently unsupported. Mixed types are also undefined and
implementation-dependent. If your input data consists of mixed types,
you may be able to use map() to ensure a consistent result, e.g.
map(float, input_data).

9.7.1. Averages and measures of central location¶

These functions calculate an average or typical value from a population
or sample.

mean() Arithmetic mean (“average”) of data.
harmonic_mean() Harmonic mean of data.
median() Median (middle value) of data.
median_low() Low median of data.
median_high() High median of data.
median_grouped() Median, or 50th percentile, of grouped data.
mode() Mode (most common value) of discrete data.

9.7.2. Measures of spread¶

These functions calculate a measure of how much the population or sample
tends to deviate from the typical or average values.

pstdev() Population standard deviation of data.
pvariance() Population variance of data.
stdev() Sample standard deviation of data.
variance() Sample variance of data.

9.7.3. Function details¶

Note: The functions do not require the data given to them to be sorted.
However, for reading convenience, most of the examples show sorted sequences.

statistics.mean(data)

Return the sample arithmetic mean of data which can be a sequence or iterator.

The arithmetic mean is the sum of the data divided by the number of data
points. It is commonly called “the average”, although it is only one of many
different mathematical averages. It is a measure of the central location of
the data.

If data is empty, StatisticsError will be raised.

Some examples of use:

>>> mean([1, 2, 3, 4, 4])
2.8
>>> mean([-1.0, 2.5, 3.25, 5.75])
2.625

>>> from fractions import Fraction as F
>>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
Fraction(13, 21)

>>> from decimal import Decimal as D
>>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
Decimal('0.5625')

Note

The mean is strongly affected by outliers and is not a robust estimator
for central location: the mean is not necessarily a typical example of the
data points. For more robust, although less efficient, measures of
central location, see median() and mode(). (In this case,
“efficient” refers to statistical efficiency rather than computational
efficiency.)

The sample mean gives an unbiased estimate of the true population mean,
which means that, taken on average over all the possible samples,
mean(sample) converges on the true mean of the entire population. If
data represents the entire population rather than a sample, then
mean(data) is equivalent to calculating the true population mean μ.

statistics.harmonic_mean(data)

Return the harmonic mean of data, a sequence or iterator of
real-valued numbers.

The harmonic mean, sometimes called the subcontrary mean, is the
reciprocal of the arithmetic mean() of the reciprocals of the
data. For example, the harmonic mean of three values a, b and c
will be equivalent to 3/(1/a + 1/b + 1/c).

The harmonic mean is a type of average, a measure of the central
location of the data. It is often appropriate when averaging quantities
which are rates or ratios, for example speeds. For example:

Suppose an investor purchases an equal value of shares in each of
three companies, with P/E (price/earning) ratios of 2.5, 3 and 10.
What is the average P/E ratio for the investor’s portfolio?

>>> harmonic_mean([2.5, 3, 10])  # For an equal investment portfolio.
3.6

Using the arithmetic mean would give an average of about 5.167, which
is too high.

StatisticsError is raised if data is empty, or any element
is less than zero.

New in version 3.6.

statistics.median(data)

Return the median (middle value) of numeric data, using the common “mean of
middle two” method. If data is empty, StatisticsError is raised.
data can be a sequence or iterator.

The median is a robust measure of central location, and is less affected by
the presence of outliers in your data. When the number of data points is
odd, the middle data point is returned:

When the number of data points is even, the median is interpolated by taking
the average of the two middle values:

>>> median([1, 3, 5, 7])
4.0

This is suited for when your data is discrete, and you don’t mind that the
median may not be an actual data point.

statistics.median_low(data)

Return the low median of numeric data. If data is empty,
StatisticsError is raised. data can be a sequence or iterator.

The low median is always a member of the data set. When the number of data
points is odd, the middle value is returned. When it is even, the smaller of
the two middle values is returned.

>>> median_low([1, 3, 5])
3
>>> median_low([1, 3, 5, 7])
3

Use the low median when your data are discrete and you prefer the median to
be an actual data point rather than interpolated.

statistics.median_high(data)

Return the high median of data. If data is empty, StatisticsError
is raised. data can be a sequence or iterator.

The high median is always a member of the data set. When the number of data
points is odd, the middle value is returned. When it is even, the larger of
the two middle values is returned.

>>> median_high([1, 3, 5])
3
>>> median_high([1, 3, 5, 7])
5

Use the high median when your data are discrete and you prefer the median to
be an actual data point rather than interpolated.

statistics.median_grouped(data, interval=1)

Return the median of grouped continuous data, calculated as the 50th
percentile, using interpolation. If data is empty, StatisticsError
is raised. data can be a sequence or iterator.

>>> median_grouped([52, 52, 53, 54])
52.5

In the following example, the data are rounded, so that each value represents
the midpoint of data classes, e.g. 1 is the midpoint of the class 0.5–1.5, 2
is the midpoint of 1.5–2.5, 3 is the midpoint of 2.5–3.5, etc. With the data
given, the middle value falls somewhere in the class 3.5–4.5, and
interpolation is used to estimate it:

>>> median_grouped([1, 2, 2, 3, 4, 4, 4, 4, 4, 5])
3.7

Optional argument interval represents the class interval, and defaults
to 1. Changing the class interval naturally will change the interpolation:

>>> median_grouped([1, 3, 3, 5, 7], interval=1)
3.25
>>> median_grouped([1, 3, 3, 5, 7], interval=2)
3.5

This function does not check whether the data points are at least
interval apart.

CPython implementation detail: Under some circumstances, median_grouped() may coerce data points to
floats. This behaviour is likely to change in the future.

See also

  • “Statistics for the Behavioral Sciences”, Frederick J Gravetter and
    Larry B Wallnau (8th Edition).
  • Calculating the median.
  • The SSMEDIAN
    function in the Gnome Gnumeric spreadsheet, including this discussion.
statistics.mode(data)

Return the most common data point from discrete or nominal data. The mode
(when it exists) is the most typical value, and is a robust measure of
central location.

If data is empty, or if there is not exactly one most common value,
StatisticsError is raised.

mode assumes discrete data, and returns a single value. This is the
standard treatment of the mode as commonly taught in schools:

>>> mode([1, 1, 2, 3, 3, 3, 3, 4])
3

The mode is unique in that it is the only statistic which also applies
to nominal (non-numeric) data:

>>> mode(["red", "blue", "blue", "red", "green", "red", "red"])
'red'
statistics.pstdev(data, mu=None)

Return the population standard deviation (the square root of the population
variance). See pvariance() for arguments and other details.

>>> pstdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
0.986893273527251
statistics.pvariance(data, mu=None)

Return the population variance of data, a non-empty iterable of real-valued
numbers. Variance, or second moment about the mean, is a measure of the
variability (spread or dispersion) of data. A large variance indicates that
the data is spread out; a small variance indicates it is clustered closely
around the mean.

If the optional second argument mu is given, it should be the mean of
data. If it is missing or None (the default), the mean is
automatically calculated.

Use this function to calculate the variance from the entire population. To
estimate the variance from a sample, the variance() function is usually
a better choice.

Raises StatisticsError if data is empty.

Examples:

>>> data = [0.0, 0.25, 0.25, 1.25, 1.5, 1.75, 2.75, 3.25]
>>> pvariance(data)
1.25

If you have already calculated the mean of your data, you can pass it as the
optional second argument mu to avoid recalculation:

>>> mu = mean(data)
>>> pvariance(data, mu)
1.25

This function does not attempt to verify that you have passed the actual mean
as mu. Using arbitrary values for mu may lead to invalid or impossible
results.

Decimals and Fractions are supported:

>>> from decimal import Decimal as D
>>> pvariance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
Decimal('24.815')

>>> from fractions import Fraction as F
>>> pvariance([F(1, 4), F(5, 4), F(1, 2)])
Fraction(13, 72)

Note

When called with the entire population, this gives the population variance
σ². When called on a sample instead, this is the biased sample variance
s², also known as variance with N degrees of freedom.

If you somehow know the true population mean μ, you may use this function
to calculate the variance of a sample, giving the known population mean as
the second argument. Provided the data points are representative
(e.g. independent and identically distributed), the result will be an
unbiased estimate of the population variance.

statistics.stdev(data, xbar=None)

Return the sample standard deviation (the square root of the sample
variance). See variance() for arguments and other details.

>>> stdev([1.5, 2.5, 2.5, 2.75, 3.25, 4.75])
1.0810874155219827
statistics.variance(data, xbar=None)

Return the sample variance of data, an iterable of at least two real-valued
numbers. Variance, or second moment about the mean, is a measure of the
variability (spread or dispersion) of data. A large variance indicates that
the data is spread out; a small variance indicates it is clustered closely
around the mean.

If the optional second argument xbar is given, it should be the mean of
data. If it is missing or None (the default), the mean is
automatically calculated.

Use this function when your data is a sample from a population. To calculate
the variance from the entire population, see pvariance().

Raises StatisticsError if data has fewer than two values.

Examples:

>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> variance(data)
1.3720238095238095

If you have already calculated the mean of your data, you can pass it as the
optional second argument xbar to avoid recalculation:

>>> m = mean(data)
>>> variance(data, m)
1.3720238095238095

This function does not attempt to verify that you have passed the actual mean
as xbar. Using arbitrary values for xbar can lead to invalid or
impossible results.

Decimal and Fraction values are supported:

>>> from decimal import Decimal as D
>>> variance([D("27.5"), D("30.25"), D("30.25"), D("34.5"), D("41.75")])
Decimal('31.01875')

>>> from fractions import Fraction as F
>>> variance([F(1, 6), F(1, 2), F(5, 3)])
Fraction(67, 108)

Note

This is the sample variance s² with Bessel’s correction, also known as
variance with N-1 degrees of freedom. Provided that the data points are
representative (e.g. independent and identically distributed), the result
should be an unbiased estimate of the true population variance.

If you somehow know the actual population mean μ you should pass it to the
pvariance() function as the mu parameter to get the variance of a
sample.

9.7.4. Exceptions¶

A single exception is defined:

exception statistics.StatisticsError

Subclass of ValueError for statistics-related exceptions.

Improve Article

Save Article

Like Article

  • Read
  • Discuss
  • Improve Article

    Save Article

    Like Article

    Python is a very popular language when it comes to data analysis and statistics. Luckily, Python3 provide statistics module, which comes with very useful functions like mean(), median(), mode() etc.
    median() function in the statistics module can be used to calculate median value from an unsorted data-list. The biggest advantage of using median() function is that the data-list does not need to be sorted before being sent as parameter to the median() function.
    Median is the value that separates the higher half of a data sample or probability distribution from the lower half. For a dataset, it may be thought of as the middle value. The median is the measure of the central tendency of the properties of a data-set in statistics and probability theory. Median has a very big advantage over Mean, which is the median value is not skewed so much by extremely large or small values. The median value is either contained in the data-set of values provided or it doesn’t sway too much from the data provided.
    For odd set of elements, the median value is the middle one. 
    For even set of elements, the median value is the mean of two middle elements.
     

    Median can be represented by the following formula :
    
    {displaystyle mathrm {median} (a)={frac {a_{lfloor #xdiv 2rfloor }+a_{lfloor #xdiv 2+0.5rfloor }}{2}}}

    Syntax : median( [data-set] )
    Parameters : 
    [data-set] : List or tuple or an iterable with a set of numeric values
    Returns : Return the median (middle value) of the iterable containing the data
    Exceptions : StatisticsError is raised when iterable passed is empty or when list is null. 
     

    Code #1 : Working
     

    Python3

    import statistics

    data1 = [2, -2, 3, 6, 9, 4, 5, -1]

    print("Median of data-set is : % s "

            % (statistics.median(data1)))

    Output : 
     

    Median of data-set is : 3.5 

      
    Code #2 : 
     

    Python3

    from statistics import median

    from fractions import Fraction as fr

    data1 = (2, 3, 4, 5, 7, 9, 11)

    data2 = (2.4, 5.1, 6.7, 8.9)

    data3 = (fr(1, 2), fr(44, 12),

             fr(10, 3), fr(2, 3))

    data4 = (-5, -1, -12, -19, -3)

    data5 = (-1, -2, -3, -4, 4, 3, 2, 1)

    print("Median of data-set 1 is % s" % (median(data1)))

    print("Median of data-set 2 is % s" % (median(data2)))

    print("Median of data-set 3 is % s" % (median(data3)))

    print("Median of data-set 4 is % s" % (median(data4)))

    print("Median of data-set 5 is % s" % (median(data5)))

    Output : 
     

    Median of data-set 1 is 5
    Median of data-set 2 is 5.9
    Median of data-set 3 is 2
    Median of data-set 4 is -5
    Median of data-set 5 is 0.0

      
    Code #3 : Demonstrating StatisticsError 
     

    Python3

    from statistics import median

    empty = []

    print(median(empty))

    Output : 
     

    Traceback (most recent call last):
      File "/home/3c98774036f97845ee9f65f6d3571e49.py", line 12, in 
        print(median(empty))
      File "/usr/lib/python3.5/statistics.py", line 353, in median
        raise StatisticsError("no median for empty data")
    statistics.StatisticsError: no median for empty data

      
    Applications : 
    For practical applications, different measures of dispersion and population tendency are compared on the basis of how well the corresponding population values can be estimated. For example, a comparison shows that the sample mean is more statistically efficient than the sample median when the data is uncontaminated by data from heavily-tailed data distribution or from mixtures of data distribution, but less efficient otherwise and that the efficiency of the sample median is higher than that for a wide range of distributions. To be more specific, the median has 64% efficiency compared to minimum-variance-mean ( for large normal samples ).
     

    Last Updated :
    27 Sep, 2021

    Like Article

    Save Article

    Понравилась статья? Поделить с друзьями:
  • Как найти профессиональные кадры
  • Как составить письмо в фнс
  • Как найти ускорение свободного падения тел формул
  • Как найти удаленную запись на айфоне
  • Как найти windows в планшете