Как найти сумму столбца в python - Исправление недочетов и поиск решений вместе с Examum.ru

17 авг. 2022 г.
читать 1 мин

Часто вас может заинтересовать вычисление суммы одного или нескольких столбцов в кадре данных pandas. К счастью, вы можете легко сделать это в pandas, используя функцию sum() .

В этом руководстве показано несколько примеров использования этой функции.

Пример 1: найти сумму одного столбца

Предположим, у нас есть следующие Pandas DataFrame:

import pandas as pd
import numpy as np

#create DataFrame
df = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86],
 'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19],
 'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5],
 'rebounds': [np.nan, 8, 10, 6, 6, 9, 6, 10, 10, 7]})

#view DataFrame 
df

 rating points assists rebounds
0 90 25 5 NaN
1 85 20 7 8
2 82 14 7 10
3 88 16 8 6
4 94 27 5 6
5 90 20 7 9
6 76 12 6 6
7 75 15 9 10
8 87 14 9 10
9 86 19 5 7

Мы можем найти сумму столбца под названием «баллы», используя следующий синтаксис:

df['points']. sum ()

182

Функция sum() также будет исключать NA по умолчанию. Например, если мы найдем сумму столбца «рикошеты», первое значение «NaN» будет просто исключено из расчета:

df['rebounds']. sum ()

72.0

Пример 2. Найдите сумму нескольких столбцов

Мы можем найти сумму нескольких столбцов, используя следующий синтаксис:

#find sum of points and rebounds columns
df[['rebounds', 'points']]. sum ()

rebounds 72.0
points 182.0
dtype: float64

Пример 3: найти сумму всех столбцов

Мы также можем найти сумму всех столбцов, используя следующий синтаксис:

#find sum of all columns in DataFrame
df.sum ()

rating 853.0
points 182.0
assists 68.0
rebounds 72.0
dtype: float64

Для столбцов, которые не являются числовыми, функция sum() просто не будет вычислять сумму этих столбцов.

Вы можете найти полную документацию по функции sum() здесь .

Источник

I can sum the items in column zero fine. But where do I change the code to sum column 2, or 3, or 4 in the matrix?
I’m easily stumped.

def main():
    matrix = []

    for i in range(2):
        s = input("Enter a 4-by-4 matrix row " + str(i) + ": ") 
        items = s.split() # Extracts items from the string
        list = [ eval(x) for x in items ] # Convert items to numbers   
        matrix.append(list)

    print("Sum of the elements in column 0 is", sumColumn(matrix))

def sumColumn(m):
    for column in range(len(m[0])):
        total = 0
        for row in range(len(m)):
            total += m[row][column]
        return total

main()

lvc

34k9 gold badges72 silver badges98 bronze badges

asked Apr 18, 2014 at 0:40

numpy could do this for you quite easily:

def sumColumn(matrix):
    return numpy.sum(matrix, axis=1)  # axis=1 says "get the sum along the columns"

Of course, if you wanted do it by hand, here’s how I would fix your code:

def sumColumn(m):
    answer = []
    for column in range(len(m[0])):
        t = 0
        for row in m:
            t += row[column]
        answer.append(t)
    return answer

Still, there is a simpler way, using zip:

def sumColumn(m):
    return [sum(col) for col in zip(*m)]

answered Apr 18, 2014 at 0:49

inspectorG4dgetinspectorG4dget

109k27 gold badges147 silver badges238 bronze badges

One-liner:

column_sums = [sum([row[i] for row in M]) for i in range(0,len(M[0]))]

also

row_sums = [sum(row) for row in M]

for any rectangular, non-empty matrix (list of lists) M. e.g.

>>> M = [[1,2,3],
>>>     [4,5,6],
>>>     [7,8,9]]
>>>
>>> [sum([row[i] for row in M]) for i in range(0,len(M[0]))]
[12, 15, 18] 
>>> [sum(row) for row in M]
[6, 15, 24]

answered Jan 30, 2015 at 17:55

ChrisWChrisW

1,26512 silver badges12 bronze badges

Here is your code changed to return the sum of whatever column you specify:

def sumColumn(m, column):
    total = 0
    for row in range(len(m)):
        total += m[row][column]
    return total

column = 1
print("Sum of the elements in column", column, "is", sumColumn(matrix, column))

answered Apr 18, 2014 at 1:08

user3286261user3286261

3913 silver badges7 bronze badges

To get the sum of all columns in the matrix you can use the below python numpy code:

matrixname.sum(axis=0)

answered Oct 30, 2018 at 10:43

import numpy as np
np.sum(M,axis=1)

where M is the matrix

Buddy

10.9k5 gold badges41 silver badges58 bronze badges

answered Nov 27, 2018 at 22:47

This can be made easier if you represent the matrix as a flat array:

m = [
    1,2,3,4,
    10,11,12,13,
    100,101,102,103,
    1001,1002,1003,1004
]

def sum_column(m, n):
    return sum(m[i] for i in range(n, 4 * 4, 4))

answered Apr 18, 2014 at 0:50

michaelmeyermichaelmeyer

7,9556 gold badges30 silver badges35 bronze badges

Источник

How do I add up all of the values of a column in a python array? Ideally I want to do this without importing any additional libraries.

input_val = [[1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5]]

output_val = [3, 6, 9, 12, 15]

I know I this can be done in a nested for loop, wondering if there was a better way (like a list comprehension)?

Stephen Rauch♦

47.4k31 gold badges105 silver badges134 bronze badges

asked Apr 17, 2017 at 21:04

zip and sum can get that done:

Code:

[sum(x) for x in zip(*input_val)]

zip takes the contents of the input list and transposes them so that each element of the contained lists is produced at the same time. This allows the sum to see the first elements of each contained list, then next iteration will get the second element of each list, etc…

Test Code:

input_val = [[1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5]]

print([sum(x) for x in zip(*input_val)])

Results:

[3, 6, 9, 12, 15]

answered Apr 17, 2017 at 21:08

Stephen Rauch♦Stephen Rauch

47.4k31 gold badges105 silver badges134 bronze badges

In case you decide to use any library, numpy easily does this:

np.sum(input_val,axis=0)

answered Apr 17, 2017 at 21:09

JavNoorJavNoor

4022 silver badges11 bronze badges

You may also use sum with zip within the map function:

# In Python 3.x 
>>> list(map(sum, zip(*input_val)))
[3, 6, 9, 12, 15]
# explicitly type-cast it to list as map returns generator expression

# In Python 2.x, explicit type-casting to list is not needed as `map` returns list
>>> map(sum, zip(*input_val))
[3, 6, 9, 12, 15]

answered Apr 17, 2017 at 21:10

Moinuddin QuadriMoinuddin Quadri

46.3k12 gold badges95 silver badges125 bronze badges

Try this:

input_val = [[1, 2, 3, 4, 5],
         [1, 2, 3, 4, 5],
         [1, 2, 3, 4, 5]]

output_val = [sum([i[b] for i in input_val]) for b in range(len(input_val[0]))]

print output_val

answered Apr 17, 2017 at 21:12

Ajax1234Ajax1234

69.3k8 gold badges61 silver badges102 bronze badges

Please construct your array using the NumPy library:

import numpy as np

create the array using the array( ) function and save it in a variable:

 arr = np.array(([1, 2, 3, 4, 5],[1, 2, 3, 4, 5],[1, 2, 3, 4, 5]))

apply sum( ) function to the array specifying it for the columns by setting the axis parameter to zero:

arr.sum(axis = 0)

answered Aug 1, 2020 at 22:15

ZhannieZhannie

1772 silver badges5 bronze badges

This should work:

[sum(i) for i in zip(*input_val)]

answered Apr 17, 2017 at 21:09

AlexAlex

1,41213 silver badges26 bronze badges

I guess you can use:

import numpy as np
new_list = sum(map(np.array, input_val))

answered Apr 17, 2017 at 21:11

Pedro LobitoPedro Lobito

92.8k30 gold badges254 silver badges266 bronze badges

I think this is the most pythonic way of doing this

map(sum, [x for x in zip(*input_val)])

answered Apr 17, 2017 at 21:14

Asav PatelAsav Patel

1,0731 gold badge7 silver badges24 bronze badges

One-liner using list comprehensions: for each column (length of one row), make a list of all the entries in that column, and sum that list.

output_val = [sum([input_val[i][j] for i in range(len(input_val))]) 
                 for j in range(len(input_val[0]))]

answered Apr 17, 2017 at 21:10

PrunePrune

76.5k14 gold badges58 silver badges80 bronze badges

Try this code. This will make output_val end up as [3, 6, 9, 12, 15] given your input_val:

input_val = [[1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5]]

vals_length = len(input_val[0])
output_val = [0] * vals_length # init empty output array with 0's
for i in range(vals_length): # iterate for each index in the inputs
    for vals in input_val:
        output_val[i] += vals[i] # add to the same index

print(output_val) # [3, 6, 9, 12, 15]

Al Sweigart

11.3k10 gold badges63 silver badges92 bronze badges

answered Apr 17, 2017 at 21:11

LLLLLL

3,5062 gold badges25 silver badges44 bronze badges

Using Numpy you can easily solve this issue in one line:

1: Input

input_val = [[1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5],
             [1, 2, 3, 4, 5]]

2: Numpy does the math for you

np.sum(input_val,axis=0)

3: Then finally the results

array([ 3,  6,  9, 12, 15])

answered Nov 22, 2018 at 7:30

output_val=input_val.sum(axis=0)

this would make the code even simpler I guess

Stephen Rauch♦

47.4k31 gold badges105 silver badges134 bronze badges

answered Jan 28, 2018 at 1:26

You can use the sum function instead of np.sum simply.

input_val = np.array([[1, 2, 3, 4, 5],
         [1, 2, 3, 4, 5],
         [1, 2, 3, 4, 5]])
sum(input_val)

output: array([ 3,  6,  9, 12, 15])

answered Jun 21, 2022 at 7:40

Источник

In today’s recipe we’ll touch on the basics of adding numeric values in a pandas DataFrame.

We’ll cover the following cases:

Sum all rows of one or multiple columns
Sum by column name/label into a new column
Adding values by index
Dealing with nan values
Sum values that meet a certain condition

Creating the dataset

We’ll start by creating a simple dataset

# Python3
# import pandas into your Python environment.
import pandas as pd

# Now, let's create the dataframe 
budget = pd.DataFrame({"person": ["John", "Kim", "Bob"],
                        "quarter": [1, 1, 1] ,
                        "consumer_budg": [15000, 35000, 45000],
                         "enterprise_budg": [20000, 30000, 40000] })
budget.head()

	person	quarter	consumer_budg	enterprise_budg
0	John	1	15000	20000
1	Kim	1	35000	30000
2	Bob	1	45000	40000

How to sum a column? (or more)

For a single column we’ll simply use the Series Sum() method.

# one column
budget['consumer_budg'].sum()

95000

Also the DataFrame has a Sum() method, which we’ll use to add multiple columns:

#addingmultiple columns
cols = ['consumer_budg', 'enterprise_budg']
budget[cols].sum()

We’ll receive a Series objects with the results:

consumer_budg      95000
enterprise_budg    90000
dtype: int64

Sum row values into a new column

More interesting is the case that we want to compute the values by adding multiple column values in a specific row. See this simple example below

# using the column label names
budget['total_budget'] = budget['consumer_budg'] + budget['enterprise_budg']

We have created a new column as shown below:

	person	quarter	consumer_budg	enterprise_budg	total_budget
0	John	1	15000	20000	35000
1	Kim	1	35000	30000	65000
2	Bob	1	45000	40000	85000

Note: We could have also used the loc method to subset by label.

Adding columns by index

We can also refer to the columns to sum by index, using the iloc method.

# by index
budget['total_budget'] = budget.iloc[:,2]+ budget.iloc[:,3]

Result will be similar as above

Sum with conditions

In this example, we would like to define a column named high_budget and populate it only if the total_budget is over the 80K threshold.

budget['high_budget'] = budget.query('consumer_budg + enterprise_budg > 80000')['total_budget']

Adding columns with null values

Here we might need a bit of pre-processing to get rid of the null values using fillna().

Let’s quickly create a sample dataset containing null values (see last row).

# with nan
import numpy as np
budget_nan = pd.DataFrame({"person": ["John", "Kim", "Bob", 'Court'],
                        "quarter": [1, 1, 1,1] ,
                        "consumer_budg": [15000, 35000, 45000, 50000],
                         "enterprise_budg": [20000, 30000, 40000, np.nan ] })

	person	quarter	consumer_budg	enterprise_budg	high_budget
0	John	1	15000	20000.0	35000.0
1	Kim	1	35000	30000.0	65000.0
2	Bob	1	45000	40000.0	85000.0
3	Court	1	50000	NaN	NaN

Now lets use the DataFrame fillna() method to mass override the null values with Zeros so that we can sum the column values.

budget_nan.fillna(0, inplace=True)
budget_nan['high_budget'] = budget_nan['consumer_budg'] + budget_nan['enterprise_budg']
budget_nan

Voi’la

	person	quarter	consumer_budg	enterprise_budg	high_budget
0	John	1	15000	20000.0	35000.0
1	Kim	1	35000	30000.0	65000.0
2	Bob	1	45000	40000.0	85000.0
3	Court	1	50000	0.0	50000.0

Источник

Метод получения суммы столбца
“Совокупная сумма с групповой суммой”
Метод получения суммы столбцов на основе Условия других Столбцов Значения

Как получить сумму колонки Pandas

Мы познакомимся с тем, как получить сумму Pandas DataFrame столбца, а также с такими методами, как вычисление кумулятивной суммы с groupby, и суммы столбцов фрейма данных на основе условных значений других столбцов.

Метод получения суммы столбца

Сначала мы создаем случайный массив, используя библиотеку NumPy, а затем получаем сумму каждого столбца, используя функцию sum().

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.random.randint(0,10,size=(10, 4)),
    columns=list('1234'))
print(df)
Total = df['1'].sum()
print ("Column 1 sum:",Total)
Total = df['2'].sum()
print ("Column 2 sum:",Total)
Total = df['3'].sum()
print ("Column 3 sum:",Total)
Total = df['4'].sum()
print ("Column 4 sum:",Total)

Если вы запустите этот код, то получите следующий вывод (значение может быть разным в вашем случае),

   1  2  3  4
0  2  2  3  8
1  9  4  3  1
2  8  5  6  0
3  9  5  7  4
4  2  7  3  7
5  9  4  1  3
6  6  7  7  3
7  0  4  2  8
8  0  6  6  4
9  5  8  7  2
Column 1 sum: 50
Column 2 sum: 52
Column 3 sum: 45
Column 4 sum: 40

“Совокупная сумма с `групповой` суммой”

Мы можем получить кумулятивную сумму, используя метод групповых. Рассмотрим следующий Датафрейм со столбцами Date, Fruit и Sale:

import pandas as pd

df = pd.DataFrame(
    {
        'Date': 
             ['08/09/2018', 
              '10/09/2018', 
              '08/09/2018', 
              '10/09/2018'],
        'Fruit': 
             ['Apple', 
              'Apple', 
              'Banana', 
              'Banana'],
        'Sale':
             [34,
              12,
              22,
              27]
    })

Если мы хотим вычислить кумулятивную сумму Продажа за фрукт и для каждой даты мы можем это сделать:

import pandas as pd

df = pd.DataFrame(
    {
        'Date': 
             ['08/09/2018', 
              '10/09/2018', 
              '08/09/2018', 
              '10/09/2018'],
        'Fruit': 
             ['Apple', 
              'Apple', 
              'Banana', 
              'Banana'],
        'Sale':
             [34,
              12,
              22,
              27]
    })

print(df.groupby(by=['Fruit','Date']).sum().groupby(level=[0]).cumsum())

После запуска вышеуказанных кодов мы получим следующий вывод, который показывает кумулятивную сумму фруктов за каждую дату:

Fruit  Date         Sale
Apple  08/09/2018    34
       10/09/2018    46
Banana 08/09/2018    22
       10/09/2018    49

Метод получения суммы столбцов на основе Условия других Столбцов Значения

Этот метод обеспечивает функциональность получения суммы, если заданное условие истинно и замены суммы на заданное значение, если условие False. Рассмотрим следующий код

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.random.randn(5,3), 
    columns=list('xyz'))

df['sum'] = df.loc[df['x'] > 0,['x','y']].sum(axis=1)

df['sum'].fillna(0, inplace=True)
print(df)

В приведенном выше коде мы добавили новый столбец sum в DataFrame, который является суммой первых столбцов ['x', 'y'] если ['x'] больше чем 1, то мы заменяем sum на 0.

После запуска кода мы получим следующий вывод (значения могут быть изменены в вашем случае).

          x         y         z       sum
0 -1.067619  1.053494  0.179490  0.000000
1 -0.349935  0.531465 -1.350914  0.000000
2 -1.650904  1.534314  1.773287  0.000000
3  2.486195  0.800890 -0.132991  3.287085
4  1.581747 -0.667217 -0.182038  0.914530

Источник

Пример 1: найти сумму одного столбца

Пример 2. Найдите сумму нескольких столбцов

Пример 3: найти сумму всех столбцов

Creating the dataset

How to sum a column? (or more)

Sum row values into a new column

Adding columns by index

Sum with conditions

Adding columns with null values

Метод получения суммы столбца

“Совокупная сумма с групповой суммой”

Метод получения суммы столбцов на основе Условия других Столбцов Значения

“Совокупная сумма с `групповой` суммой”