Pandas find column contains a certain value

In this article, we learn how to find columns containing a certain value in pandas. This is a very common scenario you came across during deployment. Let see how pandas find column contains a certain value –

Pandas find column contains a certain value

Suppose you have a data frame having the following columns –

import pandas as pd1
df = pd1.DataFrame({
    'A': [1, 5, 7, 1, 5],
    'B': [2, 5, 8, 2, 5],
    'C': [3, 6, 9, 3, 6]
})

The above structure has three columns A, B, and C and these columns have below following values –

    A   B   C
0   1   2   3
1   5   5   6
2   7   8   9
3   1   2   3
4   5   5   6

Now let find out the column index which contains the value 5. So to achieve the desired output we use “np.where” from NumPy which helps to find a certain value in the column.

The “np.where” method from numpy is used to find the column index that contains a certain value. You can use numeric or string any type in this “np.where” function.

Below is the syntax of numpy np.where method.

np.where(condition[ x, y])

This function returns an array with elements from x where the condition is true and elements from y elsewhere.

import numpy as np
col_index = pd.DataFrame(np.where(df.eq(5))[1] + 1, columns=['col_index'])

The above code finds the index of column B that contains the numeric value 5.

Output

   col_index

0          2
1          2

 

Pandas find column contains a certain string value

In the above example check the numeric value what if you have to find the column contains a string.

Example –

The below code returns the array if column A contains the string “hello”.

df[df['A'].str.contains("hello")]

Let’s understand the same using the below example suppose we have below data frame –

df1 = pd.DataFrame({'col': ['^Mac', '^Macbook', 'Mac11', 'Mini']})
df1[df1['col'].str.contains('^Mac', regex=False)]
Output

       col

0     ^Mac
1     ^Macbook


Here I have used regex=False because I have a special character in my data frame so it interprets as a normal string. By default regex is true or you can alternatively use the backslash.

df1['col'].str.contains('\^')

I hope now you understand how to find a column that contains a certain value in pandas.

Leave a Reply

Your email address will not be published. Required fields are marked *