0

Given a column of a dataframe, after dividing its numerical values in 10 groups, I am trying to assign a label to each group and create a list made out of these labels. In order to do so, I need to check between which interval each value in this column lies, however, according to the error I got

AttributeError: float object has no attribute 'between'

there is no 'between' command to deal with this issue.

l2=[29.69911764705882, 32.5, 32.5, 54.0, 12.0, 29.69911764705882, 24.0, 29.69911764705882, 45.0, 33.0, 20.0, 47.0, 29.0,
    25.0, 23.0, 19.0, 37.0, 16.0, 24.0, 29.69911764705882, 22.0, 24.0, 19.0, 18.0, 19.0, 27.0, 9.0, 36.5, 42.0, 51.0, 22.0,
    55.5, 40.5, 29.69911764705882, 51.0, 16.0, 30.0, 29.69911764705882, 29.69911764705882, 44.0, 40.0, 26.0, 17.0, 1.0, 9.0,
    29.69911764705882, 45.0, 29.69911764705882, 28.0, 61.0, 4.0, 1.0, 21.0, 56.0, 18.0, 29.69911764705882, 50.0, 30.0, 36.0,
    29.69911764705882, 29.69911764705882, 9.0, 1.0, 4.0, 29.69911764705882, 29.69911764705882, 45.0, 40.0, 36.0, 32.0, 19.0,
    19.0, 3.0, 44.0, 58.0, 29.69911764705882, 42.0, 29.69911764705882, 24.0, 28.0, 29.69911764705882, 34.0, 45.5, 18.0, 2.0,
    32.0, 26.0, 16.0, 40.0, 24.0, 35.0, 22.0, 30.0, 29.69911764705882, 31.0, 27.0, 42.0, 32.0, 30.0, 16.0, 27.0, 51.0, 
    29.69911764705882, 38.0, 22.0, 19.0, 20.5, 18.0, 29.69911764705882, 35.0, 29.0, 59.0, 5.0, 24.0, 29.69911764705882, 
    44.0, 8.0, 19.0, 33.0, 29.69911764705882, 29.69911764705882, 29.0, 22.0, 30.0, 44.0, 25.0, 24.0, 37.0, 54.0, 
    29.69911764705882, 29.0, 62.0, 30.0, 41.0, 29.0, 29.69911764705882, 30.0, 35.0, 50.0, 29.69911764705882, 3.0]
d = {'col1': []}
df = pd.DataFrame(data=d)
df['col1']=l2
print(df['col1'])

df['col2'] = pd.cut(df.col1,10)
print(df['col2'].value_counts())
new_list=[]
labels=['25-31','19,25','13-19','31-37','0-7','37-43','43-49','49-55','7-13','55-62']
for i in df['col1']:
    for j in df['col2'].value_counts():
        if i.between(j):
            new_list.append(inter_list.index(j))
print(new_list)
        
2
  • can the expected output be provided? from the look of it, all elements of list l2 are being compared with the aggregate counts belonging to bins, if a match I am assuming, new_list gets appended by some random not provided inter_list item indexed at position aggregate count Commented Oct 13, 2021 at 11:34
  • As the expected output is a list just as long as l2, I will only provide the first 5 items: new_list=[25-31, 31-37, 31-37,49-55,7-13...] as you can see, each item of l2 belongs to one of the intervals listed in the 'labels' list. Let me know if is clear Commented Oct 13, 2021 at 11:43

1 Answer 1

2

According to pandas.cut, you can directly specify the labels in the function call. The return value will be a pandas Series containing the belonging label for each value in df.col1. The following code does the trick for you:

labels = ['25-31', '19,25', '13-19', '31-37', '0-7', 
          '37-43', '43-49', '49-55', '7-13', '55-62']
df['labels'] = pd.cut(df.col1,10, labels=labels)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.