0
def section_articles():

    Biology = (df2["Section"]=="Biology").sum()
    Chemistry = (df2["Section"]=="Chemistry").sum()
    Computer_Science = (df2["Section"]=="Computer Science").sum()
    Earth_Environment = (df2["Section"]=="Earth & Environment").sum()
    Mathematics = (df2["Section"]=="Mathematics").sum()
    Physics = (df2["Section"]=="Physics").sum()
    Statistics = (df2["Section"]=="Statistics").sum()

    return() 
print ("Biology",Biology)
print ("Chemistry",Chemistry)
print ("Computer_Science",Computer_Science)
print ("Earth_Environment",Earth_Environment)
print ("Mathematics",Mathematics)
print ("Physics",Physics)
print ("Statistics",Statistics)

section_articles()

I am expecting the number of articles in each section butgetting : Biology is not defined as error can someone help me please

1
  • I think you can use groupby method, however it needs to know more about dataframe Commented Feb 4, 2023 at 17:41

2 Answers 2

1

The issue is that the variables Biology, Chemistry, etc. are local variables defined inside the section_articles function, so they are not accessible outside of the function. To access the values returned by the function, you need to assign the function's output to a variable:

def section_articles():
    Biology = (df2["Section"]=="Biology").sum()
    Chemistry = (df2["Section"]=="Chemistry").sum()
    Computer_Science = (df2["Section"]=="Computer Science").sum()
    Earth_Environment = (df2["Section"]=="Earth & Environment").sum()
    Mathematics = (df2["Section"]=="Mathematics").sum()
    Physics = (df2["Section"]=="Physics").sum()
    Statistics = (df2["Section"]=="Statistics").sum()

    return (Biology, Chemistry, Computer_Science, Earth_Environment, Mathematics, Physics, Statistics)

section_counts = section_articles()

print ("Biology",section_counts[0])
print ("Chemistry",section_counts[1])
print ("Computer_Science",section_counts[2])
print ("Earth_Environment",section_counts[3])
print ("Mathematics",section_counts[4])
print ("Physics",section_counts[5])
print ("Statistics",section_counts[6])

An optimized version by using a dictionary to store the values of each section and then looping through the dictionary to print the values:

def section_articles():
    sections = {"Biology": (df2["Section"]=="Biology").sum(),
                "Chemistry": (df2["Section"]=="Chemistry").sum(),
                "Computer Science": (df2["Section"]=="Computer Science").sum(),
                "Earth & Environment": (df2["Section"]=="Earth & Environment").sum(),
                "Mathematics": (df2["Section"]=="Mathematics").sum(),
                "Physics": (df2["Section"]=="Physics").sum(),
                "Statistics": (df2["Section"]=="Statistics").sum()}
    return sections

section_counts = section_articles()

for section, count in section_counts.items():
    print(f"{section}: {count}")
Sign up to request clarification or add additional context in comments.

Comments

0

Your function returns an empty tuple () so you can't ask for its variables outside it.

One way to to fix the error and reduce visible noise is to make/return a dictionnary and loop:

def section_articles():
    list_of_sections = ["Biology", "Chemistry", "Computer Science",
                        "Earth & Environment", "Mathematics", "Physics", "Statistics"]
    return {k: (df2["Section"] == k).sum() for k in sections}

for k, v in section_articles().items():
    print(k, v)

Another variant :

list_of_sections = ["Biology", "Chemistry", "Computer Science",
                    "Earth & Environment", "Mathematics", "Physics", "Statistics"]

def section_articles(section):
    return (df2[section] == k).sum()


for section in list_of_sections:
    print(section, section_articles(section))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.