The following code creates a list of questions and options from a multiline string.
import json
import re
text_string = '''
Question 1
### Consider the following figure:
Select one:
* **a. The optimal solution**
* b. An infeasible solution
* c. An Alternate vertex
* d. None of these answers
***The correct answer is: The optimal solution***
-----
Question 2
### If the characters 'D', 'C', 'B', 'A' are placed in a queue (in that order), and then removed one at a time, in what order will they be removed?
```
// initially called with low = 0, high = N - 1
BinarySearch_Right(A[0..N-1], v alue, low , high) {
// in variants: v alue >= A[i] for all i < low
v alue < A[i] for all i > high
if (high < low)
return low
mid = (low + high) / 2
if (A[mid] > v alue)
return BinarySearch_Right(A, v alue, low , mid-1)
else
return BinarySearch_Right(A, v alue, mid+1, high)
}
```
Select one:
* a. ABCD
* b. ABDC
* c. DCAB
* **d. DCBA **
* e. ACDB
***The correct answer is: DCBA***
'''
questions = text_string.split('-----')
quizzes = []
for ques in questions:
# create array to get question text
# this should remove the question number like (Question 1)
question_array = ques.strip().split('*')[0].split('\n')
# Question text
question = '\n'.join(question_array[1:len(question_array)])
# Remove ### if starts with ###
if question.startswith("###"):
question = question[3:]
# build a dict item to add to quizzes array
quiz_item = {
'question': question.strip(),
'options': [],
'answer_string': ''
}
# get index of string staring 'select'
for option in ques.strip().split('\n'):
if option.startswith("*") and not (option.startswith("***") and option.endswith("***")):
quiz_item['options'].append({
'option': option.replace('*', '').strip(),
'answer': True if option.startswith("* **") and option.endswith("**") else False
})
if option.startswith("***") and option.endswith("***"):
quiz_item['answer_string'] = option.replace('*', '').strip()
quizzes.append(quiz_item)
print(json.dumps(quizzes, indent=2))
It works as I do get the results I want.
However, I feel it is not efficient enough.
Is there any better way to write this? Thank you.