I am getting the following error : attributeerror: 'dataframe' object has no attribute 'data_type'" . I am trying to recreate the code from this link which is based on this article with my own dataset which is similar to the article
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(df.index.values,
df.label.values,
test_size=0.15,
random_state=42,
stratify=df.label.values)
df['data_type'] = ['not_set']*df.shape[0]
df.loc[X_train, 'data_type'] = 'train'
df.loc[X_val, 'data_type'] = 'val'
df.groupby(['Conference', 'label', 'data_type']).count()
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased',
do_lower_case=True)
encoded_data_train = tokenizer.batch_encode_plus(
df[df.data_type=='train'].example.values,
add_special_tokens=True,
return_attention_mask=True,
pad_to_max_length=True,
max_length=256,
return_tensors='pt'
)
and this is the error I get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_24180/2662883887.py in <module>
3
4 encoded_data_train = tokenizer.batch_encode_plus(
----> 5 df[df.data_type=='train'].example.values,
6 add_special_tokens=True,
7 return_attention_mask=True,
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
5485 ):
5486 return self[name]
-> 5487 return object.__getattribute__(self, name)
5488
5489 def __setattr__(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'data_type'
I am using python: 3.9; pytorch :1.10.1; pandas: 1.3.5; transformers: 4.15.0