Pandas - Convert list of numpy arrays into one single list?

Question

When trying to implement OneHotEncoding into my machine learning project, I am using the following code to encode my 3 category features (job, marital status & education)

encoder = OneHotEncoder(categories = 'auto')
feature_array = encoder.fit_transform(df[['job', 'marital', 'education']]).toarray()
feature_labels = encoder.categories_

This returns the categories for each of the 3 features into 3 different arrays captured in a list.

[array(['admin.', 'blue-collar', 'management', 'retired', 'self-employed',
        'services', 'student', 'technician', 'unemployed', 'unknown'],
       dtype=object),
 array(['divorced', 'married', 'single'], dtype=object),
 array(['primary', 'secondary', 'tertiary', 'unknown'], dtype=object)]

I understand that using a for loop through this list can return 3 lists containing the labels for all 3 features,

for value in feature_labels:
    print(value)

['admin.' 'blue-collar' 'management' 'retired' 'self-employed' 'services'
 'student' 'technician' 'unemployed' 'unknown']
['divorced' 'married' 'single']
['primary' 'secondary' 'tertiary' 'unknown']

That being said, is there a more elegant or one liner that I can incorporate to create a list containing all the various categories for my 3 features? In the end, I'd love to have a single list that looks the one below so I can pipe in all 3 encoded features into a single dataframe,

['admin.', 'blue-collar', 'management', 'retired', 'self-employed', 'services', 'student' ,'technician', 'unemployed', 'unknown', 'divorced', 'married', 'single', 'primary', 'secondary', 'tertiary', 'unknown']

scikit-learn.org/stable/modules/generated/…

Vishesh Mangla
– Vishesh Mangla

2020-08-16 16:59:36 +00:00
Commented Aug 16, 2020 at 16:59 — Vishesh Mangla
– Vishesh Mangla, Commented Aug 16, 2020 at 16:59

Anthony · Accepted Answer · 2020-08-16 17:09:40Z

1

You can use numpy's concatenate to join your 3 arrays: (https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html)

labels = np.concatenate(feature_labels)

# The result:
array(['admin.', 'blue-collar', 'management', 'retired', 'self-employed',
       'services', 'student', 'technician', 'unemployed', 'unknown',
       'divorced', 'married', 'single', 'primary', 'secondary',
       'tertiary', 'unknown'], dtype=object)

answered Aug 16, 2020 at 17:09

Anthony

1355 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

ipj · Accepted Answer · 2020-08-16 16:58:46Z

0

If You have nested list:

l = [['admin.', 'blue-collar', 'management', 'retired', 'self-employed','services', 'student', 'technician', 'unemployed', 'unknown'],\
['divorced', 'married', 'single'], ['primary', 'secondary', 'tertiary', 'unknown']]

one of method to unnest it is:

import itertools

flat_l  = list(itertools.chain(*l))

result:

['admin.',
 'blue-collar',
 'management',
 'retired',
 'self-employed',
 'services',
 'student',
 'technician',
 'unemployed',
 'unknown',
 'divorced',
 'married',
 'single',
 'primary',
 'secondary',
 'tertiary',
 'unknown']

answered Aug 16, 2020 at 16:58

ipj

3,5981 gold badge17 silver badges18 bronze badges

Comments

Gil Pinsky · Accepted Answer · 2020-08-16 17:12:50Z

0

Since you have a list of numpy arrays you could also use:

import numpy as np

l = list(np.concatenate(feature_labels))

answered Aug 16, 2020 at 17:12

Gil Pinsky

2,4931 gold badge14 silver badges17 bronze badges

Collectives™ on Stack Overflow

Pandas - Convert list of numpy arrays into one single list?

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related