Creating a dictionary with list of lists in Python

Question

I have a huge file (with around 200k inputs). The inputs are in the form:

A B C D
B E F
C A B D
D

I am reading this file and storing it in a list as follows:

text = f.read().split('\n')

This splits the file whenever it sees a new line. Hence text is like follows:

[[A B C D] [B E F] [C A B D] [D]]

I have to now store these values in a dictionary where the key values are the first element from each list. i.e the keys will be A, B, C, D. I am finding it difficult to enter the values as the remaining elements of the list. i.e the dictionary should look like:

{A: [B C D]; B: [E F]; C: [A B D]; D: []}

I have done the following:

    inlinkDict = {}
    for doc in text:
    adoc= doc.split(' ')
    docid = adoc[0]
    inlinkDict[docid] = inlinkDict.get(docid,0) +  {I do not understand what to put in here}

Please help as to how should i add the values to my dictionary. It should be 0 if there are no elements in the list except for the one which will be the key value. Like in example for 0.

Do you want the dictionary to be {A: [B, C, D]; B: [E, F]; C: [A, B, D]; D: []}? Or maybe {A: "B C D"; B: "E F"; C: "A B D"; D: 0}? — huon
– huon, Commented Mar 25, 2012 at 5:27
Please edit your question to say what you want to do about duplicate keys; foer example, what if you have a 5th line containing A P Q R? How do you want to store the values B C D ... as a list ['B', 'C', 'D']? If you it will be much better to represent the case of an empty list as an empty list [], not as an integer 0. — John Machin
– John Machin, Commented Mar 25, 2012 at 5:34
@JohnMachin: There are no duplicate values. And yes storing values as a list will definitely help. I will edit my question. — gsb
– gsb, Commented Mar 25, 2012 at 5:38

Raymond Hettinger · Accepted Answer · 2012-03-25 05:43:59Z

27

A dictionary comprehension makes short work of this task:

>>> s = [['A','B','C','D'], ['B','E','F'], ['C','A','B','D'], ['D']]
>>> {t[0]:t[1:] for t in s}
{'A': ['B', 'C', 'D'], 'C': ['A', 'B', 'D'], 'B': ['E', 'F'], 'D': []}

answered Mar 25, 2012 at 5:43

Raymond Hettinger

229k67 gold badges405 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

forivall Over a year ago

If you're using an old version of python that doesn't have dict comprehensions, you can use dict(t[0], t[1:] for t in s) instead

Raymond Hettinger Over a year ago

And if you're using a version of python that predates generator expressions, you can use dict([(t[0], t[1:]) for t in s]). And, if you're using a version older than that, you can use for t in s: d[t[0]] = t[1:]. And, if you're so far back in time that Python doesn't exist, you can use Dartmouth BASIC to DIM an array so that you can simulate a hash table by writing your own hash function. And, if you're working on a system without a higher level language, you can hand translate your assembler code into machine language and input your program with toggle switches ...

forivall Over a year ago

Ha, ha, ha. It's just that 2.5 and 2.6 are still very common, and dict comprehensions were only added in 2.7.

wim · Accepted Answer · 2019-04-12 04:31:51Z

22

Try using a slice:

inlinkDict[docid] = adoc[1:]

This will give you an empty list instead of a 0 for the case where only the key value is on the line. To get a 0 instead, use an or (which always returns one of the operands):

inlinkDict[docid] = adoc[1:] or 0

Easier way with a dict comprehension:

>>> with open('/tmp/spam.txt') as f:
...     data = [line.split() for line in f]
... 
>>> {d[0]: d[1:] for d in data}
{'A': ['B', 'C', 'D'], 'C': ['A', 'B', 'D'], 'B': ['E', 'F'], 'D': []}
>>> {d[0]: ' '.join(d[1:]) if d[1:] else 0 for d in data}
{'A': 'B C D', 'C': 'A B D', 'B': 'E F', 'D': 0}

Note: dict keys must be unique, so if you have, say, two lines beginning with 'C' the first one will be over-written.

edited Apr 12, 2019 at 4:31

answered Mar 25, 2012 at 5:26

wim

368k114 gold badges681 silver badges818 bronze badges

Comments

Burhan Khalid · Accepted Answer · 2012-03-25 08:49:08Z

4

The accepted answer is correct, except that it reads the entire file into memory (may not be desirable if you have a large file), and it will overwrite duplicate keys.

An alternate approach using defaultdict, which is available from Python 2.4 solves this:

from collections import defaultdict
d = defaultdict(list)
with open('/tmp/spam.txt') as f:
  for line in f:
    parts = line.strip().split()
    d[parts[0]] += parts[1:]

Input:

A B C D
B E F
C A B D
D  
C H I J

Result:

>>> d = defaultdict(list)
>>> with open('/tmp/spam.txt') as f:
...    for line in f:
...      parts = line.strip().split()
...      d[parts[0]] += parts[1:]
...
>>> d['C']
['A', 'B', 'D', 'H', 'I', 'J']

answered Mar 25, 2012 at 8:49

Burhan Khalid

175k20 gold badges254 silver badges291 bronze badges

Collectives™ on Stack Overflow

Creating a dictionary with list of lists in Python

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related