Loading multiple CSV files with SQLite

Question

I'm using SQLite, and I need to load hundreds of CSV files into one table. I didn't manage to find such a thing in the web. Is it possible?

Please note that in the beginning i used Oracle, but since Oracle have a 1000 columns limitation per table, and my CSV files have more than 1500 columns each one, i had to find another solution. I wan't to try SQLite, since i can install it fast and easily. These CSV files have been supplied with such as amount of columns and i can't change or split them (nevermind why).

Please advise.

Are all the CSVs going into the same table? If so, you can do cat *.csv > big.csv and just load big.csv. — Mark Setchell
– Mark Setchell, Commented Jan 17, 2015 at 22:05
Yes. Some of the files are bigger than 1GB. Merging so many huge files into one file will create an enormous file. I'm afraid it would be problematic somehow... — Omri
– Omri, Commented Jan 17, 2015 at 22:28
If your system can't handle a multi-GB CSV file, it is going to have trouble with a multi-GB database. — Mark Setchell
– Mark Setchell, Commented Jan 18, 2015 at 11:40
It is unclear, to me at least, whether your problem is that you don't know how to load a single CSV file into SQLite at all, or if the problem is that you don't know how to handle hundreds of files. — Mark Setchell
– Mark Setchell, Commented Jan 18, 2015 at 11:41
The problem is that i don't know how to handle hundreds of files. — Omri
– Omri, Commented Jan 18, 2015 at 15:20

Vincent · Accepted Answer · 2022-05-02 09:11:39Z

16

I ran into a similar problem and the comments to your question actually gave me the answer that finally worked for me

Step 1: merge the multiple csv's into a single file. Exclude the header for most of them but write down the header from one of them in the beginning.

Step 2: Load the single merged csv into SQLite.

For step 1 I used:

$ head -1 one.csv > all_combined.csv
$ tail -n +2 -q *.csv >> all_combined.csv

The first command writes only the first line of the csv file (you can choose whichever one file), the second command writes the whole document starting from line 2 and therefore excluding the header. The -q option makes sure that tail never writes the file name as a header.

Make sure to put all_combined.csv in a separate folder, or in some distributions, it will be included recursively!

To load into SQLite (Step 2) the answer given by Hot Licks worked for me:

 sqlite> .mode csv
 sqlite> .import all_combined.csv my_new_table

This assumes that my_new_table hasn't been created. Alternatively you can create beforehand and then load, but in that case exclude the header from Step 1.

edited May 2, 2022 at 9:11

Vincent

1,6243 gold badges23 silver badges46 bronze badges

answered Oct 31, 2019 at 11:07

rodrigolece

1,1492 gold badges14 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Fons MA Over a year ago

I'm not sure exactly why but step 1 here made all_combined.csv grow recursively on my Ubuntu 20.04 until I run out of disk space. Using a different extension or storing it in a different folder solves the issue.

foobarna Over a year ago

@FonsMA because the second command also reads the all_combined.csv in which writes again. To fix this, the best is to have a prefix on the files that you want to concatenate tail -n +2 -q some_*.csv >> all_combined.csv or write the file name as all_combined.commasv until done

Andriamanitra · Accepted Answer · 2021-10-06 14:36:34Z

4

I didn't find a nicer way to solve this so I used find along with xargs to avoid creating a huge intermediate .csv file:

find . -type f -name '*.csv' | xargs -I% sqlite3 database.db ".mode csv" ".import % new_table" ".exit"

find prints out the file names and the -I% parameter to xargs causes the command after it to be run once for each line, with % replaced by a name of a csv file.

answered Oct 6, 2021 at 14:36

Andriamanitra

411 silver badge3 bronze badges

Comments

Hot Licks · Accepted Answer · 2015-01-17 22:59:15Z

2

http://www.sqlite.org/cli.html --

Use the ".import" command to import CSV (comma separated value) data into an SQLite table. The ".import" command takes two arguments which are the name of the disk file from which CSV data is to be read and the name of the SQLite table into which the CSV data is to be inserted.

Note that it is important to set the "mode" to "csv" before running the ".import" command. This is necessary to prevent the command-line shell from trying to interpret the input file text as some other format.

sqlite> .mode csv
sqlite> .import C:/work/somedata.csv tab1

There are two cases to consider: (1) Table "tab1" does not previously exist and (2) table "tab1" does already exist.

In the first case, when the table does not previously exist, the table is automatically created and the content of the first row of the input CSV file is used to determine the name of all the columns in the table. In other words, if the table does not previously exist, the first row of the CSV file is interpreted to be column names and the actual data starts on the second row of the CSV file.

For the second case, when the table already exists, every row of the CSV file, including the first row, is assumed to be actual content. If the CSV file contains an initial row of column labels, that row will be read as data and inserted into the table. To avoid this, make sure that table does not previously exist.

Note that you need to make sure that the files DO NOT have an initial line defining the field names. And, for "hundreds" of files you will probably want to prepare a script rather than typing in each file individually.

answered Jan 17, 2015 at 22:59

Hot Licks

47.8k19 gold badges96 silver badges156 bronze badges

1 Comment

EAmez Over a year ago

As can be read in the link in your answer, section named "Importing CSV files" (section 7.5 actually, maybe this has changed in time, or can do it): If the CSV file contains an initial row of column labels, you can cause the .import command to skip that initial row using the "--skip 1" option.

Gideon · Accepted Answer · 2022-06-07 04:16:50Z

0

You can use DB Browser for SQLite to do this pretty easily. File > Import > Table from CSV file... and then select all the files to open them together into a single table.

I just tested this out with a dozen CSV files and got a single 1 GB table from them without any work. As long as they have the same schema, DB Browser is able to put them together. You'll want to keep the 'Column Names in first line' option checked.

answered Jun 7, 2022 at 4:16

Gideon

1

Collectives™ on Stack Overflow

Loading multiple CSV files with SQLite

4 Answers 4

2 Comments

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related