Is there a tool which would automatically provide me with a SQL interface on top of CSV files?
4 Answers
You can use http://harelba.github.io/q/
Example (from their README):
q "SELECT COUNT(*) FROM ./my_file.csv WHERE c3 > 32.3"
There is also a Python module & program: https://pythonhosted.org/querycsv/
Or in Go: https://github.com/dinedal/textql
You can also use this PowerShell script: Invoke-CsvSqlcmd
Many other tools can be found here.
Comments
Have a look at http://harelba.github.io/q/, you can execute the following command:
q"SELECT COUNT(*) FROM ./clicks_file.csv WHERE c3 > 32.3"
Comments
Oracle's External Table feature.
You can use a local installation of Oracle -
Oracle Database Express Edition 11g Release 2 (Download)
This might require more work than other tools, but the data processing possibilities are endless.
create directory c_temp as 'c:\Temp';
create table myfile (c1 int,c2 int,c3 int)
organization external (type oracle_loader default directory c_temp location ('myfile.csv'));
select * from myfile;
Comments
One of the most powerful SQL OLAP tool to automatically query CSV files (and more) is DuckDB
You can make some complex SQL request by ingesting CSV/TSV files with automatic type/dialect recognition and create in-memory tables even on small desktop computers
for instance
CREATE TABLE temp_table AS SELECT * from read_csv_auto('/tmp/sample_*.csv');
will import many CSV files in a single table detecting automatically the headers, separator and type for each columns so you can later run complex SQL queries
It runs on Windows/Linux/macOS from a single binary