Create a large sql table using TPC-H - sql

I want to create a query like this:
create table genre (id int, name char(50));
insert into genre (id,name) values (1,"Test")
insert into genre (id,name) values (2,"Test2")
But I want to make it by TPC-H with a size of 1 GB and more. I tried to make the file in dbgn with this command:
dbgen -s 100 -S 1 -C 100 -T p -v
but the output is a table but I want it as an sql file like in this command:
1|155190|7706|1|17|21168.23|0.04|0.02|N|O|1996-03-13|1996-02-12|1996-03-22|DELIVER IN PERSON|TRUCK|egular courts above the|
1|67310|7311|2|36|45983.16|0.09|0.06|N|O|1996-04-12|1996-02-28|1996-04-20|TAKE BACK RETURN|MAIL|ly final dependencies: slyly bold |
1|63700|3701|3|8|13309.60|0.10|0.02|N|O|1996-01-29|1996-03-05|1996-01-31|TAKE BACK RETURN|REG AIR|riously. regular, express dep|
Can you help me figure out what is the command that creates the sql file and contains the information as above?

Related

Postgres source SQL for existing table

I need to extract the source SQL for an existing table in postgres. Is it is possible?
Example: If I have a customertable, created in pgadmin, with 2 columns: id (integer), name (text). Is there way I can extract/get the CREATE function for this existing table, so I get a SQL with “CREATE TABLE customer ...” without having to write the whole SQL of the table in hand?
pgadmin (III,4):
right click on table name -> Scripts-> CREATE sript
psql:
pg_dump -s -t table_name

How to find out the row count in input file using in SQL server?

My Requirement is "I have a text file**(test.txt)**. It contains 5 lines of data.I need to total count of records in test.txt files using sql query in SQL server.
Input: I have a file.It contain data like..
a
b
c
d
e
Required OUTPUT:
Count: 5
Removed my earlier answer, because #sepupic was correct. I was too quick and provided a wrong answer. Apologies for that.
Here's what I did. I created a text file, and dropped it in my C:\temp folder. The name of the file is c:\temp\MyText.txt, and it contains a few lines (7) of text.
I then created a table to host the data:
CREATE TABLE dbo.myTableName
(myText nvarchar(50));
Using BULK INSERT I then import the contents into my table:
BULK INSERT dbo.myTableName
FROM 'c:\temp\MyText.txt'
WITH
(
CODEPAGE = '1252',
FIELDTERMINATOR = ';',
CHECK_CONSTRAINTS
)
Finally, I perform a rowcount on the table:
SELECT COUNT(*) As LineCount FROM dbo.YourTableName
This yields the desired result:
LineCount
7
Contents of the table:
SELECT * FROM dbo.YourTableName
Result:
Text
THIS IS A FILE
WITH SEVERAL
LINES
AND CONTINUING
ON
A FEW
MORE

Hive insert with multiple select

I want to execute something like this in hive:
insert into mytable values (select count(*) from test2), (select count(*) from test3));
Is there a way to do this?
Why would you need to create a hive table with row count as a column? Assuming that you have to log the row count everyday, I am not sure if we could do this in hive.
But you can try running a shell script something like this if you want a snap shot of the row count of all the tables...
$hive -e 'use schema_name; show tables' | tee tables.txt
This stores all tables in the database in a text file tables.txt
Now, write a shell script to get the counts of all the tables that were gathered
while read line
do
echo "$line "
eval "hive -e 'select count(*) from $line'"
done
change the file permissions for the file generated now
$chmod +x count_tables.sh
$./count_tables.sh < tables.txt > counts.txt
If you are looking for a logging the row count periodically, you can store the rowcounts in a csv, by writing in the values as comma separated values and create an external table pointing to the file.
something like
$./count_tables.sh < tables.txt | sed 's/\t/,/g' > counts.txt
Hope that's the best way to achieve this
I found out the answer. It should be something like this:
INSERT INTO TABLE mytable
SELECT c1,c2 FROM
(SELECT count(*) FROM test2) AS c1
JOIN
(SELECT count(*) FROM test3) AS c2;

Efficient way to compare data

I want to upload a csv file into the database via my php script.
The database contains this numbers
part Description
1 test
2 pc
3 monitor
When i upload the csv the code must look if the part already exist in the database, if not he must add him.
for example:
my cvs:
1 test
2 pc
4 monitor
5 keyboard
if i upload this script he must only add the keyboard.
The database contains 8000 products, how can i compare this efficient and fast?
Upload in temp table the csv file the try an insert select
Insert into table as (select * from temptable minus select * from table)

Inserting Data into Hive Table

I am new to hive. I have successfully setup a single node hadoop cluster for development purpose and on top of it, I have installed hive and pig.
I created a dummy table in hive:
create table foo (id int, name string);
Now, I want to insert data into this table. Can I add data just like sql one record at a time? kindly help me with an analogous command to:
insert into foo (id, name) VALUES (12,"xyz);
Also, I have a csv file which contains data in the format:
1,name1
2,name2
..
..
..
1000,name1000
How can I load this data into the dummy table?
I think the best way is:
a) Copy data into HDFS (if it is not already there)
b) Create external table over your CSV like this
CREATE EXTERNAL TABLE TableName (id int, name string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'place in HDFS';
c) You can start using TableName already by issuing queries to it.
d) if you want to insert data into other Hive table:
insert overwrite table finalTable select * from table name;
There's no direct way to insert 1 record at a time from the terminal, however, here's an easy straight forward workaround which I usually use when I want to test something:
Assuming that t is a table with at least 1 record. It doesn't matter what is the type or number of columns.
INSERT INTO TABLE foo
SELECT '12', 'xyz'
FROM t
LIMIT 1;
Hive apparently supports INSERT...VALUES starting in Hive 0.14.
Please see the section 'Inserting into tables from SQL' at: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
What ever data you have inserted into one text file or log file that can put on one path in hdfs and then write a query as follows in hive
hive>load data inpath<<specify inputpath>> into table <<tablename>>;
EXAMPLE:
hive>create table foo (id int, name string)
row format delimited
fields terminated by '\t' or '|'or ','
stored as text file;
table created..
DATA INSERTION::
hive>load data inpath '/home/hive/foodata.log' into table foo;
to insert ad-hoc value like (12,"xyz), do this:
insert into table foo select * from (select 12,"xyz")a;
this is supported from version hive 0.14
INSERT INTO TABLE pd_temp(dept,make,cost,id,asmb_city,asmb_ct,retail) VALUES('production','thailand',10,99202,'northcarolina','usa',20)
It's a limitation of hive.
1.You cannot update data after it is inserted
2.There is no "insert into table values ... " statement
3.You can only load data using bulk load
4.There is not "delete from " command
5.You can only do bulk delete
But you still want to insert record from hive console than you can do select from statck. refer this
You may try this, I have developed a tool to generate hive scripts from a csv file. Following are few examples on how files are generated.
Tool -- https://sourceforge.net/projects/csvtohive/?source=directory
Select a CSV file using Browse and set hadoop root directory ex: /user/bigdataproject/
Tool Generates Hadoop script with all csv files and following is a sample of
generated Hadoop script to insert csv into Hadoop
#!/bin/bash -v
hadoop fs -put ./AllstarFull.csv /user/bigdataproject/AllstarFull.csv
hive -f ./AllstarFull.hive
hadoop fs -put ./Appearances.csv /user/bigdataproject/Appearances.csv
hive -f ./Appearances.hive
hadoop fs -put ./AwardsManagers.csv /user/bigdataproject/AwardsManagers.csv
hive -f ./AwardsManagers.hive
Sample of generated Hive scripts
CREATE DATABASE IF NOT EXISTS lahman;
USE lahman;
CREATE TABLE AllstarFull (playerID string,yearID string,gameNum string,gameID string,teamID string,lgID string,GP string,startingPos string) row format delimited fields terminated by ',' stored as textfile;
LOAD DATA INPATH '/user/bigdataproject/AllstarFull.csv' OVERWRITE INTO TABLE AllstarFull;
SELECT * FROM AllstarFull;
Thanks
Vijay
You can use following lines of code to insert values into an already existing table. Here the table is db_name.table_name having two columns, and I am inserting 'All','done' as a row in the table.
insert into table db_name.table_name
select 'ALL','Done';
Hope this was helpful.
Hadoop file system does not support appending data to the existing files. Although, you can load your CSV file into HDFS and tell Hive to treat it as an external table.
Use this -
create table dummy_table_name as select * from source_table_name;
This will create the new table with existing data available on source_table_name.
LOAD DATA [LOCAL] INPATH '' [OVERWRITE] INTO TABLE <table_name>;
use this command it will load the data at once just specify the file path
if file is in local fs then use LOCAL if file is in hdfs then no need to use local